JP2002163631A

JP2002163631A - Dummy creature system, action forming method for dummy creature for the same system and computer readable storage medium describing program for making the same system action

Info

Publication number: JP2002163631A
Application number: JP2000363303A
Authority: JP
Inventors: Kaoru Suzuki; 薫鈴木
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2000-11-29
Filing date: 2000-11-29
Publication date: 2002-06-07
Anticipated expiration: 2020-11-29
Also published as: JP3854061B2

Abstract

PROBLEM TO BE SOLVED: To provide a dummy creature system, an action forming method for dummy creature and a medium to be provided by a software application, with which a dummy creature itself newly generates a motion pattern by itself. SOLUTION: A robot 1 (dummy creature system) is provided with a reliability learning part 25, out-of-target learning part 26, other action learning part 27, condition control action learning part 28 and chain action learning part 29. The robot 1 is composed of a condition input part 21 for detecting a condition outside the robot 1 and holding a dummy emotion, relational database part 22 for storing plural motion patterns, action retrieving part 23 for extracting the motion pattern corresponding to the external condition and the dummy emotion, and action output part 23 for moving the robot 1 according to this motion pattern. The motion pattern, which is not stored in relational database part 22, is generated and stored by the reliability learning part 25 to the chain action learning part 29 so that motion patterns can be increased and speedily selected.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、擬似生物装置及び
擬似生物装置における擬似生物の行動形成方法及び擬似
生物の行動形成を行うプログラムを記載したコンピュー
タ読み取り可能な記憶媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a simulated creature device, a method for forming simulated creature behavior in a simulated creature device, and a computer-readable storage medium containing a program for forming a simulated creature behavior.

【０００２】[0002]

【従来の技術】近年、コンピュータのディスプレイ画面
上にキャラクタの登場するソフトウェアアプリケーショ
ンや、実体として存在するキャラクタ性のあるロボット
が種々発売されている。2. Description of the Related Art In recent years, various software applications in which a character appears on a display screen of a computer, and various robots having a character as an entity have been released.

【０００３】パソコン用のソフトウェアアプリケーショ
ンとして実施された例としては、イルカのようなトリの
ようなキャラクタの登場する「TEO（呼称は「テオ」：
登録商標：富士通）」などが有名であり、また、自立し
たロボットとして実施された例としてはメタリックなデ
ザインのイヌ型ロボット「AIBO（登録商標：ソニー）」
などが有名である。また、外部のコンピュータから遠隔
で制御される半自立型のロボットとして実施された例と
してはフクロウのような顔を持つ車輪走行ロボット「R1
00（日本電気）」などが有名である。[0003] As an example implemented as a software application for a personal computer, "TEO" (called "TEO":
"Registered trademark: Fujitsu""is a famous example, and as an example implemented as an independent robot, a dog type robot with a metallic design" AIBO (registered trademark: Sony) "
Is famous. Also, as an example implemented as a semi-autonomous robot remotely controlled from an external computer, a wheel running robot "R1" having an owl-like face
00 (NEC) "and so on.

【０００４】これらのソフトウェアアプリケーションや
ロボットは、己が造形として与えられたキャラクタをあ
たかも生きているかのように行動させることを目指した
システムである。そのためシステムは、使用者からの入
力情報に応じてキャラクタの感情表現や行動などの応答
を出力するように作られている。[0004] These software applications and robots are systems aiming to make a character given as a model act as if it were alive. Therefore, the system is designed to output a response such as the emotional expression or action of the character in response to input information from the user.

【０００５】また、近頃では、それらソフトウェアアプ
リケーションやロボットの幾つかは、使用者の顔画像や
音声波形などのパタン情報を認識する能力を有し、その
ようなパタン認識結果を入力情報として応答できるよう
に高度化されつつある。このようなシステムは、その演
出しようとしているキャラクタ（必ずしも実在する生物
ではない）に擬似的な生命を与えようとしている、いわ
ゆる疑似生物システム（あるいは仮想生物システム）で
ある。Recently, some of these software applications and robots have the ability to recognize pattern information such as a user's face image and voice waveform, and can respond to such pattern recognition results as input information. Is becoming more sophisticated. Such a system is a so-called simulated creature system (or a virtual creature system) that attempts to give a simulated life to a character (not necessarily a real creature) to be produced.

【０００６】疑似生物システムは、システムの実体たる
コンピュータやロボットの周囲環境を、システムの使用
者が行う行動も含めて認識し、そのような周囲環境の認
識結果に応じて変化する疑似生物内部の状態に応じて、
所定の応答を返すような設計がなされている。例えば、
使用者がかまってやると疑似生物は喜びを表現し、いじ
わるをすれば拗ねたり怒ったりする。このように感情や
欲求のレベルでの対話性を備えた疑似生物は、実用的な
タスクを実行させるための無味乾燥な対話性のみを有す
る、例えばワープロソフトや産業用ロボットと異なり、
使用者に癒し効果をもたらし、生き物めいたところが使
用者の興味や関心をかい、あるいはそのようなキャラク
タ性のある精巧なロボットを使用者が個人で所有できる
ことが嬉しいという理由から、老若男女を問わず広く受
入れられつつある。[0006] The pseudo-biological system recognizes the surrounding environment of a computer or a robot, which is the substance of the system, including the action performed by the user of the system, and changes the inside of the pseudo-creature which changes according to the recognition result of the surrounding environment. Depending on the condition,
It is designed to return a predetermined response. For example,
If the user becomes bitten, the simulated creature will express joy, and if it is tampered with, it will become relentless or angry. Such a pseudo-creature with interactivity at the level of emotions and desires has only dry interactivity to perform practical tasks, unlike word processing software and industrial robots, for example.
Men and women of all ages are encouraged to bring a healing effect to the user, and that the creature creatures interest or interest the user, or that the user is happy to have an elaborate robot with such a character. It is being widely accepted.

【０００７】ところで、このような疑似生物の行動を決
定する方式には、これまで従来以下に示すような3通り
の方式が実施されており、従来の疑似生物システムはこ
れらのいずれかによって動作を行っていた。 I、選択実行方式 II、小出し方式 III、強化学習方式まず、Iは玩具として販売されている最も多くの疑似生
物システムが採用している方式であり、システムに内蔵
される記憶部に予め記憶されている複数の行動パタンの
中から、システムに入力された信号に対応する行動を一
つ選択して実行するという行動発現方式である。どのよ
うな信号が入力されたら、どのような行動をするべきか
は、予めシステムに組み込まれており、その入力（状
況）と行動の対応関係（行動パタン）が生き物のような
応答性をもたらすように作り込まれているというもので
ある。[0007] By the way, the following three methods have heretofore been implemented as a method for determining the behavior of the simulated creature, and the conventional simulated creature system operates by one of these methods. I was going. I, Selective execution method II, Dispensing method III, Reinforcement learning method First, I is the method adopted by most simulated biological systems sold as toys, and is stored in advance in the storage unit built in the system. This is an action expression method in which one action corresponding to a signal input to the system is selected from a plurality of action patterns and executed. What kind of signal is inputted and what kind of action should be taken are built in the system in advance, and the correspondence (action pattern) between the input (situation) and the action leads to responsiveness like a living thing It is something that is built in.

【０００８】システムに入力された信号とは、システム
の実体たるコンピュータやロボットに設けられた光セン
サや感圧センサからの信号であり、システムが備える光
センサの前方を使用者が通過したり、システムが備える
感圧センサを使用者が撫でたりした場合などに、所定の
信号が入力されるようになっている。また、行動パタン
とは、例えば使用者が光センサの前を通過した場合に、
ソフトウェアアプリケーションであれば疑似生物が使用
者に呼びかけたり、ロボットであればこのロボットが使
用者に向かって歩き始めたり、使用者が感圧センサを撫
でた場合に、疑似生物が音を発して吠えたりするといっ
たことであり、行動のきっかけになる入力とそれに対し
て行われる疑似生物の行動を組にしたものである。[0008] The signal input to the system is a signal from an optical sensor or a pressure sensor provided in a computer or a robot which is a substance of the system, and a user passes in front of the optical sensor provided in the system. A predetermined signal is input, for example, when the user strokes a pressure-sensitive sensor provided in the system. Also, the behavior pattern is, for example, when the user passes in front of the optical sensor,
If the application is a software application, the creature calls the user; if it is a robot, the robot starts walking toward the user, or if the user strokes the pressure sensor, the creature makes a sound and barks. It is a combination of the input that triggers the action and the behavior of the pseudo-creature performed in response to the input.

【０００９】しかしながら、このようなIでは、ある入
力された信号に対する特定の行動パタンを選択し、この
選択された行動パタンを疑似生物に実行させるため、状
況が同じなら毎回同じ行動を繰り返すことになってしま
う。そのため、使うほどに使用者が疑似生物の行動パタ
ンを覚え、時として次に行う行動を予測できたり、疑似
生物に対して興味が薄れ飽きてしまったり、生物らしく
感じられず不愉快になったりする等の問題があった。さ
らに、疑似生物はある状況に対して決まった行動を起こ
すようにしかプログラムされておらず、その行動の結果
どのような状況に至るのかということには全く感知せ
ず、極めて刹那的に反応することしかできないという問
題点もあった。[0009] However, in the case of I, in order to select a specific action pattern for a certain input signal and to cause the pseudo-creature to execute the selected action pattern, the same action is repeated every time in the same situation. turn into. Therefore, the more the user uses it, the more the user learns the behavior pattern of the simulated creature, sometimes predicts the next action to be performed, the interest in the simulated creature becomes faint and tired, and it becomes unpleasant because it does not feel like a creature And so on. In addition, mimics are programmed only to take certain actions in certain situations, react in an extremely ephemeral manner without any sense of what the consequences of that action will be. There was also a problem that they could only do things.

【００１０】なお、Ｉの発展型として、ある入力信号に
対する特定の行動パタンを一つに限定せず複数用意して
おき、乱数によりその場で一つに決定するという方式も
あるが、予め用意されている、限られた選択肢の中で、
その結果を考慮することなく行動することに変わりはな
いため、上記の問題が発生することに変わりはなかっ
た。As an extension of I, there is a method in which a plurality of specific action patterns for a certain input signal are prepared without being limited to one, and one is determined on the spot by a random number. Among the limited options that have been
Since the action is still taken without considering the result, the above-mentioned problem still occurs.

【００１１】次に、IIは言葉を発する擬似生物「ファー
ビー（登録商標：タイガーエレクトロニクスリミテッ
ド）」などの言葉を発する疑似生物システムが採用して
いる方式であり、システムに内蔵される記憶部に予め記
憶されている複数の行動パタンの中から、システムに入
力された信号に対応する行動パタンを選択して実行する
点はＩと同様である。Next, II is a system adopted by a pseudo-creature system that emits words, such as a pseudo-creature that emits words, such as "Farby (registered trademark: Tiger Electronics Limited)". Similar to I, a behavior pattern corresponding to a signal input to the system is selected and executed from among a plurality of stored behavior patterns.

【００１２】しかしながら、ここで記憶されている行動
パタンには、その発現時期を示す時間的な情報や入力量
の情報が付加されており、システムが初めて起動されて
からの経過時間や与えられた入力の量と共に選択される
行動パタンが随時変化するという行動発現方式である。
例えば起動から100時間までは、システムに入力される
信号に対して発現可能な行動パタンはA、B、Cの3種類で
あり、このうちのいずれかが入力信号に応じて実行さ
れ、また100〜1000時間までの行動パタンはC〜Jの8種類
があり、このうちのいずれかが実行され、また1000時間
以上では行動パタンC、F、G、J〜Zがあり、このうちの
いずれかが実行されるというものである。こうすること
により、疑似生物は時間の経過やかまってもらった量に
応じて行動パタンを変える、すなわち使用者からは成長
するように見えるのである。[0012] However, the behavioral pattern stored here is added with temporal information indicating the manifestation time and information on the input amount, and the time elapsed since the system was first activated and the given information are given. This is an action expression method in which the action pattern selected with the amount of input changes as needed.
For example, from the start up to 100 hours, there are three types of action patterns that can be expressed with respect to the signal input to the system, A, B, and C, and any one of them is executed according to the input signal, and 100 There are eight types of action patterns C to J up to 1000 hours, and any one of them is executed, and over 1000 hours there are action patterns C, F, G, J to Z, any of these Is executed. By doing so, the simulated creature changes its behavioral pattern according to the passage of time and the amount of bite, that is, it appears to the user to grow.

【００１３】しかしながら、IIは経過時間や入力量に応
じて選択可能な行動パタンが変化する点はIと相違する
が、予め用意されている、限られた選択肢の中で、その
結果を考慮することなく行動する点はIと全く同一であ
る。そのため、IIにもIと同一の問題があった。[0013] However, II differs from I in that the selectable action pattern changes according to the elapsed time and the input amount, but the result is considered in a limited option prepared in advance. Acting without any change is exactly the same as I. Therefore, II had the same problem as I.

【００１４】次に、IIIは疑似生物に個性を持たせた
り、行動をより洗練させたりするために採用している学
習性を備えた方式であり、システムに内蔵される記憶部
に予め記憶されている複数の行動パタンのそれぞれに発
現頻度のパラメータが付加されており、この発現頻度の
大きい行動パタンほど、それが対応している状況におい
てよく選択されるという行動発現方式である。[0014] Next, III is a system having a learning property that is used to give the simulated creatures individuality and further refine their behavior, and is stored in advance in a storage unit built in the system. This is a behavior expression method in which a parameter of the expression frequency is added to each of a plurality of behavior patterns, and a behavior pattern with a higher expression frequency is more often selected in a corresponding situation.

【００１５】このとき、一つの入力された信号に対して
選択可能な複数の行動パタンがあり、この複数の行動パ
タンの中から選択された行動を行った時、使用者がどの
ような反応をするかによってその行動の発現頻度が変化
する。At this time, there are a plurality of action patterns that can be selected for one input signal, and when the user performs an action selected from the plurality of action patterns, what kind of reaction does the user take? The frequency of occurrence of the action changes depending on whether the action is performed.

【００１６】例えば、行動後に使用者が疑似生物を叩く
という反応（感圧センサへの入力が短時間で大きかった
場合）をした場合には、今行った行動に対して疑似生物
は「叱られた」と判断し、この行動の発現頻度を低下さ
せる。一方、使用者が疑似生物を撫でるという反応（感
圧センサへの入力が長時間で小さかった場合）をした場
合には、今行った行動に対して疑似生物は「誉められ
た」と判断し、この行動の発現頻度を上昇させる。For example, if the user responds by hitting the simulated creature after the action (when the input to the pressure-sensitive sensor is large in a short time), the simulated creature is reprimanded for the action just performed. And reduce the frequency of occurrence of this behavior. On the other hand, if the user responds by stroking the simulated creature (when the input to the pressure sensor is small for a long time), the simulated creature is judged to have been "praised" for the current action. Increase the frequency of this behavior.

【００１７】このように、疑似生物がとった行動に対し
て、使用者が誉めたり叱ったりすることで、使用者はイ
ヌやネコと付き合う場合と同様の自然な方法で人工の生
命たる疑似生物に良い行動と悪い行動を教えることがで
きる。このように、結果に対して良いか悪いかという評
価だけを与えて行う学習を強化学習という。行動の発現
頻度を増減させる強化学習により、初めのうちは誉めら
れる行動も叱られる行動も関係なく実行していた疑似生
物が、学習が進むにつれて誉められる行動のみを行い、
叱られる行動をしなくなっていく。そして、このような
行動パタンの選択性は、使用者がどのように躾たかによ
って疑似生物毎に異なるので、疑似生物の個性を作り出
すことができるのである。As described above, when the user praise or scold for the action taken by the simulated creature, the user can use the simulated creature as an artificial life in the same natural manner as when interacting with dogs and cats. Can teach good and bad behavior. In this way, learning performed by giving only an evaluation of good or bad to the result is called reinforcement learning. By reinforcement learning to increase or decrease the frequency of occurrence of behavior, pseudo-creatures that initially performed irrespective of behaviors praised or scolded, perform only praised actions as learning progresses,
I will not be scolded. The selectivity of such an action pattern differs for each simulated creature depending on how the user has disciplined, and thus the individuality of the simulated creature can be created.

【００１８】しかしながら、強化学習により疑似生物の
個性は形成されるものの、やはり予め用意された、限ら
れた選択肢の中からしか行動を選べないことに変わりは
なく、その制約のもとで学習が頂点に達した暁には、発
現頻度の高いお馴染みの行動しか現れなくなるため、使
用者に飽きられてしまうという問題が発生する点はI、I
Iと同一である。[0018] However, although the individuality of the simulated creature is formed by reinforcement learning, it is still the same that the action can be selected only from a limited set of options prepared in advance. When it reaches the top, only familiar behaviors with a high frequency of appearance will appear, so the problem that the user gets bored occurs I, I
Same as I.

【００１９】以上述べた様なI〜IIIは、疑似生物が行動
するときに、その状況に適したものとして予め設定（記
憶）された限られた行動パタンから一つが選択されるの
であり、時間の経過と共に使用者は疑似生物の行動パタ
ンを予測できるようになって疑似生物への興味を失った
り、また疑似生物の反応が単一となっていくことから所
詮玩具の範疇であると考えて使用しなくなったりする等
の問題があった。As described above, when the pseudo-creature acts, one of the limited behavior patterns set (stored) in advance as being suitable for the situation is selected when the pseudo-creature acts. With the passage of time, the user can predict the behavior pattern of the simulated organism and lose interest in the simulated organism, and since the reaction of the simulated organism becomes unity, it is considered to be a category of toys after all. There was a problem that it was no longer used.

【００２０】また、個性付けができるといっても、選択
された動作は記憶された行動パタンのうちのいずれか一
つであることを考えれば、選択範囲が大きいか小さいか
だけで決められる個性付けということであって、予め設
定されている範疇を超えて疑似生物が行動パタンを増や
したり、予め設定されている範疇を超えた状況で行動し
たりということがなく、その個性付けや成長の様は使用
者にとっては満足がいかない場合もあった。Even if it is possible to assign individuality, considering that the selected action is one of the stored action patterns, individuality determined only by a large or small selection range is considered. In other words, the pseudo-creature does not increase the behavior pattern beyond the preset category and does not act in a situation beyond the preset category, and the personality and growth of the In some cases, the user was not satisfied.

【００２１】[0021]

【発明が解決しようとする課題】上述したように従来の
疑似生物は、実行される行動パタンが予め記憶された中
から選択されるため、使用者が疑似生物を使用するにつ
れ疑似生物の行動が予測可能になり、興味が薄れるとい
った問題があった。As described above, in the conventional simulated creature, the action pattern to be executed is selected from pre-stored action patterns, so that as the user uses the simulated creature, the behavior of the simulated creature is reduced. There was a problem that it became predictable and interest became weak.

【００２２】そこで本発明は上記従来の問題点に鑑みて
なされたもので、行動パタンを時間の経過と共に自律的
に増加させて、使用者が継続して疑似生物を使用し得る
疑似生物システム及び擬似生物装置における疑似生物の
行動形成方法及び疑似生物の行動形成を行うプログラム
を記載したコンピュータ読み取り可能な記憶媒体の提供
を目的とする。In view of the above, the present invention has been made in view of the above-mentioned conventional problems, and provides a simulated creature system in which a behavior pattern is autonomously increased with the passage of time so that a user can continuously use the simulated creature. It is an object of the present invention to provide a computer-readable storage medium describing a method for forming a behavior of a simulated creature in a simulated creature device and a program for forming the behavior of a simulated creature.

【００２３】[0023]

【課題を解決するための手段】上記目的を達成するため
に本発明の擬似生物装置は、所望の動作パタンによって
運動する擬似生物を実現する擬似生物装置であって、こ
の擬似生物装置の内部に、前記擬似生物周囲の状況を外
部パラメータ値として検知する外部状況入力手段と、前
記擬似生物の擬似的感情を内部パラメータ値として保持
する内部状況保持手段と、前記動作パタンを、この擬似
生物に行わせる行動、この行動前の前記擬似生物の外部
パラメータ値及び内部パラメータ値、この行動後の前記
擬似生物の外部パラメータ値及び内部パラメータ値、こ
の行動の遷移確率、からなる一組の連関情報として、複
数記憶する連関データベース手段と、前記外部パラメー
タ値及び前記内部パラメータ値に対応し、この擬似生物
に行わせる前記動作パタンを前記連関データベース手段
の前記連関情報から選択する行動検索手段と、前記選択
された連関情報に基づいて運動させる行動出力手段と、
を有する擬似生物装置において、前記擬似生物装置の内
部に設けられ、選択された前記連関情報を実行した後に
検知された前記外部パラメータ値及び保持された前記内
部パラメータ値と、選択された前記連関情報の前記外部
パラメータ値及び前記内部パラメータ値との類似度を計
算し、前記類似度が所定値以上である場合は、選択され
た前記連関情報の前記遷移確率を増加させて、前記類似
度が所定値未満である場合は、選択された前記連関情報
の前記遷移確率を減少させる信頼性学習手段とから構成
される。Means for Solving the Problems To achieve the above object, a simulated creature device of the present invention is a simulated creature device for realizing a simulated creature that moves according to a desired motion pattern. An external situation input means for detecting a situation around the simulated creature as an external parameter value; an internal situation holding means for retaining a simulated emotion of the simulated creature as an internal parameter value; and the operation pattern being performed on the simulated creature. Action, the external parameter value and the internal parameter value of the simulated creature before this action, the external parameter value and the internal parameter value of the simulated creature after this action, the transition probability of this action, as a set of related information consisting of: A plurality of associated database means for storing the plurality of associated parameter values corresponding to the external parameter value and the internal parameter value, and An action searching means for selecting a pattern from the association information of the associated database means, and the action output means for movement on the basis of the selected associated information,
Wherein the external parameter value and the retained internal parameter value that are provided inside the pseudo-biological device and that are detected after executing the selected association information, and the selected association information Calculating the similarity between the external parameter value and the internal parameter value, and when the similarity is equal to or greater than a predetermined value, increasing the transition probability of the selected association information to determine the similarity. If the value is less than the value, a reliability learning means for reducing the transition probability of the selected association information is provided.

【００２４】また、本発明の行動形成方法は、所望の動
作パタンによって運動する擬似生物を実現し、前記動作
パタンを複数のパラメータからなる連関情報として記憶
する連関データベース手段を有する擬似生物装置であっ
て、前記擬似生物周囲の状況を外部パラメータ値として
検知する工程と、前記擬似生物の擬似的感情を内部パラ
メータ値として保持する工程と、前記外部パラメータ値
及び前記内部パラメータ値に対応し、この擬似生物に行
わせる前記動作パタンを前記連関データベース手段の前
記連関情報から選択する行動検索工程と、選択された前
記連関情報に基づいて運動させる行動出力工程と、選択
された前記連関情報を実行した後に検知された前記外部
パラメータ値及び保持された前記内部パラメータ値と、
選択された前記連関情報の前記外部パラメータ値及び前
記内部パラメータ値と、の類似度を計算する類似度し、
前記類似度が所定値以上である場合は、前記選択された
連関情報の前記遷移確率を増加させて、前記類似度が所
定値未満である場合は、前記選択された連関情報の前記
遷移確率を減少させる信頼性学習工程とを有することを
特徴とする。Further, the behavior forming method of the present invention is a simulated creature apparatus that realizes a simulated creature that moves according to a desired movement pattern, and that has a linkage database means for storing the movement pattern as linkage information including a plurality of parameters. Detecting the situation around the simulated creature as an external parameter value, holding the simulated emotion of the simulated creature as an internal parameter value, and corresponding to the external parameter value and the internal parameter value. An action search step of selecting the action pattern to be performed by a living organism from the link information of the link database means, an action output step of exercising based on the selected link information, and after executing the selected link information The detected external parameter value and the retained internal parameter value,
Calculating a similarity between the external parameter value and the internal parameter value of the selected association information;
If the similarity is equal to or greater than a predetermined value, the transition probability of the selected association information is increased.If the similarity is less than a predetermined value, the transition probability of the selected association information is increased. And a reliability learning step of decreasing the reliability learning step.

【００２５】また、本発明の記憶媒体は、所望の動作パ
タンによって運動する擬似生物を動作させ、前記動作パ
タンを複数のパラメータからなる連関情報として記憶す
る連関データベース手段を有する擬似生物装置に行動形
成を行わせるプログラムをコンピュータ読み取り可能な
ように記憶させた記憶媒体であって、前記擬似生物周囲
の状況を外部パラメータ値として検知させ、前記擬似生
物の擬似的感情を内部パラメータ値として保持させ、前
記外部パラメータ値及び前記内部パラメータ値に対応
し、この擬似生物に行わせる前記動作パタンを前記連関
データベース手段の前記連関情報から選択させ、前記選
択された連関情報に基づいて運動させ、選択された前記
連関情報が実行された後で検知された前記外部パラメー
タ値及び保持された前記内部パラメータ値と、選択され
た前記連関情報の前記外部パラメータ値及び前記内部パ
ラメータ値と、の類似度を計算させ、前記類似度が所定
値以上である場合は、選択された前記連関情報の前記遷
移確率を増加させて、前記類似度が所定値未満である場
合は、選択された前記連関情報の前記遷移確率を減少さ
せることを特徴とする。Further, the storage medium of the present invention operates a pseudo-creature that moves according to a desired motion pattern, and forms a behavior in a pseudo-creature apparatus having an association database means for storing the operation pattern as association information including a plurality of parameters. Is a computer-readable storage medium storing a program for performing, the situation around the pseudo-creature is detected as an external parameter value, and the pseudo-feature of the pseudo-creature is held as an internal parameter value, Corresponding to the external parameter value and the internal parameter value, the operation pattern to be performed by the pseudo creature is selected from the association information of the association database means, and exercise is performed based on the selected association information. The external parameter value detected after the association information is executed and held The similarity between the internal parameter value and the external parameter value and the internal parameter value of the selected association information is calculated, and when the similarity is equal to or greater than a predetermined value, the similarity of the selected association information is calculated. The transition probability is increased, and when the similarity is less than a predetermined value, the transition probability of the selected association information is reduced.

【００２６】また、本発明のコンピュータプログラム
は、所望の動作パタンによって運動する擬似生物を動作
させ、前記動作パタンを複数のパラメータからなる連関
情報として記憶する連関データベース手段を有する擬似
生物装置を動作させるコンピュータプログラムであっ
て、前記擬似生物周囲の状況を外部パラメータ値として
検知させる機能と、前記擬似生物の擬似的感情を内部パ
ラメータ値として保持させる機能と、前記外部パラメー
タ値及び前記内部パラメータ値に対応し、この擬似生物
に行わせる前記動作パタンを前記連関データベース手段
の前記連関情報から選択させる機能と、選択された前記
連関情報に基づいて運動させる機能と、選択された前記
連関情報が実行された後で検知された前記外部パラメー
タ値及び保持された前記内部パラメータ値と、選択され
た前記連関情報の前記外部パラメータ値及び前記内部パ
ラメータ値と、の類似度を計算させる機能と、前記類似
度が所定値以上である場合は、選択された前記連関情報
の前記遷移確率を増加させて、前記類似度が所定値未満
である場合は、選択された前記連関情報の前記遷移確率
を減少させる機能とを有する。Further, the computer program of the present invention operates a pseudo-creature that moves according to a desired motion pattern, and operates a pseudo-creature apparatus having an association database means for storing the operation pattern as association information including a plurality of parameters. A computer program, a function of detecting a situation around the simulated creature as an external parameter value, a function of retaining a simulated emotion of the simulated creature as an internal parameter value, and a function corresponding to the external parameter value and the internal parameter value. Then, a function of selecting the operation pattern to be performed by the simulated creature from the association information of the association database means, a function of exercising based on the selected association information, and the selected association information are executed. The external parameter value detected later and before being retained A function of calculating a similarity between an internal parameter value and the external parameter value and the internal parameter value of the selected association information; and, if the similarity is equal to or more than a predetermined value, the selected association information. And if the similarity is less than a predetermined value, a function of reducing the transition probability of the selected association information.

【００２７】また、本発明のロボットは、所望の動作パタ
ンによって運動するロボットであって、このロボットの
内部に、前記ロボット周囲の状況を外部パラメータとし
て検知する外部状況入力手段と、前記ロボットの擬似的
感情を内部パラメータとして保持する内部状況保持手段
と、前記動作パタンを、このロボットに行わせる行動、
この行動前の前記ロボットの外部パラメータ及び内部パ
ラメータ、この行動後の前記ロボットの外部パラメータ
及び内部パラメータ、この行動の成功確率、からなる一
組の連関情報として、複数記憶する連関データベース手
段と、前記外部パラメータ及び前記内部パラメータに対
応し、このロボットに行わせる前記動作パタンを前記連
関データベース手段の前記連関情報から選択する行動検
索手段と、前記選択された連関情報に基づいて運動させ
る行動出力手段と、を有する擬似生物装置において、前
記ロボットの内部に設けられ、選択された前記連関情報
を実行した後に検知された前記外部パラメータ及び保持
された前記内部パラメータと、選択された前記連関情報
の前記外部パラメータ及び前記内部パラメータとの類似
度を計算する信頼性（類似性）計算手段と、前記ロボッ
トの内部に設けられ、前記類似度が所定値以上である場
合は、選択された前記連関情報の前記遷移確率を増加さ
せて、前記類似度が所定値以下である場合は、選択され
た前記連関情報の前記遷移確率を減少させ、かつ選択さ
れた前記連関情報を実行する前の前記外部パラメータ及
び内部パラメータ、選択された前記連関情報の行動、こ
のロボットの行動後の前記外部パラメータ及び前記内部
パラメータと、この行動の遷移確率と、を一組とする連
関情報を生成し、この生成された連関情報が前記連関デ
ータベース手段に記憶されていなければ連関情報として
記憶する目標外行動学習手段とを具備する。Further, the robot of the present invention is a robot which moves according to a desired operation pattern. The robot has an external situation input means for detecting a situation around the robot as an external parameter, and a simulated robot. An internal situation holding means for holding a target emotion as an internal parameter, and an action for causing the robot to perform the action pattern;
An association database means for storing a plurality of association information as a set of association information including an external parameter and an internal parameter of the robot before the action, an external parameter and an internal parameter of the robot after the action, and a success probability of the action; Corresponding to external parameters and the internal parameters, an action search means for selecting the action pattern to be performed by the robot from the link information of the link database means, and an action output means for exercising based on the selected link information. Wherein the external parameter and the stored internal parameter, which are provided inside the robot and are detected after executing the selected association information, and the external of the selected association information. Parameter and the confidence in calculating the similarity with the internal parameter (Similarity) calculating means, provided inside the robot, wherein when the similarity is equal to or more than a predetermined value, the transition probability of the selected association information is increased so that the similarity is equal to or less than a predetermined value. In the case of, the transition probability of the selected link information is reduced, and the external parameter and the internal parameter before executing the selected link information, the action of the selected link information, The external parameter and the internal parameter after the action and the transition probability of the action are generated as a set of link information. If the generated link information is not stored in the link database means, the link information is generated. Non-target action learning means for storing.

【００２８】また、本発明のロボットは、所望の動作パタ
ンによって運動するロボットであって、このロボットの
内部に、前記ロボット周囲の状況を外部パラメータとし
て検知する外部状況入力手段と、前記ロボットの擬似的
感情を内部パラメータとして保持する内部状況保持手段
と、このロボットに行わせる動作パタンを、このロボッ
トに行わせる行動、この行動前の前記ロボットの外部パ
ラメータ及び内部パラメータ、この行動後の前記ロボッ
トの外部パラメータ及び内部パラメータ、を有する一組
の連関情報として、複数記憶する連関データベース手段
と、前記外部パラメータ及び前記内部パラメータに対応
し、このロボットに行わせる前記動作パタンを前記連関
データベース手段の前記連関情報から選択する行動検索
手段と、前記選択された連関情報に基づいて運動させる
行動出力手段と、を有する擬似生物装置において、前記
外部状況入力手段によって被検出体が検出され、この被
検出体が行動を行った場合に、この被検出体が行った前
記行動と、この行動前後に検知された前記外部パラメー
タと、この行動前後の前記被検出体の状態から予測され
る前記被検出体の感情をパラメータとする前記内部パラ
メータと、を一組の連関情報として生成し、この生成さ
れた連関情報が前記連関データベース手段に記憶されて
いなければ記憶する他者行動学習手段とを具備する。Further, the robot of the present invention is a robot that moves according to a desired operation pattern. The robot has an external situation input means for detecting a situation around the robot as an external parameter, and a simulated robot. Internal situation holding means for holding the emotion as an internal parameter, an action pattern to be performed by the robot, an action to be performed by the robot, an external parameter and an internal parameter of the robot before the action, and an operation of the robot after the action. An association database means for storing a plurality of sets of association information having external parameters and internal parameters, and the operation pattern corresponding to the external parameters and the internal parameters and performed by the robot by the association database means; Action search means for selecting from information; And a behavior output means for exercising based on the obtained association information, wherein the detected object is detected by the external situation input means, and when the detected object performs an action, the detected object is And the internal parameters having as parameters the emotion of the detected object predicted from the state of the detected object before and after the action. And a third party action learning unit that generates as a set of association information, and stores the generated association information unless the generated association information is stored in the association database unit.

【００２９】また、本発明のロボットは、所望の動作パタ
ンによって運動するロボットであって、このロボットの
内部に、前記ロボット周囲の状況を外部パラメータとし
て検知する外部状況入力手段と、前記ロボットの擬似的
感情を内部パラメータとして保持する内部状況保持手段
と、このロボットに行わせる動作パタンを、このロボッ
トに行わせる行動、この行動前の前記ロボットの外部パ
ラメータ及び内部パラメータ、この行動後の前記ロボッ
トの外部パラメータ及び内部パラメータ、を有する一組
の連関情報として、複数記憶する連関データベース手段
と、前記外部パラメータ及び前記内部パラメータに対応
し、このロボットに行わせる前記動作パタンを前記連関
データベース手段の前記連関情報から選択する行動検索
手段と、前記選択された連関情報に基づいて運動させる
行動出力手段と、を有する擬似生物装置において、前記
連関データベース手段に記憶された前記連関情報から同
一の行動を有する前記連関情報を抽出し、抽出された前
記連関情報のうち、前記行動後の前記内部パラメータが
前記行動前の前記内部パラメータよりも増加して前記擬
似的感情が好転される連関情報と、減少して暗転される
連関情報とを比較し、好転される前記連関情報だけに含
まれる前記外部もしくは前記内部パラメータを暗転され
る前記連関情報に追加し、暗転される前記連関情報だけ
に含まれる前記外部もしくは前記内部パラメータを暗転
される前記連関情報から削除し、この外部及び内部パラ
メータの追加及び削除がなされた前記連関情報が前記連
関データベース手段に記憶されていなければ連関情報と
して記憶する状況調整行動学習手段とを具備する。Further, the robot of the present invention is a robot that moves according to a desired operation pattern. Inside the robot, external situation input means for detecting the situation around the robot as external parameters, Internal situation holding means for holding the emotion as an internal parameter, an action pattern to be performed by the robot, an action to be performed by the robot, an external parameter and an internal parameter of the robot before the action, and an operation of the robot after the action. An association database means for storing a plurality of sets of association information having external parameters and internal parameters, and the operation pattern corresponding to the external parameters and the internal parameters and performed by the robot by the association database means; Action search means for selecting from information; A behavior output means for exercising based on the obtained association information, wherein the association information having the same behavior is extracted from the association information stored in the association database means, and the extracted association is extracted. Among the information, the relation information in which the pseudo parameter is improved by increasing the internal parameter after the action compared to the internal parameter before the action is compared with the relation information in which the pseudo emotion is reduced and darkened. The external or internal parameter included only in the relevant information is added to the relevant information to be darkened, and the external or internal parameter included only in the relevant information to be darkened is obtained from the relevant information to be darkened. Deleted, and the association information obtained by adding and deleting the external and internal parameters is stored in the association database means. ; And a status adjustment action learning means for storing as associated information otherwise.

【００３０】また、本発明のロボットは、所望の動作パタ
ンによって運動するロボットであって、このロボットの
内部に、前記ロボット周囲の状況を外部パラメータとし
て検知する外部状況入力手段と、前記ロボットの擬似的
感情を内部パラメータとして保持する内部状況保持手段
と、このロボットに行わせる動作パタンを、このロボッ
トに行わせる行動、この行動前の前記ロボットの外部パ
ラメータ及び内部パラメータ、この行動後の前記ロボッ
トの外部パラメータ及び内部パラメータ、この行動の成
功確率、を一組の連関情報として、複数記憶する連関デ
ータベース手段と、前記外部パラメータ及び前記内部パ
ラメータに対応し、このロボットに行わせる前記動作パ
タンを前記連関データベース手段の前記連関情報から選
択する行動検索手段と、前記選択された連関情報に基づ
いて運動させる行動出力手段と、を有する擬似生物装置
において、前記連関データベース手段に記憶される前記
連関情報のうち、第1の連関情報の行動後の前記ロボッ
トの外部パラメータ及び内部パラメータと、第2の連関
情報の行動前の前記ロボットの外部パラメータ及び内部
パラメータとが略一致する前記第1及び第2の連関情報を
抽出し、前記第1の連関情報の行動前の前記ロボットの
外部パラメータ及び内部パラメータと、前記第1の連関
情報の行動と、前記第2の連関情報の行動と、前記第2の
連関情報の行動後の前記ロボットの外部パラメータ及び
内部パラメータと、を一組の連関情報として生成し、こ
の生成された連関情報が前記連関データベース手段に記
憶されていなければ連関情報として記憶する連鎖行動学
習手段とを具備する。The robot according to the present invention is a robot that moves according to a desired operation pattern. The robot has an external situation input means for detecting the situation around the robot as an external parameter, and a simulated robot. Internal situation holding means for holding the emotion as an internal parameter, an action pattern to be performed by the robot, an action to be performed by the robot, an external parameter and an internal parameter of the robot before the action, and an operation of the robot after the action. The association database means for storing a plurality of external parameters and internal parameters as a set of association information as a set of association information, and the operation pattern to be performed by the robot corresponding to the external parameters and the internal parameters. Behavior search selected from the association information of database means A pseudo-biological apparatus having a step and an action output means for exercising based on the selected association information, in the association information stored in the association database means, after the action of the first association information. Extracting the first and second association information in which the external parameters and the internal parameters of the robot and the external parameters and the internal parameters of the robot before the action of the second association information substantially match, and extracting the first association information; The external parameters and internal parameters of the robot before the action, the action of the first link information, the action of the second link information, and the external parameters of the robot after the action of the second link information, And an internal parameter are generated as a set of association information, and if the generated association information is not stored in the association database means, it is stored as the association information. Chain action learning means.

【００３１】[0031]

【発明の実施の形態】以下、本発明の実施の形態につい
て図面を参照して説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００３２】本発明の疑似生物装置が提供する擬似生物
は、その外形として擬人的もしくは擬生物的な形態を有
しており、例えば動物の形状であればハムスター、ネ
コ、サル、イヌ、ウサギ、オウム、クマ、クジラ等を模
したものであり、植物であればヒマワリ、カエデ、スウ
ィートバジル等を模したものである。The mimetic provided by the mimetic apparatus of the present invention has an anthropomorphic or imitative form as its outer shape. For example, if it is an animal, it may be a hamster, cat, monkey, dog, rabbit, It simulates parrots, bears, whales and the like, and if it is a plant, it simulates sunflower, maple, sweet basil and the like.

【００３３】また、そのため、本発明の擬似生物装置は、
その実施の形態として、擬似生物の姿形を表示可能なコ
ンピュータ（電子機器）であったり、擬似生物を象った
ロボットであったりする。[0033] Therefore, the simulated biological device according to the present invention comprises:
As an embodiment thereof, a computer (electronic device) capable of displaying the shape of a simulated creature or a robot that imitates a simulated creature is used.

【００３４】図1乃至図4は本発明に係る疑似生物システ
ムをロボットにて実施した第1の実施形態を示すもので
ある。FIGS. 1 to 4 show a first embodiment in which a simulated biological system according to the present invention is implemented by a robot.

【００３５】図1はイヌ形状のロボットの斜視図であ
る。FIG. 1 is a perspective view of a dog-shaped robot.

【００３６】ロボット1の外形は略イヌであり、複数の
可動部と本体とからなる。可動部は、頭部2、右腕部3、
左腕部4、右足部5、左足部6、しっぽ7からなる。本体
は、胴体部8である。また胴体部8には、頭部2、右腕部
3、左腕部4、右足部5、左足部6、しっぽ7が関節を介し
て接続され、胴体部8に対して運動可能である。この関
節には、駆動力を発生するモータ、モータの駆動力が伝
達されて駆動された各部の時々刻々の姿勢（関節角度）
を検知する角度センサ、が設けられる。The outer shape of the robot 1 is substantially a dog, and comprises a plurality of movable parts and a main body. The movable part is head 2, right arm 3,
It consists of a left arm 4, a right foot 5, a left foot 6, and a tail 7. The main body is the body part 8. The torso 8 has a head 2, a right arm
3, the left arm 4, the right foot 5, the left foot 6, and the tail 7 are connected via joints and can move with respect to the body 8. A motor that generates a driving force, a momentary posture (joint angle) of each part driven by the driving force of the motor transmitted to the joint,
Is provided.

【００３７】また、頭部2には、目に該当する個所に発
光色の異なる点滅可能なLED9や撮像カメラ10や赤外線式
距離センサ11、耳に該当する個所にはマイク12、口に該
当する部分にはスピーカ13、額に該当する個所には感圧
センサ14、がそれぞれ設けられる。なお、口に該当する
個所と耳に該当する個所は所定の方向に運動可能であ
る。The head 2 has a blinking LED 9 and a photographing camera 10 and an infrared distance sensor 11 having different emission colors at a position corresponding to an eye, a microphone 12 at a position corresponding to an ear, and a mouth. A speaker 13 is provided in the portion, and a pressure-sensitive sensor 14 is provided in a portion corresponding to the forehead. The portion corresponding to the mouth and the portion corresponding to the ear can move in a predetermined direction.

【００３８】また、右腕部3、左腕部4、右足部5、左足
部6、しっぽ7、胴体部8にもそれぞれ感圧センサ14が設
けられる。The right arm 3, the left arm 4, the right foot 5, the left foot 6, the tail 7, and the body 8 are also provided with pressure sensors 14, respectively.

【００３９】次に、図2はロボットのブロック構成図で
ある。Next, FIG. 2 is a block diagram of the robot.

【００４０】胴体部8には、制御部15が内蔵される。制
御部15は、CPU16と、ROMとRAMからなるメモリ17とから
なる。CPU16は、各可動部に設けられた各種センサ（撮
像カメラ10、赤外線式距離センサ11、マイク12、スピー
カ13、感圧センサ14、角度センサ）からの信号を受け取
り、また各関節のモータを駆動させるための信号を出力
できるように接続されている。A control section 15 is built in the body section 8. The control unit 15 includes a CPU 16 and a memory 17 including a ROM and a RAM. The CPU 16 receives signals from various sensors (an imaging camera 10, an infrared distance sensor 11, a microphone 12, a speaker 13, a pressure sensor 14, and an angle sensor) provided in each movable unit and drives a motor of each joint. It is connected so that the signal for making it output can be output.

【００４１】次に、本発明の機能を図3の本発明の機能
を説明するためのブロック線図を用いて説明する。Next, the function of the present invention will be described with reference to the block diagram of FIG. 3 for explaining the function of the present invention.

【００４２】このロボットの機能を「行動発現能力」、
「行動形成能力」の2点について説明する。・行動発現能力行動発現能力を機能させるための構成は、状況入力部2
1、連関データベース部22、行動検索部23、行動出力部2
4である。The function of this robot is called “behavioral expression ability”,
Two points of "action formation ability" will be described. -Action expression ability The configuration for making the action expression ability function
1, linked database section 22, action search section 23, action output section 2
4

【００４３】状況入力部21は、各種センサ（ロボット1
の各可動部、本体に設けられた撮像カメラ10、赤外線式
距離センサ11、マイク12、感圧センサ14、各関節の角度
センサ）、各種センサからの信号を受け取り処理するCP
U16、信号およびその処理結果を記憶保持しておくメモ
リ17の一部により実現される。すなわち、各種センサか
らの信号を認識する処理がCPU16により実行される。The status input unit 21 is provided with various sensors (the robot 1
Receiving and processing signals from various movable parts, an imaging camera 10 provided on the main body, an infrared distance sensor 11, a microphone 12, a pressure-sensitive sensor 14, an angle sensor for each joint), and a CP for receiving and processing signals from various sensors.
U16 is realized by a part of the memory 17 that stores and holds signals and processing results thereof. That is, a process of recognizing signals from various sensors is executed by the CPU 16.

【００４４】また、連関データベース部22は、メモリ17
の一部を使って実現される。Further, the association database section 22 stores the memory 17
Is implemented using a part of

【００４５】また、行動検索部23は、CPU16により実現
される。The action search unit 23 is realized by the CPU 16.

【００４６】また、行動出力部24は、各種出力デバイス
（LED9、スピーカ13、各関節のモータ）、各種出力デバ
イスを制御するCPU16、その制御のための情報を記憶す
るメモリ17の一部により実現される。すなわち、ロボッ
トの動作や表情表出を各種出力デバイスを介して出力す
る処理がCPU16により実行される。The action output unit 24 is realized by various output devices (LED 9, speaker 13, motor of each joint), a CPU 16 for controlling various output devices, and a part of a memory 17 for storing information for the control. Is done. That is, a process of outputting the operation and expression of the robot through various output devices is executed by the CPU 16.

【００４７】まず、状況入力部21は、ロボット1内外の
状況データを得る。この状況データはM（Mは自然数）個
のパラメータ値を有し、このM個のパラメータはロボッ
ト1の外界から得られる情報に関する外部状態パラメー
タ（外部パラメータ値）と、ロボット1の内部の状態を
示す内部状態パラメータ（内部パラメータ値）とから成
っている。First, the situation input section 21 obtains situation data inside and outside the robot 1. This situation data has M (M is a natural number) parameter values. The M parameters represent an external state parameter (external parameter value) relating to information obtained from the outside of the robot 1 and an internal state of the robot 1. And internal state parameters (internal parameter values) shown in FIG.

【００４８】外部状態パラメータは、撮像カメラ10、静
電容量式距離センサ11、マイク12、感圧センサ14によっ
て検知され、被検出体の存在、被検出体の感情状態や行
動、被検出体もしくはロボット1自体の行動に対する結
果、ロボット1が検知した周囲の環境の状態である。例
えば、被検出体とは人間などであり、被検出体の感情状
態とは人間の喜怒哀楽などであり、被検出体の行動とは
人間が走る行動であり、また行動に対する結果とは人間
が走った時に何かにぶつかったという状態であり、また
環境の状態とはロボット1の周囲のどの方向に何人の人
間が存在するか、といったことである。The external state parameters are detected by the imaging camera 10, the capacitance type distance sensor 11, the microphone 12, and the pressure sensor 14, and the presence of the detected object, the emotional state or behavior of the detected object, the detected object or The state of the surrounding environment detected by the robot 1 as a result of the action of the robot 1 itself. For example, the detected object is a human, etc., the emotional state of the detected object is human emotions, emotions and so on, the detected object's behavior is a human running behavior, and the result of the behavior is a human's Is a state where it ran into something when it ran, and the state of the environment is how many people exist in which direction around the robot 1.

【００４９】内部状態パラメータは、外部状態パラメー
タ及び経過時間の影響を受けて状況入力部21内部で形成
されるロボット1独自の擬似的感情や欲求、被検出体に
対するロボット1の好悪判断結果、ロボット1が行った行
動、ロボット1が行動中か否かである。例えば、欲求と
は時間の経過と共に発生する空腹感等の欲求であり、ま
た擬似的感情とは空腹感を有した時に感じるイライラす
る感情であり、また行動とは被検出体が人間であると判
断した時に近付く行動であり、また行動中であるか否か
とは被検出体を検知した後被検出体に近付いているか否
かであり、また被検出体が人間であった場合被検出体に
近接して頭部2を撫でられた（感圧センサ14が検知する
押圧力が小さく長い時間接触あり）時にロボット1に生
成させる満足感なる感情と撫でた人間の顔の記憶、とい
ったことも含む。The internal state parameters include the pseudo-emotions and desires unique to the robot 1 formed inside the situation input unit 21 under the influence of the external state parameters and the elapsed time, the result of the robot 1's judgment on the object to be detected, the robot 1 The action performed by 1 is whether or not the robot 1 is in action. For example, a desire is a desire such as a feeling of hunger that occurs with the passage of time, a pseudo-emotion is a frustrating feeling that is felt when having a feeling of hunger, and an action is that a detected object is a human. It is an action approaching when judged, and whether or not it is acting is whether it is approaching the detected object after detecting the detected object, and if the detected object is a human, This includes the feeling of satisfaction generated by the robot 1 and the memory of the stroked human face when the head 2 is stroked in close proximity (the pressing force detected by the pressure sensor 14 is small and there is contact for a long time). .

【００５０】このような外部状態パラメータ及び内部状
態パラメータは、ロボットが扱うことのできる事象や事
物や行動などを識別するためのID番号等の記号データ
と、ロボットの擬似的感情や欲求の強度値を表す数値デ
ータで示される。The external state parameter and the internal state parameter include symbol data such as an ID number for identifying an event, thing, action, or the like that can be handled by the robot, and a simulated emotion or desire intensity value of the robot. Is represented by numerical data representing

【００５１】記号データについては、以下の表1のとお
りである。The symbol data is as shown in Table 1 below.

【表１】また、数値データとしては、以下の表2のとおりであ
る。強度値の範囲は例えば0〜255の256段階とする。[Table 1] The numerical data is as shown in Table 2 below. The range of the intensity value is, for example, 256 levels from 0 to 255.

【表２】数値データを記載するパラメータがK個（K≦M）あった
場合には、これらのパラメータはひとまとめにしてK個
のスカラ量から成る一本のK次元ベクトルとして扱うこ
とが可能である。さらに、数値データの各パラメータに
は各々その重要度に応じた重みが予め定義されており、
各パラメータ値は所定の重み係数を乗じられた後にベク
トルとしてまとめられ、さらにそのベクトルの長さ（ノ
ルム）を1に正規化される。この正規化されて生成され
たベクトルを状況ベクトルと定義すると、この状況ベク
トルの向きによって数値データが示している状況の特徴
を表すことができる。[Table 2] When there are K parameters (K ≦ M) describing numerical data, these parameters can be collectively handled as one K-dimensional vector composed of K scalar quantities. Furthermore, each parameter of the numerical data is defined in advance with a weight corresponding to its importance,
Each parameter value is combined as a vector after being multiplied by a predetermined weighting factor, and the length (norm) of the vector is normalized to 1. If the vector generated by this normalization is defined as a situation vector, the characteristics of the situation indicated by the numerical data can be represented by the direction of the situation vector.

【００５２】この状況データたるM個のパラメータと、
そのうちのK個の数値データから生成される一本の状況
ベクトルとから構成されるデータを状況情報と呼ぶこと
にする。The M parameters as the situation data are as follows:
Data composed of one of the K numerical data and one situation vector is referred to as situation information.

【００５３】この状況情報は、所定周期Tで定期的に検
知形成され、メモリ17内に設けられたリングバッファに
最新のL周期分、すなわち期間（T×L）相当分が記憶さ
れていく。すなわち、新たな状況情報が検知形成される
と、リングバッファの最も古い状況情報が破棄されて、
最新のものに置き換えられる。このリングバッファに蓄
積されている最新L周期分の状況情報の羅列を状況情報
列と呼ぶことにする。The status information is periodically detected and formed at a predetermined period T, and the latest L periods, that is, the period (T × L), are stored in the ring buffer provided in the memory 17. That is, when new status information is detected and formed, the oldest status information in the ring buffer is discarded,
Replaced by the latest one. A list of the status information for the latest L cycles stored in the ring buffer will be referred to as a status information sequence.

【００５４】また、後述するロボットの行動形成のため
に、状況入力部21には、状況情報列を記憶するリングバ
ッファが3つ用意されている。In order to form the behavior of the robot described later, the situation input unit 21 is provided with three ring buffers for storing a situation information sequence.

【００５５】1つめは、状況入力部21による定期的な状況
情報の形成に伴って更新される最新の状況情報列を記憶
しておくためのリングバッファ1である。2つめは、ロボ
ットの行動開始直前までの状況情報を記憶しておくため
のリングバッファ2である。そして、3つめは、ロボット
の行動完了直後からの状況情報を記憶しておくためのリ
ングバッファ3である。The first is a ring buffer 1 for storing the latest status information sequence updated as the status information is regularly formed by the status input unit 21. The second is a ring buffer 2 for storing situation information until immediately before the action of the robot starts. The third is a ring buffer 3 for storing status information immediately after the completion of the action of the robot.

【００５６】リングバッファ1は、ロボット内外の現在の
状況を表すものである。また、リングバッファ2は、ロボ
ットが行動を開始した時点でリングバッファ1の全内容
を複写したものであり、後述する行動前の実際の状況Ji
cはこのリングバッファ2の内容として読み出される。そ
して、リングバッファ3は、ロボットが行動を完了した時
点からL周期分の状況情報をリングバッファ1の更新に伴
って写し取ったものであり、後述する行動後の実際の状
況Jdcはこのリングバッファ3の内容として読み出され
る。つまり、リングバッファ2と3は、行動の前後の状況
を記憶しておくためのバッファであり、ロボットは、自
己の行動の前後が実際にどのような状況であったかをこ
こから読み出して利用することができる。The ring buffer 1 indicates the current situation inside and outside the robot. The ring buffer 2 is a copy of the entire contents of the ring buffer 1 at the time when the robot starts to behave.
c is read as the contents of the ring buffer 2. The ring buffer 3 is a copy of the status information for L cycles from the time when the robot has completed the action along with the update of the ring buffer 1, and the actual status Jdc after the action described later is the ring buffer 3 Is read as the contents of In other words, the ring buffers 2 and 3 are buffers for storing the situation before and after the action, and the robot should read and use the actual situation before and after the action of the robot from here. Can be.

【００５７】連関データベース部22には、連関情報R
（後述）が記憶されている。連関データベース部22はメ
モリ17の一部を使用して形成されており、メモリ17に
は、この他に、各種センサからの信号によって使用者や
机等の事象や事物をロボットが認識するためのパタン辞
書情報、ロボットの行動（動作や表情表出）に必要な身
体制御情報と出力用音データ、状況入力部21により形成
される外部状態パラメータと内部状態パラメータから形
成される状況情報列を保持するためのリングバッファ1
〜3、その他ロボットを動作させたり学習させたりする
ための各種暫定的あるいは恒久的情報が記憶されてい
る。The association database unit 22 stores the association information R
(Described later) is stored. The association database unit 22 is formed by using a part of the memory 17, and the memory 17 is used for the robot to recognize an event or thing such as a user or a desk based on signals from various sensors. Holds pattern dictionary information, body control information and output sound data necessary for robot actions (movement and expression of facial expressions), and a status information string formed from external status parameters and internal status parameters formed by the status input unit 21 Ring buffer 1 for
3, and other various temporary or permanent information for operating and learning the robot.

【００５８】連関情報Rとは、ある状況Jiから他の状況J
dへと状況を変化させる行動Acが存在するとき（Ji+Ac→
Jdと記述）、Jiを表す状況情報列と、Jdを表す状況情報
列と、Acを表す一つの行動情報とを1組とする情報から
構成されたものであり、R=［Ji,Ac,Jd］と記述される。
連関情報が擁する状況JiやJdは、各々の状況下において
状況入力部21が観測するであろうL周期分の状況情報の
羅列（状況情報列）であり、このような状況情報列が観
測されたらそれはJiやJdであるということを示す見本デ
ータである。The relation information R is defined by a condition Ji from another condition Ji.
When there is an action Ac that changes the situation to d (Ji + Ac →
Jd), a situation information sequence representing Ji, a situation information sequence representing Jd, and one set of action information representing Ac, and R = [Ji, Ac, Jd].
The statuses Ji and Jd included in the association information are a sequence of status information (status information sequence) for L cycles that the status input unit 21 will observe under each status, and such a status information sequence is observed. It is sample data indicating that it is Ji or Jd.

【００５９】このときのJiを初期状況、Jdを目的状況、
JiからJdへの遷移を起こす行動Acを状況遷移行動と呼ぶ
ことにする。状況遷移行動Acは、上述の表1に例示した
単位動作（行動素片）を識別するためのID情報（例えば
表1の0002「歩く」）と、このID情報で識別される行動
の単位動作（行動素片）を実行するタイミング情報（例
えば0秒後）と、必要に応じて付加される行動の対象
（例えば表1の0001「使用者」）などを表す動作パラメ
ータ情報とから成る行動素片情報を一つ以上含んだリス
トとして構成される。例えば、連関情報とは下記のよう
なものである。At this time, Ji is the initial situation, Jd is the objective situation,
The action Ac that causes a transition from Ji to Jd will be referred to as a situation transition action. The situation transition action Ac includes ID information (for example, 0002 “walk” in Table 1) for identifying the unit action (action unit) illustrated in Table 1 above, and a unit action of the action identified by the ID information. An action element comprising timing information (for example, 0 seconds later) for executing (action element) and operation parameter information indicating an object of an action to be added as necessary (for example, 0001 “user” in Table 1) It is configured as a list including one or more pieces of piece information. For example, the association information is as follows.

【００６０】連関情報R0001：初期状況Ji ＝「使用者が側にいなくて寂しい」
（寂しさ=221）状況遷移行動Ac ＝「その場で使用者を探す」（行動
素片ID=0004＋0秒後）目的状況Jd ＝「使用者を見つけて嬉しい」（被
検出体ID=0001、喜び=255）連関情報R0002：初期状況Ji ＝「見知らぬ人がいて恐い」（被検
出体ID=0009、恐れ=194）状況遷移行動Ac ＝「その人に吠える」（行動素片ID=
0003＋対象ID=0009＋0秒後）目的状況Jd ＝「見知らぬ人がいなくなって安
心」（恐れ=25）連関情報R0003：初期状況Ji ＝「使用者が側にいなくて寂しい」
（寂しさ=221）状況遷移行動Ac ＝「吠える」（行動素片ID=0003＋0
秒後）目的状況Jd ＝「使用者がやってきて嬉しい」
（被検出体ID=0001、喜び=255）連関情報R0004：初期状況Ji ＝「声が聞こえる」（被検出体ID=00
10＋方位=40°）状況遷移行動Ac ＝「そこへ向かいながら声の主を探
す」（行動素片ID=0002＋方位=40°＋0秒後）（行動素
片ID=0004＋0秒後）目的状況Jd ＝「声の主（使用者）を見つけて嬉
しい」（被検出体ID=0001、喜び=255）また、状況遷移行動Acは、歩く、吠える、見回す等の単
位動作を指示する行動素片情報を一つ以上含んでおり、
この行動素片情報が複数個、例えば「歩きながら声の主
を探す（行動素片情報は、歩く、声の主を探すの2
つ）」というようなものであっても構わない。さらに、
状況Jiによっては何もせず待つだけで自動的に状況Jdに
至ることもあり、対象を特定しない行動素片「待つ」を
1つだけ持つ行動情報Acは結果として何もしないという
行動を表す。Linked information R0001: Initial situation Ji = "I am lonely without the user"
(Loneliness = 221) Situation transition action Ac = "Search for the user on the spot" (behavior element ID = 0004 + 0 seconds later) Purpose situation Jd = "I am happy to find the user" (Detected object ID = 0001, Joy = 255) Related information R0002: Initial situation Ji = "Scary with stranger" (Detected object ID = 0009, fear = 194) Situation transition action Ac = "Barking at that person" (Action element ID =
0003 + subject ID = 0009 + 0 seconds later) Purpose situation Jd = "I am relieved that there is no stranger" (fear = 25) Related information R0003: Initial situation Ji = "I am lonely without a user"
(Loneliness = 221) Situation transition action Ac = "barking" (action unit ID = 0003 + 0)
Seconds) Purpose Situation Jd = "I'm glad the user came."
(Detected object ID = 0001, joy = 255) Related information R0004: Initial situation Ji = "Hearing voice" (Detected object ID = 00
10 + azimuth = 40 °) Situation transition action Ac = “Search for the main voice while heading there” (action unit ID = 0002 + azimuth = 40 ° + 0 seconds later) (action unit ID = 0004 + 0 seconds later) Purpose situation Jd = "I am happy to find the voice's master (user)" (detected object ID = 0001, joy = 255) In addition, the situation transition action Ac is action segment information that instructs unit actions such as walking, barking, and looking around. Contains one or more,
When the action unit information is plural, for example, "Search for the voice main while walking (the action unit information is walking, search for the main voice 2
)). further,
Depending on the situation Ji, just waiting without doing anything may automatically lead to the situation Jd.
The action information Ac having only one indicates an action of doing nothing as a result.

【００６１】次に、行動検索部23は、状況入力部21から
出力された現在の状況Jicと、連関データベース部22に
記憶された連関情報R=［Ji,Ac,Jd］の初期状況Jiとを比
較する。Next, the action search unit 23 determines the current situation Jic output from the situation input unit 21 and the initial situation Ji of the association information R = [Ji, Ac, Jd] stored in the association database unit 22. Compare.

【００６２】比較は、現在状況Jicと初期状況Jiとから
計算されるスコア情報Scを求めることで行う。現在状況
Jicに対する連関情報Rのスコア情報Sc(Jic,R)は、状況
適合度A(Jic,R)、状況改善効果E(R)、遷移確率S(R)、実
行容易度W(R)の4つのパラメータを有し、Sc(Jic,R)=［A
(Jic,R),E(R),S(R),W(R)］のように表す。複数の連関情
報Rの中からスコア情報Sc(Jic,R)=［A(Jic,R),E(R),S
(R),W(R)］に基づいて一つの解連関情報が抽出される。The comparison is performed by obtaining score information Sc calculated from the current situation Jic and the initial situation Ji. Current situation
The score information Sc (Jic, R) of the association information R with respect to Jic is the situation suitability A (Jic, R), the situation improvement effect E (R), the transition probability S (R), and the ease of execution W (R). Sc (Jic, R) = [A
(Jic, R), E (R), S (R), W (R)]. Score information Sc (Jic, R) = [A (Jic, R), E (R), S
(R), W (R)].

【００６３】ここで、4つのパラメータA(Jic,R),E(R),S
(R),W(R)の算出方法について説明する。（1）状況適合度A(Jic,R) 2つの状況JaとJbの類似性を評価する尺度として状況類
似度Sj(Ja,Jb)を定義する。状況JaとJbの状況類似度Sj
(Ja,Jb)は、JaとJbのそれぞれが持つ((M-K)×L)個の記
号データの類似度Sk(Ja,Jb)と、JaとJbのそれぞれが持
つ(K×L)個の数値データの類似度Sv(Ja,Jb)の積、すな
わちSj(Ja,Jb)=Sk(Ja,Jb)×Sv(Ja,Jb)として計算され
る。Here, the four parameters A (Jic, R), E (R), S
A method for calculating (R) and W (R) will be described. (1) Situation conformity A (Jic, R) Situation similarity Sj (Ja, Jb) is defined as a scale for evaluating the similarity between two situations Ja and Jb. Situation similarity Sj between situations Ja and Jb
(Ja, Jb) is the similarity Sk (Ja, Jb) of ((MK) × L) symbol data of each of Ja and Jb, and (K × L) number of similarities of Ja and Jb. The product of the similarities Sv (Ja, Jb) of the numerical data, ie, Sj (Ja, Jb) = Sk (Ja, Jb) × Sv (Ja, Jb).

【００６４】記号データ情報の類似度Sk(Ja,Jb)は、Ja
とJbが各々擁している各時刻(M-K)個でL周期分の記号デ
ータがお互いに一致する個数を((M-K)×L)で正規化した
値として計算される。このSk(Ja,Jb)の値の範囲は｛Sk
(Ja,Jb)：0≦Sk (Ja,Jb)≦1｝である。このとき、Sk(J
a,Jb)が1に近いほど2つの状況JaとJbの記号データは一
致しており、逆に0に近いほど相違している。The similarity Sk (Ja, Jb) of the symbol data information is Ja
Is calculated as a value obtained by normalizing ((MK) × L) the number of times that the symbol data of L periods coincide with each other at each time (MK) held by each and Jb. The range of the value of Sk (Ja, Jb) is ｛Sk
(Ja, Jb): 0 ≦ Sk (Ja, Jb) ≦ 1｝. At this time, Sk (J
As (a, Jb) is closer to 1, the symbol data of the two situations Ja and Jb match, and conversely, the closer they are to 0, the more different they are.

【００６５】また数値データの類似度Sv(Ja,Jb)は、Ja
とJbが各々擁している各時刻の状況ベクトルVaとVbの内
積｛Ip=(Va・Vb)：-1≦Ip≦1｝をL周期分平均した値と
して求められる。状況ベクトルを構成する数値データの
値は0以上なので、全ての状況ベクトルは必ずK次元超空
間の第1象限にあり、そのような2つの状況ベクトルVaと
Vbの成す角は0度から90度までである。したがって、実
際にはVaとVbの内積値の範囲は｛Ip=(Va・Vb)：0≦Ip≦
1｝であり、そのL周期分を平均して求められるSv(Ja,J
b) の値の範囲も｛Sv(Ja,Jb)：0≦Sv(Ja,Jb)≦1｝とな
る。このSv(Ja,Jb)が1（すなわち成す角0度で完全に一
致）に近いほど2つの状況JaとJbの数値データは一致し
ており、逆に0（すなわち成す角90度で完全に直交）に
近いほど相違しているということになる。The similarity Sv (Ja, Jb) of the numerical data is Ja
And Jb are obtained as a value obtained by averaging the inner product {Ip = (Va · Vb):-1 ≦ Ip ≦ 1} of the situation vectors Va and Vb at each time for L periods. Since the value of the numerical data constituting the situation vector is 0 or more, all the situation vectors are always in the first quadrant of the K-dimensional hyperspace, and two such situation vectors Va and
The angle formed by Vb is from 0 degrees to 90 degrees. Therefore, the range of the inner product of Va and Vb is actually ｛Ip = (Va · Vb): 0 ≦ Ip ≦
1｝, and Sv (Ja, J
The range of the value of b) is also {Sv (Ja, Jb): 0 ≦ Sv (Ja, Jb) ≦ 1}. As Sv (Ja, Jb) becomes closer to 1 (that is, the angle of the formed angle is exactly 0 degree), the numerical data of the two situations Ja and Jb match, and conversely, 0 (ie, the angle of the formed angle of 90 degrees completely matches). The difference is closer to (orthogonal).

【００６６】状況類似度Sj(Ja,Jb)は、Sk(Ja,Jb)×Sv(J
a,Jb)として求められる。状況類似度Sj(Ja,Jb)の値の範
囲は、｛Sj(Ja,Jb)：0≦Sj(Ja,Jb)≦1｝であり、このSj
(Ja,Jb)が1に近いほど2つの状況JaとJbは記号面でも数
値面でも一致しており、逆に0に近いほど相違してい
る。The situation similarity Sj (Ja, Jb) is Sk (Ja, Jb) × Sv (J
a, Jb). The range of the value of the situation similarity Sj (Ja, Jb) is {Sj (Ja, Jb): 0 ≦ Sj (Ja, Jb) ≦ 1}.
As (Ja, Jb) is closer to 1, the two situations Ja and Jb match both symbolically and numerically, and conversely, the closer they are to 0, the more different they are.

【００６７】ここで、遷移前の実際の状況Jicと連関情
報 R=［Ji,Ac,Jd］の初期状況Jiとの間の状況類似度Sj
(Jic,Ji)を状況適合度A(Jic,R)と定義する。このA(Jic,
R)が所定閾値以上あれば、現在状況Jicは連関情報Rの初
期状況Jiによく似ており、この状況JicにおいてRの指示
する行動Acを行うことにより、Rが示している目的状況J
dに至れる可能性があることになる。なお、遷移後の実
際の状況Jdcと連関情報R=［Ji,Ac,Jd］の目的状況Jdと
の間の状況類似度Sj(Jdc,Jd)を遷移達成度B(Jdc,R)と呼
ぶことにする。これは、遷移後の実際の状況Jdcが、目的
とした状況Jdにどれだけ一致しているか、すなわち目的
の遷移がどれだけよく達成されたのかを計る尺度であ
る。このB(Jdc,R)が所定閾値以上あれば、遷移後の状況
Jdcは連関情報Rの目的状況Jdによく似ており、Rの指示
する行動Acを行うことにより、Rが示している目的状況J
dに至れた、すなわち遷移が成功したということにな
る。 (2)状況改善効果E(R) ある連関情報R=［Ji,Ac,Jd］に関する状況改善効果E(R)
は、Rが示す状況遷移行動Acの前後(JiとJd)でロボット1
の快感情強度Cがどのように変化するかを求めたもので
ある。この快感情強度Cは、内部状態パラメータ中の擬
似的感情と欲求とから求められたロボットの快感情を示
すパラメータであり、状況Jに対してC(J)のように記述
される。Here, the situation similarity Sj between the actual situation Jic before the transition and the initial situation Ji of the association information R = [Ji, Ac, Jd].
(Jic, Ji) is defined as situational fitness A (Jic, R). This A (Jic,
If (R) is equal to or greater than the predetermined threshold, the current situation Jic is very similar to the initial situation Ji of the association information R, and by performing the action Ac indicated by R in this situation Jic, the target situation J indicated by R is obtained.
There is a possibility of reaching d. Note that the situation similarity Sj (Jdc, Jd) between the actual situation Jdc after the transition and the target situation Jd of the association information R = [Ji, Ac, Jd] is referred to as a transition achievement degree B (Jdc, R). I will. This is a measure of how much the actual situation Jdc after the transition matches the intended situation Jd, that is, how well the objective transition has been achieved. If this B (Jdc, R) is equal to or greater than the predetermined threshold, the situation after the transition
Jdc is very similar to the goal situation Jd of the association information R, and by performing the action Ac indicated by R, the goal situation J indicated by R is obtained.
That is, d has been reached, that is, the transition has been successful. (2) Situation improvement effect E (R) Situation improvement effect E (R) related to certain related information R = [Ji, Ac, Jd]
Is the robot 1 before and after the situation transition action Ac indicated by R (Ji and Jd).
How the pleasant emotion intensity C changes. The pleasant emotion intensity C is a parameter indicating a pleasant emotion of the robot obtained from the pseudo emotion and the desire in the internal state parameter, and is described for the situation J as C (J).

【００６８】ロボット1の快感情は、外界に存在する被
検出体に対する好悪判断結果や欲求充足状況に応じて変
化する。例えば、ロボットが好きな物や好きな相手を発
見した場合、好きな相手と協調したり嫌いな相手に逆ら
ったりした場合、欲求充足可能な対象を検知した後に実
際に欲求が充足された場合、快感情強度Cの値を上昇さ
せる。この時のCの上限は1とする。また、その逆の状況
で、ロボットが嫌いな物や嫌いな相手を発見した場合、
好きな相手と協調できなかったり嫌いな相手に利してし
まったりした場合、欲求充足可能な対象を検知した後で
充足前にその対象が消失してしまったりした場合、快感
情強度Cの値を低下させる。この時のCの下限は-1とす
る。The pleasant emotion of the robot 1 changes in accordance with the result of good / bad judgment on the object to be detected existing in the outside world and the state of satisfaction. For example, if the robot finds a favorite thing or a favorite partner, cooperates with a favorite partner or opposes a disliked partner, if the desire is actually satisfied after detecting a target that can satisfy the desire, Increase the value of pleasant emotion intensity C. At this time, the upper limit of C is 1. Also, in the opposite situation, if the robot finds something you dislike or a person you dislike,
If you can not cooperate with your favorite partner or benefit from a disliked partner, if you detect a target that can satisfy your desire and then disappear before it is satisfied, the value of pleasant emotion intensity C Lower. At this time, the lower limit of C is -1.

【００６９】このように快感情強度C(J)は、ロボット1
にとっての状況Jの好ましさを評価する尺度として機能
する。そして連関情報R=［Ji,Ac,Jd］の目的状況Jdと初
期状況Jiにおける快感情強度C(Jd)とC(Ji)との差｛E(R)
=C(Jd)-C(Ji)：-2≦E(R)≦2｝が正に大きいほど、状況
の好ましさが状況遷移行動Acによって大きく改善される
ことを表している。 (3)遷移確率S(R) ある連関情報R=［Ji,Ac,Jd］に関する遷移確率S(R)は、
その連関情報R=［Ji,Ac,Jd］がどのくらいの確率で正し
いか（成立するか）を表したものである。これは外界に
おける不確定要素のために、連関情報Rに記述された状
況遷移行動Acを初期状況Jiにおいて実行しても確実に目
的状況Jdに到達できるかどうか分からないからである。
状況遷移行動Acを行って目的状況Jdに達した時は、遷移
が成功した（連関情報が成立した）、つまりこの連関情
報Rは実際に起こる遷移（Ji+Ac→Jd）を正しく示してい
たということになる。As described above, the pleasant emotion intensity C (J) is
It functions as a scale to evaluate the preference of situation J for a person. Then, the difference 快 E (R) between the intense emotions C (Jd) and C (Ji) in the target situation Jd of the relation information R = [Ji, Ac, Jd] and the initial situation Ji.
= C (Jd) -C (Ji): The greater the value of −2 ≦ E (R) ≦ 2｝, the greater the preference of the situation is improved by the situation transition behavior Ac. (3) Transition Probability S (R) The transition probability S (R) for a piece of related information R = [Ji, Ac, Jd] is
It represents the probability that the association information R = [Ji, Ac, Jd] is correct (is established). This is because, due to an uncertain element in the outside world, it is not known whether or not the situation transition action Ac described in the association information R can reliably reach the target situation Jd even when executed in the initial situation Ji.
When the situation transition action Ac was performed and the target situation Jd was reached, the transition was successful (association information was established), that is, the association information R correctly indicated the transition (Ji + Ac → Jd) that actually occurred. It turns out that.

【００７０】遷移確率S(R)は、｛S(R)：0≦S(R)≦1｝の
値の範囲で連関情報ごとに計算される。連関情報Rに関
する遷移確率S(R)と記述され、その連関情報Rの実行回
数Nt(R)と実際にその遷移が成功した回数Ns(R)との商S
(R)/=Ns(R)/Nt(R)として計算される。ただし、連関情報
Rがまだ一度も実行されていない初期状態ではNt(R)=0、
Ns(R)=0なので商Ns(R)/Nt(R)を定義できない。この場合
の遷移確率はS(R)=0であるとする。連関情報R に関する
Nt(R)、Ns(R)、S(R)をまとめて遷移確率情報Si(R)=［Ns
(R),Nt(R),S(R)］として表す。 (4)実行容易度W(R) ある連関情報R=［Ji,Ac,Jd］に関する実行容易度W(R)
は、連関情報Rが指示する行動Acの実行に係るコストが
どれくらい低いか、すなわち行動Acがどれだけ実行しや
すいかを計る尺度である。実行容易度W(R)は、行動Acを
構成する行動素片の数（最低でも1個）の逆数｛U(Ac)：
0≦U(Ac)≦1｝として計算される。ここで、U(Ac)を行動
Acの省力指数と呼び、行動Acが少ない動作で完了するほ
ど大きくなる。実行容易度W(R)の値の範囲は｛W(R)：0
＜W(R)≦1｝であり、W(R)が大きいほど行動Acは少ない
動作で完了する簡単な行動であることを表している。The transition probability S (R) is calculated for each piece of association information in the range of {S (R): 0 ≦ S (R) ≦ 1}. Described as the transition probability S (R) for the association information R, the quotient S of the number of executions Nt (R) of the association information R and the number Ns (R) of times that the transition actually succeeded
It is calculated as (R) / = Ns (R) / Nt (R). However, related information
In the initial state where R has not yet been executed, Nt (R) = 0,
Since Ns (R) = 0, the quotient Ns (R) / Nt (R) cannot be defined. It is assumed that the transition probability in this case is S (R) = 0. Related information R
Nt (R), Ns (R), and S (R) are combined and transition probability information Si (R) = [Ns
(R), Nt (R), S (R)]. (4) Ease of execution W (R) Ease of execution W (R) related to certain relation information R = [Ji, Ac, Jd]
Is a scale for measuring how low the cost related to the execution of the action Ac specified by the association information R, that is, how easy the action Ac is to execute. The degree of execution W (R) is the reciprocal of the number of action segments (at least one) constituting the action Ac ｛U (Ac):
It is calculated as 0 ≦ U (Ac) ≦ 1｝. Here, act U (Ac)
It is called the labor-saving index of Ac, and it becomes larger as the action Ac is completed with less operations. The range of the value of the ease of execution W (R) is ｛W (R): 0
<W (R) ≦ 1｝, which indicates that the larger the W (R), the more the action Ac is a simple action that can be completed with fewer actions.

【００７１】このような(1)〜(4)を求めた上で、一つの
解連関情報Rを抽出する。具体的には、状況適合度A(R)
が所定閾値以上あり、かつ状況改善効果E(R)と遷移確率
S(R)と実行容易度W(R)の積、すなわち状況改善期待度
｛Ep(R)=E(R)×S(R)×W(R)：-2≦Ep(R)≦2｝が所定閾値
以上かつ最大となるものが選択される。つまり選択され
た解連関情報は、現在の状況Jiに対してロボットが知っ
ている適用可能な（状況適合度Aが所定閾値以上）行動
のうち、少ない労力で状況が改善される最も良さそうな
（状況改善期待度Epが所定閾値以上で最大となる）行動
である。After obtaining (1) to (4), one piece of solution relation information R is extracted. Specifically, situational conformity A (R)
Is greater than or equal to a predetermined threshold, and the situation improvement effect E (R) and the transition probability
The product of S (R) and the ease of execution W (R), that is, the degree of expectation for situation improvement ｛Ep (R) = E (R) × S (R) × W (R): −2 ≦ Ep (R) ≦ 2 One in which｝ is equal to or greater than a predetermined threshold value and maximum is selected. In other words, the selected solution relation information is the most likely to be the situation that can be improved with less effort among the applicable actions that the robot knows for the current situation Ji (situation suitability A is equal to or greater than a predetermined threshold). (Situation improvement expectation Ep is maximum when it is equal to or greater than a predetermined threshold).

【００７２】次に、行動出力部24は、行動検索部23にて
選択された解連関情報に基づいて、所定のタイミングで
行動素片情報の内容を実行する。Next, the action output section 24 executes the contents of the action segment information at a predetermined timing based on the solution relation information selected by the action search section 23.

【００７３】例えば、ロボットが室内にひとりで居て寂
しかったとする（状況Jic）。上述したスコア情報Scを
計算し、この状況Jicに類似する初期状況を持つ連関情
報（候補連関情報）を連関データベース部22から抽出す
る。抽出された連関情報が、例えば、下記の3つあった
とする。For example, suppose that the robot was alone in the room and missed (Situation Jic). The above-described score information Sc is calculated, and association information (candidate association information) having an initial situation similar to the situation Jic is extracted from the association database unit 22. It is assumed that there are, for example, the following three pieces of extracted association information.

【００７４】連関情報R1= ［Ji1:使用者が側にいなくて寂しい Ac1:あたりを見回して使用者を探す Jd1:使用者が見つかって嬉しい］（Ep(R1)=E(R1)×S(R1)×W(R1)=1.0×0.3×0.8=0.24）連関情報R2= ［Ji2:使用者が側にいなくて寂しい Ac2:吠える Jd2:使用者がやってきて嬉しい］（Ep(R2)=E(R2)×S(R2)×W(R2)=1.0×0.5×1.0=0.5）連関情報R3= ［Ji3:使用者が側にいなくて寂しい Ac3:歩きながら使用者を探す Jd3:使用者が見つかって嬉しい］（Ep(R3)=E(R3)×S(R3)×W(R3)=1.0×0.7×0.5=0.35）このとき、状況改善期待度Epの閾値が0.1であったとす
ると、この中から閾値を越え最大の状況改善期待度Ep=
0.5を持つ連関情報R2が状況Jicを最も改善する解連関情
報として選択される。Related information R1 = [Ji1: I am lonely because the user is not on the side Ac1: Look around and search for the user Jd1: I am happy to find the user] (Ep (R1) = E (R1) × S (R1) × W (R1) = 1.0 × 0.3 × 0.8 = 0.24) Related information R2 = [Ji2: lonely because the user is not on the side Ac2: barking Jd2: I am happy that the user is coming] (Ep (R2) = E (R2) × S (R2) × W (R2) = 1.0 × 0.5 × 1.0 = 0.5) Association information R3 = [Ji3: Lonely without user Ac3: Search for user while walking Jd3: I am happy to find the user] (Ep (R3) = E (R3) × S (R3) × W (R3) = 1.0 × 0.7 × 0.5 = 0.35) At this time, the threshold of the situation improvement expectation Ep is 0.1. If so, the maximum situation improvement expectation Ep =
The link information R2 having 0.5 is selected as the solution link information that most improves the situation Jic.

【００７５】ロボットは選択された解連関情報R2に従っ
て、スピーカ13から吠え声を出力しながら各可動部のモ
ータを動かして吠える身体動作を行う。・行動形成能力行動形成能力を機能させるための構成は、信頼性学習部
25、目標外行動学習部26、他者行動学習部27、状況調整
行動学習部28、連鎖行動学習部29である。The robot performs a barking motion by moving the motors of the movable parts while outputting a bark from the speaker 13 in accordance with the selected solution relation information R2.・ Action formation ability The configuration for making the action formation ability work is the reliability learning section.
25, an untargeted action learning unit 26, another person action learning unit 27, a situation adjustment action learning unit 28, and a chain action learning unit 29.

【００７６】信頼性学習部25、目標外行動学習部26、他
者行動学習部27、状況調整行動学習部28、連鎖行動学習
部29は、専ら制御部15内のCPU16とメモリ17の一部によ
り実現される。すなわち、これらの構成要素が行う処理
はCPU16により実行される。The reliability learning unit 25, the untargeted behavior learning unit 26, the other behavior learning unit 27, the situation adjustment behavior learning unit 28, and the chain behavior learning unit 29 are exclusively a part of the CPU 16 and the memory 17 in the control unit 15. Is realized by: That is, the processing performed by these components is executed by the CPU 16.

【００７７】この行動形成能力には、更に以下の4つの
能力がある。 1.行動探索能力 2.行動複製能力 3.状況調整能力 4.行動計画能力上述の各能力は、1に対して信頼性学習部25と目標外行
動学習部26、2に対して他者行動学習部27、3に対して状
況調節行動学習部28、4に対して連鎖行動学習部29がそ
れぞれ対応して動作を行う。The behavior formation ability has the following four abilities. 1. Action search ability 2. Action replication ability 3. Situation adjustment ability 4. Action planning ability Each of the above abilities is for the reliability learning unit 25 for 1 and the behavior of others for the untargeted action learning units 26 and 2. The chain behavior learning unit 29 performs operations corresponding to the situation adjustment behavior learning units 28 and 4 with respect to the learning units 27 and 3, respectively.

【００７８】まず、各能力について説明する。（行動探索能力）行動探索能力は、前述した行動検索部
23と後述する信頼性学習25と目標外行動学習部26によっ
て実行される。First, each capability will be described. (Action search ability) The action search ability is based on the action search unit described above.
23, a reliability learning 25 described later, and an untargeted behavior learning unit 26.

【００７９】ある状況Jicに遭遇したロボットが、その
状況において適用可能（状況適合度Aが所定閾値以上）
で、かつ、少ない労力で状況改善効果を期待できる（状
況改善期待度Epが所定閾値以上）行動（連関情報）をそ
の連関データベース部22に一つも記憶していなかったと
すると、行動検索部23は解連関情報を選択することがで
きない。なぜなら、連関データベース部22には、状況Ji
cにおいて適用可能であっても状況を少ない労力で改善
できる見込みの少ない（状況改善期待度Epが所定閾値未
満）行動か、状況Jicにおいて適用可能でない（状況適
合度Aが所定閾値未満）行動しか記憶されていないから
である。そこで、行動検索部23はこのような場合に限
り、連関データベース部22から乱数により一つの連関情
報を（暫定的な）解連関情報として選択する。A robot that encounters a certain situation Jic is applicable in that situation (the degree of conformity A is equal to or greater than a predetermined threshold).
If no action (association information) that can expect a situation improvement effect with a small amount of effort (situation improvement expectation Ep is equal to or more than a predetermined threshold) is not stored in the association database unit 22, the action search unit 23 Unable to select disassociation information. Because the relation database section 22 contains the situation Ji
Only actions that are unlikely to be able to improve the situation with less effort even when applicable in c (situation improvement expectation Ep is less than a predetermined threshold) or that are not applicable in situation Jic (situation conformance A is less than a predetermined threshold) This is because they are not stored. Therefore, only in such a case, the action search unit 23 selects one piece of association information from the association database unit 22 using random numbers as (temporary) solution association information.

【００８０】選択された連関情報R=[Ji,Ac,Jd]が状況Ji
cにおいて適用可能（状況適合度A(Jic,R)が所定閾値以
上）であり、その行動Acが実行された後の状況Jdcが目
的状況Jdに十分類似していた（遷移達成度B(Jdc,R)が所
定閾値以上）場合にはその遷移確率S(R)を増加させる。
逆に、選択された連関情報Rが適用可能であるにも関わ
らず、その行動Ac実行後の状況Jdcが予定された目標状
況Jdに類似しなかった（遷移達成度B(Jdc,R)が所定閾値
未満）場合には遷移確率S(R)を減少させ、それととも
に、実際の遷移（Jic+Ac→Jdc）を表す新たな連関情報
（目標外行動連関情報）R1=[Jic,Ac,Jdc]を生成し、そ
の遷移確率情報Si(R1)=［Ns(R1),Nt(R1),S(R1)］を1回
の試行で1回成功していることを表す［1,1,1］に設定
し、これらR1とS(R1)を連関データベース部22に記憶す
ることで遷移（Jic+Ac→Jdc）を知識化する。The selected association information R = [Ji, Ac, Jd] is the status Ji
c is applicable (situation conformity A (Jic, R) is equal to or greater than a predetermined threshold), and the situation Jdc after the execution of the action Ac is sufficiently similar to the goal situation Jd (the transition achievement B (Jdc , R) is equal to or greater than a predetermined threshold), the transition probability S (R) is increased.
Conversely, although the selected association information R is applicable, the situation Jdc after the execution of the action Ac is not similar to the scheduled target situation Jd (the transition achievement degree B (Jdc, R) If it is less than the predetermined threshold), the transition probability S (R) is decreased, and at the same time, new association information (non-target action association information) R1 = [Jic, Ac, which represents the actual transition (Jic + Ac → Jdc) Jdc] is generated, and the transition probability information Si (R1) = [Ns (R1), Nt (R1), S (R1)] is represented as one success in one trial [1,1 , 1], and the transition (Jic + Ac → Jdc) is made into knowledge by storing these R1 and S (R1) in the association database unit 22.

【００８１】また、選択された連関情報R=[Ji,Ac,Jd]が
状況Jicにおいて適用可能でない（状況適合度A(Jic,R)
が所定閾値未満）場合には、元々初期状況Jiが実際の状
況Jicに類似していないのだから、この連関情報Rの正し
さを示す尺度たる遷移確率S(R)を変更せず、実際の遷移
（Jic+Ac→Jdc）を表す新たな連関情報（目標外行動連
関情報）R1=[Jic,Ac,Jdc] を生成し、その遷移確率情報
Si(R1)=［Ns(R1),Nt(R1),S(R1)］を1回の試行で1回成功
していることを表す［1,1,1］に設定し、これらR1とS(R
1)を連関データベース部22に記憶することでこの遷移の
知識化のみを行う。Further, the selected association information R = [Ji, Ac, Jd] is not applicable in the situation Jic (situation conformity A (Jic, R)
Is less than the predetermined threshold), since the initial situation Ji is not originally similar to the actual situation Jic, the transition probability S (R), which is a measure indicating the correctness of the association information R, is not changed and the actual situation is not changed. Generates new association information (non-target action association information) R1 = [Jic, Ac, Jdc] representing the transition (Jic + Ac → Jdc) and its transition probability information
Si (R1) = [Ns (R1), Nt (R1), S (R1)] is set to [1,1,1], which indicates that one attempt was successful in one trial, and these R1 and S (R
By storing 1) in the association database unit 22, only the knowledge of this transition is performed.

【００８２】このように、状況改善期待度が低かったり
適用できなかったりと、その効果のほどが危ぶまれる連
関情報から一つの連関情報Rを選択してその結果を確か
めることにより、選択された連関情報Rの遷移確率S(R)
を変化させたり（後述する信頼性学習部25による）、実
際に起こった遷移を知識化するための新たな連関情報
（目標外行動連関情報）R1を生成する（後述する目標外
行動学習部26による）などして、行動Acの新たな側面を
ロボットは知ることができる。そして、適用可能な連関
情報を試した場合、結果として遷移確率S(R)が増加され
た連関情報Rは、状況Jicに類似した状況において優先的
に選択されるようになり、遷移確率S(R)が減少された連
関情報Rは、状況Jicに類似した状況においてますます選
択されにくくなる。すなわち、ある状況下で成立する連
関情報（遷移確率Sが増大）や成立しない連関情報（遷
移確率Sが減少）が明らかにされていく。また、適用可
能/不可能を問わず、このように適宜選択された連関情
報Rを実行した結果、その状況下で成立する新たな連関
情報（目標外行動連関情報）R1が明らかにされる。すな
わち、行動によりどのような状況がもたらされるかとい
う知識が、状況を改善できたもの（成功例）も、改善で
きなかったもの（失敗例）も連関情報の形で連関データ
ベース22に蓄積されていく。As described above, when the degree of expectation of the situation improvement is low or cannot be applied, one of the related information R is selected from the related information whose effect is in danger, and the result is confirmed. Transition probability S (R) of information R
Is changed (by the reliability learning unit 25 to be described later), or new association information (non-target action association information) R1 for making the transition that has actually occurred into knowledge is generated (the non-target action learning unit 26 to be described later). The robot can learn new aspects of the behavior Ac. Then, when trying the applicable link information, as a result, the link information R with the increased transition probability S (R) is preferentially selected in a situation similar to the situation Jic, and the transition probability S ( The association information R with reduced R) becomes increasingly difficult to select in situations similar to situation Jic. That is, related information that is established under a certain situation (transition probability S increases) and unrelated information (transition probability S decreases) are revealed. In addition, as a result of executing the linkage information R appropriately selected as described above, regardless of the applicability / impossibility, new linkage information (non-target action linkage information) R1 that is established under the situation is clarified. That is, the knowledge of what kind of situation is brought about by the action is stored in the association database 22 in the form of the association information, both in the case where the situation can be improved (successful example) and in the case where the situation cannot be improved (failure example). Go.

【００８３】以上述べた行動探索能力とは、ロボットが
動作を開始し始めた初期段階、適用可能な全ての連関情
報の遷移確率Sがまだ小さい時期に特に発揮されるもの
である。また使用期間が長くなったとしても連関情報に
記述される初期状況の発生頻度が少なくて十分な試行が
為されていない場合も同様である。あるいは、使用場所
や使用者が変わるなどのようにロボット周囲の環境が激
変したために、ロボットが擁している連関情報の多くが
適用不可能になったり成立しなくなったりした場合も同
様である。特定の状況で選択可能な行動が固定的に与え
られている従来のロボットでは、このような劇的な環境
の変動に対処することができなかった。The action search ability described above is particularly exhibited at the initial stage when the robot starts to operate, and at the time when the transition probabilities S of all applicable association information are still small. The same applies to a case where the frequency of occurrence of the initial situation described in the association information is low and sufficient trials have not been performed even if the use period has become long. Alternatively, the same applies to a case in which much of the association information held by the robot becomes inapplicable or cannot be established due to a drastic change in the environment around the robot such as a change in the place of use or the user. A conventional robot in which a selectable action is fixedly given in a specific situation cannot cope with such a drastic environmental change.

【００８４】この行動探索能力は次のように働く。例え
ば、使用者に叱られたロボットがこの使用者から逃げよ
うとするが、さらに追いすがられて叱られたという状況
を考える。ロボットは叱られているという状況Jicにお
いて適用可能な連関情報として、「逃げる」を行動情報
Ac1に持つ下記の連関情報R1だけを連関データベース部2
2に記憶していたとする。The action search ability works as follows. For example, consider a situation in which a robot scolded by a user tries to escape from this user, but is further chased and scolded. Robot is scolded. "Run away" is action information as applicable information in Jic.
Only the following link information R1 in Ac1 is linked database section 2
Suppose that it was stored in 2.

【００８５】R1= ［Ji1:叱られている Ac1:逃げる Jd2:逃げ切れた］当然、ロボットはこの連関情報R1を実行するが、使用者
に追いすがられることによって、この連関情報R1が示唆
する「逃げ切れた」状況Jd2に至らず、この連関情報R1
がここで役に立たないことを知らされる。そして、この
連関情報R1の遷移確立S(R)は減少し、何度かこのような
ことを繰り返すうちに連関情報R1の遷移確率S(R)は極め
て小さくなり、状況改善期待度Ep(R)が所定閾値を下回
ったとき、もはやこの連関情報R1を実行することができ
なくなる。もちろん、このときロボットは他に有力な連
関情報を持っていないので、何も実行することができな
い。そこで、行動探索能力により、ロボットは他の連関
情報を試すことになる。おそらく幾つかの無駄なあがき
をした後に、ロボットはたまたま全く別の初期状況「嬉
しい」で選択される「踊る」を行動情報Ac2に持つ下記
の連関情報R2を実行する。R1 = [Ji1: scolded Ac1: escaped Jd2: escaped] Naturally, the robot executes this association information R1, but this is suggested by the user being chased by the user. "Escaped" situation Jd2, this linked information R1
Is informed that it is useless here. Then, the transition establishment S (R) of the link information R1 decreases, and after repeating such a process several times, the transition probability S (R) of the link information R1 becomes extremely small, and the situation improvement expectation Ep (R ) Falls below the predetermined threshold value, the association information R1 can no longer be executed. Of course, at this time, the robot does not have any other powerful association information, so it cannot perform anything. Then, the robot will try other related information by the action search ability. Probably after some unnecessary screeching, the robot happens to execute the following association information R2 having "dancing" selected in a completely different initial situation "happy" in the action information Ac2.

【００８６】R2= ［Ji2:嬉しい Ac2:踊る Jd2:楽しい］ R1とR2の初期状況Ji1とJi2は全く異なる。「踊る」行動
Ac2は通常なら「叱られている」状況では絶対に選択さ
れない行動である。しかし、行動探索能力によりロボッ
トが「踊りだした」結果、使用者はその怒りをはぐらか
されて追及を止めてしまったとする。その結果、ロボッ
トは「使用者に叱られている」初期状況で、行動「踊
る」を実行することにより、「叱られなくなる」状況に
至れることを実際に経験することで初めて知る。そし
て、この経験は下記に示す新たな連関情報（目標外行動
連関情報）R3として連関データベース部22に追加され、
ロボットの今後の行動に影響を与えるようになる。R2 = [Ji2: happy Ac2: dancing Jd2: fun] The initial situations J1 and J2 of R1 and R2 are completely different. "Dancing" action
Ac2 is an action that would never be selected in a situation that would normally be “scolded”. However, it is assumed that as a result of the robot "dancing" due to the action search ability, the user has been angered and stopped pursuing. As a result, the robot learns for the first time by actually experiencing that the robot "does not be scolded" by performing the action "dancing" in the initial situation of "being scolded by the user". Then, this experience is added to the association database unit 22 as new association information (non-target action association information) R3 shown below,
Influence the future behavior of the robot.

【００８７】目標外行動連関情報R3= ［Ji3: 叱られている Ac3: 踊る Jd3: 叱られなくなる］ここで各種センサがどのような信号を検知するかは予め
メモリ17に記憶されているとする。例えば、使用者がロ
ボットの頭部を叩いて叱ることは、ロボット頭部の感圧
センサに所定値よりも大きな力が所定時間よりも短く検
知される状態としてロボットに認識され、使用者がロボ
ットを怒鳴って叱ることは、ロボットのマイクに所定値
よりも大きな音声が検知される状態としてロボットに認
識される。Untargeted action association information R3 = [Ji3: scolded Ac3: dances Jd3: no longer scolded] Here, it is assumed that the signals detected by the various sensors are stored in the memory 17 in advance. . For example, if the user beats the robot's head and scolds it, the robot is recognized as a state in which a force greater than a predetermined value is detected by the pressure sensor of the robot head for less than a predetermined time, and the user recognizes that the robot Is scolded and scolded is recognized by the robot as a state in which a voice greater than a predetermined value is detected in the robot microphone.

【００８８】なお、この例でもわかるように、ロボット
が試すことのできる行動は連関情報として連関データベ
ース部22に記憶されている行動に限定されており、それ
を逸脱した行動を実行することはできない。行動探索能
力はあくまでも既知の行動の新たな効果を知るための経
験能力である。（行動複製能力）上記行動探索能力によって、ロボット
は既知の行動（例えば「踊る」）の新たな効果、すなわ
ち予測できなかった遷移を実際に起こった遷移から発見
することができるようになった。この結果、ロボットは
行動の内容自体を増やすことはできないものの、行動を
使いこなす知識を増やすことができるようになる。しか
しながら、まだロボットは新たな行動を発見することは
できない。As can be seen from this example, the actions that can be tried by the robot are limited to the actions stored in the association database unit 22 as the association information, and the actions that deviate therefrom cannot be executed. . The action search ability is merely an experience ability to know a new effect of a known action. (Behavior Duplicating Ability) With the above behavior searching ability, the robot can discover a new effect of a known behavior (for example, “dancing”), that is, an unpredictable transition from an actually occurring transition. As a result, the robot cannot increase the content of the action itself, but can increase the knowledge of using the action. However, the robot has not yet discovered any new behavior.

【００８９】一方、後述する他者行動学習部27によって
実行される行動複製能力は、ロボットが、自己の周囲で
他者が行う未知の行動を観察し、「この状況でこうすれ
ばこうできるのだ」という情報を得て、これを新たな連
関情報（他者行動連関情報）Rとして連関データベース
部22に記憶していくことで、自己の新たな行動を獲得す
る行動模倣の能力ある。On the other hand, the behavior duplication ability executed by the other behavior learning unit 27 described later is based on the fact that the robot observes unknown behaviors performed by others around itself, and says, “I can do this in this situation. Is obtained, and this is stored in the association database unit 22 as new association information (other person's behavior association information) R, so that there is an ability to imitate the behavior to acquire the user's new behavior.

【００９０】例えば、使用者Xが泣いていた場合に、別
の使用者Yがこの泣いている使用者Xの背中をさすったり
抱きしめたりすることで、泣いている使用者Xが泣きや
み笑顔を見せたのをロボットが観察していたとする。For example, when the user X is crying, another user Y touches or hugs the back of the crying user X, so that the crying user X can cry and smile. Assume that the robot was observing what was shown.

【００９１】この一連の動作をロボットは、撮像カメ
ラ、マイクで検知しており、使用者Xの「泣いている」
状況Jiが、使用者Yによる「背中をさすられたり」、
「抱きしめられたり」する行動Acによって、「泣きやみ
笑顔を見せる」状況Jdに遷移することを認識する。な
お、予めメモリ17には、泣いている表情、泣きやむ/泣
いていない表情、笑顔等の表情が記憶されており、ロボ
ットが認識できるようになっているものとする。また、
背中をさする（行動素片：前肢を対象に接触させる＋行
動素片：前肢を滑らせる）、抱きしめる（行動素片：対
象を両前肢で抱える＋行動素片：しばらくそのままでい
る）といった行動も、予めメモリ17に記憶されている行
動素片の連鎖として認識可能である。The robot detects this series of operations with the imaging camera and the microphone, and the user X “crys”.
Situation Ji is "backlashed" by user Y,
It recognizes that the state Ac changes to the state of “crying and showing a smile” by the action Ac of “embracing”. It should be noted that the memory 17 stores expressions such as a crying expression, a crying / not crying expression, and a smiling expression so that the robot can recognize the expression. Also,
Actions such as touching the back (behavior element: contacting the forelimb with the object + sliding the forelimb) and hugging (being the behavior element: holding the object in both forelimbs + acting element: staying for a while) Can be recognized as a chain of action segments stored in the memory 17 in advance.

【００９２】そこで、ロボットは、誰かが「泣いてい
る」初期状況Jiを「泣きやむ」もしくは「笑顔を見せ
る」目的状況Jdに至らせるには、この泣いている人の
「背中をさする」、もしくは「抱きしめる」という行動
Acによって達成できることを知り、これらJi，Ac，Jdを
新たな一組の連関情報（他者行動連関情報）R=[Ji,Ac,J
d]としてまとめ、この他者行動連関情報Rを連関データ
ベース部22に追加して行動それ自体を増加させる。Therefore, in order to bring the initial situation Ji in which somebody is "crying" to "stop crying" or "show a smile", the robot "touches the back" of the crying person. Or the act of cuddling
Knowing that it can be achieved by Ac, these Ji, Ac, Jd are converted into a new set of related information (other-action related information) R = [Ji, Ac, J
d], and adds the other person's action association information R to the association database unit 22 to increase the action itself.

【００９３】なお、ロボットがこのように見真似により
複製することのできる行動は、メモリ17に記憶される行
動素片の組み合わせとしてロボットが実行できる行動に
限定されており、それを逸脱した行動を複製することは
できない。行動複製能力はあくまでも既知の行動素片に
よる既知もしくは新規の組み合わせから成る行動の新た
な効果を他者の行動を観察することにより知るための経
験能力である。（状況調整能力）目的状況が同様の状況Jdを目指しなが
ら、異なる初期状況Ji1とJi2から出発する2つの異なる
連関情報R1=[Ji1,Ac1,Jd]とR2=[Ji2,Ac2,Jd]があったと
き、初期状況Ji1とJi2の差異や、行動Ac1とAc2の差異に
よって、R1の遷移確率S(R1)は高く、R2の遷移確率S(R2)
が低いということが起こる。つまり、Ji1からならJdに
至りやすいが、Ji2からはJdに至ることが稀であるとい
うことがある。The actions that the robot can duplicate by imitating are limited to actions that the robot can execute as a combination of action segments stored in the memory 17. It cannot be duplicated. The behavior replication ability is an experience ability to know a new effect of a behavior composed of a known or a new combination by a known behavior segment by observing the behavior of another person. (Situation adjustment ability) While aiming for situation Jd with the same goal situation, two different association information R1 = [Ji1, Ac1, Jd] and R2 = [Ji2, Ac2, Jd] starting from different initial situations Ji1 and Ji2 Due to the difference between the initial situations Ji1 and Ji2 and the difference between the actions Ac1 and Ac2, the transition probability S (R1) of R1 is high and the transition probability S (R2) of R2
Is low. In other words, it is easy to reach Jd from Ji1, but rarely to Jd from Ji2.

【００９４】後述する状況調整行動学習部28により実行
される状況調整能力とは、このような遷移が成功しにく
い状況Ji2を初期状況とし、遷移が成功しやすい状況Ji1
を目的状況とする新たな連関情報（状況調節行動連関情
報）R3=[Ji2,Ac3,Ji1]を生成して連関データベース部22
に追加し、Ji2からJdに至りやすくする補助的な準備行
動を獲得する能力のことである。The situation adjusting ability executed by the situation adjusting behavior learning unit 28 described below means a situation Ji2 in which such a transition is difficult to succeed, and a situation Ji1 in which the transition is likely to succeed.
New association information (situation adjustment action association information) R3 = [Ji2, Ac3, Ji1] with the target situation of
And the ability to acquire supplementary preparatory actions that make it easier to reach Jd from Ji2.

【００９５】これは、失敗しやすい連関情報R2=[Ji2,Ac
2,Jd]によりJi2からJdへ直接遷移（Ji2+Ac2→Jd）させ
ようとする代わりに、まず状況調節行動連関情報R3=[Ji
2,Ac3,Ji1]により失敗しやすい状況Ji2から成功しやす
い状況Ji1へ至り、次に連関情報R1=[Ji1,Ac1,Jd]により
Ji1から最終目標Jdへ至る2段階遷移（Ji2+Ac3→Ji1+Ac1
→Jd）の行動パタンをロボットが学習することである。
このとき、状況調節行動連関情報R3は、遷移に失敗しや
すい状況Ji2に特有の条件（失敗要因）を取り除き、成
功しやすい状況Ji1に特有の条件（成功要因）を実現
し、連関情報R1による遷移のお膳立てをする準備行動を
表す連関情報である。準備行動が成功すれば、失敗要因
を含まず成功要因を含む状況Ji1が実現するので、目的
状況Jdへの遷移が成功する可能性が高くなる。This is because the link information R2 = [Ji2, Ac
Instead of trying to make a direct transition from Ji2 to Jd (2, Jd) (Ji2 + Ac2 → Jd), first, the situation-regulation action-related information R3 = [Ji
[2, Ac3, Ji1], which leads to a situation that is likely to fail from Ji2 to a situation that is likely to succeed Ji1, and then the related information R1 = [Ji1, Ac1, Jd]
Two-stage transition from Ji1 to final goal Jd (Ji2 + Ac3 → Ji1 + Ac1
→ The robot learns the behavior pattern of Jd).
At this time, the situation adjustment action association information R3 removes the condition (failure factor) specific to the situation Ji2 that is likely to fail in transition, realizes the condition (success factor) specific to the situation Ji1 that is likely to succeed, and uses the association information R1. This is association information indicating a preparatory action for setting a transition. If the preparatory action succeeds, the situation Ji1 that includes the success factor without including the failure factor is realized, so that the possibility of success in transition to the target situation Jd increases.

【００９６】例えば、下記の2つの連関情報R1とR2があ
ったとする。For example, assume that there are the following two pieces of association information R1 and R2.

【００９７】連関情報R1= ［Ji1:ロボットがソファの上に一人でいて寂しい Ac1:その場で使用者を探す Jd:使用者を見つけて嬉しい］ (S(R1)=0.6) 連関情報R2= ［Ji2:ロボットがソファの後ろに一人でいて寂しい Ac2:その場で使用者を探す Jd:使用者を見つけて嬉しい］ (S(R2)=0.01) このとき、連関情報R1の遷移確率S(R1)が所定値以上あ
り、連関情報R2の遷移確率S(R2)よりも所定値以上大き
かったとする。これは、ソファの後ろよりソファの上の
方が使用者を見つけやすいことに由来し、遷移確率Sの
違いがそのことを物語っている。失敗要因はソファの後
ろにいることであり、成功要因はソファの上にいること
である。このとき、ロボットは状況Ji2から状況Ji1に至
る下記の新たな連関情報（状況調節行動連関情報）R3を
生成する。なお、このR3の遷移確率情報Si(R3)=［Ns(R
3),Nt(R3),S(R3)］は、この遷移が未試行なので［0,0,
0］に設定される。Linked information R1 = [Ji1: The robot is alone on the sofa and misses Ac1: Search for a user on the spot Jd: I am happy to find the user] (S (R1) = 0.6) Linked information R2 = [Ji2: The robot is alone behind the sofa and lonely Ac2: Searches for the user on the spot Jd: I am happy to find the user] (S (R2) = 0.01) At this time, the transition probability S ( It is assumed that R1) is equal to or greater than a predetermined value and is greater than the transition probability S (R2) of the association information R2 by a predetermined value or more. This is because it is easier to find the user on the sofa than behind the sofa, and the difference in the transition probability S indicates this. The failure factor is being behind the sofa, and the success factor is being on the sofa. At this time, the robot generates the following new association information (situation adjustment action association information) R3 from the situation Ji2 to the situation Ji1. The transition probability information of this R3, Si (R3) = [Ns (R
3), Nt (R3), S (R3)] is [0,0,
0].

【００９８】状況調節行動連関情報R3= ［Ji2: ロボットがソファの後ろに一人でいて寂しい Ac3: ソファの上に登る Ji1: ロボットがソファの上に一人でいて寂しい］ (S(R3)=0.0：初期値) なお、この例ではロボットの快感情強度Cは増大しない
ので、この遷移の状況改善効果Eは0である。このよう
に、状況調節能力は状況改善効果Eの優劣に関係なく、
準備行動R3をロボットの行動パタンに加える。状況改善
期待度Ep(R3)が足りない（遷移確率S(R3)が足りない）
準備行動R3の生成当初は、R3は行動探索能力により選択
される。特にR2が役に立たないのであるから、同じ初期
状況に適合するR3が試行される可能性は高い。[0098] Situation control action relation information R3 = [Ji2: Robot is alone and lonely behind sofa Ac3: Robot climbs on sofa Ji1: Robot is alone on sofa and lonely] (S (R3) = 0.0 In this example, since the pleasant emotion strength C of the robot does not increase, the situation improvement effect E of this transition is 0. In this way, the situation adjustment ability is independent of the situation improvement effect E,
Preparatory action R3 is added to the action pattern of the robot. Situation improvement expectation Ep (R3) is insufficient (transition probability S (R3) is insufficient)
When the preparation action R3 is generated, R3 is selected based on the action search ability. It is likely that R3, which fits the same initial situation, will be tried, especially since R2 is useless.

【００９９】このように、同じ目的状況に至るものであ
りながら、初期状況の異なる2つの連関情報が連関デー
タベース22に記憶されており、一方の遷移確率が十分高
く、それに比べて他方の遷移確率が十分低いとき、状況
を調節するための新たな連関情報（状況調節行動連関情
報）が生成される機能が状況調整能力である。As described above, two pieces of association information that have the same target situation but different initial situations are stored in the association database 22. One of the transition probabilities is sufficiently high, and the other is the other. Is sufficiently low, the function of generating new association information (situation adjustment behavior association information) for adjusting the situation is the situation adjustment ability.

【０１００】なお、ロボットがこのように準備行動とし
て獲得することのできる行動は、メモリ17に記憶される
行動素片の組み合わせとしてロボットが実行できる行動
に限定されており、また、成功要因や失敗要因にロボッ
トが制御できない条件、例えば、第三者の存在やその行
動がある場合、これを呼び込んだり排除したりする行動
を生成することはできない。以上のような制約があるも
のの、状況調節能力はある行動を成功させるための条件
を自ら整えるための準備行動をロボットが知るための経
験能力として機能する。（行動計画能力）後述する連鎖行動学習部29により実行
される行動計画能力とは、連関データベース部22から状
況を連続的に結んでいる（一方の目的状況が他方の初期
状況とほぼ同一である）連関情報（行動）の連鎖を発見
し、もし、そのような連鎖が状況を改善できるならば、
この連鎖を表す新たな連関情報（連鎖行動連関情報）を
生成して連関データベース部22に追加する能力である。The actions that the robot can acquire as preparatory actions in this way are limited to actions that can be executed by the robot as a combination of action segments stored in the memory 17. If the factor is a condition that cannot be controlled by the robot, for example, the presence or behavior of a third party, it is not possible to generate a behavior that calls or eliminates this. Despite the restrictions described above, the situation adjustment ability functions as an experience ability for the robot to know the preparatory actions for preparing the conditions for the success of a certain action. (Action plan ability) The action plan ability executed by the chain action learning unit 29 described later continuously connects the situations from the association database unit 22 (one objective situation is almost the same as the other initial situation). ) Discover a chain of linked information (actions), and if such a chain can improve the situation,
This is the ability to generate new linkage information (chain action linkage information) representing this linkage and add it to the linkage database unit 22.

【０１０１】連関情報R1=[Ji1,Ac1,Jd1]の目的状況Jd1
と、連関情報R2=[Ji2,Ac2,Jd2]の初期状況Ji2とがほぼ
同一（状況類似度Sj(Jd1, Ji2)が所定閾値以上）である
ような連関情報が、連関データベース部22内に例えばR2
の目的状況Jd2に対してR3の初期状況Ji3がほぼ同一、R3
の目的状況Jd3がR4の初期状況Jd4とほぼ同一というよう
にRnまで連鎖していることが見つかった場合、その連鎖
の最初もしくは途中の連関情報Rk（1≦k＜n）の初期状
況Jikにおける快感情強度C(Jik)の値よりも、Rkの先に
連なっている連関情報Rp（k＜p≦n）の目的状況Jdpにお
ける快感情強度C(Jdp)の値の方が大きいときには、連関
情報Rk,…,Rpまでを連続して実行する遷移（Jik+Ack+…
+Acp→Jdp）により状況が改善可能であるということに
なる。そこで、ロボットはRk,…,Rpまでを連鎖させた新
たな連関情報（連鎖行動連関情報）Rx=[Jik, Ack+…+Ac
p,Jdp]を生成し、連関データベース部22に追加する。こ
の結果、状況改善可能なロボットの行動パタンが広が
る。このとき、連鎖行動連関情報Rxの遷移確率S(Rx)
は、連鎖された各連関情報Rk,…,Rpの遷移確率の積で与
えられる。なお、複数の連関情報から成る状況改善可能
な連鎖は連鎖R1,…,Rnの中から全て抽出されて連関デー
タベース部22に追加される。[0101] The purpose information Jd1 of the relation information R1 = [Ji1, Ac1, Jd1]
Is stored in the association database unit 22 such that the initial situation Ji2 of the association information R2 = [Ji2, Ac2, Jd2] is substantially the same (the situation similarity Sj (Jd1, Ji2) is equal to or greater than a predetermined threshold). For example, R2
The initial situation Ji3 of R3 is almost the same as the objective situation Jd2, R3
If it is found that the target situation Jd3 is linked to Rn so that it is almost the same as the initial situation Jd4 of R4, the initial or middle linkage information Rk (1 ≦ k <n) in the initial situation Jik If the value of the pleasant emotion intensity C (Jdp) in the target situation Jdp of the association information Rp (k <p ≦ n) linked to the end of Rk is greater than the value of the pleasant emotion intensity C (Jik), A transition (Jik + Ack + ...) that continuously executes information Rk, ..., Rp
+ Acp → Jdp) means that the situation can be improved. Therefore, the robot creates new association information (chain action association information) Rx = [Jik, Ack +… + Ac
p, Jdp] is generated and added to the association database unit 22. As a result, the behavior pattern of the robot that can improve the situation is expanded. At this time, the transition probability S (Rx) of the chain action association information Rx
Is given by the product of the transition probabilities of the linked information Rk,..., Rp. It should be noted that all the status-improvable chains composed of a plurality of linkage information are extracted from the chains R1,..., Rn and added to the linkage database unit 22.

【０１０２】例えば、前述した状況調節能力により獲得
されたR3は、行動計画能力により一層選択されやすくな
る。この例では下記のR1とR2とを比較して、準備行動と
してR3が生成された。しかし、R3は状況改善効果がな
く、このままではごく稀にしか実行されそうにない。For example, R3 acquired by the above-mentioned situation adjustment ability is more easily selected by the action planning ability. In this example, R3 is generated as a preparatory action by comparing R1 and R2 below. However, R3 has no effect in improving the situation, and it is very unlikely to be executed as it is.

【０１０３】連関情報R1= ［Ji1: ロボットがソファの上に一人でいて寂しい Ac1: その場で使用者を探す Jd: 使用者を見つけて嬉しい］連関情報R2= ［Ji2: ロボットがソファの後ろに一人でいて寂しい Ac2: その場で使用者を探す Jd: 使用者を見つけて嬉しい］連関情報R3= ［Ji2: ロボットがソファの後ろに一人でいて寂しい Ac3: ソファの上に登る Ji1: ロボットがソファの上に一人でいて寂しい］行動計画能力は、このR3とR1を組み合わせた下記に示す
新たな連関情報（連鎖行動連関情報）R4=［Ji2,Ac3+Ac
1,Jd］を生成できる。この連鎖行動連関情報R4は明らか
に状況改善効果があり、何回かの試行により遷移確率が
上昇すれば、以後、行動検索部23の通常の動作により解
連関情報として選択される可能性が高い。Linked information R1 = [Ji1: The robot is alone on the sofa and lonely Ac1: Search for the user on the spot Jd: I am happy to find the user] Linked information R2 = [Ji2: The robot is behind the sofa Ac2: Looking for a user on the spot Jd: I am happy to find a user] Related Information R3 = [Ji2: A robot is alone behind a sofa Ac3: A robot climbs on a sofa Ji1: Robot Is lonely alone on the sofa] The action planning ability is a new link information (chain action link information) R4 = [Ji2, Ac3 + Ac] shown below that combines this R3 and R1
1, Jd]. This chain action association information R4 clearly has a situation improvement effect, and if the transition probability is increased by a number of trials, it is highly likely that it will be selected as disassociation information by the normal operation of the action search unit 23 thereafter. .

【０１０４】連鎖行動連関情報R4= ［Ji2: ロボットがソファの後ろに一人でいて寂しい Ac4: ソファの上に登って(Ac3)から、そこで使用者を探
す(Ac1) Jd: 使用者を見つけて嬉しい］なお、以上からも明らかなように、行動計画能力により
連鎖される連関情報の個数はいくつであっても良く、ま
た、途中で状況が改悪されることがあっても構わない。
要するに、連鎖される一番目の連関情報Rkの初期状況Ji
kにおける快感情強度C(Jik)の値よりも、最後の連関情
報Rpの目的状況Jdpにおける快感情強度C(Jdp)の値の方
が大きくなっている部分を連鎖の中から抽出できれば良
い。Chain action association information R4 = [Ji2: Robot is alone behind the sofa and lonely Ac4: Climb on the sofa (Ac3) and search for a user there (Ac1) Jd: Find a user It is clear from the above that the number of pieces of linked information linked by the action plan ability may be any number, and the situation may be deteriorated on the way.
In short, the initial state Ji of the first linked information Rk to be chained
It is only necessary to be able to extract, from the chain, a portion where the value of the pleasant emotion strength C (Jdp) in the target situation Jdp of the last association information Rp is larger than the value of the pleasant emotion strength C (Jik) at k.

【０１０５】なお、行動計画能力により、それぞれの初
期状況がほぼ同一であり、かつそれぞれの目的状況もほ
ぼ同一である遷移を記述する複数の連関情報が連関デー
タベース部22に記憶されるようになっても構わない。こ
の場合、行動検索部23の通常の処理により、含まれる行
動素片の個数の少ない方、すなわち実行容易度Wの大き
い方が優先的に選択される。無論、実行した結果、遷移
確率Sが減少するようなことがあれば、より多くの行動
素片を持つ複雑な行動が代わりに現れてくるようにな
る。そのためにも、そのときは無駄なようでも多くの連
関情報を作っておくことが重要なのである。Note that, due to the action planning ability, a plurality of pieces of association information describing transitions in which the respective initial situations are substantially the same and the respective goal situations are also substantially the same are stored in the association database unit 22. It does not matter. In this case, by the normal processing of the action search unit 23, the one with the smaller number of action segments included, that is, the one with the higher degree of execution W is preferentially selected. Of course, if the execution results in a decrease in the transition probability S, a complex action having more action units will appear instead. Therefore, it is important to create a lot of related information even if it seems useless at that time.

【０１０６】また、連鎖の初期状況と目的状況がほぼ同
一となる（ループになっている）連鎖行動連関情報が発
見されることもあるが、前述したように、状況改善しな
い連鎖は生成されないので、初期状況と目的状況が等し
い連関情報は生成されない。[0106] In some cases, chain action association information in which the initial state of the chain and the target state are almost the same (in a loop) is found, but as described above, a chain that does not improve the situation is not generated. However, the association information in which the initial situation and the objective situation are equal is not generated.

【０１０７】上述した4つの能力それぞれに対応して動
作する信頼性学習部25、目標外行動学習部26、他者行動
学習部27、状況調節行動学習部28、連鎖行動学習部29に
ついて詳細に説明する。The reliability learning unit 25, the non-target behavior learning unit 26, the other behavior learning unit 27, the situation adjusting behavior learning unit 28, and the chain behavior learning unit 29 that operate in accordance with each of the above-mentioned four abilities will be described in detail. explain.

【０１０８】（イ）信頼性学習部25 信頼性学習部25は、ロボットが行った行動の結果を評価
して、その行動を記述する連関情報の遷移確率を増減さ
せる動作を行う。すなわち、信頼性学習部25は、前述の
行動探索能力を実現する手段の一つである。(A) Reliability Learning Unit 25 The reliability learning unit 25 evaluates the result of the action performed by the robot and performs an operation of increasing or decreasing the transition probability of the association information describing the action. That is, the reliability learning unit 25 is one of means for realizing the above-described action search ability.

【０１０９】信頼性学習部25は、状況Jicにおいてロボ
ットが連関情報R=［Ji,Ac,Jd］を選択し、その行動Acが
完了して状況Jdcに至った時点で、連関情報Rに記憶され
た目標状況Jdと完了後の実際の状況Jdcとの状況類似度S
j(Jdc,Jd)、すなわち遷移達成度B(Jdc,R)を求め、これ
が所定閾値以上であれば、遷移が成功した（連関情報R
は成立した）として、この連関情報Rの遷移確率情報Si
(R)=［Ns(R),Nt(R),S(R)］中の実行回数Nt(R)と成功回
数Ns(R)をそれぞれ1増加させ、S(R)=Ns(R)/Nt(R)を新し
い遷移確率の値としてメモリ17に記憶させる。The reliability learning unit 25 selects the relation information R = [Ji, Ac, Jd] by the robot in the situation Jic, and stores the information in the relation information R when the action Ac is completed and the situation Jdc is reached. Situation similarity S between the completed target situation Jd and the actual situation Jdc after completion
j (Jdc, Jd), that is, the transition achievement B (Jdc, R) is obtained, and if this is equal to or greater than a predetermined threshold, the transition is successful (association information R
Has been established), the transition probability information Si of this association information R
(R) = Increase the number of executions Nt (R) and the number of successes Ns (R) in [Ns (R), Nt (R), S (R)] by 1 respectively, and S (R) = Ns (R) / Nt (R) is stored in the memory 17 as a new transition probability value.

【０１１０】逆に、遷移達成度B(Jdc,R)が所定閾値未満
であれば、遷移は失敗した（連関情報Rは成立しなかっ
た）として、この連関情報Rの遷移確率情報Si(R)=［Ns
(R),Nt(R),S(R)］中の実行回数Nt(R)のみを1増加させ、
S(R)=Ns(R)/Nt(R)を新しい遷移確率の値としてメモリ17
に記憶させる。Conversely, if the transition achievement degree B (Jdc, R) is less than the predetermined threshold value, the transition has failed (association information R has not been established), and the transition probability information Si (R ) = [Ns
(R), Nt (R), S (R)] only the number of executions Nt (R) is increased by 1,
S (R) = Ns (R) / Nt (R) as a new transition probability value in memory 17
To memorize.

【０１１１】このような遷移確率の更新（学習）後に、
ロボットが、上述した状況Jicと同様の状況において適
用可能な連関情報を選択する場合、状況改善効果Eとの
関係もあるが、以前よりも遷移確率Sを高くされた連関
情報Rが選択されやすくなる。逆に遷移確率Sを小さくさ
れた連関情報Rは選択されにくくなり、その結果行動と
して現れなくなる。After updating (learning) such transition probabilities,
When the robot selects the link information applicable in the same situation as the situation Jic described above, there is a relationship with the situation improvement effect E, but the link information R with the transition probability S higher than before is easily selected. Become. Conversely, the association information R whose transition probability S is reduced becomes difficult to be selected, and as a result, does not appear as an action.

【０１１２】したがって、ロボットは、連関情報を選択
して行動をおこし、その遷移の成否によって遷移確率S
を更新していくことで、少しずつ遷移確率の高い連関情
報を選択するようになる。Accordingly, the robot selects the association information and performs an action, and determines the transition probability S based on the success or failure of the transition.
Is updated, association information having a high transition probability is selected little by little.

【０１１３】なお、注意すべきことは、信頼性学習部25
は快感情強度Cの増加ではなく、連関情報が正しかった
か否かで行動を評価する点である。これは従来のロボッ
トが行っている強化学習と異なる。ロボットの行動が快
感情強度Cの増加にあるとしたとき、強化学習は遷移後
に実際に経験する快感情強度C（報酬信号という）の大
小によって行動の発現頻度を増減させる。強化学習は行
動Acとそれを選択すべき初期状況Jiと行動の発現頻度S
だけを知っている、いわばR=［Ji,Ac］という方式であ
り、行動の前に目的状況Jdを評価することはしない。It should be noted that the reliability learning unit 25
The point is not to increase the pleasant emotion intensity C but to evaluate the action based on whether or not the association information is correct. This is different from reinforcement learning performed by conventional robots. Assuming that the behavior of the robot is to increase the pleasant emotion intensity C, the reinforcement learning increases or decreases the frequency of the behavior depending on the magnitude of the pleasant emotion intensity C (referred to as a reward signal) actually experienced after the transition. Reinforcement learning is the action Ac, the initial situation Ji to select it, and the frequency of action S
This is the method of knowing only R = [Ji, Ac], and does not evaluate the objective situation Jd before the action.

【０１１４】したがって、強化学習では、連関情報の通
りの遷移が起こらなくても、たまたま快感情強度Cが増
加する状況に遷移できさえすれば、それが予定外の遷移
であっても発現頻度が増加されてしまう。その結果、実
際には、お門違いの状況に遷移させてしまう行動が、そ
の間違いにも関わらずひたすら強化されてしまう。Therefore, in reinforcement learning, even if the transition according to the association information does not occur, as long as the transition can be made to the situation where the pleasant emotion intensity C is increased, the frequency of the expression is increased even if it is an unscheduled transition. Will be increased. As a result, in fact, the action that causes a transition to a wrong situation is reinforced solely in spite of the mistake.

【０１１５】一方、本発明は、同様に快感情強度Cの増
加を目的としてロボットを行動させる（この仕組みは行
動検索部23が担っている）が、遷移させようとした行動
が本当にその通りの結果を生んだかどうかを評価して連
関情報の遷移確率Sを増減させる。その結果、連関情報
そのものの正しさを把握したうえで行動を選択できるよ
うになるので、行動による細かい状況のコントロールが
可能になるのである。本発明によれば、正しい連関情報
R=［Ji,Ac,Jd］がありさえすれば、狙った状況Jdをロボ
ットが作り出せるのである。それに比較すれば、強化学
習は目をつぶって石を投げているようなものである。On the other hand, in the present invention, the robot is caused to act similarly for the purpose of increasing the pleasant emotion strength C (this mechanism is carried out by the action search unit 23). The transition probability S of the association information is increased / decreased by evaluating whether or not a result is produced. As a result, it is possible to select an action after grasping the correctness of the association information itself, so that detailed control of the situation by the action becomes possible. According to the present invention, correct association information
As long as R = [Ji, Ac, Jd], the robot can create the target situation Jd. In comparison, reinforcement learning is like closing your eyes and throwing a stone.

【０１１６】（ロ）目標外行動学習部26 目標外行動学習部26は、ロボットが行った行動の結果を
評価して、その行動を記述する連関情報が間違っていた
場合に、正しい遷移を表す新たな連関情報（目標外行動
連関情報）を生成して、連関データベース部22に追加す
る動作を行う。すなわち、目標外行動学習部26は、前述
の行動探索能力を実現する手段の一つである。(B) Non-target behavior learning unit 26 The non-target behavior learning unit 26 evaluates the result of the action performed by the robot, and indicates a correct transition when the association information describing the action is incorrect. An operation of generating new association information (non-target action association information) and adding it to the association database unit 22 is performed. That is, the non-target action learning unit 26 is one of means for realizing the above-described action search ability.

【０１１７】目標外行動学習部26は、状況Jicにおいて
ロボットが連関情報R=［Ji,Ac,Jd］を選択し、その行動
Acが完了して状況Jdcに至った時点で、連関情報Rに記憶
された目標状況Jdと完了後の実際の状況Jdcとの状況類
似度Sj(Jdc,J2)、すなわち遷移達成度B(Jdc,R)を求め、
これが所定閾値未満であれば、遷移は失敗した（連関情
報Rは成立しなかった）として、実際に起こった遷移（J
ic+Ac→Jdc）を表す新たな連関情報（目標外行動連関情
報）R1=［Jic,Ac,Jdc］を生成し、連関データベース部2
2に追加する。なお、生成された目標外行動連関情報R1
に類似する連関情報が既に連関データベース部22に記憶
されていれば追加しない。これは連関データベース部22
を類似した多数の連関情報でパンクさせないためであ
る。The non-target action learning unit 26 determines that the robot selects the association information R = [Ji, Ac, Jd] in the situation Jic,
When Ac is completed and reaches situation Jdc, situation similarity Sj (Jdc, J2) between target situation Jd stored in association information R and actual situation Jdc after completion, that is, transition achievement B (Jdc , R)
If this is less than a predetermined threshold, the transition has failed (association information R has not been established), and the transition (J
Generates new association information (non-target action association information) R1 = [Jic, Ac, Jdc] representing ic + Ac → Jdc)
Add to 2. In addition, the generated non-target action correlation information R1
If the related information similar to is already stored in the related database unit 22, it is not added. This is the related database section 22
Is not punctured by a number of similar pieces of association information.

【０１１８】生成された目標外行動連関情報R1に類似す
る連関情報が、既に連関データベース部22に記憶されて
いるか否かは以下のような検索を行い判断される。Whether or not the linkage information similar to the generated non-target action linkage information R1 is already stored in the linkage database unit 22 is determined by performing the following search.

【０１１９】まず、新たに追加される連関情報R1=［Ji
c,Ac,Jdc］中の状況遷移行動Acと、連関データベース部
22に記憶されている各連関情報Rn=［Jin,Acn,Jdn］中の
状況遷移行動Acnとを比較し、両者が同一である連関情
報Rnを逐次検出する。これは、これから追加しようとす
る連関情報R1の行動Acと同じ行動を持つ連関情報だけを
検出するためである。First, the newly added association information R1 = [Ji
c, Ac, Jdc] and the transition database Ac
It compares the state transition action Acn in each piece of link information Rn = [Jin, Acn, Jdn] stored in 22 and sequentially detects link information Rn in which both are the same. This is to detect only the link information having the same action as the action Ac of the link information R1 to be added.

【０１２０】次に、連関情報R1=［Jic,Ac,Jdc］中の状
況Jic,Jdcと、上記検出された連関情報Rn=［Jin,Acn,Jd
n］中の状況Jin,Jdnの、初期状況間の状況類似度Sj(Ji
c,Jin)と目的状況間の状況類似度Sj(Jdc,Jdn)をそれぞ
れ求め、両類似度がともに所定閾値以上を獲得する場
合、追加しようとしている連関情報R1と行動が同じで、
初期状況も目的状況も十分類似した連関情報が既に存在
しているものと判断し、連関情報R1を連関データベース
部22へ追加することを取りやめる。Next, the situations Jic, Jdc in the relation information R1 = [Jic, Ac, Jdc] and the detected relation information Rn = [Jin, Acn, Jd]
n], the situation similarity Sj (Ji
c, Jin) and the situation similarity Sj (Jdc, Jdn) between the objective situation, respectively, when both similarities obtain a predetermined threshold or more, the action is the same as the association information R1 to be added,
It is determined that the related information already sufficiently similar in both the initial situation and the objective situation already exists, and the addition of the related information R1 to the related database unit 22 is cancelled.

【０１２１】ここで、本発明と従来の強化学習との相違
について説明する。Here, the difference between the present invention and the conventional reinforcement learning will be described.

【０１２２】従来の強化学習は、選択された行動パタン
を実行したロボットの行動の成否（快感情強度Cの増加
や減少）で行動の発現頻度を加減していた。つまり、行
動の結果、実際に起こった遷移を快感情強度C（報酬信
号）のみで評価し、それによって発現頻度のみが学習さ
れた。In the conventional reinforcement learning, the frequency of occurrence of an action is adjusted depending on the success or failure of the action of the robot executing the selected action pattern (increase or decrease of the pleasant emotion intensity C). In other words, the transition that actually occurred as a result of the action was evaluated only with the pleasant emotion intensity C (reward signal), and only the frequency of occurrence was learned.

【０１２３】これに対して、本発明の目標外行動学習部
26は実際に起こった遷移（Jic+Ac→Jdc）を追加される
連関情報R1=［Jic,Ac,Jdc］として学習することができ
る。つまり、行動Acを発現させる初期状況Jicや、その
結果訪れる目的状況Jdcを現実に則して知識化すること
ができる。On the other hand, the non-target action learning unit of the present invention
26 can learn the transition (Jic + Ac → Jdc) that actually occurred as the additional information R1 = [Jic, Ac, Jdc] to be added. That is, the initial situation Jic that causes the action Ac to appear and the destination situation Jdc to be visited as a result can be made into knowledge according to reality.

【０１２４】本発明と従来の強化学習は、ともに状況の
改善効果の高い行動の発現頻度を増加させて、改善効果
の低い発現頻度を減少させる点においては同一である
が、強化学習では実際の初期状況を学習することは行わ
ないので、「このとき(Ji)はこの行動(Ac)をすべきであ
る」という知識の中の「このとき(Ji)」は固定されてお
り、その固定された知識のもとで、「このときはこっち
の行動をした方が報酬を得やすい」ということだけを発
現頻度Sの学習によって獲得していく。つまり、R=［Ji,
Ac］は固定されていて、学習されることはない。まして
や、目的状況Jdを知識化することはないので、状況の微
妙な食い違いに対処する行動パタンを学習できない。The present invention and the conventional reinforcement learning are the same in that the frequency of occurrence of a behavior having a high effect of improving the situation is increased and the frequency of occurrence of a low effect of the improvement is reduced, but the actual strength of reinforcement learning is the same. Since we do not learn the initial situation, `` This time (Ji) '' in the knowledge that `` This time (Ji) should do this action (Ac) '' is fixed, and that fixed With this knowledge, only the fact that it is easier to get a reward if you take this action is acquired by learning the expression frequency S. That is, R = [Ji,
Ac] is fixed and will not be learned. Furthermore, since the objective situation Jd is not made into knowledge, it is not possible to learn an action pattern to cope with a subtle discrepancy in the situation.

【０１２５】これに対して本発明は、状況の微妙な食い
違いに対処する行動パタンを学習できる点が強化学習と
は異なる。このことは、行動がどのような状況を作り出
すのかということに関する知識を蓄積して運用すること
のできるロボットを構築するために必要である。知識が
成立する（連関情報どおりの遷移が起こる）場合には、
この知識を適用し続けても問題を生じない。一方、知識
が成立しなかった（連関情報どおりの遷移が起こらなか
った）場合には、この知識と異なる成立可能な知識を新
たに構築する必要が生じる。不成立は初期状況の微妙な
違いによるのかもしれないし、あるいは状況情報として
観測できなかった不確定要因によるものであるかもしれ
ない。そこで、このような時に、目標外行動学主部26が
前後の状況を含めて知識を追加構築することで、本発明
は強化学習(状況を知識化しない)を超えた状況適応能力
を発揮できるようになる。On the other hand, the present invention is different from reinforcement learning in that an action pattern for coping with a subtle discrepancy in a situation can be learned. This is necessary in order to build a robot that can accumulate and operate knowledge about what situation an action creates. When knowledge is established (transition according to the related information occurs)
There is no problem with continuing to apply this knowledge. On the other hand, when the knowledge is not established (the transition according to the association information does not occur), it is necessary to newly construct a possible knowledge different from this knowledge. The failure may be due to subtle differences in the initial situation, or to uncertainties that could not be observed as situational information. Therefore, in such a case, the non-target ethology main unit 26 additionally builds up knowledge including the situation before and after, so that the present invention can exert a situation adaptation ability beyond reinforcement learning (does not make the situation into knowledge). Become like

【０１２６】このように、目標とした事後状況への遷移
し易さを信頼性学習部25が学習することに加えて、目標
とした事後状況と異なる状況への遷移が存在することを
目標外行動学習部26が学習する。強化学習と異なり適用
可能な状況や到達可能な状況が細分化されて(連関情報
として)蓄積可能なこの仕組みは、従来よりも高い学習
適応能力をシステムに与える。As described above, in addition to the reliability learning unit 25 learning the ease of transition to the target posterior situation, it is determined that there is a transition to a situation different from the target posterior situation. The behavior learning unit 26 learns. Unlike reinforcement learning, this mechanism in which applicable and reachable situations can be subdivided and accumulated (as association information) gives the system higher learning adaptability than before.

【０１２７】目標外行動学習部26は、既知の行動が適用
可能な知られていなかった新たな初期状況や、既知の行
動が遷移可能な知られていなかった新たな目的状況を知
識化するというものである。この結果、システムは知っ
ている行動を新たな効果を期待して適用できる、つまり
新たな行動パタン（このときはこの行動ができる）を獲
得できるようになるが、知っている行動の数を増やすこ
とはできない。すなわち、目標外行動学習部26は、連関
情報中の状況情報を拡充するように動作する。つまり、
ここで行われる学習は新しい行動情報を獲得するという
ものではない。The non-target behavior learning unit 26 converts the known initial state to which a known action can be applied and a new initial state to which a known action can transition into knowledge. Things. As a result, the system will be able to apply the known behavior with a new effect, that is, obtain a new behavior pattern (in which case this behavior can be performed), but increase the number of known behaviors It is not possible. That is, the non-target action learning unit 26 operates to expand the situation information in the association information. That is,
The learning performed here does not mean acquiring new behavior information.

【０１２８】行動検索部23と信頼性学習部25と目標外行
動学習部26の連携により、ロボットは前述した「行動探
索能力」を発揮することができる。すなわち、このロボ
ットが十分な状況改善効果を持つ行動を記憶していない
場合であっても、行動検索部23の働きにより、何らかの
行動を試してみることができる。このような試行は、信
頼性学習部25と目標外行動学習部26の働きを通じて、状
況改善の成否に関わらずその結果が今後の行動に影響を
与える知識として蓄積される。By the cooperation of the action search section 23, the reliability learning section 25, and the non-target action learning section 26, the robot can exhibit the "action search ability" described above. That is, even when the robot does not store an action having a sufficient situation improvement effect, the action of the action search unit 23 allows the user to try some action. Such trials are accumulated through the operation of the reliability learning unit 25 and the non-target behavior learning unit 26 as knowledge that influences future behavior regardless of the success or failure of the situation improvement.

【０１２９】特に連関情報通りの遷移が行われなかった
場合には、新たな連関情報が構築されるため、知識とは
異なる状況に到達しても、それが新たな連関情報として
知識化され、このロボットの状況適応能力が向上する。In particular, when the transition according to the link information is not performed, new link information is constructed. Therefore, even if a situation different from the knowledge is reached, it is converted into knowledge as new link information. The situation adaptability of this robot is improved.

【０１３０】上述したような行動探索能力をロボットに
発揮させるためには、最初に幾つかの種となる行動パタ
ン(連関情報)を連関データベース部22に記憶させておく
必要がある。この記憶された行動パタンが充実している
(数が多い)ほど、ロボットが対応できる、あるいは対応
できるようになる状況が多くなる。In order for the robot to exert the above-described action search ability, it is necessary to first store some types of action patterns (association information) in the association database unit 22. This memorized action pattern is substantial
The more (the larger the number), the more situations the robot can or will be able to handle.

【０１３１】しかしながら、どのような行動を行っても
状況が改善されない場合には、このロボットは無駄な行
動を延々と続けるだけである。これが行動それ自体を拡
充する能力を持たないロボットの状況対応能力の限界で
ある。さらなる状況対応能力をロボットに発揮させるた
めには、状況遷移行動自体を増やすような学習機能が必
要である。However, if the situation is not improved by performing any action, this robot simply keeps performing unnecessary actions endlessly. This is the limit of the situational ability of a robot without the ability to enhance the behavior itself. In order for the robot to exhibit more situational abilities, a learning function that increases the situation transition behavior itself is necessary.

【０１３２】(ハ)他者行動学習部27 他者行動学習部27は、使用者などの他者が行った行動を
新たな連関情報（他者行動連関情報）として連関データ
ベース部22に記憶させ、ロボットの行動パタンの一つと
する動作をする。すなわち、他者行動学習部27は、前述
の行動複製能力を実現する手段である。(C) Other-action learning unit 27 The other-action learning unit 27 causes the association database unit 22 to store an action performed by another user such as a user as new association information (other-action association information). The robot performs an operation that is one of the behavior patterns of the robot. That is, the other party behavior learning unit 27 is a unit for realizing the behavior duplication ability described above.

【０１３３】より詳しくは、状況入力部21が、ロボット
に設けられたマイクや撮像カメラにより他者の音声や行
動（態度）や表情などを認識し、この行動の前後でその
他者の状況がどうであったかを推定する。他者の状況に
は、観察によって即座に検知可能な外部状態パラメータ
の他に、推定を要する感情などの内部状態パラメータも
含まれる。内部状態パラメータの推定は、入力された他
者の音声や行動（態度）や表情をもとにして行われ、予
めメモリ17にそれらを認識するための辞書情報が記憶さ
れている。この辞書情報の内容は、顔の表情、身振り手
振り等の行動(態度)、音声の語気などを撮像された画像
や取り込まれた音声から検出し、そこから読み取れる内
部状態パラメータを特定するための情報である。この結
果、例えば、ある行動によって怒った表情の画像から笑
った表情の画像に変われることが観察されれば、他者の
感情状態が怒りから喜びに変化したと推定される。推定
された結果は、状況情報列に記憶される。More specifically, the situation input unit 21 recognizes the voice, action (attitude), facial expression, etc. of another person using a microphone or an imaging camera provided in the robot, and determines how the situation of the other person is before and after this action. Is estimated. The situation of others includes internal state parameters such as emotions that need to be estimated, in addition to external state parameters that can be immediately detected by observation. The estimation of the internal state parameters is performed based on the input voice, action (attitude), and expression of the other person, and dictionary information for recognizing them is stored in the memory 17 in advance. The contents of this dictionary information are information for detecting facial expressions, gestures and other gestures (attitudes), speech vocabulary, etc. from captured images and captured voices, and identifying internal state parameters that can be read from them. It is. As a result, for example, if it is observed that an image of an angry expression changes from an image of an angry expression due to a certain action, it is estimated that the emotional state of the other person has changed from anger to joy. The estimated result is stored in the status information sequence.

【０１３４】他者行動学習部27は、状況入力部21による
状況情報列の中に他者の行動Acを検出すると、次の2つ
の場合に分けて動作する。When the other party's action learning unit 27 detects the other party's action Ac in the situation information sequence by the situation input unit 21, it operates in the following two cases.

【０１３５】(1)行動Acがロボットを対象に行われた場
合通常、状況入力部21によって、他者の感情状態や行動は
ロボットの外部状態として扱われ、ロボットの感情状態
や行動はロボットの内部状態として状況情報にまとめら
れている。行動Acはロボットを対象に他者によって行わ
れたものであるから、その彼我の関係をひっくり返せ
ば、他者を対象にしたロボットの行動として捉えなおす
ことができるはずである。(1) When the Action Ac is Performed on a Robot Normally, the situation input unit 21 treats the emotional state or behavior of another person as an external state of the robot, and the emotional state or behavior of the robot is It is summarized in the status information as an internal state. Since the action Ac was performed by the other person with respect to the robot, if the relationship between the ego is turned upside down, it should be possible to reconsider the behavior of the robot with respect to the other person.

【０１３６】ロボットの周囲状況として観測される外部
状態パラメータは、(a)行為者たる他者に関する部分
と、(b)他者を除く環境部分の2つに分解できる。(a)は
行為者たる他者の表情や行動や態度及び、それらからロ
ボットが推定することのできたこの他者の感情や欲求で
ある。また、(b)は例えば周囲の樹木や物体など、ロボ
ットと他者の双方にとって共通の環境部分である。ま
た、ロボットの内部状態パラメータは、(c)ロボットの
感情や欲求を表す内部状態パラメータである。The external state parameters observed as the surroundings of the robot can be decomposed into two parts: (a) a part relating to an actor, and (b) an environment part excluding the other person. (a) is the expression, behavior, and attitude of the actor, and the emotions and desires of the other person that the robot can estimate from them. (B) is an environment part common to both the robot and the other person, such as surrounding trees and objects. Further, the internal state parameter of the robot is (c) an internal state parameter indicating the emotion or desire of the robot.

【０１３７】このとき、他者から見た外部状態パラメー
タは、前記(b)、(c)に加えて、実際にはロボットが存在
するのだが、これを何らかの他者がいるというように置
き換えた記号データから合成される。また、他者の内部
状態パラメータは、前記(a)ということになる。At this time, in addition to the above (b) and (c), the external state parameter viewed from the other person is such that a robot actually exists, but this is replaced by the presence of some other person. It is synthesized from symbol data. Further, the internal state parameter of the other person is the above (a).

【０１３８】そこで、他者行動学習部27は、行動Acの前
後の状況を示す状況情報列JiとJdの各々を上記(a)、(b)、
(c)に分解して再編成することにより、行動Acの前後の
状況を他者の視点から捉えた他者状況情報列Ji'とJd'と
し、Ji'を初期状況、Jd'を目的状況、Acを状況遷移行動
とする新たな連関情報（他者行動連関情報）R=［Ji',A
c,Jd'］を生成して連関データベース部22に追加する。
なお、生成された他者行動連関情報Rに類似する連関情
報が既に連関データベース部22に記憶されていれば追加
しない。なお、この他者行動連関情報Rに類似する連関
情報が連関データベース部22に存在するか否かの検索
は、目標外行動学習部26で行われた検索と同様である。[0138] Therefore, the other party action learning unit 27 converts the situation information strings Ji and Jd indicating the situation before and after the action Ac into the above (a), (b),
By disassembling and reorganizing into (c), the situation before and after the action Ac is captured from the perspective of others as Ji 'and Jd', and Ji 'is the initial situation and Jd' is the objective situation. , Ac as new contextual information (other-related information) R = [Ji ', A
c, Jd '] is generated and added to the association database unit 22.
It should be noted that if link information similar to the generated other-person action link information R has already been stored in the link database section 22, no information is added. It should be noted that the search for whether or not there is linkage information similar to the other person's behavior linkage information R in the linkage database unit 22 is similar to the search performed by the non-target behavior learning unit 26.

【０１３９】(2)行動Acが他者間で行われた場合状況入力部21によって、他者Xと他者Yの感情状態や行動
はロボットの外部状態として扱われ、状況情報にまとめ
られている。行動Acが他者Xを対象に他者Yによって行わ
れたものであるとすると、ロボットの外部状態パラメー
タは、(a)行為者たる他者Yに関する部分と、(b)対象者
たる他者Xに関する部分と、(c)他者XとYに関する部分を
除く環境部分の3つに分解できる。(a)は行為者Yの表情
や行動や態度及び、それらからロボットが推定すること
のできたこの行為者Yの感情や欲求である。(b)は対象者
Xの表情や行動や態度及び、それらからロボットが推定
することのできたこの対象者Xの感情や欲求である。ま
た、(c)は例えば周囲の樹木や物体など、他者XとYの双
方にとって共通の環境部分である。(2) When Action Ac is Performed Between Others The emotion input state and action of the other person X and the other person Y are treated as external states of the robot by the situation input unit 21, and are summarized in the situation information. I have. Assuming that the action Ac is performed by the other person Y on the other person X, the external state parameters of the robot include (a) a part relating to the other person Y as the actor, and (b) the other person as the target person. It can be decomposed into three parts: a part related to X and (c) an environment part excluding parts related to X and Y. (a) shows the expression, behavior, and attitude of the actor Y, and the emotions and desires of the actor Y that can be estimated by the robot from the expressions, actions, and attitudes. (b) is the target
The facial expressions, actions, and attitudes of X, and the emotions and desires of the subject X that the robot can estimate from them. (C) is an environment part common to both others X and Y, for example, surrounding trees and objects.

【０１４０】このとき、行為者Yから見た外部状態パラ
メータは、前記(b)、(c)に加えて、実際には対象者Xが存
在するのだが、これを何らかの他者がいるというように
置き換えた記号データから合成される。また、行為者Y
の内部状態パラメータは、前記(a)ということになる。At this time, in addition to the above (b) and (c), the external state parameter viewed from the actor Y includes the fact that the target person X actually exists. Is synthesized from the symbol data replaced with. Also, actor Y
Is the above-mentioned (a).

【０１４１】そこで、他者行動学習部27は、行動Acの前
後の状況を示す状況情報列JiとJdの各々を上記(a)、(b)、
(c)に分解して再編成することにより、行動Acの前後の
状況を他者の視点から捉えた他者状況情報列Ji'とJd'と
し、Ji'を初期状況、Jd'を目的状況、Acを状況遷移行動
とする新たな連関情報（他者行動連関情報）R=［Ji',A
c,Jd'］を生成して連関データベース部22に追加する。
なお、生成された他者行動連関情報Rに類似する連関情
報が既に連関データベース部22に記憶されていれば追加
しない。なお、この他者行動連関情報Rに類似する連関
情報が連関データベース部22に存在するか否かの検索
は、目標外行動学習部26で行われた検索と同様である。Therefore, the other party behavior learning unit 27 divides each of the situation information strings Ji and Jd indicating the situation before and after the action Ac into the above (a), (b),
By disassembling and reorganizing into (c), the situation before and after the action Ac is captured from the perspective of others as Ji 'and Jd', and Ji 'is the initial situation and Jd' is the objective situation. , Ac as new contextual information (other-related information) R = [Ji ', A
c, Jd '] is generated and added to the association database unit 22.
It should be noted that if link information similar to the generated other-person action link information R has already been stored in the link database section 22, no information is added. It should be noted that the search for whether or not there is linkage information similar to the other person's behavior linkage information R in the linkage database unit 22 is similar to the search performed by the non-target behavior learning unit 26.

【０１４２】なお、生成された他者行動連関情報Rに類
似する連関情報が既に連関データベース部22に記憶され
ていれば追加しない。なお、この他者行動連関情報Rに
類似する連関情報が連関データベース部22に存在するか
否かの検索は、目標外行動学習部26で行われた検索と同
様である。It should be noted that if link information similar to the generated other-action link information R is already stored in the link database section 22, no addition is made. It should be noted that the search for whether or not there is linkage information similar to the other person's behavior linkage information R in the linkage database unit 22 is similar to the search performed by the non-target behavior learning unit 26.

【０１４３】この他者行動学習部27によって、ロボット
は「他人から学ぶ能力」、すなわち「行動複製能力」を
発揮することができる。[0143] The other-action learning unit 27 allows the robot to exhibit "the ability to learn from others", that is, the "action-copying ability".

【０１４４】例えば、泣いている使用者Xの背中を使用
者Yがさすることで、泣いていた使用者Xが泣きやみ笑顔
を見せた現場を観察することで、ロボットは使用者Yの
このような対人行動を学習することができる。そして、
学習したこの行動は別の機会に、泣いている使用者Zを
検出したときなどに発現する。For example, when the user Y touches the back of the crying user X, the robot X observes the site where the crying user X cryed and showed a smile, so that the robot can use the user Y Such interpersonal behavior can be learned. And
This learned behavior appears at another opportunity, such as when the crying user Z is detected.

【０１４５】このとき、このような対人行動が選択され
るには、その行動がロボットの快感情強度Cを増加させ
るものである必要がある。これには、ロボットが他者に
対して抱く好悪感情が関係する。At this time, in order for such an interpersonal action to be selected, the action needs to increase the pleasant emotion strength C of the robot. This involves the good and bad feelings the robot has about others.

【０１４６】例えば、ロボットがある人物に好感を抱い
ている時、その人物の窮状は状況入力部21において検知
され、このロボットの快感情強度Cを損なうように作用
する。そのためロボットは自己の快感情強度Cを改善す
るために、この人物の窮状を回復する行動を要求され、
その結果先に学習された他者の状況改善に効果のあった
対人行動が発現する。For example, when the robot has a favorable impression on a person, the plight of the person is detected by the situation input unit 21 and acts so as to impair the pleasant emotion strength C of the robot. Therefore, the robot is required to take action to improve the plight intensity C of this person,
As a result, interpersonal behaviors that are effective in improving the situation of others learned earlier appear.

【０１４７】逆に、ロボットがある人物に好感を抱いて
いない時、その人物の窮状はロボットの快感情強度Cを
損なわないか、増加させるように作用するため、相手の
窮状を回復する行動は発現しない、もしくはさらに窮状
を増すような行動が発現する。Conversely, when the robot does not like a person, the plight of the person does not impair or increase the pleasant emotion strength C of the robot. Behaviors that do not develop or that further increase poverty develop.

【０１４８】他者に対する好悪感情は、ロボットとその
他者とのこれまでの関わり合いを通じて形成される。ロ
ボットを可愛がってくれる人物に対しては、その人物を
検知するための辞書情報の中に「この人は良い人」とい
うラベル（情報）が付けられる。逆にロボットをいじめ
る人物に対しては、その人物を検知するための辞書情報
の中に「この人は悪い人」というラベルが付けられる。
人物が検知されるたびに、この良い人か悪い人かの判断
がなされ、その結果に応じて観測される外部状態がロボ
ットの感情状態に与える影響が変化する。なお、これら
の処理は状況入力部21にて行われる。Positive and negative feelings for others are formed through the previous relationship between the robot and others. For a person who loves the robot, a label (information) of "this person is a good person" is attached to the dictionary information for detecting the person. Conversely, a person who is bullying a robot is labeled "this person is a bad person" in the dictionary information for detecting the person.
Each time a person is detected, it is determined whether the person is good or bad, and the effect of the observed external state on the emotional state of the robot changes according to the result. These processes are performed by the status input unit 21.

【０１４９】このように、本発明では、他者行動学習部
27が、他人の行動を見真似することによって特に対人行
動をロボットに獲得させることに貢献し、状況入力部21
が検出される他人に対する好悪判断に基づいてロボット
の快感情強度Cを変化させ、行動検索部23がその快感情
強度Cに基づいて学習された対人行動を発現させる。As described above, in the present invention, the other party behavior learning unit
27 contributes to the acquisition of interpersonal actions by the robot by imitating the actions of others, and the situation input section 21
Changes the pleasant emotion strength C of the robot based on the judgment of good or bad with respect to the detected other person, and the behavior search unit 23 expresses the interpersonal behavior learned based on the pleasant emotion strength C.

【０１５０】これまでに述べた構成により、ロボットは
「行動探索能力」と「行動複製能力」を発揮することが
できる。具体的には、(I)行動の発現頻度の適応化、(I
I)行動の発現条件たる初期状況の新規獲得、(III)行動
の効果たる目的状況の新規獲得、(IIII)行動の内容たる
状況遷移行動自体の新規獲得、が可能になる。特に、(I
I)〜(IIII)は従来のロボットでは獲得できない知識であ
る。With the above-described configuration, the robot can exhibit "action searching ability" and "action copying ability". Specifically, (I) adaptation of the frequency of occurrence of behavior, (I
It is possible to (I) newly acquire an initial situation, which is an action expression condition, (III) newly acquire an objective situation, which is an effect of the action, and (IIII) newly acquire a situation transition action itself, which is the content of the action. In particular, (I
Items I) to (IIII) are knowledge that cannot be acquired by conventional robots.

【０１５１】しかしながら、このような能力だけではロ
ボットは、個々の行動の状況改善効果に縛られてしか行
動できないため、一度は状況を悪化させながらも複数の
行動を組み合わせることで最終的により改善された状況
に到達するという計画性を発揮することができない恐れ
がある。この点に対しては、以下で説明する状況調節行
動学習部28と連鎖行動学習部29とが対応することで、計
画性を持った行動をロボットに行わせることができる。However, since the robot can only act by being restricted by the effect of improving the situation of each action with such abilities alone, the robot is finally improved by combining a plurality of actions while making the situation worse. May not be able to demonstrate the planning of reaching the situation. The situation adjustment behavior learning unit 28 and the chain behavior learning unit 29 described below correspond to this point, so that the robot can perform a planned behavior.

【０１５２】(ニ)状況調節行動学習部28 状況調節行動学習部28は、ある初期状況から、ある目的
状況に遷移できないときに、他の初期状況からその目的
状況へ遷移できた他の連関情報を適用できるようにする
ために、その初期状況を作り出すための遷移を新たな連
関情報（状況調節行動連関情報）として連関データベー
ス部22に記憶させ、ロボットの行動パタンの一つとする
動作をする。すなわち、状況調節行動学習部28は、前述
の状況調節能力を実現する手段である。(D) Situation adjustment behavior learning unit 28 When the situation adjustment behavior learning unit 28 cannot transition from a certain initial situation to a certain target situation, the other relation information that can transition from the other initial situation to the target situation. In order to enable the application, the transition for creating the initial situation is stored in the association database unit 22 as new association information (situation adjustment behavior association information), and the operation is performed as one of the robot behavior patterns. That is, the situation adjustment behavior learning unit 28 is means for realizing the situation adjustment ability described above.

【０１５３】状況調節行動学習部28は、連関情報が連関
データベース部22に追加される毎に起動し、追加された
連関情報R1=［Ji1,Ac1,Jd1］の目的状況Jd1に類似した
目的状況Jd2を持つ連関情報R2=［Ji2,Ac2,Jd2］（状況
類似度Sj(Jd1,Jd2)が所定閾値以上）を連関データベー
ス部22から検索する。検索の結果、連関情報R2が検出さ
れると、状況調節行動学習部28は両者の遷移確率S(R1)
とS(R2)を評価して、連関情報R1の遷移確率S(R1)が所定
閾値以上あり、連関情報R2の遷移確率S(R2)がS(R1)より
所定閾値以上低ければ、R2の初期状況Ji2からR1の初期
状況Ji1に至る新たな連関情報（状況調節行動連関情
報）R3を生成して連関データベース部22に記憶する動作
を行う。あるいは、逆に連関情報R2の遷移確率S(R2)が
所定閾値以上あり、連関情報R1の遷移確率S(R1)がS(R2)
より所定閾値以上低ければ、R1の初期状況Ji1からR2の
初期状況Ji2に至る新たな連関情報（状況調節行動連関
情報）R3を生成して連関データベース部22に記憶する動
作を行う。The situation adjustment behavior learning unit 28 is activated every time the link information is added to the link database unit 22, and the target state similar to the target state Jd1 of the added link information R1 = [Ji1, Ac1, Jd1]. The related information R2 having Jd2 = [Ji2, Ac2, Jd2] (the situation similarity Sj (Jd1, Jd2) is equal to or more than a predetermined threshold) is searched from the association database unit 22. As a result of the search, when the association information R2 is detected, the situation adjustment action learning unit 28 determines the transition probability S (R1)
And S (R2), the transition probability S (R1) of the association information R1 is equal to or more than a predetermined threshold, and the transition probability S (R2) of the association information R2 is lower than S (R1) by a predetermined threshold or more. An operation of generating new association information (situation adjustment behavior association information) R3 from the initial situation Ji2 to the initial situation Ji1 of R1 and storing it in the association database unit 22 is performed. Alternatively, conversely, the transition probability S (R2) of the association information R2 is equal to or greater than a predetermined threshold, and the transition probability S (R1) of the association information R1 is S (R2).
If it is lower than the predetermined threshold value, an operation of generating new association information (situation adjustment behavior association information) R3 from the initial situation Ji1 of R1 to the initial situation Ji2 of R2 and storing it in the association database unit 22 is performed.

【０１５４】なお、生成された状況調節行動連関情報R3
に類似する連関情報が既に連関データベース部22に記憶
されていれば追加しない。また、この状況調節行動連関
情報R3に類似する連関情報が連関データベース部22に存
在するか否かの検索は、目標外行動学習部26で行われた
検索と同様である。It should be noted that the generated situation control action association information R3
If the related information similar to is already stored in the related database unit 22, it is not added. Further, the search for whether or not the linkage information similar to the situation adjustment behavior linkage information R3 exists in the linkage database unit 22 is the same as the search performed by the non-target behavior learning unit 26.

【０１５５】より詳しくは、目的状況Jdがほぼ同一であ
る複数(例えば2つ)の連関情報R1=［Ji1,Ac1,Jd］とR2=
［Ji2,Ac2,Jd］（いずれか一方が追加された連関情報）
のうち、一方の連関情報R1は頻度が高く、他方の連関情
報R2は頻度が低いことが、両者の遷移確率を見ることで
判断できる。一方が目的状況Jdに到達できるが、他方が
到達できないといった原因は、出発点たる初期状況とそ
のとき行いえる行動の違いにある。そこで、遷移に成功
する連関情報R1と成功しない連関情報R2それぞれの初期
条件Ji1とJi2を、2つの連関情報に共通に含まれる付帯
条件Jkと、前者（成功）に特有の付帯条件Jsと、後者
（失敗）に特有の付帯条件Jfに分離する。このJsが成功
要因であり、Jfが失敗要因である。Ji1=Jk+Jsであり、J
i2=Jk+Jfである。More specifically, a plurality (for example, two) of pieces of related information R1 = [Ji1, Ac1, Jd] and R2 =
[Ji2, Ac2, Jd] (association information with any one added)
Among them, it can be determined from the transition probabilities of the two that the frequency of the one piece of linked information R1 is high and the frequency of the other piece of linked information R2 is low. The reason that one can reach the target situation Jd but the other cannot reach is due to the difference between the initial situation as the starting point and the actions that can be taken at that time. Therefore, the initial conditions Ji1 and Ji2 of the link information R1 that succeeds in transition and the link information R2 that does not succeed are defined as an additional condition Jk that is commonly included in the two pieces of associated information, and an additional condition Js that is specific to the former (success). Separate into the incidental condition Jf peculiar to the latter (failure). This Js is a success factor, and Jf is a failure factor. Ji1 = Jk + Js, J
i2 = Jk + Jf.

【０１５６】状況調節行動は、初期状況Ji2=(Jk+Jf)か
ら目的状況Ji1=(Jk+Js)へと状況を遷移させる行動であ
る。The situation adjustment action is an action for changing the situation from the initial situation Ji2 = (Jk + Jf) to the target situation Ji1 = (Jk + Js).

【０１５７】例えば、同じ目的状況を持つ下記の2つの
連関情報を考える。For example, consider the following two pieces of association information having the same purpose situation.

【０１５８】連関情報R1= ［Ji1:ロボットがソファの上に一人でいて寂しい（第1
のパラメータ値） Ac1:その場で使用者を探す Jd:使用者を見つけて嬉しい］連関情報R2= ［Ji2:ロボットがソファの後ろに一人でいて寂しい（第
2のハ゜ラメータ値） Ac2:その場で使用者を探す Jd:使用者を見つけて嬉しい］このとき、「一人でいて寂しい」ことが共通条件Jk、
「使用者を見つけて嬉しい」が目標状況、成功要因Jsが
「ソファの上にいる」、失敗要因Jfが「ソファの後ろに
いる」となる。状況調節行動学習部28が行う処理は、Jf
をJsにする行動、具体的には「ソファの上に登る」行動
Ac3を生成して、この行動Ac3を含んだ新しい連関情報
（状況調節行動連関情報）R3=［Ji2=Jk+Jf,Ac3, Ji1=Jk
+Js］を組み立てることである。Related information R1 = [Ji1: The robot is alone on the sofa and misses (No. 1
Ac1: Search for the user on the spot Jd: I am glad to find the user] Related information R2 = [Ji2: The robot is alone behind the sofa and misses (No.
Ac2: Find a user on the spot Jd: I am happy to find a user] At this time, the common condition Jk is that "I am lonely and alone"
"I am happy to find the user" is the target situation, the success factor Js is "on the sofa", and the failure factor Jf is "behind the sofa". The processing performed by the situation adjustment behavior learning unit 28 is Jf
To make Js, specifically the action of “climbing on the sofa”
Ac3 is generated, and new association information (situation regulation action association information) including this action Ac3 is R3 = [Ji2 = Jk + Jf, Ac3, Ji1 = Jk
+ Js].

【０１５９】状況調節行動連関情報R3= ［Ji2: ロボットがソファの後ろに一人でいて寂しい Ac3: ソファの上に登る Ji1: ロボットがソファの上に一人でいて寂しい］この新たな状況調節行動連関情報R3に従って、ロボット
は「ソファの後ろに一人でいて寂しいとき、ソファの上
に登る」という行動を行う。なお、ロボットが何かに登
ることは、ロボットの行動素片としてメモリ17に記憶さ
れていなければならない。ただ、ソファに登るという行
動の持ち合わせがロボットにはなく、無論、「ソファの
後ろに一人でいて寂しいとき、ソファの上に登る」とい
う行動パタンもこのとき初めて獲得されるのである。Situation adjustment action association information R3 = [Ji2: Robot is alone and lonely behind sofa Ac3: Climb on sofa Ji1: Robot is alone and lonely on sofa] This new situation adjustment action association According to the information R3, the robot performs an action of "when I am alone behind the sofa, I climb on the sofa". In addition, the fact that the robot climbs something must be stored in the memory 17 as a robot's action element. However, the robot has no action of climbing on the sofa, and of course, for the first time, the behavior pattern of "If you are alone behind the sofa and you are lonely, climb on the sofa" is acquired.

【０１６０】(ホ)連鎖行動学習部29 連鎖行動学習部29は、個々の連関情報（行動）では状況
を改善できないとき、複数の連関情報（行動）を連続的
に行って状況を改善できる連関情報の連鎖を見つけ、そ
の連鎖を記述した新たな連関情報（連鎖行動連関情報）
を生成して連関データベース部22に記憶させ、ロボット
の行動パタンの一つとする動作をする。すなわち、連鎖
行動学習部29は前述の行動計画能力を実現する手段であ
る。(E) Chained Action Learning Unit 29 The chained action learning unit 29 is capable of continuously performing a plurality of pieces of linked information (actions) to improve the situation when individual linked information (actions) cannot improve the situation. New linkage information that finds a chain of information and describes the chain (chain behavior linkage information)
Is generated and stored in the association database unit 22, and an operation as one of the behavior patterns of the robot is performed. That is, the chain action learning unit 29 is means for realizing the action plan ability described above.

【０１６１】連鎖行動学習部29は、目標外行動学習部26
や他者行動学習部27や状況調節行動学習部28の働きによ
り新たな連関情報が連関データベース部22に追加される
毎に起動し、連関データベース部22に記憶される連関情
報の内容を調査し、一方の連関情報の目的状況が他方の
連関情報の初期状況に類似する、すなわち3つ以上の状
況を2つ以上の行動の連鎖で連続的に遷移させる複数の
連関情報R1=［Ji1,Ac1,Jd1］、R2=［Ji2,Ac2,Jd2］（状
況類似度Sj(Jd1,Ji2)が所定閾値以上）、R3=［Ji3,Ac3,
Jd3］（状況類似度Sj(Jd2,Ji3)が所定閾値以上）…Rn=
［Jin,Acn,Jdn］から成る連鎖を全て検索する（ただし
ｎは4以上の自然数）。ここで連関情報R1は第1の連関情
報であり、R2は第2の連関情報であり、R3は第3の連関情
報であり、この場合には2連鎖となる。連鎖は、それを構
成する要素たる連関情報が一つでも異なれば異なる連鎖
であり、また要素たる連関情報の順序が一個所でも異な
れば異なる連鎖である。なお、この検索は、追加された
連関情報が上記R1〜Rnのいずれかとなる連鎖の検索に限
定される。検索の結果、連鎖が検出されると、連鎖行動
学習部29は、この抽出された連鎖の中で、連関情報Rk
（1≦k＜n）の初期状況Jikの快感情強度C(Jik)よりも、
連関情報Rp（k＜p≦n）の目的状況Jdpの快感情強度C(Jd
p)が改善されている部分連鎖（前記連鎖そのものも含
む）を全て抽出し、その各々を新たな連関情報(連鎖行
動連関情報)Rx=［Jik,Ack+…+Acp,Jdp］として連関デー
タベース部22に記憶する。The chained action learning unit 29 includes a non-target action learning unit 26.
Each time new association information is added to the association database unit 22 by the action of the other person's behavior learning unit 27 or the situation adjustment behavior learning unit 28, it is started, and the contents of the association information stored in the association database unit 22 are investigated. A plurality of pieces of related information R1 = [Ji1, Ac1) in which the purpose situation of one piece of related information is similar to the initial situation of the other piece of related information, that is, three or more situations are continuously transitioned in a chain of two or more actions. , Jd1], R2 = [Ji2, Ac2, Jd2] (situation similarity Sj (Jd1, Ji2) is equal to or greater than a predetermined threshold), R3 = [Ji3, Ac3,
Jd3] (Situation similarity Sj (Jd2, Ji3) is equal to or greater than a predetermined threshold) ... Rn =
Search all the chains consisting of [Jin, Acn, Jdn] (where n is a natural number of 4 or more). Here, the link information R1 is the first link information, R2 is the second link information, and R3 is the third link information. In this case, there are two links. A chain is a different chain if at least one element of the link information is different, and is a different chain if the order of the element link information is different even at one place. This search is limited to a search for a chain in which the added association information is any of the above R1 to Rn. When a chain is detected as a result of the search, the chain action learning unit 29 includes, in the extracted chain, the association information Rk.
(1 ≦ k <n) Initial situation Jik's pleasant emotion intensity C (Jik)
Pleasant emotion strength C (Jd of the target situation Jdp of the association information Rp (k <p ≦ n)
All the partial chains (including the above-mentioned chain itself) with improved p) are extracted, and each of them is extracted as new association information (chain action association information) Rx = [Jik, Ack + ... + Acp, Jdp]. Remember at 22.

【０１６２】なお、生成された連鎖行動連関情報Rxに類
似する連関情報が既に連関データベース部22に記憶され
ていれば追加しない。また、この連鎖行動連関情報Rxに
類似する連関情報が連関データベース部22に存在するか
否かの検索は、目標外行動学習部26で行われた検索と同
様である。Note that if link information similar to the generated chain action link information Rx has already been stored in the link database section 22, it is not added. In addition, the search for whether or not linkage information similar to the chained behavior linkage information Rx exists in the linkage database unit 22 is similar to the search performed by the non-target behavior learning unit 26.

【０１６３】さらに、この連鎖行動連関情報から、上述
した手順によって更に新たな連鎖行動連関情報を作成し
記憶していくことも可能である。Further, it is also possible to create and store new chained action linkage information from the chained action linkage information by the above-described procedure.

【０１６４】また、上記処理により、ある連関情報R10
の目的状況Jd10と類似する初期状況Ji11を有した連関情
報R11と、この連関情報R11の目的状況Jd11と類似する初
期状況Ji12を有する連関情報R12と、この連関情報R12の
目的状況Jd12と類似する初期状況Ji13を有する連関情報
R13とが存在する場合、連関情報R10の初期状況Ji10に対
してR11,R12の目的状況Jd11,Jd12の快感情強度Cが低く
ても、連関情報R13の目的状況Jd13の快感情強度Cが大き
ければ、この連関情報R10〜R13は新たな連鎖行動連関情
報として連関データベース部22に記憶される。この場合
の連鎖は2（ｎ=2）であり、連関情報は3個（ｎ+1）であ
る。Further, by the above-mentioned processing, a certain related information R10
The related information R11 having the initial situation Ji11 similar to the target situation Jd10, the related information R12 having the initial situation Ji12 similar to the objective situation Jd11 of the related information R11, and the objective situation Jd12 of the related information R12 are similar. Linkage information with initial status Ji13
When R13 is present, even if the pleasant emotion intensity C of the target situations Jd11 and Jd12 of R11 and R12 is lower than the initial situation Ji10 of the association information R10, the pleasant emotion strength C of the objective situation Jd13 of the association information R13 is large. For example, the link information R10 to R13 is stored in the link database unit 22 as new chain action link information. In this case, the number of chains is 2 (n = 2), and the number of pieces of association information is 3 (n + 1).

【０１６５】このように、複数の連関情報を組み合わせ
て連鎖行動連関情報を生成する場合には、途中の行動を
実行したときには快感情強度Cが一時的に低下しても最
終的に快感情強度Cが増加していれば良いものとする。As described above, in the case of generating chained action association information by combining a plurality of pieces of association information, even if the intensity of the pleasant emotion C temporarily decreases when an intermediate action is executed, the intensity of the pleasant emotion is eventually reduced. Assume that C has increased.

【０１６６】このような構成により、ロボットは、状況
調整能力と行動計画能力を発揮する。具体的には、直接
的に状況を改善する行動だけでなく、直接状況を改善し
なくとも間接的に状況の改善に結びつく準備行動の獲得
と、途中で状況を悪化させても最終的には状況改善に結
びつける一連の行動を記した連関情報の生成・獲得が可
能になる。With such a configuration, the robot exhibits a situation adjusting ability and an action planning ability. Specifically, not only actions that directly improve the situation, but also preparatory actions that lead to indirect improvement of the situation without directly improving the situation, It is possible to generate and acquire linked information describing a series of actions that lead to situation improvement.

【０１６７】なお、既に述べた通り、状況調節行動学習
部28と連鎖行動学習部29は、連関情報データベース部22
に記憶された連関情報の更新状況を監視しており、シス
テム起動時に起動されるほかに、新たな連関情報の追加
を検出する毎に起動される。また、自身によって連関情
報の追加が一通り行われた直後にも再び起動されて、追
加すべき連関情報が見出せなくなるまで再帰的に起動さ
れる。As described above, the situation adjustment behavior learning unit 28 and the chain behavior learning unit 29 are linked to the association information database unit 22.
Is monitored at the time of system startup, and is started each time new link information is added. In addition, it is started again immediately after the link information is completely added by itself, and is started recursively until the link information to be added cannot be found.

【０１６８】以上述べた様な構成と、各構成要素が実行
する能力について説明してきたが、続いて、図4の行動
形成方法のフローチャートを参照して行動形成方法につ
いて説明する。なお、連関データベース部22には、ロボ
ット運用時点で、個々の連関情報との他に、その状況調
節行動連関情報と連鎖行動連関情報とが既に記憶されて
いるものとする。もし、そうでなければ、ロボット運用
以前の段階で、状況調節行動学習部28と連鎖行動学習部
29とが起動され、その働きにより全ての連関情報につい
ての状況調節行動連関情報と連鎖行動連関情報が追加さ
れるものとする。Having described the configuration as described above and the capability of each component to execute, the action forming method will now be described with reference to the flowchart of the action forming method in FIG. It is assumed that, at the time of operation of the robot, in addition to the individual linkage information, the situation adjustment behavior linkage information and the chain behavior linkage information have already been stored in the linkage database unit 22. If not, the situation adjustment behavior learning unit 28 and the chain behavior learning unit
29 is activated, and by its operation, the situation adjustment behavior linkage information and the chain behavior linkage information for all the linkage information are added.

【０１６９】(1)状況入力部21がロボット周囲の状況情
報を取得する(S1)。なお状況情報は、距離センサ11、感
圧センサ14、撮像カメラ10、マイク12により検知され
た、例えば被検出体(使用者等)の有無、被検出体の表情
や状態、被検出体等が発する音、被検出体による行動で
ある。検知された状況情報はそれまでに得られていた状
況情報列に編集され、行動発現ループ、学習ループ1、
学習ループ2、学習ループ3のそれぞれに出力される。(1) The situation input unit 21 acquires situation information around the robot (S1). The status information includes, for example, the presence or absence of a detected object (user or the like), the expression or state of the detected object, the detected object, and the like, which are detected by the distance sensor 11, the pressure-sensitive sensor 14, the imaging camera 10, and the microphone 12. The sound emitted and the action taken by the detected object. The detected status information is edited into the status information sequence obtained so far, and the behavior expression loop, learning loop 1,
It is output to each of the learning loop 2 and the learning loop 3.

【０１７０】まず、行動発現ループについて説明する。First, the behavior expression loop will be described.

【０１７１】(2)行動検索部23が、最新の状況情報列
（現在状況Jic）に適合する初期状況を持つ連関情報
（候補連関情報）をメモリ17内に記憶される連関データ
ベース部22から抽出する(S2)。この抽出の際には、各連
関情報の状況適合度Aが計算されて評価され、所定閾値
以上を獲得した連関情報のみが候補連関情報として抽出
される。このとき、候補連関情報の状況改善効果E、遷
移確率S、実行容易度Wを含むスコア情報Sc(R)も一緒に
抽出される。抽出された候補連関情報はスコア情報とと
もに一時的にメモリ17に記憶される。(2) The action search unit 23 extracts the association information (candidate association information) having an initial situation matching the latest situation information sequence (current situation Jic) from the association database unit 22 stored in the memory 17. (S2). At the time of this extraction, the situation matching degree A of each piece of link information is calculated and evaluated, and only link information that has obtained a predetermined threshold or more is extracted as candidate link information. At this time, the score information Sc (R) including the situation improvement effect E, the transition probability S, and the ease of execution W of the candidate association information is also extracted. The extracted candidate association information is temporarily stored in the memory 17 together with the score information.

【０１７２】(3)次に、入力された状況情報Jicに適合す
る連関情報（候補連関情報）が抽出されたか否かがCPU1
6にて判断される(S3)。(3) Next, the CPU 1 determines whether or not association information (candidate association information) that matches the input status information Jic has been extracted.
It is determined at 6 (S3).

【０１７３】候補連関情報が少なくとも一つ以上抽出さ
れた場合にはS4へ進み、候補連関情報が全く抽出されな
かった場合には、行動探索を行うために連関データベー
ス部22からランダムに一つの連関情報を解連関情報とし
て選択してS5に進む。If at least one candidate association information has been extracted, the process proceeds to S4, and if no candidate association information has been extracted at all, one association is randomly detected from the association database unit 22 in order to conduct an action search. The information is selected as disassociation information, and the process proceeds to S5.

【０１７４】(4)候補連関情報が少なくとも一つ以上抽
出された場合には、抽出された各候補連関情報から解連
関情報が決定される(S4)。(4) If at least one candidate association information has been extracted, solution association information is determined from each extracted candidate association information (S4).

【０１７５】各候補連関情報の状況改善期待度Ep=状況
改善効果E×遷移確率S×実行容易度Wが計算され、所定
閾値以上の値を有するか否かがCPU16にて判断される。
判断の結果、状況改善期待度が所定閾値に満たない候補
連関情報はメモリ17の一時記憶から消去される。そし
て、残った候補連関情報の中で最大の状況改善期待度を
持つものを解連関情報としてS5に進む。もし、残った候
補連関情報がない場合には、行動探索を行うために連関
データベース部22からランダムに一つの連関情報を解連
関情報として選択してS5に進む。The expected situation improvement degree Ep = situation improvement effect E × transition probability S × executability W of each piece of candidate association information is calculated, and the CPU 16 determines whether or not it has a value equal to or greater than a predetermined threshold.
As a result of the determination, the candidate association information for which the degree of expectation of situation improvement is less than the predetermined threshold is deleted from the temporary storage of the memory 17. Then, among the remaining candidate linkage information, the one having the highest degree of situation improvement expectation is set as dissolution linkage information, and the process proceeds to S5. If there is no remaining candidate association information, one association information is randomly selected as the association information from the association database unit 22 in order to perform an action search, and the process proceeds to S5.

【０１７６】(5)次に、CPU16は解連関情報の行動情報に
記述されるタイミング情報に従って、行動出力部24に行
動情報に記述される行動素片情報を出力する(S5)。(5) Next, the CPU 16 outputs the action segment information described in the action information to the action output unit 24 in accordance with the timing information described in the action information of the solution association information (S5).

【０１７７】行動出力部24は、受け取った行動素片情報
に従って動作を開始する。なお動作とは、ロボットが駆
動されることであり、例えばLED（9）が点滅したり、ス
ピーカ13から音が発せられたり、ロボット1自体が移動
したり変形したりする等である。The action output unit 24 starts operating according to the received action unit information. The operation is that the robot is driven, for example, the LED (9) blinks, a sound is emitted from the speaker 13, the robot 1 itself moves or deforms, and the like.

【０１７８】解連関情報が実行された後は、S1に戻る。After executing the dissociation information, the process returns to S1.

【０１７９】続いて、学習ループ1(信頼性学習部25、目
標外行動学習部26)について説明する。Next, the learning loop 1 (reliability learning section 25, non-target action learning section 26) will be described.

【０１８０】(6)まず、行動検索部23から行動出力部24
に出力された解連関情報に従って動作を行っているか否
かがCPU16にて判断される(S6)。(6) First, the action search unit 23 to the action output unit 24
CPU 16 determines whether or not the operation is being performed in accordance with the dissociation information output to (S6).

【０１８１】行っている場合にはS7に進み、行っていな
い場合にはS1へ戻る。S1へ戻る場合には、ロボットが何
ら動作を行っていない状態である。If the operation has been performed, the process proceeds to S7. If the operation has not been performed, the process returns to S1. When returning to S1, the robot is not performing any operation.

【０１８２】(7)ロボットが動作を行っている場合に
は、解連関情報に従った動作が終了しているか否かがCP
U16にて判断される(S7)。(7) If the robot is operating, it is determined whether or not the operation according to the solution association information has been completed.
It is determined at U16 (S7).

【０１８３】終了している場合にはS8へ進み、終了して
いない場合にはS1へ戻る。S1へ戻る場合には、ロボット
が動作中である。If the processing has been completed, the flow proceeds to S8, and if not, the processing returns to S1. When returning to S1, the robot is operating.

【０１８４】(8)動作が終了している場合には、CPU16
が、解連関情報に記述された目的状況Jdと実際の動作後
の状況Jdcとの遷移達成度B(Jdc,Jd)を計算し、目的状況
Jdに達したか否かを判断する(S8)。遷移達成度B(Jdc,J
d)が所定閾値以上であれば選択された解連関情報による
遷移は成功したとみなし、所定閾値未満であれば失敗し
たとして、遷移確率情報Siの内容を適宜更新していく。
このような動作を行うことで解連関情報に対して信頼性
学習を行っていく。(8) If the operation has been completed, the CPU 16
Calculates the degree of achievement B (Jdc, Jd) between the target situation Jd described in the solution relation information and the situation Jdc after the actual operation, and
It is determined whether or not Jd has been reached (S8). Transition achievement B (Jdc, J
If d) is equal to or greater than the predetermined threshold, the transition based on the selected solution association information is regarded as successful, and if less than the predetermined threshold, it is determined that the transition has failed, and the content of the transition probability information Si is updated as appropriate.
By performing such an operation, reliability learning is performed on the solution association information.

【０１８５】(9)次に、S8で失敗したとみなされた場合
には、CPU16が、解連関情報の目的状況と異なる状況に
遷移が行われたとみなして、解連関情報の初期状況と目
的状況を実際の状況 (スコア情報Sc含む)に置き換えた
新たな連関情報(目標外行動連関情報)がCPU16によって
生成される(S9)。この生成された連関情報は、同様の連
関情報が連関データベース部22に存在していなければ記
憶される。(9) Next, when it is determined that the process has failed in S8, the CPU 16 determines that a transition has been made to a status different from the target status of the disassociation information, and New association information (non-target action association information) in which the situation is replaced with the actual situation (including the score information Sc) is generated by the CPU 16 (S9). The generated link information is stored unless the same link information exists in the link database unit 22.

【０１８６】S1へ戻る。Return to S1.

【０１８７】続いて、学習ループ2(他者行動学習部27)
について説明する。Subsequently, the learning loop 2 (other action learning unit 27)
Will be described.

【０１８８】(10)まず、状況入力部21によって、被検出
体(特に使用者)が何らかの行動をおこしているか否かが
検知される(S10)。(10) First, the situation input unit 21 detects whether or not the detected object (particularly the user) is taking any action (S10).

【０１８９】被検出体の行動が検知できた場合にはS11
へ進み、検知できなかった場合にはS1へ進む。S1へ進む
場合には、被検出体が検知範囲に存在しない、被検出体
が行動を行っていない、または被検出体は行動を行って
いるがロボットがこの行動を認識できない、のいずれか
の状態である。If the action of the detected object can be detected, S11
The process proceeds to step S1, and if not detected, proceeds to step S1. When proceeding to S1, the detected object is not in the detection range, the detected object is not performing an action, or the detected object is performing an action but the robot cannot recognize this action. State.

【０１９０】(11)検知できた場合には、状況入力部21で
検知された状況情報列をもとに被検出体の行動が終了し
ているか否かがCPU16によって判断される(S11)。(11) If the detection is successful, the CPU 16 determines whether or not the action of the detected object has been completed based on the status information sequence detected by the status input unit 21 (S11).

【０１９１】行動が終了している場合にはS12へ進み、
終了していない場合にはS1へ進む。S1へ進む場合には、
被検出体の行動が終了するまでロボットが、この被検出
体の行動を検知しつづける。When the action has been completed, the process proceeds to S12,
If not, the process proceeds to S1. When going to S1,
The robot keeps detecting the action of the detected object until the action of the detected object ends.

【０１９２】(12)行動が終了した場合には、CPU16は、
被検出体の初期状況Jiと、これに続いて行われた状況遷
移行動Acと、状況遷移行動Acによって達成された目的状
況Jdとを状況情報列から抽出し、これらを一組の連関情
報(他者行動連関情報)として編集する(S12)。この編集
された連関情報は、同様の連関情報が連関データベース
部22に存在していなければ記憶される。(12) When the action is completed, the CPU 16
The initial situation Ji of the detected object, the situation transition action Ac performed subsequently, and the target situation Jd achieved by the situation transition action Ac are extracted from the situation information sequence, and these are extracted as a set of related information ( (S12). The edited association information is stored unless the same association information exists in the association database unit 22.

【０１９３】学習ループ2により、ロボットは、被検出
体の行動を自分の新たな動作パタンとして取り込むこと
ができる。S1へ進む。By the learning loop 2, the robot can take in the behavior of the detected object as its own new motion pattern. Proceed to S1.

【０１９４】続いて、学習ループ3(状況調節行動学習部
28、連鎖行動学習部29)について説明する。Subsequently, the learning loop 3 (situation adjustment behavior learning unit)
28, the chain action learning unit 29) will be described.

【０１９５】(13)まず、連関データベース部22に新たに
追加された連関情報があるか否かがCPU16により調べら
れる(S13)。(13) First, the CPU 16 checks whether or not there is newly added association information in the association database unit 22 (S13).

【０１９６】新たな連関情報が存在する場合にはS14に
進み、存在しない場合にはS1へ進む。If new association information exists, the process proceeds to S14, and if not, the process proceeds to S1.

【０１９７】新たな連関情報を検出するには、前回検知
した時点での連関情報の個数をメモリ17に記憶させてお
き、この数値と今回検知した時点での連関情報の個数を
比較して判断する方法や、前回検知した時点での連関情
報全てのインデックス（R0001などの識別記号）をメモ
リ17に記憶させておき、この連関情報と今回検知した時
点でのインデックスとを一つ一つ比較して判断する方法
がある。To detect new association information, the number of pieces of association information at the time of the previous detection is stored in the memory 17, and this number is compared with the number of pieces of association information at the time of the current detection. And the indexes (identification symbols such as R0001) of all the related information at the time of the previous detection are stored in the memory 17, and the linked information is compared with the index at the time of the current detection one by one. There is a way to judge.

【０１９８】(14)システム起動時点、あるいは新たな連
関情報が追加されていた場合には、状況調節行動学習部
28によって、新たに追加された連関情報に基づいた新た
な連関情報(状況調節行動連関情報)が生成され、同様の
連関情報が連関データベース部22に存在していなければ
記憶される (S14)。(14) At the time of system startup or when new association information has been added, the situation adjustment behavior learning unit
By 28, new association information (situation adjustment behavior association information) is generated based on the newly added association information, and is stored if the same association information does not exist in the association database unit 22 (S14).

【０１９９】(15)次に、連鎖行動学習部29にて、新たに
追加された連関情報に基づいた新たな連関情報(連鎖行
動連関情報)が生成され、同様の連関情報が連関データ
ベース部22に存在していなければ記憶される(S15)。S1
へ進む。(15) Next, in the chain action learning section 29, new link information (chain action link information) is generated based on the newly added link information, and the similar link information is stored in the link database section 22. If it does not exist, it is stored (S15). S1
Proceed to.

【０２００】このような(1)〜(15)の工程により、ロボ
ットの行動が形成され/動作されていく。Through the steps (1) to (15), the behavior of the robot is formed / operated.

【０２０１】具体的には、上述した1.行動探索能力によ
り、ロボットが有効な行動を知らない場合にいくつかの
知っている行動を試してその新たな効果を発見して学習
したり、2.行動複製能力により、他者の行動を観察する
ことにより新たな行動の存在を発見して学習したり、3.
状況調整能力により、対処できない状況を対処可能な状
況に変える行動を発見して学習したり、4.行動計画能力
により、これまで別々の行動として知っていた複数の行
動を連続して実行することの新たな効果を発見して学習
したりという4つの学習をほぼ並列処理して行って動作
パタンを増やしていく。Specifically, when the robot does not know a valid action, it can try several known actions and discover and learn new effects by using the action search ability. With the ability to replicate behaviors, discover and learn about new behaviors by observing the behavior of others, or 3.
Finding and learning actions that can change a situation that cannot be dealt with into a situation that can be dealt with by using the situation adjustment ability, and 4.Continuously executing multiple actions that were previously known as separate actions using the action planning ability Four new learnings, such as discovering and learning new effects, are performed in almost parallel processing to increase operation patterns.

【０２０２】以上述べたような第1の実施の形態では、
動作パタンとその活用法を時間の経過と共にロボット自
体が自律的に学習し増加させていくことで、使用者の興
味を薄れさせることなく継続して使用される。また、ロ
ボットが行う行動が次第に的確かつ高度になり、使用者
はロボットが成長していくことを実感できる。In the first embodiment as described above,
The robot itself autonomously learns and increases the movement pattern and its usage over time, so that the robot can be continuously used without fading the user's interest. In addition, the actions performed by the robot become more accurate and advanced, and the user can feel that the robot grows.

【０２０３】また、従来のように特定の入力に対して略
単一の動作を行うこともない。Further, unlike the related art, a substantially single operation is not performed for a specific input.

【０２０４】また、使用者が新たに動作パタンを入力/記
憶させる必要がなく、操作性を向上させると共に、時間
の経過と共に生物に限りなく近いロボットにすることが
できる。Further, it is not necessary for the user to newly input / store an operation pattern, so that the operability can be improved and the robot can be made as close as possible to a living thing with the passage of time.

【０２０５】また、行動パタンを被検出体から学ぶこと
で、自己の行動経験の範囲を超えた新しい行動を学習し
自らの行動パタンの一つとすることができる。[0205] Further, by learning the behavior pattern from the detected object, it is possible to learn a new behavior that is beyond the scope of one's own behavioral experience and use it as one of its own behavioral patterns.

【０２０６】また、行動の成功要因と失敗要因を抽出
し、失敗要因を取り除き成功要因を取り込むような連関
情報を形成するため、経験を積むほどにより高い成功率
で状況に対処することができる。[0206] Further, since the success factor and the failure factor of the action are extracted, and the association information that removes the failure factor and captures the success factor is formed, it is possible to cope with the situation with a higher success rate as the user gets more experienced.

【０２０７】また、複数の行動パタンを組み合わせて、
一連の行動パタンとすることで、個別の行動では達成で
きなかった状況に至らせることが可能となる。Also, by combining a plurality of action patterns,
By using a series of action patterns, it is possible to reach a situation that cannot be achieved by individual actions.

【０２０８】また、ロボットの行動面における学習成長
能力を従来にない高いレベルで実現し、使用者にとって
使用する（育て）甲斐のある、より飽きのこないロボッ
トを提供することができる。Further, it is possible to realize a learning and growth ability in the behavioral aspect of the robot at an unprecedentedly high level, and to provide a more insatiable robot which is worthwhile for the user to use (foster).

【０２０９】次に、本発明に係る疑似生物システムをロ
ボットにて実施した第2の実施形態の構成について図5を
参照して説明する。Next, the configuration of a second embodiment in which the pseudo-biological system according to the present invention is implemented by a robot will be described with reference to FIG.

【０２１０】なお、以下の各実施の形態において同一構
成要素は同一符号を付し重複する説明は省略する。[0210] In the following embodiments, the same components are denoted by the same reference numerals, and redundant description will be omitted.

【０２１１】第2の実施形態の特徴は、ロボットを動作
するプログラムを記憶したコンピュータ読み取り可能な
記憶媒体を使ったロボットの行動形成方法である。The feature of the second embodiment is a method of forming a robot action using a computer-readable storage medium storing a program for operating the robot.

【０２１２】図5は第2の実施の形態を説明するためのブ
ロック図であり、記憶媒体60は上記第1の実施の形態が
実施されるプログラムが記憶されている。FIG. 5 is a block diagram for explaining the second embodiment, and the storage medium 60 stores a program for implementing the first embodiment.

【０２１３】（1）記憶媒体60内のプログラムは、電子
機器なるパソコン61によって読み取られてメモリ62に記
憶され、無線通信でこの読み取られたプログラムをロボ
ット1に送信し、ロボット1内の制御部15内のメモリ17に
記憶させる。(1) The program in the storage medium 60 is read by a personal computer 61 as an electronic device and stored in the memory 62. The read program is transmitted to the robot 1 by wireless communication, and the control unit in the robot 1 15 is stored in the memory 17.

【０２１４】この場合にはパソコン61は、記憶媒体60が
挿入される挿入部63、読み取り/記憶/演算動作をつかさ
どるCPU64、読み取られたプログラムを記憶するメモリ6
2、読み取られたプログラムをロボット1に送信する送信
部65、キーボードやマウスなどの入力部66が設けられ
る。またロボット1には送信された動作信号を受信する
受信部67が外装の一部分に設けられ、受信部67で受信さ
れた動作信号は制御部15内のメモリ17に記憶されて動作
を開始する。In this case, the personal computer 61 includes an insertion section 63 into which the storage medium 60 is inserted, a CPU 64 for performing reading / storing / arithmetic operations, and a memory 6 for storing the read program.
2. A transmission unit 65 for transmitting the read program to the robot 1 and an input unit 66 such as a keyboard and a mouse are provided. Further, the robot 1 is provided with a receiver 67 for receiving the transmitted operation signal on a part of the exterior, and the operation signal received by the receiver 67 is stored in the memory 17 in the controller 15 and starts operation.

【０２１５】（2）またロボット1に設けられたプログラ
ム読み取り部68に記憶媒体を挿入して直接ロボット1内
の制御部15内のメモリ17にプログラムを記憶させて動作
させることもできる。(2) Alternatively, the storage medium may be inserted into the program reading section 68 provided in the robot 1 to directly operate the program by storing the program in the memory 17 in the control section 15 in the robot 1.

【０２１６】このように（1）、（2）によってロボット
1内のメモリ17にプログラムが記憶された後は上述した
第1の実施の形態と同様の動作を行う。As described above, according to (1) and (2), the robot
After the program is stored in the memory 17 in 1, the same operation as in the first embodiment is performed.

【０２１７】このような第2の実施の形態では、同一の
ロボットで記憶されるプログラムの種類を変えることが
でき、例えば活動的なロボット用プログラムであればよ
く鳴きよく走るなどの活動的な動作を積極的に行うよう
な、また甘えん坊なロボット用プログラムであれば使用
者に甘えるような仕草と鳴き声を積極的に行うようなロ
ボットになって、使用者が好むロボットを実現すること
ができる。したがってロボットは1体購入するだけでよ
く、プログラムの種類を変えるだけで使用者は複数の楽
しみを得ることができる。In the second embodiment, the type of the program stored in the same robot can be changed. For example, an active robot program can be used for an active operation such as running well. If the robot program is a spoiled robot program, it becomes a robot that positively performs gestures and squeals that the user is willing to give, and a robot that the user likes can be realized. Therefore, the robot only needs to purchase one robot, and the user can obtain multiple enjoyments simply by changing the type of the program.

【０２１８】なお、本発明は、上述した各実施の形態に
は限定されず、その主旨を逸脱しない範囲で種々変形し
て実施できることは言うまでもない。例えば、ロボット
の可動部もしくは本体に、ロボット周囲の温度を測定す
る温度センサを設けることも可能である。このようにす
ることで、ロボットは外部状況として周囲温度も扱うこ
とができるようになる。The present invention is not limited to the above embodiments, and it goes without saying that various modifications can be made without departing from the spirit of the present invention. For example, it is possible to provide a temperature sensor for measuring the temperature around the robot on the movable part or the main body of the robot. In this way, the robot can handle the ambient temperature as an external situation.

【０２１９】なお、本発明における記憶媒体としては、
磁気ディスク、フロッピー（登録商標）ディスク、ハー
ドディスク、光ディスク（CD−ROM，CD−R，DVD等）、
光磁気ディスク（MO等）、半導体メモリ等、コンピュー
タプログラムを記憶でき、かつコンピュータが読み取り
可能な記憶媒体であれば、その記憶形式と物理媒体の種
類は何れの形態であってもよい。The storage medium of the present invention includes:
Magnetic disk, floppy (registered trademark) disk, hard disk, optical disk (CD-ROM, CD-R, DVD, etc.),
As long as the storage medium can store a computer program and can be read by a computer, such as a magneto-optical disk (MO or the like) or a semiconductor memory, the storage form and the type of the physical medium may be any form.

【０２２０】また、記憶媒体からコンピュータにインス
トールされたコンピュータプログラムの指示に基づきコ
ンピュータ上で稼動しているOS（オペレーションシステ
ム）や、データベース管理ソフト、ネットワーク等のMW
（ミドルウェア）等が本実施の形態を実現するための各
処理の一部を実行してもよい。Also, based on instructions of a computer program installed in the computer from the storage medium, an operating system (OS) running on the computer, MW such as database management software, network, etc.
(Middleware) or the like may execute a part of each process for implementing the present embodiment.

【０２２１】さらに、本発明における記憶媒体は、コン
ピュータと独立した媒体に限らず、LANやインターネッ
ト等により伝送することを目的としてコンピュータプロ
グラムを記憶または一時記憶した記憶媒体や、LANやイ
ンターネット等により伝送されたコンピュータプログラ
ムをダウンロードして記憶または一時記憶した記憶媒体
も含まれる。Further, the storage medium in the present invention is not limited to a medium independent of a computer, but may be a storage medium storing or temporarily storing a computer program for the purpose of transmission over a LAN or the Internet, or transmitted over a LAN or the Internet. A storage medium in which the computer program downloaded is stored and temporarily stored is also included.

【０２２２】また、記憶媒体は1つに限らず、複数の媒
体から本実施形態における処理が実行される場合も、本
発明における記憶媒体に含まれ、媒体の構成は何れの構
成であってもよい。Further, the number of storage media is not limited to one, and the case where the processing in the present embodiment is executed from a plurality of media is also included in the storage media of the present invention, and the configuration of the media may be any configuration. Good.

【０２２３】なお、本発明におけるコンピュータは、記
憶媒体に記憶されたプログラムに基づき、本実施の形態
における各処理を実行するものであって、パソコン等の
1つからなる装置、複数の装置がネットワーク接続され
たシステム等の何れの構成であってもよい。Note that the computer according to the present invention executes each process in the present embodiment based on a program stored in a storage medium.
Any configuration such as a single device or a system in which a plurality of devices are connected to a network may be used.

【０２２４】また、本発明におけるコンピュータとは、
パソコンに限らず、情報処理機器に含まれる演算処理装
置、マイコン等も含み、プログラムによって本発明の機
能を実現することが可能な機器、装置を総称している。The computer in the present invention is
It is not limited to a personal computer, but also includes an arithmetic processing unit, a microcomputer, and the like included in an information processing device, and collectively refers to devices and devices that can realize the functions of the present invention by a program.

【０２２５】また、行動形成方法はコンピュータプログ
ラムとして記憶されている。Further, the action forming method is stored as a computer program.

【０２２６】また、本発明に係る疑似生物システムに
は、実体としてのロボットのほかに、擬人的あるいは擬
生物的なキャラクタをCG（Computer Graphics）などの
技術によりアニメーション化してコンピュータなどの表
示画面上に仮想的に実体化させるソフトウェアアプリケ
ーションも含まれる。この場合、行動出力部（24）はロ
ボット（1）のモータを制御して各関節を所望の角度に
運動させる代わりに、座標変換によってキャラクタの各
関節を所望の角度に運動させたアニメーションを表示画
面上に表示する。システムの実体がロボットではなくコ
ンピュータなどであるため、実際のロボットと比較して
取得可能な状況情報や実行可能な行動に若干の制約があ
るものの、本発明による行動形成方法を実行することが
可能である。The pseudo-biological system according to the present invention includes, in addition to a robot as a real object, an anthropomorphic or pseudo-biological character by using a computer graphics (CG) or the like to animate the character on a display screen of a computer or the like. It also includes software applications that can be virtualized. In this case, instead of controlling the motor of the robot (1) to move each joint to a desired angle, the action output unit (24) displays an animation in which each joint of the character is moved to a desired angle by coordinate transformation. Display on the screen. Since the system itself is not a robot but a computer, etc., it is possible to execute the action formation method according to the present invention, although there are some restrictions on situation information and executable actions that can be obtained compared to actual robots It is.

【０２２７】また、目標外行動学習部に入力される情報
は、状況入力部から直接入力される状況情報であって
も、信頼性学習部から入力される情報（不成立連関情
報）のいずれであっても構わない。The information input to the non-target action learning unit may be either the status information directly input from the status input unit or the information (unsatisfied association information) input from the reliability learning unit. It does not matter.

【０２２８】[0228]

【発明の効果】以上述べた様な本発明によれば、疑似生
物自体が新たな動作パタンを作成し追加していくこと
で、使用者に飽きのこない疑似生物となる。According to the present invention as described above, the simulated creature itself creates and adds a new motion pattern, so that the simulated creature does not get tired of the user.

[Brief description of the drawings]

【図１】本発明に係る疑似生物をロボットにて実施し
た第1の実施形態の斜視図。FIG. 1 is a perspective view of a first embodiment in which a pseudo-creature according to the present invention is implemented by a robot.

【図２】本発明に係る疑似生物をロボットにて実施し
た第1の実施形態のブロック線図。FIG. 2 is a block diagram of a first embodiment in which the pseudo creature according to the present invention is implemented by a robot.

【図３】本発明に係る疑似生物をロボットにて実施し
た第1の実施形態の動作を説明するためのブロック線
図。FIG. 3 is a block diagram for explaining an operation of the first embodiment in which the pseudo-creature according to the present invention is implemented by a robot.

【図４】本発明に係る疑似生物をロボットにて実施し
た第1の実施形態の行動形成方法を説明するためのフロ
ーチャート。FIG. 4 is a flowchart illustrating a behavior forming method according to the first embodiment in which the pseudo creature according to the present invention is implemented by a robot.

【図５】本発明に係る疑似生物をロボットにて実施し
た第2の実施形態を説明するためのブロック線図。FIG. 5 is a block diagram for explaining a second embodiment in which the pseudo creature according to the present invention is implemented by a robot.

[Explanation of symbols]

1 ロボット 2 頭部 3 右腕部 4 左腕部 5 右足部 6 左足部 7 しっぽ 8 胴体部 9 LED 10 撮像カメラ 11 静電容量式距離センサ 12 マイク 13 スピーカ 14 感圧センサ 15 制御部 16 CPU 17 メモリ 21 状況入力部 22 連関データベース部 23 行動検索部 24 行動出力部 25 信頼性学習部 26 目標外行動学習部 27 他者構造学習部 28 状況調節行動学習部 29 連鎖行動学習部 1 Robot 2 Head 3 Right arm 4 Left arm 5 Right foot 6 Left foot 7 Tail 8 Body 9 LED 10 Imaging camera 11 Capacitive distance sensor 12 Microphone 13 Speaker 14 Pressure sensor 15 Control unit 16 CPU 17 Memory 21 Situation input unit 22 Linkage database unit 23 Behavior search unit 24 Behavior output unit 25 Reliability learning unit 26 Non-target behavior learning unit 27 Others structure learning unit 28 Situation control behavior learning unit 29 Chain behavior learning unit

Claims

[Claims]

1. A simulated creature device that realizes a simulated creature that moves according to a desired motion pattern, wherein external circumstance input means for detecting a circumstance around the simulated creature as an external parameter value. An internal state holding unit that holds a pseudo emotion of the pseudo creature as an internal parameter value; an action for causing the pseudo creature to perform the operation pattern; an external parameter value and an internal parameter value of the pseudo creature before the action; As a set of association information including the external parameter value and the internal parameter value of the pseudo-creature after this action, and the transition probability of this action, a plurality of association database means for storing a plurality of pieces of association information, And selecting the operation pattern to be performed by the simulated creature from the association information of the association database means. A motion search means, and a behavior output means for exercising based on the selected association information, wherein the pseudo-creature apparatus is provided inside the pseudo-creature apparatus and is detected after executing the selected association information. Calculating the similarity between the external parameter value and the retained internal parameter value and the external parameter value and the internal parameter value of the selected association information, and when the similarity is equal to or greater than a predetermined value, A reliability learning unit that increases the transition probability of the selected association information and decreases the transition probability of the selected association information when the similarity is less than a predetermined value. A simulated biological device characterized by the above-mentioned.

2. A simulated creature device for realizing a simulated creature that moves according to a desired motion pattern, wherein external circumstance input means for detecting a circumstance around the simulated creature as an external parameter value. An internal state holding unit that holds a pseudo emotion of the pseudo creature as an internal parameter value; an action for causing the pseudo creature to perform the operation pattern; an external parameter value and an internal parameter value of the pseudo creature before the action; As a set of association information including the external parameter value and the internal parameter value of the pseudo-creature after this action, and the transition probability of this action, a plurality of association database means for storing a plurality of pieces of association information, And selecting the operation pattern to be performed by the simulated creature from the association information of the association database means. A motion search means, and a behavior output means for exercising based on the selected association information, wherein the pseudo-creature apparatus is provided inside the pseudo-creature apparatus and is detected after executing the selected association information. Calculating the similarity between the external parameter value and the retained internal parameter value, and the external parameter value and the internal parameter value of the selected association information, and when the similarity is less than a predetermined value, The external parameter value and the internal parameter value before executing the selected association information, the behavior of the selected association information, and the external parameter value and the internal parameter value after the behavior of the pseudo-creature. If the generated association information is not stored in the association database means, it is stored as the association information. Pseudo organism apparatus being characterized in that; and a nontarget action learning means.

3. A simulated creature device for realizing a simulated creature that moves according to a desired motion pattern, wherein external circumstance input means for detecting a circumstance around the simulated creature as an external parameter value. An internal state holding unit that holds a pseudo emotion of the pseudo creature as an internal parameter value; an action for causing the pseudo creature to perform the operation pattern; an external parameter value and an internal parameter value of the pseudo creature before the action; As a set of association information including the external parameter value and the internal parameter value of the pseudo-creature after this action, and the transition probability of this action, a plurality of association database means for storing a plurality of pieces of association information, And selecting the operation pattern to be performed by the simulated creature from the association information of the association database means. In a simulated biological device having: motion search means; and action output means for exercising based on the selected association information.The detected object is detected by the external situation input means, and the detected object performs an action. In this case, the action performed by the object, the external parameter value of the object extracted from the external parameter value detected before and after the action, and the external parameter value detected before and after the action And an internal parameter value of the detected object estimated from the following, as a set of related information, and generating the related information, if the stored related information is not stored in the related database means; A simulated biological device, comprising:

4. A simulated creature that realizes a simulated creature that moves according to a desired motion pattern, wherein external circumstance input means for detecting a circumstance around the simulated creature as an external parameter value. An internal state holding unit that holds a pseudo emotion of the pseudo creature as an internal parameter value; an action for causing the pseudo creature to perform the operation pattern; an external parameter value and an internal parameter value of the pseudo creature before the action; As a set of association information including the external parameter value and the internal parameter value of the pseudo-creature after this action, and the transition probability of this action, a plurality of association database means for storing a plurality of pieces of association information, And selecting the operation pattern to be performed by the simulated creature from the association information of the association database means. A motion search means; and a behavior output means for exercising based on the selected association information. In the pseudo-biological device, based on the association information stored in the association database means, the external parameter value after the action and the internal Extracting a plurality of pieces of the association information having parameter values that are substantially the same as each other, and among the extracted pieces of the association information, a transition probability is lower than the other pieces of the association information from which the transition information is extracted before the action of the association information; The external parameter value and the internal parameter value are first
And the external parameter value and the internal parameter value before the action of the other related information are used as a second parameter value.
The first parameter value, the action of changing the first parameter value to the second parameter value, and the external parameter value and the internal parameter value after the action of the other related information And a situation adjusting action learning means for generating a set of association information, and storing the generated association information if the generated association information is not stored in the association database means.

5. A simulated creature device for realizing a simulated creature that moves according to a desired motion pattern, wherein external circumstance input means for detecting a circumstance around the simulated creature as an external parameter value. An internal state holding means for holding, as an internal parameter value, a simulated emotion having at least a pleasant sensation of the simulated creature; an action for causing the simulated creature to perform the operation pattern; and an external parameter value of the simulated creature before the action. And an internal parameter value, an external parameter value and an internal parameter value of the pseudo-creature after this action, a transition probability of this action, as a set of linked information, as a set of linked information, The operation pattern corresponding to the internal parameter value and performed by the pseudo-creature is stored in the association database means. A pseudo-biological device having: a behavior search means for selecting from the linkage information; and a behavior output means for exercising based on the selected linkage information, wherein the first one of the linkage information stored in the linkage database means A second information having an external parameter value and an internal parameter value of the simulated creature before the action that substantially matches the external parameter value and the internal parameter value of the simulated creature after the action of the first linked information. Associated information, and extracting, the pleasant feeling in the external parameter value and the internal parameter value of the pseudo-creature before the action of the first linked information,
The external parameter value and the internal parameter of the simulated creature before the action of the first link information when the external parameter value and the internal parameter value of the simulated creature after the action of the second link information are lower than the pleasant emotion. A value, an action from the first link information to the second link information,
An external parameter value and an internal parameter value of the pseudo creature after the action of the second link information are generated as a set of link information, and the generated link information is not stored in the link database means. A simulated biological device, comprising: a chain action learning means for storing as linked information.

6. A simulated creature device for realizing a simulated creature that moves according to a desired operation pattern, wherein external circumstance input means for detecting a circumstance around the simulated creature as an external parameter value. An internal state holding means for holding, as an internal parameter value, a simulated emotion having at least a pleasant sensation of the simulated creature; an action for causing the simulated creature to perform the operation pattern; and an external parameter value of the simulated creature before the action. And an internal parameter value, an external parameter value and an internal parameter value of the pseudo-creature after this action, a transition probability of this action, as a set of linked information, as a set of linked information, The operation pattern corresponding to the internal parameter value and performed by the pseudo-creature is stored in the association database means. A pseudo-biological device having an action search means for selecting from the link information and an action output means for exercising based on the selected link information, wherein the first one of the link information stored in the link database means; A second association having associated information and an external parameter value and an internal parameter value of the pseudo-creature before the action, which substantially coincides with an external parameter value and an internal parameter value of the pseudo-life after the action of the first link information; Information, and third association information having an external parameter value and an internal parameter value of the simulated creature before the action that substantially match the external parameter value and the internal parameter value of the simulated creature after the action of the second association information. And the pleasant emotion in the external parameter value and the internal parameter value of the pseudo-creature before the action of the first association information,
The external parameter value and the internal parameter of the simulated creature before the action of the first link information when the external parameter value and the internal parameter value of the simulated creature after the action of the third link information are lower than the pleasant emotion. A value and an action and a second action of the second link information from the first link information
The action from the link information to the third link information and the external parameter value and the internal parameter value of the pseudo creature after the action of the third link information are generated as a set of link information. A chain behavior learning means for storing the link information as link information if the link information is not stored in the link database means.

7. A simulated creature device for realizing a simulated creature that moves according to a desired motion pattern, wherein external circumstance input means for detecting a circumstance around the simulated creature as an external parameter value. An internal state holding means for holding, as an internal parameter value, a simulated emotion having at least a pleasant sensation of the simulated creature; an action for causing the simulated creature to perform the operation pattern; and an external parameter value of the simulated creature before the action. And an internal parameter value, an external parameter value and an internal parameter value of the pseudo-creature after this action, a transition probability of this action, as a set of linked information, as a set of linked information, The operation pattern corresponding to the internal parameter value and performed by the pseudo-creature is stored in the association database means. A behavior search means for selecting from the linkage information; and an activity output means for exercising based on the selected linkage information. A pseudo-biological device, comprising: a first one of the linkage information stored in the linkage database means. A second information having an external parameter value and an internal parameter value of the simulated creature before the action that substantially matches the external parameter value and the internal parameter value of the simulated creature after the action of the first linked information. N (where n is a natural number of 2 or more) with the linkage information as one chain (n +
1) extracting the pieces of the related information, wherein the pleasant feeling in the external parameter value and the internal parameter value of the pseudo creature before the action of the first related information is:
When the external parameter value and the internal parameter value of the pseudo creature after the action of the second linked information of the n-th chain are lower than the pleasant emotion, the outside of the pseudo creature before the action of the first linked information. The parameter value and the internal parameter value and (n + 1) actions from the first link information to the (n + 1) link information are arranged in the (n + 1) link information. And the external parameter value and the internal parameter value of the pseudo creature after the action of the (n + 1) th association information are generated as a set of association information, and the generated association is generated. A chain behavior learning means for storing information as link information if the information is not stored in the link database means;

8. A simulated creature apparatus which realizes a simulated creature moving by a desired motion pattern and has an association database means for storing the operation pattern as association information including a plurality of parameters, wherein the situation around the simulated organism is provided. Detecting the pseudo-creature as an external parameter value, and holding the pseudo-emotion of the pseudo-creature as an internal parameter value, and corresponding to the external parameter value and the internal parameter value, and performing the operation pattern to be performed by the pseudo-creature. An action search step of selecting from the link information of the link database means; an action output step of exercising based on the selected link information; and the external parameter value detected after executing the selected link information. The retained internal parameter values and the external parameters of the selected association information And calculating the similarity between the internal parameter value and the similarity. If the similarity is equal to or greater than a predetermined value, the transition probability of the selected association information is increased, and the similarity is less than a predetermined value. A reliability learning step of reducing the transition probability of the selected association information.

9. A simulated creature device which realizes a simulated creature moving by a desired motion pattern and has an association database means for storing the operation pattern as association information including a plurality of parameters, wherein Detecting the pseudo-creature as an external parameter value, and holding the pseudo-emotion of the pseudo-creature as an internal parameter value, and corresponding to the external parameter value and the internal parameter value, and performing the operation pattern to be performed by the pseudo-creature. An action search step of selecting from the link information of the link database means; an action output step of exercising based on the selected link information; and the external parameter value detected after executing the selected link information. The retained internal parameter values and the external parameters of the selected association information And calculating the similarity between the internal parameter value and the external parameter value and the internal parameter value before executing the selected association information, if the similarity is less than a predetermined value. Generated as a set of the action of the link information and the external parameter value and the internal parameter value after the action of the pseudo-creature, and the generated link information is stored in the link database means. And a non-target behavior learning step of storing as the association information if the behavior is not present.

10. A simulated creature device that realizes a simulated creature that moves according to a desired motion pattern and has an association database means that stores the operation pattern as association information including a plurality of parameters. Detecting the pseudo-creature as an external parameter value, and holding the pseudo-emotion of the pseudo-creature as an internal parameter value, and corresponding to the external parameter value and the internal parameter value, and performing the operation pattern to be performed by the pseudo-creature. An action search step of selecting from the association information of the association database means; an action output step of exercising based on the selected association information; and a detected object is detected by the external situation input means, and the detected object is When an action is performed, the action performed by the detected object and the outside detected before and after the action are performed. An external parameter value of the detected object extracted from the parameter value and an internal parameter value of the detected object estimated from the external parameter value detected before and after the action are generated as a set of association information. A learning process of the imitation creature in the imitation creature device, further comprising the step of storing the generated association information if the generated association information is not stored in the association database means.

11. A simulated creature device that realizes a simulated creature that moves by a desired motion pattern and has an association database means that stores the operation pattern as association information including a plurality of parameters, Detecting the pseudo-creature as an external parameter value, and holding the pseudo-emotion of the pseudo-creature as an internal parameter value, and corresponding to the external parameter value and the internal parameter value, and performing the operation pattern to be performed by the pseudo-creature. An action search step of selecting from the link information of the link database means, an action output step of exercising based on the selected link information, and the external action after the action from the link information stored in the link database means. A plurality of the series having parameter values and internal parameter values having substantially the same parameter value. Extracting information, in the extracted the associated information, the external parameter value and the internal parameter value before the action of the associated information below are other of said associated information near the transition probabilities are extracted first
And the external parameter value and the internal parameter value before the action of the other related information are used as a second parameter value.
And the first parameter value is the second parameter value.
And an external parameter value and an internal parameter value after the action of the other related information are generated as a set of related information, and the generated related information is stored in the related database means. And a situation adjusting action learning step of storing the information as linkage information if the action is not stored in the pseudo-creature apparatus.

12. A simulated creature apparatus which realizes a simulated creature moving by a desired motion pattern and has an association database means for storing the operation pattern as association information including a plurality of parameters, wherein Detecting as a parameter value an external parameter value; holding a pseudo emotion having at least a pleasant feeling of the pseudo-creature as an internal parameter value; performing on the pseudo-creature corresponding to the external parameter value and the internal parameter value. An action search step of selecting the action pattern to be performed from the association information of the association database means; an action output step of exercising based on the selected association information; and the action information stored in the association database means. , First association information, and external parameters of the pseudo-creature after the action of the first association information. Extracting second association information having an external parameter value and an internal parameter value of the pseudo creature before the action that substantially matches the meter value and the internal parameter value, and extracting the pseudo information before the action of the first association information. The pleasant feeling in the external parameter value and the internal parameter value of the living thing,
The external parameter value and the internal parameter of the simulated creature before the action of the first link information when the external parameter value and the internal parameter value of the simulated creature after the action of the second link information are lower than the pleasant emotion. A value, an action from the first link information to the second link information,
An external parameter value and an internal parameter value of the pseudo creature after the action of the second link information are generated as a set of link information, and the generated link information is not stored in the link database means. And a chain action learning step of storing as link information.

13. A simulated creature device that realizes a simulated creature that moves by a desired motion pattern and has an association database means that stores the operation pattern as association information including a plurality of parameters. Detecting as a parameter value an external parameter value; holding a pseudo emotion having at least a pleasant feeling of the pseudo-creature as an internal parameter value; performing on the pseudo-creature corresponding to the external parameter value and the internal parameter value. An action search step of selecting the action pattern to be performed from the association information of the association database means; an action output step of exercising based on the selected association information; and the action information stored in the association database means. , First association information, and external parameters of the pseudo-creature after the action of the first association information. Second association information having an external parameter value and an internal parameter value of the simulated creature before the action that substantially matches the meter value and the internal parameter value, and an external parameter of the simulated creature after the action of the second association information Extracting the third association information having the external parameter value and the internal parameter value of the simulated creature before the action that substantially matches the value and the internal parameter value, and extracting the simulated creature before the action of the first association information. The pleasant emotion in the external parameter value and the internal parameter value,
The external parameter value and the internal parameter of the simulated creature before the action of the first link information when the external parameter value and the internal parameter value of the simulated creature after the action of the third link information are lower than the pleasant emotion. A value and an action and a second action of the second link information from the first link information
The action from the link information to the third link information and the external parameter value and the internal parameter value of the pseudo creature after the action of the third link information are generated as a set of link information. A chain action learning step of storing the linked information as link information if the link information is not stored in the link database means.

14. A simulated creature device that realizes a simulated creature that moves according to a desired motion pattern and has an association database means for storing the operation pattern as association information including a plurality of parameters, wherein Detecting as a parameter value an external parameter value; holding a pseudo emotion having at least a pleasant feeling of the pseudo-creature as an internal parameter value; performing on the pseudo-creature corresponding to the external parameter value and the internal parameter value. An action retrieval step of selecting the action pattern to be performed from the association information of the association database means; an action output step of exercising based on the selected association information; and an action output step of the association information stored in the association database means. , First association information, and external parameters of the pseudo-creature after the action of the first association information. An n (where n is a natural number of 2 or more) chain is defined as a chain including the second association information having the external parameter value and the internal parameter value of the simulated creature before the action, which substantially matches the meter value and the internal parameter value. (N +
1) extracting the pieces of the related information, wherein the pleasant feeling in the external parameter value and the internal parameter value of the pseudo creature before the action of the first related information is:
When the external parameter value and the internal parameter value of the pseudo creature after the action of the second linked information of the n-th chain are lower than the pleasant emotion, the outside of the pseudo creature before the action of the first linked information. The parameter value and the internal parameter value and (n + 1) actions from the first link information to the (n + 1) link information are arranged in the (n + 1) link information. And the external parameter value and the internal parameter value of the pseudo creature after the action of the (n + 1) th association information are generated as a set of association information, and the generated association is generated. A chain action learning step of storing information as link information if the information is not stored in the link database means.

15. A computer-readable program for operating a simulated creature moving with a desired motion pattern and operating a simulated creature apparatus having an association database means for storing the operation pattern as association information including a plurality of parameters. A storage medium that stores the pseudo-creature's surroundings as an external parameter value, and stores a pseudo-emotion of the pseudo-creature as an internal parameter value, and corresponds to the external parameter value and the internal parameter value. Then, the operation pattern to be performed by the simulated creature is selected from the association information of the association database means, exercise is performed based on the selected association information, and the motion is detected after the selected association information is executed. The external parameter value and the retained internal parameter value Calculating a similarity between the external parameter value and the internal parameter value of the obtained association information, and when the similarity is equal to or greater than a predetermined value, increasing the transition probability of the selected association information. A computer-readable storage medium storing a program for reducing the transition probability of the selected association information when the similarity is less than a predetermined value;

16. A computer readable program for operating a simulated creature that moves according to a desired operation pattern and operating a simulated creature apparatus having an association database means for storing the operation pattern as association information including a plurality of parameters. A storage medium that stores the pseudo-creature's surroundings as an external parameter value, and stores a pseudo-emotion of the pseudo-creature as an internal parameter value, and corresponds to the external parameter value and the internal parameter value. Then, the operation pattern to be performed by the simulated creature is selected from the association information of the association database means, exercise is performed based on the selected association information, and the motion is detected after the selected association information is executed. The external parameter value and the retained internal parameter value Calculating a similarity between the external parameter value and the internal parameter value of the obtained association information. If the similarity is less than a predetermined value, the external information before the execution of the selected association information is calculated. A parameter value and an internal parameter value, an action of the selected association information,
If the external parameter value and the internal parameter value after the behavior of the simulated creature are generated as a set of link information, and if the generated link information is not stored in the link database means, the link information is used as the link information. A computer-readable storage medium storing a program to be stored.

17. A computer readable program for operating a simulated creature that moves according to a desired operation pattern and operating a simulated creature apparatus having an association database means for storing the operation pattern as association information including a plurality of parameters. A storage medium that stores the pseudo-creature's surroundings as an external parameter value, and stores a pseudo-emotion of the pseudo-creature as an internal parameter value, and corresponds to the external parameter value and the internal parameter value. Then, the operation pattern to be performed by the simulated creature is selected from the association information of the association database means, and exercise is performed based on the selected association information, and the detected object is detected by the external situation input means. When the detected object performs an action, the action performed by the detected object is The external parameter value of the detected object extracted from the external parameter value detected before and after this action, and the internal parameter value of the detected object estimated from the external parameter value detected before and after this action , As a set of association information, and a computer-readable storage medium storing a program for storing the generated association information unless the generated association information is stored in the association database means.

18. A computer readable program for operating a simulated creature that moves according to a desired operation pattern and operating a simulated creature apparatus having an association database means for storing the operation pattern as association information including a plurality of parameters. A storage medium that stores the pseudo-creature's surroundings as an external parameter value, and stores a pseudo-emotion of the pseudo-creature as an internal parameter value, and corresponds to the external parameter value and the internal parameter value. Then, the operation pattern to be performed by the simulated creature is selected from the association information of the association database means, the exercise is performed based on the selected association information, and an action is performed from the association information stored in the association database means. The subsequent external parameter value and internal parameter value are substantially the same. A plurality of the association information having a parameter value are extracted, and, among the extracted association information, the external parameter value and the internal of the association information before the action of the association information whose transition probability is lower than the other associated information extracted are also lower. Parameter value first
And the external parameter value and the internal parameter value before the action of the other related information are used as a second parameter value.
And the first parameter value is the second parameter value.
And an external parameter value and an internal parameter value after the action of the other related information are generated as a set of related information, and the generated related information is stored in the related database means. A computer-readable storage medium storing a program to be stored as association information if the program is not stored in the storage medium.

19. A computer-readable program for operating a pseudo-creature that moves according to a desired motion pattern and operating a pseudo-creature device having an association database means for storing the operation pattern as association information including a plurality of parameters. The storage medium stored in the, the situation around the pseudo-creature is detected as an external parameter value, pseudo-feelings having at least a pleasant feeling of the pseudo-creature are held as internal parameter values, the external parameter value and the external parameter value Corresponding to the internal parameter value, the operation pattern to be performed by the simulated creature is selected from the association information of the association database means, the exercise is performed based on the selected association information, and the motion pattern is stored in the association database means. Of the linkage information,
The second related information having an external parameter value and an internal parameter value of the pseudo creature before the action, which substantially matches the external parameter value and the internal parameter value of the pseudo creature after the action of the first linked information. And the associated information of, the pleasant feeling in the external parameter value and the internal parameter value of the pseudo-creature before the action of the first linked information,
The external parameter value and the internal parameter of the simulated creature before the action of the first link information when the external parameter value and the internal parameter value of the simulated creature after the action of the second link information are lower than the pleasant emotion. A value, an action from the first link information to the second link information,
An external parameter value and an internal parameter value of the pseudo creature after the action of the second link information are generated as a set of link information, and the generated link information is not stored in the link database means. A computer-readable storage medium storing a program to be stored as association information.

20. A computer-readable program for operating a pseudo-creature that moves according to a desired motion pattern and operating a pseudo-creature device having an association database means for storing the operation pattern as association information including a plurality of parameters. The storage medium stored in the, the situation around the pseudo-creature is detected as an external parameter value, pseudo-feelings having at least a pleasant feeling of the pseudo-creature are held as internal parameter values, the external parameter value and the external parameter value Corresponding to the internal parameter value, the operation pattern to be performed by the simulated creature is selected from the association information of the association database means, the exercise is performed based on the selected association information, and the motion pattern is stored in the association database means. Of the linkage information,
The second related information having an external parameter value and an internal parameter value of the pseudo creature before the action, which substantially matches the external parameter value and the internal parameter value of the pseudo creature after the action of the first linked information. A third information having an external parameter value and an internal parameter value of the pseudo creature before the action, which substantially coincides with the external parameter value and the internal parameter value of the pseudo creature after the action of the second link information.
Of the pseudo-creatures before and after the action of the first link information in the external parameter value and the internal parameter value,
The external parameter value and the internal parameter of the simulated creature before the action of the first link information when the external parameter value and the internal parameter value of the simulated creature after the action of the third link information are lower than the pleasant emotion. A value and an action and a second action of the second link information from the first link information
The action from the link information to the third link information, and the external parameter value and the internal parameter value of the pseudo creature after the action of the third link information are generated as a set of link information. A computer-readable storage medium storing a program for storing the linked information if the linked information is not stored in the linked database means.

21. A computer-readable program for operating a pseudo-creature that moves according to a desired motion pattern and operating a pseudo-creature device having an association database means for storing the operation pattern as association information including a plurality of parameters. The storage medium stored in the, the situation around the pseudo-creature is detected as an external parameter value, pseudo-feelings having at least a pleasant feeling of the pseudo-creature are held as internal parameter values, the external parameter value and the external parameter value Corresponding to the internal parameter value, the operation pattern to be performed by the simulated creature is selected from the association information of the association database means, the exercise is performed based on the selected association information, and the motion pattern is stored in the association database means. Of the linkage information,
The second related information having an external parameter value and an internal parameter value of the pseudo creature before the action, which substantially matches the external parameter value and the internal parameter value of the pseudo creature after the action of the first linked information. (N + 1 is a natural number of 2 or more), and (n + 1) pieces of the linked information are extracted. The outside of the pseudo-creature before the action of the first linked information The pleasant emotion in the parameter value and the internal parameter value is:
When the external parameter value and the internal parameter value of the pseudo creature after the action of the second linked information of the n-th chain are lower than the pleasant emotion, the outside of the pseudo creature before the action of the first linked information. The parameter value and the internal parameter value and (n + 1) actions from the first link information to the (n + 1) link information are arranged in the (n + 1) link information. The action performed in the given order and the external parameter value and the internal parameter value of the pseudo creature after the action of the (n + 1) th link information are generated as a set of link information, and the generated link is generated. If the information is not stored in the association database means, a computer-readable storage medium storing a program for storing the information as the association information.

22. A computer program for operating a pseudo-creature that moves according to a desired motion pattern and operating a pseudo-creature device having an association database means for storing the operation pattern as association information including a plurality of parameters, A function of detecting a situation around the simulated creature as an external parameter value, a function of retaining a simulated emotion of the simulated creature as an internal parameter value, and a function performed on the simulated creature in response to the external parameter value and the internal parameter value. A function of selecting the operation pattern to be performed from the association information of the association database means, a function of exercising based on the selected association information, and an external function detected after the selected association information is executed. The parameter value and the retained internal parameter value, and the selected A function of calculating a similarity between the external parameter value and the internal parameter value of the association information; and, when the similarity is equal to or more than a predetermined value, increasing the transition probability of the selected association information, If the similarity is less than a predetermined value, a function of reducing the transition probability of the selected association information.

23. A computer program for operating a pseudo-creature that moves by a desired motion pattern and operating a pseudo-creature device having an association database means for storing the operation pattern as association information including a plurality of parameters, A function of detecting a situation around the simulated creature as an external parameter value, a function of retaining a simulated emotion of the simulated creature as an internal parameter value, and a function performed on the simulated creature in response to the external parameter value and the internal parameter value. A function of selecting the operation pattern to be performed from the association information of the association database means, a function of exercising based on the selected association information, and an external function detected after the selected association information is executed. The parameter value and the retained internal parameter value, and the selected A function of calculating a similarity between the external parameter value and the internal parameter value of the association information; and, if the similarity is less than a predetermined value, the external parameter before the selected association information is executed. Values and internal parameter values, the action of the selected association information,
The external parameter value and the internal parameter value after the behavior of the simulated creature are generated as a set of link information. If the generated link information is not stored in the link database means, the link information is stored as link information. Computer program that realizes the function of

24. A computer program for operating a pseudo-creature that moves by a desired motion pattern and operating a pseudo-creature apparatus having an association database means for storing the operation pattern as association information including a plurality of parameters. A function of detecting a situation around the simulated creature as an external parameter value, a function of retaining a simulated emotion of the simulated creature as an internal parameter value, and corresponding to the external parameter value and the internal parameter value, performed on the simulated creature. A function of selecting the operation pattern to be performed from the association information of the association database means, a function of exercising based on the selected association information, and a detection object detected by the external situation input means. Performs an action, the action performed by the detected object and An external parameter value of the detected object extracted from the external parameter value detected before and after the movement, and an internal parameter value of the detected object estimated from the external parameter value detected before and after the action, A computer program for realizing a function of generating a set of association information, and storing the generated association information if the generated association information is not stored in the association database means.

25. A computer program for operating a pseudo-creature that moves according to a desired motion pattern and operating a pseudo-creature device having an association database means for storing the operation pattern as association information including a plurality of parameters, A function of detecting a situation around the simulated creature as an external parameter value, a function of retaining a simulated emotion of the simulated creature as an internal parameter value, and corresponding to the external parameter value and the internal parameter value, performed on the simulated creature. A function of selecting the operation pattern to be performed from the association information of the association database means, a function of exercising based on the selected association information, and the association information stored in the association database means. Parameters with almost the same external parameter value and internal parameter value Extracting the plurality of pieces of association information having data values, and among the extracted pieces of association information, the external parameter values and the internal values of the association information whose transition probabilities are lower than those of the other pieces of association information from which the transition probability is extracted. Parameter value first
And the external parameter value and the internal parameter value before the action of the other related information are used as a second parameter value.
And the first parameter value is the second parameter value.
And an external parameter value and an internal parameter value after the action of the other related information are generated as a set of related information, and the generated related information is stored in the related database means. A computer program for realizing the function of storing the information as association information if the information is not stored in the computer program.

26. A computer program for operating a pseudo-creature that moves by a desired motion pattern and operating a pseudo-creature device having an association database means for storing the operation pattern as association information including a plurality of parameters, A function of detecting a situation around the simulated creature as an external parameter value, a function of holding a simulated emotion having at least a pleasant feeling of the simulated creature as an internal parameter value, and corresponding to the external parameter value and the internal parameter value, A function of selecting the operation pattern to be performed by the simulated creature from the link information of the link database means, a function of exercising based on the selected link information, and a function of the link information stored in the link database means. Of which, the first related information and the action of the first related information Second association information having an external parameter value and an internal parameter value of the simulated creature before the action that substantially matches the external parameter value and the internal parameter value of the simulated creature after, and the first association The pleasant feeling in the external parameter value and the internal parameter value of the simulated creature before the action of the information,
The external parameter value and the internal parameter of the simulated creature before the action of the first link information when the external parameter value and the internal parameter value of the simulated creature after the action of the second link information are lower than the pleasant emotion. A value, an action from the first link information to the second link information,
An external parameter value and an internal parameter value of the pseudo creature after the action of the second link information are generated as a set of link information, and the generated link information must be stored in the link database means. A computer program for realizing a function of storing the information as association information.

27. An association database means for operating a movable creature by operating the movable portion according to a desired operation pattern and storing the operation pattern for operating the movable portion as association information including a plurality of parameters. A function of detecting a situation around the pseudo-creature as an external parameter value; a function of retaining a pseudo-emotion having at least a pleasant feeling of the pseudo-creature as an internal parameter value; and the external parameter value. And a function of selecting the operation pattern to be performed by the simulated creature from the association information of the association database means in accordance with the internal parameter value, a function of exercising based on the selected association information, and an association database. First association information among the association information stored in the means; The second association information having an external parameter value and an internal parameter value of the simulated creature before the action that substantially matches the external parameter value and the internal parameter value of the simulated creature after the action of the first association information, Extracting third association information having an external parameter value and an internal parameter value of the simulated creature before the action, which substantially matches the external parameter value and the internal parameter value of the simulated creature after the action of the second association information. The pleasant feeling in the external parameter value and the internal parameter value of the pseudo creature before the action of the first association information,
The external parameter value and the internal parameter of the simulated creature before the action of the first link information when the external parameter value and the internal parameter value of the simulated creature after the action of the third link information are lower than the pleasant emotion. A value and an action and a second action of the second link information from the first link information
The action from the link information to the third link information, and the external parameter value and the internal parameter value of the pseudo creature after the action of the third link information are generated as a set of link information. And a function of storing the linked information as the linked information if the linked information is not stored in the link database means.

28. A computer program for operating a pseudo-creature that moves according to a desired motion pattern, and operating a pseudo-creature device having an association database means for storing the operation pattern as association information including a plurality of parameters. A function of detecting a situation around the simulated creature as an external parameter value, a function of holding a simulated emotion having at least a pleasant feeling of the simulated creature as an internal parameter value, and corresponding to the external parameter value and the internal parameter value, A function of selecting the operation pattern to be performed by the simulated creature from the link information of the link database means, a function of exercising based on the selected link information, and a function of the link information stored in the link database means. Of which, the first related information and the action of the first related information The second linked information having the external parameter value and the internal parameter value of the simulated creature before the action, which substantially matches the external parameter value and the internal parameter value of the simulated creature after, is defined as n (where n is (N + 2) chain (n +
1) extracting the associated information, the pleasant feeling in the external parameter value and the internal parameter value of the pseudo creature before the action of the first associated information,
When the external parameter value and the internal parameter value of the pseudo creature after the action of the second linked information of the n-th chain are lower than the pleasant emotion, the outside of the pseudo creature before the action of the first linked information. The parameter value and the internal parameter value and (n + 1) actions from the first link information to the (n + 1) link information are arranged in the (n + 1) link information. And performing the action performed in the given order, and the external parameter value and the internal parameter value of the pseudo creature after the action of the (n + 1) th link information as a set of link information. A computer program for realizing a function of storing information as association information if the information is not stored in the association database means.