JP2006110707A

JP2006110707A - Robot device

Info

Publication number: JP2006110707A
Application number: JP2005128642A
Authority: JP
Inventors: Takeshi Takagi; 剛高木; Masahiro Fujita; 雅博藤田; Yukiko Yoshiike; 由紀子吉池; Profio Ugo Di; ディプロフィオウゴ; Takayuki Sakamoto; 隆之坂本; Tsutomu Sawada; 務澤田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2004-09-14
Filing date: 2005-04-26
Publication date: 2006-04-27

Abstract

<P>PROBLEM TO BE SOLVED: To realize social interaction of an entertainment robot while harmonizing action controlling softwares made of a plurality of stories with each other. <P>SOLUTION: The status dependent action story selects an action in accordance with a status such as outside stimulation, a change of an inside state, etc. and the reflective action story reflectively acts in accordance with the outside stimulation. Additionally, the status dependent action story restrains practice of an action command manifested from the reflective action story in the case when the reflective action does not match an intention of an action in correspondence with the status by controlling the reflective action of the robot device as the outside stimulation. Alternatively, reversely, it excites the reflective action story in making the reflective action appear. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、自律的な動作を行ないユーザとのリアリスティックなコミュニケーションを実現するロボット装置に係り、特に、視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況を統合的に判断して適当な行動を選択するロボットのための行動制御システム及び行動制御方法、並びにロボット装置に関する。 The present invention relates to a robot apparatus that performs autonomous operation and realizes realistic communication with a user, and in particular, a robot such as a recognition result of an external environment such as vision or hearing, or an internal state such as instinct or emotion is placed. The present invention relates to a behavior control system, a behavior control method, and a robot apparatus for a robot that selects an appropriate behavior by comprehensively judging the situation.

電気的若しくは磁気的な作用を用いて人間の動作に似せた運動を行なう機械装置のことを「ロボット」という。ロボットの語源は、スラブ語の“ＲＯＢＯＴＡ（奴隷機械）”に由来すると言われている。わが国では、ロボットが普及し始めたのは１９６０年代末からであるが、その多くは、工場における生産作業の自動化・無人化などを目的としたマニピュレータや搬送ロボットなどの産業用ロボット（ｉｎｄｕｓｔｒｉａｌｒｏｂｏｔ）であった。 A mechanical device that performs a movement resembling human movement using an electrical or magnetic action is called a “robot”. It is said that the word “robot” comes from the Slavic word “ROBOTA (slave machine)”. In Japan, robots started to spread from the end of the 1960s, but many of them are industrial robots such as manipulators and transfer robots for the purpose of automating and unmanned production operations in factories. Met.

最近では、イヌやネコ、クマのように４足歩行の動物の身体メカニズムやその動作を模したペット型ロボット、あるいは、ヒトやサルなどの２足直立歩行を行なう動物の身体メカニズムや動作を模した「人間形」若しくは「人間型」のロボット（ｈｕｍａｎｏｉｄｒｏｂｏｔ）など、脚式移動ロボットの構造やその安定歩行制御に関する研究開発が進展し、実用化への期待も高まってきている。これら脚式移動ロボットは、クローラ式ロボットに比し不安定で姿勢制御や歩行制御が難しくなるが、階段の昇降や障害物の乗り越えなど、柔軟な歩行・走行動作を実現できるという点で優れている。 Recently, a pet-type robot that mimics the body mechanism and movement of a quadruped animal such as a dog, cat, or bear, or the body mechanism or movement of a biped upright animal such as a human or monkey. Research and development on the structure of a legged mobile robot and its stable walking control, such as the “humanoid” or “humanoid robot”, has been progressed, and the expectation for practical use is also increasing. These legged mobile robots are unstable compared to crawler robots, making posture control and walking control difficult, but they are superior in that they can realize flexible walking and running operations such as climbing stairs and climbing obstacles. Yes.

脚式移動ロボットの用途の１つとして、産業活動・生産活動等における各種の難作業の代行が挙げられる。例えば、原子力発電プラントや火力発電プラント、石油化学プラントにおけるメンテナンス作業、製造工場における部品の搬送・組立作業、高層ビルにおける清掃、火災現場その他における救助といったような危険作業・難作業の代行などである。 One of the uses of legged mobile robots is to perform various difficult operations in industrial activities and production activities. For example, maintenance work at nuclear power plants, thermal power plants, petrochemical plants, transportation and assembly work of parts at manufacturing plants, cleaning of high-rise buildings, substitution of dangerous work and difficult work such as rescue at fire sites etc. .

また、脚式移動ロボットの他の用途として、上述の作業支援というよりも、生活密着型、すなわち人間との「共生」あるいは「エンターティンメント」という用途が挙げられる。この種のロボットは、ヒトあるいはイヌ（ペット）、クマなどの比較的知性の高い脚式歩行動物の動作メカニズムや四肢を利用した豊かな感情表現を忠実に再現する。また、あらかじめ入力された動作パターンを単に忠実に実行するだけではなく、ユーザ（あるいは他のロボット）から受ける言葉や態度（「褒める」とか「叱る」、「叩く」など）に対して動的に対応した、生き生きとした応答表現を実現することも要求される。 Further, as other uses of the legged mobile robot, rather than the above-described work support, there is a life-contact type, that is, a “symbiosis” or “entertainment” with a human. This type of robot faithfully reproduces the rich emotional expression using the movement mechanism and limbs of relatively intelligent legged walking animals such as humans, dogs (pets), and bears. In addition, it does not simply execute a pre-input motion pattern faithfully, but dynamically responds to words and attitudes received from the user (or other robots) (such as “giving up”, “speaking”, “hitting”). It is also required to realize corresponding and vivid response expressions.

従来の玩具機械は、ユーザ操作と応答動作との関係が固定的であり、玩具の動作をユーザの好みに合わせて変更することはできない。この結果、ユーザは同じ動作しか繰り返さない玩具をやがては飽きてしまうことになる。これに対し、インテリジェントなロボットは、対話や機体動作などからなる行動を自律的に選択することから、より高度な知的レベルでリアリスティックなコミュニケーションを実現することが可能となる。この結果、ユーザはロボットに対して深い愛着や親しみを感じる。 In the conventional toy machine, the relationship between the user operation and the response operation is fixed, and the operation of the toy cannot be changed according to the user's preference. As a result, the user eventually gets bored with the toy that repeats only the same action. In contrast, an intelligent robot autonomously selects an action consisting of a dialogue, a body motion, and the like, so that realistic communication can be realized at a higher intelligent level. As a result, the user feels deep attachment and familiarity with the robot.

ロボットあるいはその他のリアリスティックな対話システムでは、視覚や聴覚など外部環境の変化に応じて逐次的に行動を選択していくのが一般的である。また、行動選択メカニズムの他の例として、本能や感情といった情動をモデル化してシステムの内部状態を管理して、内部状態の変化に応じて行動を選択するものを挙げることができる。勿論、システムの内部状態は、外部環境の変化によっても変化するし、選択された行動を発現することによっても変化する。 In a robot or other realistic dialogue system, it is common to select actions sequentially according to changes in the external environment such as vision and hearing. Another example of the action selection mechanism is to model emotions such as instinct and emotion, manage the internal state of the system, and select an action according to a change in the internal state. Of course, the internal state of the system changes depending on changes in the external environment, and also changes depending on the selected behavior.

しかしながら、これら外部環境や内部状態などのロボットが置かれている状況を統合的に判断して行動を選択するという、状況依存型の行動制御に関しては例が少ない。 However, there are few examples regarding the situation-dependent action control in which the action such as the external environment and the internal state is selected by judging the situation where the robot is placed.

ここで、内部状態には、例えば生体で言えば大脳辺縁系へのアクセスに相当する本能のような要素や、大脳新皮質へのアクセスに相当する内発的欲求や社会的欲求などのように動物行動学的モデルで捉えられる要素、さらには喜びや悲しみ、怒り、驚きなどのような感情と呼ばれる要素などで構成される。 Here, the internal state includes elements such as instinct that corresponds to access to the limbic system in the living body, and intrinsic and social desires that correspond to access to the cerebral neocortex. It consists of elements that can be captured by an animal behavioral model and elements called emotions such as joy, sadness, anger, and surprise.

従来のインテリジェント・ロボットやその他の自律対話型ロボットにおいては、本能や感情などさまざまな要因からなる内部状態をすべて「情動」としてまとめて１次元的に内部状態を管理していた。すなわち、内部状態を構成する各要素はそれぞれ並列に存在しており、明確な選択基準のないまま外界の状況や内部状態のみで行動が選択されていた。 In conventional intelligent robots and other autonomous interactive robots, the internal state consisting of various factors such as instinct and emotions are all collected as “emotional” to manage the internal state in one dimension. That is, the elements constituting the internal state exist in parallel, and the action is selected only in the external environment and the internal state without a clear selection criterion.

従来のシステムでは、その動作の選択及び発現は１次元の中にすべての行動が存在し、どれを選択するかを決定していた。このため、動作が多くなるにつれてその選択は煩雑になり、そのときの状況や内部状態を反映した行動選択を行なうことがより難しくなる。 In conventional systems, the selection and expression of the action has all the actions in one dimension and determines which one to select. For this reason, the selection becomes complicated as the number of operations increases, and it becomes more difficult to select an action that reflects the situation and the internal state at that time.

近時、本能や感情といった情動をモデル化してシステムの内部状態を管理し、内部状態の変化に応じて行動を選択するシステムが提案されているものの（例えば、非特許文献１を参照のこと）、内部状態と外部刺激に対して選択される行動は固定的なものが多く、ユーザや環境とのインタラクションを通じてそれを変えることは困難である。 Recently, a system has been proposed in which emotions such as instincts and emotions are modeled to manage the internal state of the system, and an action is selected according to changes in the internal state (see, for example, Non-Patent Document 1). The behavior selected for the internal state and the external stimulus is often fixed, and it is difficult to change it through interaction with the user or the environment.

ロボット装置が現在の状況に応じた最適な次の行動及び動作を予想して行なわせる機能や、過去の経験に基づいて次の行動及び動作を変化させる機能（以下では、「状況依存型の行動制御」とも呼ぶ）を搭載することができれば、より一層の親近感や満足感をユーザに与えて、ロボット装置としてのアミューズメント性をより向上させるとともに、ユーザとのインタラクションを円滑に行なうことができて便利である。 A function that the robot device predicts and performs the next action and action that is optimal for the current situation, and a function that changes the next action and action based on past experience (hereinafter referred to as “situation-dependent action”). Can also be used to provide a greater sense of familiarity and satisfaction, improve amusement as a robotic device, and facilitate interaction with the user. Convenient.

また、最近では、本脳や感情といった情動をモデル化しシステムの内部状態を管理して、この種の内部状態と、センサ入力などから得られる認識結果（タッチセンサのシーケンス入力や、カラー・ボールの認識、顔認識、音声認識など）すなわち外部刺激という２種類の入力に基づいて、状況依存行動を自発的に選択することができるエンタテインメント・ロボットが提案されている（例えば、特許文献１を参照のこと）。 Also, recently, emotions such as the brain and emotions are modeled and the internal state of the system is managed, and this type of internal state and the recognition results obtained from sensor inputs (such as touch sensor sequence input and color ball An entertainment robot has been proposed that can spontaneously select a situation-dependent action based on two types of inputs, i.e., recognition, face recognition, voice recognition, etc., i.e., external stimuli (see, for example, Patent Document 1). thing).

その一方で、自発的な状況依存行動の他に、センサ入力そのものをトリガとして反射的な行動を行なうことも、エンタテインメント・ロボットには重要である、と本発明者らは思料する。例えば、肩タッチセンサを押された、目の前に突然何か物体が現れた、大きな音が発生したなどの外部刺激に対して、反射行動である。この種の反射行動は基本的には内部状態とはほぼ無関係に行なわれる。このような場合、状況依存行動と反射行動という２種類の行動を調停することは困難である。 On the other hand, in addition to spontaneous situation-dependent behavior, the present inventors think that it is also important for entertainment robots to perform reflexive behavior using sensor input itself as a trigger. For example, it is a reflex action for an external stimulus such as a shoulder touch sensor being pressed, an object suddenly appearing in front of the eyes, or a loud sound. This kind of reflex behavior is basically performed independently of the internal state. In such a case, it is difficult to mediate two types of behaviors, that is, situation dependent behavior and reflex behavior.

特許第３，５５８，２２２号公報Japanese Patent No. 3,558,222 尾形哲也、菅野重樹共著「自己保存に基づくロボットの行動生成−方法論と機械モデルの実現化−」、日本ロボット学会誌、１９９７年、第１５巻、第５号、ｐ．７１０−７２１Co-authored by Tetsuya Ogata and Shigeki Kanno, “Robot Behavior Generation Based on Self-Preservation-Realization of Methodology and Machine Model-” The Journal of the Robotics Society of Japan, 1997, Vol. 15, No. 5, p. 710-721

本発明の目的は、自律的な動作を行ないリアリスティックなコミュニケーションを実現することができる、優れたロボット装置を提供することにある。 An object of the present invention is to provide an excellent robot apparatus capable of performing autonomous operation and realizing realistic communication.

本発明のさらなる目的は、視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況を統合的に判断して行動を選択することができる、優れたロボット装置を提供することにある。 A further object of the present invention is to be able to select an action by comprehensively judging the situation where the robot is placed such as the recognition result of the external environment such as vision and hearing, and the internal state such as instinct and emotion. It is to provide a robot apparatus.

本発明のさらなる目的は、情動についての存在意義をより明確にして、一定の秩序の下で外部刺激や内部状態に応じた行動を好適に選択し実行することができる、優れたロボット装置を提供することにある。 A further object of the present invention is to provide an excellent robot apparatus that can clarify the significance of the existence of emotions and can appropriately select and execute an action according to an external stimulus or an internal state under a certain order. There is to do.

本発明のさらなる目的は、複数の階層からなる行動制御ソフトウェア間の調和をとりながら、エンタテインメント・ロボットの社会的なインタラクションを実現することができる、優れたロボット装置を提供することにある。 It is a further object of the present invention to provide an excellent robot apparatus capable of realizing social interaction of entertainment robots while harmonizing behavior control software composed of a plurality of layers.

本発明のさらなる目的は、外部刺激や内部状態に応じた自発的な状況依存行動と、外部刺激に直接反応する反射行動という２種類の異なった行動を好適に発現することができる、優れたロボット装置を提供することにある。 A further object of the present invention is to provide an excellent robot capable of suitably expressing two different behaviors, a spontaneous situation-dependent behavior according to an external stimulus or an internal state, and a reflex behavior that reacts directly to the external stimulus. To provide an apparatus.

本発明のさらなる目的は、自発的な状況依存行動と反射行動という２種類の異なった行動を好適に調停することができる、優れたロボット装置を提供することにある。 A further object of the present invention is to provide an excellent robot apparatus that can suitably mediate two different types of behaviors, ie, spontaneous situation-dependent behavior and reflex behavior.

本発明は、上記課題を参酌してなされたものであり、内部状態又は外部入力に基づいて行動を生成するロボット装置において、
内部状態又は外部入力に基づいて行動を制御する複数の行動制御階層と、
前記ロボット装置の資源を管理し、各行動制御階層からの前記ロボット装置の駆動に関する動作コマンドの競合を解決する資源管理部と、
を具備することを特徴とするロボット装置である。 The present invention has been made in consideration of the above problems, and in a robot apparatus that generates an action based on an internal state or an external input,
A plurality of action control layers for controlling actions based on internal states or external inputs;
A resource management unit that manages resources of the robot device and resolves conflicts of operation commands related to driving of the robot device from each behavior control layer;
A robot apparatus comprising:

ここで、前記の各行動制御階層は、前記ロボットの行動を決定する１以上の行動モジュールで構成され、各行動モジュールは、内部状態又は外部入力に応じた前記ロボット装置の行動評価を出力する行動評価手段と、前記ロボット装置の行動命令を出力する行動命令出力手段とをそれぞれ備えている。そして、前記の各行動制御階層では、各行動モジュールの前記行動評価手段における行動評価に基いて、前記ロボット装置の行動を制御する動作コマンドを生成するようになっている。 Here, each of the behavior control layers is composed of one or more behavior modules that determine the behavior of the robot, and each behavior module outputs an behavior evaluation of the robot device according to an internal state or an external input. Evaluation means and action command output means for outputting an action command of the robot apparatus are provided. In each of the behavior control layers, an operation command for controlling the behavior of the robot apparatus is generated based on the behavior evaluation in the behavior evaluation unit of each behavior module.

各行動制御階層は、複数の行動モジュールがロボット装置の実現レベルに応じた木構造形式に構成することができる。この木構造は、動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するための枝など、複数の枝を含んでいる。例えば、ルート行動モジュールの直近下位の階層では、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」、「食べる（Ｉｎｇｅｓｔｉｖｅ）」、「遊ぶ（Ｐｌａｙ）」という行動モジュールが配設される。そして、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」の下位には、「ＩｎｖｅｓｔｉｇａｔｉｖｅＬｏｃｏｍｏｔｉｏｎ」、「ＨｅａｄｉｎＡｉｒＳｎｉｆｆｉｎｇ」、「ＩｎｖｅｓｔｉｇａｔｉｖｅＳｎｉｆｆｉｎｇ」というより具体的な探索行動を記述した行動モジュールが配設されている。同様に、行動モジュール「食べる（Ｉｎｇｅｓｔｉｖｅ）」の下位には「Ｅａｔ」や「Ｄｒｉｎｋ」などのより具体的な飲食行動を記述した行動モジュールが配設され、行動モジュール「遊ぶ（Ｐｌａｙ）」の下位には「ＰｌａｙＢｏｗｉｎｇ」、「ＰｌａｙＧｒｅｅｔｉｎｇ」、「ＰｌａｙＰａｗｉｎｇ」などのより具体的な遊ぶ行動を記述した行動モジュールが配設されている。 Each behavior control layer can be configured by a plurality of behavior modules in a tree structure according to the realization level of the robot apparatus. This tree structure includes a plurality of branches such as a behavior model obtained by formulating an animal behavioral (ethological) situation-dependent behavior and a branch for executing emotional expression. For example, in the hierarchy immediately below the root behavior module, behavior modules such as “Investigate”, “Estimate”, and “Play” are arranged. Then, below “Investigate”, behavior modules describing more specific search behaviors such as “Investigative Location”, “HeadinAirSniffing”, and “InvestigativeSniffing” are arranged. Similarly, a behavior module describing more specific eating and drinking behavior such as “Eat” and “Drink” is arranged below the behavior module “Ingestive”, and subordinate to the behavior module “Play”. Are arranged with behavior modules describing more specific playing behaviors such as “PlayBowing”, “PlayGreeting”, “PlayPawing” and the like.

したがって、前記の各行動制御階層では、このような階層構造の下位層の行動モジュールから上位層の行動モジュールへ出力される行動評価に基づいて行動モジュールを選択し、前記ロボット装置の行動を制御することができる。 Therefore, in each of the behavior control layers, the behavior module is selected based on the behavior evaluation output from the lower layer behavior module to the upper layer behavior module, and the behavior of the robot apparatus is controlled. be able to.

前記の各行動モジュールの行動命令出力手段は、前記ロボット装置において前記行動命令を実行する際に使用する前記ロボット装置のリソースを出力する。そして、前記の各行動制御階層では、前記階層構造の下位層の行動モジュールから上位層の行動モジュールへ出力される行動評価及び前記ロボット装置の使用リソースに基づいて、前記ロボット装置の行動を制御する。 The behavior command output means of each behavior module outputs the resource of the robot device used when the behavior command is executed in the robot device. In each of the behavior control layers, the behavior of the robot device is controlled based on the behavior evaluation output from the behavior module of the lower layer of the hierarchical structure to the behavior module of the upper layer and the use resources of the robot device. .

このような場合、前記行動評価部は該木構造の上から下に向かって複数の行動モジュールを同時並行的に評価することができる。また、外部入力やロボット装置自身の内部状態の変化に応答して、前記行動評価部による前記の各行動モジュールの評価を実行して、木構造を上から下に向かって評価結果としての実行許可を渡していくことにより、外部環境や内部状態の変化に応じた適当な行動を選択的に実行することができる。すなわち、状況依存の行動の評価並びに実行をＣｏｎｃｕｒｒｅｎｔに行なうことができる。 In such a case, the behavior evaluation unit can evaluate a plurality of behavior modules simultaneously from the top to the bottom of the tree structure. Also, in response to external input or changes in the internal state of the robot apparatus itself, the behavior evaluation unit performs the evaluation of each of the behavior modules, and the execution permission as an evaluation result from the top to the bottom of the tree structure It is possible to selectively execute appropriate actions according to changes in the external environment and internal state. That is, the evaluation and execution of the situation-dependent behavior can be performed on the current.

ここで、本発明に係るロボット装置は、複数の行動制御階層として、例えば、外部入力の認識結果を直接受けて出力行動を直接決定する反射行動制御階層と、外部入力並びに前記ロボット装置の内部状態に基づいて前記ロボット装置が置かれている状況に即応した行動を制御する状況依存行動階層と、将来の状況を推論して比較的長期にわたる前記ロボット装置の行動を計画する熟考行動制御階層という３層からなる行動制御の階層構造を備えている。 Here, the robot apparatus according to the present invention includes, as a plurality of action control layers, for example, a reflex action control hierarchy that directly receives an external input recognition result and directly determines an output action, an external input, and an internal state of the robot apparatus. A situation-dependent action hierarchy that controls actions in response to the situation in which the robot apparatus is placed, and a contemplation action control hierarchy that infers future situations and plans the action of the robot apparatus over a relatively long period of time. It has a hierarchical structure of behavior control consisting of layers.

前記反射行動階層と前記状況依存行動階層による行動選択はそれぞれ独立して行なわれるので、互いに選択された行動モジュールをロボット装置上で実行するときには、ハードウェア・リソースが競合する場合がある。前記反射行動階層から発行されるコマンドは反射行動として生成するコマンドであるのに対し、前期状況依存行動階層から発行されるコマンドは、状況依存行動として生成するコマンドである。これらの行動階層は別プロセスとして、若しくは別スレッドとして生成するため、あらかじめ調停しておくことが困難である。 Since the action selection based on the reflection action hierarchy and the situation-dependent action hierarchy is performed independently, hardware resources may compete when executing action modules selected from each other on the robot apparatus. The command issued from the reflex behavior hierarchy is a command generated as a reflex action, whereas the command issued from the previous situation-dependent action hierarchy is a command generated as a situation dependence action. Since these behavior layers are generated as separate processes or as separate threads, it is difficult to mediate in advance.

そこで、本発明では、前記資源管理部は、前記反射行動階層と前記状況依存行動階層の動作コマンドの競合を調停するようにしている。 Therefore, in the present invention, the resource management unit arbitrates competition between operation commands of the reflex behavior hierarchy and the situation-dependent behavior hierarchy.

前記資源管理部は、前記反射行動階層並びに前記熟考行動階層からのコマンドの競合解決を行なうコマンド競合解決手段と、前記ロボット装置のハードウェア資源毎のコマンドの管理を行なうコンテンツ管理手段を備えている。そして、前記コマンド競合解決手段は、コマンドと、コマンドが使用する前記ロボット装置のハードウェア資源と、コマンドの活動度レベルを保持し、いずれかの行動制御階層から新しいコマンドが送信されたときには、現在実行中のコマンドが使用する前記ロボット装置のハードウェア資源と新しいコマンドが使用する前記ロボット装置のハードウェア資源が競合しているか否かを判定し、競合している場合には互いのコマンドに付随する活動度レベルの比較を行なうことによりコマンドの競合を解決することができる。 The resource management unit includes command conflict solution means for solving a conflict of commands from the reflection action hierarchy and the contemplation action hierarchy, and content management means for managing commands for each hardware resource of the robot apparatus. . The command conflict solution means holds the command, the hardware resource of the robot device used by the command, and the activity level of the command. When a new command is transmitted from any action control layer, It is determined whether or not the hardware resource of the robot device used by the command being executed conflicts with the hardware resource of the robot device used by the new command. Comparing command activity levels can resolve command conflicts.

したがって、現在実行中のコマンドの活動度レベルが新しいコマンドの活動度レベルよりも低い場合には、実行中のコマンドはキャンセルされ、新しいコマンドが実行される。他方、現在実行中のコマンドの活動度レベルが新しいコマンドの活動度レベルよりも高い場合には、新しいコマンドは実行中のコマンドの終了を待って実行されるか、若しくは新しいコマンドがキャンセル処理される。また、新しいコマンドの活動度レベルが前記資源管理部で待ち状態にあるコマンドの活動度レベルよりも大きい場合には、前記資源管理部で待ち状態にあるコマンドはキャンセルされる。 Therefore, if the activity level of the currently executing command is lower than the activity level of the new command, the executing command is canceled and the new command is executed. On the other hand, if the activity level of the currently executing command is higher than the activity level of the new command, the new command is executed waiting for the end of the executing command, or the new command is canceled. . When the activity level of a new command is higher than the activity level of a command waiting in the resource management unit, the command waiting in the resource management unit is canceled.

前記行動モジュール内の行動評価手段は、前記行動命令手段による行動命令の出力を誘発する評価値を活動度レベルとして求める行動誘発評価値演算手段と、活動度レベルに対するバイアスを意図レベルとして求めるバイアス演算手段と、前記行動命令手段による行動命令を実行したときに使用する前記ロボット装置のリソースを特定する使用リソース演算手段を備えている。上位の行動制御階層は、下位の行動制御階層に対し意図レベルを指示することにより、自階層の意図に反して動作するのを抑制したり、逆に自階層の意図に適うように励起したりすることができる。 The behavior evaluation unit in the behavior module includes a behavior induction evaluation value calculation unit that calculates an evaluation value that induces an output of a behavior command by the behavior command unit as an activity level, and a bias calculation that calculates a bias with respect to the activity level as an intention level And a use resource calculation means for specifying the resource of the robot device to be used when the action command by the action command means is executed. The higher behavior control layer instructs the lower behavior control layer to instruct the intention level, thereby suppressing the operation against the intention of the own layer, or conversely exciting it to meet the intention of the own layer. can do.

例えば、前記状況依存行動階層は、反射行動が状況依存行動の意図に適合するか否かに応じて反射行動を抑制又は励起する行動意図信号を前記反射行動階層に出力し、前記反射行動階層は、行動意図信号を管理する行動意図管理手段を備え、前記状況依存行動階層における行動意図に基づいて各行動モジュールにおける意図レベルを操作するようにする。これによって、状況依存行動の意図に反する反射行動を抑制し、又は状況依存行動の意図に適合する反射行動を励起することができる。 For example, the situation-dependent action hierarchy outputs an action intention signal that suppresses or excites a reflection action depending on whether the reflection action matches the intention of the situation-dependent action to the reflection action hierarchy, The behavior intention management means for managing the behavior intention signal is provided, and the intention level in each behavior module is operated based on the behavior intention in the situation-dependent behavior hierarchy. Thereby, reflex behavior contrary to the intention of the situation-dependent behavior can be suppressed, or the reflex behavior that matches the intention of the situation-dependent behavior can be excited.

反射行動制御階層は外部刺激の入力により直接動作するので、資源管理部による調停に頼っていたのでは、状況依存行動階層は反射行動を十分に抑制することができない。これに対し、状況依存行動階層は、意図レベルを反射行動制御階層に通知しておくことで、状況依存行動が意図しない反射行動を好適に抑制することができる。 Since the reflex behavior control hierarchy operates directly by the input of an external stimulus, the situation-dependent behavior hierarchy cannot sufficiently suppress the reflex behavior by relying on mediation by the resource management unit. On the other hand, the situation-dependent action hierarchy can suitably suppress the reflex action that the situation-dependent action is not intended by notifying the reflection action control hierarchy of the intention level.

また、前記熟考行動階層は、状況依存行動が熟考行動の意図に適合するか否かに応じて状況依存行動を抑制又は励起する行動意図信号を前記反射行動階層に出力し、前記状況依存行動階層は、行動意図信号を管理する行動意図管理手段を備え、前記熟考行動階層における行動意図に基づいて各行動モジュールにおける意図レベルを操作する。これによって、熟考行動の意図に反する行動モジュールを抑制し又は熟考行動の意図に適合する行動モジュールを励起することができる。 Further, the contemplation behavior layer outputs an action intention signal that suppresses or excites a situation-dependent behavior according to whether or not the situation-dependent behavior matches the intention of the contemplation behavior to the reflex behavior layer, and the situation-dependent behavior layer Comprises an action intention management means for managing an action intention signal, and operates an intention level in each action module based on the action intention in the contemplation action hierarchy. Thereby, it is possible to suppress a behavior module that is contrary to the intention of the contemplation behavior or to excite the behavior module that matches the intention of the contemplation behavior.

すなわち、熟考行動階層は、意図レベルを状況依存行動制御階層に通知しておくことで、熟考行動が意図しない状況依存行動を好適に抑制することができる。 That is, the contemplation action hierarchy can appropriately suppress the situation-dependent action that is not intended by the contemplation action by notifying the intention-dependent action control hierarchy of the intention level.

また、状況依存行動階層を構成する要素行動がすべて内部状態をある範囲に保つための行動すなわち「ホメオスタシス行動」である場合、すべての内部状態が十分満たされているときには各要素行動の欲求値は小さくなるため、行動価値（行動制御階層内で行動モジュールの競合解決を行なうための行動モジュールの活性度レベル）も小さく、状況依存行動が発現する機会は低下することになる。このような場合、外部刺激もなければ反射行動を起こさなくなるので、ロボット装置は何もしなくなり、エンタテイメント性を損なうという問題がある。そこで、ロボット装置の自発的な行動を発現する状況依存行動の構成要素として、ホメオスタシス的な目的を持たない「アイドル行動」をさらに組み込むようにしてもよい。 In addition, when all the elemental actions that make up the situation-dependent action hierarchy are actions to keep the internal state within a certain range, that is, “homeostasis action”, when all the internal states are sufficiently satisfied, the desire value of each elemental action is Since it becomes smaller, the action value (activity level of the action module for solving the competition of action modules in the action control hierarchy) is also reduced, and the opportunity for the situation-dependent action to occur is reduced. In such a case, since there is no reflex behavior if there is no external stimulus, there is a problem that the robot apparatus does nothing and impairs entertainment. Therefore, “idle behavior” having no homeostasis purpose may be further incorporated as a component of the situation-dependent behavior that expresses spontaneous behavior of the robot apparatus.

このように、自発行動としてアイドル行動を取り入れた場合には、状況依存行動階層内では、ホメオスタシス行動とアイドル行動のいずれを行動出力すべきかを調停しなければならなくなる。また、状況依存行動階層と反射行動階層の間では、それぞれから出力される自発行動と反射行動のいずれを行動すべきかを調停しなければならなくなる。 As described above, when the idle behavior is adopted as the self-issued motion, it is necessary to mediate whether the homeostasis behavior or the idle behavior should be output in the situation-dependent behavior hierarchy. In addition, between the situation-dependent action hierarchy and the reflex action hierarchy, it is necessary to mediate which of the self-issued movement and the reflex action output from each should be acted.

例えば、前記状況依存行動階層は、それぞれのホメオスタシス行動及びアイドル行動を行動出力する複数の要素行動で構成され、前記反射行動階層は、それぞれの反射行動を行動出力する複数の要素行動で構成される。要素行動は、上述した行動モジュールに相当する。そして、各要素行動の当該行動の実行優先度を示す行動価値を算出する行動価値算出手段をそれぞれ備える。前記行動価値算出手段は、前記ロボット装置の内部状態から欲求値を算出するとともに、内部状態と認識結果から予想満足値を算出し、該欲求値と該予想満足値からホメオスタシス行動の要素行動の行動価値を算出する。また、アイドル行動の要素行動の行動価値として一定値を与える。 For example, the situation-dependent action hierarchy is composed of a plurality of elemental actions for outputting each homeostasis action and idle action, and the reflex action hierarchy is composed of a plurality of elemental actions for outputting each reflex action. . Elemental behavior corresponds to the behavior module described above. And the action value calculation means which calculates the action value which shows the execution priority of the said action of each element action is each provided. The behavior value calculation means calculates a desire value from the internal state of the robot apparatus, calculates an expected satisfaction value from the internal state and the recognition result, and performs an action of an elementary action of homeostasis behavior from the desire value and the predicted satisfaction value Calculate value. Also, a certain value is given as the action value of the element action of the idle action.

このような場合、前記状況依存行動階層は、自発行動として行動出力すべき要素行動を行動価値に基づいて選択する行動選択手段を備え、ホメオスタシス行動とアイドル行動の調停を行なわせるようにすることができる。 In such a case, the situation-dependent action hierarchy includes action selection means for selecting an element action to be output as a self-issued action based on the action value, and allows mediation between homeostasis action and idle action. it can.

ホメオスタシス行動の行動価値が欲求値と予想満足値から算出される一方、アイドル行動に対しては一定の行動価値が与えられている。したがって、欲求値が上昇するとホメオスタシス行動の行動価値が高まるのでホメオスタシス行動を状況依存行動として選択される。逆に、すべての内部状態が十分に満たされているときにはすなわちホメオスタシス行動の行動価値が低下していき、アイドル行動の行動価値を下回るようになると、アイドル行動が選択されるようになる。 While the behavioral value of homeostasis behavior is calculated from the desire value and the expected satisfaction value, a certain behavioral value is given to idle behavior. Therefore, since the behavioral value of homeostasis behavior increases when the desire value increases, homeostasis behavior is selected as the situation-dependent behavior. On the contrary, when all the internal states are sufficiently satisfied, that is, when the action value of homeostasis action decreases and becomes lower than the action value of idle action, the idle action is selected.

アイドル行動はホメオスタシス的な目的を持たない、すなわち内部状態とは無関係であることから、その要素行動の行動価値は一定値が与えられている。そして、反射行動との競合を回避又は緩和する必要がある。 Since the idle behavior has no homeostasis purpose, that is, it has no relation to the internal state, the behavior value of the element behavior is given a constant value. And it is necessary to avoid or alleviate competition with reflex behavior.

そこで、本発明に係るロボット装置は、反射行動の要因となる外部入力イベントが発生する反射イベント密度を管理する反射イベント管理部をさらに備え、前記状況依存行動階層は、反射イベント密度に基づいてアイドル行動の各要素行動が選択される確率を求め、該確率に従って要素行動を選択するようにしている。反射イベント密度が高くなると、動作量の小さなアイドル行動あるいは反射行動が選ばれ易くなり、反射イベント密度が低くなると、動作量の大きなアイドル行動が選ばれ易く、あるいは反射行動が選ばれにくくなる。この結果、反射行動の要因となるイベントの種類や頻度から、アイドル行動を自発的に出力するタイミングやそのモーションの種類を調停することができる。 Therefore, the robot apparatus according to the present invention further includes a reflection event management unit that manages a reflection event density at which an external input event that causes a reflection action occurs, and the situation-dependent action hierarchy is idle based on the reflection event density. The probability that each elemental action is selected is obtained, and the elemental action is selected according to the probability. When the reflection event density is high, it is easy to select an idle action or a reflection action with a small amount of movement, and when the reflection event density is low, it is easy to select an idle action with a large amount of movement or it is difficult to select a reflection action. As a result, it is possible to arbitrate the timing for spontaneously outputting idle behavior and the type of motion based on the type and frequency of events that cause reflex behavior.

そして、本発明に係るロボット装置は、前記状況依存行動階層から出力される自発行動の要素運動と前記反射行動階層より出力される反射行動の要素運動とを調停する資源管理部をさらに備えている。前記状況依存行動階層及び前記反射行動階層はそれぞれ出力する要素行動にコマンドの強さ（資源管理部でコマンドの競合解決を行なうためのコマンドの活動度レベル）を与える。そして、前記資源管理部は、コマンドの強さに基づいて、ホメオスタシス行動と反射行動、又はアイドル行動と反射行動の間で調停を行なうことにより、２種類の異なった行動を好適に調停するようにしている。 The robot apparatus according to the present invention further includes a resource management unit that arbitrates between the elemental motion of the self-issued movement output from the situation-dependent behavioral hierarchy and the elemental motion of the reflective behavior output from the reflective behavior hierarchy. . The situation-dependent action hierarchy and the reflection action hierarchy give command strengths (command activity levels for performing command conflict resolution in the resource manager) to the element actions to be output. The resource management unit preferably mediates between two different behaviors by mediating between homeostasis behavior and reflex behavior or idle behavior and reflex behavior based on the strength of the command. ing.

ここで、前記状況依存行動階層は、ホメオスタシス行動の要素行動に対して行動価値に基づくコマンドの強さを与え、アイドル行動の各要素行動に対してそれぞれ固有のコマンドの強さを与えるようにする。また、前記反射行動階層は、すべての要素行動にコマンドの強さとして一定値を与えるようにする。 Here, the situation-dependent action hierarchy gives command strength based on action value to the elemental action of homeostasis action, and gives unique command strength to each elemental action of idle action. . Also, the reflex behavior hierarchy gives a constant value as the command strength to all element behaviors.

このような場合、ホメオスタシス行動の行動価値（欲求）が低い場合には、ロボット装置としては反射行動のトリガに反応し易くなるが、ホメオスタシス行動の行動価値が高い場合には、ロボット装置としては反射行動のトリガに反応しない（その行動に集中しているように見える）ことになる。 In such a case, when the action value (desire) of the homeostasis action is low, the robot apparatus easily reacts to the trigger of the reflex action, but when the action value of the homeostasis action is high, the robot apparatus reflects it. It does not respond to the trigger of the action (it seems to concentrate on the action).

また、欲求が低い場合にはホメオスタシス的でないアイドル行動が自発的行動として選択されるのは既に述べた通りであり、この場合、資源管理部はアイドル行動と反射行動とを調停することになる。アイドル行動は、反射行動の要因となるイベント（外部入力）の種類や頻度に応じてモーションの種類が決定される。そして、アイドル行動は、ユーザからの反射イベントが多く反射イベント密度が高い場合は、ロボット装置は反射行動のトリガに反応するようになる。他方、ユーザから放置されて反射イベント密度が低い場合には、ロボット装置は反射行動のトリガに反応しない（その行動に集中しているように見える）ことになる。 Further, as described above, when the desire is low, the idle behavior that is not homeostasis is selected as the spontaneous behavior, and in this case, the resource management unit mediates between the idle behavior and the reflex behavior. In the idle behavior, the type of motion is determined according to the type and frequency of the event (external input) that causes the reflective behavior. In the idle behavior, when there are many reflection events from the user and the reflection event density is high, the robot apparatus responds to the trigger of the reflection behavior. On the other hand, when the reflection event density is low due to being left by the user, the robot apparatus does not respond to the trigger of the reflection action (it seems to concentrate on the action).

本発明によれば、自律的な動作を行ないユーザとのリアリスティックなコミュニケーションを実現することができる、優れたロボット装置を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the outstanding robot apparatus which can perform realistic operation | movement and can realize realistic communication with a user can be provided.

また、本発明によれば、視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況を統合的に判断して行動を選択することができる、優れたロボット装置を提供することができる。 Further, according to the present invention, it is possible to select an action by comprehensively judging the situation where the robot is placed such as the recognition result of the external environment such as vision and hearing and the internal state such as instinct and emotion, An excellent robot apparatus can be provided.

また、本発明によれば、情動についての存在意義をより明確にして、一定の秩序の下で外部刺激や内部状態に応じた行動を好適に選択し実行することができる、優れたロボット装置を提供することができる。 Further, according to the present invention, an excellent robot apparatus capable of clarifying the existence significance of emotion and appropriately selecting and executing an action according to an external stimulus or an internal state under a certain order. Can be provided.

また、本発明によれば、複数の階層からなる行動制御ソフトウェア間の調和をとりながら、エンタテインメント・ロボットの社会的なインタラクションを実現することができる、優れたロボット装置を提供することができる。 Furthermore, according to the present invention, it is possible to provide an excellent robot apparatus capable of realizing social interaction of entertainment robots while harmonizing behavior control software composed of a plurality of layers.

また、本発明によれば、反射行動と状況依存行動と熟考行動が別のプロセスとし、そのための制御方法を構築することで、反射行動の反応時間が状況依存行動や熟考行動によって遅延することを避けられるため、反応の遅延を気にせず視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況の記述を充実させることができ、優れたロボット装置を提供することができる。 In addition, according to the present invention, the reflex action, the situation-dependent action, and the contemplation action are separate processes, and by constructing a control method therefor, the reaction time of the reflex action is delayed by the situation-dependent action and the contemplation action. Because it can be avoided, it is possible to enhance the description of the situation where the robot is placed such as the recognition result of external environment such as vision and hearing and the internal state such as instinct and emotion without worrying about the delay of reaction An apparatus can be provided.

本発明に係るロボット装置は、自発行動の行動価値で反射行動の出力を制御することにより、ホメオスタシス行動の集中度合いを表現することができる。 The robot apparatus according to the present invention can express the degree of concentration of homeostasis action by controlling the output of the reflex action by the action value of the self-issued movement.

また、本発明に係るロボット装置は、反射イベント密度に応じてアイドル行動のコマンドの強さによって反射行動の出力を制御することにより、アイドル行動の集中度合いを表現することができる。 In addition, the robot apparatus according to the present invention can express the concentration degree of the idle action by controlling the output of the reflection action according to the strength of the command of the idle action according to the reflection event density.

また、本発明に係るロボット装置は、反射イベント密度に応じてアイドル行動の種類を制御することにより、アイドル行動の集中度合いを表現することができる。 In addition, the robot apparatus according to the present invention can express the concentration level of idle behavior by controlling the type of idle behavior according to the reflection event density.

また、本発明に係るロボット装置は、反射イベントが多数あるときは、なるべく動作量の大きいモーションを出さないようにすることにより、動力学的に反射行動を出力し易くすることができる。 In addition, the robot apparatus according to the present invention can easily output the reflex action kinetically by preventing a motion with a large operation amount as much as possible when there are many reflex events.

また、本発明に係るロボット装置は、ユーザからより明示的に反射イベントがある場合は、反射イベント密度の増加分を増やすことができる。 Further, the robot apparatus according to the present invention can increase the increase in the reflection event density when there is an explicit reflection event from the user.

本発明のさらに他の目的、特徴や利点は、後述する本発明の実施形態や添付する図面に基づくより詳細な説明によって明らかになるであろう。 Other objects, features, and advantages of the present invention will become apparent from more detailed description based on embodiments of the present invention described later and the accompanying drawings.

以下、図面を参照しながら本発明の実施形態について詳解する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

Ａ．ロボット装置の構成
図１には、本発明に実施に供されるロボット装置１の機能構成を模式的に示している。同図に示すように、ロボット装置１は、全体の動作の統括的制御やその他のデータ処理を行なう制御ユニット２０と、入出力部４０と、駆動部５０と、電源部６０とで構成される。以下、各部について説明する。 A. Configuration of Robot Device FIG. 1 schematically shows a functional configuration of a robot device 1 used in the present invention. As shown in the figure, the robot apparatus 1 includes a control unit 20 that performs overall control of the entire operation and other data processing, an input / output unit 40, a drive unit 50, and a power supply unit 60. . Hereinafter, each part will be described.

入出力部４０は、入力部としてロボット装置１の目に相当するＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ：電荷結合素子）カメラ１５や、耳に相当するマイクロフォン１６、頭部や背中などの部位に配設されてユーザの接触を感知するタッチセンサ１８、あるいは五感に相当するその他の各種のセンサを含む。また、出力部として、口に相当するスピーカ１７、あるいは点滅の組み合わせや点灯のタイミングにより顔の表情を形成するＬＥＤインジケータ（目ランプ）１９などを装備している。これら出力部は、音声やランプの点滅など、脚などによる機械運動パターン以外の形式でもロボット装置１からのユーザ・フィードバックを表現することができる。 The input / output unit 40 is disposed as an input unit at a part such as a CCD (Charge Coupled Device) camera 15 corresponding to the eyes of the robot apparatus 1, a microphone 16 corresponding to the ear, a head or a back. It includes a touch sensor 18 that senses a user's contact, or other various sensors corresponding to the five senses. Further, as an output unit, a speaker 17 corresponding to the mouth or an LED indicator (eye lamp) 19 for forming a facial expression by a combination of blinking and lighting timing is provided. These output units can express user feedback from the robot apparatus 1 in a format other than a mechanical motion pattern such as a leg or the like, such as sound or blinking of a lamp.

駆動部５０は、制御部２０が指令する所定の運動パターンに従ってロボット装置１の機体動作を実現する機能ブロックであり、行動制御による制御対象である。駆動部５０は、ロボット装置１の各関節における自由度を実現するための機能モジュールであり、それぞれの関節におけるロール、ピッチ、ヨーなど各軸毎に設けられた複数の駆動ユニットで構成される。各駆動ユニットは、所定軸回りの回転動作を行なうモータ５１と、モータ５１の回転位置を検出するエンコーダ５２と、エンコーダ５２の出力に基づいてモータ５１の回転位置や回転速度を適応的に制御するドライバ５３の組み合わせで構成される。 The drive unit 50 is a functional block that realizes the body operation of the robot apparatus 1 in accordance with a predetermined motion pattern commanded by the control unit 20, and is a control target by behavior control. The drive unit 50 is a functional module for realizing the degree of freedom in each joint of the robot apparatus 1 and includes a plurality of drive units provided for each axis such as roll, pitch, and yaw in each joint. Each drive unit adaptively controls the rotational position and rotational speed of the motor 51 based on the output of the motor 51 that performs a rotational operation around a predetermined axis, the encoder 52 that detects the rotational position of the motor 51, and the encoder 52. A combination of drivers 53 is used.

駆動ユニットの組み合わせ方によって、ロボット装置１を例えば２足歩行又は４足歩行などの脚式移動ロボットとして構成することができる。 Depending on how the drive units are combined, the robot apparatus 1 can be configured as a legged mobile robot such as a bipedal walking or a quadrupedal walking.

電源部６０は、その字義通り、ロボット装置１内の各電気回路などに対して給電を行なう機能モジュールである。本実施形態に係るロボット装置１は、バッテリを用いた自律駆動式であり、電源部６０は、充電バッテリ６１と、充電バッテリ６１の充放電状態を管理する充放電制御部６２とで構成される。充電バッテリ６１は、例えば、複数本のリチウムイオン２次電池セルをカートリッジ式にパッケージ化した「バッテリ・パック」の形態で構成される。また、充放電制御部６２は、バッテリ６１の端子電圧や充電／放電電流量、バッテリ６１の周囲温度などを測定することでバッテリ６１の残存容量を把握し、充電の開始時期や終了時期などを決定する。充放電制御部６２が決定する充電の開始及び終了時期は制御ユニット２０に通知され、ロボット装置１が充電オペレーションを開始及び終了するためのトリガとなる。 The power supply unit 60 is a functional module that feeds power to each electrical circuit in the robot apparatus 1 as its meaning. The robot apparatus 1 according to the present embodiment is an autonomous drive type using a battery, and the power supply unit 60 includes a charging battery 61 and a charging / discharging control unit 62 that manages the charging / discharging state of the charging battery 61. . The rechargeable battery 61 is configured, for example, in the form of a “battery pack” in which a plurality of lithium ion secondary battery cells are packaged in a cartridge type. Further, the charge / discharge control unit 62 grasps the remaining capacity of the battery 61 by measuring the terminal voltage of the battery 61, the amount of charge / discharge current, the ambient temperature of the battery 61, etc., and determines the charging start timing and end timing. decide. The charging start / end timing determined by the charge / discharge control unit 62 is notified to the control unit 20 and serves as a trigger for the robot apparatus 1 to start and end the charging operation.

制御ユニット２０は、「頭脳」に相当し、例えばロボット装置１の機体頭部あるいは胴体部に搭載されている。 The control unit 20 corresponds to a “brain”, and is mounted on, for example, the body head or the trunk of the robot apparatus 1.

図２には、制御ユニット２０の構成をさらに詳細に図解している。同図に示すように、制御ユニット２０は、メイン・コントローラとしてのＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２１が、メモリやその他の各回路コンポーネントや周辺機器とバス接続された構成となっている。バス２７は、データ・バス、アドレス・バス、コントロール・バスなどを含む共通信号伝送路である。バス２７上の各装置にはそれぞれに固有のアドレス（メモリ・アドレス又はＩ／Ｏアドレス）が割り当てられている。ＣＰＵ２１は、アドレスを指定することによってバス２８上の特定の装置と通信することができる。 FIG. 2 illustrates the configuration of the control unit 20 in more detail. As shown in the figure, the control unit 20 has a configuration in which a CPU (Central Processing Unit) 21 as a main controller is connected to a memory and other circuit components and peripheral devices via a bus. The bus 27 is a common signal transmission path including a data bus, an address bus, a control bus, and the like. Each device on the bus 27 is assigned a unique address (memory address or I / O address). The CPU 21 can communicate with a specific device on the bus 28 by specifying an address.

ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２２は、ＤＲＡＭ（ＤｙｎａｍｉｃＲＡＭ）などの揮発性メモリで構成された書き込み可能メモリであり、ＣＰＵ２１が実行するプログラム・コードをロードしたり、実行プログラムによる作業データの一時的な保存したりするために使用される。また、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２３は、プログラムやデータを恒久的に格納する読み出し専用メモリである。ＲＯＭ２３に格納されるプログラム・コードには、ロボット装置１の電源投入時に実行する自己診断テスト・プログラムや、ロボット装置１の動作を規定する動作制御プログラムなどが挙げられる。 A RAM (Random Access Memory) 22 is a writable memory composed of a volatile memory such as a DRAM (Dynamic RAM), and loads a program code executed by the CPU 21 or temporarily stores work data by the execution program. Used to save. A ROM (Read Only Memory) 23 is a read-only memory that permanently stores programs and data. Examples of the program code stored in the ROM 23 include a self-diagnosis test program that is executed when the robot apparatus 1 is powered on, and an operation control program that defines the operation of the robot apparatus 1.

ロボット装置１の制御プログラムには、カメラ１５やマイクロフォン１６などのセンサ入力を処理してシンボルとして認識する「センサ入力・認識処理プログラム」、短期記憶や長期記憶などの記憶動作（後述）を司りながらセンサ入力と所定の行動制御モデルとに基づいてロボット装置１の行動を制御する「行動制御プログラム」、行動制御モデルに従って各関節モータの駆動やスピーカ１７の音声出力などを制御する「駆動制御プログラム」などが含まれる。 The control program for the robot apparatus 1 is a “sensor input / recognition processing program” that processes sensor inputs such as the camera 15 and the microphone 16 and recognizes them as symbols, and performs storage operations (described later) such as short-term memory and long-term memory. A “behavior control program” for controlling the behavior of the robot apparatus 1 based on the sensor input and a predetermined behavior control model, and a “drive control program” for controlling the driving of each joint motor and the sound output of the speaker 17 according to the behavior control model. Etc. are included.

不揮発性メモリ２４は、例えばＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲＯＭ）のように電気的に消去再書き込みが可能なメモリ素子で構成され、逐次更新すべきデータを不揮発的に保持するために使用される。逐次更新すべきデータには、暗号鍵やその他のセキュリティ情報、出荷後にインストールすべき装置制御プログラムなどが挙げられる。 The nonvolatile memory 24 is composed of a memory element that can be electrically erased and rewritten, such as an EEPROM (Electrically Erasable and Programmable ROM), and is used to hold data to be sequentially updated in a nonvolatile manner. Data to be updated sequentially includes an encryption key and other security information, a device control program to be installed after shipment, and the like.

インターフェース２５は、制御ユニット２０外の機器と相互接続し、データ交換を可能にするための装置である。インターフェース２５は、例えば、カメラ１５やマイクロフォン１６、スピーカ１７との間でデータ入出力を行なう。また、インターフェース２５は、駆動部５０内の各ドライバ５３−１…との間でデータやコマンドの入出力を行なう。 The interface 25 is a device for interconnecting with devices outside the control unit 20 and enabling data exchange. The interface 25 performs data input / output with the camera 15, the microphone 16, and the speaker 17, for example. The interface 25 inputs and outputs data and commands to and from the drivers 53-1.

また、インターフェース２５は、ＲＳ（ＲｅｃｏｍｍｅｎｄｅｄＳｔａｎｄａｒｄ）−２３２Ｃなどのシリアル・インターフェース、ＩＥＥＥ（ＩｎｓｔｉｔｕｔｅｏｆＥｌｅｃｔｒｉｃａｌａｎｄＥｌｅｃｔｒｏｎｉｃｓＥｎｇｉｎｅｅｒｓ）１２８４などのパラレル・インターフェース、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）インターフェース、ｉ−Ｌｉｎｋ（ＩＥＥＥ１３９４）インターフェース、ＳＣＳＩ（ＳｍａｌｌＣｏｍｐｕｔｅｒＳｙｓｔｅｍＩｎｔｅｒｆａｃｅ）インターフェース、ＰＣカードやメモリ・スティックを受容するメモリ・カード・インターフェース（カード・スロット）などのような、コンピュータの周辺機器接続用の汎用インターフェースを備え、ローカル接続された外部機器との間でプログラムやデータの移動を行なうようにしてもよい。 In addition, the interface 25 includes a serial interface such as RS (Recommended Standard) -232C, a parallel interface such as IEEE (Institut of Electrical and Electronics Engineers) 1284, a USB (Universal Serial Bus) I interface 94, an E A general-purpose interface for connecting computer peripherals, such as a small computer system interface (SCSI) interface, a memory card interface (card slot) that accepts PC cards and memory sticks, etc. Between external devices It may be performed to move programs and data.

また、インターフェース２５の他の例として、赤外線通信（ＩｒＤＡ）インターフェースを備え、外部機器と無線通信を行なうようにしてもよい。さらに、制御ユニット２０は、無線通信インターフェース２６やネットワーク・インターフェース・カード（ＮＩＣ）２７などを含み、Ｂｌｕｅｔｏｏｔｈのような近接無線データ通信や、ＩＥＥＥ８０２．１１ｂのような無線ネットワーク、あるいはインターネットなどの広域ネットワークを経由して、外部のさまざまなホスト・コンピュータとデータ通信を行なうことができる。 As another example of the interface 25, an infrared communication (IrDA) interface may be provided to perform wireless communication with an external device. Furthermore, the control unit 20 includes a wireless communication interface 26, a network interface card (NIC) 27, and the like, and is used for close proximity wireless data communication such as Bluetooth, a wireless network such as IEEE 802.11b, or a wide area such as the Internet. Data communication can be performed with various external host computers via the network.

このようなロボット装置１とホスト・コンピュータ間におけるデータ通信により、遠隔のコンピュータ資源を用いて、ロボット装置１の複雑な動作制御を演算したり、リモート・コントロールしたりすることができる。 By such data communication between the robot apparatus 1 and the host computer, complex operation control of the robot apparatus 1 can be calculated or remotely controlled using remote computer resources.

Ｂ．ロボット装置の行動制御システム
図３には、本発明の実施形態に係るロボット装置１の行動制御システム１００の機能構成を模式的に示している。ロボット装置１は、外部刺激の認識結果や内部状態の変化に応じて行動制御を行なうことができる。さらには、長期記憶機能を備え、外部刺激から内部状態の変化を連想記憶することにより、外部刺激の認識結果や内部状態の変化に応じて行動制御を行なうことができる。 B. A behavior control system diagram 3 of the robot apparatus has a functional configuration of a behavior control system 100 of the robot apparatus 1 according to an embodiment of the present invention is schematically shown. The robot apparatus 1 can perform behavior control according to the recognition result of the external stimulus and the change in the internal state. Furthermore, by providing a long-term memory function and associatively storing a change in the internal state from an external stimulus, it is possible to perform action control according to the recognition result of the external stimulus and the change in the internal state.

図示の行動制御システム１００にはオブジェクト指向プログラミングを採り入れて実装することができる。この場合、各ソフトウェアは、データとそのデータに対する処理手続きとを一体化させた「オブジェクト」というモジュール単位で扱われる。また、各オブジェクトは、メッセージ通信と共有メモリを使ったオブジェクト間通信方法によりデータの受け渡しとＩｎｖｏｋｅを行なうことができる。 The illustrated behavior control system 100 can adopt and implement object-oriented programming. In this case, each software is handled in units of modules called “objects” in which data and processing procedures for the data are integrated. In addition, each object can perform data transfer and invoke using message communication and an inter-object communication method using a shared memory.

行動制御システム１００は、外部環境（Ｅｎｖｉｒｏｎｍｅｎｔｓ）を認識するために、視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３を備えている。 The behavior control system 100 includes a visual recognition function unit 101, an auditory recognition function unit 102, and a contact recognition function unit 103 in order to recognize an external environment (Environments).

視覚認識機能部（Ｖｉｄｅｏ）５１は、例えば、ＣＣＤカメラ１５のような画像入力装置を介して入力された撮影画像を基に、顔認識や色認識などの画像認識処理や特徴抽出を行なう。視覚認識機能部５１は、後述する“ＭｕｌｔｉＣｏｌｏｒＴｒａｃｋｅｒ”、“ＦａｃｅＤｅｔｅｃｔｏｒ”、“ＦａｃｅＩｄｅｎｔｉｆｙ”といった複数のオブジェクトで構成される。 The visual recognition function unit (Video) 51 performs image recognition processing such as face recognition and color recognition and feature extraction based on a photographed image input via an image input device such as the CCD camera 15, for example. The visual recognition function unit 51 includes a plurality of objects such as “MultiColorTracker”, “FaceDetector”, and “FaceIdentify” which will be described later.

聴覚認識機能部（Ａｕｄｉｏ）５２は、マイクなどの音声入力装置を介して入力される音声データを音声認識して、特徴抽出したり、単語セット（テキスト）認識を行ったりする。聴覚認識機能部５２は、後述する“ＡｕｄｉｏＲｅｃｏｇ”や“ＡｕｔｈｕｒＤｅｃｏｄｅｒ”といった複数のオブジェクトで構成される。 The auditory recognition function unit (Audio) 52 performs voice recognition on voice data input through a voice input device such as a microphone, and performs feature extraction or word set (text) recognition. The auditory recognition function unit 52 includes a plurality of objects such as “AudioRecog” and “AuthorDecoder” described later.

接触認識機能部（Ｔａｃｔｉｌｅ）５３は、例えば機体の頭部などに内蔵された接触センサによるセンサ信号を認識して、「なでられた」とか「叩かれた」という外部刺激を認識する。 The contact recognition function unit (Tactile) 53 recognizes an external stimulus such as “struck” or “struck” by recognizing a sensor signal from a contact sensor built in the head of the aircraft, for example.

内部状態管理部（ＩＳＭ：ＩｎｔｅｒｎａｌＳｔａｔｕｓＭａｎａｇｅｒ）１０４は、本能や感情といった数種類の情動を数式モデル化して管理しており、上述の視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３によって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）に応じてロボット装置１の本能や情動といった内部状態を管理する。 An internal status management unit (ISM: Internal Status Manager) 104 manages several types of emotions such as instinct and emotion by modeling them, and includes the above-described visual recognition function unit 101, auditory recognition function unit 102, and contact recognition function. The internal state such as instinct and emotion of the robot apparatus 1 is managed according to an external stimulus (ES: External Stimula) recognized by the unit 103.

感情モデルと本能モデルは、それぞれ認識結果と行動履歴を入力に持ち、感情値と本能値を管理している。行動モデルは、これら感情値や本能値を参照することができる。 The emotion model and the instinct model have the recognition result and the action history as inputs, respectively, and manage the emotion value and the instinct value. The behavior model can refer to these emotion values and instinct values.

本実施形態では、情動についてその存在意義による複数階層で構成され、それぞれの階層で動作する。決定された複数の動作から、そのときの外部環境や内部状態によってどの動作を行なうかを決定するようになっている（後述）。また、それぞれの階層で行動は選択されるが、より低次の行動から優先的に動作を発現していくことにより、反射などの本能的行動や、記憶を用いた動作選択などの高次の行動を１つの個体上で矛盾なく発現することができる。 In the present embodiment, the emotion is composed of a plurality of hierarchies depending on the significance of existence, and operates in each hierarchy. From a plurality of determined operations, which operation is to be performed is determined according to the external environment and internal state at that time (described later). In addition, actions are selected at each level, but by expressing actions preferentially from lower-order actions, higher-order actions such as instinct actions such as reflexes and action selection using memory Behavior can be expressed consistently on one individual.

本実施形態に係るロボット装置１は、外部刺激の認識結果や内部状態の変化に応じて行動制御を行なうために、時間の経過とともに失われる短期的な記憶を行なう短期記憶部１０５と、情報を比較的長期間保持するための長期記憶部１０６を備えている。短期記憶と長期記憶という記憶メカニズムの分類は神経心理学に依拠する。 The robot apparatus 1 according to the present embodiment includes a short-term storage unit 105 that performs short-term storage that is lost over time, and information in order to perform behavior control according to the recognition result of the external stimulus and the change in the internal state. A long-term storage unit 106 for holding for a relatively long time is provided. The classification of memory mechanisms, short-term memory and long-term memory, relies on neuropsychology.

短期記憶部（ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ）１０５は、上述の視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３によって外部環境から認識されたターゲットやイベントを短期間保持する機能モジュールである。例えば、カメラ１５からの入力画像を約１５秒程度の短い期間だけ記憶する。 A short-term memory unit (ShortTermMemory) 105 is a functional module that holds targets and events recognized from the external environment by the visual recognition function unit 101, the auditory recognition function unit 102, and the contact recognition function unit 103 for a short period of time. For example, the input image from the camera 15 is stored for a short period of about 15 seconds.

長期記憶部（ＬｏｎｇＴｅｒｍＭｅｍｏｒｙ）１０６は、物の名前など学習により得られた情報を長期間保持するために使用される。長期記憶部１０６は、例えば、ある行動モジュールにおいて外部刺激から内部状態の変化を連想記憶することができる。 A long term memory unit (LongTermMemory) 106 is used to hold information obtained by learning such as the name of an object for a long period of time. For example, the long-term storage unit 106 can associatively store a change in the internal state from an external stimulus in a certain behavior module.

また、本実施形態に係るロボット装置１の行動制御は、反射行動部１０９によって実現される「反射行動」と、状況依存行動階層１０８によって実現される「状況依存行動」と、熟考行動階層１０７によって実現される「熟考行動」に大別される。 Also, the behavior control of the robot apparatus 1 according to the present embodiment is performed by the “reflection behavior” realized by the reflection behavior unit 109, the “situation dependence behavior” realized by the situation-dependent behavior hierarchy 108, and the contemplation behavior hierarchy 107. It can be roughly divided into "contemplation behavior" to be realized.

反射的行動部（ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０９は、上述の視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３によって認識された外部刺激に応じて反射的な機体動作を実現する機能モジュールである。 A reflexive behavior unit (Reflective Situated Behaviors Layer) 109 is a functional module that realizes a reflective body operation in response to an external stimulus recognized by the visual recognition function unit 101, the auditory recognition function unit 102, and the contact recognition function unit 103 described above. It is.

反射行動とは、基本的に、センサ入力そのものをトリガとして行なう反射的な行動のことであり、すなわち、センサ入力された外部情報の認識結果を直接受けて、これを分類して、出力行動を直接決定する。例えば、人間の顔を追いかけたり、うなずいたりといった振る舞いは反射行動として実装することが好ましい。反射行動部１０９は、状況依存行動階層１０８に比べ、十分な速さで制御サイクルが実行される。 The reflex behavior is basically a reflex behavior that is triggered by the sensor input itself, that is, it receives the recognition result of the external information input from the sensor directly, classifies it, and outputs the behavior. Decide directly. For example, a behavior such as chasing a human face or nodding is preferably implemented as a reflex behavior. The reflex action unit 109 executes the control cycle at a sufficient speed as compared with the situation-dependent action layer 108.

状況依存行動階層（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０８は、短期記憶部１０５並びに長期記憶部１０６の記憶内容や、内部状態管理部１０４によって管理される内部状態を基に、ロボット装置１が現在置かれている状況に即応した行動を制御する。 The situation-dependent behavior hierarchy (Situated Behaviors Layer) 108 is based on the storage contents of the short-term storage unit 105 and the long-term storage unit 106 and the internal state managed by the internal state management unit 104. Control responsive behavior.

状況依存行動階層１０８は、行動毎にステートマシンを用意しており、それ以前の行動や状況に依存して、センサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。また、状況依存行動階層１０８は、内部状態をある範囲に保つための行動（「ホメオスタシス行動」とも呼ぶ）も実現し、内部状態が指定した範囲内を越えた場合には、その内部状態を当該範囲内に戻すための行動が出現し易くなるようにその行動を活性化させる（実際には、内部状態と外部環境の両方を考慮した形で行動が選択される）。状況依存行動は、反射行動に比し、反応時間が遅い。 The situation-dependent action hierarchy 108 prepares a state machine for each action, classifies recognition results of external information input from the sensor, and expresses the action on the aircraft depending on the previous action and situation. . The situation-dependent action hierarchy 108 also realizes an action for keeping the internal state within a certain range (also referred to as “homeostasis action”). When the internal state exceeds the specified range, the internal state is The action is activated so that the action for returning to the range is likely to appear (actually, the action is selected in consideration of both the internal state and the external environment). Situation-dependent behavior has a slower response time than reflex behavior.

熟考行動階層（ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒ）１０７は、短期記憶部１０５並びに長期記憶部１０６の記憶内容に基づいて、ロボット装置１の比較的長期にわたる行動計画などを行なう。一般に、ロボット装置が状況に即応した行動に適用すればするほど、全体としては刹那的な行動の集合に陥り易くなってしまう。本実施形態では、その状況に即応した行動を発現できるだけでなく、熟考行動階層１０７によって、その先の状況を先読み（すなわち推論）して行動の計画を立てるという熟考行動を制御する。熟考行動階層１０７の詳細については後述に譲る。 A deliberate action hierarchy (Deliberate Layer) 107 performs a relatively long-term action plan of the robot apparatus 1 based on the storage contents of the short-term storage unit 105 and the long-term storage unit 106. In general, the more the robot device is applied to a behavior that responds to the situation, the easier it is to fall into a set of momentary behavior as a whole. In the present embodiment, not only can the behavior immediately responding to the situation be expressed, but the contemplation behavior hierarchy 107 controls the contemplation behavior of pre-reading (ie, inferring) the situation ahead and planning the behavior. Details of the contemplation action hierarchy 107 will be described later.

ここで言う熟考行動とは、与えられた状況あるいは人間からの命令により、推論やそれを実現するための計画を立てて行なわれる行動のことである。例えば、ロボットの位置と目標の位置から経路を探索することは熟考行動に相当する。このような推論や計画は、ロボット装置１がインタラクションを保つための反応時間よりも処理時間や計算負荷を要する（すなわち処理時間がかかる）可能性があるので、上記の反射行動や状況依存行動がリアルタイムで反応を返しながら、熟考行動は推論や計画を行なう。 The pondering action mentioned here is an action that is performed based on a given situation or a command from human beings and making a plan for realizing it. For example, searching for a route from the position of the robot and the position of the target corresponds to a contemplation action. Such an inference or plan may require a processing time or a calculation load (that is, a processing time) rather than a reaction time for the robot apparatus 1 to maintain interaction. While responding in real time, the contemplation action makes inferences and plans.

熟考行動階層１０７や状況依存行動階層１０８、反射行動部１０９は、ロボット装置１のハードウェア構成に非依存の上位のアプリケーション・プログラムとして記述することができる。これに対し、ハードウェア依存層制御部（ＣｏｎｆｉｇｕｒａｔｉｏｎＤｅｐｅｎｄｅｎｔＡｃｔｉｏｎｓＡｎｄＲｅａｃｔｉｏｎｓ）１１０は、これら上位アプリケーション（「スキーマ」と呼ばれる行動モジュール）からの命令に応じて、関節アクチュエータの駆動などの機体のハードウェア（外部環境）を直接操作する。 The contemplation behavior layer 107, the situation-dependent behavior layer 108, and the reflex behavior unit 109 can be described as higher-level application programs that are independent of the hardware configuration of the robot apparatus 1. On the other hand, the hardware dependent layer control unit (ConfigurationDependentActionsAndReactions) 110 performs hardware (external environment) such as joint actuator driving in accordance with commands from these higher-level applications (behavior modules called “schema”). Operate directly.

Ｃ．ロボット装置の記憶メカニズム
上述したように、本実施形態に係るロボット装置１は、短期記憶部１０５と長期記憶部１０６を備えているが、このような記憶メカニズムは、神経心理学に依拠する。 C. As storage mechanism described above of the robot apparatus, the robot apparatus 1 according to this embodiment is provided with the short-term memory unit 105 and the long-term memory unit 106, such storage mechanism relies on neuropsychological.

短期記憶は、字義通り短期的な記憶であり、時間の経過とともに失われる。短期記憶は、例えば、視覚や聴覚、接触など、外部環境から認識されたターゲットやイベントを短期間保持するために使用することができる。 Short-term memory is literally short-term memory and is lost over time. Short-term memory can be used to hold targets and events recognized from the external environment, such as vision, hearing, and touch, for a short period of time.

短期記憶は、さらに、感覚情報（すなわちセンサからの出力）をそのままの信号で１秒程度保持する「感覚記憶」と、感覚記憶をエンコードして限られた容量で短期的に記憶する「直接記憶」と、状況変化や文脈を数時間に渡って記憶する「作業記憶」に分類することができる。直接記憶は、神経心理学的な研究によれば７±２チャンクであると言われている。また、作業記憶は、短期記憶と長期記憶との対比で、「中間記憶」とも呼ばれる。 In short-term memory, “sensory memory” that holds sensory information (that is, output from the sensor) as it is for about 1 second, and “direct memory” that encodes sensory memory and stores it in a short time with a limited capacity. ”And“ working memory ”that memorizes situation changes and contexts over several hours. Direct memory is said to be 7 ± 2 chunks according to neuropsychological studies. The working memory is also referred to as “intermediate memory” in contrast to short-term memory and long-term memory.

また、長期記憶は、物の名前など学習により得られた情報を長期間保持するために使用される。同じパターンを統計的に処理して、ロバストな記憶にすることができる。 Long-term memory is used to hold information obtained by learning such as the name of an object for a long period of time. The same pattern can be statistically processed for robust storage.

長期記憶はさらに「宣言的知識記憶」と「手続的知識記憶」に分類される。宣言的知識記憶は、場面（例えば教えられたときのシーン）に関する記憶である「エピソード記憶」と、言葉の意味や常識といった記憶からなる「意味記憶」からなる。また、手続的知識記憶は、宣言的知識記憶をどのように使うかといった手順記憶であり、入力パターンに対する動作の獲得に用いることができる。 Long-term memory is further classified into “declarative knowledge memory” and “procedural knowledge memory”. The declarative knowledge memory is composed of “episode memory” that is a memory related to a scene (for example, a scene when taught) and “semantic memory” that includes memories such as the meaning and common sense of words. The procedural knowledge memory is a procedural memory such as how to use a declarative knowledge memory, and can be used to acquire an operation for an input pattern.

Ｃ−１．短期記憶部
短期記憶部１０５は、自分の周りに存在する物体、あるいはイベントを表現、記憶し、それに基づいてロボットが行動することを目的とした機能モジュールである。視覚や聴覚などのセンサ情報を基に物体やイベントの位置を自己中心座標系上に配置していくが、視野外の物体などを記憶し、それに対する行動などを生じさせることができる。 C-1. Short-term memory unit The short-term memory unit 105 is a functional module that expresses and stores objects or events that exist around itself, and is intended to allow a robot to act on the basis of it. The position of an object or event is arranged on the self-centered coordinate system based on sensor information such as vision and hearing, but an object outside the field of view can be stored and an action or the like can be generated.

例えば、ある人物Ａと会話していて、別の人物Ｂに声をかられたとき、Ａの位置や会話内容を保持しながらＢとの会話を行ない、終了後Ａとの会話に戻る場合などに短期記憶の機能が必要となる。但し、あまり複雑な処理による統合を行なわずに、時間と空間で近いセンサ情報を同じ物体からの信号とみなすといった時空間の簡単な近さによる統合を行なう。 For example, when talking to a person A and speaking to another person B, holding a position of A and the conversation contents, holding a conversation with B, and returning to a conversation with A after the end A short-term memory function is required. However, integration based on simple proximity in space and time, such as considering sensor information that is close in time and space as signals from the same object, without performing integration through very complicated processing.

また、ステレオ視覚などの技術を用いてパターン認識で判別可能な物体以外の物体の位置を記憶するために、自己中心座標系上に配置する。床面検出とともに利用して、障害物の位置を確率的に記憶するなどに利用することができる。 Moreover, in order to memorize | store the position of objects other than the object which can be discriminate | determined by pattern recognition using techniques, such as stereo vision, it arrange | positions on a self-centered coordinate system. It can be used together with floor surface detection to store the position of an obstacle stochastically.

本実施形態では、短期記憶部１０５は、上述した視覚認識機能部１０１、聴覚認識機能部１０２、接触認識機能部１０３などの複数の認識器の結果からなる外部刺激を時間的及び空間的に整合性を保つように統合して、外部環境下の各物体に関する知覚を短期間の記憶として状況依存行動階層（ＳＢＬ）１０８などの行動制御モジュールに提供する。 In the present embodiment, the short-term storage unit 105 temporally and spatially aligns external stimuli formed by a plurality of recognizers such as the visual recognition function unit 101, the auditory recognition function unit 102, and the contact recognition function unit 103 described above. Integration is performed so as to maintain the sexuality, and a perception regarding each object in the external environment is provided as a short-term memory to a behavior control module such as the situation-dependent behavior hierarchy (SBL) 108.

したがって、上位モジュールとして構成される行動制御モジュール側では、外界からの複数の認識結果を統合して意味を持ったシンボル情報として扱い、高度な行動制御を行なうことができる。また、以前に観測された認識結果との対応問題などより複雑な認識結果を利用して、どの肌色領域が顔でどの人物に対応しているかや、この声がどの人物の声なのかなどを解くことができる。 Therefore, on the side of the behavior control module configured as a higher-level module, a plurality of recognition results from the outside world can be integrated and handled as meaningful symbol information, and advanced behavior control can be performed. Also, by using more complicated recognition results such as correspondence problems with previously observed recognition results, which skin color area corresponds to which person on the face, which person's voice is this voice, etc. Can be solved.

また、認識した観測結果に関する情報を記憶として短期記憶部５５が保持しているので、自律行動する期間中に一時的に観測結果が来なかったりした場合であっても、機体の行動制御を行なうアプリケーションなどの上位モジュールからは常にそこに物体が知覚されているように見えるようにすることができる。例えば、センサの視野外の情報もすぐに忘れることなく保持しているので、ロボットが物体を一旦見失ったとしても、また後で探し出すことができる。この結果、認識器の間違いやセンサのノイズに強くなり、認識器の通知のタイミングに依存しない安定したシステムを実現することができる。また、認識器単体から見て情報が足りなくても、他の認識結果で補うことができる場合があるので、システム全体としての認識性能が向上する。 In addition, since the short-term storage unit 55 holds information about the recognized observation result as a memory, even if the observation result does not come temporarily during the autonomous action period, the aircraft's behavior control is performed. An upper module such as an application can always make an object appear to be perceived there. For example, information outside the field of view of the sensor is stored without forgetting, so that even if the robot loses sight of an object, it can be searched again later. As a result, it is possible to realize a stable system that is resistant to the error of the recognizer and the noise of the sensor and does not depend on the notification timing of the recognizer. In addition, even if there is not enough information when viewed from a single recognizer, it may be supplemented with other recognition results, so that the recognition performance of the entire system is improved.

また、関連する認識結果が結び付けられているので、アプリケーションなどの上位モジュールで関連する情報を使って行動判断することが可能である。例えば、ロボット装置は、呼び掛けられた声を基に、その人物の名前を引き出すことができる。この結果、挨拶の応答に「こんにちは、ＸＸＸさん。」のように答えるなどのリアクションが可能である。 In addition, since related recognition results are linked, it is possible to make a behavior determination using related information in an upper module such as an application. For example, the robot apparatus can extract the name of the person based on the called voice. As a result, it is possible to notes, such as answer such as "Hello, XXX-san." The response of the greeting.

図４には、図３に示した行動制御システム１００における外部刺激に応じた状況依存行動制御のメカニズムを図解している。外部刺激は、認識系の機能モジュール１０１〜１０３によってシステムに取り込まれるとともに、短期記憶部（ＳＴＭ）１０５を介して状況依存行動階層（ＳＢＬ）１０８に与えられる。図示の通り、認識系の各機能モジュール１０１〜１０３や、短期記憶部（ＳＴＭ）１０５、状況依存行動階層（ＳＢＬ）１０８はオブジェクトとして構成されている。 FIG. 4 illustrates a mechanism of situation-dependent behavior control according to an external stimulus in the behavior control system 100 shown in FIG. The external stimulus is taken into the system by the function modules 101 to 103 of the recognition system and is given to the situation-dependent action hierarchy (SBL) 108 via the short-term memory unit (STM) 105. As illustrated, each function module 101 to 103 of the recognition system, the short-term storage unit (STM) 105, and the situation-dependent action hierarchy (SBL) 108 are configured as objects.

同図において、丸で表されているのが、「オブジェクト」又は「プロセス」と呼ばれるエンティティである。オブジェクト同士が非同期に通信し合うことで、システム全体が動作する。各オブジェクトはメッセージ通信と共有メモリを使ったオブジェクト間通信方法によりデータの受け渡しとＩｎｖｏｋｅを行なっている。以下に、各オブジェクトの機能について説明する。 In the figure, entities represented by circles are entities called “objects” or “processes”. The entire system operates as objects communicate asynchronously. Each object exchanges data and invokes using message communication and an inter-object communication method using a shared memory. The function of each object will be described below.

ＡｕｄｉｏＲｅｃｏｇ：
マイクなどの音声入力装置からの音声データを受け取って、特徴抽出と音声区間検出を行なうオブジェクトである。また、マイクがステレオである場合には、水平方向の音源方向推定を行なうことができる。音声区間であると判断されると、その区間の音声データの特徴量及び音源方向がＡｒｔｈｅｒＤｅｃｏｄｅｒ（後述）に送られる。 AudioRecog:
An object that receives voice data from a voice input device such as a microphone and performs feature extraction and voice section detection. Further, when the microphone is a stereo, the sound source direction in the horizontal direction can be estimated. If it is determined that it is a voice section, the feature amount and sound source direction of the voice data in that section are sent to ArterDecoder (described later).

ＳｐｅｅｃｈＲｅｃｏｇ：
ＡｕｄｉｏＲｅｃｏｇから受け取った音声特徴量と音声辞書及び構文辞書を使って音声認識を行なうオブジェクトである。認識された単語のセットは短期記憶部（ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ）１０５に送られる。 SpeechRecog:
This is an object that performs speech recognition using the speech feature amount, the speech dictionary, and the syntax dictionary received from AudioRecog. The set of recognized words is sent to a short term memory (ShortTerm Memory) 105.

ＭｕｌｔｉＣｏｌｏｒＴｒａｃｋｅｒ：
色認識を行なうオブジェクトであり、カメラなどの画像入力装置から画像データを受け取り、あらかじめ持っている複数のカラー・モデルに基づいて色領域を抽出し、連続した領域に分割する。分割された各領域の位置や大きさ、特徴量などの情報を出力して、短期記憶部（ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ）１０５へ送る。 MultiColorTracker:
An object that performs color recognition, receives image data from an image input device such as a camera, extracts color areas based on a plurality of color models that are held in advance, and divides them into continuous areas. Information such as the position, size, and feature amount of each divided area is output and sent to the short-term storage unit (short term memory) 105.

ＦａｃｅＤｅｔｅｃｔｏｒ：
画像フレーム中から顔領域を検出するオブジェクトであり、カメラなどの画像入力装置から画像データを受け取り、それを９段階のスケール画像に縮小変換する。このすべての画像の中から顔に相当する矩形領域を探索する。重なりあった候補領域を削減して最終的に顔と判断された領域に関する位置や大きさ、特徴量などの情報を出力して、ＦａｃｅＩｄｅｎｔｉｆｙ（後述）へ送る。 FaceDetector:
An object that detects a face area from an image frame, receives image data from an image input device such as a camera, and reduces and converts it into a nine-stage scale image. A rectangular area corresponding to the face is searched from all the images. The overlapped candidate areas are reduced, and information such as the position, size, and feature amount related to the area finally determined as a face is output and sent to FaceIdentify (described later).

ＦａｃｅＩｄｅｎｔｉｆｙ：
検出された顔画像を識別するオブジェクトであり、顔の領域を示す矩形領域画像をＦａｃｅＤｅｔｅｃｔｏｒから受け取り、この顔画像が手持ちの人物辞書のうちでどの人物に相当するかを比較して人物の識別を行なう。この場合、顔検出から顔画像を受け取り、顔画像領域の位置、大きさ情報とともに人物のＩＤ情報を出力する。 FaceIdentify:
An object for identifying a detected face image, a rectangular area image indicating a face area is received from FaceDetector, and a person is identified by comparing to which person in the person dictionary this face image corresponds. Do. In this case, the face image is received from the face detection, and the ID information of the person is output together with the position and size information of the face image area.

ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（短期記憶部）：
ロボット１の外部環境に関する情報を比較的短い時間だけ保持するオブジェクトであり、ＳｐｅｅｃｈＲｅｃｏｇから音声認識結果（単語、音源方向、確信度）を受け取り、ＭｕｌｔｉＣｏｌｏｒＴｒａｃｋｅｒから肌色の領域の位置、大きさと顔領域の位置、大きさを受け取り、ＦａｃｅＩｄｅｎｔｉｆｙから人物のＩＤ情報等を受け取る。また、ロボット１の機体上の各センサからロボットの首の方向（関節角）を受け取る。そして、これらの認識結果やセンサ出力を統合的に使って、現在どこにどの人物がいて、しゃべった言葉がどの人物のものであり、その人物とはこれまでにどんな対話を行なったのかという情報を保存する。こうした物体すなわちターゲットに関する物理情報と時間方向でみたイベント（履歴）を出力として、状況依存行動階層（ＳＢＬ）などの上位モジュールに渡す。 ShortTermMemory (short term memory):
An object that holds information about the external environment of the robot 1 for a relatively short time, receives a speech recognition result (word, sound source direction, certainty factor) from the SpeechRecog, and receives the position, size, and face region position of the skin color region from the MultiColorTracker , Receiving the size and receiving the ID information of the person from FaceIdentify. Further, the direction (joint angle) of the robot's neck is received from each sensor on the body of the robot 1. Then, using these recognition results and sensor output in an integrated manner, it is possible to obtain information about where the person is currently, who the spoken word belongs to, and what kind of dialogue the person has spoken so far. save. The physical information related to such an object, that is, the target and the event (history) seen in the time direction are output to an upper module such as a situation-dependent action hierarchy (SBL).

ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒ（状況依存行動階層）：
上述のＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（短期記憶部）からの情報を基にロボット１の行動（状況に依存した行動）を決定するオブジェクトである。複数の行動を同時に評価したり、実行したりすることができる。また、行動を切り替えて機体をスリープ状態にしておき、別の行動を起動することができる。 Situated Behavior layer (situation-dependent behavior hierarchy):
This is an object that determines the behavior of the robot 1 (the behavior depending on the situation) based on the information from the above-mentioned ShortTerm Memory (short-term memory). Multiple actions can be evaluated and executed at the same time. In addition, the action can be switched to put the aircraft in the sleep state and another action can be activated.

ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ：
出力用のコマンドに対してロボット１の各ハードウェアのリソース調停を行なうオブジェクトである。図４に示す例では、音声出力用のスピーカをコントロールするオブジェクトと首のモーション・コントロールするオブジェクトのリソース調停を行なう。また、本実施形態では、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒは、ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒ（反射行動階層）とＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒ（状況依存行動階層）がそれぞれ発現する動作コマンドの調停を行なうが、この点については後に詳解する。 ResourceManager:
It is an object that performs resource arbitration of each hardware of the robot 1 in response to an output command. In the example shown in FIG. 4, resource arbitration is performed between an object that controls a speaker for audio output and an object that controls the motion of the neck. In this embodiment, the ResourceManager arbitrates operation commands that are expressed by the Reflexive Situated Behavior Layer (reflex action layer) and the Situated Behavior Layer (situation behavior layer), which will be described in detail later.

ＳｏｕｎｄＰｅｒｆｏｒｍｅｒＴＴＳ：
音声出力を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから与えられたテキスト・コマンドに応じて音声合成を行ない、ロボット１の機体上のスピーカから音声出力を行なう。 SoundPerformer TTS:
It is an object for performing voice output, performs voice synthesis in accordance with a text command given from the Situated BehaviorLayer via ResourceManager, and outputs voice from the speaker on the body of the robot 1.

ＨｅａｄＭｏｔｉｏｎＧｅｎｅｒａｔｏｒ：
ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから首を動かすコマンドを受けたことに応答して、首の関節角を計算するオブジェクトである。「追跡」のコマンドを受けたときには、ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙから受け取った物体の位置情報を基に、その物体が存在する方向を向く首の関節角を計算して出力する。 HeadMotionGenerator:
This is an object that calculates the joint angle of the neck in response to receiving a command for moving the neck from the Situated Behavior Layer via the ResourceManager. When a “track” command is received, based on the position information of the object received from the ShortTermMemory, the joint angle of the neck facing the direction in which the object exists is calculated and output.

短期記憶部１０５は、ターゲット・メモリとイベント・メモリという２種類のメモリ・オブジェクトで構成される。 The short-term storage unit 105 includes two types of memory objects, a target memory and an event memory.

ターゲット・メモリは、各認識機能部１０１〜１０３からの情報を統合して、現在知覚している物体に関する情報すなわちターゲットを保持している。このため、対象物体がいなくなったり現れたりすることで、該当するターゲットを記憶領域から削除したり（ＧａｒｂａｇｅＣｏｌｌｅｃｔｏｒ）、新たに生成したりする。また、１つのターゲットを複数の認識属性で表現することができる（ＴａｒｇｅｔＡｓｓｏｃｉａｔｅ）。例えば、肌色で顔のパターンで声を発する物体（人間の顔）などである。 The target memory integrates information from each of the recognition function units 101 to 103, and holds information relating to the currently perceived object, that is, a target. For this reason, when the target object disappears or appears, the target is deleted from the storage area (GarbageCollector) or newly generated. One target can be expressed by a plurality of recognition attributes (Target Associate). For example, an object (human face) that emits a voice with a skin color and facial pattern.

ターゲット・メモリで保持される物体（ターゲット）の位置や姿勢情報は、それぞれの認識機能部５１〜５３において使用されるセンサ座標系ではなく、ロボット１の体幹などの機体上の特定の部位が所定の場所に固定されたワールド座標系で表現を行なうようにしている。このため、短期記憶部（ＳＴＭ）１０５では、ロボット１の各関節の現在値（センサ出力）を常に監視して、センサ座標系からこの固定座標系への変換を行なう。これにより、各認識機能部１０１〜１０３の情報を統合することが可能になる。例えば、ロボット１００が首などを動かしてセンサの姿勢が変化しても、状況依存行動階層（ＳＢＬ）などの行動制御モジュールから見た物体の位置は同じままであるので、ターゲットの取り扱いが容易になる。 The position and orientation information of the object (target) held in the target memory is not a sensor coordinate system used in each of the recognition function units 51 to 53, but a specific part on the body such as the trunk of the robot 1 Expression is performed in a world coordinate system fixed at a predetermined place. For this reason, the short-term memory unit (STM) 105 constantly monitors the current values (sensor outputs) of the joints of the robot 1 and performs conversion from the sensor coordinate system to the fixed coordinate system. Thereby, it becomes possible to integrate the information of each recognition function part 101-103. For example, even when the robot 100 moves its neck or the like and the posture of the sensor changes, the position of the object viewed from the behavior control module such as the situation-dependent behavior hierarchy (SBL) remains the same, so that the target can be handled easily. Become.

また、イベント・メモリは、外部環境下で発生した過去から現在までのイベントを時系列的に格納するオブジェクトである。イベント・メモリにおいて扱われるイベントとして、ターゲットの出現と消失、音声認識単語、自己の行動や姿勢の変化などの外界の状況の変化に関する情報を挙げることができる。 The event memory is an object that stores events from the past to the present that occurred in the external environment in time series. Examples of events handled in the event memory include information on changes in the external environment such as the appearance and disappearance of targets, speech recognition words, and changes in self behavior and posture.

イベントの中には、あるターゲットに関する状態変化が含まれている。このため、イベント情報として該当するターゲットのＩＤを含めることで、発生したイベントに関するより詳しい情報を、上述のターゲット・メモリから検索することも可能である。 Some events include state changes related to a target. For this reason, by including the ID of the corresponding target as the event information, it is possible to retrieve more detailed information regarding the generated event from the above-described target memory.

図５及び図６には、各認識機能部１０１〜１０３における認識結果に基づいて、短期記憶部１０５内のターゲット・メモリ及びイベント・メモリに入る情報の流れをそれぞれ示している。 5 and 6 show the flow of information entering the target memory and event memory in the short-term storage unit 105 based on the recognition results in the respective recognition function units 101 to 103, respectively.

図５に示すように、短期記憶部１０５（ＳＴＭオブジェクト）内には、外部環境からターゲットを検出するターゲット検出器が設けられている。このターゲット検出器は、声認識結果や顔認識結果、色認識結果などの各認識機能部１０１〜１０３による認識結果を基に、新規ターゲットを追加したり、既存のターゲットを認識結果に反映するように更新したりする。検出されたターゲットは、ターゲット・メモリ内に保持される。 As shown in FIG. 5, a target detector for detecting a target from the external environment is provided in the short-term storage unit 105 (STM object). The target detector adds a new target or reflects an existing target in the recognition result based on the recognition result by each of the recognition function units 101 to 103 such as a voice recognition result, a face recognition result, and a color recognition result. Or update to The detected target is held in the target memory.

また、ターゲット・メモリには、もはや観測されなくなったターゲットを探して消去するガーベッジ・コレクタ（ＧａｒｂａｇｅＣｏｌｌｅｃｔｏｒ）や、複数のターゲットの関連性を判別して同じターゲットに結び付けるターゲット・アソシエート（ＴａｒｇｅｔＡｓｓｏｃｉａｔｅ）などの機能がある。ガーベッジ・コレクタは、時間の経過に従ってターゲットの確信度をデクリメントしていき、確信度が所定値を下回ったターゲットを削除（ｄｅｌｅｔｅ）することで実現される。また、ターゲット・アソシエートは、同じ属性（認識タイプ）の特徴量が近いターゲット間で空間的・時間的な近さを持つことで、同じターゲットを同定することができる。 In the target memory, functions such as a garbage collector (GarbageCollector) for searching for and erasing a target that is no longer observed, and a target associate (TargetAssociate) for determining the relevance of a plurality of targets and connecting them to the same target There is. The garbage collector is realized by decrementing the certainty of the target as time passes and deleting a target whose certainty falls below a predetermined value. Further, the target associate can identify the same target by having spatial and temporal closeness between the targets having the same feature (recognition type) feature amount.

前述した状況依存型行動階層（ＳＢＬ）は、短期記憶部１０５のクライアント（ＳＴＭクライアント）となるオブジェクトであり、ターゲット・メモリからは定期的に各ターゲットに関する情報の通知（Ｎｏｔｉｆｙ）を受け取る。本実施形態では、ＳＴＭプロキシ・クラスが、短期記憶部１０５（ＳＴＭオブジェクト）とは独立したクライアント・ローカルな作業領域にターゲットをコピーして、常に最新の情報を保持しておく。そして、ローカルなターゲット・リスト（ＴａｒｇｅｔｏｆＩｎｔｅｒｅｓｔ）の中から所望のターゲットを外部刺激として読み出して、スキーマ（ｓｃｈｅｍａ）すなわち行動モジュールを決定する（後述）。 The situation-dependent action hierarchy (SBL) described above is an object that becomes a client (STM client) of the short-term storage unit 105, and periodically receives notification (Notify) of information on each target from the target memory. In this embodiment, the STM proxy class copies the target to a client-local work area independent of the short-term storage unit 105 (STM object), and always holds the latest information. Then, a desired target is read out as an external stimulus from the local target list (Target of Interest), and a schema, that is, an action module is determined (described later).

また、図６に示すように、短期記憶部１０５（ＳＴＭオブジェクト）内には、外部環境において発生するイベントを検出するイベント検出器が設けられている。このイベント検出器は、ターゲット検出器によるターゲットの生成や、ガーベッジ・コレクタによるターゲットの削除をイベントとして検出する。また、認識機能部１０１〜１０３による認識結果が音声認識である場合には、その発話内容がイベントになる。発生したイベントは、発生した時間順にイベント・メモリ内でイベント・リストとして格納される。 Further, as shown in FIG. 6, an event detector that detects an event that occurs in the external environment is provided in the short-term storage unit 105 (STM object). This event detector detects the generation of a target by the target detector and the deletion of the target by the garbage collector as events. Further, when the recognition result by the recognition function units 101 to 103 is voice recognition, the utterance content becomes an event. The events that have occurred are stored as an event list in the event memory in the order in which they occurred.

状況依存型行動階層（ＳＢＬ）は、短期記憶部１０５のクライアント（ＳＴＭクライアント）となるオブジェクトであり、イベント・メモリからは時々刻々とイベントの通知（Ｎｏｔｉｆｙ）を受け取る。本実施形態では、ＳＴＭプロキシ・クラスが、短期記憶部１０５（ＳＴＭオブジェクト）とは独立したクライアント・ローカルな作業領域にイベント・リストをコピーしておく。そして、ローカルなイベント・リストの中から所望のイベントを外部刺激として読み出して、スキーマ（ｓｃｈｅｍａ）すなわち行動モジュールを決定する（後述）。実行された行動モジュールは新たなイベントとしてイベント検出器により検出される。また、古いイベントは、例えばＦＩＦＯ（ＦａｓｔＩｎＦａｓｔＯｕｔ）形式でイベント・リストから逐次的に廃棄される。 The context-dependent action hierarchy (SBL) is an object that becomes a client (STM client) of the short-term storage unit 105, and receives event notification (Notify) from the event memory. In this embodiment, the STM proxy class copies the event list to a client-local work area independent of the short-term storage unit 105 (STM object). Then, a desired event is read from the local event list as an external stimulus, and a schema, that is, a behavior module is determined (described later). The executed behavior module is detected as a new event by the event detector. Further, old events are sequentially discarded from the event list in, for example, a FIFO (Fast In Fast Out) format.

本実施形態に係る短期記憶メカニズムによれば、ロボット１は、外部刺激に関する複数の認識器の結果を時間的及び空間的に整合性を保つように統合して、意味を持ったシンボル情報として扱うようになっている。これによって、以前に観測された認識結果との対応問題などより複雑な認識結果を利用して、どの肌色領域が顔でどの人物に対応しているかや、この声がどの人物の声なのかなどを解くことを可能にしている。 According to the short-term memory mechanism according to this embodiment, the robot 1 integrates the results of a plurality of recognizers related to external stimuli so as to maintain temporal and spatial consistency, and handles them as meaningful symbol information. It is like that. This makes it possible to use more complex recognition results such as correspondence problems with previously observed recognition results, which skin color area corresponds to which person on the face, and which person's voice this voice is, etc. It is possible to solve.

以下では、図７〜図９を参照しながら、ロボット１によるユーザＡ及びＢとの対話処理について説明する。 Hereinafter, the dialogue processing with the users A and B by the robot 1 will be described with reference to FIGS.

まず、図７に示すように、ユーザＡが「まさひろ（ロボットの名前）くん！」と呼ぶと、各認識機能部５１〜５３により音方向検出、音声認識、及び顔識別が行なわれ、呼ばれた方向を向いて、ユーザＡの顔をトラッキングしたり、ユーザＡとの対話を開始するという状況依存の行動が行なわれる。 First, as shown in FIG. 7, when the user A calls “Masahiro (robot name) -kun!”, The recognition function units 51 to 53 perform sound direction detection, voice recognition, and face identification. A situation-dependent action is performed such as tracking the face of the user A or starting a dialogue with the user A.

次いで、図８に示すように、今度はユーザＢが「まさひろ（ロボットの名前）くん！」と呼ぶと、各認識機能部１０１〜１０３により音方向検出、音声認識、及び顔識別が行なわれ、ユーザＡとの対話を中断した後（但し、会話のコンテキストを保存する）、呼ばれた方向を向いて、ユーザＢの顔をトラッキングしたり、ユーザＢとの対話を開始したりするという状況依存の行動が行なわれる。これは、状況依存行動階層１０８が持つＰｒｅｅｍｐｔｉｏｎ機能（後述）である。 Next, as shown in FIG. 8, when the user B calls “Masahiro (robot name) -kun!”, Sound direction detection, voice recognition, and face identification are performed by the respective recognition function units 101 to 103. Depends on the situation in which the conversation with the user A is interrupted (however, the context of the conversation is preserved), the face of the user B is tracked in the called direction, and the conversation with the user B is started. The action is performed. This is a preemption function (described later) of the situation-dependent action hierarchy 108.

次いで、図９に示すように、ユーザＡが「おーい！」と叫んで、会話の継続を催促すると、今度は、ユーザＢとの対話を中断した後（但し、会話のコンテキストを保存する）、呼ばれた方向を向いて、ユーザＡの顔をトラッキングしたり、保存されているコンテキストに基づいてユーザＡとの対話を再開するという状況依存の行動が行なわれる。このとき、状況依存行動階層１０８が持つＲｅｅｎｔｒａｎｔ機能（後述）により、ユーザＡとの対話によってユーザＢとの対話内容が破壊されずに済み、中断した時点から正確に対話を再開することができる。 Next, as shown in FIG. 9, when the user A yells “Oh!” And urges the continuation of the conversation, the conversation with the user B is interrupted (however, the context of the conversation is saved). A situation-dependent action is performed in which the face of the user A is tracked in the called direction or the dialogue with the user A is resumed based on the stored context. At this time, the Reentrant function (described later) of the situation-dependent action hierarchy 108 does not destroy the content of the dialogue with the user B due to the dialogue with the user A, and the dialogue can be accurately restarted from the point of interruption.

Ｃ−２．長期記憶部
長期記憶は、物の名前など学習により得られた情報を長期間保持するために使用される。同じパターンを統計的に処理して、ロバストな記憶にすることができる。 C-2. Long-term memory section Long-term memory is used to hold information obtained by learning such as the name of an object for a long period of time. The same pattern can be statistically processed for robust storage.

エピソード記憶は、長期記憶の中でも、宣言的知識記憶（言明記憶とも言う）の一種である。例えば、自転車に乗ることを考えると、初めて自転車に乗った場面（時間・場所など）を覚えていることがエピソード記憶に相当する。その後、時間の経過によりそのエピソードに関する記憶が薄れる一方、その意味を記憶するのが意味記憶である。また、自転車の乗り方の手順を記憶するようになるが、これが手続的知識記憶に相当する。一般的に、手続的知識の記憶には時間を要する。宣言的知識記憶によって「言う」ことができるのに対して、手続的知識記憶は潜在的であり、動作の実行という形で表れる。 Episodic memory is a kind of declarative knowledge memory (also called statement memory) among long-term memories. For example, when considering riding a bicycle, remembering a scene (time, place, etc.) of riding a bicycle for the first time corresponds to episode memory. Thereafter, the memory about the episode fades with the passage of time, while the meaning memory stores the meaning. In addition, a procedure for riding a bicycle is stored, which corresponds to procedural knowledge storage. In general, it takes time to memorize procedural knowledge. Procedural knowledge memory is latent, while it can be “sayed” by declarative knowledge memory, and appears in the form of performing actions.

本実施形態に係る長期記憶部１０６は、視覚情報、聴覚情報などの物体に関するセンサ情報、及びその物体に対して行なった行動に対する結果としての内部状態が変化した結果などを記憶する連想記憶と、その１つの物体に関するフレーム記憶と、周囲の情景から構築されるマップ情報、あるいはデータとして与えられる地図情報、原因となる状況とそれに対する行動とその結果といったルールで構成される。 The long-term storage unit 106 according to the present embodiment stores associative memory that stores sensor information related to objects such as visual information and auditory information, and results of changes in internal state as a result of actions performed on the objects, It consists of a frame storage related to the one object, map information constructed from surrounding scenes, or map information given as data, a causal situation, an action for the situation, and a result thereof.

Ｃ−２−１．連想記憶
連想記憶とは、あらかじめ複数のシンボルからなる入力パターンを記憶パターンとして記憶しておき、その中のある１つのパターンに類似したパターンが想起される仕組みのことを言う。本実施形態に係る連想記憶は、競合型ニューラル・ネットワークを用いたモデルにより実現される。このような連想記憶メカニズムによれば、一部欠陥のあるパターンが入力されたとき、記憶されている複数のパターンの中で最も近い記憶パターンを出力することができる。これは、不完全なデータからなる外部刺激しか与えられなかったときであっても、該当するニューロンの発火によりあるオブジェクトの意味などを想起することができるからである。 C-2-1. Associative memory Associative memory refers to a mechanism in which an input pattern consisting of a plurality of symbols is stored in advance as a memory pattern, and a pattern similar to one of the patterns is recalled. The associative memory according to the present embodiment is realized by a model using a competitive neural network. According to such an associative memory mechanism, when a pattern having a partial defect is input, the closest stored pattern among a plurality of stored patterns can be output. This is because even when an external stimulus consisting of incomplete data is given, the meaning of an object can be recalled by the firing of the corresponding neuron.

連想記憶は、「自己想起型連想記憶」と「相互想起型連想記憶」に大別される。自己想起型とは記憶したパターンを直接キー・パターンで引き出すモデルであり、また、相互想起型とは入力パターンと出力パターンがある種の連合関係で結ばれているモデルである。本実施形態では、自己想起型連想記憶を採用するが、これは、従来のホップフィールドやアソシアトロンなどの記憶モデルに比し、追加学習が容易である、入力パターンの統計的な記憶が可能である、などのメリットがある。 Associative memory is broadly divided into “self-associative associative memory” and “mutual associative associative memory”. The self-recollection type is a model in which a stored pattern is directly extracted by a key pattern, and the mutual recollection type is a model in which an input pattern and an output pattern are connected by a certain association relationship. In the present embodiment, self-associative associative memory is adopted, but this is capable of statistical storage of input patterns, which is easier to learn than conventional memory models such as Hopfield and Associatron. There are merits such as.

追加学習によれば、新しいパターンを新たに記憶しても、過去の記憶が上書きされて消されることはない。また、統計的な学習によれば、同じものを多く見ればそれだけ記憶に残るし、また同じことを繰り返し実行すれば、忘れにくくなる。この場合、記憶過程において、毎回完全なパターンが入力されなくとも、繰り返し実行により、多く提示されたパターンに収束していく。 According to the additional learning, even if a new pattern is newly stored, the past storage is not overwritten and erased. Also, according to statistical learning, if you see many of the same thing, it will remain in memory, and if you repeat the same thing, it will be hard to forget. In this case, even if a complete pattern is not input every time in the storing process, it is converged to a pattern that is often presented by repeated execution.

Ｃ−２−２．連想記憶による意味記憶
ロボット装置１が覚えるパターンは、例えばロボット装置１への外部刺激と内部状態の組み合わせで構成される。 C-2-2. The pattern memorized by the semantic memory robot device 1 by associative memory is composed of, for example, a combination of an external stimulus to the robot device 1 and an internal state.

ここで、外的刺激とは、ロボット装置１がセンサ入力を認識して得られた知覚情報であり、例えば、カメラ１５から入力された画像に対して処理された色情報、形情報、顔情報などであり、より具体的には、色、形、顔、３Ｄ一般物体、ハンドジェスチャー、動き、音声、接触、匂い、味などの構成要素からなる。
る。 Here, the external stimulus is perceptual information obtained by the robot apparatus 1 recognizing sensor input. For example, color information, shape information, and face information processed for an image input from the camera 15. More specifically, it is composed of components such as color, shape, face, 3D general object, hand gesture, movement, voice, contact, smell, and taste.
The

また、内的状態とは、例えば、ロボットの身体に基づいた本能や感情などの情動を指す。本能的要素は、例えば、疲れ（ｆａｔｉｇｕｅ）、熱あるいは体内温度（ｔｅｍｐｅｒａｔｕｒｅ）、痛み（ｐａｉｎ）、食欲あるいは飢え（ｈｕｎｇｅｒ）、乾き（ｔｈｉｒｓｔ）、愛情（ａｆｆｅｃｔｉｏｎ）、好奇心（ｃｕｒｉｏｓｉｔｙ）、排泄（ｅｌｉｍｉｎａｔｉｏｎ）又は性欲（ｓｅｘｕａｌ）のうちの少なくとも１つである。また、情動的要素は、幸せ（ｈａｐｐｉｎｅｓｓ）、悲しみ（ｓａｄｎｅｓｓ）、怒り（ａｎｇｅｒ）、驚き（ｓｕｒｐｒｉｓｅ）、嫌悪（ｄｉｓｇｕｓｔ）、恐れ（ｆｅａｒ）、苛立ち（ｆｒｕｓｔｒａｔｉｏｎ）、退屈（ｂｏｒｅｄｏｍ）、睡眠（ｓｏｍｎｏｌｅｎｃｅ）、社交性（ｇｒｅｇａｒｉｏｕｓｎｅｓｓ）、根気（ｐａｔｉｅｎｃｅ）、緊張（ｔｅｎｓｅ）、リラックス（ｒｅｌａｘｅｄ）、警戒（ａｌｅｒｔｎｅｓｓ）、罪（ｇｕｉｌｔ）、悪意（ｓｐｉｔｅ）、誠実さ（ｌｏｙａｌｔｙ）、服従性（ｓｕｂｍｉｓｓｉｏｎ）又は嫉妬（ｊｅａｌｏｕｓｙ）のうちの少なくとも１つである。 The internal state refers to emotions such as instinct and emotion based on the robot body. Instinct factors include, for example, fatigue, heat or body temperature, pain, appetite or hunger, dryness, affection, curiosity, excretion ( at least one of elimination or sexual desire. Also, emotional elements are happiness, sadness, anger, surprise, disgust, fear, frustration, boredom, sleepiness. ), Sociality, patience, tense, relaxed, alertness, guilt, spite, honesty, submission or At least one of the heels.

本実施形態に係る競合型ニューラル・ネットワークを適用した連想記憶メカニズムでは、これら外部刺激や内部状態を構成する各要素に対して入力チャンネルを割り当てている。また、視覚認識機能部１０１や聴覚認識機能部１０２などの各知覚機能モジュールは、センサ出力となる生の信号を送るのではなく、センサ出力を認識した結果をシンボル化して、シンボルに相当するＩＤ情報（例えば、色プロトタイプＩＤ、形プロトタイプＩＤ、音声プロトタイプＩＤなど）を該当するチャンネルに送るようになっている。 In the associative memory mechanism to which the competitive neural network according to this embodiment is applied, an input channel is assigned to each element constituting these external stimuli and internal states. In addition, each perceptual function module such as the visual recognition function unit 101 and the auditory recognition function unit 102 does not send a raw signal as a sensor output, but converts the result of recognizing the sensor output into a symbol, and an ID corresponding to the symbol Information (for example, color prototype ID, shape prototype ID, voice prototype ID, etc.) is sent to the corresponding channel.

例えば、カラー・セグメンテーション・モジュールによりセグメンテーションされた各オブジェクトは、色プロトタイプＩＤを付加されて連想記憶システムに入力される。また、顔認識モジュールにより認識された顔のＩＤが連想記憶システムに入力される。また、物体認識モジュールにより認識された物体のＩＤが連想システムに入力される。また、音声認識モジュールからは、ユーザの発話により単語のプロトタイプＩＤが入力される。このとき、発話の音素記号列（ＰｈｏｎｅｍｅＳｅｑｕｅｎｃｅ）も入力されるので、記憶・連想の処理で、ロボット装置１に発話させることが可能となる。また、本能に関しては、アナログ値を扱えるようになっており（後述）、例えば、本能のデルタ値を８０で記憶しておけば、連想により８０というアナログ値を得ることが可能である。 For example, each object segmented by the color segmentation module is added to the associative memory system with a color prototype ID. Further, the face ID recognized by the face recognition module is input to the associative memory system. The ID of the object recognized by the object recognition module is input to the associative system. Also, a prototype ID of a word is input from the speech recognition module by the user's utterance. At this time, since a phoneme symbol string of an utterance is also input, it is possible to cause the robot apparatus 1 to utter by a process of storage / association. As for the instinct, an analog value can be handled (described later). For example, if the instinct delta value is stored as 80, an analog value of 80 can be obtained by association.

したがって、本実施形態に係る連想記憶システムは、色、形、音声…などの外部刺激や内部状態を、各チャンネル毎のシンボル化されたＩＤの組み合わせからなる入力パターンとして記憶することができる。すなわち、連想記憶システムが記憶するのは、 Therefore, the associative memory system according to the present embodiment can store external stimuli such as color, shape, voice, and so on and an internal state as an input pattern composed of a combination of symbolized IDs for each channel. In other words, the associative memory system stores

［色ＩＤ形ＩＤ顔ＩＤ音声ＩＤ…本能ＩＤ（値）情動ＩＤ］ [Color ID Shape ID Face ID Voice ID ... Instinct ID (Value) Emotion ID]

の組み合わせである。 It is a combination.

連想記憶には、記憶過程と想起過程がある。図１０には、連想記憶の記憶過程の概念を示している。 Associative memory has a memory process and a recall process. FIG. 10 shows the concept of the storage process of associative memory.

連想記憶システムに入力される記憶パターンは、外部刺激や内部状態の各要素毎に割り当てられている複数のチャンネルで構成される（図示の例では入力１〜入力８の８チャンネルからなる）。そして、各チャンネルには、対応する外部刺激の認識結果や内部状態をシンボル化したＩＤ情報が送られてくる。図示の例では、各チャンネルの濃淡がＩＤ情報を表しているものとする。例えば、記憶パターン中のｋ番目のカラムが顔のチャンネルに割り当てられている場合、その色により顔のプロトタイプＩＤを表している。 A memory pattern input to the associative memory system is composed of a plurality of channels assigned to each element of the external stimulus and the internal state (in the illustrated example, it consists of 8 channels of input 1 to input 8). Each channel receives ID information obtained by symbolizing the recognition result of the corresponding external stimulus and the internal state. In the example shown in the figure, the shading of each channel represents ID information. For example, when the kth column in the storage pattern is assigned to the face channel, the face prototype ID is represented by the color.

図１０に示す例では、連想記憶システムは既に１〜ｎの合計ｎ個の記憶パターンを記憶しているものとする。ここで、２つの記憶パターン間での対応するチャンネルの色の相違は、同じチャンネル上で記憶している外部刺激又は内部状態のシンボルすなわちＩＤが当該記憶パターン間で異なることを意味する。 In the example shown in FIG. 10, it is assumed that the associative memory system has already stored a total of n storage patterns 1 to n. Here, the difference in the color of the corresponding channel between the two storage patterns means that the external stimulus or the internal state symbol or ID stored on the same channel is different between the storage patterns.

また、図１１には、連想記憶の想起過程の概念を示している。上述したように、記憶過程で蓄えた入力パターンに似たパターンが入力されると、欠落していた情報を補うように完全な記憶パターンが出力される。 FIG. 11 shows the concept of the associative memory recall process. As described above, when a pattern similar to the input pattern stored in the storage process is input, a complete storage pattern is output so as to compensate for the missing information.

図１１に示す例では、８チャンネルからなる記憶パターンのうち上位の３チャンネルしかＩＤが与えられていないパターンがキー・パターンとして入力される。このような場合、連想記憶システムでは、既に貯えられている記憶パターンの中で、これら上位の３チャンネルが最も近いパターン（図示の例では記憶パターン１）を見つけ出して、想起されたパターンとして出力することができる。すなわち、欠落していたチャンネル４〜８の情報を補うように、最も近い記憶パターンが出力される。 In the example shown in FIG. 11, a pattern in which IDs are given only to the upper three channels among the storage patterns of eight channels is input as a key pattern. In such a case, the associative memory system finds a pattern (stored pattern 1 in the illustrated example) that is closest to these upper three channels among the stored patterns, and outputs the pattern as a recalled pattern. be able to. That is, the closest stored pattern is output so as to compensate for the information of the missing channels 4 to 8.

したがって、連想記憶システムによれば、顔のＩＤのみから音声ＩＤ、つまり名前を連想したり、食べ物の名前だけから、“おいしい”や“おいしくない”などを想起したりすることができる。競合型ニューラル・ネットワークによる長期記憶アーキテクチャによれば、言葉の意味や常識などに関する意味記憶を、他の長期記憶と同じ工学モデルで実現することができる。 Therefore, according to the associative memory system, it is possible to associate a voice ID, that is, a name only from the face ID, or to recall “delicious” or “not delicious” from only the food name. According to the long-term memory architecture based on the competitive neural network, the semantic memory about the meaning of words and common sense can be realized with the same engineering model as other long-term memories.

Ｃ−２．競合型ニューラル・ネットワークによる連想学習
図１２には、競合型ニューラル・ネットワークを適用した連想記憶システムの構成例を模式的に示している。同図に示すように、この競合型ニューラル・ネットワークは、入力層（ｉｎｐｕｔｌａｙｅｒ）と競合層（ｃｏｍｐｅｔｉｔｉｖｅｌａｙｅｒ）の２層からなる階層型ニューラル・ネットワークである。 C-2. Associative Learning by Competitive Neural Network FIG. 12 schematically shows a configuration example of an associative memory system to which a competitive neural network is applied. As shown in the figure, this competitive neural network is a hierarchical neural network composed of two layers: an input layer and a competitive layer.

この競合型ニューラル・ネットワークは、記憶モードと連想モードという２通りの動作モードを備えており、記憶モードでは入力パターンを競合的に記憶し、また、想起モードでは部分的に欠損した入力パターンから完全な記憶パターンを想起する。 This competitive neural network has two modes of operation: memory mode and associative mode. In the memory mode, the input pattern is memorized in a competitive manner, and in the recall mode, the input pattern is partially lost. Recalling a memory pattern.

入力層は、複数の入力ニューロンで構成される。各入力ニューロンには、外部刺激や内部状態を表す各要素に対して割り当てられたチャンネルから、外部刺激や内部状態の認識結果に相当するシンボルすなわちＩＤ情報が入力される。入力層では、色ＩＤの個数＋形ＩＤの個数＋音声ＩＤの個数＋本能の種類…に相当する個数のニューロンを用意する必要がある。 The input layer is composed of a plurality of input neurons. Each input neuron receives a symbol corresponding to the recognition result of the external stimulus or the internal state, that is, ID information, from a channel assigned to each element representing the external stimulus or the internal state. In the input layer, it is necessary to prepare a number of neurons corresponding to the number of color IDs + the number of shape IDs + the number of voice IDs + the type of instinct.

また、競合層は、複数の競合ニューロンで構成される。各競合ニューロンは、入力層側の各入力ニューロンとは、ある結合重みを持って結合されている。競合ニューロンは、それぞれのニューロンが記憶すべき１つのシンボルに相当する。言い換えれば、競合ニューロンの数は記憶可能なシンボルの個数に相当する。 The competitive layer is composed of a plurality of competitive neurons. Each competitive neuron is connected to each input neuron on the input layer side with a certain connection weight. A competitive neuron corresponds to one symbol that each neuron should memorize. In other words, the number of competing neurons corresponds to the number of symbols that can be stored.

ある入力パターンが入力層に与えられたとする。このとき、入力パターンは外部刺激や内部状態の各要素を表すチャンネルで構成されており、チャンネルから該当するＩＤが送られてきた入力ニューロンは発火する。 Assume that an input pattern is given to the input layer. At this time, the input pattern is composed of channels representing each element of the external stimulus and the internal state, and the input neuron that receives the corresponding ID from the channel is fired.

競合ニューロンは、各入力ニューロンからの出力をシナプスによる重み付けをして入力して、それら入力値の総和を計算する。そして、競合層で入力値の総和が最大となる競合ニューロンを選択して、勝ち抜いた競合ニューロンと入力ニューロンとの結合力を強めていくことで、学習を行なう。また、欠損のある入力パターンに対して、競合層で勝ち抜いた競合ニューロンを選択することにより、入力パターンに対応するシンボルを想起することができる。 The competing neuron inputs the output from each input neuron with synaptic weighting, and calculates the sum of those input values. Then, learning is performed by selecting a competitive neuron having the maximum sum of input values in the competitive layer and strengthening the binding power between the winning competitive neuron and the input neuron. Also, by selecting a competitive neuron that has won in the competitive layer for a defective input pattern, a symbol corresponding to the input pattern can be recalled.

記憶モード：
入力層と競合層の結合重みは、０から１の間の値をとるものとする。但し、初期結合重みはランダムに決定する。 Memory mode:
The connection weight between the input layer and the competitive layer takes a value between 0 and 1. However, the initial connection weight is determined randomly.

競合型ニューラル・ネットワークにおける記憶は、まず、記憶したい入力パターンに対して競合層で勝ち抜いた競合ニューロンを選択して、その競合ニューロンと各入力ニューロンとの結合力を強めることで行なう。 Storage in the competitive neural network is performed by first selecting a competitive neuron that has won in the competitive layer for the input pattern to be stored, and strengthening the coupling force between the competitive neuron and each input neuron.

ここで、入力パターン・ベクトル［ｘ₁，ｘ₂，…，ｘ_n］は、ニューロンが、色プロトタイプＩＤ１に対応し、ＩＤ１が認識されたら、ニューロンｘ₁を発火させ、順次、形、音声もそのように発火させることとする。発火したニューロンは１の値をとり、発火しないニューロンは−１の値をとる。 Here, the input pattern vector _{_{[x 1, x 2, ...}} , x n] are neurons, corresponding to the color prototype ID1, When ID1 has been recognized, ignite neurons x _1, successively, form, also voice Let it fire like that. A fired neuron takes a value of 1, and a non-fired neuron takes a value of -1.

また、ｉ番目の入力ニューロンとｊ番目の競合ニューロンとの結合力をｗ_ijとおくと、入力ｘ_iに対する競合ニューロンｙ_jの値は、下式のように表される。 Further, if the connection force between the i-th input neuron and the j-th competitive neuron is set to w _ij , the value of the competitive neuron y _j for the input x _i is expressed by the following equation.

したがって、競合に勝ち抜くニューロンは、下式により求めることができる。 Therefore, the neuron that wins the competition can be obtained by the following equation.

記憶は、競合層で勝ち抜いた競合ニューロン（ｗｉｎｎｅｒｎｅｕｒｏｎ）と各入力ニューロンとの結合力を強めることで行なう。勝ち抜いたニューロン（ｗｉｎｎｅｒｎｅｕｒｏｎ）と入力ニューロンとの結合の更新は、Ｋｏｈｏｎｅｎの更新規則により、以下のように行なわれる。 Memorization is performed by strengthening the binding force between the competitive neuron (winner neuron) won in the competitive layer and each input neuron. The connection between the winning neuron and the input neuron is updated as follows according to the Kohonen update rule.

ここで、Ｌ２Ｎｏｒｍで正規化する。 Here, normalization is performed using L2 Norm.

この結合力がいわゆる記憶の強さを表し、記憶力になる。ここで、学習率αは、提示する回数と記憶の関係を表すパラメータである。学習率αが大きいほど、１回の記憶で重みを大きく変更する。例えば、α＝０．５を用いると、一度記憶させれば、忘却することはなく、次回同じようなパターンを提示すれば、ほぼ間違いなく記憶したパターンを連想することができる。 This binding force represents the so-called memory strength and becomes the memory power. Here, the learning rate α is a parameter representing the relationship between the number of times of presentation and storage. The greater the learning rate α, the larger the weight is changed with one memory. For example, if α = 0.5 is used, once it is stored, it is not forgotten. If the same pattern is presented next time, the stored pattern can be almost certainly associated.

また、提示して記憶させればさせるほど、ネットワークの結合値（重み）が大きくなっていく。これは、同じパターンが何度も入力されるうちに、記憶が強くなることを示し、統計的な学習が可能であり、実環境下におけるノイズの影響の少ない長期記憶を実現することができる。 In addition, the network connection value (weight) increases as the information is presented and stored. This indicates that the memory becomes stronger as the same pattern is input many times, statistical learning is possible, and long-term memory with little influence of noise in a real environment can be realized.

また、新たなパターンが入力され、記憶しようとすれば、新たな競合層のニューロンが発火するため、その新しいニューロンとの結合が強まり、以前の記憶によるニューロンとの結合が弱まる訳ではない。言い換えれば、競合型ニューラル・ネットワークによる連想記憶では、追加学習が可能なのであり、「忘却」の問題から解放される。 In addition, if a new pattern is input and an attempt is made to memorize, a new competitive layer of neurons fires, so that the connection with the new neuron is strengthened and the connection with the neuron from the previous memory is not weakened. In other words, the associative memory by the competitive neural network allows additional learning and is freed from the problem of “forgetting”.

想起モード：
いま、以下に示すような入力パターン・ベクトルが図１２に示す連想記憶システムに提示されたとする。入力パターンは、完全なものではなく一部が欠損していてもよい。 Recall mode:
Assume that an input pattern vector as shown below is presented to the associative memory system shown in FIG. The input pattern is not complete and may be partially missing.

このとき、入力ベクトルは、プロトタイプＩＤであっても、あるいはそのプロトタイプＩＤに対する尤度、確率であってもよい。出力ニューロンｙ_jの値は、入力ｘ_iについて下式のように計算される。 At this time, the input vector may be a prototype ID, or may be a likelihood and a probability for the prototype ID. The value of the output neuron y _j is calculated for the input x _i as

上式は、各チャンネルの尤度に応じた競合ニューロンの発火値の尤度を表しているとも言える。ここで重要なことは、複数のチャンネルからの尤度入力に対して、それらをコネクションして全体的な尤度を求めることが可能である、という点である。本実施形態では、連想するものは唯一すなわち尤度が最大のものだけを選択することとし、競合に勝ち抜くニューロンを下式により求めることができる。 It can be said that the above expression represents the likelihood of the firing value of the competing neuron according to the likelihood of each channel. What is important here is that it is possible to obtain the overall likelihood by connecting the likelihood inputs from a plurality of channels. In the present embodiment, only one associated with the maximum likelihood, that is, the one with the maximum likelihood is selected, and the neuron that can win the competition can be obtained by the following equation.

求めた競合ニューロンＹの番号が記憶したシンボルの番号に対応するので、下式のように、Ｗの逆行列演算により入力パターンＸを想起することができる。 Since the obtained number of the competitive neuron Y corresponds to the stored symbol number, the input pattern X can be recalled by the inverse matrix operation of W as shown in the following equation.

さらに図１２に示す競合型ニューラル・ネットワークの入力層ニューロンにエピソードや動作ＩＤなどのシンボルを割り当てることにより、宣言的知識記憶や手続的知識記憶を連想記憶アーキテキチャにより実現することができる。 Further, declarative knowledge storage and procedural knowledge storage can be realized by an associative memory architecture by assigning symbols such as episodes and action IDs to the input layer neurons of the competitive neural network shown in FIG.

Ｄ．状況依存行動制御
本実施形態では、内部状態又はその変化と外部刺激に依存した行動選択を自律的に行なう状況依存行動階層１０８と、認識された外部刺激に応じて反射的・直接的な機体動作を実行する反射行動部１０９と、記憶内容に基づいて比較的長期にわたる行動計画を行なう熟考行動階層１０７を備え、これらの階層的な行動制御メカニズムにより、ロボットの行動が選択される。 D. Situation-dependent action control In this embodiment, a situation-dependent action hierarchy 108 that autonomously selects an action depending on the internal state or its change and an external stimulus, and reflexive / direct airframe action according to the recognized external stimulus And a contemplation action layer 107 that performs a relatively long-term action plan based on the stored contents, and the action of the robot is selected by these hierarchical action control mechanisms.

状況依存行動階層（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０８は、ロボット装置１が現在置かれている状況に即応した自発的な行動を制御する行動制御層であり、複数の要素行動で構成される。要素行動は、「スキーマ（ｓｃｈｅｍａ）」と呼ばれる行動モジュールすなわちオブジェクトとして記述される。各々の要素行動は、内部状態、短期記憶部１０５や長期記憶部１０６に記憶されている認識結果などから定期的に行動価値を算出し、各要素行動が持つ行動価値（以下では、「ＡＬ値」とも呼ぶ）を基にいずれの要素行動を発現すべきかを行動選択して、行動出力する。 The situation-dependent action hierarchy (Situated BehaviorsLayer) 108 is an action control layer that controls a spontaneous action in response to a situation where the robot apparatus 1 is currently placed, and includes a plurality of element actions. Elemental actions are described as action modules or objects called “schema”. Each elemental action periodically calculates the action value from the internal state, the recognition result stored in the short-term memory unit 105 or the long-term memory unit 106, and the action value of each elemental action (hereinafter referred to as “AL value”). Based on the above, it is selected which of the elemental actions should be expressed, and the action is output.

Ｄ−１．状況依存行動階層の構成
本実施形態では、状況依存行動階層１０８は、行動モジュール毎にステートマシンを用意しており、それ以前の行動や状況に依存して、センサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。行動モジュールは、外部刺激や内部状態の変化に応じた状況判断を行なうｍｏｎｉｔｏｒ機能と、行動実行に伴う状態遷移（ステートマシン）を実現するａｃｔｉｏｎ機能とを備えたスキーマ（ｓｃｈｅｍａ）として記述される。状況依存行動階層１０８は、複数のスキーマが階層的に連結された木構造として構成されている（後述）。 D-1. Configuration of Situation Dependent Action Hierarchy In this embodiment, the situation dependent action hierarchy 108 prepares a state machine for each action module, and the recognition result of the external information input by the sensor depending on the previous action and situation. Classify the behavior on the aircraft. The behavior module is described as a schema having a monitor function that makes a situation determination according to an external stimulus or a change in an internal state, and an action function that realizes a state transition (state machine) associated with behavior execution. The situation-dependent action hierarchy 108 is configured as a tree structure in which a plurality of schemas are hierarchically connected (described later).

また、状況依存行動階層１０８は、内部状態をある範囲に保つための行動（「ホメオスタシス行動」とも呼ぶ）も実現し、内部状態が指定した範囲内を越えた場合には、その内部状態を当該範囲内に戻すための行動が出易くなるようにその行動を活性化させる（実際には、内部状態と外部環境の両方を考慮した形で行動が選択される）。 The situation-dependent action hierarchy 108 also realizes an action for keeping the internal state within a certain range (also referred to as “homeostasis action”). When the internal state exceeds the specified range, the internal state is The action is activated so that the action for returning to the range can be easily performed (actually, the action is selected in consideration of both the internal state and the external environment).

図３に示したようなロボット１の行動制御システム１００における各機能モジュールは、オブジェクトとして構成される。各オブジェクトは、メッセージ通信と共有メモリを使ったオブジェクト間通信方法によりデータの受け渡しとＩｎｖｏｋｅを行なうことができる。図１３には、本実施形態に係る行動制御システム１００のオブジェクト構成を模式的に示している。 Each functional module in the behavior control system 100 of the robot 1 as shown in FIG. 3 is configured as an object. Each object can perform data transfer and invoke by message communication and an inter-object communication method using a shared memory. FIG. 13 schematically shows the object configuration of the behavior control system 100 according to the present embodiment.

視覚認識機能部１０１は、“ＦａｃｅＤｅｔｅｃｔｏｒ”、“ＭｕｌｉｔＣｏｌｏｔＴｒａｃｋｅｒ”、“ＦａｃｅＩｄｅｎｔｉｆｙ”という３つのオブジェクトで構成される。 The visual recognition function unit 101 includes three objects, “FaceDetector”, “MultiColtTracker”, and “FaceIdentify”.

ＦａｃｅＤｅｔｅｃｔｏｒは、画像フレーム中から顔領域を検出するオブジェクトであり、検出結果をＦａｃｅＩｄｅｎｔｉｆｙに出力する。ＭｕｌｉｔＣｏｌｏｔＴｒａｃｋｅｒは、色認識を行なうオブジェクトであり、認識結果をＦａｃｅＩｄｅｎｔｉｆｙ及びＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（短期記憶ブ１０５を構成するオブジェクト）に出力する。また、ＦａｃｅＩｄｅｎｔｉｆｙは、検出された顔画像を手持ちの人物辞書で検索するなどして人物の識別を行ない、顔画像領域の位置、大きさ情報とともに人物のＩＤ情報をＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙに出力する。 FaceDetector is an object that detects a face area from an image frame, and outputs the detection result to FaceIdentify. The MultiClotTracker is an object that performs color recognition, and outputs the recognition result to FaceIdentify and ShortTermMemory (an object constituting the short-term memory 105). FaceIdentify also identifies a person by searching the detected face image in a hand-held person dictionary, and outputs the person ID information together with the position and size information of the face image area to ShortTermMemory.

聴覚認識機能部１０２は、“ＡｕｄｉｏＲｅｃｏｇ”と“ＳｐｅｅｃｈＲｅｃｏｇ”という２つのオブジェクトで構成される。ＡｕｄｉｏＲｅｃｏｇは、マイクなどの音声入力装置からの音声データを受け取って、特徴抽出と音声区間検出を行なうオブジェクトであり、音声区間の音声データの特徴量及び音源方向をＳｐｅｅｃｈＲｅｃｏｇやＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙに出力する。ＳｐｅｅｃｈＲｅｃｏｇは、ＡｕｄｉｏＲｅｃｏｇから受け取った音声特徴量と音声辞書及び構文辞書を使って音声認識を行なうオブジェクトであり、認識された単語のセットをＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙに出力する。 The auditory recognition function unit 102 includes two objects “AudioRecog” and “SpeechRecog”. AudioRecog is an object that receives audio data from an audio input device such as a microphone and performs feature extraction and audio segment detection, and outputs the feature amount and sound source direction of the audio data in the audio segment to SpeedRecog and ShortTermMemory. The SpeechRecog is an object that performs speech recognition using the speech feature amount, the speech dictionary, and the syntax dictionary received from the AudioRecog, and outputs a set of recognized words to the ShortTermMemory.

触覚認識記憶部１０３は、接触センサからのセンサ入力を認識する“ＴａｃｔｉｌｅＳｅｎｓｏｒ”というオブジェクトで構成され、認識結果はＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙや内部状態を管理するオブジェクトであるＩｎｔｅｒｎａｌＳｔａｔｅＭｏｄｅｌ（ＩＳＭ）に出力する。 The tactile sensation recognition storage unit 103 includes an object called “TactileSensor” that recognizes a sensor input from a contact sensor, and outputs a recognition result to the ShortTermMemory and an internal state model (ISM) that manages an internal state.

ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（ＳＴＭ）は、短期記憶部１０５を構成するオブジェクトであり、上述の認識系の各オブジェクトによって外部環境から認識されたターゲットやイベントを短期間保持（例えばカメラ１５からの入力画像を約１５秒程度の短い期間だけ記憶する）する機能モジュールであり、ＳＴＭクライアントであるＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒに対して外部刺激の通知（Ｎｏｔｉｆｙ）を定期的に行なう。 ShortTermMemory (STM) is an object constituting the short-term storage unit 105, and holds targets and events recognized from the external environment by each object of the recognition system described above (for example, an input image from the camera 15 is about 15 seconds). This is a functional module that stores only for a short period of time, and periodically notifies the Stimated BehaviorsLayer, which is an STM client, of external stimuli (Notify).

ＬｏｎｇＴｅｒｍＭｅｍｏｒｙ（ＬＴＭ）は、長期記憶部１０６を構成するオブジェクトであり、物の名前など学習により得られた情報を長期間保持するために使用される。ＬｏｎｇＴｅｒｍＭｅｍｏｒｙは、例えば、ある行動モジュールにおいて外部刺激から内部状態の変化を連想記憶することができる。 LongTermMemory (LTM) is an object that constitutes the long-term storage unit 106 and is used to hold information obtained by learning such as the name of an object for a long period of time. For example, LongTermMemory can associatively store a change in internal state from an external stimulus in a certain behavior module.

ＩｎｔｅｒｎａｌＳｔａｔｕｓＭａｎａｇｅｒ（ＩＳＭ）は、内部状態管理部１０４を構成するオブジェクトであり、本能や感情といった数種類の情動を数式モデル化して管理しており、上述の認識系の各オブジェクトによって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）に応じてロボット装置１の本能や情動といった内部状態を管理する。 The Internal Status Manager (ISM) is an object that constitutes the internal state management unit 104, manages several types of emotions such as instinct and emotion by modeling them, and external stimuli (ES that are recognized by each object of the recognition system described above. : Manages the internal state of the robot apparatus 1 such as instinct and emotion according to (ExternalStimula).

ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒ（ＳＢＬ）は状況依存型行動階層１０８を構成するオブジェクトである。ＳＢＬは、ＳｈｏｒＴｅｒｍＭｅｍｏｒｙのクライアント（ＳＴＭクライアント）となるオブジェクトであり、ＳｈｏｒＴｅｒｍＭｅｍｏｒｙからは定期的に外部刺激（ターゲットやイベント）に関する情報の通知（Ｎｏｔｉｆｙ）を受け取ると、スキーマ（ｓｃｈｅｍａ）すなわち実行すべき行動モジュールを決定する（後述）。 Situated Behaviorslayer (SBL) is an object that constitutes the context-dependent action hierarchy 108. SBL is an object that becomes a client (STM client) of ShorTermMemory. When a notification (Notify) of information related to external stimuli (targets and events) is periodically received from ShorTermMemory, a schema (scheme), that is, an action module to be executed. Is determined (described later).

ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒは、反射的行動部１０９を構成するオブジェクトであり、上述した認識系の各オブジェクトによって認識された外部刺激に応じて反射的・直接的な機体動作を実行する。例えば、人間の顔を追いかけたり、うなずいたり、障害物の検出により咄嗟に避けたりといった振る舞いを行なう（後述）。 The Reflexive Situated BehaviorsLayer is an object that constitutes the reflexive action unit 109, and executes reflexive and direct body motion according to the external stimulus recognized by each object of the recognition system described above. For example, a behavior such as chasing a human face, nodding, or avoiding a trap by detecting an obstacle is performed (described later).

ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒ（状況依存行動階層）は外部刺激や内部状態の変化などの状況に応じて行動を選択する。これに対し、ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒは、外部刺激に応じて反射的を行動する。また、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒは、外部刺激としてロボット装置の反射行動の管理も行なうことにより、反射行動が状況に応じた行動の意図に合わない場合には、ＲｅｆｌｅｘｉｖｅＢｅｈａｖｉｏｒＬａｙｅｒ（反射行動階層）から発現される動作コマンドの実行を抑制する（後述）。あるいは逆に、反射行動を出現させたい場合には、ＲｅｆｌｅｘｉｖｅＢｅｈａｖｉｏｒＬａｙｅｒを励起する。 The Situated Behaviors layer (situation-dependent action hierarchy) selects an action according to a situation such as an external stimulus or a change in an internal state. In contrast, Reflexive Situated BehaviorsLayer acts reflexively in response to external stimuli. In addition, the Situated Behavior layer also manages the reflex behavior of the robot device as an external stimulus, so that when the reflex behavior does not match the intention of the behavior according to the situation, the behavior command expressed from the Reflexive Behavior Layer (reflective behavior layer) Suppress execution (described later). Or conversely, when it is desired to make the reflex behavior appear, the Reflexive BehaviorLayer is excited.

ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒとＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒという２つのオブジェクトによる行動選択は独立して行なわれるため、互いに選択された行動モジュール（スキーマ）を機体上で実行する場合に、ロボット１のハードウェア・リソースが競合して実現不可能なこともある。Ｒｅｓｏｕｒｃｅｍａｎａｇｅｒというオブジェクトは、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒとＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒによる行動選択時のハードウェアの競合を調停する（後述）。そして、調停結果に基づいて機体動作を実現する各オブジェクトに通知することにより機体が駆動する。 Since the behavior selection by two objects of Refractive Situated Behaviors Layer and Situated Behavior Layer is performed independently, when the behavior modules (schema) selected from each other are executed on the aircraft, the hardware resources of the robot 1 are in conflict and cannot be realized. Sometimes it is. The ResourceManager object mediates hardware conflicts when selecting actions by the Situated Behaviorslayer and the Reflexive Situated Behaviors Layer (described later). Then, the airframe is driven by notifying each object that realizes the airframe motion based on the arbitration result.

ＳｏｕｎｄＰｅｒｆｏｒｍｅｒ、ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ、ＬｅｄＣｏｎｔｒｏｌｌｅｒは、機体動作を実現するオブジェクトである。ＳｏｕｎｄＰｅｒｆｏｒｍｅｒは、音声出力を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから与えられたテキスト・コマンドに応じて音声合成を行ない、ロボット１の機体上のスピーカから音声出力を行なう。また、ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒは、機体上の各関節アクチュエータの動作を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから手や脚などを動かすコマンドを受けたことに応答して、該当する関節角を計算する。また、ＬｅｄＣｏｎｔｒｏｌｌｅｒは、ＬＥＤ１９の点滅動作を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒからコマンドを受けたことに応答してＬＥＤ１９の点滅駆動を行なう。 SoundPerformer, MotionController, and LedController are objects that realize the body operation. The SoundPerformer is an object for performing voice output, performs voice synthesis in accordance with a text command given from the SituatedBehaviorLayer via the ResourceManager, and outputs voice from the speaker on the body of the robot 1. The Motion Controller is an object for performing the operation of each joint actuator on the aircraft, and calculates a corresponding joint angle in response to receiving a command to move a hand, leg, or the like from the Situated BehaviorLayer via the ResourceManager. The LedController is an object for performing the blinking operation of the LED 19, and performs the blinking drive of the LED 19 in response to receiving a command from the Situated BehaviorLayer via the ResourceManager.

図１４には、状況依存行動階層（ＳＢＬ）１０８（但し、反射行動部１０９を含む）による状況依存行動制御の形態を模式的に示している。認識系１０１〜１０３による外部環境の認識結果は、外部刺激として状況依存行動階層１０８（反射行動部１０９を含む）に与えられる。また、認識系による外部環境の認識結果に応じた内部状態の変化も状況依存行動階層１０８に与えられる。そして、状況依存行動階層１０８では、外部刺激や内部状態の変化に応じて状況を判断して、行動選択を実現することができる。 FIG. 14 schematically shows a form of situation-dependent action control by the situation-dependent action hierarchy (SBL) 108 (however, including the reflex action unit 109). The recognition result of the external environment by the recognition systems 101 to 103 is given to the situation-dependent action hierarchy 108 (including the reflex action part 109) as an external stimulus. In addition, a change in the internal state according to the recognition result of the external environment by the recognition system is also given to the situation-dependent action hierarchy 108. In the situation-dependent action hierarchy 108, the situation can be determined according to an external stimulus or a change in the internal state, and action selection can be realized.

図１５には、図１４に示した状況依存行動階層１０８による行動制御の基本的な動作例を示している。同図に示すように、状況依存行動階層１０８（ＳＢＬ）では、外部刺激や内部状態の変化によって各行動モジュール（スキーマ）の活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する。活動度レベルの算出には、例えばライブラリを利用することにより、すべてのスキーマについて統一的な計算処理を行なうことができる（以下、同様）。例えば、活動度レベルが最も高いスキーマを選択したり、所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行したりするようにしてもよい（但し、並列実行するときは各スキーマどうしでハードウェア・リソースの競合がないことを前提とする）。 FIG. 15 shows a basic operation example of action control by the situation-dependent action hierarchy 108 shown in FIG. As shown in the figure, in the situation-dependent action hierarchy 108 (SBL), the activity level of each action module (schema) is calculated based on an external stimulus or a change in the internal state, and the schema is determined according to the degree of the activity level. Select and execute an action. For the calculation of the activity level, for example, by using a library, unified calculation processing can be performed for all schemas (the same applies hereinafter). For example, a schema with the highest activity level may be selected, or two or more schemas exceeding a predetermined threshold may be selected and actions may be executed in parallel. (Assuming there are no hardware resource conflicts between schemas).

また、図１６には、図１４に示した状況依存行動階層１０８により反射行動を行なう場合の動作例を示している。この場合、同図に示すように、状況依存行動階層１０８に含まれる反射行動部１０９（ＲｅｆｌｅｘｉｖｅＳＢＬ）は、認識系の各オブジェクトによって認識された外部刺激を直接入力として活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する。この場合、内部状態の変化は、活動度レベルの計算には使用されない。 FIG. 16 shows an operation example in the case of performing a reflex action by the situation-dependent action hierarchy 108 shown in FIG. In this case, as shown in the figure, the reflexive action unit 109 (ReflexiveSBL) included in the situation-dependent action hierarchy 108 calculates the activity level by directly inputting the external stimulus recognized by each object of the recognition system, Select a schema according to the level of activity level and execute an action. In this case, the change in internal state is not used for the activity level calculation.

また、図１７には、図１４に示した状況依存行動階層１０８により感情表現を行なう場合の動作例を示している。内部状態管理部１０４では、本能や感情などの情動を数式モデルとして管理しており、情動パラメータの状態値が所定値に達したことに応答して、状況依存行動階層１０８に内部状態の変化を通知（Ｎｏｔｉｆｙ）する。状況依存行動階層１０８は、内部状態の変化を入力として活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する。この場合、認識系の各オブジェクトによって認識された外部刺激は、内部状態管理部１０４（ＩＳＭ）における内部状態の管理・更新に利用されるが、スキーマの活動度レベルの算出には使用されない。 FIG. 17 shows an operation example in the case where emotion expression is performed by the situation-dependent action hierarchy 108 shown in FIG. The internal state management unit 104 manages emotions such as instinct and emotion as a mathematical model, and in response to the state value of the emotion parameter reaching a predetermined value, changes in the internal state are made to the situation-dependent action hierarchy 108. Notify. The situation-dependent behavior hierarchy 108 calculates an activity level using an internal state change as an input, selects a schema according to the level of the activity level, and executes an action. In this case, the external stimulus recognized by each object of the recognition system is used for management / update of the internal state in the internal state management unit 104 (ISM), but is not used for calculating the activity level of the schema.

Ｄ−２．スキーマ
状況依存行動階層１０８は、各行動モジュール毎にステートマシンを用意しており、それ以前の行動や状況に依存して、センサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。行動モジュールは、機体動作を記述し行動実行に伴う状態遷移（ステートマシン）を実現するＡｃｔｉｏｎ機能と、Ａｃｔｉｏｎ機能において記述された行動の実行を外部刺激や内部状態に応じて評価して状況判断を行なうＭｏｎｉｔｏｒ機能とを備えたスキーマ（ｓｃｈｅｍａ）として記述される。 D-2. The schema situation-dependent action hierarchy 108 prepares a state machine for each action module, and classifies the recognition results of external information input from the sensor depending on the action and situation before that, and the action is displayed on the aircraft. Expressed in The action module describes the action of the action described in the action function according to the external stimulus and the internal state, and describes the situation determination by describing the body motion and realizing the state transition (state machine) accompanying the action execution. It is described as a schema having a Monitor function to be performed.

図１８には、状況依存行動階層１０８が複数のスキーマによって構成されている様子を模式的に示している。また、図１３と図１８では、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒとＲｅｆｌｅｘｉｖｅＳｉｔｕａｔａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒからの動作コマンドが、ロボット装置１のＲｅｓｏｕｒｃｅＭａｎａｇｅｍｅｎｔ（資源管理）モジュールで競合解決されている様子が示されている（後述）。 FIG. 18 schematically shows that the situation-dependent action hierarchy 108 is composed of a plurality of schemas. FIGS. 13 and 18 show that the operation commands from the Situated Behavior Layer and the Reflexive Situated Behavior Layer are conflict-resolved in the Resource Management module of the robot apparatus 1 (described later).

状況依存行動階層１０８（より厳密には、状況依存行動階層１０８のうち、通常の状況依存行動を制御する階層）は、複数のスキーマが階層的に連結されたツリー構造として構成され、外部刺激や内部状態の変化に応じてより最適なスキーマを統合的に判断して行動制御を行なうようになっている。ツリーは、例えば動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するためのサブツリーなど、複数のサブツリー（又は枝）を含んでいる。 The situation-dependent action hierarchy 108 (more strictly speaking, the hierarchy that controls the normal situation-dependent action among the situation-dependent action hierarchy 108) is configured as a tree structure in which a plurality of schemas are hierarchically connected. In accordance with changes in the internal state, a more optimal schema is determined in an integrated manner to perform action control. The tree includes a plurality of subtrees (or branches) such as a behavior model obtained by formulating an ethological situation-dependent behavior and a subtree for executing emotion expression.

図１９には、状況依存行動階層１０８におけるスキーマのツリー構造を模式的に示している。同図に示すように、状況依存行動階層１０８は、短期記憶部１０５から外部刺激の通知（Ｎｏｔｉｆｙ）を受けるルート・スキーマを先頭に、抽象的な行動カテゴリから具体的な行動カテゴリに向かうように、各階層毎にスキーマが配設されている。例えば、ルート・スキーマの直近下位の階層では、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」、「食べる（Ｉｎｇｅｓｔｉｖｅ）」、「遊ぶ（Ｐｌａｙ）」というスキーマが配設される。そして、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」の下位には、「ＩｎｖｅｓｔｉｇａｔｉｖｅＬｏｃｏｍｏｔｉｏｎ」、「ＨｅａｄｉｎＡｉｒＳｎｉｆｆｉｎｇ」、「ＩｎｖｅｓｔｉｇａｔｉｖｅＳｎｉｆｆｉｎｇ」というより具体的な探索行動を記述したスキーマが配設されている。同様に、スキーマ「食べる（Ｉｎｇｅｓｔｉｖｅ）」の下位には「Ｅａｔ」や「Ｄｒｉｎｋ」などのより具体的な飲食行動を記述したスキーマが配設され、スキーマ「遊ぶ（Ｐｌａｙ）」の下位には「ＰｌａｙＢｏｗｉｎｇ」、「ＰｌａｙＧｒｅｅｔｉｎｇ」、「ＰｌａｙＰａｗｉｎｇ」などのより具体的な遊ぶ行動を記述したスキーマが配設されている。 FIG. 19 schematically shows a schema tree structure in the situation-dependent action hierarchy 108. As shown in the figure, the situation-dependent action hierarchy 108 is directed from the abstract action category to the specific action category, starting with the root schema that receives the notification (Notify) of the external stimulus from the short-term storage unit 105. A schema is arranged for each hierarchy. For example, in the hierarchy immediately below the root schema, schemas “Search”, “Insert”, and “Play” are arranged. Then, below “Investigate”, a schema describing more specific search behaviors such as “Investigative Location”, “HeadinAirSniffing”, and “InvestigativeSniffing” is arranged. Similarly, a schema describing more specific eating and drinking behavior such as “Eat” and “Drink” is arranged below the schema “Ingestive”, and “Schema” is placed below the “Play”. Schemas describing more specific playing behaviors such as “PlayBowing”, “PlayGreeting”, and “PlayPawing” are arranged.

図示の通り、各スキーマは外部刺激と内部状態を入力している。また、各スキーマは、少なくともＭｏｎｉｔｏｒ関数とＡｃｔｉｏｎ関数を備えている。 As shown, each schema inputs an external stimulus and an internal state. Each schema has at least a Monitor function and an Action function.

図２０には、スキーマの内部構成を模式的に示している。同図に示すように、スキーマは、状態遷移（ステートマシン）の形式で機体動作を記述したＡｃｔｉｏｎ関数と、外部刺激や内部状態に応じてＡｃｔｉｏｎ関数の各状態を評価して活動度レベル値として返すＭｏｎｉｔｏｒ関数と、Ａｃｔｉｏｎ関数のステートマシンをＲＥＡＤＹ（準備完了）、ＡＣＴＩＶＥ（活動中），ＳＬＥＥＰ（待機中）いずれかの状態としてスキーマの状態を記憶管理する状態管理部で構成されている。 FIG. 20 schematically shows the internal structure of the schema. As shown in the figure, the schema evaluates each state of the Action function according to the external function and the internal state as an activity level value by describing the action of the body in the form of state transition (state machine). The Monitor function to be returned and a state management unit that stores and manages the state of the schema as the READY (ready), ACTIVE (active), or SLEEP (standby) state machine of the Action function.

Ｍｏｎｉｔｏｒ関数は、外部刺激と内部状態に応じて当該スキーマの活動度レベル（ＡｃｔｉｖａｔｉｏｎＬｅｖｅｌ：ＡＬ値）を算出する関数である。図１９に示すようなツリー構造を構成する場合、上位（親）のスキーマは外部刺激と内部状態を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールすることができ、子供のスキーマはＡＬ値を返り値とする。また、スキーマは自分のＡＬ値を算出するために、さらに子供のスキーマのＭｏｎｉｔｏｒ関数をコールすることができる。そして、ルートのスキーマには各サブツリーからのＡＬ値が返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断することができる。 The Monitor function is a function that calculates an activity level (Activation Level: AL value) of the schema in accordance with an external stimulus and an internal state. When the tree structure as shown in FIG. 19 is configured, the upper (parent) schema can call the Monitor function of the lower (child) schema with the external stimulus and the internal state as arguments, and the child schema has an AL value. Is the return value. The schema can also call the child's schema Monitor function to calculate its AL value. Since the AL value from each sub-tree is returned to the root schema, the optimum schema corresponding to the external stimulus and the change of the internal state, that is, the behavior can be determined in an integrated manner.

例えばＡＬ値が最も高いスキーマを選択したり、ＡＬ値が所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行したりするようにしてもよい（但し、並列実行するときは各スキーマ同士でハードウェア・リソースの競合がないことを前提とする）。 For example, a schema having the highest AL value may be selected, or two or more schemas having an AL value exceeding a predetermined threshold value may be selected and executed in parallel (however, when executing in parallel) (Assuming there is no hardware resource conflict between schemas).

図２１には、Ｍｏｎｉｔｏｒ関数の内部構成を模式的に示している。同図に示すように、Ｍｏｎｉｔｏｒ関数は、当該スキーまで記述されている行動を誘発する評価値を活動度レベルとして算出する行動誘発評価値演算器と、活動度レベルにバイアスを意図レベル（ＩｎｔｅｎｔｉｏｎＬｅｖｅｌ：ＩＶ値）として算出するバイアス演算器と、使用する期待リソースを特定する使用リソース演算器を備えている。 FIG. 21 schematically shows the internal configuration of the Monitor function. As shown in the figure, the Monitor function includes an action induction evaluation value calculator that calculates an evaluation value that induces the action described up to the ski as an activity level, and an intention level (Intention Level: (IV value) and a used resource calculator for specifying an expected resource to be used.

バイアス演算器で算出されるＩＶ値は、例えば、状況依存行動を実行中に、ロボット装置１が状況依存行動の意図に反して動作するのを抑制する場合や、状況依存の意図に適うように励起する場合に用いられる。このＩＶ値は、熟考行動階層（ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒ）が状況依存行動の発現計画を立て、状況依存行動の発現する順序を管理する場合にも用いる。 The IV value calculated by the bias calculator is, for example, suitable for the case where the robot apparatus 1 is prevented from operating against the intention of the situation-dependent action during the execution of the situation-dependent action or the situation-dependent intention. Used for excitation. This IV value is also used when the contemplation action hierarchy (DeliverativeLayer) makes a situation-dependent action expression plan and manages the order in which the situation-dependent action occurs.

図２０に示した例では、Ｍｏｎｉｔｏｒ関数は、スキーマすなわち行動モジュールの管理を行なう行動状態制御部（仮称）からコールされると、Ａｃｔｉｏｎ関数のステートマシンを仮想実行して、活動度レベルと使用リソースを演算し、これを返すようになっている。 In the example shown in FIG. 20, when the Monitor function is called from a behavior state control unit (tentative name) that manages a schema, that is, a behavior module, it virtually executes the state machine of the Action function, and the activity level and the resource used Is calculated and returned.

また、Ａｃｔｉｏｎ関数は、スキーマ自身が持つ行動を記述したステートマシン（後述）を備えている。図１９に示すようなツリー構造を構成する場合、親スキーマは、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりすることができる。本実施形態では、ＡｃｔｉｏｎのステートマシンはＲｅａｄｙにならないと初期化されない。言い換えれば、中断しても状態はリセットされず、スキーマが実行中の作業データを保存することから、中断再実行が可能である（後述）。 Further, the Action function includes a state machine (described later) that describes the behavior of the schema itself. When the tree structure shown in FIG. 19 is configured, the parent schema can call the Action function to start or interrupt the execution of the child schema. In the present embodiment, the action state machine is not initialized unless it becomes Ready. In other words, even if it is interrupted, the state is not reset, and the work data being executed by the schema is saved, so that it can be interrupted and reexecuted (described later).

図２０で示す例では、スキーマすなわち行動モジュールの管理を行なう行動状態制御部（仮称）は、Ｍｏｎｉｔｏｒ関数からの戻り値に基づいて、実行すべき行動を選択し、該当するスキーマのＡｃｔｉｏｎ関数をコールし、あるいは状態管理部に記憶されているスキーマの状態の移行を指示する。例えば行動誘発評価値としての活動度レベルが最も高いスキーマを選択したり、リソースが競合しないように優先順位に従って複数のスキーマを選択したりする。また、行動状態制御部は、より優先順位の高いスキーマが起動し、リソースの競合が生じた場合、優先順位が下位のスキーマの状態をＡＣＴＩＶＥからＳＬＥＥＰに退避させ、競合状態が解かれるとＡＣＴＩＶＥに回復するなど、スキーマの状態を制御する。 In the example shown in FIG. 20, a behavior state control unit (tentative name) that manages a schema, that is, a behavior module, selects an action to be executed based on a return value from the Monitor function, and calls an action function of the corresponding schema. Alternatively, the transition of the schema state stored in the state management unit is instructed. For example, the schema having the highest activity level as the action induction evaluation value is selected, or a plurality of schemas are selected according to the priority order so that resources do not compete. In addition, when a schema having a higher priority is activated and resource conflict occurs, the behavior state control unit saves the state of the schema having a lower priority from ACTIVE to SLEEP, and when the conflict state is solved, the behavior state control unit changes to ACTIVE. Control schema state, such as recovery.

行動状態制御部は、図２２に示すように、状況依存行動階層１０８において１つだけ配設して、同階層１０８を構成するすべてのスキーマを一元的に集中管理するようにしてもよい。 As shown in FIG. 22, only one behavior state control unit may be provided in the situation-dependent behavior hierarchy 108 so that all schemas constituting the hierarchy 108 are centrally managed.

図示の例では、行動状態制御部は、行動評価部と、行動選択部と、行動実行部を備えている。行動評価部は、例えば所定の制御周期で各スキーマのＭｏｎｉｔｏｒ関数をコールして、各々の活動度レベルと使用リソースを取得する。行動選択部は、各スキーマによる行動制御と機体リソースの管理を行なう。例えば、集計された活動度レベルの高い順にスキーマを選択するとともに、使用リソースが競合しないように２以上のスキーマを同時に選択する。行動実行部は、選択されたスキーマのＡｃｔｉｏｎ関数に行動実行命令を発行したり、スキーマの状態（ＲＥＡＤＹ、ＡＣＴＩＶＥ，ＳＬＥＥＰ）を管理して、スキーマの実行を制御する。例えば、より優先順位の高いスキーマが起動し、リソースの競合が生じた場合、優先順位が下位のスキーマの状態をＡＣＴＩＶＥからＳＬＥＥＰに退避させ、競合状態が解かれるとＡＣＴＩＶＥに回復する。 In the illustrated example, the behavior state control unit includes a behavior evaluation unit, a behavior selection unit, and a behavior execution unit. For example, the behavior evaluation unit calls the Monitor function of each schema at a predetermined control cycle, and acquires each activity level and resource used. The action selection unit performs action control and aircraft resource management by each schema. For example, schemas are selected in descending order of activity level, and two or more schemas are simultaneously selected so that resources used do not conflict. The behavior execution unit issues a behavior execution command to the Action function of the selected schema, manages the schema status (READY, ACTIVE, SLEEP), and controls the execution of the schema. For example, when a schema with a higher priority is activated and a resource conflict occurs, the state of the schema with a lower priority is saved from ACTIVE to SLEEP, and when the conflict state is solved, it is restored to ACTIVE.

あるいは、このような行動状態制御部の機能を、状況依存行動階層１０８内のスキーマ毎に配置するようにしてもよい。例えば、図１９に示したように，スキーマがツリー構造を形成している場合（図２３を参照のこと）、上位（親）のスキーマの行動状態制御は、外部刺激と内部状態を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールし、子供のスキーマから活動度レベルと使用リソースを返り値として受け取る。また、子供のスキーマは、自分の活動度レベルと使用リソースを算出するために、さらに子供のスキーマのＭｏｎｉｔｏｒ関数をコールする。そして、ルートのスキーマの行動状態制御部には、各サブツリーからの活動度レベルと使用リソースが返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断して、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりする。 Or you may make it arrange | position the function of such an action state control part for every schema in the situation dependence action hierarchy 108. FIG. For example, as shown in FIG. 19, when the schema forms a tree structure (see FIG. 23), the behavioral state control of the higher-level (parent) schema is the lower-level with external stimulus and internal state as arguments. The Monitor function of the (child) schema is called, and the activity level and the used resource are received as return values from the child schema. In addition, the child schema further calls the Monitor function of the child schema in order to calculate its activity level and resource used. Then, the activity level control unit and the resources used from each sub-tree are returned to the action state control unit of the root schema, so that the optimum schema corresponding to the external stimulus and the change of the internal state, that is, the action is integrally determined. , Call the Action function to start or interrupt the execution of the child schema.

図２４には、状況依存行動階層１０８において通常の状況依存行動を制御するためのメカニズムを模式的に示している。 FIG. 24 schematically shows a mechanism for controlling a normal situation-dependent action in the situation-dependent action hierarchy 108.

同図に示すように、状況依存行動階層１０８には、短期記憶部１０５から外部刺激が入力（Ｎｏｔｉｆｙ）されるとともに、内部状態管理部１０９から内部状態の変化が入力される。状況依存行動階層１０８は、例えば動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するためのサブツリーなど、複数のサブツリーで構成されており、ルート・スキーマは、外部刺激の通知（Ｎｏｔｉｆｙ）に応答して、各サブツリーのｍｏｎｉｔｏｒ関数をコールし、その返り値としての活動度レベル（ＡＬ値）を参照して、統合的な行動選択を行ない、選択された行動を実現するサブツリーに対してａｃｔｉｏｎ関数をコールする。また、状況依存行動階層１０８において決定された状況依存行動は、リソース・マネージャにより反射行動部１０９による反射的行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ）に適用される。 As shown in the figure, an external stimulus is input (Notify) from the short-term storage unit 105 and a change in the internal state is input from the internal state management unit 109 to the situation-dependent action hierarchy 108. The situation-dependent action hierarchy 108 is composed of a plurality of subtrees such as an action model obtained by formulating an animal behavioral (ethological) situation-dependent action, and a subtree for executing emotion expression. In response to the notification (Notify) of the external stimulus, the monitor function of each sub-tree is called, and the activity level (AL value) as the return value is referred to, and the integrated action selection is performed and the selected function is selected. Call the action function on the subtree that realizes the action. In addition, the situation-dependent action determined in the situation-dependent action hierarchy 108 is applied to the aircraft operation (Motion Controller) through the mediation of hardware resource competition with the reflex action by the reflex action unit 109 by the resource manager. .

また、状況依存行動層１０８のうち、反射的行動部１０９は、上述した認識系の各オブジェクトによって認識された外部刺激に応じて反射的・直接的な機体動作を実行する（例えば、障害物の検出により咄嗟に避ける）。このため、通常の状況依存行動を制御する場合（図１９）とは相違し、認識系の各オブジェクトからの信号を直接入力する複数のスキーマが、階層化されずに並列的に配置されている。 Further, in the situation-dependent action layer 108, the reflexive action unit 109 executes reflexive and direct aircraft actions according to the external stimulus recognized by each object of the recognition system described above (for example, an obstacle Avoid by detection) For this reason, unlike the case where normal situation-dependent behavior is controlled (FIG. 19), a plurality of schemas for directly inputting signals from each object of the recognition system are arranged in parallel without being hierarchized. .

図２５には、反射行動部１０９におけるスキーマの構成を模式的に示している。同図に示すように、反射行動部１０９には、聴覚系の認識結果に応答して動作するスキーマとして「ＡｖｏｉｄＢｉｇＳｏｕｎｄ」、「ＦａｃｅｔｏＢｉｇＳｏｕｎｄ」及び「ＮｏｄｄｉｎｇＳｏｕｎｄ」、視覚系の認識結果に応答して動作するスキーマとして「ＦａｃｅｔｏＭｏｖｉｎｇＯｂｊｅｃｔ」及び「ＡｖｏｉｄＭｏｖｉｎｇＯｂｊｅｃｔ」、並びに、触覚系の認識結果に応答して動作するスキーマとして「手を引っ込める」が、それぞれ対等な立場で（並列的に）配設されている。 FIG. 25 schematically shows a schema configuration in the reflex action unit 109. As shown in the figure, the reflex action unit 109 operates in response to the recognition result of the visual system, “AvoidBigSound”, “FacetoBigSound” and “NodingSound” as schemas that operate in response to the recognition result of the auditory system. “FacetoMovingObject” and “AvoidMovingObject” as schemas, and “retract hand” as schemas operating in response to the recognition result of the tactile system are arranged in an equal position (in parallel).

図示の通り、反射的行動を行なう各スキーマは外部刺激を入力に持つ。また、各スキーマは、少なくともｍｏｎｉｔｏｒ関数とａｃｔｉｏｎ関数を備えている。ｍｏｎｉｔｏｒ関数は、外部刺激に応じて当該スキーマのＡＬ値を算出して、これに応じて該当する反射的行動を発現すべきかどうかが判断される。また、ａｃｔｉｏｎ関数は、スキーマ自身が持つ反射的行動を記述したステートマシン（後述）を備えており、コールされることにより、該当する反射的行動を発現するとともにａｃｔｉｏｎの状態を遷移させていく。 As shown, each schema that performs reflex behavior has an external stimulus as input. Each schema has at least a monitor function and an action function. The monitor function calculates the AL value of the schema in accordance with the external stimulus, and determines whether or not the corresponding reflex behavior should be expressed according to the calculated AL value. The action function includes a state machine (described later) describing the reflexive behavior of the schema itself. When called, the action function expresses the corresponding reflexive behavior and changes the state of the action.

図２６には、反射行動部１０９において反射的行動を制御するためのメカニズムを模式的に示している。 FIG. 26 schematically shows a mechanism for controlling the reflex behavior in the reflex behavior unit 109.

図２５にも示したように、反射行動部１０９内には、反応行動を記述したスキーマや、即時的な応答行動を記述したスキーマが並列的に存在している。認識系のオブジェクトから認識結果が入力されると、対応する反射行動スキーマがｍｏｎｉｔｏｒ関数によりＡＬ値を算出し、その値に応じてａｃｔｉｏｎを軌道すべきかどうかが判断される。そして、反射行動部１０９において起動が決定された反射的行動は、リソース・マネージャにより反射行動部１０９による反射的行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ）に適用される。 As shown also in FIG. 25, in the reflex action part 109, the schema which described the reaction action and the schema which described the immediate response action exist in parallel. When a recognition result is input from a recognition system object, the corresponding reflex behavior schema calculates an AL value by the monitor function, and it is determined whether or not the action should be trajected according to the value. The reflex behavior determined to be activated by the reflex behavior unit 109 is applied to the aircraft operation (Motion Controller) through the mediation of hardware resource competition with the reflex behavior by the reflex behavior unit 109 by the resource manager. The

状況依存行動階層１０８（反射行動部１０９を含む）を構成するスキーマは、例えばＣ⁺⁺言語ベースで記述される「クラス・オブジェクト」として記述することができる。図２７には、状況依存行動階層１０８において使用されるスキーマのクラス定義を模式的に示している。同図に示されている各ブロックはそれぞれ１つのクラス・オブジェクトに相当する。 The schema constituting the situation-dependent action hierarchy 108 (including the reflex action part 109) can be described as, for example, a “class object” described in a C ⁺⁺ language base. FIG. 27 schematically shows a schema class definition used in the situation-dependent behavior hierarchy 108. Each block shown in the figure corresponds to one class object.

図示の通り、状況依存行動階層（ＳＢＬ）１０８は、１以上のスキーマと、ＳＢＬの入出力イベントに対してＩＤを割り振るＥｖｅｎｔＤａｔａＨａｎｄｌｅｒ（ＥＤＨ）と、ＳＢＬ内のスキーマを管理するＳｃｈｅｍａＨａｎｄｌｅｒ（ＳＨ）と、外部オブジェクト（ＳＴＭやＬＴＭ、リソース・マネージャ、認識系の各オブジェクトなど）からデータを受信する１以上のＲｅｃｅｉｖｅＤａｔａＨａｎｄｌｅｒ（ＲＤＨ）と、外部オブジェクトにデータを送信する１以上のＳｅｎｄＤａｔａＨａｎｄｌｅｒ（ＳＤＨ）とを備えている。 As shown in the figure, the situation-dependent behavior hierarchy (SBL) 108 includes one or more schemas, an EventDataHandler (EDH) that allocates an ID to an SBL input / output event, a SchemaHandler (SH) that manages the schema in the SBL, 1 or more ReceiveDataHandler (RDH) which receives data from external objects (STM, LTM, resource manager, each object of recognition system, etc.) and 1 or more SendDataHandler (SDH) which transmits data to an external object Yes.

ＥｖｅｎｔＤａｔａＨａｎｄｌｅｒ（ＥＤＨ）は、ＳＢＬの入出力イベントに対してＩＤを割り振るためのクラス・オブジェクトであり、ＲＤＨやＳＤＨから入出力イベントの通知を受ける。 EventDataHandler (EDH) is a class object for allocating IDs for SBL input / output events, and receives notification of input / output events from RDH and SDH.

ＳｃｈｅｍａＨａｎｄｌｅｒは、状況依存行動階層（ＳＢＬ）１０８や反射行動部１０９を構成する各スキーマやツリー構造などの情報（ＳＢＬのコンフィギュレーション情報）をファイルとして保管している。例えばシステムの起動時などに、ＳｃｈｅｍａＨａｎｄｌｅｒは、このコンフィギュレーション情報ファイルを読み込んで、図１９に示したような状況依存行動階層１０８のスキーマ構成を構築（再現）して、メモリ空間上に各スキーマのエンティティをマッピングする。 SchemaHandler stores information (SBL configuration information) such as each schema and tree structure constituting the situation-dependent action hierarchy (SBL) 108 and the reflex action part 109. For example, when the system is started up, SchemaHandler reads this configuration information file, constructs (reproduces) the schema structure of the situation-dependent action hierarchy 108 as shown in FIG. 19, and stores each schema in the memory space. Map entities.

各スキーマは、スキーマのベースとして位置付けられるＯｐｅｎＲ_Ｇｕｅｓｔを備えている。ＯｐｅｎＲ_Ｇｕｅｓｔは、スキーマが外部にデータを送信するためのＤｓｕｂｊｅｃｔ、並びに、スキーマが外部からデータを受信するためのＤＯｂｊｅｃｔというクラス・オブジェクトをそれぞれ１以上備えている。例えば、スキーマが、ＳＢＬの外部オブジェクト（ＳＴＭやＬＴＭ、認識系の各オブジェクトなど）にデータを送るときには、ＤｓｕｂｊｅｃｔはＳｅｎｄＤａｔａＨａｎｄｌｅｒに送信データを書き込む。また、ＤＯｂｊｅｃｔは、ＳＢＬの外部オブジェクトから受信したデータをＲｅｃｅｉｖｅＤａｔａＨａｎｄｌｅｒから読み取ることができる。 Each schema has OpenR_Guest positioned as the base of the schema. OpenR_Guest includes at least one class object called Dsub for the schema to transmit data to the outside and DO object for the schema to receive data from the outside. For example, when the schema sends data to an external object of SBL (STM, LTM, recognition system objects, etc.), Dsubject writes transmission data to SendDataHandler. In addition, DOObject can read data received from an external object of SBL from ReceiveDataHandler.

ＳｃｈｅｍａＭａｎａｇｅｒ及びＳｃｈｅｍａＢａｓｅは、ともにＯｐｅｎＲ_Ｇｕｅｓｔを継承したクラス・オブジェクトである。クラス継承は、元のクラスの定義を受け継ぐことであり、この場合、ＯｐｅｎＲ_Ｇｕｅｓｔで定義されているＤｓｕｂｊｅｃｔやＤＯｂｊｅｃｔなどのクラス・オブジェクトをＳｃｈｅｍａＭａｎａｇｅｒやＳｃｈｅｍａＢａｓｅも備えていることを意味する（以下、同様）。例えば図１９に示すように複数のスキーマがツリー構造になっている場合、ＳｃｈｅｍａＭａｎａｇｅｒは、子供のスキーマのリストを管理するクラス・オブジェクトＳｃｈｅｍａＬｉｓｔを持ち（子供のスキーマへのポインタを持ち）、子供スキーマの関数をコールすることができる。また、ＳｃｈｅｍａＢａｓｅは、親スキーマへのポインタを持ち、親スキーマからコールされた関数の返り値を戻すことができる。 SchemaManager and SchemaBase are both class objects that inherit OpenR_Guest. Class inheritance is inheriting the definition of the original class. In this case, it means that SchemaManager and SchemaBase are also provided with class objects such as Dsubject and DOObject defined in OpenR_Guest (the same applies hereinafter). For example, as shown in FIG. 19, when a plurality of schemas have a tree structure, SchemaManager has a class object SchemaList (which has a pointer to a child schema) that manages a list of child schemas, and has a child schema. A function can be called. SchemaBase has a pointer to the parent schema and can return a return value of a function called from the parent schema.

ＳｃｈｅｍａＢａｓｅは、ＳｔａｔｅＭａｃｈｉｎｅ及びＰｒｏｎｏｍｅという２つのクラス・オブジェクトを持つ。ＳｔａｔｅＭａｃｈｉｎｅは当該スキーマの行動（Ａｃｔｉｏｎ関数）についてのステートマシンを管理している。図２８には、スキーマの行動（Ａｃｔｉｏｎ関数）についてのステートマシンを図解している。このステートマシンの状態間の遷移にそれぞれ行動（Ａｃｔｉｏｎ）が紐付けされている SchemaBase has two class objects, StateMachine and Pronome. StateMachine manages a state machine for behavior (Action function) of the schema. FIG. 28 illustrates a state machine for schema behavior (Action function). Actions are associated with the transitions between the states of the state machine.

親スキーマは子供スキーマのＡｃｔｉｏｎ関数のステートマシンを切り替える（状態遷移させる）ことができる。また、Ｐｒｏｎｏｍｅには、当該スキーマが行動（Ａｃｔｉｏｎ関数）を実行又は適用するターゲットを代入する。後述するように、スキーマはＰｒｏｎｏｍｅに代入されたターゲットによって占有され、行動が終了（完結、異常終了など）するまでスキーマは解放されない。新規のターゲットのために同じ行動を実行するためには同じクラス定義のスキーマをメモリ空間上に生成する。この結果、同じスキーマをターゲット毎に独立して実行することができ（個々のスキーマの作業データが干渉し合うことはなく）、行動のＲｅｅｎｔｒａｎｃｅ性（後述）が確保される。 The parent schema can switch (change state) the state machine of the action function of the child schema. Further, a target to which the schema executes or applies an action (Action function) is substituted for Proname. As will be described later, the schema is occupied by the target assigned to Pronome, and the schema is not released until the action is completed (completed, abnormally terminated, etc.). In order to perform the same action for a new target, a schema with the same class definition is generated in the memory space. As a result, the same schema can be executed independently for each target (the work data of the individual schemas do not interfere with each other), and behavioral reentrance (described later) is ensured.

ＰａｒｅｎｔＳｃｈｅｍａＢａｓｅは、ＳｃｈｅｍａＭａｎａｇｅｒ及びＳｃｈｅｍａＢａｓｅを多重継承するクラス・オブジェクトであり、スキーマのツリー構造において、当該スキーマ自身についての親スキーマ及び子供スキーマすなわち親子関係を管理する。 ParentSchemaBase is a class object that inherits multiple of SchemaManager and SchemaBase, and manages a parent schema and a child schema, that is, a parent-child relationship for the schema itself in the tree structure of the schema.

ＩｎｔｅｒｍｅｄｉａＰａｒｅｎｔＳｃｈｅｍａＢａｓｅは、ＰａｒｅｎｔＳｃｈｅｍａＢａｓｅを継承するクラス・オブジェクトであり、各クラスのためのインターフェース変換を実現する。また、ＩｎｔｅｒｍｅｄｉａＰａｒｅｎｔＳｃｈｅｍａＢａｓｅは、ＳｃｈｅｍａＳｔａｔｕｓＩｎｆｏを持つ。このＳｃｈｅｍａＳｔａｔｕｓＩｎｆｏは、当該スキーマ自身のステートマシンを管理するクラス・オブジェクトである。 IntermediaParentSchemaBase is a class object that inherits ParentSchemaBase and implements interface conversion for each class. In addition, IntermediaParentSchemaBase has SchemaStatusInfo. This SchemaStatusInfo is a class object that manages the state machine of the schema itself.

親スキーマは、子供スキーマのＡｃｔｉｏｎ関数をコールすることによってそのステートマシンの状態を切り換えることができる。また、子供スキーマのＡｏｎｉｔｏｒ関数をコールしてそのステートマシンの状態に応じたＡＬ値を問うことができる。但し、スキーマのステートマシンは、前述したＡｃｔｉｏｎ関数のステートマシンとは異なるということを留意されたい。 The parent schema can switch the state of its state machine by calling the action function of the child schema. Also, it is possible to ask the AL value corresponding to the state of the state machine by calling the Aonitor function of the child schema. However, it should be noted that the schema state machine is different from the action function state machine described above.

図２９には、スキーマ自身すなわちＡｃｔｉｏｎ関数によって記述されている行動についてのステートマシンを図解している。既に述べたように、スキーマ自身のステートマシンは、Ａｃｔｉｏｎ関数に寄って記述されている行動について、ＲＥＡＤＹ（準備完了）、ＡＣＴＩＶＥ（活動中），ＳＬＥＥＰ（待機中）という３つの状態を規定している。より優先順位の高いスキーマが起動し、リソースの競合が生じた場合、優先順位が下位のスキーマの状態をＡＣＴＩＶＥからＳＬＥＥＰに退避させ、競合状態が解かれるとＡＣＴＩＶＥに回復する。 FIG. 29 illustrates a state machine for actions described by a schema itself, that is, an Action function. As already mentioned, the state machine of the schema itself defines three states of READY (ready), ACTIVE (active), and SLEEP (standby) for actions described by the Action function. Yes. When a schema with a higher priority is activated and resource conflict occurs, the state of the schema with the lower priority is saved from ACTIVE to SLEEP, and when the conflict state is solved, it is restored to ACTIVE.

図２９に示すように、ＡＣＴＩＶＥからＳＬＥＥＰへの状態遷移にＡＣＴＩＶＥ_ＴＯ_ＳＬＥＥＰが、ＳＬＥＥＰからＡＣＴＩＶＥへの状態遷移にＳＬＥＥＰ_ＴＯ_ＡＣＴＩＶＥがそれぞれ規定されている。本実施形態において特徴的なのは、 As shown in FIG. 29, ACTIVE_TO_SLEEP is defined for the state transition from ACTIVE to SLEEP, and SLEEP_TO_ACTIVE is defined for the state transition from SLEEP to ACTIVE. What is characteristic in this embodiment is

（１）ＡＣＴＩＶＥ_ＴＯ_ＳＬＥＥＰに、後にＡＣＴＩＶＥに遷移して再開するために必要なデータ（コンテキスト）を保存するための処理と、ＳＬＥＥＰするために必要な行動が紐付けされている。
（２）ＳＬＥＥＰ_ＴＯ_ＡＣＴＩＶＥに、保存しておいたデータ（コンテキスト）を復元するための処理と、ＡＣＴＩＶＥに戻るために必要な行動が紐付けされている。 (1) A process for saving data (context) necessary for resuming the transition to ACTIVE later and an action necessary for SLEEP are associated with ACTIVE_TO_SLEEP.
(2) SLEEP_TO_ACTIVE is associated with a process for restoring saved data (context) and an action necessary for returning to ACTIVE.

という点である。ＳＬＥＥＰするために必要な行動とは、例えば、話し相手に休止を告げる「ちょっと待っててね」などのセリフを言う行動（その他、身振り手振りが加わっていてもよい）である。また、ＡＣＴＩＶＥに戻るために必要な行動とは、例えば、話し相手に謝意を表わす「お待たせ」などのセリフを言う行動（その他、身振り手振りが加わっていてもよい）である。 That is the point. The action necessary for the SLEEP is, for example, an action of saying a line such as “Please wait for a while” to tell the other party to pause (other gesture gestures may be added). The action required to return to ACTIVE is, for example, an action that says a line such as “send me a wait” that expresses gratitude to the other party (others may be gesture gestures added).

ＡｎｄＰａｒｅｎｔＳｃｈｅｍａ、ＮｕｍＯｒＰａｒｅｎｔＳｃｈｅｍａ、ＯｒＰａｒｅｎｔＳｃｈｅｍａは、ＩｎｔｅｒｍｅｄｉａＰａｒｅｎｔＳｃｈｅｍａＢａｓｅを継承するクラス・オブジェクトである。ＡｎｄＰａｒｅｎｔＳｃｈｅｍａは、同時実行する複数の子供スキーマへのポインタを持つ。ＯｒＰａｒｅｎｔＳｃｈｅｍａは、いずれか択一的に実行する複数の子供スキーマへのポインタを持つ。また、ＮｕｍＯｒＰａｒｅｎｔＳｃｈｅｍａは、所定数のみを同時実行する複数の子供スキーマへのポインタを持つ。 AndParentSchema, NumOrParentSchema, and OrParentSchema are class objects that inherit from IntermediateMediaSchemaBase. AndAndParentSchema has pointers to multiple child schemas that are executed simultaneously. OrParentSchema has pointers to multiple child schemas to be executed alternatively. NumOrParentSchema has pointers to a plurality of child schemas that execute only a predetermined number at the same time.

ＰａｒｅｎｔＳｃｈｅｍａは、これらＡｎｄＰａｒｅｎｔＳｃｈｅｍａ、ＮｕｍＯｒＰａｒｅｎｔＳｃｈｅｍａ、ＯｒＰａｒｅｎｔＳｃｈｅｍａを多重継承するクラス・オブジェクトである。 ParentSchema is a class object that inherits these AndParentSchema, NumOrParentSchema, and OrParentSchema multiple times.

図３０には、状況依存行動階層（ＳＢＬ）１０８内のクラスの機能的構成を模式的に示している。 FIG. 30 schematically shows a functional configuration of classes in the situation-dependent action hierarchy (SBL) 108.

状況依存行動階層（ＳＢＬ）１０８は、ＳＴＭやＬＴＭ、リソース・マネージャ、認識系の各オブジェクトなど外部オブジェクトからデータを受信する１以上のＲｅｃｅｉｖｅＤａｔａＨａｎｄｌｅｒ（ＲＤＨ）と、外部オブジェクトにデータを送信する１以上のＳｅｎｄＤａｔａＨａｎｄｌｅｒ（ＳＤＨ）とを備えている。 The Situation Dependent Action Hierarchy (SBL) 108 includes one or more Receive Data Handlers (RDH) that receive data from external objects such as STM, LTM, resource manager, and recognition system objects, and one or more that transmit data to external objects. SendDataHandler (SDH).

ＳｃｈｅｍａＨａｎｄｌｅｒは、スキーマを管理するためのクラス・オブジェクトであり、ＳＢＬを構成するスキーマのコンフィギュレーション情報をファイルとして保管している。例えばシステムの起動時などに、ＳｃｈｅｍａＨａｎｄｌｅｒは、このコンフィギュレーション情報ファイルを読み込んで、ＳＢＬ内のスキーマ構成を構築する。 The SchemaHandler is a class object for managing the schema, and stores configuration information of the schema constituting the SBL as a file. For example, when the system is started up, SchemaHandler reads this configuration information file and constructs a schema configuration in the SBL.

各スキーマは、図２７に示したクラス定義に従って生成され、メモリ空間上にエンティティがマッピングされる。各スキーマは、ＯｐｅｎＲ_Ｇｕｅｓｔをベースのクラス・オブジェクトとし、外部にデータ・アクセスするためのＤＳｕｂｊｅｃｔやＤＯｂｊｅｃｔなどのクラス・オブジェクトを備えている。 Each schema is generated according to the class definition shown in FIG. 27, and entities are mapped on the memory space. Each schema uses OpenR_Guest as a base class object, and includes class objects such as DSubject and DOObject for data access to the outside.

スキーマが主に持つ関数とステートマシンを以下に示しておく。 The functions and state machines that the schema has are shown below.

ＡｃｔｉｖａｔｉｏｎＭｏｎｉｔｏｒ（）：スキーマがＲｅａｄｙ時にＡｃｔｉｖｅになるための評価関数。
Ａｃｔｉｏｎｓ（）：Ａｃｔｉｖｅ時の実行用ステートマシン。
Ｇｏａｌ（）：Ａｃｔｉｖｅ時にスキーマがＧｏａｌに達したかを評価する関数。
Ｇｏａｌ（）：Ａｃｔｉｖｅ時にスキーマがｆａｉｌ状態かを判定する関数。
ＳｌｅｅｐＡｃｔｉｏｎｓ（）：Ｓｌｅｅｐ前に実行されるステートマシン。
ＳｌｅｅｐＭｏｎｉｔｏｒ（）：Ｓｌｅｅｐ時にＲｅｓｕｍｅするための評価関数。
ＲｅｓｕｍｅＡｃｔｉｏｎｓ（）：Ｒｅｓｕｍｅ前にＲｅｓｕｍｅするためのステートマシン。
ＤｅｓｔｒｏｙＭｏｎｉｔｏｒ（）：Ｓｌｅｅｐ時にスキーマがｆａｉｌ状態か判定する評価関数。
ＭａｋｅＰｒｏｎｏｍｅ（）：ツリー全体のターゲットを決定する関数である。 ActivationMonitor (): an evaluation function for becoming active when the schema is ready.
Actions (): State machine for execution at the time of Active.
Goal (): A function that evaluates whether the schema has reached Goal at the time of Active.
Goal (): A function that determines whether the schema is in a fail state at the time of Active.
SleepActions (): State machine executed before Sleep.
SleepMonitor (): an evaluation function for Resume at the time of Sleep.
ResumeActions (): State machine for Resume before Resume.
DestroyMonitor (): an evaluation function that determines whether the schema is in a fail state during Sleep.
MakePronome (): A function that determines the target of the entire tree.

これらの関数は、ＳｃｈｅｍａＢａｓｅで記述されている。 These functions are described in SchemaBase.

図３１には、ＭａｋｅＰｒｏｎｏｍｅ関数を実行する処理手順をフローチャートの形式で示している。 FIG. 31 shows a processing procedure for executing the MakePronome function in the form of a flowchart.

スキーマのＭａｋｅＰｒｏｎｏｍｅ関数がコールされると、まず、スキーマ自信に子供スキーマが存在するかどうかを判別する（ステップＳ１）。 When the MakePronome function of a schema is called, it is first determined whether or not a child schema exists in the schema confidence (step S1).

子供スキーマが存在する場合には、同様にすべての子供スキーマのＭａｋｅＰｒｏｎｏｍｅ関数を再帰的にコールする（ステップＳ２）。 If there is a child schema, the MakePronom function of all child schemas is similarly called recursively (step S2).

そして、スキーマ自身のＭａｋｅＰｒｏｎｏｍｅを実行して、Ｐｒｏｎｏｍｅオブジェクトにターゲットが代入される（ステップＳ３）。 Then, MakeProname of the schema itself is executed, and the target is assigned to the Pronome object (step S3).

この結果、自分以下のすべてのスキーマのＰｒｏｎｏｍｅに対して同じターゲットが代入され、行動が終了（完結、異常終了など）するまでスキーマは解放されない。新規のターゲットのために同じ行動を実行するためには同じクラス定義のスキーマをメモリ空間上に生成する。 As a result, the same target is assigned to Pronom of all the schemas below itself, and the schema is not released until the action ends (complete, abnormal end, etc.). In order to perform the same action for a new target, a schema with the same class definition is generated in the memory space.

図３２には、Ｍｏｎｉｔｏｒ関数を実行する処理手順をフローチャートの形式で示している。 FIG. 32 shows a processing procedure for executing the Monitor function in the form of a flowchart.

まず、評価フラグ（ＡｓｓｅｓｓｓｍｅｎｔＦｌａｇ）をオンに設定して（ステップＳ１１）、スキーマ自身のＡｃｔｉｏｎを実行する（ステップＳ１２）。このとき、子供スキーマの選定も行なう。そして、評価フラグをオフに戻す（ステップＳ１３）。 First, the evaluation flag (AssessmentFlag) is set to ON (step S11), and the action of the schema itself is executed (step S12). At this time, the child schema is also selected. Then, the evaluation flag is turned off (step S13).

子供スキーマが存在する場合には（ステップＳ１４）、ステップＳ１２において選択した子供スキーマのＭｏｎｉｔｏｒ関数を再帰的にコールする（ステップＳ１５）。 If a child schema exists (step S14), the Monitor function of the child schema selected in step S12 is recursively called (step S15).

次いで、スキーマ自身のＭｏｎｉｔｏｒ関数を実行して（ステップＳ１６）、活動度レベルと行動実行に使用するリソースを算出して（ステップＳ１７）、関数の戻り値とする。 Next, the Monitor function of the schema itself is executed (step S16), the activity level and the resource used for action execution are calculated (step S17), and set as the return value of the function.

図３３及び図３４には、Ａｃｔｉｏｎｓ関数を実行する処理手順をフローチャートの形式で示している。 FIG. 33 and FIG. 34 show the processing procedure for executing the Actions function in the form of a flowchart.

まず、スキーマがＳＴＯＰＰＩＮＧ状態かどうかをチェックし（ステップＳ２１）、次いで、ＳＴＯＰＰＩＮＧすべき状態かどうかをチェックする（ステップＳ２２）。 First, it is checked whether or not the schema is in a STOPPING state (step S21), and then it is checked whether or not it is in a STOPPING state (step S22).

ＳＴＯＰＰＩＮＧすべき状態である場合には、さらに子供スキーマがいるかどうかをチェックする（ステップＳ２３）。そして、子供スキーマがいる場合にはこれをＧＯ_ＴＯ_ＳＴＯＰ状態に移行させてから（ステップＳ２４）、ＨａｖｅＴｏＳｔｏｐＦｌａｇをオンにする（ステップＳ２５）。 If it is a state to be stopped, it is further checked whether or not there is a child schema (step S23). If there is a child schema, it is shifted to the GO_TO_STOP state (step S24), and then the HaveToStopFlag is turned on (step S25).

また、ＳＴＯＰＰＩＮＧすべき状態でない場合には、ＲＵＮＮＩＮＧ状態かどうかをチェックする（ステップＳ２６）。 If the state is not to be stopped, it is checked whether or not it is in the RUNNING state (step S26).

ＲＵＮＮＩＮＧ状態でない場合には、さらに子供スキーマがいるかどうかをチェックする（ステップＳ２７）。そして、子供スキーマがいる場合には、ＨａｖｅＴｏＳｔｏｐＦｌａｇをオンにする（ステップＳ２８）。 If it is not in the RUNNING state, it is further checked whether or not there is a child schema (step S27). If there is a child schema, the HaveToStopFlag is turned on (step S28).

次いで、現在のシステム状態とＨａｖｅＴｏＲｕｎＦｌａｇとＨａｖｅＴｏＳｔｏｐＦｌａｇと子供スキーマの動作状態から次の自分自身の状態を決定する（ステップＳ２９）。 Next, the next own state is determined from the current system state, HaveToRunFlag, HaveToStopFlag, and the operation state of the child schema (step S29).

次いで、スキーマ自身のＡｃｔｉｏｎ関数を実行する（ステップＳ３０）。 Next, the Action function of the schema itself is executed (step S30).

その後、スキーマ自身がＧＯ_ＴＯ_ＳＴＯＰ状態かどうかをチェックする（ステップＳ３１）。ＧＯ_ＴＯ_ＳＴＯＰ状態でない場合には、さらに子供スキーマがいるかどうかをチェックする（ステップＳ３２）。そして、子供スキーマがいる場合には、ＧＯ_ＴＯ_ＳＴＯＰ状態の子供スキーマがいるかどうかをチェックする（ステップＳ３３）。 Thereafter, it is checked whether or not the schema itself is in a GO_TO_STOP state (step S31). If it is not in the GO_TO_STOP state, it is further checked whether there is a child schema (step S32). If there is a child schema, it is checked whether there is a child schema in the GO_TO_STOP state (step S33).

ＧＯ_ＴＯ_ＳＴＯＰ状態の子供スキーマがいる場合には、これらのスキーマのＡｃｔｉｏｎ関数を実行する（ステップＳ３４）。 If there are child schemas in the GO_TO_STOP state, the Action function of these schemas is executed (step S34).

次いで、ＲＵＮＮＩＮＧ中の子供スキーマがいるかどうかをチェックする（ステップＳ３５）。ＲＵＮＮＩＮＧ中の子供スキーマがいない場合には、停止中の子供スキーマがいるかどうかをチェックして（ステップＳ３６）、停止中の子供スキーマのＡｃｔｉｏｎ関数を実行する（ステップＳ３７）。 Next, it is checked whether or not there is a child schema in RUNNING (step S35). If there is no child schema in RUNNING, it is checked whether there is a child schema that is stopped (step S36), and the action function of the stopped child schema is executed (step S37).

次いで、ＧＯ_ＴＯ_ＲＵＮ状態の子供スキーマがいるかどうかをチェックする（ステップＳ３８）。ＧＯ_ＴＯ_ＲＵＮ状態の子供スキーマがいない場合には、ＧＯ_ＴＯ_ＳＴＯＰ状態の子供スキーマがいるかどうかをチェックして（ステップＳ３９）、いればこの子供スキーマのＡｃｔｉｏｎ関数を実行する（ステップＳ４０）。 Next, it is checked whether there is a child schema in the GO_TO_RUN state (step S38). If there is no child schema in the GO_TO_RUN state, it is checked whether there is a child schema in the GO_TO_STOP state (step S39). If so, the action function of this child schema is executed (step S40).

最後に、現在のシステム状態とＨａｖｅＴｏＲｕｎＦｌａｇとＨａｖｅＴｏＳｔｏｐＦｌａｇと子供の動作状態から自分自身の次の状態を決定して,本処理ルーチン全体を終了する（ステップＳ４１）。 Finally, the next state of itself is determined from the current system state, HaveToRunFlag, HaveToStopFlag, and the child's operating state, and the entire processing routine is terminated (step S41).

Ｄ−３．状況依存行動階層の機能
状況依存行動階層（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０８は、短期記憶部１０５並びに長期記憶部１０６の記憶内容や、内部状態管理部１０４によって管理される内部状態を基に、ロボット装置１が現在置かれている状況に即応した行動を制御する。さらにＩＶ値を考慮して、熟考行動の意図に適う状況依存行動を発現できるように制御する。 D-3. The function- dependent behavior hierarchy (situated behaviors layer) 108 of the situation-dependent behavior hierarchy is based on the storage contents of the short-term storage unit 105 and the long-term storage unit 106 and the internal state managed by the internal state management unit 104. Control behaviors that respond quickly to your situation. Further, in consideration of the IV value, control is performed so that a situation-dependent behavior suitable for the intention of the contemplation behavior can be expressed.

前項で述べたように、本実施形態に係る状況依存行動階層１０８は、スキーマのツリー構造（図１９を参照のこと）で構成されている。各スキーマは、自分の子供と親の情報を知っている状態で独立性を保っている。このようなスキーマ構成により、状況依存行動階層１０８は、Ｃｏｎｃｕｒｒｅｎｔな評価、Ｃｏｎｃｕｒｒｅｎｔな実行、Ｐｒｅｅｍｐｔｉｏｎ、Ｒｅｅｎｔｒａｎｔという主な特徴を持っている。以下、これらの特徴について詳解する。 As described in the previous section, the situation-dependent action hierarchy 108 according to the present embodiment is configured by a schema tree structure (see FIG. 19). Each schema is independent with knowledge of its child and parent information. With such a schema configuration, the situation-dependent behavior hierarchy 108 has the main characteristics of current evaluation, current execution, preemption, and reentrant. Hereinafter, these features will be described in detail.

（１）Ｃｏｎｃｕｒｒｅｎｔな評価：
行動モジュールとしてのスキーマは外部刺激や内部状態の変化に応じた状況判断を行なうＭｏｎｉｔｏｒ機能を備えていることは既に述べた。Ｍｏｎｉｔｏｒ機能は、スキーマがクラス・オブジェクトＳｃｈｅｍａＢａｓｅでＭｏｎｉｔｏｒ関数を備えていることにより実装されている。Ｍｏｎｉｔｏｒ関数とは、外部刺激と内部状態とＩＶ値に応じて当該スキーマの活動度レベル（ＡｃｔｉｖａｔｉｏｎＬｅｖｅｌ：ＡＬ値）を算出する関数である。 (1) Current evaluation:
It has already been described that the schema as the behavior module has a Monitor function for judging the situation according to the external stimulus and the change of the internal state. The Monitor function is implemented by providing the Monitor function in the schema with the class object SchemaBase. The Monitor function is a function that calculates an activity level (Activation Level: AL value) of the schema in accordance with an external stimulus, an internal state, and an IV value.

図１９に示すようなツリー構造を構成する場合、上位（親）のスキーマは外部刺激と内部状態を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールすることができ、子供のスキーマはＡＬ値を返り値とする。また、スキーマは自分のＡＬ値を算出するために、さらに子供のスキーマのＭｏｎｉｔｏｒ関数をコールすることができる。そして、ルートのスキーマには各サブツリーからのＡＬ値が返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断することができる。 When the tree structure as shown in FIG. 19 is configured, the upper (parent) schema can call the Monitor function of the lower (child) schema with the external stimulus and the internal state as arguments, and the child schema has an AL value. Is the return value. The schema can also call the child's schema Monitor function to calculate its AL value. Since the AL value from each sub-tree is returned to the root schema, the optimum schema corresponding to the external stimulus and the change of the internal state, that is, the behavior can be determined in an integrated manner.

このようにツリー構造になっていることから、外部刺激と内部状態の変化による各スキーマの評価は、まずツリー構造の下から上に向かってＣｏｎｃｕｒｒｅｎｔに行なわれる。図３２のフローチャートでも示したように、スキーマに子供スキーマがある場合には（ステップＳ１４）、選択した子供のＭｏｎｉｔｏｒ関数をコールしてから（ステップＳ１５）、自身のＭｏｎｉｔｏｒ関数を実行する。 Since the tree structure is formed in this way, the evaluation of each schema based on the external stimulus and the change in the internal state is first performed to the current from the bottom to the top of the tree structure. As shown in the flowchart of FIG. 32, if the schema has a child schema (step S14), the selected child's Monitor function is called (step S15), and then the own Monitor function is executed.

次いで、ツリー構造の上から下に向かって評価結果としての実行許可を渡していく。評価と実行は、その行動が用いるリソースの競合を解きながら行なわれる。 Next, the execution permission as the evaluation result is passed from the top to the bottom of the tree structure. Evaluation and execution are performed while solving the competition of resources used by the action.

本実施形態に係る状況依存行動階層１０８は、スキーマのツリー構造を利用して、並列的に行動の評価を行なうことができるので、外部刺激や内部状態などの状況に対しての適応性がある。また、評価時には、ツリー全体に関しての評価を行ない、このとき算出される活動度レベル（ＡＬ）値によりツリーが変更されるので、スキーマすなわち実行する行動を動的にプライオリタイズすることができる。 Since the situation-dependent action hierarchy 108 according to the present embodiment can evaluate actions in parallel using the schema tree structure, it is adaptable to situations such as external stimuli and internal states. . Further, at the time of evaluation, the entire tree is evaluated, and the tree is changed by the activity level (AL) value calculated at this time, so that the schema, that is, the action to be executed can be dynamically prioritized.

（２）Ｃｏｎｃｕｒｒｅｎｔな実行：
ルートのスキーマには各サブツリーからのＡＬ値が返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断することができる。例えばＡＬ値が最も高いスキーマを選択したり、ＡＬ値が所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行するようにしてもよい（但し、並列実行するときは各スキーマ同士でハードウェア・リソースの競合がないことを前提とする）。 (2) Current execution:
Since the AL value from each sub-tree is returned to the root schema, it is possible to integrally determine the optimal schema, that is, the behavior corresponding to the external stimulus and the change in the internal state. For example, a schema having the highest AL value may be selected, or two or more schemas having an AL value exceeding a predetermined threshold value may be selected and be executed in parallel (however, each schema is executed in parallel execution) (Assuming there is no hardware resource conflict between each other).

実行許可をもらったスキーマは実行される。すなわち、実際にそのスキーマはさらに詳細の外部刺激や内部状態の変化を観測して、コマンドを実行する。実行に関しては、ツリー構造の上から下に向かって順次すなわちＣｏｎｃｕｒｒｅｎｔに行なわれる。図３３及び図３４のフローチャートでも示したように、スキーマに子供スキーマがある場合には、子供のＡｃｔｉｏｎｓ関数を実行する。 Schemas that have permission to execute are executed. In other words, the schema actually executes the command by observing more detailed external stimuli and internal state changes. Regarding execution, it is performed sequentially from the top to the bottom of the tree structure, that is, to the current. As shown in the flowcharts of FIGS. 33 and 34, if the schema includes a child schema, the child Actions function is executed.

Ａｃｔｉｏｎ関数は、スキーマ自身が持つ行動を記述したステートマシン（後述）を備えている。図１９に示すようなツリー構造を構成する場合、親スキーマは、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりすることができる。 The Action function includes a state machine (described later) that describes the behavior of the schema itself. When the tree structure shown in FIG. 19 is configured, the parent schema can call the Action function to start or interrupt the execution of the child schema.

本実施形態に係る状況依存行動階層１０８は、スキーマのツリー構造を利用して、リソースが競合しない場合には、余ったリソースを使う他のスキーマを同時に実行することができる。但し、Ｇｏａｌまでに使用するリソースに対して制限を加えないと、ちぐはぐな行動出現が起きる可能性がある。状況依存行動階層１０８において決定された状況依存行動は、リソース・マネージャにより反射行動部１０９による反射的行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ）に適用される。 The context-dependent behavior hierarchy 108 according to the present embodiment can simultaneously execute other schemas that use surplus resources when resources do not compete using the schema tree structure. However, if there are no restrictions on the resources used before Goal, there is a possibility that a stupid behavior will occur. The situation-dependent action determined in the situation-dependent action hierarchy 108 is applied to the aircraft operation (Motion Controller) through the mediation of hardware resource competition with the reflex action by the reflex action unit 109 by the resource manager.

（３）Ｐｒｅｅｍｐｔｉｏｎ：
１度実行に移されたスキーマであっても、それよりも重要な（優先度の高い）行動があれば、スキーマを中断してそちらに実行権を渡さなければならない。また、より重要な行動が終了（完結又は実行中止など）したら、元のスキーマを再開して実行を続けることも必要である。 (3) Preemption:
Even if the schema has been moved to once, if there is a more important (higher priority) action, the schema must be interrupted and the right to execute must be passed to it. It is also necessary to resume the original schema and continue execution when more important actions are completed (completion or execution stop, etc.).

このような優先度に応じたタスクの実行は、コンピュータの世界におけるＯＳ（オペレーティング・システム）のＰｒｅｅｍｐｔｉｏｎと呼ばれる機能に類似している。ＯＳでは、スケジュールを考慮するタイミングで優先度のより高いタスクを順に実行していくという方針である。 The execution of tasks according to such priorities is similar to a function called Preemption of OS (Operating System) in the computer world. The OS has a policy of sequentially executing tasks with higher priorities at a timing considering the schedule.

これに対し、本実施形態に係るロボット１の行動制御システム１００は、複数のオブジェクトにまたがるため、オブジェクト間での調停が必要になる。例えば反射行動を制御するオブジェクトであるＲｅｆｌｅｘｉｖｅＳＢＬは、上位の状況依存行動を制御するオブジェクトであるＳＢＬの行動評価を気にせずに物を避けたり、バランスをとったりする必要がある。これは、実際に実行権を奪い取り実行を行なう訳であるが、上位の行動モジュール（ＳＢＬ）に、実行権利が奪い取られたことを通知して、上位はその処理を行なうことによってＰｒｅｅｍｐｔｉｖｅな能力を保持する。 On the other hand, since the behavior control system 100 of the robot 1 according to the present embodiment spans a plurality of objects, arbitration between the objects is necessary. For example, Reflexive SBL, which is an object that controls reflex behavior, needs to avoid or balance things without worrying about the behavior evaluation of SBL, which is an object that controls higher-level situation-dependent behavior. This means that the execution right is actually taken and executed, but the higher-level action module (SBL) is notified that the execution right has been taken, and the higher-level preemptive ability is obtained by performing the processing. Hold.

また、状況依存行動層１０８内において、外部刺激と内部状態の変化に基づくＡＬ値の評価の結果、あるスキーマに実行許可がなされたとする。さらに、その後の外部刺激と内部状態の変化に基づくＡＬ値の評価により、別のスキーマの重要度の方がより高くなったとする。このような場合、実行中のスキーマのＡｃｔｉｏｎｓ関数を利用してＳｌｅｅｐ状態にして中断することにより、Ｐｒｅｅｍｐｔｉｖｅな行動の切り替えを行なうことができる。 In the situation-dependent behavior layer 108, it is assumed that execution of a certain schema is permitted as a result of the evaluation of the AL value based on the external stimulus and the change in the internal state. Furthermore, it is assumed that the importance of another schema becomes higher due to the evaluation of the AL value based on the subsequent external stimulus and the change in the internal state. In such a case, it is possible to switch the preemptive behavior by using the Actions function of the schema being executed and suspending the sleep state.

実行中のスキーマのＡｃｔｉｏｎｓ（）の状態を保存して、異なるスキーマのＡｃｔｉｏｎｓ（）を実行する。また、異なるスキーマのＡｃｔｉｏｎｓ（）が終了した後、中断されたスキーマのＡｃｔｉｏｎｓ（）を再度実行することができる。 The state of Actions () of the schema being executed is saved, and Actions () of a different schema is executed. In addition, after the Actions () of the different schema ends, the Actions () of the interrupted schema can be executed again.

また、実行中のスキーマのＡｃｔｉｏｎｓ（）を中断して、異なるスキーマに実行権が移動する前に、ＳｌｅｅｐＡｃｔｉｏｎｓ（）を実行する。例えば、ロボット１は、対話中にサッカーボールを見つけると、「ちょっと待ってね」と言って、サッカーすることができる。 Further, the Actions () of the schema being executed is interrupted, and the SleepActions () is executed before the execution right is transferred to a different schema. For example, when the robot 1 finds a soccer ball during the conversation, it can say “Please wait a moment” and play soccer.

（４）Ｒｅｅｎｔｒａｎｔ：
状況依存行動階層１０８を構成する各スキーマは、一種のサブルーチンである。スキーマは、複数の親からコールされた場合には、その内部状態を記憶するために、それぞれの親に対応した記憶空間を持つ必要がある。 (4) Reentrant:
Each schema constituting the situation-dependent action hierarchy 108 is a kind of subroutine. When a schema is called from a plurality of parents, it is necessary to have a storage space corresponding to each parent in order to store the internal state.

これは、コンピュータの世界では、ＯＳが持つＲｅｅｎｔｒａｎｔ性に類似しており、本明細書ではスキーマのＲｅｅｎｔｒａｎｔ性と呼ぶ。図３０を参照しながら説明したように、スキーマはクラス・オブジェクトで構成されており、クラス・オブジェクトのエンティティすなわちインスタンスをターゲット（Ｐｒｏｎｏｍｅ）毎に生成することによりＲｅｅｎｔｒａｎｔ性が実現される。 This is similar to the Reentrant property of the OS in the computer world, and is referred to as schema Reentrant property in this specification. As described with reference to FIG. 30, the schema is composed of class objects, and the Reentrant property is realized by generating an entity, that is, an instance of the class object for each target (Pronome).

スキーマのＲｅｅｎｔｒａｎｔ性について、図３５を参照しながらより具体的に説明する。 The Reentrant property of the schema will be described more specifically with reference to FIG.

ＳｃｈｅｍａＨａｎｄｌｅｒは、スキーマを管理するためのクラス・オブジェクトであり、ＳＢＬを構成するスキーマのコンフィギュレーション情報をファイルとして保管している。システムの起動時に、ＳｃｈｅｍａＨａｎｄｌｅｒは、このコンフィギュレーション情報ファイルを読み込んで、ＳＢＬ内のスキーマ構成を構築する。図３１に示す例では、ＥａｔやＤｉａｌｏｇなどの行動を規定するスキーマのエンティティがメモリ空間上にマッピングされているとする。 The SchemaHandler is a class object for managing the schema, and stores configuration information of the schema constituting the SBL as a file. At system startup, the SchemaHandler reads this configuration information file and builds a schema configuration in the SBL. In the example shown in FIG. 31, it is assumed that an entity of a schema that defines an action such as Eat or Dialog is mapped on the memory space.

ここで、外部刺激と内部状態の変化に基づく活動度レベルの評価により、スキーマＤｉａｌｏｇに対してＡというターゲット（Ｐｒｏｎｏｍｅ）が設定されて、Ｄｉａｌｏｇが人物Ａとの対話を実行するようになったとする。 Here, it is assumed that the target (Pronom) A is set for the schema Dialog by the evaluation of the activity level based on the external stimulus and the change in the internal state, and the Dialog starts to execute the dialogue with the person A. .

その後、人物Ｂがロボット１と人物Ａとの対話に割り込み、外部刺激と内部状態の変化に基づく活動度レベルの評価を行なった結果、Ｂとの対話を行なうスキーマの方がより優先度が高くなったとする。 After that, the person B interrupts the conversation between the robot 1 and the person A and evaluates the activity level based on the external stimulus and the change in the internal state. As a result, the schema for the conversation with B has a higher priority. Suppose that

このような場合、ＳｃｈｅｍａＨａｎｄｌｅｒは、Ｂとの対話を行なうためのクラス継承した別のＤｉａｌｏｇエンティティ（インスタンス）をメモリ空間上にマッピングする。別のＤｉａｌｏｇエンティティを使用して、先のＤｉａｌｏｇエンティティとは独立して、Ｂとの対話を行なうことから、Ａとの対話内容は破壊されずに済む。したがって、ＤｉａｌｏｇＡはデータの一貫性を保持することができ、Ｂとの対話が終了すると、Ａとの対話を中断した時点から再開することができる。 In such a case, SchemaHandler maps another Dialog entity (instance) that inherits the class for performing the interaction with B on the memory space. Since another Dialog entity is used and the dialogue with B is performed independently of the previous Dialog entity, the content of the dialogue with A is not destroyed. Therefore, Dialog A can maintain data consistency, and when the dialogue with B is finished, the dialogue with A can be resumed from the point at which it was interrupted.

Ｒｅａｄｙリスト内のスキーマは、その対象物（外部刺激）に応じて評価すなわちＡＬ値の計算が行なわれ、実行権が引き渡される。その後、Ｒｅａｄｙリスト内に移動したスキーマのインスタンスを生成して、これ以外の対象物に対して評価を行なう。これにより、同一のスキーマをａｃｔｉｖｅ又はｓｌｅｅｐ状態にすることができる。 The schema in the Ready list is evaluated, that is, the AL value is calculated according to the object (external stimulus), and the execution right is handed over. Thereafter, an instance of the schema that has been moved into the Ready list is generated, and evaluation is performed on other objects. Thereby, the same schema can be set in the active or sleep state.

Ｅ．ロボットの内部状態管理
本実施形態に係るロボットの行動制御システム１００では、状況依存行動階層１０８は内部状態と外部環境によって行動を決定する。 E. In the robot behavior control system 100 according to the present embodiment, the situation-dependent behavior hierarchy 108 determines the behavior according to the internal state and the external environment.

ロボット装置１の内部状態は、本能や感情といった数種類の情動で構成され、数式モデル化して扱われる。内部状態管理部（ＩＳＭ：ＩｎｔｅｒｎａｌＳｔａｔｕｓＭａｎａｇｅｒ）１０４は、上述した各認識機能部１０１〜１０３によって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）と、時間的経過に基づいて、内部状態を管理する。 The internal state of the robot apparatus 1 is composed of several types of emotions such as instinct and emotion, and is handled as a mathematical model. An internal state management unit (ISM: Internal Status Manager) 104 manages an internal state based on an external stimulus (ES: External Stimula) recognized by each of the recognition function units 101 to 103 described above and a time course.

Ｅ−１．情動の階層化
本実施形態では、情動についてその存在意義による複数階層で構成され、それぞれの階層で動作する。決定された複数の動作から、そのときの外部環境や内部状態によってどの動作を行なうかを決定するようになっている（後述）。また、それぞれの階層で行動は選択されるが、より低次の行動から優先的に動作を発現していくことにより、反射などの本能的行動や、記憶を用いた動作選択などの高次の行動を１つの個体上で矛盾なく発現することができる。 E-1. Emotional Hierarchization In this embodiment, emotions are composed of a plurality of hierarchies depending on their significance and operate in each hierarchy. From a plurality of determined operations, which operation is to be performed is determined according to the external environment and internal state at that time (described later). In addition, actions are selected at each level, but by expressing actions preferentially from lower-order actions, higher-order actions such as instinct actions such as reflexes and action selection using memory Behavior can be expressed consistently on one individual.

図３６には、本実施形態に係る内部状態管理部１０４の階層的構成を模式的に示している。 FIG. 36 schematically shows a hierarchical configuration of the internal state management unit 104 according to the present embodiment.

図示の通り、内部状態管理部１０４は、情動などの内部情報を、情動を本能や欲求などの個体存続に必要な１次情動と、この１次情動の満足度（過不足）によって変化する２次情動に大別する。また、１次情動は、個体存続においてより生理的なものから連想に至るものまで階層的に細分化されている。 As shown in the figure, the internal state management unit 104 changes the internal information such as emotions according to primary emotions necessary for individual survival such as instinct and desire, and satisfaction (over or shortage) of the primary emotions 2 Broadly divided into the following emotions. Further, primary emotions are hierarchically subdivided from those that are more physiological to those that are associated with the survival of the individual.

図示の例では、１次情動は、低次から高次に向かって、下位の１次情動、上位の１次情動、連想による１次情動に区分される。下位の１次情動は、大脳辺縁系へのアクセスに相当し、ホメオスタシス（個体維持）が保たれるように情動発生するとともに、ホメオスタシスが脅かされる場合には優先される。また、上位の１次情動は、大脳新皮質へのアクセスに相当し、内発的欲求や社会的欲求などの種族維持に関わる。上位の１次情動は、学習や環境に依って満足度が変化する（学習やコミュニケーションにより満足される）。 In the example shown in the figure, the primary emotion is divided into a lower primary emotion, an upper primary emotion, and an associated primary emotion from the lower order to the higher order. The lower primary emotion corresponds to access to the limbic system, and the emotion is generated so that homeostasis (individual maintenance) is maintained, and is given priority when homeostasis is threatened. The upper primary emotions correspond to access to the cerebral neocortex and are related to race maintenance such as intrinsic desires and social needs. The degree of satisfaction of the upper primary emotion varies depending on learning and environment (satisfied by learning and communication).

１次情動の各階層は、行動選択されたスキーマを実行することによる一時情動（本能）レベルの変化量ΔＩを出力する。 Each level of the primary emotion outputs a change amount ΔI of a temporary emotion (instinct) level by executing the action-selected schema.

２次情動は、いわゆる感情（Ｅｍｏｔｉｏｎ）に相当し、喜び（Ｊｏｙ）、悲しみ（Ｓａｄ）、怒り（Ａｎｇｅｒ）、驚き（Ｓｕｒｐｒｉｓｅ）、嫌気（Ｄｉｓｇｕｓｔ）、畏怖（Ｆｅｅｒ）などの要素からなる。１次情動の変化量ΔＩに応じて２次情動の変化量（満足度）ΔＥが決定される。 The secondary emotion corresponds to so-called emotion, and includes elements such as joy, sadness, anger, surprise, anxiety, and fear. A change amount (satisfaction) ΔE of the secondary emotion is determined according to the change amount ΔI of the primary emotion.

状況依存行動階層１０８では、主に１次情動を基に行動選択を行なうが、２次情動が強い場合には、２次情動に基づく行動選択を行なうこともできる。さらに、１次情動を基に選択された行動に対して２次情動により生成されたパラメータを使用してモジュレーションを行なうことも可能である。 In the situation-dependent action hierarchy 108, action selection is performed mainly based on the primary emotion, but when the secondary emotion is strong, action selection based on the secondary emotion can also be performed. Furthermore, it is also possible to modulate the action selected based on the primary emotion using the parameter generated by the secondary emotion.

個体存続のための情動階層は、生得的反射による行動がまず選択される。次いで、下位の１次情動を満たす行動を選択する。そして、上位の１次情動を満たす行動発生、連想による１次情動を満たす行動発生と、よりプリミティブな個体保持から実現する。 The emotional hierarchy for the survival of an individual is first selected for behavior by innate reflexes. Next, an action that satisfies the lower primary emotion is selected. Then, it is realized by the action generation satisfying the upper primary emotion, the action generation satisfying the primary emotion by the association, and the more primitive individual retention.

この際、各階層の１次情動は、直近の階層に対して圧力をかけることができる。自身で決定した行動を選択するための指標が強い場合、直近の階層で決定された行動を抑制して、自身の行動を発現することができる。 At this time, the primary emotion of each layer can apply pressure to the nearest layer. When the index for selecting the action determined by itself is strong, it is possible to suppress the action determined in the latest hierarchy and express its own action.

前項Ｄでも述べたように、状況依存行動階層１０８は、目標とする動作を持った複数のスキーマによって構成されている（図１８、図１９などを参照のこと）。状況依存行動階層１０８では、各スキーマが持つ活動度レベルを指標にしてスキーマすなわち行動を選択する。内部状態の活動度レベルと外部状況の活動度レベルによりスキーマ全体の活動度レベルが決定する。スキーマは、目標とする動作を実行するための途中経過毎に、活動度レベルを保持する。○○を満たす行動発生とは、○○を満たす行動が最終目標であるスキーマを実行することに相当する。 As described in the previous section D, the situation-dependent action hierarchy 108 is composed of a plurality of schemas having target actions (see FIGS. 18 and 19). In the situation-dependent action hierarchy 108, a schema, that is, an action is selected using an activity level of each schema as an index. The activity level of the entire schema is determined by the activity level of the internal state and the activity level of the external situation. The schema maintains an activity level for each halfway progress to execute a target action. The occurrence of an action satisfying OO corresponds to executing a schema in which an action satisfying OO is the final goal.

内部状態の活動度レベルは、スキーマを実行したときの１次情動における階層毎の変化量ΔＩに基づく２次情動の満足度の変化ΔＥの総和によって決定される。ここで、１次情動がＬ１，Ｌ２，Ｌ３の３階層からなり、スキーマ選択時の１次情動の各階層に由来する２次情動の変化をそれぞれΔＥ_L1、ΔＥ_L2、ΔＥ_L3とすると、それぞれに重み因子ｗ₁、ｗ₂、ｗ₃を掛けて活動度レベルを算出する。下位の１次情動に対する重み因子をより大きくすることにより、下位の１次情動を満たす行動がより選択され易くなる。また、これら重み因子を調整することにより、各階層の１次情動が直近の階層に対して圧力をかける（Ｃｏｎｃｅｎｔｒａｔｉｏｎ：行動抑制）という作用を得ることができる。 The activity level of the internal state is determined by the sum of the change ΔE of the satisfaction level of the secondary emotion based on the change amount ΔI for each hierarchy in the primary emotion when the schema is executed. Here, when the primary emotion is composed of three layers L1, L2, and L3, and changes in the secondary emotion derived from each layer of the primary emotion at the time of schema selection are ΔE _L1 , ΔE _L2 , and ΔE _L3 , respectively. Is multiplied by weight factors w ₁ , w ₂ , and w ₃ to calculate the activity level. By increasing the weighting factor for the lower primary emotion, it becomes easier to select an action that satisfies the lower primary emotion. Further, by adjusting these weight factors, it is possible to obtain an effect that the primary emotion of each layer applies pressure to the nearest layer (concentration).

ここで、情動の階層化構造を利用した行動選択の実施例について説明する。但し、以下では下位の１次情動としてＳｌｅｅｐ（眠気）を、上位の１次情動としてＣｕｒｉｏｓｉｔｙ（好奇心）を扱う。 Here, an example of action selection using a hierarchical structure of emotion will be described. However, in the following, Sleep (drowsiness) is treated as the lower primary emotion, and Curiosity (curiosity) is treated as the upper primary emotion.

（１）下位の１次情動であるＳｌｅｅｐが不足してきて、Ｓｌｅｅｐを満たすスキーマの活動度レベルが高まってきたとする。このとき、他のスキーマの活動度レベルが上がらなければ、Ｓｌｅｅｐを満たすスキーマは、Ｓｌｅｅｐが満たされるまで自身を実行する。 (1) Assume that the level of activity of schemas that satisfy Sleep has increased as Sleep, which is a lower-level primary emotion, has become insufficient. At this time, if the activity level of the other schema does not rise, the schema satisfying Sleep executes itself until Sleep is satisfied.

（２）Ｓｌｅｅｐが満たされる前に、上位の１次情動であるＣｕｒｉｏｓｉｔｙが不足してきたとする。しかし、Ｓｌｅｅｐのほうが個体維持に直結するため、Ｓｌｅｅｐの活動度レベルが一定値以下になるまでは、Ｓｌｅｅｐを満たすスキーマが実行し続ける。そして、Ｓｌｅｅｐがある程度満たされたら、Ｃｕｒｉｏｓｉｔｙを満たすスキーマを実行することができる。 (2) Suppose that the upper primary emotion, Curiosity, is insufficient before Sleep is satisfied. However, since Sleep is directly linked to individual maintenance, a schema that satisfies Sleep continues to be executed until the activity level of Sleep is below a certain value. When Sleep is satisfied to some extent, a schema that satisfies the Curiosity can be executed.

（３）Ｃｕｒｉｏｓｉｔｙを満たすスキーマ実行中に手を勢いよくロボットの顔面に近づけたとする。これに応答して、ロボットは色認識と大きさ認識による突然肌色が近づいてきたことが判り、生得的な反射行動として手から顔を避ける、すなわち後ろに頭を引くという動作を反射的に行なう。この反射的な動作は動物の脊髄反射に相当する。反射は、最も下位にあるスキーマなので、反射スキーマがまず実行される。 (3) It is assumed that the hand is moved close to the face of the robot while executing the schema satisfying the Curiosity. In response to this, the robot knows that the skin color has suddenly approached due to color recognition and size recognition, and as an innate reflex action it avoids the face from the hand, that is, pulls the head behind it reflexively. . This reflexive movement corresponds to the animal's spinal reflex. Since reflection is the lowest schema, the reflection schema is executed first.

脊髄反射の後、それに伴う情動変化が起き、その変化幅と他のスキーマの活動度レベルから、続いて情動表出スキーマを行なうかどうかを決定する。情動表出スキーマが行なわれていない場合は、Ｃｕｒｉｏｓｉｔｙを満たすスキーマが続行される。 After the spinal cord reflex, the accompanying emotional change occurs, and it is determined from the change width and the activity level of other schemas whether or not the emotional expression schema is subsequently performed. If the emotion expression schema has not been performed, the schema that satisfies the Curiosity is continued.

（４）あるスキーマ自身の下位にあるスキーマは通常自身より選択される可能性が高いが、自身の活動度レベルが極端に高いときに限り、下位のスキーマを抑制して（Ｃｏｎｃｅｎｔｒａｔｉｏｎ）、一定値まで自身を実行することが可能である。Ｓｌｅｅｐの不足が著しいときは、反射行動スキーマの行動を出したいときであっても、一定値に回復するまではＳｌｅｅｐを満たすスキーマが優先的に実行される。 (4) A schema below a certain schema itself is usually more likely to be selected than itself, but only when the activity level of the schema itself is extremely high, the lower schema is suppressed (Concentration) and a constant value. It is possible to execute itself. When the sleep shortage is significant, even if it is time to take action of the reflex behavior schema, the schema that satisfies the sleep is preferentially executed until it is restored to a constant value.

Ｅ−２．他の機能モジュールとの連携
図３７には、内部状態管理部１０４と他の機能モジュールとの通信経路を模式的に示している。 E-2. Cooperation with Other Functional Modules FIG. 37 schematically shows communication paths between the internal state management unit 104 and other functional modules.

短期記憶部１０５は、外部環境の変化を認識する各認識機能部１０１〜１０３からの認識結果を、内部状態管理部１０４と状況依存行動階層１０８に出力する。 The short-term storage unit 105 outputs recognition results from the recognition function units 101 to 103 that recognize changes in the external environment to the internal state management unit 104 and the situation-dependent action hierarchy 108.

内部状態管理部１０４は、状況依存行動階層１０８に内部状態を通知する。これに対し、状況依存行動階層１０８は、連想又は決定した本能や感情の情報を返す。 The internal state management unit 104 notifies the internal state to the situation-dependent action hierarchy 108. On the other hand, the situation-dependent action hierarchy 108 returns information on the instinct or emotion that is associated or determined.

また、状況依存行動階層１０８は、内部状態と外部環境から算出される活動度レベルを基に行動を選択するとともに、選択した行動の実行と完了を短期記憶部１０５経由で内部状態管理部１０４に通知する。 In addition, the situation-dependent action hierarchy 108 selects an action based on the activity level calculated from the internal state and the external environment, and executes and completes the selected action to the internal state management unit 104 via the short-term storage unit 105. Notice.

内部状態管理部１０４は、行動毎に内部状態を長期記憶部１０６に出力する。これに対し、長期記憶部１０６は、記憶情報を返す。 The internal state management unit 104 outputs the internal state to the long-term storage unit 106 for each action. In response to this, the long-term storage unit 106 returns stored information.

バイオリズム管理部は、バイオリズム情報を内部状態管理部１０４に供給する。 The biorhythm management unit supplies biorhythm information to the internal state management unit 104.

Ｅ−３．時間経過による内部状態の変化
内部状態の指標は時間経過により変化する。例えば、１次情動すなわち本能であるＨｕｎｇｅｒ（空腹感）、Ｆａｔｉｇｕｅ（疲労）、Ｓｌｅｅｐ（眠気）は、時間経過によりそれぞれ以下のように変化する。 E-3. Changes in internal state over time The internal state index changes over time. For example, primary emotions, that is, instincts such as Hunger (fatigue), Fatigue (fatigue), and Sleep (sleepiness) change as follows according to the passage of time.

Ｈｕｎｇｅｒ：おなかが減る（仮想値又はバッテリ残量）
Ｆａｔｉｇｕｅ：疲れがたまる
Ｓｌｅｅｐ：眠気がたまる Hunger: Reduced stomach (virtual value or battery level)
Fatigue: Sleepiness: Sleepiness

また、本実施形態では、ロボットの２次情動すなわち感情（Ｅｍｏｔｉｏｎ）の要素としてＰｌｅａｓａｎｔｎｅｓｓ（満足度），Ａｃｔｉｖａｔｉｏｎ（活動度），Ｃｅｒｔａｉｎｔｙ（確信度）を定義しているが、時間経過によりそれぞれ以下のように変化する。 Further, in this embodiment, Pleasantness (satisfaction), Activation (activity), and Certality (definition) are defined as elements of secondary emotions of the robot, that is, emotions. To change.

Ｐｌｅａｓａｎｔｎｅｓｓ：Ｎｅｕｔｒａｌ（中立）に向かって変化する
Ａｃｔｉｖａｔｉｏｎ：バイオリズムやＳｌｅｅｐ（眠気）に依存する
Ｃｅｒｔａｉｎｔｙ：Ａｔｔｅｎｔｉｏｎに依存する Pleasantness: Neutral changes: Activation: Biorhythms and Sleep (sleepiness) depend on Certification: Attentions

図３８には、内部状態管理部１０４が時間変化に伴って内部状態を変化させるための仕組みを示している。 FIG. 38 shows a mechanism for the internal state management unit 104 to change the internal state with time.

図示のように、バイオリズム管理部は、一定の周期でバイオリズム情報を通知する。これに対し、内部状態管理部１０４は、バイオリズムにより１次情動の各要素の値を変更するとともに、２次情動であるＡｃｔｉｖａｔｉｏｎ（活動度）を変動させる。そして、状況依存行動階層１０８は、バイオリズム管理部からの通知がある度に、内部状態管理部１０４から本能や感情など内部状態の指標値を受け取るので、内部状態を基に各スキーマの活動度レベルを算出することにより、状況に依存した行動（スキーマ）を選択することができる。 As shown in the figure, the biorhythm management section notifies the biorhythm information at a constant cycle. On the other hand, the internal state management unit 104 changes the value of each element of the primary emotion by biorhythm and changes the activation (activity) that is the secondary emotion. The situation-dependent behavior hierarchy 108 receives the index value of the internal state such as instinct and emotion from the internal state management unit 104 every time there is a notification from the biorhythm management unit, so the activity level of each schema based on the internal state By calculating, an action (schema) depending on the situation can be selected.

Ｅ−４．動作実行による内部状態の変化
内部状態は、ロボットが動作を実行することによっても変化する。 E-4. Change in internal state due to operation execution The internal state also changes when the robot executes an operation.

例えば、「眠る」という行動を行なうスキーマは、下位の１次情動としてのＳｌｅｅｐ（眠気）を満たす行動が最終目標としている。状況依存行動階層１０８では、１次情動としてのＳｌｅｅｐと２次情動としてのＡｃｔｉｖａｔｉｏｎを基に各スキーマの活動度レベルを算出・比較して、「眠る」スキーマを選択し、この結果、眠るという行動が実現される。 For example, in a schema that performs an action of “sleeping”, an action that satisfies Sleep (sleepiness) as a lower primary emotion is a final goal. In the situation-dependent action hierarchy 108, the activity level of each schema is calculated and compared based on the sleep as the primary emotion and the activation as the secondary emotion, and the “sleep” schema is selected. Is realized.

一方、状況依存行動階層１０８は、眠るという行動の実行完了を短期記憶部１０５経由で内部状態管理部１０４に伝達する。これに対し、内部状態管理部１０４は、「眠る」行動の実行により、１次情動であるＳｌｅｅｐの指標値を変更する。 On the other hand, the situation-dependent behavior hierarchy 108 transmits the completion of execution of the action of sleeping to the internal state management unit 104 via the short-term storage unit 105. On the other hand, the internal state management unit 104 changes the index value of Sleep, which is the primary emotion, by executing the “sleep” action.

そして、状況依存行動階層１０８では、Ｓｌｅｅｐが満たされた度合いと２次情動としてのＡｃｔｉｖａｔｉｏｎを基に各スキーマの活動度レベルを改めて算出・比較する。この結果、優先度が高くなった他のスキーマを選択し、眠るというスキーマから抜ける。 Then, in the situation-dependent action hierarchy 108, the activity level of each schema is calculated and compared anew based on the degree to which Sleep is satisfied and the activation as the secondary emotion. As a result, another schema having a higher priority is selected and the system leaves the schema of sleeping.

図３９には、内部状態管理部１０４がロボットの動作実行により内部状態を変化させるための仕組みを示している。 FIG. 39 shows a mechanism for the internal state management unit 104 to change the internal state by executing the robot operation.

状況依存行動階層１０８は、状況依存型で選択された行動の実行開始及び実行終了、並びにＡｔｔｅｎｔｉｏｎ情報を、短期記憶部１０５経由で内部状態管理部１０４に通知する。 The situation-dependent action hierarchy 108 notifies the internal state management unit 104 of the execution start and execution end of the action selected in the context-dependent manner and the Attention information via the short-term storage unit 105.

内部状態管理部１０４は、選択された行動の実行完了情報が通知されると、Ａｔｔｅｎｔｉｏｎ情報に則って、短期記憶部１０５から得た外部環境を確認して、１次情動としての本能（Ｓｌｅｅｐ）の指標値を変更するとともに、これに伴って２次情動としての感情も変更する。そして、これら内部状態の更新データを、状況依存行動階層１０８並びに長期記憶部１０６に出力する。 When the execution completion information of the selected action is notified, the internal state management unit 104 confirms the external environment obtained from the short-term storage unit 105 according to the Attention information, and instinct (Sleep) as the primary emotion And the emotion as the secondary emotion are also changed accordingly. Then, the update data of the internal state is output to the situation dependent action hierarchy 108 and the long-term storage unit 106.

状況依存行動階層１０８では、新たに受け取った内部状態の指標値を基に、各スキーマの活動度レベルを算出して、状況に依存した次の行動（スキーマ）を選択する。 In the situation-dependent action hierarchy 108, the activity level of each schema is calculated based on the newly received index value of the internal state, and the next action (schema) depending on the situation is selected.

また、長期記憶部１０６は、内部状態の更新データを基に記憶情報を更新するとともに、更新内容を内部状態管理部１０４に通知する。内部状態管理部１０４では、外部環境に対する確信度と長期記憶部１０６の確信度により、２次情動としての確信度（Ｃｅｒｔａｉｎｔｙ）を決定する。 The long-term storage unit 106 updates the storage information based on the internal state update data, and notifies the internal state management unit 104 of the updated content. The internal state management unit 104 determines the certainty factor (Certainty) as the secondary emotion based on the certainty factor for the external environment and the certainty factor of the long-term storage unit 106.

Ｅ−５．センサ情報による内部状態の変化
ロボットが動作を実行したときのその動作程度は、各認識機能部１０１〜１０３によって認識され、短期記憶部１０５経由で内部状態管理部１０４に通知される。内部状態管理部１０４は、この動作程度を例えばＦａｔｉｇｕｅ（疲労）として１次情動の変化に反映させることができる。また、この１次情動の変化に応答して、２次情動も変化させることができる。 E-5. Change of internal state based on sensor information The degree of movement of the robot when the robot executes the movement is recognized by each of the recognition function units 101 to 103 and notified to the internal state management unit 104 via the short-term storage unit 105. The internal state management unit 104 can reflect this degree of operation as, for example, fatigue (fatigue) in the change of the primary emotion. Further, the secondary emotion can be changed in response to the change of the primary emotion.

図４０には、内部状態管理部１０４が外部環境の認識結果により内部状態を変化させるための仕組みを示している。 FIG. 40 shows a mechanism for the internal state management unit 104 to change the internal state based on the recognition result of the external environment.

内部状態管理部１０４は、短期記憶部１０５経由で各認識機能部１０１〜１０３による認識結果を受け取ると、１次情動の指標値を変更するとともに、これに伴って２次情動としての感情も変更する。そして、これら内部状態の更新データを、状況依存行動階層１０８に出力する。 When the internal state management unit 104 receives the recognition results by the recognition function units 101 to 103 via the short-term storage unit 105, the internal state management unit 104 changes the index value of the primary emotion and changes the emotion as the secondary emotion accordingly. To do. Then, the update data of the internal state is output to the situation dependent action hierarchy 108.

状況依存行動階層１０８では、新たに受け取った内部状態の指標値を基に、各スキーマの活動度レベルを算出して、状況に依存した次の行動（スキーマ）を選択することができる。 In the situation dependent action hierarchy 108, the activity level of each schema can be calculated based on the newly received index value of the internal state, and the next action (schema) depending on the situation can be selected.

Ｅ−６．連想による内部状態の変化
既に述べたように、本実施形態に係るロボットは、長期記憶部１０６において連想記憶機能を備えている。この連想記憶は、あらかじめ複数のシンボルからなる入力パターンを記憶パターンとして記憶しておき、その中のある１つのパターンに類似したパターンが想起される仕組みのことであり、外部刺激から内部状態の変化を連想記憶することができる。 E-6. Changes in Internal State due to Association As described above, the robot according to this embodiment has an associative memory function in the long-term storage unit 106. This associative memory is a mechanism in which an input pattern consisting of a plurality of symbols is stored in advance as a memory pattern, and a pattern similar to one pattern is recalled. Associative memory.

例えば、りんごが見えた場合に「嬉しい」という情動の変化を起こす場合について考察してみる。 For example, consider the case where an emotional change of “happy” occurs when an apple is seen.

りんごが視覚認識機能部１０１において認識されると、短期記憶部１０５を経由して状況依存行動階層１０８に外部環境の変化として通知される。 When the apple is recognized by the visual recognition function unit 101, the situation-dependent behavior hierarchy 108 is notified as a change in the external environment via the short-term storage unit 105.

長期記憶部１０６では、「りんご」に関する連想記憶により、「（りんごを）食べる」という行動と、食べることにより１次情動（空腹感）が指標値で３０だけ満たされるという内部状態の変化を想起することができる。 The long-term memory unit 106 recalls an action of “eating (apple)” and an internal state change in which the primary emotion (hunger) is satisfied by an index value of 30 by eating with an associative memory related to “apple”. can do.

状況依存行動階層１０８は、長期記憶部１０６から記憶情報を受け取ると、内部状態の変化ΔＩ＝３０を、内部状態管理部１０４に通知する。 When receiving the storage information from the long-term storage unit 106, the situation-dependent behavior hierarchy 108 notifies the internal state management unit 104 of the change ΔI = 30 in the internal state.

内部状態管理部１０４では、通知されたΔＩを基に、２次情動の変化量ΔＥを算出して、りんごを食べることによる２次情動Ｅの指標値を得ることができる。 The internal state management unit 104 can calculate the change amount ΔE of the secondary emotion based on the notified ΔI, and obtain an index value of the secondary emotion E by eating an apple.

図４１には、内部状態管理部１０４が連想記憶により内部状態を変化させるための仕組みを示している。 FIG. 41 shows a mechanism for the internal state management unit 104 to change the internal state by associative memory.

外部環境が短期記憶部１０５を経由して状況依存行動階層１０８に通知される。長期記憶部１０６の連想記憶機能により、外部環境に応じた行動と、１次情動の変化ΔＩを想起することができる。 The external environment is notified to the situation dependent action hierarchy 108 via the short-term storage unit 105. With the associative memory function of the long-term storage unit 106, it is possible to recall an action according to the external environment and a change ΔI of the primary emotion.

状況依存行動階層１０８は、この連想記憶により得られた記憶情報を基に行動を選択するとともに、１次情動の変化ΔＩを内部状態管理部１０４に通知する。 The situation-dependent action hierarchy 108 selects an action based on the storage information obtained by the associative memory, and notifies the internal state management unit 104 of the change ΔI of the primary emotion.

内部状態管理部１０４では、通知を受けた１次情動の変化ΔＩと、自身で管理している１次情動の指標値とを基に、２次情動の変化ΔＥを算出して、２次情動を変化させる。そして、新たに生成された１次情動及び２次情動を、内部状態更新データとして状況依存行動階層１０８に出力する。 The internal state management unit 104 calculates the secondary emotion change ΔE based on the primary emotion change ΔI received and the primary emotion index value managed by the internal state management unit 104. To change. Then, the newly generated primary emotion and secondary emotion are output to the situation dependent behavior hierarchy 108 as internal state update data.

Ｅ−７．生得的な行動による内部状態の変化
本実施形態に係るロボットが動作実行により内部状態を変化させることは既に述べた通りである（図３９を参照のこと）。この場合、１次情動と２次情動からなる内部状態の指標値を基に行動が選択されるとともに、行動の実行完了により情動が満たされる。他方、本実施形態に係るロボットは、情動に依存しない、生得的な反射行動も規定されている。この場合、外部環境の変化に応じて反射行動が直接選択されることになり、通常の動作実行による内部変化とは異なる仕組みとなる。 E-7. Changes in Internal State due to Intrinsic Behavior As described above, the robot according to the present embodiment changes the internal state by executing an action (see FIG. 39). In this case, the action is selected based on the index value of the internal state composed of the primary emotion and the secondary emotion, and the emotion is satisfied when the execution of the action is completed. On the other hand, the robot according to the present embodiment also defines an innate reflex behavior that does not depend on emotion. In this case, the reflex behavior is directly selected according to the change in the external environment, and the mechanism is different from the internal change caused by normal operation execution.

例えば、大きなものが突然現れたときに生得的な反射行動をとる場合について考察してみる。 For example, consider the case of taking an innate reflex behavior when a large thing suddenly appears.

このような場合、例えば視覚的認識機能部１０１による「大きいもの」という認識結果（センサ情報）は、短期記憶部１０５を介さず、状況依存行動階層１０８に直接入力される。 In such a case, for example, the recognition result (sensor information) of “large” by the visual recognition function unit 101 is directly input to the situation-dependent action hierarchy 108 without passing through the short-term storage unit 105.

状況依存行動階層１０８では、「大きいもの」という外部刺激により各スキーマの活動度レベルを算出して、適当な行動を選択する（図１５、図２５及び図２６を参照のこと）。この場合、状況依存行動階層１０８では、「よける」という脊髄反射的行動を選択するとともに、「驚く」という２次情動を決定して、これを内部状態管理部１０４に通知する。 In the situation-dependent action hierarchy 108, an activity level of each schema is calculated by an external stimulus “large”, and an appropriate action is selected (see FIGS. 15, 25, and 26). In this case, in the situation-dependent action hierarchy 108, the spinal reflex action “OK” is selected, the secondary emotion “surprise” is determined, and this is notified to the internal state management unit 104.

内部状態管理部１０４では、状況依存行動階層１０８から送られてきた２次情動を自身の感情として出力する。 The internal state management unit 104 outputs the secondary emotion sent from the situation-dependent action hierarchy 108 as its own emotion.

図４２には、内部状態管理部１０４が生得的反射行動により内部状態を変化させるための仕組みを示している。 FIG. 42 shows a mechanism for the internal state management unit 104 to change the internal state by innate reflection behavior.

生得的な反射行動を行なう場合、各認識機能部１０１〜１０３による戦さ情報は、短期記憶部１０５を介さず、状況依存行動階層１０８に直接入力される。 When performing an innate reflex action, the fighting information by each recognition function part 101-103 is directly input into the situation dependence action hierarchy 108 not via the short-term memory | storage part 105. FIG.

状況依存行動階層１０８では、センサ情報として得た外部刺激により各スキーマの活動度レベルを算出して、適当な行動を選択するとともに、２次情動を決定して、これを内部状態管理部１０４に通知する。 In the situation-dependent action hierarchy 108, the activity level of each schema is calculated by an external stimulus obtained as sensor information, an appropriate action is selected, a secondary emotion is determined, and this is sent to the internal state management unit 104. Notice.

内部状態管理部１０４では、状況依存行動階層１０８から送られてきた２次情動を自身の感情として出力する。また、状況依存行動階層１０８からのＡｃｔｉｖａｔｉｏｎに対して、バイオリズムの高低によって最終的なＡｃｔｉｖａｔｉｏｎを決定する。 The internal state management unit 104 outputs the secondary emotion sent from the situation-dependent action hierarchy 108 as its own emotion. Further, the final activation is determined based on the level of biorhythm with respect to the activation from the situation-dependent behavior hierarchy 108.

Ｅ−８．スキーマと内部状態管理部との関係
状況依存行動階層１０８は、複数のスキーマで構成され、各スキーマ毎に外部刺激や内部状態の変化によって活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する（図１８、図１９、図２５を参照のこと）。 E-8. Relationship between schema and internal state management unit The situation-dependent behavior hierarchy 108 is composed of a plurality of schemas. For each schema, an activity level is calculated based on an external stimulus or a change in an internal state, and the activity level is determined according to the level of the activity level. The schema is selected and the action is executed (see FIGS. 18, 19, and 25).

図４３には、スキーマと内部状態管理部との関係を模式的に示している。スキーマは、ＤＳｕｂｊｅｃｔやＤＯｂｊｅｃｔなどのプロキシを介して、短期記憶部１０５、長期記憶部１０６、内部状態管理部１０４などの外部オブジェクトと通信することができる（図３０を参照のこと）。 FIG. 43 schematically shows the relationship between the schema and the internal state management unit. The schema can communicate with external objects such as the short-term storage unit 105, the long-term storage unit 106, and the internal state management unit 104 via a proxy such as DSubject or DOObject (see FIG. 30).

スキーマは、外部刺激や内部状態の変化によって活動度レベルを算出するクラス・オブジェクトを備えている。ＲＭ（ＲｅｓｏｕｒｃｅＭａｎａｇｅｍｅｎｔ）オブジェクトは、プロキシを介して短期記憶部１０５に通信して、外部環境を取得して、外部環境に基づく活動度レベルを算出する。また、Ｍｏｔｉｖａｔｉｏｎ算出クラス・オブジェクトは、プロキシを介して長期記憶部１０６並びに内部状態管理部１０４と通信して、内部状態の変化量を取得して、内部状態に基づく活動度レベルすなわちＭｏｔｉｖａｔｉｏｎを算出する。Ｍｏｔｉｖａｔｉｏｎの算出方法に関しては後に詳解する。 The schema includes a class object that calculates an activity level according to an external stimulus or a change in an internal state. An RM (Resource Management) object communicates with the short-term storage unit 105 via a proxy, acquires an external environment, and calculates an activity level based on the external environment. In addition, the Motivation calculation class object communicates with the long-term storage unit 106 and the internal state management unit 104 via the proxy, acquires the amount of change in the internal state, and calculates the activity level based on the internal state, that is, the Motivation. . The calculation method of Motivation will be described in detail later.

内部状態管理部１０４は、既に述べたように、１次情動と２次情動とに段階的に階層化されている。また、１次情動に関しては、生得的反応による１次情動階層と、ホメオスタシスによる１次情動と、連想による１次情動とに次元的に階層化されている（図３６を参照のこと）。また、２次情動としての感情は、Ｐ（Ｐｌｅａｓａｎｔｎｅｓｓ）、Ａ（Ａｃｔｉｖｉｔｙ）、Ｃ（Ｃｏｎｃｅｎｔｒａｔｉｏｎ）の３要素にマッピングされている。 As described above, the internal state management unit 104 is layered in stages into primary emotions and secondary emotions. The primary emotions are dimensionally hierarchized into primary emotion layers based on innate reactions, primary emotions based on homeostasis, and primary emotions based on association (see FIG. 36). The emotion as the secondary emotion is mapped to three elements of P (Pleasantness), A (Activity), and C (Concentration).

１次情動の各階層における変化ΔＩはすべて２次情動に入力されて、Ｐｌｅａｓａｎｔｎｅｓｓの変化ΔＰの算出に利用される。 All changes ΔI in each layer of the primary emotion are input to the secondary emotion and used to calculate the Pleasantness change ΔP.

Ａｃｔｉｖｉｔｙは、センサ入力、動作時間、バイオリズムなどの情報から統合的に判断される。 Activity is comprehensively determined from information such as sensor input, operation time, and biorhythm.

また、選択されたスキーマの確信度を、実際の２次情動階層における確信度として使用する。 Further, the certainty factor of the selected schema is used as the certainty factor in the actual secondary emotion hierarchy.

図４４には、Ｍｏｔｉｖａｔｉｏｎ算出クラス・オブジェクトによるＭｏｔｉｖａｔｉｏｎ算出経路を模式的に示している。 FIG. 44 schematically shows the Motivation calculation path by the Motivation calculation class object.

ＲＭクラス・オブジェクトは、プロキシ経由で短期記憶部１０５にアクセスして、センサ情報を取得し、認識された対象物の距離や大きさなどの刺激の強さに基づいて外部刺激による活動度レベルを評価する。 The RM class object accesses the short-term storage unit 105 via a proxy, acquires sensor information, and determines the activity level by an external stimulus based on the strength of the stimulus such as the distance and size of the recognized object. evaluate.

一方、Ｍｏｔｉｖａｔｉｏｎ算出クラス・オブジェクトは、プロキシ経由で短期記憶部１０５にアクセスして、対象物に関する特徴を取得して、さらにプロキシ経由で長期記憶部１０６の対象物の特徴を問い合わせて内部状態の変化を取得する。そして、プロキシ経由で内部状態管理部１０４にアクセスして、ロボット内部にある内部評価値を算出する。したがって、Ｍｏｔｉｖａｔｉｏｎの算出は、外部刺激の強さには無関係である。 On the other hand, the Motivation calculation class / object accesses the short-term storage unit 105 via a proxy, acquires characteristics relating to the object, and further inquires about the characteristics of the object in the long-term storage unit 106 via the proxy to change the internal state. To get. Then, the internal state management unit 104 is accessed via a proxy to calculate an internal evaluation value inside the robot. Therefore, the calculation of Motivation is independent of the strength of the external stimulus.

本実施形態に係るロボットの行動制御システムが連想記憶を用いて外部刺激から内部状態の変化を想起することにより、２次情動を算出して行動選択を行なう、ということは既に述べた（図４１を参照のこと）。さらに、連想記憶を用いることにより、対象物毎に異なる内部状態の変化を想起させることができる。これによって、同じ状況でもその行動の発現し易さを異ならせることができる。すなわち、外部の刺激や物理的状況、現在の内部状態に加え、ロボットの対象物ごとの記憶を考慮して行動を選択することができ、より多彩で多様化した対応を実現することができる。 As described above, the behavior control system of the robot according to the present embodiment uses the associative memory to recall the change of the internal state from the external stimulus, thereby calculating the secondary emotion and selecting the behavior (FIG. 41). checking). Furthermore, by using associative memory, it is possible to recall a change in the internal state that is different for each object. Thereby, even in the same situation, it is possible to vary the ease of expression of the action. That is, in addition to external stimuli, physical conditions, and the current internal state, an action can be selected in consideration of memory for each object of the robot, and more various and diversified responses can be realized.

例えば、「○○が見えているから××する」とか、「現在○○が不足だから（何に対しても）××する」などの外部環境又は内部状態によって決まった行動をするのではなく、「○○が見えても△△なので□□する」とか、「○○が見えているけど××なので■■する」など、対象物に関する内部状態の変化記憶を用いることにより、行動にバリエーションをつけることができる。 For example, instead of taking action determined by the external environment or internal state, such as “I do XX because I can see XX” or “I do XX because I do n’t have enough XX” , Even if you can see ○○, □□ because it is △△, or “you can see it because it is XX, but ■■■”, you can use it to change the internal state of the object. You can turn on.

図４５には、対象物が存在するときのＭｏｔｉｖａｔｉｏｎ算出処理のメカニズムを模式的に示している。 FIG. 45 schematically shows the mechanism of the Motivation calculation process when an object is present.

まず、プロキシ経由で短期記憶部１０５にアクセスして、認識機能部１０１〜１０３により認識されたターゲットの特徴を尋ねる。 First, the short-term storage unit 105 is accessed via a proxy, and the target features recognized by the recognition function units 101 to 103 are inquired.

次いで、取り出した特徴を用いて、今度はプロキシ経由で長期記憶部１０６にアクセスして、その特徴の対象物がスキーマに関係した欲求をどのように変化させるか、すなわち１次情動の変化ΔＩを獲得する。 Next, using the extracted feature, this time, the long-term storage unit 106 is accessed via the proxy, and how the feature object changes the desire related to the schema, that is, the primary emotion change ΔI is changed. To win.

次いで、プロキシ経由で内部状態管理部１０４にアクセスして、欲求の変化により快不快の値がどのように変化するか、すなわち２次情動の変化ΔＰｌｅａｓａｎｔを引き出す。 Next, the internal state management unit 104 is accessed via the proxy to extract how the pleasantness / discomfort value changes due to a change in desire, that is, a secondary emotion change ΔPleaant.

そして、２次情動の変化ΔＰｌｅａｓａｎｔと対象物の確信度を引数とする以下のＭｏｔｉｖａｔｉｏｎ算出関数ｇ_target-iにより、ｉ番目のＭｏｔｉｖａｔｉｏｎを算出する。 Then, the i-th motivation is calculated by the following motivation calculation function g _target-i that uses the secondary emotion change ΔPleasant and the certainty of the object as arguments.

また、図４６には、対象物が存在しないときのＭｏｔｉｖａｔｉｏｎ算出処理のメカニズムを模式的に示している。 FIG. 46 schematically shows the mechanism of the Motivation calculation process when there is no object.

この場合、まず、行動に対する記憶に対して、その行動による欲求の変化ΔＩを尋ねる。 In this case, first, a change ΔI in desire due to the action is asked with respect to the memory for the action.

次いで、取得したΔＩを用いて、内部状態管理部１０４により１次情動がΔＩだけ変化したときの２次情動の変化ΔＰｌｅａｓａｎｔを引き出す。そして、この場合は、２次情動の変化ΔＰｌｅａｓａｎｔを引数とする以下のＭｏｔｉｖａｔｉｏｎ算出関数ｇ_nottarget-iにより、ｉ番目のＭｏｔｉｖａｔｉｏｎを算出する。 Next, using the acquired ΔI, the internal state management unit 104 derives a change ΔPleasant of the secondary emotion when the primary emotion changes by ΔI. In this case, the i-th _{motivation is} calculated by the following _motivation calculation function g _nottarget-i using the secondary emotion change ΔPleasant as an argument.

Ｅ−９．２次情動の各要素の変更方法
図４７には、２次情動のうちのＰｌｅａｓａｎｔｎｅｓｓを変更するためのメカニズムを図解している。 E-9 . Method for Changing Each Element of Secondary Emotion FIG. 47 illustrates a mechanism for changing the Pleasantness of the secondary emotion.

長期記憶部１０６は、記憶の量による１次情動の変化を内部状態管理部１０４に入力する。また、短期記憶部１０５は、認識機能部１０１〜１０３からのセンサ入力による１次情動の変化を内部状態管理部１０４に入力する。 The long-term storage unit 106 inputs changes in the primary emotion according to the amount of storage to the internal state management unit 104. In addition, the short-term storage unit 105 inputs changes in the primary emotion caused by sensor inputs from the recognition function units 101 to 103 to the internal state management unit 104.

また、スキーマは、スキーマ実行による１次情動の変化（Ｎｏｕｒｉｓｈｍｅｎｔ，Ｍｏｉｓｔｕｒｅ，Ｓｌｅｅｐ）や、スキーマの内容による１次情動の変化（Ａｆｆｅｃｔｉｏｎ）を内部状態管理部１０４に入力する。 In addition, the schema inputs the primary emotion change (Nourishment, Moisture, Sleep) due to the schema execution and the primary emotion change (Affection) depending on the schema contents to the internal state management unit 104.

Ｐｌｅａｓａｎｔｎｅｓｓは、１次情動の過不足の変化に応じて決定される。 Pleasantness is determined according to changes in excess or deficiency of the primary emotion.

また、図４８には、２次情動のうちのＡｃｔｉｖｉｔｙを変更するためのメカニズムを図解している。 FIG. 48 illustrates a mechanism for changing the activity of the secondary emotion.

Ａｃｔｉｖｉｔｙは、スキーマのＳｌｅｅｐ以外の時間の総和と、バイオリズムと、センサ入力を基に、統合的に判断される。 Activity is determined in an integrated manner based on the sum of times other than Sleep in the schema, biorhythm, and sensor input.

また、図４９には、２次情動のうちのＣｅｒｔａｉｎｔｙを変更するためのメカニズムを図解している。 Further, FIG. 49 illustrates a mechanism for changing the Certificate of the secondary emotion.

長期記憶部１０６に対して対象物を尋ねると、Ｃｅｒｔａｉｎｔｙが返される。どの１次情動に着目するかは、そのスキーマの目標とする行動に依存する。そして、引き出されたＣｅｒｔａｉｎｔｙがそのまま内部状態管理部１０４の２次情動におけるＣｅｒｔａｉｎｔｙとなる。 When the long-term storage unit 106 is inquired about an object, Certificate is returned. Which primary emotion is focused depends on the target behavior of the schema. Then, the extracted Certificate becomes the Certificate in the secondary emotion of the internal state management unit 104 as it is.

図５０には、Ｃｅｒｔａｉｎｔｙを求めるためのメカニズムを模式的に示している。 FIG. 50 schematically shows a mechanism for obtaining the certificate.

長期記憶部１０６では、対象物に関する認識結果や情動などの各項目の確からしさを、スキーマ毎に記憶している。 The long-term storage unit 106 stores the probability of each item such as the recognition result and emotion related to the object for each schema.

スキーマは、長期記憶部１０６に対して、スキーマと関係する記憶の対する確からしさの値を尋ねる。これに対し、長期記憶部１０６は、スキーマと関係する記憶の確からしさを対象物の確からしさとして与える。 The schema asks the long-term storage unit 106 for a certainty value for the storage related to the schema. In contrast, the long-term storage unit 106 gives the certainty of the storage related to the schema as the certainty of the object.

Ｆ．ＲｅｓｏｕｒｃｅＭａｎａｇｅｒでのコマンド管理
ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒとＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒから、ロボット装置１の資源管理（ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ）モジュールに対し、ロボット装置１を動作させるためのコマンドが送信される。（図１３並びに図１８を参照のこと。） F. Command management in ResourceManager A command for operating the robot apparatus 1 is transmitted to the resource management (ResourceManager) module of the robot apparatus 1 from the Situated Behaviorslayer and the Reflexive Situated Behaviors Layer. (See FIGS. 13 and 18)

ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒから発行されるコマンドは、反射行動として生成するコマンドである。また、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒから発行されるコマンドは、状況依存行動として生成するコマンドである。これらは別プロセスとして、若しくは別スレッドとして生成するため、あらかじめ調停しておくことが困難であり、あらかじめ発現しないようにするメカニズムである意図レベルによる制御は困難である（後述）。 A command issued from Reflexive Situated BehaviorsLayer is a command generated as a reflex action. A command issued from Situated Behaviorslayer is a command generated as a situation-dependent action. Since these are generated as separate processes or as separate threads, it is difficult to mediate in advance, and it is difficult to control by the intention level, which is a mechanism for preventing them from being expressed in advance (described later).

このため、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒでは、この２つのコマンドをコマンドに付随する活動度レベルで調停する。図５１には、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒの構成を示している。図示のＲｅｓｏｕｒｃｅＭａｎａｇｅｒは、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒ若しくはＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒから送信されたコマンドの競合解決を行なうコマンド競合解決器と、ハードウェア・リソース毎のコマンドの管理を行なうコンテンツ管理器とで構成される。 For this reason, ResourceManager arbitrates these two commands at the activity level associated with the command. FIG. 51 shows the configuration of the ResourceManager. The illustrated ResourceManager is composed of a command contention resolver that performs contention resolution of commands transmitted from a Situated Behaviorslayer or Refractive Situated BehaviorsLayer, and a content manager that manages commands for each hardware resource.

コマンド競合解決器では、コマンドと、コマンドが使用するロボット装置１のハードウェアのリソースの情報と、コマンドの活動度レベルを保持する。そして、新しいコマンドがＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒ若しくはＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒから送信された場合には、現在実行中のコマンドが使用しているロボット装置１のハードウェアのリソース情報と新しいコマンドのハードウェアのリソース情報と競合しているかを判定する。そして、競合している場合には、コマンドに付随する活動度レベルの比較を行ない、コマンドの競合解決を行なう。 The command conflict resolution unit holds a command, information on hardware resources of the robot apparatus 1 used by the command, and an activity level of the command. When a new command is transmitted from Situated Behaviors or Reflexive Situation Behaviors Layer, whether the currently executed command conflicts with the hardware resource information of the robot device 1 being used and the hardware resource information of the new command. judge. If there is a conflict, the activity levels associated with the command are compared to resolve the command conflict.

つまり、現在実行中のコマンドの活動度レベルが新しいコマンドの活動度レベルよりも低い場合には、実行中のコマンドはキャンセルされ、新しいコマンドが実行される。 That is, when the activity level of the command currently being executed is lower than the activity level of the new command, the command being executed is canceled and the new command is executed.

他方、現在実行中のコマンドの活動度レベルが新しいコマンドの活動度レベルよりも高い場合には、新しいコマンドは実行中のコマンドの終了を待って実行されるか、若しくは新しいコマンドがキャンセル処理される。 On the other hand, if the activity level of the currently executing command is higher than the activity level of the new command, the new command is executed waiting for the end of the executing command, or the new command is canceled. .

また、新しいコマンドの活動度レベルがＲｅｓｏｕｒｃｅＭａｎａｇｅｒで待ち状態にあるコマンドの活動度レベルよりも大きい場合には、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒで待ち状態にあるコマンドは、キャンセルされる。 If the activity level of a new command is higher than the activity level of a command waiting in ResourceManager, the command waiting in ResourceManager is cancelled.

Ｇ．ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｌａｙｅｒとＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒにおける意図レベルを用いた下位行動の制御
Ｇ−１．ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｌａｙｅｒからＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒへの意図レベル制御
ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｌａｙｅｒの行動は、その行動中に反射行動が発生することを許容できない場合がある。例えば、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｌａｙｅｒに存在する対話行動は、ロボット装置１に顔をトラッキングさせたとする。 G. Control of subordinate actions using intention level in SituatedBehaviorlayer and DeliverableLayer
G-1. The intention level control SituatedBehaviorlayer from SituatedBehaviorlayer to ReflexiveSituatedBehaviorsLayer may not allow reflex behavior to occur during the behavior. For example, it is assumed that the interactive action existing in the Situated Behavior layer causes the robot apparatus 1 to track the face.

このときロボット装置１の動作スピードに比べ、人間の動作がすばやければ、顔が突然大きく観測されることがある。このとき、大きなものが突然現れたと考え、生得的な行動として、ＲｅｆｌｅｘｉｖｅＢｅｈａｖｉｏｒｌａｙｅｒの行動が発現し、ロボット装置１は首をのけぞらせる（びっくりする）。これにより、顔のトラッキングのコマンドが上書きされ、ロボットのトラッキングは停止してしまう。これは対話中にはふさわしくない行動である。 At this time, if the human motion is quicker than the operation speed of the robot apparatus 1, the face may be suddenly observed larger. At this time, it is considered that a large thing suddenly appeared, and as a natural behavior, the behavior of Reflexive Behavior is manifested, and the robot apparatus 1 is dragged (surprised). As a result, the face tracking command is overwritten, and the robot tracking stops. This is an inappropriate behavior during a dialogue.

このような場合、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｌａｙｅｒの対話行動（スキーマ）は、Ａｃｔｉｏｎ関数の中で、意図レベルを反映させるＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒの首をのけぞらせるスキーマと意図レベル情報をコマンドとしてＲｅｆｌｅｘｉｖｅＢｅｈａｖｉｏｒｌａｙｅｒに送信する。 In such a case, the dialogue behavior (schema) of the Situated Behavior is transmitted to the Reflexive Behavior as a command with the schema and the intention level information for reflexive Behaviors Layer that reflects the intention level in the Action function.

図５２には、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒからＲｅｆｌｅｘｉｖｅＢｅｈａｖｉｏｒｌａｙｅｒへの制御を模式的に示している。ＲｅｆｌｅｘｉｖｅＢｅｈａｖｉｏｒｌａｙｅｒの行動状態制御部の行動意図管理器は、意図レベル情報コマンドを管理し、意図レベルを反映させるスキーマ（首をのけぞらせる行動）のバイアス演算器に送信する（バイアス演算器については、図２１を参照のこと）。バイアス演算器は、ＲｅｆｌｅｘｉｖｅＢｅｈａｖｉｏｒｌａｙｅｒの意図レベルを反映させる。スキーマは、図２１のようにＭｏｎｉｔｏｒ関数での活動度レベルに反映させる。 FIG. 52 schematically shows control from the Situated Behavior Layer to the Reflexive Behavior Layer. The behavior intention manager of the behavioral state control unit of the Reflexive Behavior layer manages the intention level information command and transmits the command to a bias calculator of a schema (behavior to lift the neck) that reflects the intention level (for the bias calculator, see FIG. 21). checking). The bias calculator reflects the intention level of Reflexive Behavior. The schema is reflected in the activity level of the Monitor function as shown in FIG.

Ｇ−２．ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒからＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｌａｙｅｒへの意図レベル制御
ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒは、本発明では、スキーマとして実装される。長期記憶や短期記憶を用いて、行動計画を行なう。このときの計画される行動は、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｌａｙｅｒのスキーマのシーケンスとして構成する。そのため、Ｇ−１の意図レベルと同じメカニズムで制御する。 G-2. Intent level control DeliverableLayer from DeliverableLayer to SituatedBehaviorlayer is implemented as a schema in the present invention. Use long-term and short-term memory to plan an action. The planned action at this time is configured as a sequence of the Situated Behavior layer schema. Therefore, it controls by the same mechanism as the intention level of G-1.

図５３には、ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒからＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒへの制御を模式的に示している。つまり、ＤｅｌｉｂｅｒａｔｉｖｅａＬａｙｅｒでは、計画されたシーケンスのスキーマを対象スキーマとして、意図レベル情報をコマンドとしてＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒに送信する。 FIG. 53 schematically shows control from the Deliverable Layer to the Situated Behavior Layer. In other words, the DeliverativeLayerLayer transmits the planned sequence schema as the target schema and the intention level information as a command to the SituatedBehaviorLayer.

ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒの行動状態制御部の行動意図管理部は、意図レベル情報コマンドを管理し、意図レベルを反映させるスキーマのバイアス演算器に送信する。（バイアス演算器は、図２１を参照のこと）ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒの意図レベルを反映させるスキーマは、図２１のようにＭｏｎｉｔｏｒ関数での活動度レベルに反映させ、励起されるため、状況依存ではなく、ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒからの支持で発現する。発現が終了すると、行動状態制御部の行動意図管理部は、終了情報をＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒに送信し、意図レベル情報をコマンドとして送信したスキーマに返信される。 The behavior intention management unit of the behavioral state control unit of the Situated Behavior Layer manages the intention level information command and transmits it to a bias calculator having a schema that reflects the intention level. (Refer to FIG. 21 for the bias calculator.) The schema reflecting the intention level of the Situated Behavior Layer is reflected in the activity level in the Monitor function as shown in FIG. 21 and excited, so it is not a situation dependent but a Deliverable Layer. Expressed with support from When the expression ends, the behavior intention management unit of the behavior state control unit transmits the termination information to the Deliverable Layer and returns the intention level information as a command to the schema that has been transmitted.

Ｈ．行動制御システムの他の構成例
Ｄ項では、状況依存行動階層を構成する要素行動はすべて、内部状態をある範囲に保つための行動すなわち「ホメオスタシス行動」である場合を前提にして場合について説明した。このような場合、すべての内部状態が十分満たされているときには各要素行動の欲求値は小さくなるため、行動価値も小さく、状況依存行動が発現する機会は低下する。加えて、外部刺激もなければ反射行動を起こさなくなるので、ロボット装置は何もしなくなる。しかしながら、自律的に何もしないロボットというのは、エンタテイメント性の点で問題があると考えられる。そこで、本発明者らは、ロボット装置の自発的な行動を発現する状況依存行動の構成要素として、ホメオスタシス的な目的を持たない「アイドル行動」をさらに組み込むことにした。この項では、ホメオスタシス行動とアイドル行動を要素行動として構成される状況依存行動階層のメカニズム、並びにこの場合のロボット装置の行動制御システム、自発行動と反射行動との調停方法について詳解する。 H. In the section D of the other configuration example of the behavior control system, the case has been described on the assumption that all the element behaviors constituting the situation-dependent behavior hierarchy are behaviors for keeping the internal state within a certain range, that is, “homeostasis behavior”. . In such a case, when all the internal states are sufficiently satisfied, the desire value of each element action becomes small, so the action value is also small, and the opportunity for the situation-dependent action to occur is reduced. In addition, since there will be no reflex behavior without external stimuli, the robotic device will do nothing. However, robots that do nothing autonomously are considered problematic in terms of entertainment. Therefore, the present inventors have decided to further incorporate “idle behavior” having no homeostasis purpose as a component of the situation-dependent behavior that expresses spontaneous behavior of the robot apparatus. In this section, the mechanism of the situation-dependent behavior hierarchy composed of homeostasis behavior and idle behavior as elemental behavior, the behavior control system of the robot device in this case, and the arbitration method between the self-issued motion and the reflex behavior will be explained in detail.

Ｈ−１．行動制御システム
図５４には、ロボット装置１の行動制御システムの他の構成例を示している。但し、図３と同一の構成要素については同一の参照番号を付している。 H-1. Behavior Control System FIG. 54 shows another configuration example of the behavior control system of the robot apparatus 1. However, the same components as those in FIG. 3 are denoted by the same reference numerals.

当該システム１００´は、視覚認識機能部１０１、聴覚認識機能部１０２、接触認識機能部１０３を通して外界の情報すなわち外部刺激を獲得する。これら認識機能部による認識結果は、短期記憶部１０５並びに長期記憶部１０６を通して状況依存行動階層１０８と反射行動部１０９、並びに反射イベント管理部１１０に伝達される。 The system 100 ′ acquires external information, that is, external stimulus through the visual recognition function unit 101, the auditory recognition function unit 102, and the contact recognition function unit 103. The recognition results by these recognition function units are transmitted to the situation-dependent behavior hierarchy 108, the reflex behavior unit 109, and the reflex event management unit 110 through the short-term storage unit 105 and the long-term storage unit 106.

状況依存行動階層１０８と反射行動部１０９はそれぞれ、「スキーマ」と呼ばれるオブジェクトとして記述される複数の行動モジュールすなわち要素行動で構成されている。状況依存行動階層１０８と反射行動部１０９はそれぞれ、認識結果すなわち外部刺激に基づいて行動を決定し、モーション・コマンドの強さが付加されたモーション・コマンドを資源管理部１２０に出力する。モーション・コマンドは、上述した行動価値すなわちＡＬ値に相当し、スカラ値として表される。 Each of the situation-dependent action hierarchy 108 and the reflex action part 109 includes a plurality of action modules, that is, element actions described as objects called “schema”. Each of the situation-dependent action hierarchy 108 and the reflex action unit 109 determines an action based on a recognition result, that is, an external stimulus, and outputs a motion command to which the strength of the motion command is added to the resource management unit 120. The motion command corresponds to the action value, that is, the AL value described above, and is expressed as a scalar value.

また、反射イベント管理部１１０は、反射行動部１０９に入力されるイベントから反射イベント密度を算出する。反射イベント密度は、反射イベント密度は、反射行動の要因となるイベントすなわち外部刺激が入力される度合いを数値化したものである。ロボット装置がホメオスタシス的でないときに、アイドル行動の発生確率が反射イベント密度に基づいて制御される（後述）。 Further, the reflection event management unit 110 calculates the reflection event density from the event input to the reflection action unit 109. The reflection event density is obtained by quantifying the degree of input of an event that causes reflection behavior, that is, an external stimulus. When the robot apparatus is not homeostasis, the occurrence probability of idle behavior is controlled based on the reflection event density (described later).

資源管理部１２０は、状況依存行動階層１０８と反射行動部１０９からそれぞれ伝達された状況依存行動と反射行動のモーション・コマンドの強さを比較し、値が大きい方のモーションを外部環境に出力し、ロボット装置１が現実に行なう動作として発現させる。 The resource management unit 120 compares the strengths of the motion commands of the situation-dependent action and the reflex action transmitted from the situation-dependent action hierarchy 108 and the reflex action part 109, respectively, and outputs the motion with the larger value to the external environment. Then, the robot device 1 is caused to perform as an actual operation.

なお、反射行動部１０９は、状況依存行動階層１０８に比べ、十分な速さで制御サイクルが実行される。 Note that the reflex behavior unit 109 executes the control cycle at a sufficient speed compared to the situation-dependent behavior hierarchy 108.

Ｈ−２．状況依存行動階層
既に述べたように、状況依存行動階層１０８は複数の要素行動で構成される。各々の要素行動は、内部状態、短期記憶部１０５や長期記憶部１０６に記憶されている認識結果などから定期的に行動価値を算出し、行動選択された場合には、行動を出力する。本項で詳解する構成例では、状況依存行動階層１０８は、内部状態をある範囲に保つための行動すなわち「ホメオスタシス行動」の他に、ホメオスタシス的な目的を持たない「アイドル行動」を要素行動として備えている。 H-2. Situation-dependent action hierarchy As already described, the situation-dependent action hierarchy 108 is composed of a plurality of element actions. For each elemental action, an action value is periodically calculated from an internal state, a recognition result stored in the short-term storage unit 105 or the long-term storage unit 106, and the action is output when the action is selected. In the configuration example described in detail in this section, the situation-dependent action hierarchy 108 uses, as an elemental action, an “idle action” having no homeostasis purpose in addition to an action for keeping the internal state within a certain range, that is, “homeostasis action”. I have.

図５５には、状況依存行動階層１０８の内部構成例を示している。同図に示す状況依存行動階層１０８では、内部状態管理部１０４から入力される情動データと、短期記憶部１０５や長期記憶部１０６から入力される外部刺激の認識結果に基づいて、各要素行動内の行動価値算出部２２０が行動価値としてのＡＬ値をそれぞれ算出する。行動価値算出部２２０が行動価値算データベースを用いて行動価値を算出する場合、行動を発現した後の結果を基に学習して、要素行動内の行動価値算出データベースを更新する学習部２４０をさらに備えていてもよい。行動価値算出データベースの学習方法については、例えば本出願人に既に譲渡されている特願２００４−６８１３３号明細書に記載されている。 FIG. 55 illustrates an internal configuration example of the situation-dependent action hierarchy 108. In the situation-dependent action hierarchy 108 shown in the figure, each element action action is based on the emotion data input from the internal state management unit 104 and the external stimulus recognition result input from the short-term storage unit 105 or the long-term storage unit 106. The behavior value calculation unit 220 calculates the AL value as the behavior value. When the behavior value calculation unit 220 calculates the behavior value using the behavior value calculation database, the learning unit 240 further learns based on the result after the behavior is expressed and updates the behavior value calculation database in the elemental behavior. You may have. The learning method of the behavior value calculation database is described in, for example, Japanese Patent Application No. 2004-68133 already assigned to the present applicant.

そして、行動選択部２３０は、各要素行動のＡＬ値に基づいて発現すべき要素行動を選択し、そのモーション・コマンドをそのコマンドの強さを付加して、資源管理部１２０に出力する。そして、資源管理部１２０は、モーション・コマンドの強さに応じて自発的な行動又は反射行動のいずれかのモーション・コマンドを出力する。行動選択された要素行動は、自身の行動を出力する。 Then, the action selection unit 230 selects an element action that should be expressed based on the AL value of each element action, adds the strength of the command to the motion command, and outputs the motion command to the resource management unit 120. Then, the resource management unit 120 outputs a motion command of either a spontaneous behavior or a reflex behavior according to the strength of the motion command. The selected elemental action outputs its own action.

図５５に示す例では、状況依存行動階層１０８は、要素行動として、ホメオスタシス行動Ａ、ホメオスタシス行動Ｂ、…、アイドル行動などを備えている。各要素行動は、スキーマと呼ばれるオブジェクトとして記述される。図示の例では、行動選択部２３０によって一元的に要素行動の選択が行なわれる構成となっているが、勿論、図１９に示したように要素行動は階層構造を備えていてもよい。 In the example shown in FIG. 55, the situation-dependent action hierarchy 108 includes homeostasis action A, homeostasis action B,. Each element behavior is described as an object called a schema. In the illustrated example, the element selection is performed by the action selection unit 230 in a unified manner. However, as shown in FIG. 19, the element action may have a hierarchical structure.

ホメオスタシス行動の要素行動は、内部状態を満たすための行動（例えば、バッテリが少なくなったので充電行動をとる、疲れたので休むなど）として自発的に出力される。これらの要素行動は内部状態のみから欲求値を算出する。また、同時に内部状態と認識結果から予想満足値を算出し、欲求値と予想満足値から各要素行動の行動価値を算出する。行動価値の算出は各要素行動で行なわれる。欲求値が上昇するとホメオスタシス行動の行動価値が高まるので、行動選択部２３０ではホメオスタシス行動を選択して、そのモーション・コマンドを資源管理部１２０に対して出力する。その際、モーション・コマンドの強さとして算出された行動価値を付加する。 The elemental behavior of the homeostasis behavior is voluntarily output as an action for satisfying the internal state (for example, taking a charging action because the battery is low, resting because it is tired). For these elemental actions, the desire value is calculated only from the internal state. At the same time, an expected satisfaction value is calculated from the internal state and the recognition result, and an action value of each elemental action is calculated from the desire value and the expected satisfaction value. The behavior value is calculated for each elemental behavior. When the desire value increases, the behavioral value of the homeostasis behavior increases, so the behavior selection unit 230 selects the homeostasis behavior and outputs the motion command to the resource management unit 120. At that time, the action value calculated as the strength of the motion command is added.

また、すべての内部状態が十分満たされているときは各要素行動の欲求値は小さくなるため、行動価値も小さい。そのような場合は、ホメオスタシス的な目的を持たないアイドル行動がモーション・コマンド（例えば、なにもしない、首を傾げる、ゴルフパットの練習をする、横になってくつろぐなど）を出力する。アイドル行動の要素行動の行動価値は一定値である。したがって、すべての内部状態が十分に満たされているときには、ホメオスタシス的ではない、すなわちホメオスタシス行動の行動価値が低下していき、アイドル行動の行動価値を下回るようになると、行動選択部２３０はアイドル行動を選択するようになる。 Further, when all the internal states are sufficiently satisfied, the desire value of each elemental action is small, so the action value is also small. In such a case, an idle action having no homeostasis purpose outputs a motion command (for example, nothing, tilting the head, practicing a golf putt, relaxing lying down, etc.). The action value of the element action of the idol action is a constant value. Therefore, when all the internal states are sufficiently satisfied, when the behavioral value is not homeostasis, that is, the behavioral value of the homeostasis behavior decreases and falls below the behavioral value of the idle behavior, the behavior selection unit 230 performs idle behavior. Will come to choose.

本実施形態では、アイドル行動と反射行動との競合を回避又は緩和するために、反射イベント密度に応じた発生確率でアイドル行動を発現するようにしている。反射イベント密度は、反射行動の要因となるイベントすなわち外部刺激が入力される度合いを数値化したものである。アイドル行動のモーション・コマンド選択に関しては後に詳解する。 In the present embodiment, in order to avoid or alleviate the competition between the idle action and the reflex action, the idle action is expressed with a probability of occurrence according to the reflex event density. The reflex event density is a numerical value representing the degree of input of an event that causes reflex behavior, that is, an external stimulus. The motion command selection for idle action will be described in detail later.

Ｈ−３．ホメオスタシス行動の行動価値
ホメオスタシス行動の要素行動は内部状態のみから欲求値を算出する。また、同時に内部状態と認識結果から予想満足値を算出し、欲求値と予想満足値から各要素行動の行動価値を算出する。資源管理部１２０では、モーション・コマンドの強さを指標にしてモーションの選択を行なうが、ホメオスタシス行動のモーション・コマンドの強さは、その要素行動の行動価値を用いている。この項では、ホメオスタシス行動の行動価値の算出方法について詳解する。 H-3. Behavioral value of homeostasis behavior For the elemental behavior of homeostasis behavior, the desire value is calculated only from the internal state. At the same time, an expected satisfaction value is calculated from the internal state and the recognition result, and an action value of each elemental action is calculated from the desire value and the expected satisfaction value. The resource management unit 120 selects a motion using the strength of the motion command as an index, and the strength of the motion command of the homeostasis behavior uses the behavior value of the elemental behavior. In this section, the method of calculating the behavioral value of homeostasis behavior is explained in detail.

図５５に示したように、ホメオスタシス行動の各要素行動は、自身に記述された行動に応じて所定の内部状態及び外部刺激が定義されている。外部刺激は、該当する対象物のプロパティとして扱われる。例えば、行動出力が「食べる」である要素行動Ａは、外部刺激として対象物の種類（ＯＢＪＥＣＴ_ＩＤ）、対象物の大きさ（ＯＢＪＥＣＴ_ＳＩＺＥ）、対象物の距離（ＯＢＪＥＣＴ_ＤＩＳＴＡＮＣＥ）などを扱い、内部状態として「ＮＯＵＲＩＳＨＭＥＮＴ」（「栄養状態」）、「ＦＡＴＩＧＵＥ」（「疲れ」）などを扱う。このように、要素行動毎に、扱う外部刺激及び内部状態の種類が定義され、該当する外部刺激及び内部状態に対応する行動（要素行動）に対する行動価値が算出される。なお、１つの内部状態、又は外部刺激は、１つの要素行動だけでなく、複数の要素行動に対応付けられていてもよいことは勿論である。 As shown in FIG. 55, for each elemental behavior of homeostasis behavior, a predetermined internal state and external stimulus are defined according to the behavior described in itself. External stimuli are treated as properties of the corresponding object. For example, the elemental action A whose action output is “eat” handles the type of the object (OBJECT_ID), the size of the object (OBJECT_SIZE), the distance of the object (OBJECT_DISTANCE), and the like as the internal state. “NOURISHMENT” (“nutrition”), “FATIGUE” (“fatigue”), etc. are handled. In this way, the types of external stimuli and internal states to be handled are defined for each elemental action, and the action value for the action (elemental action) corresponding to the corresponding external stimulus and internal state is calculated. Of course, one internal state or external stimulus may be associated with not only one elemental action but also a plurality of elemental actions.

また、内部状態管理部１０４は、外部刺激並びに例えば自身のバッテリの残量及びモータの回転角などの情報を入力とし、上述のような複数の内部状態に対応した内部状態の値（内部状態ベクトルＩｎｔＶ）を算出し、管理する。具体的には、例えば、内部状態「栄養状態」は、バッテリの残量を基に決定し、内部状態「疲れ」は、消費電力を基に決定することができる。 Further, the internal state management unit 104 receives external stimuli and information such as the remaining battery level of the own battery and the rotation angle of the motor, and receives internal state values (internal state vectors) corresponding to a plurality of internal states as described above. IntV) is calculated and managed. Specifically, for example, the internal state “nutrient state” can be determined based on the remaining amount of the battery, and the internal state “fatigue” can be determined based on the power consumption.

ホメオスタシス行動の行動価値ＡＬは、その要素行動をロボット装置がどれくらいやりたいか（実行優先度）を示す。行動価値算出部２２０は、入力される外部刺激と、行動発現後に変化すると予想される予想内部状態変化が対応付けられた行動価値算出データベース２２１を参照し、ある時刻での外部刺激と内部状態とからその時刻での各要素行動Ａ〜Ｄにおける行動価値ＡＬをそれぞれ算出する。図５５に示した例では行動価値算出部２２０は要素行動毎に個別に設けられているが、単一の行動価値算出部がすべての要素行動についての行動価値を算出するように構成してもよい。 The action value AL of homeostasis action indicates how much the robot apparatus wants to perform the element action (execution priority). The behavior value calculation unit 220 refers to the behavior value calculation database 221 in which the input external stimulus is associated with the expected internal state change expected to change after the behavior is expressed, and the external stimulus and the internal state at a certain time The action value AL for each element action A to D at that time is calculated. In the example shown in FIG. 55, the behavior value calculation unit 220 is provided for each elemental action. However, a single action value calculation unit may be configured to calculate the action values for all elemental actions. Good.

各ホメオスタシス行動の要素行動に対する行動価値ＡＬは、現在の各内部状態に対応する各行動に対する欲求値と、現在の各内部状態に基づく満足度と、外部刺激により変化すると予想される内部状態の変化量、すなわち、外部刺激が入力され行動を発現した結果、変化すると予想される内部状態の変化量を示す予想内部状態変化に基づく予想満足度変化とに基づいて算出する。 The behavioral value AL for the elemental behavior of each homeostasis behavior is the desire value for each behavior corresponding to each current internal state, the satisfaction based on each current internal state, and the change in the internal state that is expected to change due to an external stimulus. The amount is calculated based on the expected satisfaction change based on the expected internal state change indicating the amount of change in the internal state that is expected to change as a result of the external stimulus being input and the behavior being expressed.

図５６には、行動価値算出部２２０が内部状態及び外部刺激から行動価値ＡＬを算出する処理の流れを示している。本実施形態では、要素行動毎に、１以上の内部状態の値を成分として有する内部状態ベクトルＩｎｔＶ（ＩｎｔｅｒｎａｌＶａｒｉａｂｌｅ）が定義されており、内部状態管理部１０４から各要素行動に応じた内部状態ベクトルＩｎｔＶを得る。すなわち、内部状態ベクトルＩｎｔＶの各成分は、例えば上述した情動などを示す１つの内部状態の値（内部状態パラメータ）を示すもので、内部状態ベクトルＩｎｔＶが有する各成分に応じた要素行動の行動価値算出に使用される。具体的には、上記行動出力「食べる」を有する要素行動Ａは、例えば内部状態ベクトルＩｎｔＶ｛ＩｎｔＶ_ＮＯＵＲＩＳＨＭＥＮＴ「栄養状態」、ＩｎｔＶ_ＦＡＴＩＧＵＥ「疲れ」｝が定義されている。 FIG. 56 shows a flow of processing in which the behavior value calculation unit 220 calculates the behavior value AL from the internal state and the external stimulus. In the present embodiment, an internal state vector IntV (Internal Variable) having a value of one or more internal states as a component is defined for each element action, and the internal state vector corresponding to each element action from the internal state management unit 104 IntV is obtained. That is, each component of the internal state vector IntV indicates, for example, one internal state value (internal state parameter) indicating the emotion described above, and the action value of the elemental action corresponding to each component of the internal state vector IntV Used for calculation. Specifically, for the elemental action A having the action output “eat”, for example, an internal state vector IntV {IntV_NOURISHEMENT “nutrition state”, IntV_FATIGUE “fatigue”} is defined.

また、内部状態毎に、１以上の外部刺激の値を成分として有する外部刺激ベクトルＥｘＳｔｍｌ（ＥｘｔｅｒｎａｌＳｔｉｍｕｌｕｓ）が定義されており、各記憶部１０５〜１０６から各要素行動に応じた外部刺激ベクトルＥｘＳｔｍｌを得る。外部刺激ベクトルＥｘＳｔｍｌの各成分は対象物の大きさ、対象物の種類、対象物までの距離などの認識情報を示すもので、外部刺激ベクトルＥｘＳｔｍｌが有する各成分に応じた内部状態値の算出に使用される。具体的には、内部状態ＩｎｔＶ_ＮＯＵＲＩＳＨＭＥＮＴ「栄養状態」には、例えば、外部刺激ベクトルＥｘＳｔｍｌ｛ＯＢＪＥＣＴ_ＩＤ「対象物の種類」、ＯＢＪＥＣＴ_ＳＩＺＥ「対象物の大きさ」｝が定義され、内部状態ＩｎｔＶ_ＦＡＴＩＧＵＥ「疲れ」には、例えば外部刺激ベクトルＥｘＳｔｍｌ｛ＯＢＪＥＣＴ_ＤＩＳＴＡＮＣＥ「対象物までの距離」｝が定義されている。 Further, for each internal state, an external stimulus vector ExStml (External Stimulus) having one or more external stimulus values as components is defined, and the external stimulus vector ExStml corresponding to each elemental action is stored from each storage unit 105 to 106. obtain. Each component of the external stimulus vector ExStml indicates recognition information such as the size of the object, the type of the object, and the distance to the object, and is used to calculate the internal state value corresponding to each component of the external stimulus vector ExStml. used. Specifically, the internal state IntV_NOURISHMENT “nutrition state” defines, for example, an external stimulus vector ExStml {OBJECT_ID “object type”, OBJECT_SIZE “object size”}, and the internal state IntV_FATIGUE “fatigue” For example, an external stimulus vector ExStml {OBJECT_DISTANCE “distance to an object”} is defined.

行動価値算出部２２０は、この内部状態ベクトルＩｎｔＶ及び外部刺激ベクトルＥｘＳｔｍｌを入力とし、行動価値ＡＬを算出する。具体的には、行動価値算出部２２０は、内部状態ベクトルＩｎｔＶから、該当する要素行動について、どれだけやりたいかを示すモチベーション・ベクトル（ＭｏｔｉｖａｔｉｏｎＶｅｃｔｏｒ）を求める第１の算出部ＭＶと、内部状態ベクトルＩｎｔＶ及び外部刺激ベクトルＥｘＳｔｍｌから、該当する要素行動をやれるか否か示すリリーシング・ベクトル（ＲｅｌｅａｓｉｎｇＶｅｃｔｏｒ）を求める第２の算出部ＲＶとを有備え、これら２つのベクトルから行動価値ＡＬを算出する。 The behavior value calculation unit 220 receives the internal state vector IntV and the external stimulus vector ExStml as input, and calculates the behavior value AL. Specifically, the behavior value calculation unit 220 includes, from the internal state vector IntV, a first calculation unit MV that obtains a motivation vector (Motivation Vector) indicating how much the corresponding element behavior is desired, and the internal state vector IntV. And a second calculating unit RV that obtains a releasing vector indicating whether or not the corresponding element behavior can be performed from the external stimulus vector ExStml, and calculates an action value AL from these two vectors.

ホメオスタシス行動に関する行動価値ＡＬの算出方法の詳細については、例えば、例えば本出願人に既に譲渡されている特願２００４−６８１３３号明細書を参照されたい。 For details of the method for calculating the behavioral value AL related to homeostasis behavior, see, for example, Japanese Patent Application No. 2004-68133 already assigned to the present applicant.

Ｈ−４．アイドル行動のモーション・コマンド選択とその強さ
アイドル行動はホメオスタシス的な目的を持たない、すなわち内部状態とは無関係であることから、その要素行動の行動価値は一定値が与えられている。そして、反射行動との競合を回避又は緩和するために、反射イベント密度に応じた発生確率に応じてアイドル行動が選択されるように、そのモーション・コマンドの強さを反射イベント密度に基づいて決定するようにしている。この項では、アイドル行動の行動価値の算出方法について詳解する。 H-4. Motion command selection and strength of idol action Since idol action has no homeostasis purpose, that is, it has no relation to the internal state, the action value of the element action is given a constant value. In order to avoid or alleviate competition with reflex behavior, the strength of the motion command is determined based on the reflex event density so that the idle behavior is selected according to the probability of occurrence according to the reflex event density. Like to do. In this section, the calculation method of the behavior value of idle behavior will be explained in detail.

反射イベント管理部１１０は、反射行動部１０９に入力されるイベントから反射イベント密度を算出する。アイドル行動の要素運動は、その値によって出力するモーション・コマンドの種類やその強さを決定する。 The reflection event management unit 110 calculates the reflection event density from the event input to the reflection action unit 109. The element motion of the idle action determines the type and strength of the motion command to be output depending on the value.

本実施形態では、以下の２式に示すように、反射イベントが入力される度に反射イベント密度ＲＥＤ（ＲｅｆｌｅｘｉｖｅＥｖｅｎｔＤｅｎｓｉｔｙ）を増加させるとともに、時間減衰するようにしている。但し、ＩＲ（ＩｎｃｒｅａｓｅＲａｔｉｏ）は反射イベント１回当たりの反射イベント密度ＲＥＤの増加率であり（但し、０＜ＩＲ＜１）、ＲＥＤ_max＝１．０とする。また、ｄＴは経過時間、τは時間減衰の半減期である。 In the present embodiment, as shown in the following two formulas, every time a reflection event is input, the reflection event density RED (Reflexive Event Density) is increased and the time is attenuated. However, IR (Increase Ratio) is an increase rate of the reflection event density RED per reflection event (where 0 <IR <1), and RED _max = 1.0. DT is the elapsed time, and τ is the half-life of time decay.

なお、増加率ＩＲは、反射イベントの種類毎に設定するようにしてもよい。例えば、ユーザにより明示的な反射イベント（肩タッチセンサを押すなど）に関しては大きく、ノイズなどにより誤認識され易い反射イベント（クラップ音発生）に関しては小さく設定することが可能である。図５７には、反射イベント密度ＲＥＤが反射イベントの発生に応じて変化する様子を例示している。 The increase rate IR may be set for each type of reflection event. For example, it is possible to set a large value for an explicit reflection event (such as pressing a shoulder touch sensor) by the user, and a small value for a reflection event (generation of a crap sound) that is easily recognized by noise or the like. FIG. 57 illustrates a state in which the reflection event density RED changes according to the occurrence of the reflection event.

アイドル行動は、外部刺激に直接反応する必要のある反射行動と干渉しないことが好ましい。このため、本実施形態では、反射イベント密度に応じてアイドル行動のモーション・コマンドを決定するようにしている。 Idle behavior preferably does not interfere with reflex behavior that needs to react directly to external stimuli. For this reason, in this embodiment, the motion command for idle action is determined according to the reflection event density.

例えば、ユーザから反射イベントの入力が多数あり、反射イベント密度が高いときには、よりリアクティブに反応できるように、「何もしない」や、「首を傾ける」といった動作量のより小さいモーションを高い確率で出力する。逆に、ユーザに放置され反射イベントの入力があまりなく反射イベント密度が低いときには、静止してしまうことをなるべく避け、「ゴルフパットの練習をする」や、「横になってくつろぐ」といった動作量の大きいモーションを高い確率で出力する。 For example, when there is a large number of reflection event inputs from the user and the reflection event density is high, a high probability of a motion with a smaller amount of motion such as “do nothing” or “tilt” so that it can react more reactively To output. On the other hand, when the reflection event density is low because the user is left unattended and the reflection event density is low, the amount of movement such as “practice golf putting” or “lie down and relax” is avoided as much as possible. A large motion is output with high probability.

反射イベント密度からどのモーション（要素行動）を選ぶかは、以下のような仕組みで確率的に選択される。 Which motion (element behavior) is selected from the reflection event density is selected probabilistically by the following mechanism.

あるモーションｍ_iが最もよく選択される反射イベント密度をＲＥＤｍ_iとする。現在の反射イベント密度ＲＥＤとの距離をＤ_mi＝｜ＲＥＤｍ_i−ＲＥＤ｜としたとき、ｉ番目のモーションｍ_iが選択される確率Ｐｍ_iは次式で計算される。 Reflection event density in motion m _i is best selected to REDm _i. The distance between the current reflection event density _{_{RED D mi = | REDm i -RED}} | when the probability Pm _i the i-th motion m _i is selected is calculated as follows.

Ｔはボルツマン温度であり、大きい値を設定すると行動選択がよりランダムになる。逆に小さい値に設定すると選択に確率的要素がなくなり、反射イベント密度ＲＥＤによってモーションが一意（ＲＥＤが最も近いＲＥＤｍ_iのモーション）に選択されるようになる。但し、この選択はモーション実行中には行なわない。 T is the Boltzmann temperature, and the action selection becomes more random when a large value is set. On the other hand, when the value is set to a small value, there is no stochastic element in the selection, and the motion is uniquely selected by the reflection event density RED (the motion of REDm _i having the closest RED). However, this selection is not performed during motion execution.

図５８には、反射イベント密度ＲＥＤを用いてｍ_iが選択される確率Ｐｍ_iの例を示している。図示の例では、アイドル行動として「何もしない」、「首を傾ける」、「ゴルフパットの練習をする」、「横になってくつろぐ」という４種類のモーションが定義されており、それぞれについて最もよく選択される反射イベント密度ＲＥＤ_m0〜ＲＥＤ_m3が設定されている。上述したように、エンタテイメント性を高めつつアイドル行動の反射行動への干渉を抑制するために、動作量の大きなモーションには低い反射イベント密度が設定され、動作量の小さなモーションにはより高い反射イベント密度が設定されている。例えば、「何もしない」や「首を傾ける」といった動作量の小さいモーションは大きく、「ゴルフパットの練習をする」や「横になってくつろぐ」といった動作量の大きいモーションは小さく設定する。 Figure 58 shows an example of probability Pm _i where m _i is selected using the reflection event density RED. In the example shown in the figure, four types of motion are defined as idle actions: “do nothing”, “tilt”, “practice a golf pad”, and “lie down and relax”. Frequently selected reflection event densities RED _{m0 to} RED _m3 are set. As mentioned above, a low reflection event density is set for a motion with a large amount of motion, and a higher reflection event is set for a motion with a small amount of motion in order to suppress the interference of the idle behavior with the reflection behavior while enhancing entertainment properties. Density is set. For example, a motion with a small amount of motion such as “do nothing” or “tilt neck” is large, and a motion with a large amount of motion such as “practice a golf pad” or “sit down and relax” is set small.

そして、上式から求まる各モーションの発生確率をそれぞれのモーションについてのモーション・コマンドの強さとして用いる。これにより、資源管理部１２０での自発行動調停において、反射イベント密度が高くなると、動作量の小さなアイドル行動あるいは反射行動が選ばれ易くなり、反射イベント密度が低くなると、動作量の大きなアイドル行動が選ばれ易く、あるいは反射行動が選ばれにくくなる。この結果、反射行動の要因となるイベントの種類や頻度から、アイドル行動を自発的に出力するタイミングやそのモーションの種類を調停することができる。 Then, the occurrence probability of each motion obtained from the above equation is used as the strength of the motion command for each motion. Thereby, in the self-issued arbitration in the resource management unit 120, when the reflection event density is high, it becomes easy to select an idle action or a reflection action with a small operation amount, and when the reflection event density is low, an idle action with a large operation amount is selected. It is easy to be selected, or it becomes difficult to select reflex behavior. As a result, it is possible to arbitrate the timing for spontaneously outputting idle behavior and the type of motion based on the type and frequency of events that cause reflex behavior.

上述のようにして状況依存行動階層１０８内では、ホメオスタシス行動の各要素運動についての行動価値が算出されるとともに、反射行動の要因となるイベントの種類や頻度に応じてアイドル行動の要素運動が選択される。そして、行動選択部２３０が、ホメオスタシス行動及びアイドル行動の各要素運動の行動価値を比較し、いずれを自発行動として選択すべきかを最終的に決定する。 In the situation-dependent behavior hierarchy 108 as described above, the behavior value for each elemental motion of homeostasis behavior is calculated, and the elemental motion of idle behavior is selected according to the type and frequency of the event that causes reflex behavior Is done. And the action selection part 230 compares the action value of each element exercise | movement of homeostasis action and idle action, and finally determines which should be selected as self-issued movement.

ホメオスタシス行動の行動価値が欲求値と予想満足値から算出される一方、アイドル行動に対しては一定の行動価値が与えられている。したがって、欲求値が上昇するとホメオスタシス行動の行動価値が高まるので、行動選択部２３０ではホメオスタシス行動を状況依存行動として選択する。逆に、すべての内部状態が十分に満たされているときには、ホメオスタシス的ではない、すなわちホメオスタシス行動の行動価値が低下していき、アイドル行動の行動価値を下回るようになると、行動選択部２３０はアイドル行動を状況依存行動として選択するようになる。 While the behavioral value of homeostasis behavior is calculated from the desire value and the expected satisfaction value, a certain behavioral value is given to idle behavior. Accordingly, since the behavioral value of homeostasis behavior increases as the desire value increases, the behavior selection unit 230 selects the homeostasis behavior as the situation-dependent behavior. On the other hand, when all the internal states are sufficiently satisfied, when the behavioral value is not homeostasis, that is, when the behavioral value of the homeostasis behavior decreases and falls below the behavioral value of the idle behavior, the behavior selection unit 230 becomes idle. The action is selected as the situation dependent action.

Ｈ−５．自発行動と反射行動の調停
資源管理部１２０では、自発行動と反射行動の調停を行なう。具体的には、状況依存行動階層１０８と反射行動部１０９からそれぞれ伝達された状況依存行動と反射行動のモーション・コマンドの強さを比較し、値が大きい方のモーションを外部環境に出力し、ロボット装置１が現実に行なう動作として発現させる。 H-5. Self-issued movement and reflex behavior mediation resource management unit 120 mediates self-issued movement and reflex behavior. Specifically, the strengths of the motion commands of the situation-dependent action and the reflex action transmitted from the situation-dependent action hierarchy 108 and the reflex action unit 109 are compared, and the motion with the larger value is output to the external environment, It is expressed as an action that the robot apparatus 1 actually performs.

反射行動は、基本的に、センサ入力そのものをトリガとして行なう反射的な行動のことであるが、状況依存行動階層と同様に複数の要素行動で構成される。反射行動部１０９の要素行動は、視覚認識機能部１０１、聴覚認識機能部１０２、接触認識機能部１０３からイベントが発生する度に直接入力を得るため、即座の反応（反射）が可能となる。各要素行動は注目する入力（クラップが発生した、目の前に突然何か物体が現れた、肩タッチセンサを押されたなど）に対して特定のモーション・コマンド（音の方向を見る、後ろにたじろぐ、押されたタッチセンサの方向を見る）を資源管理部に出力する。その際、モーション・コマンドの強さは一定値を付加する。 The reflex action is basically a reflex action that is triggered by the sensor input itself, but is composed of a plurality of element actions as in the situation-dependent action hierarchy. The elemental behavior of the reflex behavior unit 109 is directly input every time an event occurs from the visual recognition function unit 101, the auditory recognition function unit 102, and the contact recognition function unit 103, so that an immediate reaction (reflection) is possible. Each elemental action is a specific motion command (see direction of sound, behind) for the input of interest (clapping occurred, something suddenly appeared in front of you, shoulder touch sensor pressed, etc.) (See the direction of the touch sensor pressed) is output to the resource management unit. At that time, a constant value is added to the strength of the motion command.

以下では、状況依存行動階層１０８による自発行動と反射行動部１０９による反射行動の調停方法について詳解する。 Below, the arbitration method of the self-issued movement by the situation-dependent action hierarchy 108 and the reflection action by the reflection action unit 109 will be described in detail.

ホメオスタシス行動の具体例として、バッテリチャージ行動を考える。バッテリチャージ行動は内部状態としてバッテリ残量に着目し、バッテリ残量が少なくなると行動価値が高くなる。コマンドの強さは行動価値の値とし、０から１００までの間をとるものとする。 As a specific example of homeostasis behavior, consider battery charge behavior. The battery charging action focuses on the remaining battery level as an internal state, and the action value increases as the remaining battery level decreases. The strength of the command is a value of action value, and takes a value from 0 to 100.

また、アイドル行動として、「何もしない」、「首を傾ける」、「ゴルフパットの練習をする」、「横になってくつろぐ」という４種類を考える。これらの要素行動の行動価値として一定値２０が一律に与えられる。また、｛コマンドの強さ，そのモーションが最もよく選択される反射イベント密度｝については、「何もしない」は｛７０，８０｝、「首を傾ける」は｛７５，６０｝、「ゴルフパットの練習をする」は｛８５，４０｝、「横になってくつろぐ」は｛９０，２０｝とする。 In addition, four types of idle actions are considered: “do nothing”, “tilt”, “practice a golf pad”, and “lie down and relax”. A constant value 20 is uniformly given as the action value of these element actions. As for {command strength, reflection event density at which the motion is best selected}, "do nothing" is {70, 80}, "tilt" is {75, 60}, "golf putt" "Practice of" is {85, 40}, and "Lie down and relax" is {90, 20}.

また、反射行動に関する要素行動の例として、クラップ反射行動、動作物体反射行動、タッチ反射行動を考える。クラップ反射行動は、手をパチンパチンと叩いたときの音を入力とし、その音源方向を向くモーション・コマンドを出力する。動作物体反射行動は物体が突然近づくことを入力とし、後ろにたじろぐモーション・コマンドを出力する。タッチ反射行動は、タッチセンサを入力とし、押されたタッチセンサの方向を見るコマンドを出力する。これら反射行動についての行動価値、すなわちモーション・コマンドの強さは一定値８０とする。 In addition, as an example of elemental behavior related to reflex behavior, a clapp reflex behavior, a moving object reflex behavior, and a touch reflex behavior are considered. The clap reflex action receives a sound when a hand is slammed and outputs a motion command that points in the direction of the sound source. The moving object reflex action inputs a sudden approach of an object and outputs a motion command that sways back. The touch reflection action receives a touch sensor and outputs a command for viewing the direction of the pressed touch sensor. The action value of these reflection actions, that is, the strength of the motion command is set to a constant value 80.

自発的行動（ホメオスタシス行動、アイドル行動）と反射行動の調停例について以下に説明する。 Examples of mediation between spontaneous behavior (homeostasis behavior, idle behavior) and reflex behavior will be described below.

ホメオスタシス行動と反射行動の調停例：
図５９には、バッテリ残量が中程度で、外界からクラップを認識したときの行動調停例を示している。 Mediation of homeostasis and reflex behavior:
FIG. 59 shows an example of behavior mediation when the remaining battery level is medium and a crap is recognized from the outside world.

このとき、状況依存行動階層１０８では、行動選択部２３０は、ホメオスタシス行動としてのバッテリチャージ行動の行動価値７０と、アイドル行動の行動価値２０を比較して、行動価値の大きいバッテリチャージ行動が選択される。このような場合、コマンドの強さとして行動価値７０が与えられたバッテリチャージ行動のモーション・コマンドが、資源管理部１１０に対して出力される。 At this time, in the situation-dependent action hierarchy 108, the action selection unit 230 compares the action value 70 of the battery charge action as the homeostasis action with the action value 20 of the idle action, and the battery charge action having a large action value is selected. The In such a case, the motion command of the battery charge action given the action value 70 as the strength of the command is output to the resource management unit 110.

また、このとき、外部刺激としてクラップが検出されたとする。反射行動部１０９は、この外部刺激に即応するクラップ反射行動を選択すると、すべての反射行動に一律に与えられたコマンドの強さ８０を付けて、資源管理部１１０に対してモーション・コマンドを出力する。 At this time, it is assumed that a crap is detected as an external stimulus. When the reflex action unit 109 selects the clap reflex action that responds immediately to the external stimulus, the reflex action unit 109 outputs a motion command to the resource management unit 110 with the command strength 80 given uniformly to all the reflex actions. To do.

そして、資源管理部１１０は、状況依存行動階層１０８及び反射行動部１０９からそれぞれ伝達されたモーション・コマンドの強さを比較し、値が大きい方のモーションを出力する。したがって、この場合は、コマンドの強さの大きい反射行動が選択されることになる。 Then, the resource management unit 110 compares the strengths of the motion commands transmitted from the situation-dependent behavior layer 108 and the reflex behavior unit 109, and outputs the motion having the larger value. Therefore, in this case, a reflex action with a high command strength is selected.

また、図６０には、バッテリ残量が少程度で、クラップを認識したときの行動調停例を示している。 FIG. 60 shows an example of behavior arbitration when the remaining battery level is low and a crap is recognized.

このとき、状況依存行動階層１０８では、行動選択部２３０は、ホメオスタシス行動としてのバッテリチャージ行動の行動価値９０と、アイドル行動の行動価値２０を比較して、行動価値の大きいバッテリチャージ行動が選択される。このような場合、コマンドの強さとして行動価値９０が与えられたバッテリチャージ行動のモーション・コマンドが、資源管理部１１０に対して出力される。 At this time, in the situation-dependent action hierarchy 108, the action selection unit 230 compares the action value 90 of the battery charge action as the homeostasis action with the action value 20 of the idle action, and the battery charge action having a large action value is selected. The In such a case, the motion command of the battery charge action given the action value 90 as the strength of the command is output to the resource management unit 110.

また、このとき、外部刺激としてクラップが検出されたとする。反射行動部１０９は、この外部刺激に即応するクラップ反射行動を選択すると、すべての反射行動に一律に与えられたコマンドの強さ８０を付けて、資源管理部１１０に対してモーション・コマンドを出力する。 At this time, it is assumed that a crap is detected as an external stimulus. When the reflex action unit 109 selects the clap reflex action that responds immediately to the external stimulus, the reflex action unit 109 outputs a motion command to the resource management unit 110 with a command strength 80 given to all reflex actions uniformly To do.

そして、資源管理部１１０は、状況依存行動階層１０８及び反射行動部１０９からそれぞれ伝達されたモーション・コマンドの強さを比較し、その結果、コマンドの強さの大きいバッテリチャージ行動が選択される。 Then, the resource management unit 110 compares the strengths of the motion commands transmitted from the situation-dependent behavior layer 108 and the reflex behavior unit 109, and as a result, a battery charge behavior having a high command strength is selected.

つまり、ホメオスタシス行動の行動価値（欲求）が低い場合には、ロボット装置としては反射行動のトリガに反応し易くなるが、ホメオスタシス行動の行動価値が高い場合には、ロボット装置としては反射行動のトリガに反応しない（その行動に集中しているように見える）ことになる。 That is, when the action value (desire) of the homeostasis action is low, the robot apparatus easily reacts to the trigger of the reflex action, but when the action value of the homeostasis action is high, the robot apparatus triggers the reflex action. Will not respond to it (it appears to be focused on that action).

アイドル行動と反射行動の調停例：
図６１並びに図６２には、バッテリ残量が十分で、外界からクラップを認識したときの行動調停例を示している。 Example of mediation between idol behavior and reflex behavior:
61 and 62 show examples of behavior mediation when the remaining battery level is sufficient and the crap is recognized from the outside world.

この場合、内部状態が十分に満たされているので、ホメオスタシス行動としてのバッテリチャージ行動の欲求値は小さくなる、すなわち行動価値が低下するので、状況依存行動階層１０８はアイドル行動の要素運動がモーション・コマンドとして出力する。出力されるアイドル行動の種類は、そのときの反射イベント密度ＲＥＤによって決定される（前述）。 In this case, since the internal state is sufficiently satisfied, the desire value of the battery charging action as the homeostasis action becomes small, that is, the action value decreases. Output as a command. The type of idle action to be output is determined by the reflection event density RED at that time (described above).

図６１には、ユーザがロボット装置に対してぱちぱちと手を叩いて呼び掛け、反射イベント密度ＲＥＤがある値から６０程度まで上昇したときの行動調停例を示している。 FIG. 61 shows an example of behavior arbitration when the user calls the robot apparatus with a clapping hand and the reflection event density RED increases from a certain value to about 60.

このとき、状況依存行動階層１０８から出力されたアイドル行動の要素行動が「首を傾ける」であるとすると、そのモーション・コマンドの強さは７５である。これに対し、反射行動のコマンドの強さは一定値８０と大きいことから、資源管理部１２０は、コマンドの強さの大きい反射行動を選択する。 At this time, if the elemental action of the idle action output from the situation dependent action hierarchy 108 is “tilt”, the strength of the motion command is 75. On the other hand, since the strength of the command of the reflex action is as large as the constant value 80, the resource management unit 120 selects the reflex action having a large strength of the command.

また、図６２には、ロボット装置がしばらく報知された結果、反射イベント密度が４０程度に低下したときに、ユーザがロボット装置に対してぱちぱちと手を叩いて呼び掛けたときの行動調停例を示している。 FIG. 62 shows an example of behavior mediation when the user calls the robot device with a clapping hand when the reflection event density decreases to about 40 as a result of the robot device being notified for a while. ing.

このとき、状況依存行動階層１０８から出力されたアイドル行動の要素行動が「ゴルフパットの練習」であるとすると、そのモーション・コマンドの強さは８５である。これに対し、反射行動のコマンドの強さは一定値８０と小さいことから、資源管理部１２０は、コマンドの強さの大きいアイドル行動を選択する。 At this time, if the elemental behavior of the idle behavior output from the situation dependent behavior hierarchy 108 is “golf put practice”, the motion command strength is 85. On the other hand, since the strength of the reflex action command is as small as 80, the resource management unit 120 selects an idle action with a high command strength.

その後、さらにユーザがロボット装置に対してぱちぱちと手を叩いて呼び掛け続けると、反射イベント密度ＲＥＤが上昇し、その結果、図６１に示したように資源管理部１２０は反射行動を出力するようになる。 Thereafter, when the user continues to call the robot apparatus with a clapping hand, the reflection event density RED increases, and as a result, the resource management unit 120 outputs the reflection action as shown in FIG. Become.

つまり、アイドル行動は、ユーザからの反射イベントが多く反射イベント密度が高い場合は、ロボット装置は反射行動のトリガに反応するようになる。他方、ユーザから放置されて反射イベント密度が低い場合には、ロボット装置は反射行動のトリガに反応しない（その行動に集中しているように見える）ことになる。 That is, in the idle behavior, when there are many reflection events from the user and the reflection event density is high, the robot apparatus responds to the trigger of the reflection behavior. On the other hand, when the reflection event density is low due to being left by the user, the robot apparatus does not respond to the trigger of the reflection action (it seems to concentrate on the action).

以上、特定の実施形態を参照しながら、本発明について詳解してきた。しかしながら、本発明の要旨を逸脱しない範囲で当業者が該実施形態の修正や代用を成し得ることは自明である。 The present invention has been described in detail above with reference to specific embodiments. However, it is obvious that those skilled in the art can make modifications and substitutions of the embodiment without departing from the gist of the present invention.

本発明の要旨は、必ずしも「ロボット」と称される製品には限定されない。すなわち、電気的若しくは磁気的な作用を用いて人間の動作に似せた運動を行なう機械装置であるならば、例えば玩具等のような他の産業分野に属する製品であっても、同様に本発明を適用することができる。 The gist of the present invention is not necessarily limited to a product called a “robot”. That is, as long as it is a mechanical device that performs an exercise resembling human movement using an electrical or magnetic action, the present invention can be applied to products belonging to other industrial fields such as toys. Can be applied.

また、本発明に係る行動制御メカニズムは、ロボット装置だけでなく、自発行動と反射行動を行なう他の自律型のエージェントに対しても好適に適用することが可能である。 Further, the behavior control mechanism according to the present invention can be suitably applied not only to the robot apparatus but also to other autonomous agents that perform self-issued movement and reflex behavior.

要するに、例示という形態で本発明を開示してきたのであり、本明細書の記載内容を限定的に解釈するべきではない。本発明の要旨を判断するためには、特許請求の範囲を参酌すべきである。 In short, the present invention has been disclosed in the form of exemplification, and the description of the present specification should not be interpreted in a limited manner. In order to determine the gist of the present invention, the claims should be taken into consideration.

本発明に実施に供されるロボット装置１の機能構成を模式的に示した図である。It is the figure which showed typically the function structure of the robot apparatus 1 provided to implementation by this invention. 制御ユニット２０の構成をさらに詳細に示した図である。FIG. 3 is a diagram showing the configuration of the control unit 20 in more detail. 本発明の実施形態に係るロボット装置１の行動制御システム１００の機能構成を模式的に示した図である。It is the figure which showed typically the function structure of the action control system 100 of the robot apparatus 1 which concerns on embodiment of this invention. 図３に示した行動制御システム１００を構成する各オブジェクトによる動作の流れを示した図である。It is the figure which showed the flow of operation | movement by each object which comprises the action control system 100 shown in FIG. 各認識機能部１０１〜１０３における認識結果に基づいて短期記憶部１０５内のターゲット・メモリに入る情報の流れを示した図である。It is the figure which showed the flow of the information which enters into the target memory in the short-term memory | storage part 105 based on the recognition result in each recognition function part 101-103. 各認識機能部１０１〜１０３における認識結果に基づいて短期記憶部１０５内のイベント・メモリに入る情報の流れを示した図である。It is the figure which showed the flow of the information which enters into the event memory in the short-term memory | storage part 105 based on the recognition result in each recognition function part 101-103. ロボット１によるユーザＡ及びＢとの対話処理を説明するための図である。It is a figure for demonstrating the dialogue process with the user A and B by the robot. ロボット１によるユーザＡ及びＢとの対話処理を説明するための図である。It is a figure for demonstrating the dialogue process with the user A and B by the robot. ロボット１によるユーザＡ及びＢとの対話処理を説明するための図である。It is a figure for demonstrating the dialogue process with the user A and B by the robot. 本発明の一実施形態に係る連想記憶の記憶過程を概念的に示した図である。It is the figure which showed notionally the memory | storage process of the associative memory which concerns on one Embodiment of this invention. 本発明の一実施形態に係る連想記憶の想起過程を概念的に示した図である。It is the figure which showed notionally the recall process of the associative memory based on one Embodiment of this invention. 競合型ニューラル・ネットワークを適用した連想記憶システムの構成例を模式的に示した図である。It is the figure which showed typically the example of a structure of the associative memory system to which a competitive neural network was applied. 本発明の実施形態に係る行動制御システム１００のオブジェクト構成を模式的に示した図である。It is the figure which showed typically the object structure of the action control system 100 which concerns on embodiment of this invention. 状況依存行動階層１０８による状況依存行動制御の形態を模式的に示した図である。It is the figure which showed typically the form of the situation dependence action control by the situation dependence action hierarchy. 図１４に示した状況依存行動階層１０８による行動制御の基本的な動作例を示した図である。FIG. 15 is a diagram illustrating a basic operation example of behavior control by the situation-dependent behavior hierarchy illustrated in FIG. 14. 図１４に示した状況依存行動階層１０８により反射行動を行なう場合の動作例を示した図である。It is the figure which showed the operation example in the case of performing reflex action by the situation dependence action hierarchy 108 shown in FIG. 図１４に示した状況依存行動階層１０８により感情表現を行なう場合の動作例を示した図である。It is the figure which showed the operation example in the case of expressing an emotion by the situation dependence action hierarchy shown in FIG. 状況依存行動階層１０８が複数のスキーマによって構成されている様子を模式的に示した図である。It is the figure which showed typically a mode that the situation dependence action hierarchy 108 was comprised by the some schema. 状況依存行動階層１０８におけるスキーマのツリー構造を模式的に示した図である。It is the figure which showed typically the tree structure of the schema in the situation dependence action hierarchy. スキーマの内部構成を模式的に示している。The internal structure of a schema is schematically shown. Ｍｏｎｉｔｏｒ関数の内部構成を模式的に示した図である。It is the figure which showed typically the internal structure of the Monitor function. 行動状態制御部の構成例を模式的に示した図である。It is the figure which showed the structural example of the action state control part typically. 行動状態制御部の他の構成例を模式的に示した図である。It is the figure which showed typically the example of another structure of the action state control part. 状況依存行動階層１０８において通常の状況依存行動を制御するためのメカニズムを模式的に示した図である。It is the figure which showed typically the mechanism for controlling the normal situation dependence action in the situation dependence action hierarchy 108. FIG. 反射行動部１０９におけるスキーマの構成を模式的に示した図である。It is the figure which showed the structure of the schema in the reflective action part 109 typically. 反射行動部１０９により反射的行動を制御するためのメカニズムを模式的に示した図である。It is the figure which showed typically the mechanism for controlling a reflective action by the reflective action part 109. FIG. 状況依存行動階層１０８において使用されるスキーマのクラス定義を模式的に示した図である。It is the figure which showed typically the class definition of the schema used in the situation dependence action hierarchy 108. スキーマのａｃｔｉｏｎ関数のステートマシンを示した図である。It is the figure which showed the state machine of the action function of schema. スキーマのステートマシンを示した図である。It is the figure which showed the state machine of the schema. 状況依存行動階層１０８内のクラスの機能的構成を模式的に示した図である。It is the figure which showed typically the functional structure of the class in the situation dependence action hierarchy. ＭａｋｅＰｒｏｎｏｍｅ関数を実行する処理手順を示したフローチャートである。It is the flowchart which showed the process sequence which performs the MakePronome function. Ｍｏｎｉｔｏｒ関数を実行する処理手順を示したフローチャートである。It is the flowchart which showed the process sequence which performs a Monitor function. Ａｃｔｉｏｎｓ関数を実行する処理手順を示したフローチャートである。It is the flowchart which showed the process sequence which performs Actions function. Ａｃｔｉｏｎｓ関数を実行する処理手順を示したフローチャートである。It is the flowchart which showed the process sequence which performs Actions function. スキーマのＲｅｅｎｔｒａｎｔ性を説明するための図である。It is a figure for demonstrating Reentrant property of a schema. 本実施形態に係る内部状態管理部１０４の階層的構成を模式的に示した図である。It is the figure which showed typically the hierarchical structure of the internal state management part 104 which concerns on this embodiment. 内部状態管理部１０４と他の機能モジュールとの通信経路を模式的に示した図である。It is the figure which showed typically the communication path | route between the internal state management part 104 and another functional module. 内部状態管理部１０４が時間変化に伴って内部状態を変化させるための仕組みを示した図である。It is the figure which showed the mechanism for the internal state management part 104 to change an internal state with a time change. 内部状態管理部１０４がロボットの動作実行に伴って内部状態を変化させるための仕組みを示した図である。It is the figure which showed the mechanism for the internal state management part 104 to change an internal state with operation | movement execution of a robot. 内部状態管理部１０４が外部環境の認識結果により内部状態を変化させるための仕組みを示した図である。It is the figure which showed the mechanism for the internal state management part 104 to change an internal state by the recognition result of an external environment. 内部状態管理部１０４が連想記憶により内部状態を変化させるための仕組みを示した図である。It is the figure which showed the structure for the internal state management part 104 to change an internal state by associative memory. 内部状態管理部１０４が生得的反射行動により内部状態を変化させるための仕組みを示した図である。It is the figure which showed the mechanism for the internal state management part 104 to change an internal state by innate reflection behavior. スキーマと内部状態管理部との関係を模式的に示した図である。It is the figure which showed typically the relationship between a schema and an internal state management part. Ｍｏｔｉｖａｔｉｏｎ算出クラス・オブジェクトによるＭｏｔｉｖａｔｉｏｎ算出経路を模式的に示した図である。It is the figure which showed typically the Motivation calculation path | route by a Motivation calculation class object. 対象物が存在するときのＭｏｔｉｖａｔｉｏｎ算出処理のメカニズムを模式的に示した図である。It is the figure which showed typically the mechanism of the Motivation calculation process when a target object exists. 対象物が存在しないときのＭｏｔｉｖａｔｉｏｎ算出処理のメカニズムを模式的に示した図である。It is the figure which showed typically the mechanism of the Motivation calculation process when a target object does not exist. Ｐｌｅａｓａｎｔｎｅｓｓの変更方法を示した図である。It is the figure which showed the change method of Pleasantness. Ａｃｔｉｖｉｔｙの変更方法を示した図である。It is the figure which showed the change method of Activity. Ｃｅｒｔａｉｎｔｙの変更方法を示した図である。It is the figure which showed the change method of Certificate. Ｃｅｒｔａｉｎｔｙを求めるためのメカニズムを示した図である。It is the figure which showed the mechanism for calculating | requiring Certificate. ＲｅｓｏｕｒｃｅＭａｎａｇｅｒの構成を示した図である。It is the figure which showed the structure of ResourceManager. ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒからＲｅｆｌｅｘｉｖｅＢｅｈａｖｉｏｒｌａｙｅｒへの制御を模式的に示した図である。It is the figure which showed typically the control from SituatedBehaviorLayer to ReflexiveBehaviorlayer. ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒからＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒへの制御を模式的に示した図である。It is the figure which showed typically the control from DeliverativeLayer to SituatedBehaviorLayer. 図５４は、ロボット装置１の行動制御システムの他の構成例を示した図である。FIG. 54 is a diagram illustrating another configuration example of the behavior control system of the robot apparatus 1. 図５５は、状況依存行動階層１０８の内部構成例を示した図である。FIG. 55 is a diagram showing an example of the internal configuration of the situation-dependent action hierarchy 108. 図５６は、行動価値算出部２２０が内部状態及び外部刺激から行動価値ＡＬを算出する処理の流れを示した図である。FIG. 56 is a diagram showing a flow of processing in which the behavior value calculation unit 220 calculates the behavior value AL from the internal state and the external stimulus. 図５７は、反射イベント密度ＲＥＤが反射イベントの発生に応じて変化する様子を例示した図である。FIG. 57 is a diagram illustrating a state in which the reflection event density RED changes according to the occurrence of the reflection event. 図５８は、反射イベント密度ＲＥＤを用いてｍ_iが選択される確率Ｐｍ_iの例を示した図である。Figure 58 is a diagram m _i is an example of a probability Pm _i is selected by using the reflection event density RED. 図５９は、バッテリ残量が中程度で、外界からクラップを認識したときの行動調停例を示した図である。FIG. 59 is a diagram illustrating an example of behavior arbitration when the remaining battery level is medium and a crap is recognized from the outside world. 図６０は、バッテリ残量が少程度で、クラップを認識したときの行動調停例を示した図である。FIG. 60 is a diagram illustrating an example of behavior arbitration when the remaining battery level is low and a crap is recognized. 図６１は、バッテリ残量が十分で、外界からクラップを認識したときの行動調停例を示した図である。FIG. 61 is a diagram illustrating an example of behavior arbitration when the remaining battery capacity is sufficient and a crap is recognized from the outside world. 図６２は、バッテリ残量が十分で、外界からクラップを認識したときの行動調停例を示した図である。FIG. 62 is a diagram illustrating an example of behavior arbitration when the remaining battery level is sufficient and a crap is recognized from the outside world.

Explanation of symbols

１…ロボット装置
１５…ＣＣＤカメラ
１６…マイクロフォン
１７…スピーカ
１８…タッチセンサ
１９…ＬＥＤインジケータ
２０…制御部
２１…ＣＰＵ
２２…ＲＡＭ
２３…ＲＯＭ
２４…不揮発メモリ
２５…インターフェース
２６…無線通信インターフェース
２７…ネットワーク・インターフェース・カード
２８…バス
２９…キーボード
４０…入出力部
５０…駆動部
５１…モータ
５２…エンコーダ
５３…ドライバ
１００…行動制御システム
１０１…視覚認識機能部
１０２…聴覚認識機能部
１０３…接触認識機能部
１０５…短期記憶部
１０６…長期記憶部
１０７…熟考行動階層
１０８…状況依存行動階層
１０９…反射行動部
２２０…行動価値算出部
２３０…行動選択部
２４０…学習部
DESCRIPTION OF SYMBOLS 1 ... Robot apparatus 15 ... CCD camera 16 ... Microphone 17 ... Speaker 18 ... Touch sensor 19 ... LED indicator 20 ... Control part 21 ... CPU
22 ... RAM
23 ... ROM
24 ... Non-volatile memory 25 ... Interface 26 ... Wireless communication interface 27 ... Network interface card 28 ... Bus 29 ... Keyboard 40 ... Input / output unit 50 ... Drive unit 51 ... Motor 52 ... Encoder 53 ... Driver 100 ... Action control system 101 ... Visual recognition function unit 102 ... Hearing recognition function unit 103 ... Contact recognition function unit 105 ... Short-term memory unit 106 ... Long-term memory unit 107 ... Contemplation action layer 108 ... Situation-dependent action layer 109 ... Reflection action unit 220 ... Action value calculation unit 230 ... Action selection unit 240 ... learning unit

Claims

In a robot device that generates an action based on an internal state or an external input,
A plurality of action control layers for controlling actions based on internal states or external inputs;
A resource management unit that manages resources of the robot device and resolves conflicts of operation commands related to driving of the robot device from each behavior control layer;
A robot apparatus comprising:

Each of the behavior control layers is composed of one or more behavior modules that determine the behavior of the robot.
Each behavior module includes a behavior evaluation unit that outputs a behavior evaluation of the robot device according to an internal state or an external input, and a behavior command output unit that outputs a behavior command of the robot device, respectively.
In each of the action control layers, an action command for controlling the action of the robot apparatus is generated based on the action evaluation in the action evaluation unit of each action module and the resource used by the robot apparatus.
The robot apparatus according to claim 1.

In each of the behavior control layers, the plurality of behavior modules have a tree-like hierarchical structure, and are used for behavior evaluation and output resources of the robot apparatus that are output from the behavior module in the lower layer to the behavior module in the upper layer. Selecting a behavior module based on and controlling the behavior of the robotic device;
The robot apparatus according to claim 2.

The action command output means of each of the action modules outputs resources of the robot apparatus used when executing the action instruction in the robot apparatus,
In each of the behavior control layers, the behavior of the robot device is controlled based on the behavior evaluation and the resource used by the robot device output from the behavior module of the lower layer of the hierarchical structure to the behavior module of the upper layer.
The robot apparatus according to claim 2.

The behavior control layer includes a reflection behavior control layer that directly receives an external input recognition result and directly determines an output behavior, and a situation in which the robot apparatus is placed based on an external input and an internal state of the robot apparatus. A situation-dependent action hierarchy that controls immediate action and a contemplation action control hierarchy that infers future situations and plans the behavior of the robotic device over a relatively long period of time,
The robot apparatus according to claim 2.

The resource management unit mediates a conflict between operation commands of the reflex behavior layer and the situation-dependent behavior layer;
The robot apparatus according to claim 5.

The resource management unit includes command conflict solution means for solving a conflict of commands from the reflection action hierarchy and the contemplation action hierarchy, and content management means for managing a command for each hardware resource of the robot apparatus.
The robot apparatus according to claim 5.

The command conflict solution means
Holding the command, the hardware resource of the robot device used by the command, and the activity level of the command;
When a new command is transmitted from any of the behavior control layers, whether or not the hardware resource of the robot device used by the currently executing command and the hardware resource of the robot device used by the new command are in conflict Resolving command conflicts by comparing the activity levels associated with each other's commands, if any
The robot apparatus according to claim 7.

The behavior evaluation unit in the behavior module includes a behavior induction evaluation value calculation unit that calculates an evaluation value that induces an output of a behavior command by the behavior command unit as an activity level, and a bias calculation that calculates a bias with respect to the activity level as an intention level And a use resource calculation means for specifying the resource of the robot device to be used when the action command by the action command means is executed.
The robot apparatus according to claim 5.

The situation-dependent action layer outputs an action intention signal that suppresses or excites a reflection action according to whether or not the reflection action matches the intention of the situation-dependent action to the reflection action layer,
The reflex behavior layer includes behavior intention management means for managing a behavior intention signal, and an action contrary to the intention of the situation-dependent behavior by manipulating the intention level in each behavior module based on the behavior intention in the situation-dependent behavior layer. Suppress the module or excite a behavior module that fits the intention of context-dependent behavior,
The robot apparatus according to claim 9.

The contemplation action layer outputs an action intention signal that suppresses or excites a situation-dependent action depending on whether or not the situation-dependent action matches the intention of the contemplation action, to the reflex action layer,
The situation-dependent action hierarchy includes action intention management means for managing an action intention signal, and an action module contrary to the intention of the contemplation action by manipulating the intention level in each action module based on the action intention in the contemplation action hierarchy To suppress behavior or excite behavior modules that fit the intention of contemplation behavior,
The robot apparatus according to claim 9.

The behavior control layer directly receives a homeostasis behavior for keeping the internal state within a certain range, a situation-dependent behavior layer that spontaneously generates an idle behavior without a homeostasis purpose, and an external input recognition result. Includes a reflex behavior control hierarchy that directly determines output behavior,
A resource management unit that arbitrates between the self-issued movement output from the situation-dependent action hierarchy and the reflex action output from the reflex action control hierarchy;
The robot apparatus according to claim 1.

The situation-dependent behavior hierarchy is composed of a plurality of elemental behaviors that output the respective homeostasis behavior and idle behavior,
The reflex behavior hierarchy is composed of a plurality of elemental behaviors that behaviorally output each reflex behavior.
The robot apparatus according to claim 12, wherein:

Each has an action value calculation means for calculating an action value indicating the execution priority of each element action,
The situation-dependent action hierarchy further includes an action selection means for selecting an element action to be output as a self-issued action based on an action value.
The robot apparatus according to claim 13.

The behavior value calculating means includes
While calculating the desire value from the internal state of the robotic device, calculating the expected satisfaction value from the internal state and the recognition result, calculating the action value of the elementary action of homeostasis action from the desire value and the predicted satisfaction value,
Give a certain value as the action value of elemental action of idle action,
The robot apparatus according to claim 14.

A resource management unit that mediates between the element movement of the self-issued movement output from the situation-dependent action hierarchy and the element movement of the reflection action output from the reflection action hierarchy;
The robot apparatus according to claim 15.

The situation-dependent action hierarchy and the reflex action hierarchy each give command strength to the element action to be output,
The resource management unit mediates between homeostasis behavior and reflex behavior, or idle behavior and reflex behavior based on the strength of the command,
The robot apparatus according to claim 16.

The context-dependent behavior hierarchy gives command strength based on behavior value to elemental behavior of homeostasis behavior,
The robot apparatus according to claim 17.

The situation-dependent action hierarchy gives a unique command strength to each element action of the idle action,
The robot apparatus according to claim 17.

The reflex behavior hierarchy gives a constant value as command strength to all element behaviors,
The robot apparatus according to claim 17.

A reflection event management unit that manages the reflection event density at which an external input event that causes reflection behavior occurs is provided.
The situation-dependent action hierarchy obtains a probability that each element action of the idle action is selected based on the reflection event density, and selects the element action according to the probability.
The robot apparatus according to claim 13.