JP3558222B2

JP3558222B2 - Robot behavior control system and behavior control method, and robot device

Info

Publication number: JP3558222B2
Application number: JP2003072844A
Authority: JP
Inventors: 雅博藤田; 剛高木; 里香堀中; 伸弥大谷
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2002-03-15
Filing date: 2003-03-17
Publication date: 2004-08-25
Anticipated expiration: 2023-03-17
Also published as: JP2003334785A

Description

【０００１】
【発明の属する技術分野】
本発明は、自律的な動作を行ないユーザとのリアリスティックなコミュニケーションを実現するロボットの行動制御システム及び行動制御方法、並びにロボット装置に係り、特に、視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況を統合的に判断して適当な行動を選択する状況依存行動型のロボットのための行動制御システム及び行動制御方法、並びにロボット装置に関する。
【０００２】
【従来の技術】
電気的若しくは磁気的な作用を用いて人間の動作に似せた運動を行なう機械装置のことを「ロボット」という。ロボットの語源は、スラブ語の”ＲＯＢＯＴＡ（奴隷機械）”に由来すると言われている。わが国では、ロボットが普及し始めたのは１９６０年代末からであるが、その多くは、工場における生産作業の自動化・無人化などを目的としたマニピュレータや搬送ロボットなどの産業用ロボット（ｉｎｄｕｓｔｒｉａｌｒｏｂｏｔ）であった。
【０００３】
最近では、イヌやネコ、クマのように４足歩行の動物の身体メカニズムやその動作を模したペット型ロボット、あるいは、ヒトやサルなどの２足直立歩行を行なう動物の身体メカニズムや動作を模した「人間形」若しくは「人間型」のロボット（ｈｕｍａｎｏｉｄｒｏｂｏｔ）など、脚式移動ロボットの構造やその安定歩行制御に関する研究開発が進展し、実用化への期待も高まってきている。これら脚式移動ロボットは、クローラ式ロボットに比し不安定で姿勢制御や歩行制御が難しくなるが、階段の昇降や障害物の乗り越えなど、柔軟な歩行・走行動作を実現できるという点で優れている。
【０００４】
脚式移動ロボットの用途の１つとして、産業活動・生産活動等における各種の難作業の代行が挙げられる。例えば、原子力発電プラントや火力発電プラント、石油化学プラントにおけるメンテナンス作業、製造工場における部品の搬送・組立作業、高層ビルにおける清掃、火災現場その他における救助といったような危険作業・難作業の代行などである。
【０００５】
また、脚式移動ロボットの他の用途として、上述の作業支援というよりも、生活密着型、すなわち人間との「共生」あるいは「エンターティンメント」という用途が挙げられる。この種のロボットは、ヒトあるいはイヌ（ペット）、クマなどの比較的知性の高い脚式歩行動物の動作メカニズムや四肢を利用した豊かな感情表現を忠実に再現する。また、あらかじめ入力された動作パターンを単に忠実に実行するだけではなく、ユーザ（あるいは他のロボット）から受ける言葉や態度（「褒める」とか「叱る」、「叩く」など）に対して動的に対応した、生き生きとした応答表現を実現することも要求される。
【０００６】
従来の玩具機械は、ユーザ操作と応答動作との関係が固定的であり、玩具の動作をユーザの好みに合わせて変更することはできない。この結果、ユーザは同じ動作しか繰り返さない玩具をやがては飽きてしまうことになる。これに対し、インテリジェントなロボットは、対話や機体動作などからなる行動を自律的に選択することから、より高度な知的レベルでリアリスティックなコミュニケーションを実現することが可能となる。この結果、ユーザはロボットに対して深い愛着や親しみを感じる。
【０００７】
ロボットあるいはその他のリアリスティックな対話システムでは、視覚や聴覚など外部環境の変化に応じて逐次的に行動を選択していくのが一般的である。また、行動選択メカニズムの他の例として、本能や感情といった情動をモデル化してシステムの内部状態を管理して、内部状態の変化に応じて行動を選択するものを挙げることができる。勿論、システムの内部状態は、外部環境の変化によっても変化するし、選択された行動を発現することによっても変化する。
【０００８】
しかしながら、これら外部環境や内部状態などのロボットが置かれている状況を統合的に判断して行動を選択するという、状況依存型の行動制御に関しては例は少ない。
【０００９】
ここで、内部状態には、例えば生体で言えば大脳辺縁系へのアクセスに相当する本能のような要素や、大脳新皮質へのアクセスに相当する内発的欲求や社会的欲求などのように動物行動学的モデルで捉えられる要素、さらには喜びや悲しみ、怒り、驚きなどのような感情と呼ばれる要素などで構成される。
【００１０】
従来のインテリジェント・ロボットやその他の自律対話型ロボットにおいては、本能や感情などさまざまな要因からなる内部状態をすべて「情動」としてまとめて１次元的に内部状態を管理していた。すなわち、内部状態を構成する各要素はそれぞれ並列に存在しており、明確な選択基準のないまま外界の状況や内部状態のみで行動が選択されていた。
【００１１】
従来のシステムでは、その動作の選択及び発現は１次元の中にすべての行動が存在し、どれを選択するかを決定していた。このため、動作が多くなるにつれてその選択は煩雑になり、そのときの状況や内部状態を反映した行動選択を行なうことがより難しくなる。
【００１２】
【発明が解決しようとする課題】
本発明の目的は、自律的な動作を行ないリアリスティックなコミュニケーションを実現することができる、優れたロボットの行動制御システム及び行動制御方法、並びにロボット装置を提供することにある。
【００１３】
本発明のさらなる目的は、視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況を統合的に判断して行動を選択することができる、優れたロボットの行動制御システム及び行動制御方法、並びにロボット装置を提供することにある。
【００１４】
本発明のさらなる目的は、情動についての存在意義をより明確にして、一定の秩序の下で外部刺激や内部状態に応じた行動を好適に選択し実行することができる、優れたロボットの行動制御システム及び行動制御方法、並びにロボット装置を提供することにある。
【００１５】
本発明のさらなる目的は、視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況を統合的に判断して行動を選択することができる、優れたロボットの行動制御システム及び行動制御方法、並びにロボット装置を提供することにある。
【００１６】
【課題を解決するための手段及び作用】
本発明は、上記課題を参酌してなされたものであり、その第１の側面は、自律的に動作するロボットのための行動制御システムであって、
ロボットの機体動作を記述する複数の行動記述部と、
機体の外部環境を認識する外部環境認識部と、
認識された外部環境及び／又は行動の実行結果に応じたロボットの内部状態を管理する内部状態管理部と、
外部環境及び／又は内部状態に応じて前記の各行動記述部に記述された行動の実行を評価する行動評価部と、
を具備することを特徴とするロボットの行動制御システムである。
【００１７】
但し、ここで言う「システム」とは、複数の装置（又は特定の機能を実現する機能モジュール）が論理的に集合した物のことを言い、各装置や機能モジュールが単一の筐体内にあるか否かは特に問わない。
【００１８】
前記外部環境認識部は、外部の視覚認識、外部で発生する音声認識、外部から印加された接触認識のうち少なくとも１つを行なう。また、前記内部状態管理部は、ロボットの本能モデル及び／又は感情モデルを管理する。
【００１９】
前記行動記述部は、複数の行動記述部が機体動作の実現レベルに応じた木構造形式に構成することができる。この木構造は、動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するための枝など、複数の枝を含んでいる。例えば、ルート行動記述部の直近下位の階層では、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」、「食べる（Ｉｎｇｅｓｔｉｖｅ）」、「遊ぶ（Ｐｌａｙ）」という行動記述部が配設される。そして、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」の下位には、「ＩｎｖｅｓｔｉｇａｔｉｖｅＬｏｃｏｍｏｔｉｏｎ」、「ＨｅａｄｉｎＡｉｒＳｎｉｆｆｉｎｇ」、「ＩｎｖｅｓｔｉｇａｔｉｖｅＳｎｉｆｆｉｎｇ」というより具体的な探索行動を記述した行動記述部が配設されている。同様に、行動記述部「食べる（Ｉｎｇｅｓｔｉｖｅ）」の下位には「Ｅａｔ」や「Ｄｒｉｎｋ」などのより具体的な飲食行動を記述した行動記述部が配設され、行動記述部「遊ぶ（Ｐｌａｙ）」の下位には「ＰｌａｙＢｏｗｉｎｇ」、「ＰｌａｙＧｒｅｅｔｉｎｇ」、「ＰｌａｙＰａｗｉｎｇ」などのより具体的な遊ぶ行動を記述した行動記述部が配設されている。
【００２０】
このような場合、前記行動評価部は該木構造の上から下に向かって複数の行動記述部を同時並行的に評価することができる。また、前記外部環境認識部による新規認識及び／又は前記内部状態管理部による内部状態の変化に応答して、前記行動評価部による前記の各行動記述部の評価を実行して、木構造を上から下に向かって評価結果としての実行許可を渡していくことにより、外部環境や内部状態の変化に応じた適当な行動を選択的に実行することができる。すなわち、状況依存の行動の評価並びに実行をＣｏｎｃｕｒｒｅｎｔに行なうことができる。
【００２１】
また、複数の行動記述部に記述された行動を同時実行するときの機体上の資源の競合を管理する資源管理部をさらに備えていてもよい。このような場合、前記行動選択部は、資源の競合が調停されていることを前提に、２以上の行動記述部を同時に選択することができる。
【００２２】
また、前記外部環境認識部による新規認識により前記行動評価部による前記の各行動記述部の評価を実行した結果、現在実行中の行動よりも高い評価値を得た行動記述部が出現した場合、前記行動選択部は、現在実行中の行動を停止して、評価値がより高い行動記述部に記述された行動を優先的に実行するようにしてもよい。したがって、反射行動のようにより重要度や緊急性の高い行動を、既に実行中の状況依存行動に割り込んで、優先的に実行することができる。このような場合、該優先的に実行した行動が終了した後、一旦停止された行動を再開させることが好ましい。
【００２３】
また、前記行動選択部は、異なる外部環境の変化に応じて同一の行動記述部を逐次選択するようにしてもよい。このような場合、前記行動記述部に記述された行動を実行する度に外部環境毎に個別の作業空間を割り当てるようにする。
【００２４】
例えば、人物Ａとの対話という行動を実行中に、人物Ｂがロボットと人物Ａとの対話に割り込み、外部刺激と内部状態の変化に基づく活動度レベルの評価を行なった結果、Ｂとの対話を行なう行動の方がより優先度が高くなると、Ｂとの対話が割り込まれる。
【００２５】
このような場合、Ａ又はＢのいずれとの対話も同じ行動記述部に従って対話を行なうが、Ａとの対話を行なう行動とは別に、Ｂとの対話を行なう行動のための作業空間を割り当てることにより、対話内容の干渉を防ぐ。すなわち、Ｂとの対話によりＡとの対話内容が破壊されずに済むので、Ｂとの対話が終了すると、Ａとの対話を中断した時点から再開することができる。
【００２６】
また、本発明の第２の側面は、内部状態に応じて自律的に動作するロボットの行動制御システム又は行動制御方法であって、
内部状態の指標である情動を複数の階層構造にして管理する内部状態管理部又はステップと、
各階層の情動を満たす行動を選択的に実行する行動選択部又はステップと、
を特徴とするロボットの行動制御システム又は行動制御方法である。
【００２７】
ここで、前記内部状態管理部又はステップは、個体存続のために必要な１次情動と、該１次情動の過不足により変化する２次情動という段階毎に階層化するとともに、該１次情動を生得的反射や生理的な階層から連想に至るまで次元により階層化するようにしてもよい。
【００２８】
そして、前記行動選択部又はステップは、より低次の１次情動を満たす行動を優先的に選択するようにしてもよい。あるいは、前記行動選択部又はステップは、より高次の１次情動が低次の１次情動に比し著しく不足している場合には、低次の１次情動を満たす行動の選択を抑制するようにしてもよい。
【００２９】
本発明の第２の側面に係るロボットの行動制御システム又は行動制御方法によれば、情動についてその存在意義による複数階層化を行ない、それぞれの階層で動作を決定する。決定された複数の動作から、そのときの外部刺激や内部状態によってどの動作を行なうかを決定する。それぞれの階層で行動は選択されるが、その実施される順番はロボットの内部状態の優先順位に基づくので、より低次の行動から優先的に動作を発現していくことにより、反射などの本能的行動や、記憶を用いた動作選択などの高次の行動を１つの個体上で矛盾なく発現することができる。また、行動をカテゴライズして、スキーマとして作成する際も明確な指標となる。
【００３０】
本発明の第２の側面に係るロボットの行動制御システム又は行動制御方法は、ロボットの外部環境の変化を認識する外部環境認識部をさらに備えていてもよい。このような場合、前記行動選択部又はステップは、内部状態の指標に加え、外部環境の指標を基に行動を選択することができる。
【００３１】
また、前記内部状態管理部ステップは、バイオリズムなどを利用して、時間経過に応じて内部状態の指標を変更するようにしてもよい。
【００３２】
また、前記内部状態管理部又はステップは、行動選択部において選択された行動の実行に応じて、すなわち動作の程度に応じて内部状態の指標を変更するようにしてもよい。
【００３３】
また、前記内部状態管理部又はステップは、外部環境の変化に応じて内部状態の指標を変更するようにしてもよい。
【００３４】
また、本発明の第２の側面に係るロボットの行動制御システム又は行動制御方法は、外部環境から内部状態の変化を連想記憶する連想記憶部又はステップをさらに備えていてもよい。このような場合、前記内部状態管理部又はステップは、前記連想記憶部又はステップが外部環境から想起した内部環境の変化を基に内部状態の指標を変更するようにしてもよい。また、前記連想記憶部又はステップは前記外部環境認識される対象物毎に内部状態の変化を連想記憶するようにしてもよい。
【００３５】
従来のロボットにおける動作の選択や発現は、基本的には、対象物までの物理的距離や、そのときのロボットの内部状態によって決定されており、言い換えれば、対象物の相違によりどのような行動をとるか、といった行動選択は行なわれていない。
【００３６】
これに対し、本発明の第２の側面に係るロボットの行動制御システム又は行動制御方法によれば、連想記憶を用いることにより、対象物毎に異なる内部状態の変化を想起することができるので、同じ状況でもその行動の発現し易さを異ならせることができる。すなわち、外部の刺激や物理的状況、現在の内部状態に加え、ロボットの対象物ごとの記憶を考慮して行動を選択することができ、より多彩で多様化した対応を実現することができる。
【００３７】
例えば、「○○が見えているから××する」とか、「現在○○が不足だから（何に対しても）××する」などの外部環境又は内部状態によって決まった行動をするのではなく、「○○が見えても△△なので□□する」とか、「○○が見えているけど××なので■■する」など、対象物に関する内部状態の変化記憶を用いることにより、行動にバリエーションを付けることができる。
【００３８】
本発明のさらに他の目的、特徴や利点は、後述する本発明の実施形態や添付する図面に基づくより詳細な説明によって明らかになるであろう。
【００３９】
【発明の実施の形態】
以下、図面を参照しながら本発明の実施形態について詳解する。
【００４０】
Ａ．ロボット装置の構成
図１には、本発明に実施に供されるロボット装置１の機能構成を模式的に示している。同図に示すように、ロボット装置１は、全体の動作の統括的制御やその他のデータ処理を行なう制御ユニット２０と、入出力部４０と、駆動部５０と、電源部６０とで構成される。以下、各部について説明する。
【００４１】
入出力部４０は、入力部としてロボット装置１の目に相当するＣＣＤカメラ１５や、耳に相当するマイクロフォン１６、頭部や背中などの部位に配設されてユーザの接触を感知するタッチ・センサ１８、あるいは五感に相当するその他の各種のセンサを含む。また、出力部として、口に相当するスピーカ１７、あるいは点滅の組み合わせや点灯のタイミングにより顔の表情を形成するＬＥＤインジケータ（目ランプ）１９などを装備している。これら出力部は、音声やランプの点滅など、脚などによる機械運動パターン以外の形式でもロボット装置１からのユーザ・フィードバックを表現することができる。
【００４２】
駆動部５０は、制御部２０が指令する所定の運動パターンに従ってロボット装置１の機体動作を実現する機能ブロックであり、行動制御による制御対象である。駆動部５０は、ロボット装置１の各関節における自由度を実現するための機能モジュールであり、それぞれの関節におけるロール、ピッチ、ヨーなど各軸毎に設けられた複数の駆動ユニットで構成される。各駆動ユニットは、所定軸回りの回転動作を行なうモータ５１と、モータ５１の回転位置を検出するエンコーダ５２と、エンコーダ５２の出力に基づいてモータ５１の回転位置や回転速度を適応的に制御するドライバ５３の組み合わせで構成される。
【００４３】
駆動ユニットの組み合わせ方によって、ロボット装置１を例えば２足歩行又は４足歩行などの脚式移動ロボットとして構成することができる。
【００４４】
電源部６０は、その字義通り、ロボット装置１内の各電気回路などに対して給電を行なう機能モジュールである。本実施形態に係るロボット装置１は、バッテリを用いた自律駆動式であり、電源部６０は、充電バッテリ６１と、充電バッテリ６１の充放電状態を管理する充放電制御部６２とで構成される。
【００４５】
充電バッテリ６１は、例えば、複数本のリチウムイオン２次電池セルをカートリッジ式にパッケージ化した「バッテリ・パック」の形態で構成される。
【００４６】
また、充放電制御部６２は、バッテリ６１の端子電圧や充電／放電電流量、バッテリ６１の周囲温度などを測定することでバッテリ６１の残存容量を把握し、充電の開始時期や終了時期などを決定する。充放電制御部６２が決定する充電の開始及び終了時期は制御ユニット２０に通知され、ロボット装置１が充電オペレーションを開始及び終了するためのトリガとなる。
【００４７】
制御ユニット２０は、「頭脳」に相当し、例えばロボット装置１の機体頭部あるいは胴体部に搭載されている。
【００４８】
図２には、制御ユニット２０の構成をさらに詳細に図解している。同図に示すように、制御ユニット２０は、メイン・コントローラとしてのＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２１が、メモリやその他の各回路コンポーネントや周辺機器とバス接続された構成となっている。バス２７は、データ・バス、アドレス・バス、コントロール・バスなどを含む共通信号伝送路である。バス２７上の各装置にはそれぞれに固有のアドレス（メモリ・アドレス又はＩ／Ｏアドレス）が割り当てられている。ＣＰＵ２１は、アドレスを指定することによってバス２８上の特定の装置と通信することができる。
【００４９】
ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２２は、ＤＲＡＭ（ＤｙｎａｍｉｃＲＡＭ）などの揮発性メモリで構成された書き込み可能メモリであり、ＣＰＵ２１が実行するプログラム・コードをロードしたり、実行プログラムによる作業データの一時的な保存のために使用される。
【００５０】
ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２３は、プログラムやデータを恒久的に格納する読み出し専用メモリである。ＲＯＭ２３に格納されるプログラム・コードには、ロボット装置１の電源投入時に実行する自己診断テスト・プログラムや、ロボット装置１の動作を規定する動作制御プログラムなどが挙げられる。
【００５１】
ロボット装置１の制御プログラムには、カメラ１５やマイクロフォン１６などのセンサ入力を処理してシンボルとして認識する「センサ入力・認識処理プログラム」、短期記憶や長期記憶などの記憶動作（後述）を司りながらセンサ入力と所定の行動制御モデルとに基づいてロボット装置１の行動を制御する「行動制御プログラム」、行動制御モデルに従って各関節モータの駆動やスピーカ１７の音声出力などを制御する「駆動制御プログラム」などが含まれる。
【００５２】
不揮発性メモリ２４は、例えばＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲＯＭ）のように電気的に消去再書き込みが可能なメモリ素子で構成され、逐次更新すべきデータを不揮発的に保持するために使用される。逐次更新すべきデータには、暗号鍵やその他のセキュリティ情報、出荷後にインストールすべき装置制御プログラムなどが挙げられる。
【００５３】
インターフェース２５は、制御ユニット２０外の機器と相互接続し、データ交換を可能にするための装置である。インターフェース２５は、例えば、カメラ１５やマイクロフォン１６、スピーカ１７との間でデータ入出力を行なう。また、インターフェース２５は、駆動部５０内の各ドライバ５３−１…との間でデータやコマンドの入出力を行なう。
【００５４】
また、インターフェース２５は、ＲＳ（ＲｅｃｏｍｍｅｎｄｅｄＳｔａｎｄａｒｄ）−２３２Ｃなどのシリアル・インターフェース、ＩＥＥＥ（ＩｎｓｔｉｔｕｔｅｏｆＥｌｅｃｔｒｉｃａｌａｎｄｅｌｅｃｔｒｏｎｉｃｓＥｎｇｉｎｅｅｒｓ）１２８４などのパラレル・インターフェース、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）インターフェース、ｉ−Ｌｉｎｋ（ＩＥＥＥ１３９４）インターフェース、ＳＣＳＩ（ＳｍａｌｌＣｏｍｐｕｔｅｒＳｙｓｔｅｍＩｎｔｅｒｆａｃｅ）インターフェース、ＰＣカードやメモリ・スティックを受容するメモリ・カード・インターフェース（カード・スロット）などのような、コンピュータの周辺機器接続用の汎用インターフェースを備え、ローカル接続された外部機器との間でプログラムやデータの移動を行なうようにしてもよい。
【００５５】
また、インターフェース２５の他の例として、赤外線通信（ＩｒＤＡ）インターフェースを備え、外部機器と無線通信を行なうようにしてもよい。
さらに、制御ユニット２０は、無線通信インターフェース２６やネットワーク・インターフェース・カード（ＮＩＣ）２７などを含み、Ｂｌｕｅｔｏｏｔｈのような近接無線データ通信や、ＩＥＥＥ８０２．１１ｂのような無線ネットワーク、あるいはインターネットなどの広域ネットワークを経由して、外部のさまざまなホスト・コンピュータとデータ通信を行なうことができる。
【００５６】
このようなロボット装置１とホスト・コンピュータ間におけるデータ通信により、遠隔のコンピュータ資源を用いて、ロボット装置１の複雑な動作制御を演算したり、リモート・コントロールすることができる。
【００５７】
Ｂ．ロボット装置の行動制御システム
図３には、本発明の実施形態に係るロボット装置１の行動制御システム１００の機能構成を模式的に示している。ロボット装置１は、外部刺激の認識結果や内部状態の変化に応じて行動制御を行なうことができる。さらには、長期記憶機能を備え、外部刺激から内部状態の変化を連想記憶することにより、外部刺激の認識結果や内部状態の変化に応じて行動制御を行なうことができる。
【００５８】
図示の行動制御システム１００にはオブジェクト指向プログラミングを採り入れて実装することができる。この場合、各ソフトウェアは、データとそのデータに対する処理手続きとを一体化させた「オブジェクト」というモジュール単位で扱われる。また、各オブジェクトは、メッセージ通信と共有メモリを使ったオブジェクト間通信方法によりデータの受け渡しとＩｎｖｏｋｅを行なうことができる。
【００５９】
行動制御システム１００は、外部環境（Ｅｎｖｉｒｏｎｍｅｎｔｓ）を認識するために、視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３を備えている。
【００６０】
視覚認識機能部（Ｖｉｄｅｏ）５１は、例えば、ＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ：電荷結合素子）カメラのような画像入力装置を介して入力された撮影画像を基に、顔認識や色認識などの画像認識処理や特徴抽出を行なう。視覚認識機能部５１は、後述する”ＭｕｌｔｉＣｏｌｏｒＴｒａｃｋｅｒ”，”ＦａｃｅＤｅｔｅｃｔｏｒ”，”ＦａｃｅＩｄｅｎｔｉｆｙ”といった複数のオブジェクトで構成される。
【００６１】
聴覚認識機能部（Ａｕｄｉｏ）５２は、マイクなどの音声入力装置を介して入力される音声データを音声認識して、特徴抽出したり、単語セット（テキスト）認識を行ったりする。聴覚認識機能部５２は、後述する”ＡｕｄｉｏＲｅｃｏｇ”，”ＡｕｔｈｕｒＤｅｃｏｄｅｒ”といった複数のオブジェクトで構成される。
接触認識機能部（Ｔａｃｔｉｌｅ）５３は、例えば機体の頭部などに内蔵された接触センサによるセンサ信号を認識して、「なでられた」とか「叩かれた」という外部刺激を認識する。
【００６２】
内部状態管理部（ＩＳＭ：ＩｎｔｅｒｎａｌＳｔａｔｕｓＭａｎａｇｅｒ）１０４は、本能や感情といった数種類の情動を数式モデル化して管理しており、上述の視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３によって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）に応じてロボット装置１の本能や情動といった内部状態を管理する。
【００６３】
感情モデルと本能モデルは、それぞれ認識結果と行動履歴を入力に持ち、感情値と本能値を管理している。行動モデルは、これら感情値や本能値を参照することができる。
【００６４】
本実施形態では、情動についてその存在意義による複数階層で構成され、それぞれの階層で動作する。決定された複数の動作から、そのときの外部環境や内部状態によってどの動作を行なうかを決定するようになっている（後述）。また、それぞれの階層で行動は選択されるが、より低次の行動から優先的に動作を発現していくことにより、反射などの本能的行動や、記憶を用いた動作選択などの高次の行動を１つの個体上で矛盾なく発現することができる。
【００６５】
本実施形態に係るロボット装置１は、外部刺激の認識結果や内部状態の変化に応じて行動制御を行なうために、時間の経過とともに失われる短期的な記憶を行なう短期記憶部１０５と、情報を比較的長期間保持するための長期記憶部１０６を備えている。短期記憶と長期記憶という記憶メカニズムの分類は神経心理学に依拠する。
【００６６】
短期記憶部（ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ）１０５は、上述の視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３によって外部環境から認識されたターゲットやイベントを短期間保持する機能モジュールである。例えば、カメラ１５からの入力画像を約１５秒程度の短い期間だけ記憶する。
【００６７】
長期記憶部（ＬｏｎｇＴｅｒｍＭｅｍｏｒｙ）１０６は、物の名前など学習により得られた情報を長期間保持するために使用される。長期記憶部１０６は、例えば、ある行動モジュールにおいて外部刺激から内部状態の変化を連想記憶することができる。
【００６８】
また、本実施形態に係るロボット装置１の行動制御は、反射行動部１０９によって実現される「反射行動」と、状況依存行動階層１０８によって実現される「状況依存行動」と、熟考行動階層１０７によって実現される「熟考行動」に大別される。
【００６９】
反射的行動部（ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０９は、上述の視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３によって認識された外部刺激に応じて反射的な機体動作を実現する機能モジュールである。
【００７０】
反射行動とは、基本的に、センサ入力された外部情報の認識結果を直接受けて、これを分類して、出力行動を直接決定する行動のことである。例えば、人間の顔を追いかけたり、うなずくといった振る舞いは反射行動として実装することが好ましい。
【００７１】
状況依存行動階層（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０８は、短期記憶部１０５並びに長期記憶部１０６の記憶内容や、内部状態管理部１０４によって管理される内部状態を基に、ロボット装置１が現在置かれている状況に即応した行動を制御する。
【００７２】
状況依存行動階層１０８は、各行動毎にステートマシンを用意しており、それ以前の行動や状況に依存して、センサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。また、状況依存行動階層１０８は、内部状態をある範囲に保つための行動（「ホメオスタシス行動」とも呼ぶ）も実現し、内部状態が指定した範囲内を越えた場合には、その内部状態を当該範囲内に戻すための行動が出現し易くなるようにその行動を活性化させる（実際には、内部状態と外部環境の両方を考慮した形で行動が選択される）。状況依存行動は、反射行動に比し、反応時間が遅い。
【００７３】
熟考行動階層（ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒ）１０７は、短期記憶部１０５並びに長期記憶部１０６の記憶内容に基づいて、ロボット装置１の比較的長期にわたる行動計画などを行なう。
【００７４】
熟考行動とは、与えられた状況あるいは人間からの命令により、推論やそれを実現するための計画を立てて行なわれる行動のことである。例えば、ロボットの位置と目標の位置から経路を探索することは熟考行動に相当する。このような推論や計画は、ロボット装置１がインタラクションを保つための反応時間よりも処理時間や計算負荷を要する（すなわち処理時間がかかる）可能性があるので、上記の反射行動や状況依存行動がリアルタイムで反応を返しながら、熟考行動は推論や計画を行なう。
【００７５】
熟考行動階層１０７や状況依存行動階層１０８、反射行動部１０９は、ロボット装置１のハードウェア構成に非依存の上位のアプリケーション・プログラムとして記述することができる。これに対し、ハードウェア依存層制御部（ＣｏｎｆｉｇｕｒａｔｉｏｎＤｅｐｅｎｄｅｎｔＡｃｔｉｏｎｓＡｎｄＲｅａｃｔｉｏｎｓ）１１０は、これら上位アプリケーション（「スキーマ」と呼ばれる行動モジュール）からの命令に応じて、関節アクチュエータの駆動などの機体のハードウェア（外部環境）を直接操作する。
【００７６】
Ｃ．ロボット装置の記憶メカニズム
上述したように、本実施形態に係るロボット装置１は、短期記憶部１０５と長期記憶部１０６を備えているが、このような記憶メカニズムは、神経心理学に依拠する。
【００７７】
短期記憶は、字義通り短期的な記憶であり、時間の経過とともに失われる。短期記憶は、例えば、視覚や聴覚、接触など、外部環境から認識されたターゲットやイベントを短期間保持するために使用することができる。
【００７８】
短期記憶は、さらに、感覚情報（すなわちセンサからの出力）をそのままの信号で１秒程度保持する「感覚記憶」と、感覚記憶をエンコードして限られた容量で短期的に記憶する「直接記憶」と、状況変化や文脈を数時間に渡って記憶する「作業記憶」に分類することができる。直接記憶は、神経心理学的な研究によれば７±２チャンクであると言われている。また、作業記憶は、短期記憶と長期記憶との対比で、「中間記憶」とも呼ばれる。
【００７９】
また、長期記憶は、物の名前など学習により得られた情報を長期間保持するために使用される。同じパターンを統計的に処理して、ロバストな記憶にすることができる。
【００８０】
長期記憶はさらに「宣言的知識記憶」と「手続的知識記憶」に分類される。宣言的知識記憶は、場面（例えば教えられたときのシーン）に関する記憶である「エピソード記憶」と、言葉の意味や常識といった記憶からなる「意味記憶」からなる。また、手続的知識記憶は、宣言的知識記憶をどのように使うかといった手順記憶であり、入力パターンに対する動作の獲得に用いることができる。
【００８１】
Ｃ−１．短期記憶部
短期記憶部１０５は、自分の周りに存在する物体、あるいはイベントを表現、記憶し、それに基づいてロボットが行動することを目的とした機能モジュールである。視覚や聴覚などのセンサ情報を基に物体やイベントの位置を自己中心座標系上に配置していくが、視野外の物体などを記憶し、それに対する行動などを生じさせることができる。
【００８２】
例えば、ある人物Ａと会話していて、別の人物Ｂに声をかられたとき、Ａの位置や会話内容を保持しながらＢとの会話を行ない、終了後Ａとの会話に戻る場合などに短期記憶の機能が必要となる。但し、あまり複雑な処理による統合を行なわずに、時間と空間で近いセンサ情報を同じ物体からの信号とみなすといった時空間の簡単な近さによる統合を行なう。
【００８３】
また、ステレオ視覚などの技術を用いてパターン認識で判別可能な物体以外の物体の位置を記憶するために、自己中心座標系上に配置する。床面検出とともに利用して、障害物の位置を確率的に記憶するなどに利用することができる。
【００８４】
本実施形態では、短期記憶部１０５は、上述した視覚認識機能部１０１、聴覚認識機能部１０２、接触認識機能部１０３などの複数の認識器の結果からなる外部刺激を時間的及び空間的に整合性を保つように統合して、外部環境下の各物体に関する知覚を短期間の記憶として状況依存行動階層（ＳＢＬ）１０８などの行動制御モジュールに提供する。
【００８５】
したがって、上位モジュールとして構成される行動制御モジュール側では、外界からの複数の認識結果を統合して意味を持ったシンボル情報として扱い、高度な行動制御を行なうことができる。また、以前に観測された認識結果との対応問題などより複雑な認識結果を利用して、どの肌色領域が顔でどの人物に対応しているかや、この声がどの人物の声なのかなどを解くことができる。
【００８６】
また、認識した観測結果に関する情報を記憶として短期記憶部５５が保持しているので、自律行動する期間中に一時的に観測結果が来なかったりした場合であっても、機体の行動制御を行なうアプリケーションなどの上位モジュールからは常にそこに物体が知覚されているように見えるようにすることができる。例えば、センサの視野外の情報もすぐに忘れることなく保持しているので、ロボットが物体を一旦見失ったとしても、また後で探し出すことができる。この結果、認識器の間違いやセンサのノイズに強くなり、認識器の通知のタイミングに依存しない安定したシステムを実現することができる。また、認識器単体から見て情報が足りなくても、他の認識結果が補うことができる場合があるので、システム全体としての認識性能が向上する。
【００８７】
また、関連する認識結果が結び付けられているので、アプリケーションなどの上位モジュールで関連する情報を使って行動判断することが可能である。例えば、ロボット装置は、呼び掛けられた声を基に、その人物の名前を引き出すことができる。この結果、挨拶の応答に「こんにちは、ＸＸＸさん。」のように答えるなどのリアクションが可能である。
【００８８】
図４には、図３に示した行動制御システム１００における外部刺激に応じた状況依存行動制御のメカニズムを図解している。外部刺激は、認識系の機能モジュール１０１〜１０３によってシステムに取り込まれるとともに、短期記憶部（ＳＴＭ）１０５を介して状況依存行動階層（ＳＢＬ）１０８に与えられる。図示の通り、認識系の各機能モジュール１０１〜１０３や、短期記憶部（ＳＴＭ）１０５、状況依存行動階層（ＳＢＬ）１０８はオブジェクトとして構成されている。
【００８９】
同図において、丸で表されているのが、「オブジェクト」又は「プロセス」と呼ばれるエンティティである。オブジェクト同士が非同期に通信し合うことで、システム全体が動作する。各オブジェクトはメッセージ通信と共有メモリを使ったオブジェクト間通信方法によりデータの受け渡しとＩｎｖｏｋｅを行なっている。以下に、各オブジェクトの機能について説明する。
【００９０】
ＡｕｄｉｏＲｅｃｏｇ：
マイクなどの音声入力装置からの音声データを受け取って、特徴抽出と音声区間検出を行なうオブジェクトである。また、マイクがステレオである場合には、水平方向の音源方向推定を行なうことができる。音声区間であると判断されると、その区間の音声データの特徴量及び音源方向がＡｒｔｈｅｒＤｅｃｏｄｅｒ（後述）に送られる。
【００９１】
ＳｐｅｅｃｈＲｅｃｏｇ：
ＡｕｄｉｏＲｅｃｏｇから受け取った音声特徴量と音声辞書及び構文辞書を使って音声認識を行なうオブジェクトである。認識された単語のセットは短期記憶部（ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ）１０５に送られる。
【００９２】
ＭｕｌｔｉＣｏｌｏｒＴｒａｃｋｅｒ：
色認識を行なうオブジェクトであり、カメラなどの画像入力装置から画像データを受け取り、あらかじめ持っている複数のカラー・モデルに基づいて色領域を抽出し、連続した領域に分割する。分割された各領域の位置や大きさ、特徴量などの情報を出力して、短期記憶部（ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ）１０５へ送る。
【００９３】
ＦａｃｅＤｅｔｅｃｔｏｒ：
画像フレーム中から顔領域を検出するオブジェクトであり、カメラなどの画像入力装置から画像データを受け取り、それを９段階のスケール画像に縮小変換する。このすべての画像の中から顔に相当する矩形領域を探索する。重なりあった候補領域を削減して最終的に顔と判断された領域に関する位置や大きさ、特徴量などの情報を出力して、ＦａｃｅＩｄｅｎｔｉｆｙ（後述）へ送る。
【００９４】
ＦａｃｅＩｄｅｎｔｉｆｙ：
検出された顔画像を識別するオブジェクトであり、顔の領域を示す矩形領域画像をＦａｃｅＤｅｔｅｃｔｏｒから受け取り、この顔画像が手持ちの人物辞書のうちでどの人物に相当するかを比較して人物の識別を行なう。この場合、顔検出から顔画像を受け取り、顔画像領域の位置、大きさ情報とともに人物のＩＤ情報を出力する。
【００９５】
ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（短期記憶部）：
ロボット１の外部環境に関する情報を比較的短い時間だけ保持するオブジェクトであり、ＳｐｅｅｃｈＲｅｃｏｇから音声認識結果（単語、音源方向、確信度）を受け取り、ＭｕｌｔｉＣｏｌｏｒＴｒａｃｋｅｒから肌色の領域の位置、大きさと顔領域の位置、大きさを受け取り、ＦａｃｅＩｄｅｎｔｉｆｙから人物のＩＤ情報等を受け取る。また、ロボット１の機体上の各センサからロボットの首の方向（関節角）を受け取る。そして、これらの認識結果やセンサ出力を統合的に使って、現在どこにどの人物がいて、しゃべった言葉がどの人物のものであり、その人物とはこれまでにどんな対話を行なったのかという情報を保存する。こうした物体すなわちターゲットに関する物理情報と時間方向でみたイベント（履歴）を出力として、状況依存行動階層（ＳＢＬ）などの上位モジュールに渡す。
【００９６】
ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒ（状況依存行動階層）：
上述のＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（短期記憶部）からの情報を基にロボット１の行動（状況に依存した行動）を決定するオブジェクトである。複数の行動を同時に評価したり、実行したりすることができる。また、行動を切り替えて機体をスリープ状態にしておき、別の行動を起動することができる。
【００９７】
ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ：
出力用のコマンドに対してロボット１の各ハードウェアのリソース調停を行なうオブジェクトである。図４に示す例では、音声出力用のスピーカをコントロールするオブジェクトと首のモーション・コントロールするオブジェクトのリソース調停を行なう。
【００９８】
ＳｏｕｎｄＰｅｒｆｏｒｍｅｒＴＴＳ：
音声出力を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから与えられたテキスト・コマンドに応じて音声合成を行ない、ロボット１の機体上のスピーカから音声出力を行なう。
【００９９】
ＨｅａｄＭｏｔｉｏｎＧｅｎｅｒａｔｏｒ：
ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから首を動かすコマンドを受けたことに応答して、首の関節角を計算するオブジェクトである。「追跡」のコマンドを受けたときには、ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙから受け取った物体の位置情報を基に、その物体が存在する方向を向く首の関節角を計算して出力する。
【０１００】
短期記憶部１０５は、ターゲット・メモリとイベント・メモリという２種類のメモリ・オブジェクトで構成される。
【０１０１】
ターゲット・メモリは、各認識機能部１０１〜１０３からの情報を統合して、現在知覚している物体に関する情報すなわちターゲットを保持している。このため、対象物体がいなくなったり現れたりすることで、該当するターゲットを記憶領域から削除したり（ＧａｒｂａｇｅＣｏｌｌｅｃｔｏｒ）、新たに生成したりする。また、１つのターゲットを複数の認識属性で表現することができる（ＴａｒｇｅｔＡｓｓｏｃｉａｔｅ）。例えば、肌色で顔のパターンで声を発する物体（人間の顔）などである。
【０１０２】
ターゲット・メモリで保持される物体（ターゲット）の位置や姿勢情報は、それぞれの認識機能部５１〜５３において使用されるセンサ座標系ではなく、ロボット１の体幹などの機体上の特定の部位が所定の場所に固定されたワールド座標系で表現を行なうようにしている。このため、短期記憶部（ＳＴＭ）１０５では、ロボット１の各関節の現在値（センサ出力）を常に監視して、センサ座標系からこの固定座標系への変換を行なう。これにより、各認識機能部１０１〜１０３の情報を統合することが可能になる。例えば、ロボット１００が首などを動かしてセンサの姿勢が変化しても、状況依存行動階層（ＳＢＬ）などの行動制御モジュールから見た物体の位置は同じままであるので、ターゲットの取り扱いが容易になる。
【０１０３】
また、イベント・メモリは、外部環境下で発生した過去から現在までのイベントを時系列的に格納するオブジェクトである。イベント・メモリにおいて扱われるイベントとして、ターゲットの出現と消失、音声認識単語、自己の行動や姿勢の変化などの外界の状況の変化に関する情報を挙げることができる。
【０１０４】
イベントの中には、あるターゲットに関する状態変化が含まれている。このため、イベント情報として該当するターゲットのＩＤを含めることで、発生したイベントに関するより詳しい情報を、上述のターゲット・メモリから検索することも可能である。
【０１０５】
図５及び図６には、各認識機能部１０１〜１０３における認識結果に基づいて、短期記憶部１０５内のターゲット・メモリ及びイベント・メモリに入る情報の流れをそれぞれ示している。
【０１０６】
図５に示すように、短期記憶部１０５（ＳＴＭオブジェクト）内には、外部環境からターゲットを検出するターゲット検出器が設けられている。このターゲット検出器は、声認識結果や顔認識結果、色認識結果などの各認識機能部１０１〜１０３による認識結果を基に、新規ターゲットを追加したり、既存のターゲットを認識結果に反映するように更新したりする。検出されたターゲットは、ターゲット・メモリ内に保持される。
【０１０７】
また、ターゲット・メモリには、もはや観測されなくなったターゲットを探して消去するガーベッジ・コレクタ（ＧａｒｂａｇｅＣｏｌｌｅｃｔｏｒ）や、複数のターゲットの関連性を判別して同じターゲットに結び付けるターゲット・アソシエート（ＴａｒｇｅｔＡｓｓｏｃｉａｔｅ）などの機能がある。ガーベッジ・コレクタは、時間の経過に従ってターゲットの確信度をデクリメントしていき、確信度が所定値を下回ったターゲットを削除（ｄｅｌｅｔｅ）することで実現される。また、ターゲット・アソシエートは、同じ属性（認識タイプ）の特徴量が近いターゲット間で空間的・時間的な近さを持つことで、同じターゲットを同定することができる。
【０１０８】
前述した状況依存型行動階層（ＳＢＬ）は、短期記憶部１０５のクライアント（ＳＴＭクライアント）となるオブジェクトであり、ターゲット・メモリからは定期的に各ターゲットに関する情報の通知（Ｎｏｔｉｆｙ）を受け取る。本実施形態では、ＳＴＭプロキシ・クラスが、短期記憶部１０５（ＳＴＭオブジェクト）とは独立したクライアント・ローカルな作業領域にターゲットをコピーして、常に最新の情報を保持しておく。そして、ローカルなターゲット・リスト（ＴａｒｇｅｔｏｆＩｎｔｅｒｅｓｔ）の中から所望のターゲットを外部刺激として読み出して、スキーマ（ｓｃｈｅｍａ）すなわち行動モジュールを決定する（後述）。
【０１０９】
また、図６に示すように、短期記憶部１０５（ＳＴＭオブジェクト）内には、外部環境において発生するイベントを検出するイベント検出器が設けられている。このイベント検出器は、ターゲット検出器によるターゲットの生成や、ガーベッジ・コレクタによるターゲットの削除をイベントとして検出する。また、認識機能部１０１〜１０３による認識結果が音声認識である場合には、その発話内容がイベントになる。発生したイベントは、発生した時間順にイベント・メモリ内でイベント・リストとして格納される。
【０１１０】
状況依存型行動階層（ＳＢＬ）は、短期記憶部１０５のクライアント（ＳＴＭクライアント）となるオブジェクトであり、イベント・メモリからは時々刻々とのイベントの通知（Ｎｏｔｉｆｙ）を受け取る。本実施形態では、ＳＴＭプロキシ・クラスが、短期記憶部１０５（ＳＴＭオブジェクト）とは独立したクライアント・ローカルな作業領域にイベント・リストをコピーしておく。そして、ローカルなイベント・リストの中から所望のイベントを外部刺激として読み出して、スキーマ（ｓｃｈｅｍａ）すなわち行動モジュールを決定する（後述）。実行された行動モジュールは新たなイベントとしてイベント検出器により検出される。また、古いイベントは、例えばＦＩＦＯ（ＦａｓｔＩｎＦａｓｔＯｕｔ）形式でイベント・リストから逐次的に廃棄される。
【０１１１】
本実施形態に係る短期記憶メカニズムによれば、ロボット１は、外部刺激に関する複数の認識器の結果を時間的及び空間的に整合性を保つように統合して、意味を持ったシンボル情報として扱うようになっている。これによって、以前に観測された認識結果との対応問題などより複雑な認識結果を利用して、どの肌色領域が顔でどの人物に対応しているかや、この声がどの人物の声なのかなどを解くことを可能にしている。
【０１１２】
以下では、図７〜図９を参照しながら、ロボット１によるユーザＡ及びＢとの対話処理について説明する。
【０１１３】
まず、図７に示すように、ユーザＡが「まさひろ（ロボットの名前）くん！」と呼ぶと、各認識機能部５１〜５３により音方向検出、音声認識、及び顔識別が行なわれ、呼ばれた方向を向いて、ユーザＡの顔をトラッキングしたり、ユーザＡとの対話を開始するという状況依存の行動が行なわれる。
【０１１４】
次いで、図８に示すように、今度はユーザＢが「まさひろ（ロボットの名前）くん！」と呼ぶと、各認識機能部１０１〜１０３により音方向検出、音声認識、及び顔識別が行なわれ、ユーザＡとの対話を中断した後（但し、会話のコンテキストを保存する）、呼ばれた方向を向いて、ユーザＢの顔をトラッキングしたり、ユーザＢとの対話を開始するという状況依存の行動が行なわれる。これは、状況依存行動階層１０８が持つＰｒｅｅｍｐｔｉｏｎ機能（後述）である。
【０１１５】
次いで、図９に示すように、ユーザＡが「おーい！」と叫んで、会話の継続を催促すると、今度は、ユーザＢとの対話を中断した後（但し、会話のコンテキストを保存する）、呼ばれた方向を向いて、ユーザＡの顔をトラッキングしたり、保存されているコンテキストに基づいてユーザＡとの対話を再開するという状況依存の行動が行なわれる。このとき、状況依存行動階層１０８が持つＲｅｅｎｔｒａｎｔ機能（後述）により、ユーザＡとの対話によってユーザＢとの対話内容が破壊されずに済み、中断した時点から正確に対話を再開することができる。
【０１１６】
Ｃ−２．長期記憶部
長期記憶は、物の名前など学習により得られた情報を長期間保持するために使用される。同じパターンを統計的に処理して、ロバストな記憶にすることができる。
【０１１７】
長期記憶はさらに「宣言的知識記憶」と「手続的知識記憶」に分類される。宣言的知識記憶は、場面（例えば教えられたときのシーン）に関する記憶である「エピソード記憶」と、言葉の意味や常識といった記憶からなる「意味記憶」からなる。また、手続的知識記憶は、宣言的知識記憶をどのように使うかといった手順記憶であり、入力パターンに対する動作の獲得に用いることができる。
【０１１８】
エピソード記憶は、長期記憶の中でも、宣言的知識記憶（言明記憶とも言う）の一種である。例えば、自転車に乗ることを考えると、初めて自転車に乗った場面（時間・場所など）を覚えていることがエピソード記憶に相当する。その後、時間の経過によりそのエピソードに関する記憶が薄れる一方、その意味を記憶するのが意味記憶である。また、自転車の乗り方の手順を記憶するようになるが、これが手続的知識記憶に相当する。一般的に、手続的知識の記憶には時間を要する。宣言的知識記憶によって「言う」ことができるのに対して、手続的知識記憶は潜在的であり、動作の実行という形で表れる。
【０１１９】
本実施形態に係る長期記憶部１０６は、視覚情報、聴覚情報などの物体に関するセンサ情報、及びその物体に対して行なった行動に対する結果としての内部状態が変化した結果などを記憶する連想記憶と、その１つの物体に関するフレーム記憶と、周囲の情景から構築されるマップ情報、あるいはデータとして与えられる地図情報、原因となる状況とそれに対する行動とその結果といったルールで構成される。
【０１２０】
Ｃ−２−１．連想記憶
連想記憶とは、あらかじめ複数のシンボルからなる入力パターンを記憶パターンとして記憶しておき、その中のある１つのパターンに類似したパターンが想起される仕組みのことを言う。本実施形態に係る連想記憶は、競合型ニューラル・ネットワークを用いたモデルにより実現される。このような連想記憶メカニズムによれば、一部欠陥のあるパターンが入力されたとき、記憶されている複数のパターンの中で最も近い記憶パターンを出力することができる。これは、不完全なデータからなる外部刺激しか与えられなかったときであっても、該当するニューロンの発火によりあるオブジェクトの意味などを想起することができるからである。
【０１２１】
連想記憶は、「自己想起型連想記憶」と「相互想起型連想記憶」に大別される。自己想起型とは記憶したパターンを直接キー・パターンで引き出すモデルであり、また、相互想起型とは入力パターンと出力パターンがある種の連合関係で結ばれているモデルである。本実施形態では、自己想起型連想記憶を採用するが、これは、従来のホップフィールドやアソシアトロン（前述）などの記憶モデルに比し、追加学習が容易である、入力パターンの統計的な記憶が可能である、などのメリットがある。
【０１２２】
追加学習によれば、新しいパターンを新たに記憶しても、過去の記憶が上書きされて消されることはない。また、統計的な学習によれば、同じものを多く見ればそれだけ記憶に残るし、また同じことを繰り返し実行すれば、忘れにくくなる。この場合、記憶過程において、毎回完全なパターンが入力されなくとも、繰り返し実行により、多く提示されたパターンに収束していく。
【０１２３】
Ｃ−２−２．連想記憶による意味記憶
ロボット装置１が覚えるパターンは、例えばロボット装置１への外部刺激と内部状態の組み合わせで構成される。
【０１２４】
ここで、外的刺激とは、ロボット装置１がセンサ入力を認識して得られた知覚情報であり、例えば、カメラ１５から入力された画像に対して処理された色情報、形情報、顔情報などであり、より具体的には、色、形、顔、３Ｄ一般物体、ハンドジェスチャー、動き、音声、接触、匂い、味などの構成要素からなる。
る。
【０１２５】
また、内的状態とは、例えば、ロボットの身体に基づいた本能や感情などの情動を指す。本能的要素は、例えば、疲れ（ｆａｔｉｇｕｅ）、熱あるいは体内温度（ｔｅｍｐｅｒａｔｕｒｅ）、痛み（ｐａｉｎ）、食欲あるいは飢え（ｈｕｎｇｅｒ）、乾き（ｔｈｉｒｓｔ）、愛情（ａｆｆｅｃｔｉｏｎ）、好奇心（ｃｕｒｉｏｓｉｔｙ）、排泄（ｅｌｉｍｉｎａｔｉｏｎ）又は性欲（ｓｅｘｕａｌ）のうちの少なくとも１つである。また、情動的要素は、幸せ（ｈａｐｐｉｎｅｓｓ）、悲しみ（ｓａｄｎｅｓｓ）、怒り（ａｎｇｅｒ）、驚き（ｓｕｒｐｒｉｓｅ）、嫌悪（ｄｉｓｇｕｓｔ）、恐れ（ｆｅａｒ）、苛立ち（ｆｒｕｓｔｒａｔｉｏｎ）、退屈（ｂｏｒｅｄｏｍ）、睡眠（ｓｏｍｎｏｌｅｎｃｅ）、社交性（ｇｒｅｇａｒｉｏｕｓｎｅｓｓ）、根気（ｐａｔｉｅｎｃｅ）、緊張（ｔｅｎｓｅ）、リラックス（ｒｅｌａｘｅｄ）、警戒（ａｌｅｒｔｎｅｓｓ）、罪（ｇｕｉｌｔ）、悪意（ｓｐｉｔｅ）、誠実さ（ｌｏｙａｌｔｙ）、服従性（ｓｕｂｍｉｓｓｉｏｎ）又は嫉妬（ｊｅａｌｏｕｓｙ）のうちの少なくとも１つである。
【０１２６】
本実施形態に係る競合型ニューラル・ネットワークを適用した連想記憶メカニズムでは、これら外部刺激や内部状態を構成する各要素に対して入力チャンネルを割り当てている。また、視覚認識機能部１０１や聴覚認識機能部１０２などの各知覚機能モジュールは、センサ出力となる生の信号を送るのではなく、センサ出力を認識した結果をシンボル化して、シンボルに相当するＩＤ情報（例えば、色プロトタイプＩＤ、形プロトタイプＩＤ、音声プロトタイプＩＤなど）を該当するチャンネルに送るようになっている。
【０１２７】
例えば、カラー・セグメンテーション・モジュールによりセグメンテーションされた各オブジェクトは、色プロトタイプＩＤを付加されて連想記憶システムに入力される。また、顔認識モジュールにより認識された顔のＩＤが連想記憶システムに入力される。また、物体認識モジュールにより認識された物体のＩＤが連想システムに入力される。また、音声認識モジュールからは、ユーザの発話により単語のプロトタイプＩＤが入力される。このとき、発話の音素記号列（ＰｈｏｎｅｍｅＳｅｑｕｅｎｃｅ）も入力されるので、記憶・連想の処理で、ロボット装置１に発話させることが可能となる。また、本能に関しては、アナログ値を扱えるようになっており（後述）、例えば、本能のデルタ値を８０で記憶しておけば、連想により８０というアナログ値を得ることが可能である。
【０１２８】
したがって、本実施形態に係る連想記憶システムは、色、形、音声…などの外部刺激や内部状態を、各チャンネル毎のシンボル化されたＩＤの組み合わせからなる入力パターンとして記憶することができる。すなわち、連想記憶システムが記憶するのは、
【０１２９】
［色ＩＤ形ＩＤ顔ＩＤ音声ＩＤ…本能ＩＤ（値）情動ＩＤ］
【０１３０】
の組み合わせである。
【０１３１】
連想記憶には、記憶過程と想起過程がある。図１０には、連想記憶の記憶過程の概念を示している。
【０１３２】
連想記憶システムに入力される記憶パターンは、外部刺激や内部状態の各要素毎に割り当てられている複数のチャンネルで構成される（図示の例では入力１〜入力８の８チャンネルからなる）。そして、各チャンネルには、対応する外部刺激の認識結果や内部状態をシンボル化したＩＤ情報が送られてくる。図示の例では、各チャンネルの濃淡がＩＤ情報を表しているものとする。例えば、記憶パターン中のｋ番目のカラムが顔のチャンネルに割り当てられている場合、その色により顔のプロトタイプＩＤを表している。
【０１３３】
図１０に示す例では、連想記憶システムは既に１〜ｎの合計ｎ個の記憶パターンを記憶しているものとする。ここで、２つの記憶パターン間での対応するチャンネルの色の相違は、同じチャンネル上で記憶している外部刺激又は内部状態のシンボルすなわちＩＤが当該記憶パターン間で異なることを意味する。
【０１３４】
また、図１１には、連想記憶の想起過程の概念を示している。上述したように、記憶過程で蓄えた入力パターンに似たパターンが入力されると、欠落していた情報を補うように完全な記憶パターンが出力される。
【０１３５】
図１１に示す例では、８チャンネルからなる記憶パターンのうち上位の３チャンネルしかＩＤが与えられていないパターンがキー・パターンとして入力される。このような場合、連想記憶システムでは、既に貯えられている記憶パターンの中で、これら上位の３チャンネルが最も近いパターン（図示の例では記憶パターン１）を見つけ出して、想起されたパターンとして出力することができる。すなわち、欠落していたチャンネル４〜８の情報を補うように、最も近い記憶パターンが出力される。
【０１３６】
したがって、連想記憶システムによれば、顔のＩＤのみから音声ＩＤ、つまり名前を連想したり、食べ物の名前だけから、“おいしい”や“おいしくない”などを想起することができる。競合型ニューラル・ネットワークによる長期記憶アーキテクチャによれば、言葉の意味や常識などに関する意味記憶を、他の長期記憶と同じ工学モデルで実現することができる。
【０１３７】
Ｃ−３．競合型ニューラル・ネットワークによる連想学習
図１２には、競合型ニューラル・ネットワークを適用した連想記憶システムの構成例を模式的に示している。同図に示すように、この競合型ニューラル・ネットワークは、入力層（ｉｎｐｕｔｌａｙｅｒ）と競合層（ｃｏｍｐｅｔｉｔｉｖｅｌａｙｅｒ）の２層からなる階層型ニューラル・ネットワークである。
【０１３８】
この競合型ニューラル・ネットワークは、記憶モードと連想モードという２通りの動作モードを備えており、記憶モードでは入力パターンを競合的に記憶し、また、想起モードでは部分的に欠損した入力パターンから完全な記憶パターンを想起する。
【０１３９】
入力層は、複数の入力ニューロンで構成される。各入力ニューロンには、外部刺激や内部状態を表す各要素に対して割り当てられたチャンネルから、外部刺激や内部状態の認識結果に相当するシンボルすなわちＩＤ情報が入力される。入力層では、色ＩＤの個数＋形ＩＤの個数＋音声ＩＤの個数＋本能の種類…に相当する個数のニューロンを用意する必要がある。
【０１４０】
また、競合層は、複数の競合ニューロンで構成される。各競合ニューロンは、入力層側の各入力ニューロンとは、ある結合重みを持って結合されている。競合ニューロンは、それぞれのニューロンが記憶すべき１つのシンボルに相当する。言い換えれば、競合ニューロンの数は記憶可能なシンボルの個数に相当する。
【０１４１】
ある入力パターンが入力層に与えられたとする。このとき、入力パターンは外部刺激や内部状態の各要素を表すチャンネルで構成されており、チャンネルから該当するＩＤが送られてきた入力ニューロンは発火する。
【０１４２】
競合ニューロンは、各入力ニューロンからの出力をシナプスによる重み付けをして入力して、それら入力値の総和を計算する。そして、競合層で入力値の総和が最大となる競合ニューロンを選択して、勝ち抜いた競合ニューロンと入力ニューロンとの結合力を強めていくことで、学習を行なう。また、欠損のある入力パターンに対して、競合層で勝ち抜いた競合ニューロンを選択することにより、入力パターンに対応するシンボルを想起することができる。
【０１４３】
記憶モード：
入力層と競合層の結合重みは、０から１の間の値をとるものとする。但し、初期結合重みはランダムに決定する。
【０１４４】
競合型ニューラル・ネットワークにおける記憶は、まず、記憶したい入力パターンに対して競合層で勝ち抜いた競合ニューロンを選択して、その競合ニューロンと各入力ニューロンとの結合力を強めることで行なう。
【０１４５】
ここで、入力パターン・ベクトル［ｘ_１，ｘ_２，…，ｘ_ｎ］は、ニューロンが、色プロトタイプＩＤ１に対応し、ＩＤ１が認識されたら、ニューロンｘ_１を発火させ、順次、形、音声もそのように発火させることとする。発火したニューロンは１の値をとり、発火しないニューロンは−１の値をとる。
【０１４６】
また、ｉ番目の入力ニューロンとｊ番目の競合ニューロンとの結合力をｗ_ｉｊとおくと、入力ｘ_ｉに対する競合ニューロンｙ_ｊの値は、下式のように表される。
【０１４７】
【数１】

【０１４８】
したがって、競合に勝ち抜くニューロンは、下式により求めることができる。
【０１４９】
【数２】

【０１５０】
記憶は、競合層で勝ち抜いた競合ニューロン（ｗｉｎｎｅｒｎｅｕｒｏｎ）と各入力ニューロンとの結合力を強めることで行なう。勝ち抜いたニューロン（ｗｉｎｎｅｒｎｅｕｒｏｎ）と入力ニューロンとの結合の更新は、Ｋｏｈｏｎｅｎの更新規則により、以下のように行なわれる。
【０１５１】
【数３】

【０１５２】
ここで、Ｌ２Ｎｏｒｍで正規化する。
【０１５３】
【数４】

【０１５４】
この結合力がいわゆる記憶の強さを表し、記憶力になる。ここで、学習率αは、提示する回数と記憶の関係を表すパラメータである。学習率αが大きいほど、１回の記憶で重みを大きく変更する。例えば、α＝０．５を用いると、一度記憶させれば、忘却することはなく、次回同じようなパターンを提示すれば、ほぼ間違いなく記憶したパターンを連想することができる。
【０１５５】
また、提示して記憶させればさせるほど、ネットワークの結合値（重み）が大きくなっていく。これは、同じパターンが何度も入力されるうちに、記憶が強くなることを示し、統計的な学習が可能であり、実環境下におけるノイズの影響の少ない長期記憶を実現することができる。
【０１５６】
また、新たなパターンが入力され、記憶しようとすれば、新たな競合層のニューロンが発火するため、その新しいニューロンとの結合が強まり、以前の記憶によるニューロンとの結合が弱まる訳ではない。言い換えれば、競合型ニューラル・ネットワークによる連想記憶では、追加学習が可能なのであり、「忘却」の問題から解放される。
【０１５７】
想起モード：
いま、以下に示すような入力パターン・ベクトルが図１２に示す連想記憶システムに提示されたとする。入力パターンは、完全なものではなく一部が欠損していてもよい。
【０１５８】
【数５】

【０１５９】
このとき、入力ベクトルは、プロトタイプＩＤであっても、あるいはそのプロトタイプＩＤに対する尤度、確率であってもよい。出力ニューロンｙ_ｊの値は、入力ｘ_ｉについて下式のように計算される。
【０１６０】
【数６】

【０１６１】
上式は、各チャンネルの尤度に応じた競合ニューロンの発火値の尤度を表しているとも言える。ここで重要なことは、複数のチャンネルからの尤度入力に対して、それらをコネクションして全体的な尤度を求めることが可能である、という点である。本実施形態では、連想するものは唯一すなわち尤度が最大のものだけを選択することとし、競合に勝ち抜くニューロンを下式により求めることができる。
【０１６２】
【数７】

【０１６３】
求めた競合ニューロンＹの番号が記憶したシンボルの番号に対応するので、下式のように、Ｗの逆行列演算により入力パターンＸを想起することができる。
【０１６４】
【数８】

【０１６５】
さらに図１２に示す競合型ニューラル・ネットワークの入力層ニューロンにエピソードや動作ＩＤなどのシンボルを割り当てることにより、宣言的知識記憶や手続的知識記憶を連想記憶アーキテキチャにより実現することができる。
【０１６６】
Ｄ．状況依存行動制御
状況依存行動階層（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０８は、短期記憶部１０５並びに長期記憶部１０６の記憶内容や、内部状態管理部１０４によって管理される内部状態を基に、ロボット装置１が現在置かれている状況に即応した行動を制御する。また、状況依存行動階層１０８の一部として、認識された外部刺激に応じて反射的・直接的な機体動作を実行する反射行動部１０９を含んでいる。
【０１６７】
Ｄ−１．状況依存行動階層の構成
本実施形態では、状況依存行動階層１０８は、各行動モジュール毎にステートマシンを用意しており、それ以前の行動や状況に依存して、センサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。行動モジュールは、外部刺激や内部状態の変化に応じた状況判断を行なうｍｏｎｉｔｏｒ機能と、行動実行に伴う状態遷移（ステートマシン）を実現するａｃｔｉｏｎ機能とを備えたスキーマ（ｓｃｈｅｍａ）として記述される。状況依存行動階層１０８は、複数のスキーマが階層的に連結された木構造として構成されている（後述）。
【０１６８】
また、状況依存行動階層１０８は、内部状態をある範囲に保つための行動（「ホメオスタシス行動」とも呼ぶ）も実現し、内部状態が指定した範囲内を越えた場合には、その内部状態を当該範囲内に戻すための行動が出易くなるようにその行動を活性化させる（実際には、内部状態と外部環境の両方を考慮した形で行動が選択される）。
【０１６９】
図３に示したようなロボット１の行動制御システム１００における各機能モジュールは、オブジェクトとして構成される。各オブジェクトは、メッセージ通信と共有メモリを使ったオブジェクト間通信方法によりデータの受け渡しとＩｎｖｏｋｅを行なうことができる。図１３には、本実施形態に係る行動制御システム１００のオブジェクト構成を模式的に示している。
【０１７０】
視覚認識機能部１０１は、”ＦａｃｅＤｅｔｅｃｔｏｒ”、”ＭｕｌｉｔＣｏｌｏｔＴｒａｃｋｅｒ”、”ＦａｃｅＩｄｅｎｔｉｆｙ”という３つのオブジェクトで構成される。
【０１７１】
ＦａｃｅＤｅｔｅｃｔｏｒは、画像フレーム中から顔領域を検出するオブジェクトであり、検出結果をＦａｃｅＩｄｅｎｔｉｆｙに出力する。ＭｕｌｉｔＣｏｌｏｔＴｒａｃｋｅｒは、色認識を行なうオブジェクトであり、認識結果をＦａｃｅＩｄｅｎｔｉｆｙ及びＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（短期記憶ブ１０５を構成するオブジェクト）に出力する。また、ＦａｃｅＩｄｅｎｔｉｆｙは、検出された顔画像を手持ちの人物辞書で検索するなどして人物の識別を行ない、顔画像領域の位置、大きさ情報とともに人物のＩＤ情報をＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙに出力する。
【０１７２】
聴覚認識機能部１０２は、”ＡｕｄｉｏＲｅｃｏｇ”と”ＳｐｅｅｃｈＲｅｃｏｇ”という２つのオブジェクトで構成される。ＡｕｄｉｏＲｅｃｏｇは、マイクなどの音声入力装置からの音声データを受け取って、特徴抽出と音声区間検出を行なうオブジェクトであり、音声区間の音声データの特徴量及び音源方向をＳｐｅｅｃｈＲｅｃｏｇやＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙに出力する。ＳｐｅｅｃｈＲｅｃｏｇは、ＡｕｄｉｏＲｅｃｏｇから受け取った音声特徴量と音声辞書及び構文辞書を使って音声認識を行なうオブジェクトであり、認識された単語のセットをＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙに出力する。
【０１７３】
触覚認識記憶部１０３は、接触センサからのセンサ入力を認識する”ＴａｃｔｉｌｅＳｅｎｓｏｒ”というオブジェクトで構成され、認識結果はＳＨｏｒｔＴｅｒｍＭｅｍｏｒｙや内部状態を管理するオブジェクトであるＩｎｔｅｒｎａｌＳｔａｔｅＭｏｄｅｌ（ＩＳＭ）に出力する。
【０１７４】
ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（ＳＴＭ）は、短期記憶部１０５を構成するオブジェクトであり、上述の認識系の各オブジェクトによって外部環境から認識されたターゲットやイベントを短期間保持（例えばカメラ１５からの入力画像を約１５秒程度の短い期間だけ記憶する）する機能モジュールであり、ＳＴＭクライアントであるＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒに対して外部刺激の通知（Ｎｏｔｉｆｙ）を定期的に行なう。
【０１７５】
ＬｏｎｇＴｅｒｍＭｅｍｏｒｙ（ＬＴＭ）は、長期記憶部１０６を構成するオブジェクトであり、物の名前など学習により得られた情報を長期間保持するために使用される。ＬｏｎｇＴｅｒｍＭｅｍｏｒｙは、例えば、ある行動モジュールにおいて外部刺激から内部状態の変化を連想記憶することができる。
【０１７６】
ＩｎｔｅｒｎａｌＳｔａｔｕｓＭａｎａｇｅｒ（ＩＳＭ）は、内部状態管理部１０４を構成するオブジェクトであり、本能や感情といった数種類の情動を数式モデル化して管理しており、上述の認識系の各オブジェクトによって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）に応じてロボット装置１の本能や情動といった内部状態を管理する。
【０１７７】
ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒ（ＳＢＬ）は状況依存型行動階層１０８を構成するオブジェクトである。ＳＢＬは、ＳｈｏｒＴｅｒｍＭｅｍｏｒｙのクライアント（ＳＴＭクライアント）となるオブジェクトであり、ＳｈｏｒＴｅｒｍＭｅｍｏｒｙからは定期的に外部刺激（ターゲットやイベント）に関する情報の通知（Ｎｏｔｉｆｙ）を受け取ると、スキーマ（ｓｃｈｅｍａ）すなわち実行すべき行動モジュールを決定する（後述）。
【０１７８】
ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒは、反射的行動部１０９を構成するオブジェクトであり、上述した認識系の各オブジェクトによって認識された外部刺激に応じて反射的・直接的な機体動作を実行する。例えば、人間の顔を追いかけたり、うなずく、障害物の検出により咄嗟に避けるといった振る舞いを行なう（後述）。
【０１７９】
ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒは外部刺激や内部状態の変化などの状況に応じて行動を選択する。これに対し、ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒは、外部刺激に応じて反射的を行動する。これら２つのオブジェクトによる行動選択は独立して行なわれるため、互いに選択された行動モジュール（スキーマ）を機体上で実行する場合に、ロボット１のハードウェア・リソースが競合して実現不可能なこともある。Ｒｅｓｏｕｒｃｅｍａｎａｇｅｒというオブジェクトは、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒとＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒによる行動選択時のハードウェアの競合を調停する。そして、調停結果に基づいて機体動作を実現する各オブジェクトに通知することにより機体が駆動する。
【０１８０】
ＳｏｕｎｄＰｅｒｆｏｒｍｅｒ、ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ、ＬｅｄＣｏｎｔｒｏｌｌｅｒは、機体動作を実現するオブジェクトである。ＳｏｕｎｄＰｅｒｆｏｒｍｅｒは、音声出力を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから与えられたテキスト・コマンドに応じて音声合成を行ない、ロボット１の機体上のスピーカから音声出力を行なう。また、ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒは、機体上の各関節アクチュエータの動作を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから手や脚などを動かすコマンドを受けたことに応答して、該当する関節角を計算する。また、ＬｅｄＣｏｎｔｒｏｌｌｅｒは、ＬＥＤ１９の点滅動作を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒからコマンドを受けたことに応答してＬＥＤ１９の点滅駆動を行なう。
【０１８１】
図１４には、状況依存行動階層（ＳＢＬ）１０８（但し、反射行動部１０９を含む）による状況依存行動制御の形態を模式的に示している。認識系１０１〜１０３による外部環境の認識結果は、外部刺激として状況依存行動階層１０８（反射行動部１０９を含む）に与えられる。また、認識系による外部環境の認識結果に応じた内部状態の変化も状況依存行動階層１０８に与えられる。そして、状況依存行動階層１０８では、外部刺激や内部状態の変化に応じて状況を判断して、行動選択を実現することができる。
【０１８２】
図１５には、図１４に示した状況依存行動階層１０８による行動制御の基本的な動作例を示している。同図に示すように、状況依存行動階層１０８（ＳＢＬ）では、外部刺激や内部状態の変化によって各行動モジュール（スキーマ）の活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する。活動度レベルの算出には、例えばライブラリを利用することにより、すべてのスキーマについて統一的な計算処理を行なうことができる（以下、同様）。例えば、活動度レベルが最も高いスキーマを選択したり、所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行するようにしてもよい（但し、並列実行するときは各スキーマどうしでハードウェア・リソースの競合がないことを前提とする）。
【０１８３】
また、図１６には、図１４に示した状況依存行動階層１０８により反射行動を行なう場合の動作例を示している。この場合、同図に示すように、状況依存行動階層１０８に含まれる反射行動部１０９（ＲｅｆｌｅｘｉｖｅＳＢＬ）は、認識系の各オブジェクトによって認識された外部刺激を直接入力として活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する。この場合、内部状態の変化は、活動度レベルの計算には使用されない。
【０１８４】
また、図１７には、図１４に示した状況依存行動階層１０８により感情表現を行なう場合の動作例を示している。内部状態管理部１０４では、本能や感情などの情動を数式モデルとして管理しており、情動パラメータの状態値が所定値に達したことに応答して、状況依存行動階層１０８に内部状態の変化を通知（Ｎｏｔｉｆｙ）する。状況依存行動階層１０８は、内部状態の変化を入力として活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する。この場合、認識系の各オブジェクトによって認識された外部刺激は、内部状態管理部１０４（ＩＳＭ）における内部状態の管理・更新に利用されるが、スキーマの活動度レベルの算出には使用されない。
【０１８５】
Ｄ−２．スキーマ
状況依存行動階層１０８は、各行動モジュール毎にステートマシンを用意しており、それ以前の行動や状況に依存して、センサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。行動モジュールは、機体動作を記述し行動実行に伴う状態遷移（ステートマシン）を実現するＡｃｔｉｏｎ機能と、Ａｃｔｉｏｎ機能において記述された行動の実行を外部刺激や内部状態に応じて評価して状況判断を行なうＭｏｎｉｔｏｒ機能とを備えたスキーマ（ｓｃｈｅｍａ）として記述される。図１８には、状況依存行動階層１０８が複数のスキーマによって構成されている様子を模式的に示している。
【０１８６】
状況依存行動階層１０８（より厳密には、状況依存行動階層１０８のうち、通常の状況依存行動を制御する階層）は、複数のスキーマが階層的に連結されたツリー構造として構成され、外部刺激や内部状態の変化に応じてより最適なスキーマを統合的に判断して行動制御を行なうようになっている。ツリーは、例えば動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するためのサブツリーなど、複数のサブツリー（又は枝）を含んでいる。
【０１８７】
図１９には、状況依存行動階層１０８におけるスキーマのツリー構造を模式的に示している。同図に示すように、状況依存行動階層１０８は、短期記憶部１０５から外部刺激の通知（Ｎｏｔｉｆｙ）を受けるルート・スキーマを先頭に、抽象的な行動カテゴリから具体的な行動カテゴリに向かうように、各階層毎にスキーマが配設されている。例えば、ルート・スキーマの直近下位の階層では、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」、「食べる（Ｉｎｇｅｓｔｉｖｅ）」、「遊ぶ（Ｐｌａｙ）」というスキーマが配設される。そして、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」の下位には、「ＩｎｖｅｓｔｉｇａｔｉｖｅＬｏｃｏｍｏｔｉｏｎ」、「ＨｅａｄｉｎＡｉｒＳｎｉｆｆｉｎｇ」、「ＩｎｖｅｓｔｉｇａｔｉｖｅＳｎｉｆｆｉｎｇ」というより具体的な探索行動を記述したスキーマが配設されている。同様に、スキーマ「食べる（Ｉｎｇｅｓｔｉｖｅ）」の下位には「Ｅａｔ」や「Ｄｒｉｎｋ」などのより具体的な飲食行動を記述したスキーマが配設され、スキーマ「遊ぶ（Ｐｌａｙ）」の下位には「ＰｌａｙＢｏｗｉｎｇ」、「ＰｌａｙＧｒｅｅｔｉｎｇ」、「ＰｌａｙＰａｗｉｎｇ」などのより具体的な遊ぶ行動を記述したスキーマが配設されている。
【０１８８】
図示の通り、各スキーマは外部刺激と内部状態を入力している。また、各スキーマは、少なくともＭｏｎｉｔｏｒ関数とＡｃｔｉｏｎ関数を備えている。
【０１８９】
図２０には、スキーマの内部構成を模式的に示している。同図に示すように、スキーマは、状態遷移（ステートマシン）の形式で機体動作を記述したＡｃｔｉｏｎ関数と、外部刺激や内部状態に応じてＡｃｔｉｏｎ関数の各状態を評価して活動度レベル値として返すＭｏｎｉｔｏｒ関数と、Ａｃｔｉｏｎ関数のステートマシンをＲＥＡＤＹ（準備完了）、ＡＣＴＩＶＥ（活動中），ＳＬＥＥＰ（待機中）いずれかの状態としてスキーマの状態を記憶管理する状態管理部で構成されている。
【０１９０】
Ｍｏｎｉｔｏｒ関数は、外部刺激と内部状態に応じて当該スキーマの活動度レベル（ＡｃｔｉｖａｔｉｏｎＬｅｖｅｌ：ＡＬ値）を算出する関数である。図１９に示すようなツリー構造を構成する場合、上位（親）のスキーマは外部刺激と内部状態を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールすることができ、子供のスキーマはＡＬ値を返り値とする。また、スキーマは自分のＡＬ値を算出するために、さらに子供のスキーマのＭｏｎｉｔｏｒ関数をコールすることができる。そして、ルートのスキーマには各サブツリーからのＡＬ値が返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断することができる。
【０１９１】
例えばＡＬ値が最も高いスキーマを選択したり、ＡＬ値が所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行するようにしてもよい（但し、並列実行するときは各スキーマ同士でハードウェア・リソースの競合がないことを前提とする）。
【０１９２】
図２１には、Ｍｏｎｉｔｏｒ関数の内部構成を模式的に示している。同図に示すように、Ｍｏｎｉｔｏｒ関数は、当該スキーマで記述されている行動を誘発する評価値を活動度レベルとして算出する行動誘発評価値演算器と、使用する機体リソースを特定する使用リソース演算器を備えている。図２０で示す例では、Ｍｏｎｉｔｏｒ関数は、スキーマすなわち行動モジュールの管理を行なう行動状態制御部（仮称）からコールされると、Ａｃｔｉｏｎ関数のステートマシンを仮想実行して、行動誘発評価値（すなわち活動度レベル）と使用リソースを演算して、これを返すようになっている。
【０１９３】
また、Ａｃｔｉｏｎ関数は、スキーマ自身が持つ行動を記述したステートマシン（後述）を備えている。図１９に示すようなツリー構造を構成する場合、親スキーマは、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりすることができる。本実施形態では、ＡｃｔｉｏｎのステートマシンはＲｅａｄｙにならないと初期化されない。言い換えれば、中断しても状態はリセットされず、スキーマが実行中の作業データを保存することから、中断再実行が可能である（後述）。
【０１９４】
図２０で示す例では、スキーマすなわち行動モジュールの管理を行なう行動状態制御部（仮称）は、Ｍｏｎｉｔｏｒ関数からの戻り値に基づいて、実行すべき行動を選択し、該当するスキーマのＡｃｔｉｏｎ関数をコールし、あるいは状態管理部に記憶されているスキーマの状態の移行を指示する。例えば行動誘発評価値としての活動度レベルが最も高いスキーマを選択したり、リソースが競合しないように優先順位に従って複数のスキーマを選択したりする。また、行動状態制御部は、より優先順位の高いスキーマが起動し、リソースの競合が生じた場合、優先順位が下位のスキーマの状態をＡＣＴＩＶＥからＳＬＥＥＰに退避させ、競合状態が解かれるとＡＣＴＩＶＥに回復するなど、スキーマの状態を制御する。
【０１９５】
行動状態制御部は、図２２に示すように、状況依存行動階層１０８において１つだけ配設して、同階層１０８を構成するすべてのスキーマを一元的に集中管理するようにしてもよい。
【０１９６】
図示の例では、行動状態制御部は、行動評価部と、行動選択部と、行動実行部を備えている。行動評価部は、例えば所定の制御周期で各スキーマのＭｏｎｉｔｏｒ関数をコールして、各々の活動度レベルと使用リソースを取得する。行動選択部は、各スキーマによる行動制御と機体リソースの管理を行なう。例えば、集計された活動度レベルの高い順にスキーマを選択するとともに、使用リソースが競合しないように２以上のスキーマを同時に選択する。行動実行部は、選択されたスキーマのＡｃｔｉｏｎ関数に行動実行命令を発行したり、スキーマの状態（ＲＥＡＤＹ、ＡＣＴＩＶＥ，ＳＬＥＥＰ）を管理して、スキーマの実行を制御する。例えば、より優先順位の高いスキーマが起動し、リソースの競合が生じた場合、優先順位が下位のスキーマの状態をＡＣＴＩＶＥからＳＬＥＥＰに退避させ、競合状態が解かれるとＡＣＴＩＶＥに回復する。
【０１９７】
あるいは、このような行動状態制御部の機能を、状況依存行動階層１０８内の各スキーマ毎に配置するようにしてもよい。例えば、図１９に示したように，スキーマがツリー構造を形成している場合（図２３を参照のこと）、上位（親）のスキーマの行動状態制御は、外部刺激と内部状態を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールし、子供のスキーマから活動度レベルと使用リソースを返り値として受け取る。また、子供のスキーマは、自分の活動度レベルと使用リソースを算出するために、さらに子供のスキーマのＭｏｎｉｔｏｒ関数をコールする。そして、ルートのスキーマの行動状態制御部には、各サブツリーからの活動度レベルと使用リソースが返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断して、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりする。
【０１９８】
図２４には、状況依存行動階層１０８において通常の状況依存行動を制御するためのメカニズムを模式的に示している。
【０１９９】
同図に示すように、状況依存行動階層１０８には、短期記憶部１０５から外部刺激が入力（Ｎｏｔｉｆｙ）されるとともに、内部状態管理部１０９から内部状態の変化が入力される。状況依存行動階層１０８は、例えば動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するためのサブツリーなど、複数のサブツリーで構成されており、ルート・スキーマは、外部刺激の通知（Ｎｏｔｉｆｙ）に応答して、各サブツリーのｍｏｎｉｔｏｒ関数をコールし、その返り値としての活動度レベル（ＡＬ値）を参照して、統合的な行動選択を行ない、選択された行動を実現するサブツリーに対してａｃｔｉｏｎ関数をコールする。また、状況依存行動階層１０８において決定された状況依存行動は、リソース・マネージャにより反射行動部１０９による反射的行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ）に適用される。
【０２００】
また、状況依存行動層１０８のうち、反射的行動部１０９は、上述した認識系の各オブジェクトによって認識された外部刺激に応じて反射的・直接的な機体動作を実行する（例えば、障害物の検出により咄嗟に避ける）。このため、通常の状況依存行動を制御する場合（図１９）とは相違し、認識系の各オブジェクトからの信号を直接入力する複数のスキーマが、階層化されずに並列的に配置されている。
【０２０１】
図２５には、反射行動部１０９におけるスキーマの構成を模式的に示している。同図に示すように、反射行動部１０９には、聴覚系の認識結果に応答して動作するスキーマとして「ＡｖｏｉｄＢｉｇＳｏｕｎｄ」、「ＦａｃｅｔｏＢｉｇＳｏｕｎｄ」及び「ＮｏｄｄｉｎｇＳｏｕｎｄ」、視覚系の認識結果に応答して動作するスキーマとして「ＦａｃｅｔｏＭｏｖｉｎｇＯｂｊｅｃｔ」及び「ＡｖｏｉｄＭｏｖｉｎｇＯｂｊｅｃｔ」、並びに、触覚系の認識結果に応答して動作するスキーマとして「手を引っ込める」が、それぞれ対等な立場で（並列的に）配設されている。
【０２０２】
図示の通り、反射的行動を行なう各スキーマは外部刺激を入力に持つ。また、各スキーマは、少なくともｍｏｎｉｔｏｒ関数とａｃｔｉｏｎ関数を備えている。ｍｏｎｉｔｏｒ関数は、外部刺激に応じて当該スキーマのＡＬ値を算出して、これに応じて該当する反射的行動を発現すべきかどうかが判断される。また、ａｃｔｉｏｎ関数は、スキーマ自身が持つ反射的行動を記述したステートマシン（後述）を備えており、コールされることにより、該当する反射的行動を発現するとともにａｃｔｉｏｎの状態を遷移させていく。
【０２０３】
図２６には、反射行動部１０９において反射的行動を制御するためのメカニズムを模式的に示している。
【０２０４】
図２５にも示したように、反射行動部１０９内には、反応行動を記述したスキーマや、即時的な応答行動を記述したスキーマが並列的に存在している。認識系のオブジェクトから認識結果が入力されると、対応する反射行動スキーマがｍｏｎｉｔｏｒ関数によりＡＬ値を算出し、その値に応じてａｃｔｉｏｎを軌道すべきかどうかが判断される。そして、反射行動部１０９において起動が決定された反射的行動は、リソース・マネージャにより反射行動部１０９による反射的行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ）に適用される。
【０２０５】
状況依存行動階層１０８（反射行動部１０９を含む）を構成するスキーマは、例えばＣ^＋＋言語ベースで記述される「クラス・オブジェクト」として記述することができる。図２７には、状況依存行動階層１０８において使用されるスキーマのクラス定義を模式的に示している。同図に示されている各ブロックはそれぞれ１つのクラス・オブジェクトに相当する。
【０２０６】
図示の通り、状況依存行動階層（ＳＢＬ）１０８は、１以上のスキーマと、ＳＢＬの入出力イベントに対してＩＤを割り振るＥｖｅｎｔＤａｔａＨａｎｄｌｅｒ（ＥＤＨ）と、ＳＢＬ内のスキーマを管理するＳｃｈｅｍａＨａｎｄｌｅｒ（ＳＨ）と、外部オブジェクト（ＳＴＭやＬＴＭ、リソース・マネージャ、認識系の各オブジェクトなど）からデータを受信する１以上のＲｅｃｅｉｖｅＤａｔａＨａｎｄｌｅｒ（ＲＤＨ）と、外部オブジェクトにデータを送信する１以上のＳｅｎｄＤａｔａＨａｎｄｌｅｒ（ＳＤＨ）とを備えている。
【０２０７】
ＥｖｅｎｔＤａｔａＨａｎｄｌｅｒ（ＥＤＨ）は、ＳＢＬの入出力イベントに対してＩＤを割り振るためのクラス・オブジェクトであり、ＲＤＨやＳＤＨから入出力イベントの通知を受ける。
【０２０８】
ＳｃｈｅｍａＨａｎｄｌｅｒは、状況依存行動階層（ＳＢＬ）１０８や反射行動部１０９を構成する各スキーマやツリー構造などの情報（ＳＢＬのコンフィギュレーション情報）をファイルとして保管している。例えばシステムの起動時などに、ＳｃｈｅｍａＨａｎｄｌｅｒは、このコンフィギュレーション情報ファイルを読み込んで、図１９に示したような状況依存行動階層１０８のスキーマ構成を構築（再現）して、メモリ空間上に各スキーマのエンティティをマッピングする。
【０２０９】
各スキーマは、スキーマのベースとして位置付けられるＯｐｅｎＲ＿Ｇｕｅｓｔを備えている。ＯｐｅｎＲ＿Ｇｕｅｓｔは、スキーマが外部にデータを送信するためのＤｓｕｂｊｅｃｔ、並びに、スキーマが外部からデータを受信するためのＤＯｂｊｅｃｔというクラス・オブジェクトをそれぞれ１以上備えている。例えば、スキーマが、ＳＢＬの外部オブジェクト（ＳＴＭやＬＴＭ、認識系の各オブジェクトなど）にデータを送るときには、ＤｓｕｂｊｅｃｔはＳｅｎｄＤａｔａＨａｎｄｌｅｒに送信データを書き込む。また、ＤＯｂｊｅｃｔは、ＳＢＬの外部オブジェクトから受信したデータをＲｅｃｅｉｖｅＤａｔａＨａｎｄｌｅｒから読み取ることができる。
【０２１０】
ＳｃｈｅｍａＭａｎａｇｅｒ及びＳｃｈｅｍａＢａｓｅは、ともにＯｐｅｎＲ＿Ｇｕｅｓｔを継承したクラス・オブジェクトである。クラス継承は、元のクラスの定義を受け継ぐことであり、この場合、ＯｐｅｎＲ＿Ｇｕｅｓｔで定義されているＤｓｕｂｊｅｃｔやＤＯｂｊｅｃｔなどのクラスオブジェクトをＳｃｈｅｍａＭａｎａｇｅｒやＳｃｈｅｍａＢａｓｅも備えていることを意味する（以下、同様）。例えば図１９に示すように複数のスキーマがツリー構造になっている場合、ＳｃｈｅｍａＭａｎａｇｅｒは、子供のスキーマのリストを管理するクラス・オブジェクトＳｃｈｅｍａＬｉｓｔを持ち（子供のスキーマへのポインタを持ち）、子供スキーマの関数をコールすることができる。また、ＳｃｈｅｍａＢａｓｅは、親スキーマへのポインタを持ち、親スキーマからコールされた関数の返り値を戻すことができる。
【０２１１】
ＳｃｈｅｍａＢａｓｅは、ＳｔａｔｅＭａｃｈｉｎｅ及びＰｒｏｎｏｍｅという２つのクラス・オブジェクトを持つ。ＳｔａｔｅＭａｃｈｉｎｅは当該スキーマの行動（Ａｃｔｉｏｎ関数）についてのステートマシンを管理している。図２８には、スキーマの行動（Ａｃｔｉｏｎ関数）についてのステートマシンを図解している。このステートマシンの状態間の遷移にそれぞれ行動（Ａｃｔｉｏｎ）が紐付けされている
【０２１２】
親スキーマは子供スキーマのＡｃｔｉｏｎ関数のステートマシンを切り替える（状態遷移させる）ことができる。また、Ｐｒｏｎｏｍｅには、当該スキーマが行動（Ａｃｔｉｏｎ関数）を実行又は適用するターゲットを代入する。後述するように、スキーマはＰｒｏｎｏｍｅに代入されたターゲットによって占有され、行動が終了（完結、異常終了など）するまでスキーマは解放されない。新規のターゲットのために同じ行動を実行するためには同じクラス定義のスキーマをメモリ空間上に生成する。この結果、同じスキーマをターゲット毎に独立して実行することができ（個々のスキーマの作業データが干渉し合うことはなく）、行動のＲｅｅｎｔｒａｎｃｅ性（後述）が確保される。
【０２１３】
ＰａｒｅｎｔＳｃｈｅｍａＢａｓｅは、ＳｃｈｅｍａＭａｎａｇｅｒ及びＳｃｈｅｍａＢａｓｅを多重継承するクラス・オブジェクトであり、スキーマのツリー構造において、当該スキーマ自身についての親スキーマ及び子供スキーマすなわち親子関係を管理する。
【０２１４】
ＩｎｔｅｒｍｅｄｉａＰａｒｅｎｔＳｃｈｅｍａＢａｓｅは、ＰａｒｅｎｔＳｃｈｅｍａＢａｓｅを継承するクラス・オブジェクトであり、各クラスのためのインターフェース変換を実現する。また、ＩｎｔｅｒｍｅｄｉａＰａｒｅｎｔＳｃｈｅｍａＢａｓｅは、ＳｃｈｅｍａＳｔａｔｕｓＩｎｆｏを持つ。このＳｃｈｅｍａＳｔａｔｕｓＩｎｆｏは、当該スキーマ自身のステートマシンを管理するクラス・オブジェクトである。
【０２１５】
親スキーマは、子供スキーマのＡｃｔｉｏｎ関数をコールすることによってそのステートマシンの状態を切り換えることができる。また、子供スキーマのＡｏｎｉｔｏｒ関数をコールしてそのステートマシンの状態に応じたＡＬ値を問うことができる。但し、スキーマのステートマシンは、前述したＡｃｔｉｏｎ関数のステートマシンとは異なるということを留意されたい。
【０２１６】
図２９には、スキーマ自身すなわちＡｃｔｉｏｎ関数によって記述されている行動についてのステートマシンを図解している。既に述べたように、スキーマ自身のステートマシンは、Ａｃｔｉｏｎ関数に寄って記述されている行動について、ＲＥＡＤＹ（準備完了）、ＡＣＴＩＶＥ（活動中），ＳＬＥＥＰ（待機中）という３つの状態を規定している。より優先順位の高いスキーマが起動し、リソースの競合が生じた場合、優先順位が下位のスキーマの状態をＡＣＴＩＶＥからＳＬＥＥＰに退避させ、競合状態が解かれるとＡＣＴＩＶＥに回復する。
【０２１７】
図２９に示すように、ＡＣＴＩＶＥからＳＬＥＥＰへの状態遷移にＡＣＴＩＶＥ＿ＴＯ＿ＳＬＥＥＰが、ＳＬＥＥＰからＡＣＴＩＶＥへの状態遷移にＳＬＥＥＰ＿ＴＯ＿ＡＣＴＩＶＥがそれぞれ規定されている。本実施形態において特徴的なのは、
（１）ＡＣＴＩＶＥ＿ＴＯ＿ＳＬＥＥＰに、後にＡＣＴＩＶＥに遷移して再開するために必要なデータ（コンテキスト）を保存するための処理と、ＳＬＥＥＰするために必要な行動が紐付けされている。
（２）ＳＬＥＥＰ＿ＴＯ＿ＡＣＴＩＶＥに、保存しておいたデータ（コンテキスト）を復元するための処理と、ＡＣＴＩＶＥに戻るために必要な行動が紐付けされている。
という点である。ＳＬＥＥＰするために必要な行動とは、例えば、話し相手に休止を告げる「ちょっと待っててね」などのセリフを言う行動（その他、身振り手振りが加わっていてもよい）である。また、ＡＣＴＩＶＥに戻るために必要な行動とは、例えば、話し相手に謝意を表わす「お待たせ」などのセリフを言う行動（その他、身振り手振りが加わっていてもよい）である。
【０２１８】
ＡｎｄＰａｒｅｎｔＳｃｈｅｍａ、ＮｕｍＯｒＰａｒｅｎｔＳｃｈｅｍａ、ＯｒＰａｒｅｎｔＳｃｈｅｍａは、ＩｎｔｅｒｍｅｄｉａＰａｒｅｎｔＳｃｈｅｍａＢａｓｅを継承するクラス・オブジェクトである。ＡｎｄＰａｒｅｎｔＳｃｈｅｍａは、同時実行する複数の子供スキーマへのポインタを持つ。ＯｒＰａｒｅｎｔＳｃｈｅｍａは、いずれか択一的に実行する複数の子供スキーマへのポインタを持つ。また、ＮｕｍＯｒＰａｒｅｎｔＳｃｈｅｍａは、所定数のみを同時実行する複数の子供スキーマへのポインタを持つ。
【０２１９】
ＰａｒｅｎｔＳｃｈｅｍａは、これらＡｎｄＰａｒｅｎｔＳｃｈｅｍａ、ＮｕｍＯｒＰａｒｅｎｔＳｃｈｅｍａ、ＯｒＰａｒｅｎｔＳｃｈｅｍａを多重継承するクラス・オブジェクトである。
【０２２０】
図３０には、状況依存行動階層（ＳＢＬ）１０８内のクラスの機能的構成を模式的に示している。
【０２２１】
状況依存行動階層（ＳＢＬ）１０８は、ＳＴＭやＬＴＭ、リソース・マネージャ、認識系の各オブジェクトなど外部オブジェクトからデータを受信する１以上のＲｅｃｅｉｖｅＤａｔａＨａｎｄｌｅｒ（ＲＤＨ）と、外部オブジェクトにデータを送信する１以上のＳｅｎｄＤａｔａＨａｎｄｌｅｒ（ＳＤＨ）とを備えている。
【０２２２】
ＥｖｅｎｔＤａｔａＨａｎｄｌｅｒ（ＥＤＨ）は、ＳＢＬの入出力イベントに対してＩＤを割り振るためのクラス・オブジェクトであり、ＲＤＨやＳＤＨから入出力イベントの通知を受ける。
【０２２３】
ＳｃｈｅｍａＨａｎｄｌｅｒは、スキーマを管理するためのクラス・オブジェクトであり、ＳＢＬを構成するスキーマのコンフィギュレーション情報をファイルとして保管している。例えばシステムの起動時などに、ＳｃｈｅｍａＨａｎｄｌｅｒは、このコンフィギュレーション情報ファイルを読み込んで、ＳＢＬ内のスキーマ構成を構築する。
【０２２４】
各スキーマは、図２７に示したクラス定義に従って生成され、メモリ空間上にエンティティがマッピングされる。各スキーマは、ＯｐｅｎＲ＿Ｇｕｅｓｔをベースのクラス・オブジェクトとし、外部にデータ・アクセスするためのＤＳｕｂｊｅｃｔやＤＯｂｊｅｃｔなどのクラス・オブジェクトを備えている。
【０２２５】
スキーマが主に持つ関数とステートマシンを以下に示しておく。
【０２２６】
ＡｃｔｉｖａｔｉｏｎＭｏｎｉｔｏｒ（）：スキーマがＲｅａｄｙ時にＡｃｔｉｖｅになるための評価関数。
Ａｃｔｉｏｎｓ（）：Ａｃｔｉｖｅ時の実行用ステートマシン。
Ｇｏａｌ（）：Ａｃｔｉｖｅ時にスキーマがＧｏａｌに達したかを評価する関数。
Ｇｏａｌ（）：Ａｃｔｉｖｅ時にスキーマがｆａｉｌ状態かを判定する関数。
ＳｌｅｅｐＡｃｔｉｏｎｓ（）：Ｓｌｅｅｐ前に実行されるステートマシン。
ＳｌｅｅｐＭｏｎｉｔｏｒ（）：Ｓｌｅｅｐ時にＲｅｓｕｍｅするための評価関数。
ＲｅｓｕｍｅＡｃｔｉｏｎｓ（）：Ｒｅｓｕｍｅ前にＲｅｓｕｍｅするためのステートマシン。
ＤｅｓｔｒｏｙＭｏｎｉｔｏｒ（）：Ｓｌｅｅｐ時にスキーマがｆａｉｌ状態か判定する評価関数。
ＭａｋｅＰｒｏｎｏｍｅ（）：ツリー全体のターゲットを決定する関数である。
【０２２７】
これらの関数は、ＳｃｈｅｍａＢａｓｅで記述されている。
【０２２８】
図３１には、ＭａｋｅＰｒｏｎｏｍｅ関数を実行する処理手順をフローチャートの形式で示している。
【０２２９】
スキーマのＭａｋｅＰｒｏｎｏｍｅ関数がコールされると、まず、スキーマ自信に子供スキーマが存在するかどうかを判別する（ステップＳ１）。
【０２３０】
子供スキーマが存在する場合には、同様にすべての子供スキーマのＭａｋｅＰｒｏｎｏｍｅ関数を再帰的にコールする（ステップＳ２）。
【０２３１】
そして、スキーマ自身のＭａｋｅＰｒｏｎｏｍｅを実行して、Ｐｒｏｎｏｍｅオブジェクトにターゲットが代入される（ステップＳ３）。
【０２３２】
この結果、自分以下のすべてのスキーマのＰｒｏｎｏｍｅに対して同じターゲットが代入され、行動が終了（完結、異常終了など）するまでスキーマは解放されない。新規のターゲットのために同じ行動を実行するためには同じクラス定義のスキーマをメモリ空間上に生成する。
【０２３３】
図３２には、Ｍｏｎｉｔｏｒ関数を実行する処理手順をフローチャートの形式で示している。
【０２３４】
まず、評価フラグ（ＡｓｓｅｓｓｓｍｅｎｔＦｌａｇ）をオンに設定して（ステップＳ１１）、スキーマ自身のＡｃｔｉｏｎを実行する（ステップＳ１２）。このとき、子供スキーマの選定も行なう。そして、評価フラグをオフに戻す（ステップＳ１３）。
【０２３５】
子供スキーマが存在する場合には（ステップＳ１４）、ステップＳ１２において選択した子供スキーマのＭｏｎｉｔｏｒ関数を再帰的にコールする（ステップＳ１５）。
【０２３６】
次いで、スキーマ自身のＭｏｎｉｔｏｒ関数を実行して（ステップＳ１６）、活動度レベルと行動実行に使用するリソースを算出して（ステップＳ１７）、関数の戻り値とする。
【０２３７】
図３３及び図３４には、Ａｃｔｉｏｎｓ関数を実行する処理手順をフローチャートの形式で示している。
【０２３８】
まず、スキーマがＳＴＯＰＰＩＮＧ状態かどうかをチェックし（ステップＳ２１）、次いで、ＳＴＯＰＰＩＮＧすべき状態かどうかをチェックする（ステップＳ２２）。
【０２３９】
ＳＴＯＰＰＩＮＧすべき状態である場合には、さらに子供スキーマがいるかどうかをチェックする（ステップＳ２３）。そして、子供スキーマがいる場合にはこれをＧＯ＿ＴＯ＿ＳＴＯＰ状態に移行させてから（ステップＳ２４）、ＨａｖｅＴｏＳｔｏｐＦｌａｇをオンにする（ステップＳ２５）。
【０２４０】
また、ＳＴＯＰＰＩＮＧすべき状態でない場合には、ＲＵＮＮＩＮＧ状態かどうかをチェックする（ステップＳ２６）。
【０２４１】
ＲＵＮＮＩＮＧ状態でない場合には、さらに子供スキーマがいるかどうかをチェックする（ステップＳ２７）。そして、子供スキーマがいる場合には、ＨａｖｅＴｏＳｔｏｐＦｌａｇをオンにする（ステップＳ２８）。
【０２４２】
次いで、現在のシステム状態とＨａｖｅＴｏＲｕｎＦｌａｇとＨａｖｅＴｏＳｔｏｐＦｌａｇと子供スキーマの動作状態から次の自分自身の状態を決定する（ステップＳ２９）。
【０２４３】
次いで、スキーマ自身のＡｃｔｉｏｎ関数を実行する（ステップＳ３０）。
【０２４４】
その後、スキーマ自身がＧＯ＿ＴＯ＿ＳＴＯＰ状態かどうかをチェックする（ステップＳ３１）。ＧＯ＿ＴＯ＿ＳＴＯＰ状態でない場合には、さらに子供スキーマがいるかどうかをチェックする（ステップＳ３２）。そして、子供スキーマがいる場合には、ＧＯ＿ＴＯ＿ＳＴＯＰ状態の子供スキーマがいるかどうかをチェックする（ステップＳ３３）。
【０２４５】
ＧＯ＿ＴＯ＿ＳＴＯＰ状態の子供スキーマがいる場合には、これらのスキーマのＡｃｔｉｏｎ関数を実行する（ステップＳ３４）。
【０２４６】
次いで、ＲＵＮＮＩＮＧ中の子供スキーマがいるかどうかをチェックする（ステップＳ３５）。ＲＵＮＮＩＮＧ中の子供スキーマがいない場合には、停止中の子供スキーマがいるかどうかをチェックして（ステップＳ３６）、停止中の子供スキーマのＡｃｔｉｏｎ関数を実行する（ステップＳ３７）。
【０２４７】
次いで、ＧＯ＿ＴＯ＿ＲＵＮ状態の子供スキーマがいるかどうかをチェックする（ステップＳ３８）。ＧＯ＿ＴＯ＿ＲＵＮ状態の子供スキーマがいない場合には、ＧＯ＿ＴＯ＿ＳＴＯＰ状態の子供スキーマがいるかどうかをチェックして（ステップＳ３９）、いればこの子供スキーマのＡｃｔｉｏｎ関数を実行する（ステップＳ４０）。
【０２４８】
最後に、現在のシステム状態とＨａｖｅＴｏＲｕｎＦｌａｇとＨａｖｅＴｏＳｔｏｐＦｌａｇと子供の動作状態から自分自身の次の状態を決定して，本処理ルーチン全体を終了する（ステップＳ４１）。
【０２４９】
Ｄ−３．状況依存行動階層の機能
状況依存行動階層（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０８は、短期記憶部１０５並びに長期記憶部１０６の記憶内容や、内部状態管理部１０４によって管理される内部状態を基に、ロボット装置１が現在置かれている状況に即応した行動を制御する。
【０２５０】
前項で述べたように、本実施形態に係る状況依存行動階層１０８は、スキーマのツリー構造（図１９を参照のこと）で構成されている。各スキーマは、自分の子供と親の情報を知っている状態で独立性を保っている。このようなスキーマ構成により、状況依存行動階層１０８は、Ｃｏｎｃｕｒｒｅｎｔな評価、Ｃｏｎｃｕｒｒｅｎｔな実行、Ｐｒｅｅｍｐｔｉｏｎ、Ｒｅｅｎｔｒａｎｔという主な特徴を持っている。以下、これらの特徴について詳解する。
【０２５１】
（１）Ｃｏｎｃｕｒｒｅｎｔな評価：
行動モジュールとしてのスキーマは外部刺激や内部状態の変化に応じた状況判断を行なうＭｏｎｉｔｏｒ機能を備えていることは既に述べた。Ｍｏｎｉｔｏｒ機能は、スキーマがクラス・オブジェクトＳｃｈｅｍａＢａｓｅでＭｏｎｉｔｏｒ関数を備えていることにより実装されている。Ｍｏｎｉｔｏｒ関数とは、外部刺激と内部状態に応じて当該スキーマの活動度レベル（ＡｃｔｉｖａｔｉｏｎＬｅｖｅｌ：ＡＬ値）を算出する関数である。
【０２５２】
図１９に示すようなツリー構造を構成する場合、上位（親）のスキーマは外部刺激と内部状態を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールすることができ、子供のスキーマはＡＬ値を返り値とする。また、スキーマは自分のＡＬ値を算出するために、さらに子供のスキーマのＭｏｎｉｔｏｒ関数をコールすることができる。そして、ルートのスキーマには各サブツリーからのＡＬ値が返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断することができる。
【０２５３】
このようにツリー構造になっていることから、外部刺激と内部状態の変化による各スキーマの評価は、まずツリー構造の下から上に向かってＣｏｎｃｕｒｒｅｎｔに行なわれる。図３２のフローチャートでも示したように、スキーマに子供スキーマがある場合には（ステップＳ１４）、選択した子供のＭｏｎｉｔｏｒ関数をコールしてから（ステップＳ１５）、自身のＭｏｎｉｔｏｒ関数を実行する。
【０２５４】
次いで、ツリー構造の上から下に向かって評価結果としての実行許可を渡していく。評価と実行は、その行動が用いるリソースの競合を解きながら行なわれる。
【０２５５】
本実施形態に係る状況依存行動階層１０８は、スキーマのツリー構造を利用して、並列的に行動の評価を行なうことができるので、外部刺激や内部状態などの状況に対しての適応性がある。また、評価時には、ツリー全体に関しての評価を行ない、このとき算出される活動度レベル（ＡＬ）値によりツリーが変更されるので、スキーマすなわち実行する行動を動的にプライオリタイズすることができる。
【０２５６】
（２）Ｃｏｎｃｕｒｒｅｎｔな実行：
ルートのスキーマには各サブツリーからのＡＬ値が返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断することができる。例えばＡＬ値が最も高いスキーマを選択したり、ＡＬ値が所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行するようにしてもよい（但し、並列実行するときは各スキーマ同士でハードウェア・リソースの競合がないことを前提とする）。
【０２５７】
実行許可をもらったスキーマは実行される。すなわち、実際にそのスキーマはさらに詳細の外部刺激や内部状態の変化を観測して、コマンドを実行する。実行に関しては、ツリー構造の上から下に向かって順次すなわちＣｏｎｃｕｒｒｅｎｔに行なわれる。図３３及び図３４のフローチャートでも示したように、スキーマに子供スキーマがある場合には、子供のＡｃｔｉｏｎｓ関数を実行する。
【０２５８】
Ａｃｔｉｏｎ関数は、スキーマ自身が持つ行動を記述したステートマシン（後述）を備えている。図１９に示すようなツリー構造を構成する場合、親スキーマは、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりすることができる。
【０２５９】
本実施形態に係る状況依存行動階層１０８は、スキーマのツリー構造を利用して、リソースが競合しない場合には、余ったリソースを使う他のスキーマを同時に実行することができる。但し、Ｇｏａｌまでに使用するリソースに対して制限を加えないと、ちぐはぐな行動出現が起きる可能性がある。状況依存行動階層１０８において決定された状況依存行動は、リソース・マネージャにより反射行動部１０９による反射的行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ）に適用される。
【０２６０】
（３）Ｐｒｅｅｍｐｔｉｏｎ：
１度実行に移されたスキーマであっても、それよりも重要な（優先度の高い）行動があれば、スキーマを中断してそちらに実行権を渡さなければならない。また、より重要な行動が終了（完結又は実行中止など）したら、元のスキーマを再開して実行を続けることも必要である。
【０２６１】
このような優先度に応じたタスクの実行は、コンピュータの世界におけるＯＳ（オペレーティング・システム）のＰｒｅｅｍｐｔｉｏｎと呼ばれる機能に類似している。ＯＳでは、スケジュールを考慮するタイミングで優先度のより高いタスクを順に実行していくという方針である。
【０２６２】
これに対し、本実施形態に係るロボット１の行動制御システム１００は、複数のオブジェクトにまたがるため、オブジェクト間での調停が必要になる。例えば反射行動を制御するオブジェクトであるＲｅｆｌｅｘｉｖｅＳＢＬは、上位の状況依存行動を制御するオブジェクトであるＳＢＬの行動評価を気にせずに物を避けたり、バランスをとったりする必要がある。これは、実際に実行権を奪い取り実行を行なう訳であるが、上位の行動モジュール（ＳＢＬ）に、実行権利が奪い取られたことを通知して、上位はその処理を行なうことによってＰｒｅｅｍｐｔｉｖｅな能力を保持する。
【０２６３】
また、状況依存行動層１０８内において、外部刺激と内部状態の変化に基づくＡＬ値の評価の結果、あるスキーマに実行許可がなされたとする。さらに、その後の外部刺激と内部状態の変化に基づくＡＬ値の評価により、別のスキーマの重要度の方がより高くなったとする。このような場合、実行中のスキーマのＡｃｔｉｏｎｓ関数を利用してＳｌｅｅｐ状態にして中断することにより、Ｐｒｅｅｍｐｔｉｖｅな行動の切り替えを行なうことができる。
【０２６４】
実行中のスキーマのＡｃｔｉｏｎｓ（）の状態を保存して、異なるスキーマのＡｃｔｉｏｎｓ（）を実行する。また、異なるスキーマのＡｃｔｉｏｎｓ（）が終了した後、中断されたスキーマのＡｃｔｉｏｎｓ（）を再度実行することができる。
【０２６５】
また、実行中のスキーマのＡｃｔｉｏｎｓ（）を中断して、異なるスキーマに実行権が移動する前に、ＳｌｅｅｐＡｃｔｉｏｎｓ（）を実行する。例えば、ロボット１は、対話中にサッカーボールを見つけると、「ちょっと待ってね」と言って、サッカーすることができる。
【０２６６】
（４）Ｒｅｅｎｔｒａｎｔ：
状況依存行動階層１０８を構成する各スキーマは、一種のサブルーチンである。スキーマは、複数の親からコールされた場合には、その内部状態を記憶するために、それぞれの親に対応した記憶空間を持つ必要がある。
【０２６７】
これは、コンピュータの世界では、ＯＳが持つＲｅｅｎｔｒａｎｔ性に類似しており、本明細書ではスキーマのＲｅｅｎｔｒａｎｔ性と呼ぶ。図３０を参照しながら説明したように、スキーマはクラス・オブジェクトで構成されており、クラス・オブジェクトのエンティティすなわちインスタンスをターゲット（Ｐｒｏｎｏｍｅ）毎に生成することによりＲｅｅｎｔｒａｎｔ性が実現される。
【０２６８】
スキーマのＲｅｅｎｔｒａｎｔ性について、図３５を参照しながらより具体的に説明する。
【０２６９】
ＳｃｈｅｍａＨａｎｄｌｅｒは、スキーマを管理するためのクラス・オブジェクトであり、ＳＢＬを構成するスキーマのコンフィギュレーション情報をファイルとして保管している。システムの起動時に、ＳｃｈｅｍａＨａｎｄｌｅｒは、このコンフィギュレーション情報ファイルを読み込んで、ＳＢＬ内のスキーマ構成を構築する。図３１に示す例では、ＥａｔやＤｉａｌｏｇなどの行動を規定するスキーマのエンティティがメモリ空間上にマッピングされているとする。
【０２７０】
ここで、外部刺激と内部状態の変化に基づく活動度レベルの評価により、スキーマＤｉａｌｏｇに対してＡというターゲット（Ｐｒｏｎｏｍｅ）が設定されて、Ｄｉａｌｏｇが人物Ａとの対話を実行するようになったとする。
【０２７１】
その後、人物Ｂがロボット１と人物Ａとの対話に割り込み、外部刺激と内部状態の変化に基づく活動度レベルの評価を行なった結果、Ｂとの対話を行なうスキーマの方がより優先度が高くなったとする。
【０２７２】
このような場合、ＳｃｈｅｍａＨａｎｄｌｅｒは、Ｂとの対話を行なうためのクラス継承した別のＤｉａｌｏｇエンティティ（インスタンス）をメモリ空間上にマッピングする。別のＤｉａｌｏｇエンティティを使用して、先のＤｉａｌｏｇエンティティとは独立して、Ｂとの対話を行なうことから、Ａとの対話内容は破壊されずに済む。したがって、ＤｉａｌｏｇＡはデータの一貫性を保持することができ、Ｂとの対話が終了すると、Ａとの対話を中断した時点から再開することができる。
【０２７３】
Ｒｅａｄｙリスト内のスキーマは、その対象物（外部刺激）に応じて評価すなわちＡＬ値の計算が行なわれ、実行権が引き渡される。その後、Ｒｅａｄｙリスト内に移動したスキーマのインスタンスを生成して、これ以外の対象物に対して評価を行なう。これにより、同一のスキーマをａｃｔｉｖｅ又はｓｌｅｅｐ状態にすることができる。
【０２７４】
Ｅ．ロボットの内部状態管理
本実施形態に係るロボットの行動制御システム１００では、状況依存行動階層１０８は内部状態と外部環境によって行動を決定する。
【０２７５】
ロボット装置１の内部状態は、本能や感情といった数種類の情動で構成され、数式モデル化して扱われる。内部状態管理部（ＩＳＭ：ＩｎｔｅｒｎａｌＳｔａｔｕｓＭａｎａｇｅｒ）１０４は、上述した各認識機能部１０１〜１０３によって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）と、時間的経過に基づいて、内部状態を管理する。
【０２７６】
Ｅ−１．情動の階層化
本実施形態では、情動についてその存在意義による複数階層で構成され、それぞれの階層で動作する。決定された複数の動作から、そのときの外部環境や内部状態によってどの動作を行なうかを決定するようになっている（後述）。また、それぞれの階層で行動は選択されるが、より低次の行動から優先的に動作を発現していくことにより、反射などの本能的行動や、記憶を用いた動作選択などの高次の行動を１つの個体上で矛盾なく発現することができる。
【０２７７】
図３６には、本実施形態に係る内部状態管理部１０４の階層的構成を模式的に示している。
【０２７８】
図示の通り、内部状態管理部１０４は、情動などの内部情報を、情動を本能や欲求などの個体存続に必要な１次情動と、この１次情動の満足度（過不足）によって変化する２次情動に大別する。また、１次情動は、個体存続においてより生理的なものから連想に至るものまで階層的に細分化されている。
【０２７９】
図示の例では、１次情動は、低次から高次に向かって、下位の１次情動、上位の１次情動、連想による１次情動に区分される。下位の１次情動は、大脳辺縁系へのアクセスに相当し、ホメオスタシス（個体維持）が保たれるように情動発生するとともに、ホメオスタシスが脅かされる場合には優先される。また、上位の１次情動は、大脳新皮質へのアクセスに相当し、内発的欲求や社会的欲求などの種族維持に関わる。上位の１次情動は、学習や環境に依って満足度が変化する（学習やコミュニケーションにより満足される）。
【０２８０】
１次情動の各階層は、行動選択されたスキーマを実行することによる一時情動（本能）レベルの変化量ΔＩを出力する。
【０２８１】
２次情動は、いわゆる感情（Ｅｍｏｔｉｏｎ）に相当し、喜び（Ｊｏｙ）、悲しみ（Ｓａｄ）、怒り（Ａｎｇｅｒ）、驚き（Ｓｕｒｐｒｉｓｅ）、嫌気（Ｄｉｓｇｕｓｔ）、畏怖（Ｆｅｅｒ）などの要素からなる。１次情動の変化量ΔＩに応じて２次情動の変化量（満足度）ΔＥが決定される。
【０２８２】
状況依存行動階層１０８では、主に１次情動を基に行動選択を行なうが、２次情動が強い場合には、２次情動に基づく行動選択を行なうこともできる。さらに、１次情動を基に選択された行動に対して２次情動により生成されたパラメータを使用してモジュレーションを行なうことも可能である。
【０２８３】
個体存続のための情動階層は、生得的反射による行動がまず選択される。次いで、下位の１次情動を満たす行動を選択する。そして、上位の１次情動を満たす行動発生、連想による１次情動を満たす行動発生と、よりプリミティブな個体保持から実現する。
【０２８４】
この際、各階層の１次情動は、直近の階層に対して圧力をかけることができる。自身で決定した行動を選択するための指標が強い場合、直近の階層で決定された行動を抑制して、自身の行動を発現することができる。
【０２８５】
前項Ｄでも述べたように、状況依存行動階層１０８は、目標とする動作を持った複数のスキーマによって構成されている（図１８、図１９などを参照のこと）。状況依存行動階層１０８では、各スキーマが持つ活動度レベルを指標にしてスキーマすなわち行動を選択する。内部状態の活動度レベルと外部状況の活動度レベルによりスキーマ全体の活動度レベルが決定する。スキーマは、目標とする動作を実行するための途中経過毎に、活動度レベルを保持する。○○を満たす行動発生とは、○○を満たす行動が最終目標であるスキーマを実行することに相当する。
【０２８６】
内部状態の活動度レベルは、スキーマを実行したときの１次情動における階層毎の変化量ΔＩに基づく２次情動の満足度の変化ΔＥの総和によって決定される。ここで、１次情動がＬ１，Ｌ２，Ｌ３の３階層からなり、スキーマ選択時の１次情動の各階層に由来する２次情動の変化をそれぞれΔＥ_Ｌ１，ΔＥ_Ｌ２，ΔＥ_Ｌ３，とすると、それぞれに重み因子ｗ_１，ｗ_２，ｗ_３を掛けて活動度レベルを算出する。下位の１次情動に対する重み因子をより大きくすることにより、下位の１次情動を満たす行動がより選択され易くなる。また、これら重み因子を調整することにより、各階層の１次情動が直近の階層に対して圧力をかける（Ｃｏｎｃｅｎｔｒａｔｉｏｎ：行動抑制）という作用を得ることができる。
【０２８７】
ここで、情動の階層化構造を利用した行動選択の実施例について説明する。但し、以下では下位の１次情動としてＳｌｅｅｐ（眠気）を、上位の１次情動としてＣｕｒｉｏｓｉｔｙ（好奇心）を扱う。
【０２８８】
（１）下位の１次情動であるＳｌｅｅｐが不足してきて、Ｓｌｅｅｐを満たすスキーマの活動度レベルが高まってきたとする。このとき、他のスキーマの活動度レベルが上がらなければ、Ｓｌｅｅｐを満たすスキーマは、Ｓｌｅｅｐが満たされるまで自身を実行する。
【０２８９】
（２）Ｓｌｅｅｐが満たされる前に、上位の１次情動であるＣｕｒｉｏｓｉｔｙが不足してきたとする。しかし、Ｓｌｅｅｐのほうが個体維持に直結するため、Ｓｌｅｅｐの活動度レベルが一定値以下になるまでは、Ｓｌｅｅｐを満たすスキーマが実行し続ける。そして、Ｓｌｅｅｐがある程度満たされたら、Ｃｕｒｉｏｓｉｔｙを満たすスキーマを実行することができる。
【０２９０】
（３）Ｃｕｒｉｏｓｉｔｙを満たすスキーマ実行中に手を勢いよくロボットの顔面に近づけたとする。これに応答して、ロボットは色認識と大きさ認識による突然肌色が近づいてきたことが判り、生得的な反射行動として手から顔を避ける、すなわち後ろに頭を引くという動作を反射的に行なう。この反射的な動作は動物の脊髄反射に相当する。反射は、最も下位にあるスキーマなので、反射スキーマがまず実行される。
【０２９１】
脊髄反射の後、それに伴う情動変化が起き、その変化幅と他のスキーマの活動度レベルから、続いて情動表出スキーマを行なうかどうかを決定する。情動表出スキーマが行なわれていない場合は、Ｃｕｒｉｏｓｉｔｙを満たすスキーマが続行される。
【０２９２】
（４）あるスキーマ自身の下位にあるスキーマは通常自身より選択される可能性が高いが、自身の活動度レベルが極端に高いときに限り、下位のスキーマを抑制して（Ｃｏｎｃｅｎｔｒａｔｉｏｎ）、一定値まで自身を実行することが可能である。Ｓｌｅｅｐの不足が著しいときは、反射行動スキーマの行動を出したいときであっても、一定値に回復するまではＳｌｅｅｐを満たすスキーマが優先的に実行される。
【０２９３】
Ｅ−２．他の機能モジュールとの連携
図３７には、内部状態管理部１０４と他の機能モジュールとの通信経路を模式的に示している。
【０２９４】
短期記憶部１０５は、外部環境の変化を認識する各認識機能部１０１〜１０３からの認識結果を、内部状態管理部１０４と状況依存行動階層１０８に出力する。
内部状態管理部１０４は、状況依存行動階層１０８に内部状態を通知する。これに対し、状況依存行動階層１０８は、連想又は決定した本能や感情の情報を返す。
【０２９５】
また、状況依存行動階層１０８は、内部状態と外部環境から算出される活動度レベルを基に行動を選択するとともに、選択した行動の実行と完了を短期記憶部１０５経由で内部状態管理部１０４に通知する。
【０２９６】
内部状態管理部１０４は、行動毎に内部状態を長期記憶部１０６に出力する。これに対し、長期記憶部１０６は、記憶情報を返す。
【０２９７】
バイオリズム管理部は、バイオリズム情報を内部状態管理部１０４に供給する。
【０２９８】
Ｅ−３．時間経過による内部状態の変化
内部状態の指標は時間経過により変化する。例えば、１次情動すなわち本能であるＨｕｎｇｅｒ（空腹感）、Ｆａｔｉｇｕｅ（疲労）、Ｓｌｅｅｐ（眠気）は、時間経過によりそれぞれ以下のように変化する。
【０２９９】
Ｈｕｎｇｅｒ：おなかが減る（仮想値又はバッテリ残量）
Ｆａｔｉｇｕｅ：疲れがたまる
Ｓｌｅｅｐ：眠気がたまる
【０３００】
また、本実施形態では、ロボットの２次情動すなわち感情（Ｅｍｏｔｉｏｎ）の要素としてＰｌｅａｓａｎｔｎｅｓｓ（満足度），Ａｃｔｉｖａｔｉｏｎ（活動度），Ｃｅｒｔａｉｎｔｙ（確信度）を定義しているが、時間経過によりそれぞれ以下のように変化する。
【０３０１】
Ｐｌｅａｓａｎｔｎｅｓｓ：Ｎｅｕｔｒａｌ（中立）に向かって変化する
Ａｃｔｉｖａｔｉｏｎ：バイオリズムやＳｌｅｅｐ（眠気）に依存する
Ｃｅｒｔａｉｎｔｙ：Ａｔｔｅｎｔｉｏｎに依存する
【０３０２】
図３８には、内部状態管理部１０４が時間変化に伴って内部状態を変化させるための仕組みを示している。
【０３０３】
図示のように、バイオリズム管理部は、一定の周期でバイオリズム情報を通知する。これに対し、内部状態管理部１０４は、バイオリズムにより１次情動の各要素の値を変更するとともに、２次情動であるＡｃｔｉｖａｔｉｏｎ（活動度）を変動させる。そして、状況依存行動階層１０８は、バイオリズム管理部からの通知がある度に、内部状態管理部１０４から本能や感情など内部状態の指標値を受け取るので、内部状態を基に各スキーマの活動度レベルを算出することにより、状況に依存した行動（スキーマ）を選択することができる。
【０３０４】
Ｅ−４．動作実行による内部状態の変化
内部状態は、ロボットが動作を実行することによっても変化する。
【０３０５】
例えば、「眠る」という行動を行なうスキーマは、下位の１次情動としてのＳｌｅｅｐ（眠気）を満たす行動が最終目標としている。状況依存行動階層１０８では、１次情動としてのＳｌｅｅｐと２次情動としてのＡｃｔｉｖａｔｉｏｎを基に各スキーマの活動度レベルを算出・比較して、「眠る」スキーマを選択し、この結果、眠るという行動が実現される。
【０３０６】
一方、状況依存行動階層１０８は、眠るという行動の実行完了を短期記憶部１０５経由で内部状態管理部１０４に伝達する。これに対し、内部状態管理部１０４は、「眠る」行動の実行により、１次情動であるＳｌｅｅｐの指標値を変更する。
【０３０７】
そして、状況依存行動階層１０８では、Ｓｌｅｅｐが満たされた度合いと２次情動としてのＡｃｔｉｖａｔｉｏｎを基に各スキーマの活動度レベルを改めて算出・比較する。この結果、優先度が高くなった他のスキーマを選択し、眠るというスキーマから抜ける。
【０３０８】
図３９には、内部状態管理部１０４がロボットの動作実行により内部状態を変化させるための仕組みを示している。
【０３０９】
状況依存行動階層１０８は、状況依存型で選択された行動の実行開始及び実行終了、並びにＡｔｔｅｎｔｉｏｎ情報を、短期記憶部１０５経由で内部状態管理部１０４に通知する。
【０３１０】
内部状態管理部１０４は、選択された行動の実行完了情報が通知されると、Ａｔｔｅｎｔｉｏｎ情報に則って、短期記憶部１０５から得た外部環境を確認して、１次情動としての本能（Ｓｌｅｅｐ）の指標値を変更するとともに、これに伴って２次情動としての感情も変更する。そして、これら内部状態の更新データを、状況依存行動階層１０８並びに長期記憶部１０６に出力する。
状況依存行動階層１０８では、新たに受け取った内部状態の指標値を基に、各スキーマの活動度レベルを算出して、状況に依存した次の行動（スキーマ）を選択する。
【０３１１】
また、長期記憶部１０６は、内部状態の更新データを基に記憶情報を更新するとともに、更新内容を内部状態管理部１０４に通知する。内部状態管理部１０４では、外部環境に対する確信度と長期記憶部１０６の確信度により、２次情動としての確信度（Ｃｅｒｔａｉｎｔｙ）を決定する。
【０３１２】
Ｅ−５．センサ情報による内部状態の変化
ロボットが動作を実行したときのその動作程度は、各認識機能部１０１〜１０３によって認識され、短期記憶部１０５経由で内部状態管理部１０４に通知される。内部状態管理部１０４は、この動作程度を例えばＦａｔｉｇｕｅ（疲労）として１次情動の変化に反映させることができる。また、この１次情動の変化に応答して、２次情動も変化させることができる。
【０３１３】
図４０には、内部状態管理部１０４が外部環境の認識結果により内部状態を変化させるための仕組みを示している。
【０３１４】
内部状態管理部１０４は、短期記憶部１０５経由で各認識機能部１０１〜１０３による認識結果を受け取ると、１次情動の指標値を変更するとともに、これに伴って２次情動としての感情も変更する。そして、これら内部状態の更新データを、状況依存行動階層１０８に出力する。
【０３１５】
状況依存行動階層１０８では、新たに受け取った内部状態の指標値を基に、各スキーマの活動度レベルを算出して、状況に依存した次の行動（スキーマ）を選択することができる。
【０３１６】
Ｅ−６．連想による内部状態の変化
既に述べたように、本実施形態に係るロボットは、長期記憶部１０６において連想記憶機能を備えている。この連想記憶は、あらかじめ複数のシンボルからなる入力パターンを記憶パターンとして記憶しておき、その中のある１つのパターンに類似したパターンが想起される仕組みのことであり、外部刺激から内部状態の変化を連想記憶することができる。
【０３１７】
例えば、りんごが見えた場合に「嬉しい」という情動の変化を起こす場合について考察してみる。
【０３１８】
りんごが視覚認識機能部１０１において認識されると、短期記憶部１０５を経由して状況依存行動階層１０８に外部環境の変化として通知される。
【０３１９】
長期記憶部１０６では、「りんご」に関する連想記憶により、「（りんごを）食べる」という行動と、食べることにより１次情動（空腹感）が指標値で３０だけ満たされるという内部状態の変化を想起することができる。
【０３２０】
状況依存行動階層１０８は、長期記憶部１０６から記憶情報を受け取ると、内部状態の変化ΔＩ＝３０を、内部状態管理部１０４に通知する。
【０３２１】
内部状態管理部１０４では、通知されたΔＩを基に、２次情動の変化量ΔＥを算出して、りんごを食べることによる２次情動Ｅの指標値を得ることができる。
【０３２２】
図４１には、内部状態管理部１０４が連想記憶により内部状態を変化させるための仕組みを示している。
【０３２３】
外部環境が短期記憶部１０５を経由して状況依存行動階層１０８に通知される。長期記憶部１０６の連想記憶機能により、外部環境に応じた行動と、１次情動の変化ΔＩを想起することができる。
【０３２４】
状況依存行動階層１０８は、この連想記憶により得られた記憶情報を基に行動を選択するとともに、１次情動の変化ΔＩを内部状態管理部１０４に通知する。
【０３２５】
内部状態管理部１０４では、通知を受けた１次情動の変化ΔＩと、自身で管理している１次情動の指標値とを基に、２次情動の変化ΔＥを算出して、２次情動を変化させる。そして、新たに生成された１次情動及び２次情動を、内部状態更新データとして状況依存行動階層１０８に出力する。
【０３２６】
状況依存行動階層１０８では、新たに受け取った内部状態の指標値を基に、各スキーマの活動度レベルを算出して、状況に依存した次の行動（スキーマ）を選択することができる。
【０３２７】
Ｅ−７．生得的な行動による内部状態の変化
本実施形態に係るロボットが動作実行により内部状態を変化させることは既に述べた通りである（図３９を参照のこと）。この場合、１次情動と２次情動からなる内部状態の指標値を基に行動が選択されるとともに、行動の実行完了により情動が満たされる。他方、本実施形態に係るロボットは、情動に依存しない、生得的な反射行動も規定されている。この場合、外部環境の変化に応じて反射行動が直接選択されることになり、通常の動作実行による内部変化とは異なる仕組みとなる。
【０３２８】
例えば、大きなものが突然現れたときに生得的な反射行動をとる場合について考察してみる。
【０３２９】
このような場合、例えば視覚的認識機能部１０１による「大きいもの」という認識結果（センサ情報）は、短期記憶部１０５を介さず、状況依存行動階層１０８に直接入力される。
【０３３０】
状況依存行動階層１０８では、「大きいもの」という外部刺激により各スキーマの活動度レベルを算出して、適当な行動を選択する（図１５、図２５及び図２６を参照のこと）。この場合、状況依存行動階層１０８では、「よける」という脊髄反射的行動を選択するとともに、「驚く」という２次情動を決定して、これを内部状態管理部１０４に通知する。
【０３３１】
内部状態管理部１０４では、状況依存行動階層１０８から送られてきた２次情動を自身の感情として出力する。
【０３３２】
図４２には、内部状態管理部１０４が生得的反射行動により内部状態を変化させるための仕組みを示している。
【０３３３】
生得的な反射行動を行なう場合、各認識機能部１０１〜１０３による戦さ情報は、短期記憶部１０５を介さず、状況依存行動階層１０８に直接入力される。
状況依存行動階層１０８では、センサ情報として得た外部刺激により各スキーマの活動度レベルを算出して、適当な行動を選択するとともに、２次情動を決定して、これを内部状態管理部１０４に通知する。
【０３３４】
内部状態管理部１０４では、状況依存行動階層１０８から送られてきた２次情動を自身の感情として出力する。また、状況依存行動階層１０８からのＡｃｔｉｖａｔｉｏｎに対して、バイオリズムの高低によって最終的なＡｃｔｉｖａｔｉｏｎを決定する。
【０３３５】
状況依存行動階層１０８では、新たに受け取った内部状態の指標値を基に、各スキーマの活動度レベルを算出して、状況に依存した次の行動（スキーマ）を選択することができる。
【０３３６】
Ｅ−８．スキーマと内部状態管理部との関係
状況依存行動階層１０８は、複数のスキーマで構成され、各スキーマ毎に外部刺激や内部状態の変化によって活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する（図１８、図１９、図２５を参照のこと）。
【０３３７】
図４３には、スキーマと内部状態管理部との関係を模式的に示している。
スキーマは、ＤＳｕｂｊｅｃｔやＤＯｂｊｅｃｔなどのプロキシを介して、短期記憶部１０５、長期記憶部１０６、内部状態管理部１０４などの外部オブジェクトと通信することができる（図３０を参照のこと）。
【０３３８】
スキーマは、外部刺激や内部状態の変化によって活動度レベルを算出するクラス・オブジェクトを備えている。ＲＭ（ＲｅｓｏｕｒｃｅＭａｎａｇｅｍｅｎｔ）オブジェクトは、プロキシを介して短期記憶部１０５に通信して、外部環境を取得して、外部環境に基づく活動度レベルを算出する。また、Ｍｏｔｉｖａｔｉｏｎ算出クラス・オブジェクトは、プロキシを介して長期記憶部１０６並びに内部状態管理部１０４と通信して、内部状態の変化量を取得して、内部状態に基づく活動度レベルすなわちＭｏｔｉｖａｔｉｏｎを算出する。Ｍｏｔｉｖａｔｉｏｎの算出方法に関しては後に詳解する。
【０３３９】
内部状態管理部１０４は、既に述べたように、１次情動と２次情動とに段階的に階層化されている。また、１次情動に関しては、生得的反応による１次情動階層と、ホメオスタシスによる１次情動と、連想による１次情動とに次元的に階層化されている（図３６を参照のこと）。また、２次情動としての感情は、Ｐ（Ｐｌｅａｓａｎｔｎｅｓｓ）、Ａ（Ａｃｔｉｖｉｔｙ）、Ｃ（Ｃｏｎｃｅｎｔｒａｔｉｏｎ）の３要素にマッピングされている。
【０３４０】
１次情動の各階層における変化ΔＩはすべて２次情動に入力されて、Ｐｌｅａｓａｎｔｎｅｓｓの変化ΔＰの算出に利用される。
【０３４１】
Ａｃｔｉｖｉｔｙは、センサ入力、動作時間、バイオリズムなどの情報から統合的に判断される。
【０３４２】
また、選択されたスキーマの確信度を、実際の２次情動階層における確信度として使用する。
【０３４３】
図４４には、Ｍｏｔｉｖａｔｉｏｎ算出クラス・オブジェクトによるＭｏｔｉｖａｔｉｏｎ算出経路を模式的に示している。
【０３４４】
ＲＭクラス・オブジェクトは、プロキシ経由で短期記憶部１０５にアクセスして、センサ情報を取得し、認識された対象物の距離や大きさなどの刺激の強さに基づいて外部刺激による活動度レベルを評価する。
【０３４５】
一方、Ｍｏｔｉｖａｔｉｏｎ算出クラス・オブジェクトは、プロキシ経由で短期記憶部１０５にアクセスして、対象物に関する特徴を取得して、さらにプロキシ経由で長期記憶部１０６の対象物の特徴を問い合わせて内部状態の変化を取得する。そして、プロキシ経由で内部状態管理部１０４にアクセスして、ロボット内部にある内部評価値を算出する。したがって、Ｍｏｔｉｖａｔｉｏｎの算出は、外部刺激の強さには無関係である。
【０３４６】
本実施形態に係るロボットの行動制御システムが連想記憶を用いて外部刺激から内部状態の変化を想起することにより、２次情動を算出して行動選択を行なう、ということは既に述べた（図４１を参照のこと）。さらに、連想記憶を用いることにより、対象物毎に異なる内部状態の変化を想起させることができる。これによって、同じ状況でもその行動の発現し易さを異ならせることができる。すなわち、外部の刺激や物理的状況、現在の内部状態に加え、ロボットの対象物ごとの記憶を考慮して行動を選択することができ、より多彩で多様化した対応を実現することができる。
【０３４７】
例えば、「○○が見えているから××する」とか、「現在○○が不足だから（何に対しても）××する」などの外部環境又は内部状態によって決まった行動をするのではなく、「○○が見えても△△なので□□する」とか、「○○が見えているけど××なので■■する」など、対象物に関する内部状態の変化記憶を用いることにより、行動にバリエーションをつけることができる。
【０３４８】
図４５には、対象物が存在するときのＭｏｔｉｖａｔｉｏｎ算出処理のメカニズムを模式的に示している。
【０３４９】
まず、プロキシ経由で短期記憶部１０５にアクセスして、認識機能部１０１〜１０３により認識されたターゲットの特徴を尋ねる。
【０３５０】
次いで、取り出した特徴を用いて、今度はプロキシ経由で長期記憶部１０６にアクセスして、その特徴の対象物がスキーマに関係した欲求をどのように変化させるか、すなわち１次情動の変化ΔＩを獲得する。
【０３５１】
次いで、プロキシ経由で内部状態管理部１０４にアクセスして、欲求の変化により快不快の値がどのように変化するか、すなわち２次情動の変化ΔＰｌｅａｓａｎｔを引き出す。
【０３５２】
そして、２次情動の変化ΔＰｌｅａｓａｎｔと対象物の確信度を引数とする以下のＭｏｔｉｖａｔｉｏｎ算出関数ｇ_{ｔａｒｇｅｔ−ｉ}により、ｉ番目のＭｏｔｉｖａｔｉｏｎを算出する。
【０３５３】
【数９】

【０３５４】
また、図４６には、対象物が存在しないときのＭｏｔｉｖａｔｉｏｎ算出処理のメカニズムを模式的に示している。
【０３５５】
この場合、まず、行動に対する記憶に対して、その行動による欲求の変化ΔＩを尋ねる。
【０３５６】
次いで、取得したΔＩを用いて、内部状態管理部１０４により１次情動がΔＩだけ変化したときの２次情動の変化ΔＰｌｅａｓａｎｔを引き出す。そして、この場合は、２次情動の変化ΔＰｌｅａｓａｎｔを引数とする以下のＭｏｔｉｖａｔｉｏｎ算出関数ｇ_{ｎｏｔｔａｒｇｅｔ−ｉ}により、ｉ番目のＭｏｔｉｖａｔｉｏｎを算出する。
【０３５７】
【数１０】

【０３５８】
Ｅ−９．２次情動の各要素の変更方法
図４７には、２次情動のうちのＰｌｅａｓａｎｔｎｅｓｓを変更するためのメカニズムを図解している。
【０３５９】
長期記憶部１０６は、記憶の量による１次情動の変化を内部状態管理部１０４に入力する。また、短期記憶部１０５は、認識機能部１０１〜１０３からのセンサ入力による１次情動の変化を内部状態管理部１０４に入力する。
【０３６０】
また、スキーマは、スキーマ実行による１次情動の変化（Ｎｏｕｒｉｓｈｍｅｎｔ，Ｍｏｉｓｔｕｒｅ，Ｓｌｅｅｐ）や、スキーマの内容による１次情動の変化（Ａｆｆｅｃｔｉｏｎ）を内部状態管理部１０４に入力する。
【０３６１】
Ｐｌｅａｓａｎｔｎｅｓｓは、１次情動の過不足の変化に応じて決定される。
【０３６２】
また、図４８には、２次情動のうちのＡｃｔｉｖｉｔｙを変更するためのメカニズムを図解している。
【０３６３】
Ａｃｔｉｖｉｔｙは、スキーマのＳｌｅｅｐ以外の時間の総和と、バイオリズムと、センサ入力を基に、統合的に判断される。
【０３６４】
また、図４９には、２次情動のうちのＣｅｒｔａｉｎｔｙを変更するためのメカニズムを図解している。
【０３６５】
長期記憶部１０６に対して対象物を尋ねると、Ｃｅｒｔａｉｎｔｙが返される。どの１次情動に着目するかは、そのスキーマの目標とする行動に依存する。そして、引き出されたＣｅｒｔａｉｎｔｙがそのまま内部状態管理部１０４の２次情動におけるＣｅｒｔａｉｎｔｙとなる。
【０３６６】
図５０には、Ｃｅｒｔａｉｎｔｙを求めるためのメカニズムを模式的に示している。
【０３６７】
長期記憶部１０６では、対象物に関する認識結果や情動などの各項目の確からしさを、スキーマ毎に記憶している。
【０３６８】
スキーマは、長期記憶部１０６に対して、スキーマと関係する記憶の対する確からしさの値を尋ねる。これに対し、長期記憶部１０６は、スキーマと関係する記憶の確からしさを対象物の確からしさとして与える。
【０３６９】
［追補］
以上、特定の実施形態を参照しながら、本発明について詳解してきた。しかしながら、本発明の要旨を逸脱しない範囲で当業者が該実施形態の修正や代用を成し得ることは自明である。
【０３７０】
本発明の要旨は、必ずしも「ロボット」と称される製品には限定されない。すなわち、電気的若しくは磁気的な作用を用いて人間の動作に似せた運動を行なう機械装置であるならば、例えば玩具等のような他の産業分野に属する製品であっても、同様に本発明を適用することができる。
【０３７１】
要するに、例示という形態で本発明を開示してきたのであり、本明細書の記載内容を限定的に解釈するべきではない。本発明の要旨を判断するためには、冒頭に記載した特許請求の範囲の欄を参酌すべきである。
【０３７２】
【発明の効果】
本発明によれば、自律的な動作を行ないユーザとのリアリスティックなコミュニケーションを実現することができる、優れたロボットの行動制御システム及び行動制御方法、並びにロボット装置を提供することができる。
【０３７３】
また、本発明によれば、視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況を統合的に判断して行動を選択することができる、優れたロボットの行動制御システム及び行動制御方法、並びにロボット装置を提供することができる。
【０３７４】
また、本発明によれば、視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況を統合的に判断して行動を選択することができる、優れたロボットの行動制御システム及び行動制御方法、並びにロボット装置を提供することができる。
【０３７５】
また、本発明によれば、情動についての存在意義をより明確にして、一定の秩序の下で外部刺激や内部状態に応じた行動を好適に選択し実行することができる、優れたロボットの行動制御システム及び行動制御方法、並びにロボット装置を提供することができる。
【０３７６】
本発明によれば、情動についてその存在意義による複数階層化を行ない、それぞれの階層で動作を決定する。決定された複数の動作から、そのときの外部刺激や内部状態によってどの動作を行なうかを決定する。それぞれの階層で行動は選択されるが、その実施される順番はロボットの内部状態の優先順位に基づくため、より低次の行動から優先的に動作を発現していくことにより、反射などの本能的行動や、記憶を用いた動作選択などの高次の行動を１つの個体上で矛盾なく発現することができる。また、行動をカテゴライズして、スキーマとして作成する際も明確な指標となる。
【０３７７】
また、本発明に係るロボットの行動制御システム又は行動制御方法によれば、連想記憶を用いることにより、対象物毎に異なる内部状態の変化を想起することができるので、同じ状況でもその行動の発現し易さを異ならせることができる。すなわち、外部の刺激や物理的状況、現在の内部状態に加え、ロボットの対象物ごとの記憶を考慮して行動を選択することができ、より多彩で多様化した対応を実現することができる。
【０３７８】
例えば、「○○が見えているから××する」とか、「現在○○が不足だから（何に対しても）××する」などの外部環境又は内部状態によって決まった行動をするのではなく、「○○が見えても△△なので□□する」とか、「○○が見えているけど××なので■■する」など、対象物に関する内部状態の変化記憶を用いることにより、行動にバリエーションを付けることができる。
【図面の簡単な説明】
【図１】本発明に実施に供されるロボット装置１の機能構成を模式的に示した図である。
【図２】制御ユニット２０の構成をさらに詳細に示した図である。
【図３】本発明の実施形態に係るロボット装置１の行動制御システム１００の機能構成を模式的に示した図である。
【図４】図３に示した行動制御システム１００を構成する各オブジェクトによる動作の流れを示した図である。
【図５】各認識機能部１０１〜１０３における認識結果に基づいて短期記憶部１０５内のターゲット・メモリに入る情報の流れを示した図である。
【図６】各認識機能部１０１〜１０３における認識結果に基づいて短期記憶部１０５内のイベント・メモリに入る情報の流れを示した図である。
【図７】ロボット１によるユーザＡ及びＢとの対話処理を説明するための図である。
【図８】ロボット１によるユーザＡ及びＢとの対話処理を説明するための図である。
【図９】ロボット１によるユーザＡ及びＢとの対話処理を説明するための図である。
【図１０】本発明の一実施形態に係る連想記憶の記憶過程を概念的に示した図である。
【図１１】本発明の一実施形態に係る連想記憶の想起過程を概念的に示した図である。
【図１２】競合型ニューラル・ネットワークを適用した連想記憶システムの構成例を模式的に示した図である。
【図１３】本発明の実施形態に係る行動制御システム１００のオブジェクト構成を模式的に示した図である。
【図１４】状況依存行動階層１０８による状況依存行動制御の形態を模式的に示した図である。
【図１５】図１４に示した状況依存行動階層１０８による行動制御の基本的な動作例を示した図である。
【図１６】図１４に示した状況依存行動階層１０８により反射行動を行なう場合の動作例を示した図である。
【図１７】図１４に示した状況依存行動階層１０８により感情表現を行なう場合の動作例を示した図である。
【図１８】状況依存行動階層１０８が複数のスキーマによって構成されている様子を模式的に示した図である。
【図１９】状況依存行動階層１０８におけるスキーマのツリー構造を模式的に示した図である。
【図２０】スキーマの内部構成を模式的に示している。
【図２１】Ｍｏｎｉｔｏｒ関数の内部構成を模式的に示した図である。
【図２２】行動状態制御部の構成例を模式的に示した図である。
【図２３】行動状態制御部の他の構成例を模式的に示した図である。
【図２４】状況依存行動階層１０８において通常の状況依存行動を制御するためのメカニズムを模式的に示した図である。
【図２５】反射行動部１０９におけるスキーマの構成を模式的に示した図である。
【図２６】反射行動部１０９により反射的行動を制御するためのメカニズムを模式的に示した図である。
【図２７】状況依存行動階層１０８において使用されるスキーマのクラス定義を模式的に示した図である。
【図２８】スキーマのａｃｔｉｏｎ関数のステートマシンを示した図である。
【図２９】スキーマのステートマシンを示した図である。
【図３０】状況依存行動階層１０８内のクラスの機能的構成を模式的に示した図である。
【図３１】ＭａｋｅＰｒｏｎｏｍｅ関数を実行する処理手順を示したフローチャートである。
【図３２】Ｍｏｎｉｔｏｒ関数を実行する処理手順を示したフローチャートである。
【図３３】Ａｃｔｉｏｎｓ関数を実行する処理手順を示したフローチャートである。
【図３４】Ａｃｔｉｏｎｓ関数を実行する処理手順を示したフローチャートである。
【図３５】スキーマのＲｅｅｎｔｒａｎｔ性を説明するための図である。
【図３６】本実施形態に係る内部状態管理部１０４の階層的構成を模式的に示した図である。
【図３７】内部状態管理部１０４と他の機能モジュールとの通信経路を模式的に示した図である。
【図３８】内部状態管理部１０４が時間変化に伴って内部状態を変化させるための仕組みを示した図である。
【図３９】内部状態管理部１０４がロボットの動作実行に伴って内部状態を変化させるための仕組みを示した図である。
【図４０】内部状態管理部１０４が外部環境の認識結果により内部状態を変化させるための仕組みを示した図である。
【図４１】内部状態管理部１０４が連想記憶により内部状態を変化させるための仕組みを示した図である。
【図４２】内部状態管理部１０４が生得的反射行動により内部状態を変化させるための仕組みを示した図である。
【図４３】スキーマと内部状態管理部との関係を模式的に示した図である。
【図４４】Ｍｏｔｉｖａｔｉｏｎ算出クラス・オブジェクトによるＭｏｔｉｖａｔｉｏｎ算出経路を模式的に示した図である。
【図４５】対象物が存在するときのＭｏｔｉｖａｔｉｏｎ算出処理のメカニズムを模式的に示した図である。
【図４６】対象物が存在しないときのＭｏｔｉｖａｔｉｏｎ算出処理のメカニズムを模式的に示した図である。
【図４７】Ｐｌｅａｓａｎｔｎｅｓｓの変更方法を示した図である。
【図４８】Ａｃｔｉｖｉｔｙの変更方法を示した図である。
【図４９】Ｃｅｒｔａｉｎｔｙの変更方法を示した図である。
【図５０】Ｃｅｒｔａｉｎｔｙを求めるためのメカニズムを示した図である。
【符号の説明】
１…ロボット装置
１５…ＣＣＤカメラ
１６…マイクロフォン
１７…スピーカ
１８…タッチセンサ
１９…ＬＥＤインジケータ
２０…制御部
２１…ＣＰＵ
２２…ＲＡＭ
２３…ＲＯＭ
２４…不揮発メモリ
２５…インターフェース
２６…無線通信インターフェース
２７…ネットワーク・インターフェース・カード
２８…バス
２９…キーボード
４０…入出力部
５０…駆動部
５１…モータ
５２…エンコーダ
５３…ドライバ
１００…行動制御システム
１０１…視覚認識機能部
１０２…聴覚認識機能部
１０３…接触認識機能部
１０５…短期記憶部
１０６…長期記憶部
１０７…熟考行動階層
１０８…状況依存行動階層
１０９…反射行動部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a behavior control system and a behavior control method for a robot that realizes realistic communication with a user by performing an autonomous operation, and a robot apparatus. In particular, the present invention relates to a recognition result and instinct of an external environment such as vision and hearing. The present invention relates to a behavior control system, a behavior control method, and a robot device for a situation-dependent behavior type robot that integrally determines a situation in which the robot is placed, such as an internal state such as emotions and emotions, and selects an appropriate behavior.
[0002]
[Prior art]
A mechanical device that performs a motion resembling a human motion using an electric or magnetic action is called a “robot”. It is said that the robot is derived from the Slavic word "ROBOTA (slave machine)". In Japan, robots began to spread from the late 1960's, but most of them were industrial robots (industrial robots) such as manipulators and transfer robots for the purpose of automation and unmanned production work in factories. Met.
[0003]
Recently, a pet-type robot that imitates the body mechanism and operation of a four-legged animal such as dogs, cats, and bears, or the body mechanism and motion of an animal that walks upright on two legs, such as a human or monkey. Research and development on the structure of a legged mobile robot such as the "humanoid" or "humanoid" robot and stable walking control thereof have been progressing, and expectations for its practical use have been increasing. These legged mobile robots are unstable compared to crawler type robots, making posture control and walking control difficult.However, they are excellent in that they can realize flexible walking and running operations such as climbing up and down stairs and over obstacles. I have.
[0004]
One of the uses of the legged mobile robot is to perform various difficult tasks in industrial activities and production activities. For example, maintenance work in nuclear power plants, thermal power plants, petrochemical plants, parts transfer and assembly work in manufacturing plants, cleaning in high-rise buildings, rescue work in fire spots and other dangerous and difficult work, etc. .
[0005]
Another use of the legged mobile robot is not the work support described above, but a life-based use, that is, a use in "symbiosis" or "entertainment" with humans. This type of robot faithfully reproduces the movement mechanism of legged walking animals, such as humans, dogs (pets), and bears, and rich emotional expressions using limbs. In addition, it does not simply execute a pre-input motion pattern faithfully, but also dynamically responds to words and attitudes received from the user (or another robot) (such as "praise", "scratch", and "slap"). It is also required to realize a corresponding and lively response expression.
[0006]
In the conventional toy machine, the relationship between the user operation and the response operation is fixed, and the operation of the toy cannot be changed according to the user's preference. As a result, the user eventually gets tired of toys that repeat only the same operation. On the other hand, an intelligent robot autonomously selects an action including a dialogue and a body motion, so that it is possible to realize realistic communication at a higher intellectual level. As a result, the user feels a deep attachment and familiarity with the robot.
[0007]
In robots or other realistic dialogue systems, it is common to select actions sequentially in response to changes in the external environment, such as vision and hearing. Further, as another example of the action selection mechanism, a mechanism in which emotions such as instinct and emotion are modeled to manage an internal state of the system, and an action is selected according to a change in the internal state. Of course, the internal state of the system changes due to changes in the external environment, and also changes due to the selected action.
[0008]
However, there are few examples of situation-dependent action control in which a situation in which a robot such as an external environment or an internal state is placed is integrated and an action is selected.
[0009]
Here, the internal state includes, for example, an instinct element corresponding to access to the limbic system in a living body, and intrinsic and social needs corresponding to access to the cerebral neocortex. In addition, it is composed of elements captured by the ethological model, and elements called emotions such as joy, sadness, anger, and surprise.
[0010]
In conventional intelligent robots and other autonomous interactive robots, internal states composed of various factors such as instinct and emotion are all collected as “emotional” to manage the internal state one-dimensionally. That is, each element constituting the internal state exists in parallel with each other, and the action is selected only based on the external situation or the internal state without a clear selection criterion.
[0011]
In conventional systems, the choice and manifestation of the action was that all actions existed in one dimension and which one to choose. For this reason, as the number of operations increases, the selection becomes more complicated, and it becomes more difficult to select an action that reflects the situation or internal state at that time.
[0012]
[Problems to be solved by the invention]
An object of the present invention is to provide an excellent behavior control system and behavior control method for a robot, and a robot apparatus that can realize a realistic communication by performing an autonomous operation.
[0013]
A further object of the present invention is to be able to select an action by integrally judging a situation where a robot is placed such as a recognition result of an external environment such as vision or hearing and an internal state such as instinct and emotion. To provide a robot behavior control system and behavior control method, and a robot apparatus.
[0014]
A further object of the present invention is to provide an excellent robotic behavior control that can clarify the significance of existence of emotions and can appropriately select and execute an action according to an external stimulus or an internal state under a certain order. It is to provide a system, a behavior control method, and a robot device.
[0015]
A further object of the present invention is to be able to select an action by integrally judging a situation where a robot is placed such as a recognition result of an external environment such as vision or hearing and an internal state such as instinct and emotion. To provide a robot behavior control system and behavior control method, and a robot apparatus.
[0016]
Means and Action for Solving the Problems
The present invention has been made in view of the above problems, and a first aspect thereof is a behavior control system for a robot that operates autonomously,
A plurality of action description sections for describing the body motion of the robot,
An external environment recognition unit that recognizes the external environment of the aircraft,
An internal state management unit that manages the internal state of the robot according to the recognized external environment and / or the execution result of the action;
An action evaluation unit that evaluates the execution of the action described in each of the action description units according to the external environment and / or the internal state;
A robot behavior control system characterized by comprising:
[0017]
However, the term “system” as used herein refers to a logical collection of a plurality of devices (or functional modules that realize specific functions), and each device or functional module is in a single housing. It does not matter in particular.
[0018]
The external environment recognition unit performs at least one of external visual recognition, externally generated voice recognition, and externally applied contact recognition. The internal state management unit manages an instinct model and / or an emotion model of the robot.
[0019]
The behavior description section may be configured in a tree structure format in which a plurality of behavior description sections correspond to the realization level of the body operation. This tree structure includes a plurality of branches such as a behavior model in which an ethological situation-dependent behavior is formalized, and a branch for executing an emotional expression. For example, at the level immediately below the root action description section, action description sections of "Search (Investigate)", "Eat (Ingestive)", and "Play (Play)" are provided. Below “Search (Investigate)”, an action description section that describes more specific search actions, such as “InvestigativeLocomotion”, “HeadinAirSniffing”, and “InvestigativeSniffing”, is provided. Similarly, below the action description section "Eat (Ingestive)", an action description section describing more specific eating and drinking behavior such as "Eat" and "Dlink" is provided, and the action description section "Play (Play)" is provided. Below the "", an action description section that describes more specific playing actions such as "Play Bowing", "Play Greeting", and "Play Paying" is provided.
[0020]
In such a case, the behavior evaluation section can evaluate a plurality of behavior description sections simultaneously and in parallel from top to bottom of the tree structure. Further, in response to the new recognition by the external environment recognition unit and / or the change of the internal state by the internal state management unit, the behavior evaluation unit evaluates each of the behavior description units, and the tree structure is raised. By passing the execution permission as an evaluation result downward from, it is possible to selectively execute an appropriate action according to a change in the external environment or the internal state. That is, the evaluation and execution of the situation-dependent behavior can be performed on the current.
[0021]
Further, the apparatus may further include a resource management unit that manages contention of resources on the machine when simultaneously executing the actions described in the plurality of action description units. In such a case, the behavior selection unit can simultaneously select two or more behavior description units on the assumption that resource competition is arbitrated.
[0022]
Also, as a result of performing the evaluation of each of the action description units by the action evaluation unit by the new recognition by the external environment recognition unit, when an action description unit that has obtained a higher evaluation value than the currently executing action appears, The action selecting unit may stop the currently executing action and preferentially execute the action described in the action description unit having the higher evaluation value. Therefore, an action with a higher degree of importance or urgency such as a reflex action can be interrupted to a situation-dependent action that is already being executed, and can be executed with priority. In such a case, it is preferable to restart the temporarily stopped action after the action that has been preferentially executed is completed.
[0023]
Further, the action selecting unit may sequentially select the same action description unit according to a change in a different external environment. In such a case, each time the action described in the action description section is executed, an individual work space is allocated to each external environment.
[0024]
For example, during the action of the dialogue with the person A, the person B interrupts the dialogue between the robot and the person A and evaluates the activity level based on the external stimulus and the change in the internal state. When the action of performing the action has a higher priority, the conversation with B is interrupted.
[0025]
In such a case, the dialogue with either A or B is performed in accordance with the same action description part. However, apart from the action performing the dialogue with A, a work space for the action performing the dialogue with B is allocated. This prevents interference of the contents of the dialogue. That is, since the conversation with B does not destroy the contents of the conversation with A, when the conversation with B ends, the conversation with A can be resumed from the point of interruption.
[0026]
Further, a second aspect of the present invention is a behavior control system or behavior control method for a robot that operates autonomously according to an internal state,
An internal state management unit or step for managing emotions, which are indicators of the internal state, in a plurality of hierarchical structures,
An action selecting unit or step for selectively executing an action that satisfies the emotion of each hierarchy;
A robot behavior control system or behavior control method characterized by the following.
[0027]
Here, the internal state management unit or step hierarchizes each stage of a primary emotion necessary for the existence of an individual and a secondary emotion that changes due to excess or deficiency of the primary emotion. May be hierarchized by dimension from innate reflex or physiological hierarchy to associative.
[0028]
The action selecting unit or step may preferentially select an action that satisfies a lower primary emotion. Alternatively, the action selecting unit or step suppresses the selection of an action that satisfies the lower-order primary emotion when the higher-order primary emotion is significantly insufficient compared to the lower-order primary emotion. You may do so.
[0029]
According to the behavior control system or the behavior control method for a robot according to the second aspect of the present invention, emotions are divided into a plurality of layers according to their existence significance, and an operation is determined at each layer. From the determined plurality of operations, it is determined which operation is to be performed depending on the external stimulus or internal state at that time. Actions are selected at each level, but the order in which they are performed is based on the priority of the robot's internal state. Higher-level behaviors such as objective behaviors and motion selection using memory can be consistently expressed on one individual. It is also a clear index when categorizing actions and creating a schema.
[0030]
The behavior control system or behavior control method for a robot according to the second aspect of the present invention may further include an external environment recognition unit that recognizes a change in the external environment of the robot. In such a case, the action selecting unit or step can select an action based on an index of the external environment in addition to the index of the internal state.
[0031]
Further, the internal state management unit step may change the index of the internal state in accordance with the passage of time using a biorhythm or the like.
[0032]
Further, the internal state management unit or the step may change the index of the internal state according to the execution of the action selected by the action selection unit, that is, according to the degree of the operation.
[0033]
Further, the internal state management unit or the step may change an index of the internal state according to a change in an external environment.
[0034]
The robot behavior control system or method according to the second aspect of the present invention may further include an associative storage unit or step for associatively storing a change in an internal state from an external environment. In such a case, the internal state management unit or the step may change the index of the internal state based on a change in the internal environment recalled from the external environment by the associative storage unit or the step. Further, the associative storage unit or the step may associate and store a change in an internal state for each object whose external environment is recognized.
[0035]
The selection and expression of motion in a conventional robot is basically determined by the physical distance to the object and the internal state of the robot at that time, in other words, what kind of action depends on the difference of the object There is no action choice to take.
[0036]
On the other hand, according to the robot behavior control system or the behavior control method according to the second aspect of the present invention, by using the associative memory, it is possible to recall a change in the internal state that differs for each target object. Even in the same situation, the easiness of the action can be made different. That is, in addition to the external stimulus, the physical state, and the current internal state, the action can be selected in consideration of the memory of each object of the robot, and a more diversified and diversified response can be realized.
[0037]
For example, instead of taking actions determined by the external environment or internal state, such as "XX is visible because XX is visible", or "XX is currently insufficient (for anything) XX" , Such as "Even if you can see XX, so □□," or "I can see XX but XX, so XX." Can be attached.
[0038]
Further objects, features, and advantages of the present invention will become apparent from more detailed descriptions based on embodiments of the present invention described below and the accompanying drawings.
[0039]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0040]
A. Configuration of robot device
FIG. 1 schematically shows a functional configuration of a robot device 1 used in the present invention. As shown in FIG. 1, the robot apparatus 1 includes a control unit 20 that performs overall control of the entire operation and other data processing, an input / output unit 40, a driving unit 50, and a power supply unit 60. . Hereinafter, each unit will be described.
[0041]
The input / output unit 40 includes, as input units, a CCD camera 15 corresponding to the eyes of the robot apparatus 1, a microphone 16 corresponding to the ears, and a touch sensor disposed at a part such as a head or a back to detect a user's contact. 18 or other various sensors corresponding to the five senses. The output unit is provided with a speaker 17 corresponding to a mouth, an LED indicator (eye lamp) 19 that forms a facial expression by a combination of blinking and lighting timing. These output units can express the user feedback from the robot apparatus 1 in a form other than a mechanical movement pattern by a leg or the like, such as a sound or blinking of a lamp.
[0042]
The drive unit 50 is a functional block that implements a body operation of the robot device 1 according to a predetermined movement pattern commanded by the control unit 20, and is a control target by behavior control. The drive unit 50 is a functional module for realizing a degree of freedom at each joint of the robot apparatus 1, and includes a plurality of drive units provided for each axis such as roll, pitch, and yaw at each joint. Each drive unit performs a rotation operation around a predetermined axis, an encoder 52 that detects a rotation position of the motor 51, and adaptively controls a rotation position and a rotation speed of the motor 51 based on an output of the encoder 52. It is composed of a combination of drivers 53.
[0043]
Depending on how the drive units are combined, the robot apparatus 1 can be configured as a legged mobile robot such as a bipedal walking or a quadrupedal walking.
[0044]
The power supply unit 60 is a functional module that supplies power to each electric circuit and the like in the robot apparatus 1 as the name implies. The robot apparatus 1 according to the present embodiment is of an autonomous driving type using a battery, and a power supply unit 60 includes a charging battery 61 and a charging / discharging control unit 62 that manages a charging / discharging state of the charging battery 61. .
[0045]
The charging battery 61 is configured, for example, in the form of a “battery pack” in which a plurality of lithium ion secondary battery cells are packaged in a cartridge type.
[0046]
Further, the charge / discharge control unit 62 grasps the remaining capacity of the battery 61 by measuring a terminal voltage and a charge / discharge current amount of the battery 61, an ambient temperature of the battery 61, and determines a start time and an end time of charging. decide. The start and end timings of charging determined by the charge / discharge control unit 62 are notified to the control unit 20 and serve as triggers for the robot apparatus 1 to start and end charging operations.
[0047]
The control unit 20 corresponds to a “brain” and is mounted on, for example, the head or the body of the robot apparatus 1.
[0048]
FIG. 2 illustrates the configuration of the control unit 20 in more detail. As shown in FIG. 1, the control unit 20 has a configuration in which a CPU (Central Processing Unit) 21 as a main controller is bus-connected to a memory, other circuit components, and peripheral devices. The bus 27 is a common signal transmission path including a data bus, an address bus, a control bus, and the like. Each device on the bus 27 is assigned a unique address (memory address or I / O address). The CPU 21 can communicate with a specific device on the bus 28 by specifying an address.
[0049]
A RAM (Random Access Memory) 22 is a writable memory composed of a volatile memory such as a DRAM (Dynamic RAM), and loads a program code executed by the CPU 21 or temporarily stores work data by the execution program. Used for preservation.
[0050]
The ROM (Read Only Memory) 23 is a read-only memory that permanently stores programs and data. The program code stored in the ROM 23 includes a self-diagnosis test program executed when the power of the robot apparatus 1 is turned on, an operation control program for defining the operation of the robot apparatus 1, and the like.
[0051]
The control program of the robot apparatus 1 includes a “sensor input / recognition processing program” that processes sensor inputs from the camera 15 and the microphone 16 and recognizes the symbols as symbols, and performs storage operations (described later) such as short-term storage and long-term storage. An “action control program” that controls the action of the robot apparatus 1 based on a sensor input and a predetermined action control model, and a “drive control program” that controls the drive of each joint motor and the audio output of the speaker 17 according to the action control model. And so on.
[0052]
The non-volatile memory 24 is configured by a memory element that can be electrically erased and rewritten, such as an EEPROM (Electrically Erasable and Programmable ROM), and is used to hold data to be sequentially updated in a non-volatile manner. The data to be sequentially updated includes an encryption key and other security information, a device control program to be installed after shipment, and the like.
[0053]
The interface 25 is a device for interconnecting with devices outside the control unit 20 and enabling data exchange. The interface 25 performs data input / output with the camera 15, the microphone 16, and the speaker 17, for example. The interface 25 inputs and outputs data and commands to and from each of the drivers 53-1 in the drive unit 50.
[0054]
The interface 25 includes a serial interface such as RS (Recommended Standard) -232C, a parallel interface such as IEEE (Institute of Electrical and Electronics Engineers) 1284, a USB interface (Universal Serial I / E), a serial interface, and a serial interface (USB). , A general-purpose interface for connecting peripheral devices of a computer, such as a SCSI (Small Computer System Interface) interface, a memory card interface (card slot) for receiving a PC card or a memory stick, and the like. External device The movement of the program or data may be carried out between them.
[0055]
As another example of the interface 25, an infrared communication (IrDA) interface may be provided to perform wireless communication with an external device.
Further, the control unit 20 includes a wireless communication interface 26, a network interface card (NIC) 27, and the like, and performs near field wireless data communication such as Bluetooth, a wireless network such as IEEE 802.11b, or a wide area such as the Internet. Data communication can be performed with various external host computers via the network.
[0056]
By such data communication between the robot device 1 and the host computer, complicated operation control of the robot device 1 can be calculated or remotely controlled using a remote computer resource.
[0057]
B. Behavior control system for robot device
FIG. 3 schematically shows a functional configuration of the behavior control system 100 of the robot device 1 according to the embodiment of the present invention. The robot apparatus 1 can perform behavior control according to the recognition result of the external stimulus and a change in the internal state. Furthermore, by providing a long-term memory function and associatively storing a change in an internal state from an external stimulus, behavior control can be performed according to a recognition result of the external stimulus or a change in the internal state.
[0058]
The illustrated behavior control system 100 can be implemented using object-oriented programming. In this case, each software is handled in a module unit called an “object” that integrates data and a processing procedure for the data. Further, each object can perform data transfer and Invoke by message communication and an inter-object communication method using a shared memory.
[0059]
The behavior control system 100 includes a visual recognition function unit 101, a hearing recognition function unit 102, and a contact recognition function unit 103 in order to recognize an external environment (Environments).
[0060]
The visual recognition function unit (Video) 51 performs image recognition such as face recognition and color recognition based on a captured image input via an image input device such as a CCD (Charge Coupled Device) camera. Performs processing and feature extraction. The visual recognition function unit 51 is composed of a plurality of objects such as “MultiColorTracker”, “FaceDetector”, and “FaceIdentify”, which will be described later.
[0061]
The auditory recognition function unit (Audio) 52 performs voice recognition of voice data input via a voice input device such as a microphone, and performs feature extraction and word set (text) recognition. The auditory recognition function unit 52 is composed of a plurality of objects such as “AudioRecog” and “AuthurDecoder” to be described later.
The contact recognition function unit (Tactile) 53 recognizes a sensor signal from a contact sensor built in, for example, the head of the body, and recognizes an external stimulus such as “patched” or “hit”.
[0062]
An internal state manager (ISM: Internal Status Manager) 104 manages several types of emotions, such as instinct and emotion, by using mathematical models, and manages the above-described visual recognition function unit 101, auditory recognition function unit 102, and contact recognition function. The internal state such as instinct and emotion of the robot apparatus 1 is managed according to an external stimulus (ES: External Stimula) recognized by the unit 103.
[0063]
The emotion model and the instinct model have recognition results and action histories as inputs, respectively, and manage emotion values and instinct values. The behavior model can refer to these emotion values and instinct values.
[0064]
In the present embodiment, an emotion is composed of a plurality of layers according to the significance of its existence, and operates at each layer. The operation to be performed is determined from the determined plurality of operations according to the external environment or internal state at that time (described later). In addition, actions are selected at each level, but by expressing actions preferentially from lower-order actions, instinctual actions such as reflex and higher-order actions such as action selection using memory are performed. Behavior can be consistently expressed on one individual.
[0065]
The robot apparatus 1 according to the present embodiment includes a short-term storage unit 105 that performs short-term storage that is lost over time, and performs information control in response to a recognition result of an external stimulus or a change in an internal state. A long-term storage unit 106 for holding data for a relatively long time is provided. The classification of short-term memory and long-term memory depends on neuropsychology.
[0066]
The short-term storage unit (ShortTermMemory) 105 is a functional module that holds, for a short time, a target or an event recognized from the external environment by the above-described visual recognition function unit 101, auditory recognition function unit 102, and contact recognition function unit 103. For example, the input image from the camera 15 is stored for only a short period of about 15 seconds.
[0067]
The long-term storage unit (LongTermMemory) 106 is used to hold information obtained by learning, such as the name of an object, for a long time. For example, the long-term storage unit 106 can associate and store a change in an internal state from an external stimulus in a certain behavior module.
[0068]
The behavior control of the robot apparatus 1 according to the present embodiment is performed by a “reflex action” realized by the reflex action unit 109, a “situation-dependent action” realized by the situation-dependent behavior layer 108, and a reflection behavior layer 107. It is broadly divided into "consideration actions" that are realized.
[0069]
The reflexive behavioral behavior layer (Reflexive Behaviors Layer) 109 is a functional module that implements a reflexive body operation in response to an external stimulus recognized by the visual recognition function unit 101, the auditory recognition function unit 102, and the contact recognition function unit 103. It is.
[0070]
The reflex action is basically an action of directly receiving a recognition result of external information input from a sensor, classifying the result, and directly determining an output action. For example, it is preferable to implement a behavior such as chasing or nodding a human face as a reflex action.
[0071]
The situation-dependent behavior hierarchy (Suited Behaviors Layer) 108 is based on the storage contents of the short-term storage unit 105 and the long-term storage unit 106, and the internal state managed by the internal state management unit 104, based on the state where the robot apparatus 1 is currently placed. Control responsive actions.
[0072]
The situation-dependent action hierarchy 108 prepares a state machine for each action, classifies the recognition result of the external information input from the sensor depending on the action or situation before that, and expresses the action on the body. I do. The situation-dependent behavior hierarchy 108 also implements an action for keeping the internal state within a certain range (also called “homeostasis behavior”). When the internal state exceeds a specified range, the internal state is determined. The action is activated so that the action for returning to the range easily appears (actually, the action is selected in consideration of both the internal state and the external environment). Situation-dependent behavior has a slower reaction time than reflex behavior.
[0073]
The reflective behavior layer 107 performs a relatively long-term action plan of the robot apparatus 1 based on the contents stored in the short-term storage unit 105 and the long-term storage unit 106.
[0074]
Reflection behavior is behavior that is performed based on a given situation or a command from a human, with a reasoning or a plan for realizing it. For example, searching for a route from the position of the robot and the position of the target corresponds to a deliberate action. Such inferences and plans may require more processing time and calculation load (that is, more processing time) than the reaction time for the robot apparatus 1 to maintain the interaction. Reflective actions make inferences and plans while responding in real time.
[0075]
The reflection behavior hierarchy 107, the situation-dependent behavior hierarchy 108, and the reflex behavior unit 109 can be described as higher-level application programs independent of the hardware configuration of the robot apparatus 1. On the other hand, the hardware dependent layer control unit (ConfigurationDependentActionsAndReactions) 110 responds to a command from these higher-level applications (behavior modules called “schema”) to change the hardware (external environment) of the body such as driving joint actuators. Operate directly.
[0076]
C. Memory mechanism of robot device
As described above, the robot device 1 according to the present embodiment includes the short-term storage unit 105 and the long-term storage unit 106, and such a storage mechanism relies on neuropsychology.
[0077]
Short-term memory is literally short-term memory and is lost over time. Short-term memory can be used for short-term retention of targets and events recognized from the external environment, such as vision, hearing, and contact.
[0078]
The short-term memory further includes "sensory memory" in which sensory information (that is, output from the sensor) is held as it is for about one second as it is, and "direct memory" in which sensory memory is encoded and stored for a short time with a limited capacity. ], And "work memory" that stores a situation change and a context for several hours. Direct memory is said to be 7 ± 2 chunks according to neuropsychological studies. Working memory is also referred to as “intermediate memory” in comparison between short-term memory and long-term memory.
[0079]
The long-term memory is used to retain information obtained by learning, such as the name of an object, for a long time. The same pattern can be statistically processed for robust storage.
[0080]
Long-term memory is further divided into "declarative knowledge memory" and "procedural knowledge memory." The declarative knowledge memory is composed of "episode memory" which is a memory relating to a scene (for example, a scene when taught) and "semantic memory" which is a memory including the meaning of words and common sense. The procedural knowledge memory is a procedural memory such as how to use the declarative knowledge memory, and can be used to acquire an operation for an input pattern.
[0081]
C-1. Short-term memory
The short-term storage unit 105 is a functional module for expressing and storing an object or an event existing around the user, and for causing the robot to act on the basis thereof. The position of an object or an event is arranged on the self-centered coordinate system based on sensor information such as sight or hearing, but an object or the like out of the field of view can be stored, and an action or the like can be caused for it.
[0082]
For example, when a person A is talking and another person B speaks, a conversation with B is performed while maintaining the position of A and the contents of the conversation, and then a return to the conversation with A is completed. Requires a function of short-term memory. However, the integration is performed by simple closeness of spatiotemporal space such that sensor information that is close in time and space is regarded as a signal from the same object without performing integration by too complicated processing.
[0083]
In addition, in order to store the position of an object other than the object that can be distinguished by pattern recognition using a technique such as stereo vision, the object is arranged on the self-centered coordinate system. It can be used together with floor detection to stochastically store the position of an obstacle.
[0084]
In the present embodiment, the short-term storage unit 105 temporally and spatially matches external stimuli obtained from the results of a plurality of recognizers such as the visual recognition function unit 101, the auditory recognition function unit 102, and the contact recognition function unit 103 described above. The perception of each object in the external environment is provided as a short-term memory to a behavior control module such as the context-dependent behavior hierarchy (SBL) 108.
[0085]
Therefore, the behavior control module configured as a higher-level module can perform a high-level behavior control by integrating a plurality of recognition results from the outside world and treating it as meaningful symbol information. In addition, using more complex recognition results such as the problem of matching with previously observed recognition results, it is possible to determine which skin color region corresponds to which person with a face, and which person's voice this voice, etc. Can be solved.
[0086]
In addition, since the information on the recognized observation result is stored in the short-term storage unit 55 as a memory, even when the observation result does not come temporarily during the autonomous operation period, the behavior control of the aircraft is performed. An object such as an application can always make an object appear to be perceived there. For example, since the information outside the field of view of the sensor is immediately and without forgetting, even if the robot once loses sight of the object, it can be found later. As a result, it is possible to realize a stable system which is resistant to the error of the recognizer and the noise of the sensor and does not depend on the notification timing of the recognizer. Further, even if the information is insufficient from the viewpoint of the recognizer alone, there is a case where another recognition result can supplement the recognition result, so that the recognition performance of the entire system is improved.
[0087]
In addition, since the related recognition results are linked, it is possible to judge an action using the related information in a higher-level module such as an application. For example, the robotic device can derive the name of the person based on the called voice. As a result, it is possible to notes, such as answer such as "Hello, XXX-san." The response of the greeting.
[0088]
FIG. 4 illustrates a mechanism of situation-dependent behavior control in response to an external stimulus in the behavior control system 100 shown in FIG. The external stimulus is taken into the system by the function modules 101 to 103 of the recognition system, and is given to the situation-dependent behavior hierarchy (SBL) 108 via the short-term storage unit (STM) 105. As shown in the figure, the function modules 101 to 103 of the recognition system, the short-term storage unit (STM) 105, and the situation-dependent behavior hierarchy (SBL) 108 are configured as objects.
[0089]
In the figure, circles represent entities called “objects” or “processes”. The whole system operates by the objects communicating asynchronously with each other. Each object performs data transfer and Invoke by a message communication and an inter-object communication method using a shared memory. Hereinafter, the function of each object will be described.
[0090]
AudioRecog:
An object that receives voice data from a voice input device such as a microphone and performs feature extraction and voice section detection. When the microphone is stereo, the sound source direction in the horizontal direction can be estimated. If it is determined that the section is a voice section, the feature amount and sound source direction of the voice data in that section are sent to ArterDecoder (described later).
[0091]
SpeechRecog:
This is an object that performs speech recognition using the speech feature amount received from AudioRecog, the speech dictionary, and the syntax dictionary. The set of recognized words is sent to the short term memory (ShortTermMemory) 105.
[0092]
MultiColorTracker:
An object for performing color recognition. The object receives image data from an image input device such as a camera, extracts a color region based on a plurality of color models stored in advance, and divides the color region into continuous regions. Information such as the position, size, and feature amount of each of the divided areas is output and sent to the short-term storage unit (ShortTermMemory) 105.
[0093]
FaceDetector:
An object for detecting a face area from an image frame. The object receives image data from an image input device such as a camera, and reduces and converts the image data into a nine-stage scale image. A rectangular area corresponding to a face is searched from all the images. Information such as the position, size, feature amount, etc. relating to the area finally determined to be a face by reducing the overlapping candidate areas is output and sent to FaceIdentify (described later).
[0094]
FaceIdentify:
An object for identifying a detected face image, a rectangular area image indicating a face area is received from FaceDetector, and the face image is compared with a person in the held personal dictionary to identify a person. Do. In this case, the face image is received from the face detection, and the ID information of the person is output together with the position and size information of the face image area.
[0095]
ShortTermMemory (short term memory):
An object that holds information about the external environment of the robot 1 for a relatively short period of time, receives a speech recognition result (word, sound source direction, certainty factor) from SpeechRecog, and receives a position, a size, and a face region position of a flesh-colored area from MultiColorTracker. , The size, and the ID information of the person from FaceIdentify. Also, the direction (joint angle) of the robot's neck is received from each sensor on the body of the robot 1. By using these recognition results and sensor outputs in an integrated manner, it is possible to obtain information about where and who is currently speaking, who is speaking, and what dialogue has been conducted with that person. save. The physical information about such an object, that is, the target, and an event (history) viewed in the time direction are output to an upper module such as a situation-dependent behavior hierarchy (SBL).
[0096]
SituatedBehaviorLayer (Situation-Dependent Behavior Hierarchy):
An object that determines the behavior of the robot 1 (behavior depending on the situation) based on the information from the above-mentioned ShortTermMemory (short-term storage unit). Multiple actions can be evaluated and executed simultaneously. In addition, it is possible to switch the action to put the machine in the sleep state and activate another action.
[0097]
ResourceManager:
It is an object that performs resource arbitration of each hardware of the robot 1 in response to a command for output. In the example shown in FIG. 4, resource arbitration is performed between the object for controlling the speaker for audio output and the object for controlling the motion of the neck.
[0098]
SoundPerformerTTS:
It is an object for performing voice output, performs voice synthesis in accordance with a text command given from a SituationBehaviorLayer via a ResourceManager, and performs voice output from a speaker on the body of the robot 1.
[0099]
HeadMotionGenerator:
This is an object that calculates the joint angle of the neck in response to receiving a command to move the neck from the Sited BehaviorLayer via the ResourceManager. When a command of “tracking” is received, based on the position information of the object received from the ShortTermMemory, the joint angle of the neck in the direction in which the object exists is calculated and output.
[0100]
The short-term storage unit 105 is composed of two types of memory objects, a target memory and an event memory.
[0101]
The target memory integrates information from the respective recognition function units 101 to 103 and holds information on the object currently being perceived, that is, a target. Therefore, when the target object disappears or appears, the corresponding target is deleted from the storage area (GarbageCollector) or newly generated. In addition, one target can be represented by a plurality of recognition attributes (TargetAssociate). For example, an object (a human face) that emits a voice in a facial pattern in a flesh color.
[0102]
The position and orientation information of the object (target) held in the target memory is not a sensor coordinate system used in each of the recognition function units 51 to 53, but a specific part on the body such as the trunk of the robot 1. The representation is made in a world coordinate system fixed at a predetermined place. Therefore, the short-term storage unit (STM) 105 constantly monitors the current value (sensor output) of each joint of the robot 1 and performs conversion from the sensor coordinate system to the fixed coordinate system. This makes it possible to integrate the information of the recognition function units 101 to 103. For example, even if the posture of the sensor changes when the robot 100 moves the neck or the like, the position of the object viewed from the behavior control module such as the situation-dependent behavior hierarchy (SBL) remains the same, so that the target can be easily handled. Become.
[0103]
The event memory is an object that stores events occurring in the external environment from the past to the present in chronological order. Events handled in the event memory may include information on appearance and disappearance of targets, speech recognition words, and changes in external situations, such as changes in own actions and postures.
[0104]
The event includes a state change related to a certain target. Therefore, by including the ID of the target as the event information, more detailed information on the event that has occurred can be retrieved from the above-described target memory.
[0105]
FIGS. 5 and 6 show the flow of information entering the target memory and the event memory in the short-term storage unit 105 based on the recognition results of the respective recognition function units 101 to 103.
[0106]
As shown in FIG. 5, a target detector for detecting a target from an external environment is provided in the short-term storage unit 105 (STM object). The target detector adds a new target or reflects an existing target on the recognition result based on the recognition results of the respective recognition function units 101 to 103 such as a voice recognition result, a face recognition result, and a color recognition result. Or to update. The detected target is held in the target memory.
[0107]
The target memory also has functions such as a garbage collector (GarbageCollector) for searching for and erasing targets that are no longer observed, and a target associate (TargetAssociate) for determining the relevance of a plurality of targets and linking them to the same target. There is. The garbage collector is implemented by decrementing the certainty factor of the target as time passes, and deleting a target whose certainty factor falls below a predetermined value. In addition, the target associate can identify the same target by having a spatial and temporal proximity between targets having similar features of the same attribute (recognition type).
[0108]
The above-described situation-dependent behavior hierarchy (SBL) is an object serving as a client (STM client) of the short-term storage unit 105, and periodically receives notification (Notify) of information on each target from the target memory. In this embodiment, the STM proxy class copies the target to a client-local work area independent of the short-term storage unit 105 (STM object), and always holds the latest information. Then, a desired target is read out from the local target list (Target of Interest) as an external stimulus, and a schema, that is, an action module is determined (described later).
[0109]
As shown in FIG. 6, an event detector for detecting an event occurring in an external environment is provided in the short-term storage unit 105 (STM object). This event detector detects generation of a target by the target detector and deletion of the target by the garbage collector as events. When the recognition result by the recognition function units 101 to 103 is voice recognition, the content of the utterance becomes an event. The events that have occurred are stored as an event list in the event memory in the order in which they occurred.
[0110]
The situation-dependent behavior hierarchy (SBL) is an object serving as a client (STM client) of the short-term storage unit 105, and receives an instantaneous notification (Notify) of an event from the event memory. In the present embodiment, the STM proxy class copies the event list to a client-local work area independent of the short-term storage unit 105 (STM object). Then, a desired event is read from the local event list as an external stimulus, and a schema, that is, a behavior module is determined (described later). The executed behavior module is detected by the event detector as a new event. Older events are sequentially discarded from the event list in, for example, a FIFO (Fast In Fast Out) format.
[0111]
According to the short-term memory mechanism according to the present embodiment, the robot 1 integrates results of a plurality of recognizers regarding external stimuli so as to maintain temporal and spatial consistency, and treats the results as meaningful symbol information. It has become. This makes it possible to use more complex recognition results, such as the problem of correspondence with previously observed recognition results, to determine which skin color region corresponds to which person in the face, which person's voice this voice, etc. It is possible to solve.
[0112]
Hereinafter, the dialog processing with the users A and B by the robot 1 will be described with reference to FIGS. 7 to 9.
[0113]
First, as shown in FIG. 7, when the user A calls "Masahiro (robot name) -kun!", Sound direction detection, voice recognition, and face identification are performed by each of the recognition function units 51 to 53, and the user A is called. Then, a situation-dependent action of tracking the face of the user A or starting a conversation with the user A is performed.
[0114]
Next, as shown in FIG. 8, when the user B calls “Masahiro (robot name) -kun!”, Sound direction detection, voice recognition, and face identification are performed by each of the recognition function units 101 to 103. Situation-dependent behavior of stopping the conversation with user A (but preserving the context of the conversation), then tracking the face of user B and starting the conversation with user B in the called direction. Is performed. This is a Preemption function (described later) of the situation-dependent behavior hierarchy 108.
[0115]
Next, as shown in FIG. 9, when the user A shouts “Oh!” And prompts the continuation of the conversation, the user A interrupts the conversation with the user B (however, the context of the conversation is saved). In the called direction, a situation-dependent action of tracking the face of the user A or restarting the conversation with the user A based on the stored context is performed. At this time, by the Reentrant function (described later) of the context-dependent behavior layer 108, the conversation with the user A does not destroy the contents of the conversation with the user B, and the conversation can be resumed exactly from the point of interruption.
[0116]
C-2. Long-term memory
Long-term memory is used to retain information obtained by learning, such as the name of an object, for a long time. The same pattern can be statistically processed for robust storage.
[0117]
Long-term memory is further divided into "declarative knowledge memory" and "procedural knowledge memory." The declarative knowledge memory is composed of "episode memory" which is a memory relating to a scene (for example, a scene when taught) and "semantic memory" which is a memory including the meaning of words and common sense. The procedural knowledge memory is a procedural memory such as how to use the declarative knowledge memory, and can be used to acquire an operation for an input pattern.
[0118]
Episode memory is a kind of declarative knowledge memory (also called statement memory) among long-term memories. For example, considering riding a bicycle, remembering the scene (time, place, etc.) of riding a bicycle for the first time corresponds to episode memory. Thereafter, while the memory relating to the episode fades with the passage of time, the meaning is stored in the meaning memory. Also, the procedure of riding a bicycle is stored, which corresponds to procedural knowledge storage. Generally, storing procedural knowledge takes time. While declarative knowledge memory can say, procedural knowledge memory is latent and manifests itself in performing actions.
[0119]
The long-term storage unit 106 according to the present embodiment is an associative memory that stores sensor information about an object such as visual information and auditory information, and a result of a change in an internal state as a result of an action performed on the object, It is composed of a frame storage for the one object, map information constructed from the surrounding scenes, or map information given as data, a cause condition, an action for the condition, and a result.
[0120]
C-2-1. Associative memory
The associative memory refers to a mechanism in which an input pattern including a plurality of symbols is stored in advance as a storage pattern, and a pattern similar to a certain one of the input patterns is recalled. The associative memory according to the present embodiment is realized by a model using a competitive neural network. According to such an associative memory mechanism, when a pattern having a partial defect is input, the closest storage pattern among a plurality of stored patterns can be output. This is because even when only an external stimulus composed of incomplete data is given, the meaning of a certain object can be recalled by firing the corresponding neuron.
[0121]
Associative memories are broadly classified into "self-associative associative memories" and "mutual associative associative memories." The self-recall type is a model in which a stored pattern is directly extracted by a key pattern, and the mutual recall type is a model in which an input pattern and an output pattern are connected by a certain association. In the present embodiment, a self-recall type associative memory is adopted. However, compared to a conventional memory model such as a Hopfield or an associatron (described above), a statistical storage of an input pattern which is easy to perform additional learning is performed. Is possible.
[0122]
According to the additional learning, even if a new pattern is newly stored, the past storage is not overwritten and erased. Also, according to statistical learning, the more the same thing is seen, the more it remains in the memory, and the more the same thing is repeatedly executed, the more difficult it is to forget. In this case, in the storage process, even if a complete pattern is not input each time, the pattern repeatedly converges to many presented patterns.
[0123]
C-2-2. Semantic memory by associative memory
The pattern that the robot device 1 learns is composed of, for example, a combination of an external stimulus to the robot device 1 and an internal state.
[0124]
Here, the external stimulus is perceptual information obtained by the robot apparatus 1 recognizing a sensor input, and is, for example, color information, shape information, face information processed for an image input from the camera 15. More specifically, it is composed of components such as color, shape, face, 3D general object, hand gesture, movement, voice, contact, smell, and taste.
You.
[0125]
The internal state refers to, for example, emotions such as instinct and emotions based on the body of the robot. Instinct factors include, for example, fatigue, heat or temperature, pain, appetite or hunger, third, affection, curiosity, excretion ( elimination or at least one of sexual desire. The emotional elements include happiness, sadness, anger, surprise, disgust, fear, frustration, boredom, and sleepiness. ), Sociability (gregalousness), patience, tension, relaxed, alertness, guilt, spite, loyalty, submissibility or It is at least one of jealousy.
[0126]
In the associative memory mechanism to which the competitive neural network according to the present embodiment is applied, an input channel is assigned to each element constituting the external stimulus and the internal state. In addition, each perceptual function module such as the visual recognition function unit 101 and the auditory recognition function unit 102 does not send a raw signal as a sensor output, but converts a result of recognition of the sensor output into a symbol and outputs an ID corresponding to the symbol. Information (for example, color prototype ID, shape prototype ID, audio prototype ID, etc.) is sent to the corresponding channel.
[0127]
For example, each object segmented by the color segmentation module is added to a color prototype ID and input to the associative memory system. The face ID recognized by the face recognition module is input to the associative memory system. The ID of the object recognized by the object recognition module is input to the associative system. In addition, a prototype ID of a word is input from the voice recognition module according to the user's utterance. At this time, since the phoneme symbol string of the utterance (Phoneme Sequence) is also input, the robot apparatus 1 can be made to utter by the processing of storage and association. The instinct can handle an analog value (described later). For example, if the instinct delta value is stored as 80, an analog value of 80 can be obtained by association.
[0128]
Therefore, the associative memory system according to the present embodiment can store external stimuli such as colors, shapes, sounds, etc. and internal states as input patterns composed of combinations of IDs symbolized for each channel. That is, the associative memory system stores
[0129]
[Color ID Shape ID Face ID Voice ID ... Instinct ID (value) Emotion ID]
[0130]
It is a combination of
[0131]
Associative memory has a memory process and a recall process. FIG. 10 shows the concept of the storage process of the associative memory.
[0132]
The storage pattern input to the associative memory system is composed of a plurality of channels assigned to each element of the external stimulus and the internal state (in the example shown, eight channels of input 1 to input 8). Then, ID information that symbolizes the recognition result of the external stimulus and the internal state is sent to each channel. In the illustrated example, it is assumed that the shading of each channel represents ID information. For example, when the k-th column in the storage pattern is assigned to the face channel, the color indicates the prototype ID of the face.
[0133]
In the example illustrated in FIG. 10, it is assumed that the associative memory system has already stored a total of n storage patterns 1 to n. Here, the difference in the color of the corresponding channel between the two storage patterns means that the symbol of the external stimulus or the internal state stored on the same channel, that is, the ID, is different between the storage patterns.
[0134]
FIG. 11 shows the concept of the associative memory recall process. As described above, when a pattern similar to the input pattern stored in the storage process is input, a complete storage pattern is output so as to compensate for the missing information.
[0135]
In the example shown in FIG. 11, a pattern to which only the upper three channels are given an ID out of a storage pattern composed of eight channels is input as a key pattern. In such a case, the associative memory system finds a pattern (storage pattern 1 in the illustrated example) closest to these upper three channels from among the stored patterns already stored, and outputs it as a recalled pattern. be able to. That is, the closest storage pattern is output so as to supplement the information of the missing channels 4 to 8.
[0136]
Therefore, according to the associative memory system, it is possible to associate the voice ID, that is, the name, only from the face ID, or to recall “delicious” or “not delicious” only from the name of the food. According to the long-term memory architecture using a competitive neural network, semantic memory relating to the meaning of words and common sense can be realized with the same engineering model as other long-term memories.
[0137]
C-3. Competitive neural ・ Associative learning by network
FIG. 12 schematically shows a configuration example of an associative memory system to which a competitive neural network is applied. As shown in the figure, this competitive neural network is a hierarchical neural network composed of two layers, an input layer and a competitive layer.
[0138]
This competitive neural network has two operation modes, a memory mode and an associative mode. In the memory mode, the input pattern is competitively stored. Recall a unique memory pattern.
[0139]
The input layer is composed of a plurality of input neurons. To each input neuron, a symbol corresponding to the recognition result of the external stimulus or the internal state, that is, ID information, is input from a channel assigned to each element representing the external stimulus or the internal state. In the input layer, it is necessary to prepare the number of neurons corresponding to the number of color IDs + the number of shape IDs + the number of voice IDs + the type of instinct.
[0140]
The competitive layer is composed of a plurality of competitive neurons. Each competitive neuron is connected to each input neuron on the input layer side with a certain connection weight. A competing neuron corresponds to one symbol that each neuron has to store. In other words, the number of competing neurons corresponds to the number of storable symbols.
[0141]
Suppose that an input pattern is given to the input layer. At this time, the input pattern is composed of channels representing each element of the external stimulus and the internal state, and the input neuron to which the corresponding ID has been sent from the channel fires.
[0142]
The competitive neuron inputs the output from each input neuron by weighting it with a synapse, and calculates the sum of those input values. Then, learning is performed by selecting a competitive neuron having the maximum sum of input values in the competitive layer and increasing the coupling force between the winning competitive neuron and the input neuron. Further, by selecting a competitive neuron that has won in the competitive layer for an input pattern having a defect, a symbol corresponding to the input pattern can be recalled.
[0143]
Memory mode:
Assume that the connection weight between the input layer and the competitive layer takes a value between 0 and 1. However, the initial connection weight is determined at random.
[0144]
The storage in the competitive neural network is performed by first selecting a competitive neuron that has won the input pattern to be stored in the competitive layer and strengthening the connection between the competitive neuron and each input neuron.
[0145]
Here, the input pattern vector [x₁, X₂, ..., x_n] Indicates that the neuron corresponds to the color prototype ID1 and, if ID1 is recognized, the neuron x₁Is fired, and the shape and the voice are fired in that way as well. A firing neuron takes a value of 1, and a non-firing neuron takes a value of -1.
[0146]
Also, the connecting force between the i-th input neuron and the j-th competitive neuron is represented by w_ijThen, input x_iCompetitive neuron y_jIs represented by the following equation.
[0147]
(Equation 1)

[0148]
Therefore, the neuron that wins the competition can be obtained by the following equation.
[0149]
(Equation 2)

[0150]
The memory is performed by increasing the coupling force between the competitive neuron (winner neuron) that has won in the competitive layer and each input neuron. The update of the connection between the winning neuron and the input neuron is performed as follows according to the Kohonen update rule.
[0151]
(Equation 3)

[0152]
Here, normalization is performed using L2Norm.
[0153]
(Equation 4)

[0154]
This bonding force represents the so-called memory strength, and becomes the memory power. Here, the learning rate α is a parameter indicating the relationship between the number of presentations and the memory. The larger the learning rate α, the larger the weight is changed by one storage. For example, if α = 0.5 is used, once it is stored, it will not be forgotten, and the next time a similar pattern is presented, the stored pattern can be almost certainly associated.
[0155]
Further, the more the data is presented and stored, the larger the network connection value (weight) becomes. This indicates that the memory becomes stronger as the same pattern is input many times, statistical learning is possible, and long-term storage with less influence of noise in a real environment can be realized.
[0156]
Also, when a new pattern is input and attempted to be stored, the neurons of the new competitive layer are fired, so that the connection with the new neuron is strengthened, and the connection with the neuron based on the previous memory is not weakened. In other words, in the associative memory by the competitive neural network, additional learning is possible, and the problem of "forgetting" is released.
[0157]
Recall mode:
Now, it is assumed that the following input pattern vector is presented to the associative memory system shown in FIG. The input pattern is not perfect and may be partially missing.
[0158]
(Equation 5)

[0159]
At this time, the input vector may be a prototype ID, or a likelihood and a probability for the prototype ID. Output neuron y_jIs the input x_iIs calculated as follows.
[0160]
(Equation 6)

[0161]
It can be said that the above expression represents the likelihood of the firing value of the competing neuron according to the likelihood of each channel. What is important here is that the likelihood input from a plurality of channels can be connected to obtain the overall likelihood. In the present embodiment, only the associated one is selected, that is, only the one with the highest likelihood is selected, and the neuron that wins the competition can be obtained by the following equation.
[0162]
(Equation 7)

[0163]
Since the obtained number of the competitive neuron Y corresponds to the stored symbol number, the input pattern X can be recalled by the inverse matrix operation of W as in the following equation.
[0164]
(Equation 8)

[0165]
Further, by assigning symbols such as episodes and action IDs to the input layer neurons of the competitive neural network shown in FIG. 12, declarative knowledge storage and procedural knowledge storage can be realized by the associative memory architecture.
[0166]
D. Situation-dependent behavior control
The situation-dependent behavior hierarchy (Suited Behaviors Layer) 108 is based on the storage contents of the short-term storage unit 105 and the long-term storage unit 106, and the internal state managed by the internal state management unit 104, based on the state where the robot apparatus 1 is currently placed. Control responsive actions. In addition, as a part of the situation-dependent behavior layer 108, a reflex behavior unit 109 that performs a reflexive and direct body motion in response to the recognized external stimulus is included.
[0167]
D-1. Structure of the situation-dependent behavior hierarchy
In the present embodiment, the situation-dependent behavior hierarchy 108 prepares a state machine for each behavior module, classifies the recognition result of the external information input by the sensor depending on the behavior or situation before that, and The action is expressed on the airframe. The behavior module is described as a schema (schema) having a monitor function for performing a situation determination according to an external stimulus or a change in an internal state, and an action function for implementing a state transition (state machine) accompanying the behavior execution. The situation-dependent behavior hierarchy 108 is configured as a tree structure in which a plurality of schemas are hierarchically connected (described later).
[0168]
The situation-dependent behavior hierarchy 108 also implements an action for keeping the internal state within a certain range (also called “homeostasis behavior”). When the internal state exceeds a specified range, the internal state is determined. The action is activated so that the action for returning to within the range becomes easy (actually, the action is selected in consideration of both the internal state and the external environment).
[0169]
Each functional module in the behavior control system 100 of the robot 1 as shown in FIG. 3 is configured as an object. Each object can perform data transfer and Invoke by message communication and an inter-object communication method using a shared memory. FIG. 13 schematically illustrates an object configuration of the behavior control system 100 according to the present embodiment.
[0170]
The visual recognition function unit 101 is composed of three objects, “FaceDetector”, “MultiColorTracker”, and “FaceIdentify”.
[0171]
FaceDetector is an object that detects a face area from within an image frame, and outputs a detection result to FaceIdentify. The MultiColorTracker is an object for performing color recognition, and outputs a recognition result to FaceIdentify and ShortTermMemory (objects constituting the short-term memory 105). In addition, FaceIdentify identifies a person by, for example, searching for a detected face image in a hand-held person dictionary, and outputs the ID information of the person together with the position and size information of the face image area to the ShortTermMemory.
[0172]
The auditory recognition function unit 102 is composed of two objects “AudioRecog” and “SpeechRecog”. AudioRecog is an object that receives voice data from a voice input device such as a microphone and performs feature extraction and voice section detection, and outputs the feature amount and sound source direction of voice data in a voice section to SpeechRecog and ShortTermMemory. The SpeechRecog is an object that performs speech recognition using the speech feature quantity received from the AudioRecog, the speech dictionary, and the syntax dictionary, and outputs a set of recognized words to the ShortTermMemory.
[0173]
The tactile recognition storage unit 103 is configured by an object called “TactileSensor” that recognizes a sensor input from a contact sensor, and outputs a recognition result to a ShortTermMemory or an InternalStateModel (ISM) that is an object for managing an internal state.
[0174]
The ShortTermMemory (STM) is an object constituting the short-term storage unit 105, and holds a target or an event recognized from an external environment by each of the above-described recognition system objects for a short period of time (for example, an input image from the camera 15 for about 15 seconds). This is a functional module that stores the information only for a short period of time), and periodically notifies the STM client (Suited Behaviors Layer) of an external stimulus (Notify).
[0175]
The LongTermMemory (LTM) is an object constituting the long-term storage unit 106, and is used to hold information obtained by learning, such as the name of an object, for a long time. LongTermMemory can, for example, associatively store a change in an internal state from an external stimulus in a certain behavior module.
[0176]
The InternalStatusManager (ISM) is an object that constitutes the internal state management unit 104, manages several types of emotions such as instinct and emotions in a mathematical model, and manages the external stimulus (ES) recognized by each of the above-described recognition system objects. : Internal instincts and emotions of the robot apparatus 1 are managed according to the External Stimula.
[0177]
The switched behavior layers (SBL) are objects that constitute the situation-dependent behavior hierarchy 108. The SBL is an object serving as a client of the ShormTermMemory (STM client). When a notification (Notify) of information on an external stimulus (target or event) is periodically received from the ShormTermMemory, a schema (schema), that is, an action module to be executed Is determined (described later).
[0178]
The ReflexiveSuitedBehaviorsLayer is an object that constitutes the reflexive behavior unit 109, and executes a reflexive and direct body motion in response to an external stimulus recognized by each of the above-described recognition system objects. For example, a behavior such as following a human face, nodding, or immediately avoiding by detecting an obstacle is performed (described later).
[0179]
The Switched Behaviors layerer selects an action according to a situation such as an external stimulus or a change in an internal state. On the other hand, ReflexiveSuitedBehaviorsLayer behaves reflexively in response to an external stimulus. Since the action selection by these two objects is performed independently, when the action modules (schema) selected from each other are executed on the airframe, the hardware resources of the robot 1 may conflict with each other and cannot be realized. is there. An object called Resourcemanager arbitrates a conflict between hardware when an action is selected by the Situationed Behaviorslayer and the ReflexiveSuited BehaviorsLayer. Then, the body is driven by notifying each object that realizes the body operation based on the arbitration result.
[0180]
The SoundPerformer, the MotionController, and the LedController are objects that implement the body operation. The SoundPerformer is an object for performing voice output, performs voice synthesis in accordance with a text command given from the Switched BehaviorLayer via the ResourceManager, and performs voice output from a speaker on the body of the robot 1. The MotionController is an object for performing the operation of each joint actuator on the airframe, and calculates a corresponding joint angle in response to receiving a command to move a hand, a leg, or the like from the Sited BehaviorLayer via the ResourceManager. The LedController is an object for performing a blinking operation of the LED 19, and performs a blinking drive of the LED 19 in response to receiving a command from the SituationBehaviorLayer via the ResourceManager.
[0181]
FIG. 14 schematically shows a form of the situation-dependent behavior control by the situation-dependent behavior hierarchy (SBL) 108 (including the reflex behavior unit 109). The recognition result of the external environment by the recognition systems 101 to 103 is given to the situation-dependent behavior hierarchy 108 (including the reflex behavior unit 109) as an external stimulus. Further, a change in the internal state according to the recognition result of the external environment by the recognition system is also given to the situation-dependent behavior hierarchy 108. Then, the situation-dependent behavior hierarchy 108 can determine a situation according to an external stimulus or a change in the internal state, and implement the behavior selection.
[0182]
FIG. 15 shows a basic operation example of behavior control by the situation-dependent behavior hierarchy 108 shown in FIG. As shown in the figure, in the situation-dependent behavior hierarchy 108 (SBL), the activity level of each behavior module (schema) is calculated based on an external stimulus or a change in the internal state, and the schema is determined according to the activity level. Select and perform action. For the calculation of the activity level, for example, by using a library, a uniform calculation process can be performed for all schemas (hereinafter the same). For example, a schema having the highest activity level may be selected, or two or more schemas exceeding a predetermined threshold may be selected and executed in parallel. With no hardware resource contention).
[0183]
FIG. 16 shows an operation example when a reflex action is performed by the situation-dependent action hierarchy 108 shown in FIG. In this case, as shown in the figure, the reflexive action unit 109 (ReflexiveSBL) included in the context-dependent action hierarchy 108 calculates the activity level by directly using the external stimulus recognized by each object of the recognition system, An action is executed by selecting a schema according to the degree of the activity level. In this case, the change in the internal state is not used for calculating the activity level.
[0184]
FIG. 17 shows an operation example when emotion expression is performed by the situation-dependent behavior hierarchy 108 shown in FIG. The internal state management unit 104 manages emotions such as instinct and emotion as a mathematical model. In response to the state value of the emotion parameter reaching a predetermined value, a change in the internal state is transmitted to the situation-dependent behavior layer 108. Notify (notify). The situation-dependent behavior hierarchy 108 calculates an activity level by using a change in the internal state as an input, selects a schema according to the degree of the activity level, and executes an action. In this case, the external stimulus recognized by each object of the recognition system is used for managing and updating the internal state in the internal state management unit 104 (ISM), but is not used for calculating the activity level of the schema.
[0185]
D-2. Schema
The situation-dependent action hierarchy 108 prepares a state machine for each action module, classifies the recognition result of the external information input by the sensor depending on the action or situation before that, and performs the action on the aircraft. Express. The action module describes an action of the body and realizes a state transition (state machine) accompanying the action execution, and evaluates the execution of the action described in the action function according to an external stimulus or an internal state to determine a situation. It is described as a schema having a Monitor function to perform. FIG. 18 schematically illustrates a situation where the situation-dependent behavior hierarchy 108 is configured by a plurality of schemas.
[0186]
The context-dependent behavior hierarchy 108 (more strictly, of the context-dependent behavior hierarchy 108, a hierarchy that controls normal context-dependent behavior) is configured as a tree structure in which a plurality of schemas are hierarchically connected, and is used for external stimuli and Behavior control is performed by integrally determining a more optimal schema according to changes in the internal state. The tree includes a plurality of subtrees (or branches) such as a behavior model in which ethological situation-dependent behavior is formalized, and a subtree for executing emotional expression.
[0187]
FIG. 19 schematically shows a tree structure of a schema in the context-dependent behavior hierarchy 108. As shown in the figure, the situation-dependent behavior hierarchy 108 starts from a root schema that receives a notification (Notify) of an external stimulus from the short-term storage unit 105 and moves from an abstract behavior category to a specific behavior category. A schema is provided for each hierarchy. For example, in a hierarchy immediately below the root schema, schemas of “Search”, “Eatestive”, and “Play” are provided. Below the “search”, a schema describing a more specific search action, such as “InvestigativeLocomotion”, “HeadinAirSniffing”, and “InvestigativeSniffing”, is provided. Similarly, a schema describing more specific eating and drinking behaviors such as "Eat" and "Dlink" is provided below the schema "Eattive", and "Scheme (Play)" is provided below the schema "Play". A schema describing a more specific playing action such as "Play Bowing", "Play Greeting", or "Play Paying" is provided.
[0188]
As shown, each schema inputs an external stimulus and an internal state. Each schema has at least a Monitor function and an Action function.
[0189]
FIG. 20 schematically shows the internal structure of the schema. As shown in the figure, the schema evaluates each state of the Action function in accordance with an external stimulus or an internal state as an Action function that describes the body operation in the form of a state transition (state machine), and generates an activity level value. A state management unit that stores and manages the state of the schema as a Monitor function to be returned and a state machine of the Action function as one of READY (prepared), ACTIVE (active), and SLEEP (standby).
[0190]
The Monitor function is a function that calculates an activity level (Activation Level: AL value) of the schema according to the external stimulus and the internal state. When the tree structure shown in FIG. 19 is configured, the upper (parent) schema can call the Monitor function of the lower (child) schema with the external stimulus and the internal state as arguments, and the child schema has the AL value. Is the return value. The schema can also call the Monitor function of the child's schema to calculate its own AL value. Then, since the AL value from each subtree is returned to the root schema, it is possible to integrally determine the optimal schema, that is, the action according to the change of the external stimulus and the internal state.
[0191]
For example, a schema having the highest AL value may be selected, or two or more schemas whose AL values exceed a predetermined threshold may be selected and executed in parallel. (Assuming there is no hardware resource contention between them).
[0192]
FIG. 21 schematically illustrates the internal configuration of the Monitor function. As shown in the figure, the Monitor function includes a behavior induction evaluation value calculator for calculating an evaluation value for inducing a behavior described in the schema as an activity level, and a used resource calculator for specifying an aircraft resource to be used. It has. In the example shown in FIG. 20, when the Monitor function is called from the behavior state control unit (tentative name) for managing the schema, that is, the behavior module, the Monitor function virtually executes the state machine of the Action function, and the behavior induction evaluation value (that is, the activity induction evaluation value). Calculates the used resource and the used resource, and returns it.
[0193]
The Action function includes a state machine (described later) that describes the behavior of the schema itself. When configuring the tree structure as shown in FIG. 19, the parent schema can call the Action function to start or interrupt the execution of the child schema. In this embodiment, the Action state machine is not initialized unless it becomes Ready. In other words, the state is not reset even if interrupted, and the schema saves the work data being executed, so that interrupted re-execution is possible (described later).
[0194]
In the example shown in FIG. 20, the behavior state control unit (tentative name) that manages the schema, that is, the behavior module, selects the behavior to be executed based on the return value from the Monitor function, and calls the Action function of the corresponding schema. Or instruct the transition of the state of the schema stored in the state management unit. For example, a schema having the highest activity level as a behavior induction evaluation value is selected, or a plurality of schemas are selected according to a priority order so that resources do not conflict. Also, the behavior state control unit saves the state of the lower-priority schema from ACTIVE to SLEEP when a higher-priority schema is activated and a resource conflict occurs. Control the state of the schema, such as recovering.
[0195]
As shown in FIG. 22, the behavior state control unit may be provided only one in the context-dependent behavior hierarchy 108, and may centrally manage all schemas constituting the hierarchy 108.
[0196]
In the illustrated example, the behavior state control unit includes a behavior evaluation unit, a behavior selection unit, and a behavior execution unit. The behavior evaluation unit calls the Monitor function of each schema at a predetermined control cycle, for example, to acquire each activity level and used resources. The action selection unit performs action control and management of machine resources using each schema. For example, schemas are selected in descending order of the aggregated activity level, and two or more schemas are simultaneously selected so that resources used do not conflict. The action execution unit issues an action execution command to the Action function of the selected schema, and manages the state of the schema (READY, ACTIVE, SLEEP) to control the execution of the schema. For example, when a schema with a higher priority is activated and a resource conflict occurs, the state of a schema with a lower priority is saved from ACTIVE to SLEEP, and when the conflict is resolved, the schema is restored to ACTIVE.
[0197]
Alternatively, the function of the behavior state control unit may be arranged for each schema in the context-dependent behavior hierarchy 108. For example, as shown in FIG. 19, when the schema forms a tree structure (see FIG. 23), the behavior state control of the upper (parent) schema is performed by using the external stimulus and the internal state as arguments, and The Monitor function of the (child) schema is called, and the activity level and the used resource are received as return values from the child schema. The child's schema also calls the child's schema Monitor function to calculate its activity level and resources used. Then, the activity level and the used resources from each subtree are returned to the behavior state control unit of the root schema, so that the optimal schema, that is, the behavior according to the change of the external stimulus and the internal state, that is, the behavior is determined in an integrated manner. , Action function to start or suspend execution of the child schema.
[0198]
FIG. 24 schematically shows a mechanism for controlling normal context-dependent behavior in the context-dependent behavior hierarchy 108.
[0199]
As shown in the drawing, an external stimulus is input (Notify) from the short-term storage unit 105 and a change in the internal state is input from the internal state management unit 109 to the situation-dependent behavior hierarchy 108. The context-dependent behavior hierarchy 108 is composed of a plurality of sub-trees, such as a behavior model in which ethological situation-dependent behavior is formalized, and a sub-tree for executing emotional expression. In response to the notification of the external stimulus (Notify), the monitor function of each subtree is called, and the activity level (AL value) as a return value is referred to, and an integrated action selection is performed. Call the action function on the subtree that implements the action. Further, the context-dependent behavior determined in the context-dependent behavior hierarchy 108 is applied to the body operation (Motion Controller) through arbitration of hardware resource competition with the reflex behavior by the reflex behavior unit 109 by the resource manager. .
[0200]
In addition, in the context-dependent behavior layer 108, the reflex behavior section 109 executes reflex / direct body motion in response to an external stimulus recognized by each object of the above-described recognition system (for example, when an obstacle is detected). Avoid immediately by detection). Therefore, unlike the case of controlling the normal context-dependent behavior (FIG. 19), a plurality of schemas for directly inputting signals from the respective objects of the recognition system are arranged in parallel without being hierarchized. .
[0201]
FIG. 25 schematically shows the configuration of the schema in the reflex action unit 109. As shown in the drawing, the reflex action unit 109 operates in response to the recognition results of the visual system, such as “AvoidBigSound”, “FacetoBigSound” and “NoddingSound” as the schemas that operate in response to the recognition result of the auditory system. "FacetoMovingObject" and "AvoidMovingObject" as schemas, and "Retract hand" as a schema that operates in response to the recognition result of the haptic system are provided in an equivalent position (in parallel).
[0202]
As shown, each schema that performs reflexive behavior has an external stimulus as input. Each schema has at least a monitor function and an action function. The monitor function calculates the AL value of the schema in response to an external stimulus, and determines whether or not the corresponding reflexive behavior should be expressed accordingly. The action function has a state machine (described later) that describes the reflexive behavior of the schema itself. When called, the action function expresses the reflexive behavior and transitions the state of the action.
[0203]
FIG. 26 schematically shows a mechanism for controlling the reflexive behavior in the reflexive behavior unit 109.
[0204]
As shown in FIG. 25, a schema describing a reaction behavior and a schema describing an immediate response behavior exist in parallel in the reflex behavior unit 109. When a recognition result is input from a recognition system object, the corresponding reflex behavior schema calculates an AL value by a monitor function, and it is determined whether or not to follow an action according to the value. Then, the reflex action determined to be activated by the reflex action unit 109 is applied to the body operation (Motion Controller) through arbitration of hardware resource competition with the reflex action by the reflex action unit 109 by the resource manager. You.
[0205]
The schema that constitutes the context-dependent behavior hierarchy 108 (including the reflex behavior unit 109) is, for example, C⁺⁺It can be described as a "class object" described on a language basis. FIG. 27 schematically shows the class definition of the schema used in the situation-dependent behavior hierarchy 108. Each block shown in the figure corresponds to one class object.
[0206]
As illustrated, the context-dependent behavior hierarchy (SBL) 108 includes one or more schemas, an EventDataHandler (EDH) that allocates IDs to SBL input / output events, and a SchemaHandler (SH) that manages schemas in the SBL. One or more ReceiveDataHandler (RDH) that receives data from an external object (such as an STM or LTM, a resource manager, or an object of a recognition system) and one or more SendDataHandler (SDH) that transmits data to an external object. I have.
[0207]
EventDataHandler (EDH) is a class object for allocating IDs to SBL input / output events, and receives notification of input / output events from RDH and SDH.
[0208]
The SchemaHandler stores information such as schemas and tree structures (SBL configuration information) constituting the context-dependent behavior hierarchy (SBL) 108 and the reflex behavior unit 109 as a file. For example, when the system is started, the Schema Handler reads this configuration information file, constructs (reproduces) the schema configuration of the situation-dependent behavior hierarchy 108 as shown in FIG. 19, and stores each schema in the memory space. Map entities.
[0209]
Each schema has an OpenR_Guest positioned as the base of the schema. OpenR_Guest includes one or more class objects Dsub for the schema to transmit data to the outside, and one or more class objects DObject for the schema to receive data from the outside. For example, when the schema sends data to an external object of the SBL (STM, LTM, each object of the recognition system, etc.), Dsubject writes the transmission data to the SendDataHandler. Also, the DObject can read the data received from the external object of the SBL from the ReceiveDataHandler.
[0210]
SchemaManager and SchemaBase are both class objects that inherit OpenR_Guest. Class inheritance means inheriting the definition of the original class. In this case, it means that a class object such as Dsubject or DOobject defined in OpenR_Guest is also provided in SchemaManager or SchemaBase (the same applies hereinafter). For example, when a plurality of schemas have a tree structure as shown in FIG. 19, SchemaManager has a class object SchemaList that manages a list of child schemas (has a pointer to a child schema), and has a child schema. Functions can be called. Also, SchemaBase has a pointer to the parent schema, and can return the return value of the function called from the parent schema.
[0211]
SchemaBase has two class objects: StateMachine and Pronome. StateMachine manages a state machine for the behavior (Action function) of the schema. FIG. 28 illustrates a state machine for the behavior (Action function) of the schema. An action is linked to each transition between the states of the state machine.
[0212]
The parent schema can switch (state transition) the state machine of the Action function of the child schema. In addition, a target to which the schema executes or applies an action (Action function) is substituted in Pronom. As will be described later, the schema is occupied by the target assigned to Pronom, and the schema is not released until the action ends (complete, abnormal termination, etc.). To execute the same action for a new target, the same class definition schema is created in the memory space. As a result, the same schema can be executed independently for each target (the work data of each schema does not interfere with each other), and the Reentrance property (described later) of the behavior is ensured.
[0213]
ParentSchemaBase is a class object that inherits SchemaManager and SchemaBase multiple times, and manages a parent schema and a child schema of the schema itself, that is, a parent-child relationship in the tree structure of the schema.
[0214]
IntermediaParentSchemaBase is a class object that inherits ParentSchemaBase, and implements interface conversion for each class. Also, the IntermediaParentSchemaBase has SchemaStatusInfo. This SchemaStatusInfo is a class object that manages the state machine of the schema itself.
[0215]
The parent schema can switch the state of its state machine by calling the Action function of the child schema. Further, the Aonitor function of the child schema can be called to inquire about the AL value according to the state of the state machine. However, it should be noted that the state machine of the schema is different from the state machine of the Action function described above.
[0216]
FIG. 29 illustrates the state machine for the behavior described by the schema itself, that is, the Action function. As described above, the state machine of the schema itself defines three states, READY (ready), ACTIVE (active), and SLEEP (waiting), for the behavior described in accordance with the Action function. I have. When a higher priority schema is activated and resource contention occurs, the state of the lower priority schema is saved from ACTIVE to SLEEP, and when the contention is resolved, the schema is restored to ACTIVE.
[0217]
As shown in FIG. 29, ACTIVE_TO_SLEEP is defined in the state transition from ACTIVE to SLEEP, and SLEEP_TO_ACTIVE is defined in the state transition from SLEEP to ACTIVE. The feature of this embodiment is that
(1) The ACTIVE_TO_SLEEP is linked with a process for storing data (context) necessary for transitioning to the ACTIVE state and resuming the operation later, and an action required for the SLEEP operation.
(2) The process for restoring the saved data (context) and the action required to return to ACTIVE are linked to SLEEP_TO_ACTIVE.
That is the point. The action necessary for performing SLEEP is, for example, an action of saying a pause to the other party, such as "Please wait a moment" or the like (in addition, a gesture may be added). In addition, the action required to return to ACTIVE is, for example, an action of saying a phrase such as “please wait” indicating appreciation to the other party (in addition, a gesture may be added).
[0218]
AndParentSchema, NumOrParentSchema, and OrParentSchema are class objects that inherit from IntermediaParentSchemaBase. AndParentSchema has pointers to multiple child schemas that execute simultaneously. OrParentSchema has pointers to a plurality of alternately executed child schemas. Also, NumOrParentSchema has pointers to a plurality of child schemas that execute only a predetermined number at the same time.
[0219]
ParentSchema is a class object that inherits AndParentSchema, NumOrParentSchema, and OrParentSchema multiple times.
[0220]
FIG. 30 schematically shows the functional configuration of the classes in the context-dependent behavior hierarchy (SBL) 108.
[0221]
The context-dependent behavior hierarchy (SBL) 108 includes one or more ReceiveDataHandler (RDH) for receiving data from external objects such as STM, LTM, resource manager, and cognitive objects, and one or more ReceiveDataHandler (RDH) for transmitting data to external objects. And a SendDataHandler (SDH).
[0222]
EventDataHandler (EDH) is a class object for allocating IDs to SBL input / output events, and receives notification of input / output events from RDH and SDH.
[0223]
SchemaHandler is a class object for managing a schema, and stores configuration information of a schema constituting an SBL as a file. For example, when the system is started, the Schema Handler reads this configuration information file and constructs a schema configuration in the SBL.
[0224]
Each schema is generated according to the class definition shown in FIG. 27, and entities are mapped on the memory space. Each schema uses OpenR_Guest as a base class object, and includes class objects such as DObject and DOObject for external data access.
[0225]
The functions and state machines that the schema mainly has are shown below.
[0226]
ActivationMonitor (): an evaluation function for making the schema Active when the schema is Ready.
Actions (): State machine for execution at the time of Active.
Goal (): a function for evaluating whether the schema has reached Goal at the time of Active.
Goal (): a function for determining whether the schema is in a fail state at the time of Active.
SleepActions (): State machine executed before Sleep.
SleepMonitor (): Evaluation function for resuming at the time of sleep.
ResumeActions (): State machine for Resume before Resume.
DestroyMonitor (): an evaluation function that determines whether the schema is in a fail state at the time of sleep.
MakePronome (): A function that determines the target of the entire tree.
[0227]
These functions are described in SchemaBase.
[0228]
FIG. 31 shows a processing procedure for executing the MakePronome function in the form of a flowchart.
[0229]
When the MakePronome function of the schema is called, first, it is determined whether or not a child schema exists in the schema itself (step S1).
[0230]
If a child schema exists, the MakePronome function of all child schemas is similarly recursively called (step S2).
[0231]
Then, MakePronome of the schema itself is executed, and the target is assigned to the Pronome object (step S3).
[0232]
As a result, the same target is assigned to the Pronomes of all the schemas below the schema, and the schema is not released until the action ends (complete, abnormal termination, etc.). To execute the same action for a new target, the same class definition schema is created in the memory space.
[0233]
FIG. 32 shows a processing procedure for executing the Monitor function in the form of a flowchart.
[0234]
First, the evaluation flag (AssessmentFlag) is set to ON (step S11), and the action of the schema itself is executed (step S12). At this time, a child schema is also selected. Then, the evaluation flag is turned off (step S13).
[0235]
If a child schema exists (step S14), the Monitor function of the child schema selected in step S12 is recursively called (step S15).
[0236]
Next, the Monitor function of the schema itself is executed (Step S16), and the activity level and the resource used for executing the action are calculated (Step S17), and the result is set as the return value of the function.
[0237]
FIGS. 33 and 34 show, in the form of a flowchart, a processing procedure for executing the Actions function.
[0238]
First, it is checked whether or not the schema is in the STOPPING state (step S21), and then, whether or not the schema is in the STOPPING state (step S22).
[0239]
If it is in a state to be STOPPING, it is further checked whether or not there is a child schema (step S23). If there is a child schema, the child schema is shifted to the GO_TO_STOP state (step S24), and the HaveToStopFlag is turned on (step S25).
[0240]
If the state is not the STOPPING state, it is checked whether the state is the RUNNING state (step S26).
[0241]
If it is not in the RUNNING state, it is checked whether there is any child schema (step S27). Then, when there is a child schema, the HaveToStopFlag is turned on (step S28).
[0242]
Next, the next own state is determined from the current system state, the HaveToRunFlag, the HaveToStopFlag, and the operation state of the child schema (step S29).
[0243]
Next, the Action function of the schema itself is executed (Step S30).
[0244]
Thereafter, it is checked whether the schema itself is in the GO_TO_STOP state (step S31). If it is not in the GO_TO_STOP state, it is further checked whether or not there is a child schema (step S32). If there is a child schema, it is checked whether there is a child schema in the GO_TO_STOP state (step S33).
[0245]
If there are child schemas in the GO_TO_STOP state, the Action functions of these schemas are executed (step S34).
[0246]
Next, it is checked whether there is any child schema in RUNNING (step S35). If there is no child schema in RUNNING, it is checked whether there is a stopped child schema (step S36), and the Action function of the stopped child schema is executed (step S37).
[0247]
Next, it is checked whether there is a child schema in the GO_TO_RUN state (step S38). If there is no child schema in the GO_TO_RUN state, it is checked whether there is a child schema in the GO_TO_STOP state (step S39), and if so, the Action function of this child schema is executed (step S40).
[0248]
Finally, the next state of itself is determined from the current system state, HaveToRunFlag, HaveToStopFlag, and the operation state of the child, and the entire processing routine ends (step S41).
[0249]
D-3. Functions of the contextual behavior hierarchy
The situation-dependent behavior hierarchy (Suited Behaviors Layer) 108 is based on the storage contents of the short-term storage unit 105 and the long-term storage unit 106, and the internal state managed by the internal state management unit 104, based on the state where the robot apparatus 1 is currently placed. Control responsive actions.
[0250]
As described in the previous section, the situation-dependent action hierarchy 108 according to the present embodiment is configured by a tree structure of a schema (see FIG. 19). Each schema keeps its independence, knowing its child and parent information. With such a schema configuration, the situation-dependent behavior hierarchy 108 has main features of Concurrent evaluation, Concurrent execution, Preemption, and Reentrant. Hereinafter, these features will be described in detail.
[0251]
(1) Concurrent evaluation:
It has already been mentioned that the schema as an action module has a Monitor function for making a situation judgment in response to an external stimulus or a change in the internal state. The Monitor function is implemented by the schema having a Monitor function in the class object SchemaBase. The Monitor function is a function that calculates an activity level (Activation Level: AL value) of the schema according to the external stimulus and the internal state.
[0252]
When the tree structure shown in FIG. 19 is configured, the upper (parent) schema can call the Monitor function of the lower (child) schema with the external stimulus and the internal state as arguments, and the child schema has the AL value. Is the return value. The schema can also call the Monitor function of the child's schema to calculate its own AL value. Then, since the AL value from each subtree is returned to the root schema, it is possible to integrally determine the optimal schema, that is, the action according to the change of the external stimulus and the internal state.
[0253]
Because of the tree structure, the evaluation of each schema based on the external stimulus and the change of the internal state is first performed on the current from the bottom of the tree structure to the top. As shown in the flowchart of FIG. 32, when the child schema is included in the schema (step S14), the Monitor function of the selected child is called (step S15), and then the own Monitor function is executed.
[0254]
Next, an execution permission as an evaluation result is passed from top to bottom of the tree structure. Evaluation and execution are performed while resolving competition for resources used by the action.
[0255]
The situation-dependent behavior hierarchy 108 according to the present embodiment can evaluate behavior in parallel using the tree structure of the schema, and thus has adaptability to situations such as external stimuli and internal states. . In addition, at the time of evaluation, evaluation is performed on the entire tree, and the tree is changed by the activity level (AL) value calculated at this time. Therefore, the schema, that is, the action to be executed can be dynamically prioritized.
[0256]
(2) Concurrent execution:
Since an AL value from each subtree is returned to the root schema, an optimal schema, that is, an action corresponding to a change in an external stimulus and an internal state can be determined in an integrated manner. For example, a schema having the highest AL value may be selected, or two or more schemas whose AL values exceed a predetermined threshold may be selected and executed in parallel. (Assuming there is no hardware resource contention between them).
[0257]
The schema for which permission has been granted is executed. In other words, the schema actually executes commands by observing external stimuli and changes in internal states in more detail. Execution is performed sequentially from the top of the tree structure to the bottom, that is, Concurrent. As shown in the flowcharts of FIGS. 33 and 34, if the schema includes a child schema, the child's Actions function is executed.
[0258]
The Action function includes a state machine (described later) that describes the behavior of the schema itself. When configuring the tree structure as shown in FIG. 19, the parent schema can call the Action function to start or interrupt the execution of the child schema.
[0259]
The context-dependent behavior hierarchy 108 according to the present embodiment can execute another schema using the surplus resources at the same time by using the tree structure of the schema when resources do not conflict. However, if there is no restriction on the resources used up to Goal, an unusual behavior may occur. The context-dependent behavior determined in the context-dependent behavior hierarchy 108 is applied to the body operation (Motion Controller) through arbitration of hardware resource competition with the reflex behavior by the reflex behavior unit 109 by the resource manager.
[0260]
(3) Preemption:
Even if a schema has been executed once, if there is a more important (higher priority) action, the schema must be interrupted and the execution right must be given to it. Also, when more important actions are completed (completed or stopped), it is necessary to resume the original schema and continue execution.
[0261]
Executing a task according to such a priority is similar to a function called Preemption of an OS (Operating System) in the computer world. The OS has a policy of sequentially executing tasks with higher priorities at a timing when the schedule is considered.
[0262]
On the other hand, since the behavior control system 100 of the robot 1 according to the present embodiment extends over a plurality of objects, arbitration between the objects is required. For example, Reflexive SBL, which is an object that controls reflex behavior, needs to avoid objects and balance without worrying about the behavior evaluation of SBL, which is an object that controls higher-level context-dependent behavior. This means that the execution right is actually robbed and executed, but the upper-level behavior module (SBL) is notified that the execution right has been robbed, and the higher-level module performs preemptive capability by performing the processing. Hold.
[0263]
Also, assume that execution of a certain schema is permitted as a result of the evaluation of the AL value based on the external stimulus and the change in the internal state in the context-dependent behavior layer 108. Further, it is assumed that the significance of another schema becomes higher by the subsequent evaluation of the AL value based on the external stimulus and the change in the internal state. In such a case, by switching to the sleep state using the Actions function of the schema being executed, the preemptive behavior can be switched.
[0264]
Save the state of Actions () of the running schema and execute Actions () of a different schema. Also, after Actions () of a different schema is completed, Actions () of the suspended schema can be executed again.
[0265]
Also, Actions () of the schema being executed is interrupted, and SleepActions () is executed before the execution right is transferred to a different schema. For example, if the robot 1 finds a soccer ball during a conversation, it can say, "Wait a minute" and play soccer.
[0266]
(4) Reentrant:
Each schema constituting the context-dependent behavior hierarchy 108 is a kind of subroutine. When a schema is called from a plurality of parents, it is necessary to have a storage space corresponding to each parent in order to store its internal state.
[0267]
This is similar to the reentrant property of the OS in the computer world, and is referred to as schema reentrant property in this specification. As described with reference to FIG. 30, the schema is composed of class objects, and the reentrant property is realized by generating an entity, that is, an instance of the class object for each target (Pronome).
[0268]
The reentrancy of the schema will be described more specifically with reference to FIG.
[0269]
SchemaHandler is a class object for managing a schema, and stores configuration information of a schema constituting an SBL as a file. When the system is started, the SchemaHandler reads this configuration information file and constructs a schema configuration in the SBL. In the example illustrated in FIG. 31, it is assumed that schema entities defining behavior such as Eat and Dialog are mapped in the memory space.
[0270]
Here, by evaluating the activity level based on the external stimulus and the change in the internal state, a target (Pronome) of A is set for the schema Dialog, and the Dialog executes the dialogue with the person A. .
[0271]
Thereafter, the person B interrupts the dialogue between the robot 1 and the person A, and evaluates the activity level based on the external stimulus and the change in the internal state. As a result, the schema for the dialogue with the person B has a higher priority. Let's say that.
[0272]
In such a case, the SchemaHandler maps another Dialog entity (instance) inheriting the class for interacting with B onto the memory space. Since the conversation with B is performed using another Dialog entity independently of the previous Dialog entity, the contents of the conversation with A are not destroyed. Therefore, Dialog A can maintain data consistency, and when the conversation with B ends, the conversation with A can be resumed from the point at which it was interrupted.
[0273]
The schema in the Ready list is evaluated according to the object (external stimulus), that is, the AL value is calculated, and the execution right is handed over. After that, an instance of the schema moved in the Ready list is generated, and the other objects are evaluated. Thereby, the same schema can be set to the active or sleep state.
[0274]
E. FIG. Robot internal state management
In the robot behavior control system 100 according to the present embodiment, the situation-dependent behavior hierarchy 108 determines behavior based on the internal state and the external environment.
[0275]
The internal state of the robot device 1 is composed of several kinds of emotions such as instinct and emotion, and is treated as a mathematical model. The internal state manager (ISM: Internal Status Manager) 104 manages the internal state based on the external stimulus (ES: External Stimula) recognized by each of the above-described recognition function units 101 to 103 and the passage of time.
[0276]
E-1. Layering emotions
In the present embodiment, an emotion is composed of a plurality of layers according to the significance of its existence, and operates at each layer. The operation to be performed is determined from the determined plurality of operations according to the external environment or internal state at that time (described later). In addition, actions are selected at each level, but by expressing actions preferentially from lower-order actions, instinctual actions such as reflex and higher-order actions such as action selection using memory are performed. Behavior can be consistently expressed on one individual.
[0277]
FIG. 36 schematically illustrates a hierarchical configuration of the internal state management unit 104 according to the present embodiment.
[0278]
As shown in the figure, the internal state management unit 104 changes internal information such as emotions depending on the primary emotions necessary for the individual to survive such as instinct and desire, and the degree of satisfaction (excess or insufficient) of the primary emotions. It is roughly divided into the following emotions. In addition, primary emotions are hierarchically subdivided from those that are more physiological to those that are associated with the survival of the individual.
[0279]
In the illustrated example, the primary emotions are divided into low-order primary emotions, high-order primary emotions, and primary emotions associated with the ascending order from low to high. The lower-order primary emotions correspond to access to the limbic system, generate emotions to maintain homeostasis (individual maintenance), and are prioritized when homeostasis is threatened. The higher primary emotions correspond to access to the neocortex of the cerebrum, and are involved in maintaining races such as intrinsic needs and social needs. The higher primary emotions have different degrees of satisfaction depending on learning and environment (satisfied by learning and communication).
[0280]
Each layer of the primary emotion outputs a change amount ΔI of the temporary emotion (instinct) level by executing the schema selected for the action.
[0281]
The secondary emotion corresponds to so-called emotion (Emotion) and includes elements such as joy (Joy), sadness (Sad), anger (Anger), surprise (Surprise), disgust (Disgust), and awe (Feer). The change amount (satisfaction) ΔE of the secondary emotion is determined according to the change amount ΔI of the primary emotion.
[0282]
In the situation-dependent action hierarchy 108, action selection is mainly performed based on primary emotions, but when secondary emotions are strong, action selection based on secondary emotions can also be performed. Further, it is also possible to perform a modulation on an action selected based on the primary emotion using a parameter generated by the secondary emotion.
[0283]
For the emotional hierarchy for the survival of the individual, the behavior by innate reflex is selected first. Next, an action that satisfies the lower primary emotion is selected. Then, the action generation that satisfies the primary emotions of the higher rank, the action generation that satisfies the primary emotions by association, and the more primitive individual retention are realized.
[0284]
At this time, the primary emotion of each layer can put pressure on the nearest layer. When the index for selecting the action determined by the user is strong, the action determined in the nearest hierarchy can be suppressed and the user's own action can be expressed.
[0285]
As described in the preceding section D, the situation-dependent behavior hierarchy 108 is configured by a plurality of schemas having target operations (see FIGS. 18 and 19). In the situation-dependent action hierarchy 108, a schema, that is, an action is selected using the activity level of each schema as an index. The activity level of the entire schema is determined by the activity level of the internal state and the activity level of the external situation. The schema holds an activity level every time the target operation is performed. The occurrence of an action that satisfies XX means that the action that satisfies XX executes a schema that is the final goal.
[0286]
The activity level of the internal state is determined by the sum total of the change ΔE of the satisfaction degree of the secondary emotion based on the change amount ΔI of each layer in the primary emotion when the schema is executed. Here, the primary emotion is composed of three layers L1, L2, and L3, and changes in the secondary emotion derived from each layer of the primary emotion when the schema is selected are represented by ΔE._L1, ΔE_L2, ΔE_L3, And each has a weighting factor w₁, W₂, W₃Is multiplied to calculate the activity level. By making the weight factor for the lower primary emotion larger, it is easier to select an action that satisfies the lower primary emotion. Further, by adjusting these weighting factors, it is possible to obtain an effect that the primary emotion of each layer applies pressure to the nearest layer (Concentration: behavior suppression).
[0287]
Here, an embodiment of action selection using the hierarchical structure of emotions will be described. However, in the following, Sleep (sleepiness) is treated as the lower primary emotion, and Curiosity (Curiosity) is treated as the upper primary emotion.
[0288]
(1) It is assumed that Sleep, which is a lower-order primary emotion, is insufficient, and the activity level of a schema that satisfies Sleep is increasing. At this time, if the activity level of the other schema does not increase, the schema that satisfies Sleep executes itself until Sleep is satisfied.
[0289]
(2) It is assumed that, before Sleep is satisfied, the primary primary emotion, Curiosity, is insufficient. However, since the sleep is directly related to the maintenance of the individual, the schema that satisfies the sleep continues to be executed until the activity level of the sleep falls below a certain value. Then, when Sleep is satisfied to some extent, a schema that satisfies Curiosity can be executed.
[0290]
(3) It is assumed that the hand vigorously comes close to the face of the robot during execution of the schema satisfying Curiosity. In response to this, the robot recognizes that the skin color is approaching suddenly by color recognition and size recognition, and reflexively avoids the face from the hand as an innate reflex action, that is, pulls the head behind . This reflex movement corresponds to the spinal reflex of the animal. Since reflection is the lowest order schema, the reflection schema is executed first.
[0291]
After the spinal reflex, an accompanying emotional change occurs. Based on the magnitude of the change and the activity level of other schemas, it is determined whether or not to perform an emotional expression schema. If the emotion expression schema has not been performed, the schema that satisfies Curiosity is continued.
[0292]
(4) Schemas below a certain schema itself are usually more likely to be selected than the schema itself, but only when the activity level of the schema itself is extremely high, the schemas under the schema are suppressed (Concentration) and a certain value is set. It is possible to run itself up to. When the sleep shortage is remarkable, even when the action of the reflex action schema is desired, the schema that satisfies the sleep is preferentially executed until the behavior is restored to a certain value.
[0293]
E-2. Linkage with other function modules
FIG. 37 schematically illustrates a communication path between the internal state management unit 104 and another functional module.
[0294]
The short-term storage unit 105 outputs a recognition result from each of the recognition function units 101 to 103 for recognizing a change in the external environment to the internal state management unit 104 and the situation-dependent behavior hierarchy 108.
The internal state management unit 104 notifies the context-dependent behavior hierarchy 108 of the internal state. On the other hand, the context-dependent behavior hierarchy 108 returns information on the instinct and emotion determined or associated.
[0295]
The context-dependent behavior hierarchy 108 selects an activity based on the activity level calculated from the internal state and the external environment, and notifies the internal state management unit 104 of the execution and completion of the selected activity via the short-term storage unit 105. Notice.
[0296]
The internal state management unit 104 outputs the internal state to the long-term storage unit 106 for each action. On the other hand, the long-term storage unit 106 returns the stored information.
[0297]
The biorhythm management unit supplies the biorhythm information to the internal state management unit 104.
[0298]
E-3. Changes in internal state over time
The index of the internal state changes over time. For example, primary emotions, that is, instinct, Hunger (hunger), Fatique (fatigue), and Sleep (sleepiness) change as time passes as follows.
[0299]
Hunger: Hunger decreases (virtual value or remaining battery charge)
Fatique: tired
Sleep: Drowsiness
[0300]
Further, in the present embodiment, Pleasantness (satisfaction), Activation (activity), and Certainty (certainty) are defined as elements of the secondary emotion of the robot, that is, the emotion (Emotion). To change.
[0301]
Pleasantness: changing towards Neutral
Activation: Depends on biorhythm and sleep
Certainty: Depends on Attention
[0302]
FIG. 38 shows a mechanism by which the internal state management unit 104 changes the internal state with time.
[0303]
As shown in the figure, the biorhythm management unit notifies biorhythm information at regular intervals. On the other hand, the internal state management unit 104 changes the value of each element of the primary emotion according to the biorhythm and changes the activation (activity) as the secondary emotion. Then, the situation-dependent behavior layer 108 receives an index value of an internal state such as instinct and emotion from the internal state management unit 104 every time there is a notification from the biorhythm management unit. By calculating, an action (schema) depending on the situation can be selected.
[0304]
E-4. Changes in internal state due to operation execution
The internal state also changes when the robot executes an operation.
[0305]
For example, in a schema that performs an action of “sleeping”, an action that satisfies Sleep (sleepiness) as a lower-order primary emotion is the final goal. In the context-dependent behavior hierarchy 108, the activity level of each schema is calculated and compared based on Sleep as the primary emotion and Activation as the secondary emotion, and selects a “sleeping” schema. Is realized.
[0306]
On the other hand, the situation-dependent behavior hierarchy 108 transmits the completion of the behavior of sleeping to the internal state management unit 104 via the short-term storage unit 105. On the other hand, the internal state management unit 104 changes the index value of Sleep, which is the primary emotion, by executing the “sleep” action.
[0307]
Then, in the context-dependent behavior hierarchy 108, the activity level of each schema is calculated and compared again based on the degree of the satisfaction of the sleep and the activation as the secondary emotion. As a result, another schema having a higher priority is selected, and the process exits from the schema of sleeping.
[0308]
FIG. 39 shows a mechanism by which the internal state management unit 104 changes the internal state by executing the operation of the robot.
[0309]
The situation-dependent action hierarchy 108 notifies the internal state management unit 104 via the short-term storage unit 105 of the execution start and execution end of the action selected in the situation-dependent type and Attention information.
[0310]
When notified of the execution completion information of the selected action, the internal state management unit 104 checks the external environment obtained from the short-term storage unit 105 according to the Attention information, and instincts (Sleep) as a primary emotion. Is changed, and the emotion as the secondary emotion is changed accordingly. Then, the updated data of the internal state is output to the situation-dependent behavior hierarchy 108 and the long-term storage unit 106.
The situation-dependent behavior hierarchy 108 calculates the activity level of each schema based on the newly received index value of the internal state, and selects the next behavior (schema) depending on the situation.
[0311]
The long-term storage unit 106 updates the storage information based on the internal state update data, and notifies the internal state management unit 104 of the update content. The internal state management unit 104 determines a certainty factor (Certainty) as a secondary emotion based on the certainty factor of the external environment and the certainty factor of the long-term storage unit 106.
[0312]
E-5. Changes in internal state due to sensor information
The degree of operation when the robot executes the operation is recognized by each of the recognition function units 101 to 103, and is notified to the internal state management unit 104 via the short-term storage unit 105. The internal state management unit 104 can reflect the degree of the operation as, for example, a “fatigue” in the change of the primary emotion. In addition, in response to the change of the primary emotion, the secondary emotion can also be changed.
[0313]
FIG. 40 shows a mechanism by which the internal state management unit 104 changes the internal state according to the recognition result of the external environment.
[0314]
When receiving the recognition results by the respective recognition function units 101 to 103 via the short-term storage unit 105, the internal state management unit 104 changes the index value of the primary emotion and also changes the emotion as the secondary emotion. I do. Then, the update data of the internal state is output to the situation-dependent behavior hierarchy 108.
[0315]
In the situation-dependent behavior hierarchy 108, the activity level of each schema is calculated based on the newly received index value of the internal state, and the next behavior (schema) depending on the situation can be selected.
[0316]
E-6. Changes in internal state due to association
As described above, the robot according to the present embodiment has the associative memory function in the long-term storage unit 106. The associative memory is a mechanism in which an input pattern composed of a plurality of symbols is stored in advance as a storage pattern, and a pattern similar to a certain one of the input patterns is recalled. Can be associatively stored.
[0317]
For example, consider the case where the appearance of an apple causes an emotional change of "happy".
[0318]
When the apple is recognized by the visual recognition function unit 101, the situation-dependent behavior hierarchy 108 is notified via the short-term storage unit 105 as a change in the external environment.
[0319]
In the long-term memory unit 106, the associative memory related to “apple” recalls the behavior of “eat (apple)” and the change of the internal state in which the primary emotion (hunger sensation) is satisfied by the index value of 30 by eating. can do.
[0320]
Upon receiving the storage information from the long-term storage unit 106, the situation-dependent behavior hierarchy 108 notifies the internal state management unit 104 of the internal state change ΔI = 30.
[0321]
The internal state management unit 104 can calculate the change amount ΔE of the secondary emotion based on the notified ΔI and obtain an index value of the secondary emotion E caused by eating an apple.
[0322]
FIG. 41 shows a mechanism by which the internal state management unit 104 changes the internal state by associative memory.
[0323]
The external environment is notified to the context-dependent behavior hierarchy 108 via the short-term storage unit 105. By the associative memory function of the long-term storage unit 106, the action according to the external environment and the change ΔI of the primary emotion can be recalled.
[0324]
The situation-dependent action hierarchy 108 selects an action based on the stored information obtained by the associative memory, and notifies the internal state management unit 104 of the change ΔI of the primary emotion.
[0325]
The internal state management unit 104 calculates the change ΔE of the secondary emotion based on the notified change ΔI of the primary emotion and the index value of the primary emotion managed by itself, and calculates the change ΔE of the secondary emotion. To change. Then, the newly generated primary emotion and secondary emotion are output to the context-dependent behavior hierarchy 108 as internal state update data.
[0326]
In the situation-dependent behavior hierarchy 108, the activity level of each schema is calculated based on the newly received index value of the internal state, and the next behavior (schema) depending on the situation can be selected.
[0327]
E-7. Changes in internal state due to innate behavior
The fact that the robot according to the present embodiment changes the internal state by executing the operation is as described above (see FIG. 39). In this case, the action is selected based on the index value of the internal state including the primary emotion and the secondary emotion, and the emotion is satisfied by the completion of the execution of the action. On the other hand, in the robot according to the present embodiment, an innate reflex behavior that does not depend on emotion is also defined. In this case, the reflex behavior is directly selected according to the change of the external environment, and the mechanism is different from the internal change by performing the normal operation.
[0328]
For example, consider the case where an innate reflex occurs when a large object suddenly appears.
[0329]
In such a case, for example, the recognition result (sensor information) of “large thing” by the visual recognition function unit 101 is directly input to the situation-dependent behavior hierarchy 108 without passing through the short-term storage unit 105.
[0330]
In the context-dependent behavior hierarchy 108, the activity level of each schema is calculated by an external stimulus of "big one", and an appropriate behavior is selected (see FIGS. 15, 25 and 26). In this case, the situation-dependent behavior hierarchy 108 selects the spinal reflex behavior “escape”, determines the secondary emotion “surprise”, and notifies the internal state management unit 104 of this.
[0331]
The internal state management unit 104 outputs the secondary emotion sent from the situation-dependent behavior hierarchy 108 as its own emotion.
[0332]
FIG. 42 shows a mechanism by which the internal state management unit 104 changes the internal state by an innate reflex action.
[0333]
When performing an innate reflex action, the battle information from each of the recognition function units 101 to 103 is directly input to the situation-dependent action hierarchy 108 without passing through the short-term storage unit 105.
In the context-dependent behavior hierarchy 108, the activity level of each schema is calculated based on the external stimulus obtained as the sensor information, an appropriate behavior is selected, and a secondary emotion is determined. Notice.
[0334]
The internal state management unit 104 outputs the secondary emotion sent from the situation-dependent behavior hierarchy 108 as its own emotion. Further, for the Activation from the context-dependent behavior hierarchy 108, the final Activation is determined based on the level of the biorhythm.
[0335]
In the situation-dependent behavior hierarchy 108, the activity level of each schema is calculated based on the newly received index value of the internal state, and the next behavior (schema) depending on the situation can be selected.
[0336]
E-8. Relationship between schema and internal state management unit
The context-dependent behavior hierarchy 108 is composed of a plurality of schemas, calculates an activity level for each schema based on an external stimulus or a change in an internal state, selects a schema according to the degree of the activity level, and executes an action. (See FIGS. 18, 19, and 25).
[0337]
FIG. 43 schematically shows the relationship between the schema and the internal state management unit.
The schema can communicate with external objects such as the short-term storage unit 105, the long-term storage unit 106, and the internal state management unit 104 via a proxy such as DObject or DOObject (see FIG. 30).
[0338]
The schema includes a class object that calculates an activity level based on an external stimulus or a change in an internal state. An RM (Resource Management) object communicates with the short-term storage unit 105 via a proxy, acquires an external environment, and calculates an activity level based on the external environment. Further, the Motivation calculation class object communicates with the long-term storage unit 106 and the internal state management unit 104 via the proxy, acquires the change amount of the internal state, and calculates the activity level based on the internal state, that is, Motivation. . The method of calculating the motion will be described in detail later.
[0339]
As described above, the internal state management unit 104 is hierarchically divided into primary emotions and secondary emotions. The primary emotion is hierarchically hierarchically divided into a primary emotion hierarchy based on innate reactions, a primary emotion based on homeostasis, and a primary emotion based on association (see FIG. 36). The emotion as the secondary emotion is mapped to three elements of P (Pleasantness), A (Activity), and C (Concentration).
[0340]
All the changes ΔI of the primary emotion in each hierarchy are input to the secondary emotion and used for calculating the change ΔP of Pleasantness.
[0341]
Activity is integrally determined from information such as sensor input, operation time, and biorhythm.
[0342]
Also, the certainty factor of the selected schema is used as the certainty factor in the actual secondary emotion hierarchy.
[0343]
FIG. 44 schematically shows a Motivation calculation path based on the Motivation calculation class object.
[0344]
The RM class object accesses the short-term storage unit 105 via the proxy, acquires sensor information, and determines the activity level of the external stimulus based on the strength of the stimulus such as the distance and size of the recognized object. evaluate.
[0345]
On the other hand, the Motivation calculation class object accesses the short-term storage unit 105 via the proxy, acquires the characteristics of the object, and queries the characteristics of the object in the long-term storage unit 106 via the proxy to change the internal state. To get. Then, it accesses the internal state management unit 104 via the proxy to calculate an internal evaluation value inside the robot. Therefore, the calculation of Motivation is independent of the strength of the external stimulus.
[0346]
It has already been described that the behavior control system of the robot according to the present embodiment uses the associative memory to recall a change in the internal state from an external stimulus, thereby calculating a secondary emotion and selecting an action (FIG. 41). checking). Furthermore, by using the associative memory, it is possible to recall a change in the internal state that differs for each object. Thereby, even in the same situation, the easiness of the action can be made different. That is, in addition to the external stimulus, the physical state, and the current internal state, the action can be selected in consideration of the memory of each object of the robot, and a more diversified and diversified response can be realized.
[0347]
For example, instead of taking actions determined by the external environment or internal state, such as "XX is visible because XX is visible", or "XX is currently insufficient (for anything) XX" , Such as "Even if you can see XX, so □□" or "Even though XX is visible but XX, so ■■" Can be attached.
[0348]
FIG. 45 schematically illustrates the mechanism of the Motivation calculation process when an object exists.
[0349]
First, the short-term storage unit 105 is accessed via the proxy, and the characteristics of the target recognized by the recognition function units 101 to 103 are asked.
[0350]
Next, using the extracted feature, the long-term storage unit 106 is accessed via a proxy, and how the object of the feature changes the desire related to the schema, that is, the change ΔI of the primary emotion, To win.
[0351]
Next, the internal state management unit 104 is accessed via the proxy to derive how the value of the pleasure changes due to the change in the desire, that is, the change ΔPleasant of the secondary emotion.
[0352]
Then, the following motivation calculation function g using the change ΔPleasant of the secondary emotion and the certainty factor of the object as arguments_target-iThen, the i-th Motivation is calculated.
[0353]
(Equation 9)

[0354]
FIG. 46 schematically illustrates the mechanism of the Motivation calculation process when no object exists.
[0355]
In this case, first, the change in desire ΔI due to the action is asked to the memory for the action.
[0356]
Next, using the acquired ΔI, the internal state management unit 104 extracts a change ΔPleasant of the secondary emotion when the primary emotion changes by ΔI. Then, in this case, the following Motivation calculation function g using the change of the secondary emotion ΔPleasant as an argument_nottarget-iThen, the i-th Motivation is calculated.
[0357]
(Equation 10)

[0358]
E-9. Method of changing each element of secondary emotion
FIG. 47 illustrates a mechanism for changing Pleasantness of the secondary emotion.
[0359]
The long-term storage unit 106 inputs a primary emotion change according to the amount of storage to the internal state management unit 104. In addition, the short-term storage unit 105 inputs changes in primary emotions due to sensor inputs from the recognition function units 101 to 103 to the internal state management unit 104.
[0360]
As the schema, a change in primary emotion (Nourishment, Moisture, Sleep) due to the execution of the schema and a change in primary emotion (Affection) due to the contents of the schema are input to the internal state management unit 104.
[0361]
Pleasantness is determined according to a change in excess or deficiency of the primary emotion.
[0362]
FIG. 48 illustrates a mechanism for changing the activity of the secondary emotion.
[0363]
Activity is determined in an integrated manner based on the sum of time other than sleep in the schema, biorhythm, and sensor input.
[0364]
FIG. 49 illustrates a mechanism for changing the Certainty of the secondary emotion.
[0365]
When the long-term storage unit 106 is asked for an object, Certainty is returned. Which primary emotion to focus on depends on the target behavior of the schema. Then, the extracted Certainty becomes the Certainty in the secondary emotion of the internal state management unit 104 as it is.
[0366]
FIG. 50 schematically shows a mechanism for obtaining Certainty.
[0367]
The long-term storage unit 106 stores the certainty of each item such as a recognition result and an emotion regarding the object for each schema.
[0368]
The schema asks the long-term storage unit 106 about a probability value of the storage related to the schema. On the other hand, the long-term storage unit 106 gives the certainty of the storage related to the schema as the certainty of the object.
[0369]
[Supplement]
The present invention has been described in detail with reference to the specific embodiments. However, it is obvious that those skilled in the art can modify or substitute the embodiment without departing from the scope of the present invention.
[0370]
The gist of the present invention is not necessarily limited to products called “robots”. That is, as long as the mechanical device performs a motion similar to a human motion using an electric or magnetic action, the present invention similarly applies to a product belonging to another industrial field such as a toy. Can be applied.
[0371]
In short, the present invention has been disclosed by way of example, and the contents described in this specification should not be interpreted in a limited manner. In order to determine the gist of the present invention, the claims described at the beginning should be considered.
[0372]
【The invention's effect】
Advantageous Effects of Invention According to the present invention, it is possible to provide an excellent robot behavior control system, behavior control method, and robot device capable of realizing autonomous operation and realizing communication with a user.
[0373]
Further, according to the present invention, it is possible to integrally determine the situation where the robot is placed such as the recognition result of the external environment such as vision and hearing and the internal state such as instinct and emotion, and to select an action. An excellent robot behavior control system and behavior control method, and a robot device can be provided.
[0374]
Further, according to the present invention, it is possible to integrally determine the situation where the robot is placed such as the recognition result of the external environment such as vision and hearing and the internal state such as instinct and emotion, and to select an action. An excellent robot behavior control system and behavior control method, and a robot device can be provided.
[0375]
Further, according to the present invention, an excellent robot behavior that makes it possible to select and execute an action according to an external stimulus or an internal state under a certain order by clarifying the meaning of existence of an emotion more clearly. A control system, a behavior control method, and a robot device can be provided.
[0376]
According to the present invention, emotions are classified into a plurality of layers according to their significance, and the operation is determined at each layer. From the determined plurality of operations, it is determined which operation is to be performed depending on the external stimulus or internal state at that time. Actions are selected at each level, but the order in which they are performed is based on the priority of the robot's internal state. Higher-level behaviors such as objective behaviors and motion selection using memory can be consistently expressed on one individual. It is also a clear index when categorizing actions and creating a schema.
[0377]
Further, according to the robot behavior control system or the behavior control method according to the present invention, by using the associative memory, it is possible to recall a change in the internal state that is different for each target object, so that the behavior manifests even in the same situation. Easiness can be made different. That is, in addition to the external stimulus, the physical state, and the current internal state, the action can be selected in consideration of the memory of each object of the robot, and a more diversified and diversified response can be realized.
[0378]
For example, instead of taking actions determined by the external environment or internal state, such as "XX is visible because XX is visible", or "XX is currently insufficient (for anything) XX" , Such as "Even if you can see XX, so □□," or "I can see XX but XX, so XX." Can be attached.
[Brief description of the drawings]
FIG. 1 is a diagram schematically showing a functional configuration of a robot device 1 used in the present invention.
FIG. 2 is a diagram showing the configuration of a control unit 20 in further detail.
FIG. 3 is a diagram schematically showing a functional configuration of a behavior control system 100 of the robot device 1 according to the embodiment of the present invention.
4 is a diagram showing a flow of an operation by each object constituting the behavior control system 100 shown in FIG.
FIG. 5 is a diagram showing a flow of information entering a target memory in a short-term storage unit 105 based on recognition results in each of the recognition function units 101 to 103.
FIG. 6 is a diagram showing a flow of information entering an event memory in a short-term storage unit 105 based on recognition results in each of the recognition function units 101 to 103.
FIG. 7 is a diagram for explaining a dialogue process between the robot 1 and users A and B.
FIG. 8 is a diagram for explaining a dialogue process between the robot 1 and users A and B.
FIG. 9 is a diagram for explaining dialog processing with the users A and B by the robot 1;
FIG. 10 is a diagram conceptually showing a storage process of associative memory according to an embodiment of the present invention.
FIG. 11 is a diagram conceptually showing a recall process of associative memory according to an embodiment of the present invention.
FIG. 12 is a diagram schematically showing a configuration example of an associative memory system to which a competitive neural network is applied.
FIG. 13 is a diagram schematically illustrating an object configuration of the behavior control system 100 according to the embodiment of the present invention.
FIG. 14 is a diagram schematically showing a form of situation-dependent behavior control by a situation-dependent behavior hierarchy 108.
FIG. 15 is a diagram showing a basic operation example of behavior control by the situation-dependent behavior hierarchy shown in FIG.
FIG. 16 is a diagram showing an operation example when a reflex action is performed by the situation-dependent action hierarchy shown in FIG.
FIG. 17 is a diagram showing an operation example when emotion expression is performed by the situation-dependent behavior hierarchy shown in FIG.
FIG. 18 is a diagram schematically illustrating a situation in which the situation-dependent behavior hierarchy is composed of a plurality of schemas.
FIG. 19 is a diagram schematically showing a tree structure of a schema in the situation-dependent behavior hierarchy 108.
FIG. 20 schematically shows an internal configuration of a schema.
FIG. 21 is a diagram schematically showing an internal configuration of a Monitor function.
FIG. 22 is a diagram schematically illustrating a configuration example of an action state control unit.
FIG. 23 is a diagram schematically illustrating another configuration example of the behavior state control unit.
FIG. 24 is a diagram schematically showing a mechanism for controlling normal context-dependent behavior in the context-dependent behavior hierarchy 108.
FIG. 25 is a diagram schematically showing the configuration of a schema in the reflex action unit 109.
FIG. 26 is a diagram schematically showing a mechanism for controlling reflex behavior by the reflex behavior unit 109.
FIG. 27 is a diagram schematically illustrating a class definition of a schema used in the situation-dependent behavior hierarchy 108.
FIG. 28 is a diagram showing a state machine of an action function of a schema.
FIG. 29 is a diagram showing a state machine of a schema.
FIG. 30 is a diagram schematically showing a functional configuration of a class in the situation-dependent behavior hierarchy 108.
FIG. 31 is a flowchart showing a processing procedure for executing a MakePronome function.
FIG. 32 is a flowchart showing a processing procedure for executing a Monitor function.
FIG. 33 is a flowchart showing a processing procedure for executing an Actions function.
FIG. 34 is a flowchart showing a processing procedure for executing an Actions function.
FIG. 35 is a diagram for describing the reentrant property of a schema.
FIG. 36 is a diagram schematically illustrating a hierarchical configuration of the internal state management unit 104 according to the present embodiment.
FIG. 37 is a diagram schematically illustrating a communication path between the internal state management unit 104 and another functional module.
FIG. 38 is a diagram showing a mechanism by which the internal state management unit 104 changes the internal state with time.
FIG. 39 is a diagram showing a mechanism by which the internal state management unit 104 changes the internal state in accordance with the execution of the operation of the robot.
FIG. 40 is a diagram showing a mechanism by which the internal state management unit 104 changes the internal state according to the recognition result of the external environment.
FIG. 41 is a diagram showing a mechanism by which the internal state management unit 104 changes the internal state by using associative memory.
FIG. 42 is a diagram showing a mechanism by which the internal state management unit 104 changes the internal state by an innate reflex action.
FIG. 43 is a diagram schematically showing a relationship between a schema and an internal state management unit.
FIG. 44 is a diagram schematically showing a Motivation calculation path by a Motivation calculation class object.
FIG. 45 is a diagram schematically illustrating a mechanism of a Motivation calculation process when an object exists.
FIG. 46 is a diagram schematically showing a mechanism of a Motivation calculation process when an object does not exist.
FIG. 47 is a diagram showing a method of changing Pleasantness.
FIG. 48 is a diagram showing a method of changing Activity.
FIG. 49 is a diagram illustrating a method of changing Certainty.
FIG. 50 is a diagram showing a mechanism for obtaining Certainty.
[Explanation of symbols]
1 ... Robot device
15 ... CCD camera
16 ... Microphone
17… Speaker
18 ... Touch sensor
19 ... LED indicator
20 ... Control unit
21 ... CPU
22 ... RAM
23… ROM
24: Non-volatile memory
25 ... Interface
26 ... Wireless communication interface
27 ... Network interface card
28 ... Bus
29 ... Keyboard
40 ... input / output unit
50 ... Drive unit
51 ... motor
52 ... Encoder
53 ... Driver
100 ... behavior control system
101: Visual recognition function unit
102: Auditory recognition function unit
103 ... Touch recognition function unit
105: Short-term memory
106: Long-term storage unit
107: Reflection behavior hierarchy
108 ... Situation-dependent behavior hierarchy
109 ... Reflex action part

Claims

In a robot device that generates an action based on an internal state or an external input,
One or more action modules for determining the action of the robot;
Behavior state control means for managing the behavior module;
State storage means for storing a current state of the behavior module;
One or more action instruction output means for outputting an action command based on an external or internal input corresponding to the state stored in the state storage means,
When the output of the behavior command of the behavior command output unit is stopped and the output of the behavior command of the behavior command output unit is restarted thereafter, the behavior state control unit performs the operation based on the state stored in the state storage unit. After causing the robot to perform the predetermined action, the output of the action command is resumed,
A robot device characterized by the above-mentioned.

The state storage means and the action command output means are provided inside the action module,
The behavior state control means is stored in another behavior module located in a higher hierarchy,
The robot device according to claim 1, wherein:

In a robot device that generates an action based on an internal state or an external input,
A plurality of action modules each including: an action evaluation unit that outputs an action evaluation of the robot apparatus according to an internal state or an external input; and an action instruction output unit that outputs an action command of the robot apparatus.
Behavior state control means for controlling the behavior of the robot device based on the behavior evaluation in the behavior evaluation means of each behavior module,
A robot device comprising:

The plurality of action modules form a tree-like hierarchical structure,
The behavior state control means selects a behavior module based on a behavior evaluation output from a behavior module of a lower layer of the hierarchical structure to a behavior module of an upper layer, and controls behavior of the robot device.
The robot device according to claim 3, wherein:

The action command output means of each of the action modules outputs resources of the robot device used when executing the action command in the robot device,
The behavior state control unit controls the behavior of the robot device based on a behavior evaluation output from a behavior module of a lower layer of the hierarchical structure to a behavior module of an upper layer and resources used by the robot device.
The robot device according to claim 4, wherein:

In a robot device that generates an action based on an internal state or an external input,
Behavior state control means for controlling the behavior of the robot device,
Behavior evaluation means for outputting an evaluation of the behavior of the robot apparatus in response to an internal state or an external input, and an output of a behavior command for outputting a behavior command of the robot apparatus and resources of the robot apparatus used when executing the behavior command And a plurality of behavior modules each having a means,
The behavior state control means generates a behavior module different from the previous behavior module for each target, and controls the behavior of the robot device.
A robot device characterized by the above-mentioned.

The behavior state control means may generate a second behavior module having a higher priority in the behavior evaluation than the first behavior module during the generation of the first behavior module to be applied to the first object. Generating a second behavior module in response to the
The robot device according to claim 6, wherein:

The behavior state control means controls the behavior of the robot device by switching between a first behavior module and a second behavior module based on the behavior evaluation output from each of the behavior modules.
The robot device according to claim 7, wherein:

In a robot device that generates an action based on an internal state or an external input,
Behavior state control means for controlling the behavior of the robot device,
Behavior evaluation means for outputting an evaluation of the behavior of the robot apparatus in response to an internal state or an external input, and an output of a behavior command for outputting a behavior command of the robot apparatus and resources of the robot apparatus used when executing the behavior command And a plurality of behavior modules each having a means,
The plurality of action modules form a tree-like hierarchical structure,
The behavior state control means selects a behavior module based on a behavior evaluation output from a behavior module in a lower layer of the hierarchical structure to a behavior module in an upper layer and a resource used by the robot device, and determines a behavior of the robot device. Control,
A robot device characterized by the above-mentioned.

In an action control system for a robot that operates autonomously,
A state machine that describes the operation of the robot, and an activity evaluator that evaluates a current activity level of the operation state of the robot in the state machine and resources of the robot used when the operation state of the robot is activated. One or more behavior modules,
Instruct the action evaluator of each action module to calculate an activity level and a used resource, select an action module to be activated according to each activity level and a used resource, and select the selected action module. A behavior state control unit that controls the behavior state of each behavior module by instructing the state machine to execute,
A behavior control system for a robot, comprising:

The behavior evaluator evaluates an activity level of the state machine according to at least one of an external environment of the robot or an internal state of the robot,
The robot behavior control system according to claim 10, wherein:

The behavior module is configured in a tree structure format according to a realization level of the operation of the robot,
The behavior state control unit is mounted for each behavior module, and instructs a behavior module below the tree structure to evaluate an activity level and a used resource, to select a behavior module, and to execute a state machine. ,
The robot behavior control system according to claim 10, wherein:

The behavior state control unit transitions the behavior module having the decreased activity level from the active state to the standby state, and transitions the behavior module having the increased activity level from the standby state to the active state.
The robot behavior control system according to claim 10, wherein:

Means for initiating an action required to transition the active module to the standby state while saving data necessary for resuming the active module when transitioning from the active state to the standby state;
Means for initiating an action necessary for causing the behavior module to transition from the standby state to the active state, restoring the saved data and initializing the behavior module state, and transitioning to the active state;
The robot behavior control system according to claim 13, further comprising:

In an action control system for a robot that operates autonomously,
One or more action modules comprising a combination of a command for operating a robot and an action evaluator for evaluating resources of the robot required for executing the command;
Means for detecting released resources of the robot;
Means for selectively activating an action module executable by resources of the released robot according to a predetermined priority,
Two or more action modules whose resources do not compete can be executed in parallel.
A robot behavior control system characterized by the following.

In a robot behavior control system that operates autonomously according to the internal state,
An internal state management unit that manages emotions, which are indicators of the internal state, in a plurality of hierarchical structures,
An action selection unit that selectively executes an action that satisfies the emotions of each hierarchy;
A behavior control system for a robot, comprising:

The internal state management unit hierarchizes the primary emotions necessary for the survival of the individual and the secondary emotions that change depending on the excess or deficiency of the primary emotions. Hierarchy by dimension from physiological hierarchy to association,
The robot behavior control system according to claim 16, wherein:

The action selection unit preferentially selects an action that satisfies a lower primary emotion,
The robot behavior control system according to claim 17, wherein:

The action selecting unit suppresses selection of an action that satisfies the lower-order primary emotion when the higher-order primary emotion is significantly insufficient compared to the lower-order primary emotion,
The robot behavior control system according to claim 18, wherein:

It further includes an external environment recognition unit that recognizes changes in the external environment of the robot,
The action selection unit, in addition to the index of the internal state, selects an action based on the index of the external environment,
The robot behavior control system according to claim 16, wherein:

The internal state management unit changes the index of the internal state over time,
The robot behavior control system according to claim 16, wherein:

The internal state management unit changes the index of the internal state according to the execution of the action selected in the action selection unit,
The robot behavior control system according to claim 16, wherein:

It further includes an external environment recognition unit that recognizes changes in the external environment of the robot,
The internal state management unit changes the index of the internal state according to a change in the external environment,
The robot behavior control system according to claim 16, wherein:

An external environment recognition unit that recognizes changes in the external environment of the robot; and an associative storage unit that associates and stores changes in the internal state from the external environment.
The internal state management unit changes the index of the internal state based on the change of the internal environment recalled from the external environment by the associative storage unit,
The robot behavior control system according to claim 16, wherein:

The associative storage unit associatively stores a change in the internal state for each recognized object.
25. The robot behavior control system according to claim 24, wherein:

In a behavior control method for a robot that operates autonomously according to an internal state,
An internal state management step of managing emotions, which are indicators of the internal state, in a plurality of hierarchical structures,
An action selection step of selectively executing an action that satisfies the emotion of each layer;
A behavior control method for a robot, comprising:

In the internal state management step, the primary emotion necessary for the existence of the individual and the secondary emotion that changes depending on the excess or deficiency of the primary emotion are hierarchized into stages, and the primary emotion is innately reflected or reflected. From the physiological hierarchy to the association, the internal state is handled hierarchically by dimension,
The method of controlling a robot action according to claim 26, wherein:

In the action selecting step, an action that satisfies a lower primary emotion is preferentially selected.
The method of controlling a robot action according to claim 26, wherein:

In the action selecting step, when a higher-order primary emotion is significantly insufficient compared to a lower-order primary emotion, the selection of an action that satisfies the lower-order primary emotion is suppressed.
The method of controlling a robot action according to claim 26, wherein:

Further comprising an external environment recognition step of recognizing a change in the external environment of the robot,
In the action selection step, in addition to the index of the internal state, select an action based on the index of the external environment,
The method of controlling a robot action according to claim 26, wherein:

In the internal state management step, changing the index of the internal state according to the passage of time,
The method of controlling a robot action according to claim 26, wherein:

In the internal state management step, changing the index of the internal state according to the execution of the action selected in the action selection step,
The method of controlling a robot action according to claim 26, wherein:

Further comprising an external environment recognition step of recognizing a change in the external environment of the robot,
In the internal state management step, changing the index of the internal state according to the change of the external environment,
The method of controlling a robot action according to claim 26, wherein:

An external environment recognition step of recognizing a change in the external environment of the robot; and an associative memory step of associatively storing a change in the internal state from the external environment,
In the internal state management step, the index of the internal state is changed based on the change of the internal environment recalled from the external environment by the associative memory,
The method according to claim 36, wherein the behavior of the robot is controlled.

In the associative storage step, the change of the internal state is associatively stored for each recognized object,
35. The method according to claim 34, wherein the behavior of the robot is controlled.

A robot device that generates an action based on an internal input or an external input,
One or more action modules for determining the action of the robot;
Behavior module management means for managing the behavior module,
State storage means for storing a current state of the behavior module;
One or more state machines corresponding to the states stored in the state storage means and outputting an action command based on an external or internal input,
When the output of the action command of the state machine is stopped and the subsequent action output is restarted, the action module management means causes the robot to execute a predetermined action based on the state stored in the state storage means. Later, the action output is resumed,
A robot device characterized by the above-mentioned.

The state storage means and the state machine are provided inside the behavior module,
The behavior module management means has a hierarchical structure with the behavior module, and is stored in another behavior module located in a higher hierarchy.
The robot apparatus according to claim 36, wherein:

In a robot device that operates autonomously,
An action module including means for calculating an activity level by determining a situation based on an external stimulus and an internal environment, and means for outputting an action corresponding to an input and a state based on a predetermined state machine;
State setting means for setting a state of the action module;
A robot device comprising:

The state setting unit sets the behavior module to one of a ready state, an active state, and a standby state according to the activity level.
The robot device according to claim 38, wherein:

Two or more action modules are arranged in a hierarchical structure,
The state setting means is configured such that an upper-level behavior module of the hierarchy selects a lower-level behavior module,
The robot device according to claim 38, wherein: