JP2004291147A

JP2004291147A - Robot's behavior control system

Info

Publication number: JP2004291147A
Application number: JP2003086620A
Authority: JP
Inventors: Profio Ugo Di; ディプロフィオウゴ; Takeshi Takagi; 剛高木
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2003-03-26
Filing date: 2003-03-26
Publication date: 2004-10-21

Abstract

<P>PROBLEM TO BE SOLVED: To take a state-depending action and a reflex action in parallel cooperatively and organically without interfering with each other. <P>SOLUTION: A reflex action hierarchy monitoring schema is disposed in the nearest low rank of a root schema of a state-action-depending action hierarchy. The reflex action hierarchy monitoring schema has a dummy action function, and the monitoring action function maximizes own activity level in response to the schema execution in the reflex action hierarchy and sets all used resources to return values. As a result, the root schema always selects the reflex action hierarchy monitoring schema, so that competition between resources in action hierarchies is avoided. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、自律的な動作を行ないユーザとのリアリスティックなコミュニケーションを実現するロボットの行動制御システムに係り、特に、視覚や聴覚などの外部環境の認識結果や本能や感情などの内部状態などのロボットが置かれている状況を統合的に判断して適当な行動を選択する状況依存行動型のロボットのための行動制御システム及び行動制御方法、並びにロボット装置に関する。
【０００２】
さらに詳しくは、本発明は、ロボットが置かれている状況を統合的に判断して適当な行動を選択する状況依存型の行動とシステムに重大な反射的行動を適宜行なうロボットの行動制御システムに係り、特に、状況依存行動と反射的行動が不適当に干渉し合うことなく協働的且つ有機的に並列実行するロボットの行動制御システムに関する。
【０００３】
【従来の技術】
電気的若しくは磁気的な作用を用いて人間の動作に似せた運動を行なう機械装置のことを「ロボット」という。ロボットの語源は、スラブ語の”ＲＯＢＯＴＡ（奴隷機械）”に由来すると言われている。わが国では、ロボットが普及し始めたのは１９６０年代末からであるが、その多くは、工場における生産作業の自動化・無人化などを目的としたマニピュレータや搬送ロボットなどの産業用ロボット（ｉｎｄｕｓｔｒｉａｌｒｏｂｏｔ）であった。
【０００４】
最近では、イヌやネコ、クマのように４足歩行の動物の身体メカニズムやその動作を模したペット型ロボット、あるいは、ヒトやサルなどの２足直立歩行を行なう動物の身体メカニズムや動作を模した「人間形」若しくは「人間型」のロボット（ｈｕｍａｎｏｉｄｒｏｂｏｔ）など、脚式移動ロボットの構造やその安定歩行制御に関する研究開発が進展し、実用化への期待も高まってきている。これら脚式移動ロボットは、クローラ式ロボットに比し不安定で姿勢制御や歩行制御が難しくなるが、階段の昇降や障害物の乗り越えなど、柔軟な歩行・走行動作を実現できるという点で優れている。
【０００５】
脚式移動ロボットの用途の１つとして、産業活動・生産活動等における各種の難作業の代行が挙げられる。例えば、原子力発電プラントや火力発電プラント、石油化学プラントにおけるメンテナンス作業、製造工場における部品の搬送・組立作業、高層ビルにおける清掃、火災現場その他における救助といったような危険作業・難作業の代行などである。
【０００６】
また、脚式移動ロボットの他の用途として、上述の作業支援というよりも、生活密着型、すなわち人間との「共生」あるいは「エンターティンメント」という用途が挙げられる。この種のロボットは、ヒトあるいはイヌ（ペット）、クマなどの比較的知性の高い脚式歩行動物の動作メカニズムや四肢を利用した豊かな感情表現を忠実に再現する。また、あらかじめ入力された動作パターンを単に忠実に実行するだけではなく、ユーザ（あるいは他のロボット）から受ける言葉や態度（「褒める」とか「叱る」、「叩く」など）に対して動的に対応した、生き生きとした応答表現を実現することも要求される。
【０００７】
従来の玩具機械は、ユーザ操作と応答動作との関係が固定的であり、玩具の動作をユーザの好みに合わせて変更することはできない。この結果、ユーザは同じ動作しか繰り返さない玩具をやがては飽きてしまうことになる。これに対し、インテリジェントなロボットは、対話や機体動作などからなる行動を自律的に選択することから、より高度な知的レベルでリアリスティックなコミュニケーションを実現することが可能となる。この結果、ユーザはロボットに対して深い愛着や親しみを感じる。
【０００８】
ロボットあるいはその他のリアリスティックな対話システムでは、視覚や聴覚など外部環境の変化に応じて逐次的に行動を選択していくのが一般的である。また、行動選択メカニズムの他の例として、本能や感情といった情動をモデル化してシステムの内部状態を管理して、内部状態の変化に応じて行動を選択するものを挙げることができる。勿論、システムの内部状態は、外部環境の変化によっても変化するし、選択された行動を発現することによっても変化する。
【０００９】
しかしながら、これら外部環境や内部状態などのロボットが置かれている状況を統合的に判断して行動を選択するという、状況依存型の行動制御に関する例は少ない。
【００１０】
ここで、内部状態には、例えば生体で言えば大脳辺縁系へのアクセスに相当する本能のような要素や、大脳新皮質へのアクセスに相当する内発的欲求や社会的欲求などのように動物行動学的モデルで捉えられる要素、さらには喜びや悲しみ、怒り、驚きなどのような感情と呼ばれる要素などで構成される。
【００１１】
従来のインテリジェント・ロボットやその他の自律対話型ロボットにおいては、本能や感情などさまざまな要因からなる内部状態をすべて「情動」としてまとめて１次元的に内部状態を管理していた。すなわち、内部状態を構成する各要素はそれぞれ並列に存在しており、明確な選択基準のないまま外界の状況や内部状態のみで行動が選択されていた。
【００１２】
また、従来のシステムでは、その動作の選択及び発現は１次元の中にすべての行動が存在し、どれを選択するかを決定していた。このため、動作が多くなるにつれてその選択は煩雑になり、そのときの状況や内部状態を反映した行動選択を行なうことがより難しくなる。
【００１３】
一方、インテリジェントなロボットが自律型の行動を実現するためには、状況依存型の行動以外に、システムに重大な反射的な機体動作を実現する必要があると思料される。
【００１４】
ここで言う反射的行動とは、例えばシステムのブートやシャットダウン、エラー発生時の処理行動、あるいは、センサ入力された外部情報の認識結果を直接受けて、これを分類して、出力行動を直接決定する行動のことである。例えば、人間の顔を追いかけたり、うなずいたりといった振る舞いは反射行動として実装することが好ましい。
【００１５】
状況依存型行動や反射行動は、ロボットの行動制御システムにおいて異なる制御階層レベルとして実装し、同時又は並列実行することが可能である。ところがこのような階層化システム・アーキテクチャにおいては、各行動階層の間でコミュニケーションすなわち情報交換を行なう術がない。このため、各行動階層において互いに干渉し合うような活動が独自に発現してしまった場合には、システム全体として一貫したパフォーマンスを行なうことが困難となってしまう。
【００１６】
【発明が解決しようとする課題】
本発明の目的は、ロボットが置かれている状況を統合的に判断して適当な行動を選択する状況依存型の行動と認識された外部刺激に応じて反射的な機体動作を実現する反射型の行動を適宜行なうことができる、優れたロボットの行動制御システムを提供することにある。
【００１７】
本発明のさらなる目的は、状況依存行動と反射行動が不適当に干渉し合うことなく協働的且つ有機的に並列実行することができる、優れたロボットの行動制御システムを提供することにある。
【００１８】
【課題を解決するための手段及び作用】
本発明は、上記課題を参酌してなされたものであり、自律的に動作するロボットのための行動制御システムであって、
前記ロボットの行動を制御する第１の行動階層と、
前記第１の行動階層とは独立して前記ロボットの行動を制御する第２の行動階層と、
各行動階層毎に設けられた、行動階層内における行動の実行状況を記録するログ・メモリ・モジュールと、
各行動階層のログ・メモリ・モジュール間で記録内容を通信する接続部と、
を具備することを特徴とするロボットの行動制御システムである。
【００１９】
但し、ここで言う「システム」とは、複数の装置（又は特定の機能を実現する機能モジュール）が論理的に集合した物のことを言い、各装置や機能モジュールが単一の筐体内にあるか否かは特に問わない。
【００２０】
ここで、前記第１の行動階層は、例えば、ロボットが置かれている状況を統合的に判断して適当な行動を選択する状況依存行動階層である。また、前記第２の行動階層は、例えば、システムのブートやシャットダウン、各種のエラー処理など、システムにとって極めて重大な行動を司る反射的行動階層である。
【００２１】
これら第１及び第２の行動階層では、基本的には、それぞれ独自に行動制御が行なわれる。しかしながら、各行動階層間で不適当な干渉（使用リソースの競合）が発生すると、システムの誤動作を招来する。そこで、前記第１の行動階層では、前記接続部を介して得られる前記第２の行動階層側のログ・メモリ・モジュールの記録内容に応じて、前記第１の行動階層内の行動を制御するようになっている。
【００２２】
例えば、前記第１の行動階層では、前記接続部を介して得られる前記第２の行動階層側のログ・メモリ・モジュールの記録内容に基づいて、前記第２の行動階層側が活動中は、待機状態となるようになっている。
【００２３】
あるいは、前記第１の行動階層では、前記接続部を介して得られる前記第２の行動階層側のログ・メモリ・モジュールの記録内容に基づいて、前記第２の行動階層側において使用される機体リソースを回避した行動を選択する。
【００２４】
すなわち、システムのブート時、シャットダウン時、並びにエラー回避など、システムにクリティカルな行動が活動化しているときには、状況依存行動階層の活動を回避又は制限するようにした。
【００２５】
ここで、それぞれの行動階層は、
機体動作を記述したステートマシンと、前記ステートマシンにおける現在の機体動作ステートの活性度レベルと機体動作ステート起動時において使用する機体リソースを評価する行動評価器とを備えた、１以上の行動モジュールと、
前記の各行動モジュールの行動評価器に対して活性度レベルと使用リソースの算出を指示し、それぞれの活性度レベルと使用リソースに応じて活性化させる行動モジュールを選択し、該選択された行動モジュールのステートマシンに実行を指示することにより、各行動モジュールの行動状態を制御する行動状態制御部と、
で構成することができる。
【００２６】
前記行動評価器は、機体の外部環境及び／又は前記ロボットの内部状態に応じて前記ステートマシンの活性度レベルを評価する。
【００２７】
また、前記行動状態制御部は、活性度レベルの低下した行動モジュールを活性状態から待機状態に遷移させるとともに、活性度レベルが上昇した行動モジュールを待機状態から活性状態に遷移させる。前記行動状態制御部は、リソースが競合しない２以上の行動モジュールを活動度レベルに従って選択することができるものとする。
【００２８】
また、それぞれの行動階層は、機体動作の実現レベルに応じた行動モジュールの木構造として構成することができる。そして、木構造の最上位の行動モジュールに配設された前記行動状態制御部は、前記木構造の下位に向かって、行動モジュールに対して活性度レベル及び使用リソースの評価の指示、行動モジュールの選択、並びにステートマシンの実行の指示を行なうようにすることができる。
【００２９】
このような場合、前記第１の行動階層においてルートの行動モジュールの直近下位に、前記第２の行動階層の行動モジュールの活動度ログを前記ログ・メモリ・モジュールを介して監視して前記第２の行動階層内の各行動モジュールの開始／終了を検出する行動階層監視行動モジュールを配設してもよい。
【００３０】
この行動階層監視行動モジュールは、前記ログ・メモリ・モジュールを介して検出された前記第２の行動階層側における行動モジュールの活動に応じて、自己の活動度レベルを十分高く励起するとともに、適当なリソースを要求することができる。
【００３１】
このような場合、行動階層監視行動モジュールは、ルートの行動モジュールの直近下位に配設されているので、同じ行動階層内で他のスキーマの選択や実行に大きく影響を与えることができる。すなわち、ルートの行動モジュールは、必ず行動階層監視行動モジュールを選択するので、行動階層間でリソースの競合が回避され、システムにクリティカルなスキーマを支障なく実行することができる。
【００３２】
また、行動階層監視行動モジュールは、活動度検出により課されている行動モジュール選択やその実行の制限を、自らの要求リソース解放と活動度レベルの低下により解除することができる。これによって、リソースの競合を取り除き、他の行動モジュールは通常の活動を継続することができる。
【００３３】
このように、行動階層監視行動モジュールの介在により、複数の行動階層間でのリソースの競合を好適に回避し、不適当に干渉し合うことなく協働的且つ有機的に並列実行することができる。
【００３４】
各行動階層においては、独自のポリシーに従い、行動モジュールの選択やその実行を行なうことができる。また、各行動階層におけるリソース管理を簡素化することができる。
【００３５】
本発明のさらに他の目的、特徴や利点は、後述する本発明の実施形態や添付する図面に基づくより詳細な説明によって明らかになるであろう。
【００３６】
【発明の実施の形態】
以下、図面を参照しながら本発明の実施形態について詳解する。
【００３７】
Ａ．ロボット装置の構成
図１には、本発明に実施に供されるロボット装置１の機能構成を模式的に示している。同図に示すように、ロボット装置１は、全体の動作の統括的制御やその他のデータ処理を行なう制御ユニット２０と、入出力部４０と、駆動部５０と、電源部６０とで構成される。以下、各部について説明する。
【００３８】
入出力部４０は、入力部としてロボット装置１の目に相当するＣＣＤカメラ１５や、耳に相当するマイクロフォン１６、頭部や背中などの部位に配設されてユーザの接触を感知するタッチ・センサ１８、あるいは五感に相当するその他の各種のセンサを含む。また、出力部として、口に相当するスピーカ１７、あるいは点滅の組み合わせや点灯のタイミングにより顔の表情を形成するＬＥＤインジケータ（目ランプ）１９などを装備している。これら出力部は、音声やランプの点滅など、脚などによる機械運動パターン以外の形式でもロボット装置１からのユーザ・フィードバックを表現することができる。
【００３９】
駆動部５０は、制御部２０が指令する所定の運動パターンに従ってロボット装置１の機体動作を実現する機能ブロックであり、行動制御による制御対象である。駆動部５０は、ロボット装置１の各関節における自由度を実現するための機能モジュールであり、それぞれの関節におけるロール、ピッチ、ヨーなど各軸毎に設けられた複数の駆動ユニットで構成される。各駆動ユニットは、所定軸回りの回転動作を行なうモータ５１と、モータ５１の回転位置を検出するエンコーダ５２と、エンコーダ５２の出力に基づいてモータ５１の回転位置や回転速度を適応的に制御するドライバ５３の組み合わせで構成される。
【００４０】
駆動ユニットの組み合わせ方によって、ロボット装置１を例えば２足歩行又は４足歩行などの脚式移動ロボットとして構成することができる。
【００４１】
電源部６０は、その字義通り、ロボット装置１内の各電気回路などに対して給電を行なう機能モジュールである。本実施形態に係るロボット装置１は、バッテリを用いた自律駆動式であり、電源部６０は、充電バッテリ６１と、充電バッテリ６１の充放電状態を管理する充放電制御部６２とで構成される。
【００４２】
充電バッテリ６１は、例えば、複数本のリチウムイオン２次電池セルをカートリッジ式にパッケージ化した「バッテリ・パック」の形態で構成される。
【００４３】
また、充放電制御部６２は、バッテリ６１の端子電圧や充電／放電電流量、バッテリ６１の周囲温度などを測定することでバッテリ６１の残存容量を把握し、充電の開始時期や終了時期などを決定する。充放電制御部６２が決定する充電の開始及び終了時期は制御ユニット２０に通知され、ロボット装置１が充電オペレーションを開始及び終了するためのトリガとなる。
【００４４】
制御ユニット２０は、「頭脳」に相当し、例えばロボット装置１の機体頭部あるいは胴体部に搭載されている。
【００４５】
図２には、制御ユニット２０の構成をさらに詳細に図解している。同図に示すように、制御ユニット２０は、メイン・コントローラとしてのＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２１が、メモリやその他の各回路コンポーネントや周辺機器とバス接続された構成となっている。バス２７は、データ・バス、アドレス・バス、コントロール・バスなどを含む共通信号伝送路である。バス２７上の各装置にはそれぞれに固有のアドレス（メモリ・アドレス又はＩ／Ｏアドレス）が割り当てられている。ＣＰＵ２１は、アドレスを指定することによってバス２８上の特定の装置と通信することができる。
【００４６】
ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２２は、ＤＲＡＭ（ＤｙｎａｍｉｃＲＡＭ）などの揮発性メモリで構成された書き込み可能メモリであり、ＣＰＵ２１が実行するプログラム・コードをロードしたり、実行プログラムによる作業データの一時的な保存したりするために使用される。
【００４７】
ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２３は、プログラムやデータを恒久的に格納する読み出し専用メモリである。ＲＯＭ２３に格納されるプログラム・コードには、ロボット装置１の電源投入時に実行する自己診断テスト・プログラムや、ロボット装置１の動作を規定する動作制御プログラムなどが挙げられる。
【００４８】
ロボット装置１の制御プログラムには、カメラ１５やマイクロフォン１６などのセンサ入力を処理してシンボルとして認識する「センサ入力・認識処理プログラム」、短期記憶や長期記憶などの記憶動作（後述）を司りながらセンサ入力と所定の行動制御モデルとに基づいてロボット装置１の行動を制御する「行動制御プログラム」、行動制御モデルに従って各関節モータの駆動やスピーカ１７の音声出力などを制御する「駆動制御プログラム」などが含まれる。
【００４９】
不揮発性メモリ２４は、例えばＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲＯＭ）のように電気的に消去再書き込みが可能なメモリ素子で構成され、逐次更新すべきデータを不揮発的に保持するために使用される。逐次更新すべきデータには、暗号鍵やその他のセキュリティ情報、出荷後にインストールすべき装置制御プログラムなどが挙げられる。
【００５０】
インターフェース２５は、制御ユニット２０外の機器と相互接続し、データ交換を可能にするための装置である。インターフェース２５は、例えば、カメラ１５やマイクロフォン１６、スピーカ１７との間でデータ入出力を行なう。また、インターフェース２５は、駆動部５０内の各ドライバ５３−１…との間でデータやコマンドの入出力を行なう。
【００５１】
また、インターフェース２５は、ＲＳ（ＲｅｃｏｍｍｅｎｄｅｄＳｔａｎｄａｒｄ）−２３２Ｃなどのシリアル・インターフェース、ＩＥＥＥ（ＩｎｓｔｉｔｕｔｅｏｆＥｌｅｃｔｒｉｃａｌａｎｄｅｌｅｃｔｒｏｎｉｃｓＥｎｇｉｎｅｅｒｓ）１２８４などのパラレル・インターフェース、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）インターフェース、ｉ−Ｌｉｎｋ（ＩＥＥＥ１３９４）インターフェース、ＳＣＳＩ（ＳｍａｌｌＣｏｍｐｕｔｅｒＳｙｓｔｅｍＩｎｔｅｒｆａｃｅ）インターフェース、ＰＣカードやメモリ・スティックを受容するメモリ・カード・インターフェース（カード・スロット）などのような、コンピュータの周辺機器接続用の汎用インターフェースを備え、ローカル接続された外部機器との間でプログラムやデータの移動を行なうようにしてもよい。
【００５２】
また、インターフェース２５の他の例として、赤外線通信（ＩｒＤＡ）インターフェースを備え、外部機器と無線通信を行なうようにしてもよい。
【００５３】
さらに、制御ユニット２０は、無線通信インターフェース２６やネットワーク・インターフェース・カード（ＮＩＣ）２７などを含み、Ｂｌｕｅｔｏｏｔｈのような近接無線データ通信や、ＩＥＥＥ８０２．１１ｂのような無線ネットワーク、あるいはインターネットなどの広域ネットワークを経由して、外部のさまざまなホスト・コンピュータとデータ通信を行なうことができる。
【００５４】
このようなロボット装置１とホスト・コンピュータ間におけるデータ通信により、遠隔のコンピュータ資源を用いて、ロボット装置１の複雑な動作制御を演算したり、リモート・コントロールしたりすることができる。
【００５５】
Ｂ．ロボット装置の行動制御システム
図３には、本発明の実施形態に係るロボット装置１の行動制御システム１００の機能構成を模式的に示している。ロボット装置１は、外部刺激の認識結果や内部状態の変化に応じて行動制御を行なうことができる。さらには、長期記憶機能を備え、外部刺激から内部状態の変化を連想記憶することにより、外部刺激の認識結果や内部状態の変化に応じて行動制御を行なうことができる。
【００５６】
図示の行動制御システム１００にはオブジェクト指向プログラミングを採り入れて実装することができる。この場合、各ソフトウェアは、データとそのデータに対する処理手続きとを一体化させた「オブジェクト」というモジュール単位で扱われる。また、各オブジェクトは、メッセージ通信と共有メモリを使ったオブジェクト間通信方法によりデータの受け渡しと呼び出し（Ｉｎｖｏｋｅ）を行なうことができる。
【００５７】
行動制御システム１００は、外部環境（Ｅｎｖｉｒｏｎｍｅｎｔｓ）を認識するために、視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３を備えている。
【００５８】
視覚認識機能部（Ｖｉｄｅｏ）５１は、例えば、ＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ：電荷結合素子）カメラのような画像入力装置を介して入力された撮影画像を基に、顔認識や色認識などの画像認識処理や特徴抽出を行なう。視覚認識機能部５１は、後述する”ＭｕｌｔｉＣｏｌｏｒＴｒａｃｋｅｒ”，”ＦａｃｅＤｅｔｅｃｔｏｒ”，”ＦａｃｅＩｄｅｎｔｉｆｙ”といった複数のオブジェクトで構成される。
【００５９】
聴覚認識機能部（Ａｕｄｉｏ）５２は、マイクなどの音声入力装置を介して入力される音声データを音声認識して、特徴抽出したり、単語セット（テキスト）認識を行なったりする。聴覚認識機能部５２は、後述する”ＡｕｄｉｏＲｅｃｏｇ”，”ＡｕｔｈｕｒＤｅｃｏｄｅｒ”といった複数のオブジェクトで構成される。
【００６０】
接触認識機能部（Ｔａｃｔｉｌｅ）５３は、例えば機体の頭部などに内蔵された接触センサによるセンサ信号を認識して、「なでられた」とか「叩かれた」という外部刺激を認識する。
【００６１】
内部状態管理部（ＩＳＭ：ＩｎｔｅｒｎａｌＳｔａｔｕｓＭａｎａｇｅｒ）１０４は、本能や感情といった数種類の情動を数式モデル化して管理しており、上述の視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３によって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）に応じてロボット装置１の本能や情動といった内部状態を管理する。
【００６２】
感情モデルと本能モデルは、それぞれ認識結果と行動履歴を入力に持ち、感情値と本能値を管理している。行動モデルは、これら感情値や本能値を参照することができる。
【００６３】
本実施形態では、情動についてその存在意義による複数階層で構成され、それぞれの階層で動作する。決定された複数の動作から、そのときの外部環境や内部状態によってどの動作を行なうかを決定するようになっている。また、それぞれの階層で行動は選択されるが、より低次の行動から優先的に動作を発現していくことにより、反射などの本能的行動や、記憶を用いた動作選択などの高次の行動を１つの個体上で矛盾なく発現することができる。
【００６４】
本実施形態に係るロボット装置１は、外部刺激の認識結果や内部状態の変化に応じて行動制御を行なうために、時間の経過とともに失われる短期的な記憶を行なう短期記憶部１０５と、情報を比較的長期間保持するための長期記憶部１０６を備えている。短期記憶と長期記憶という記憶メカニズムの分類は神経心理学に依拠する。
【００６５】
短期記憶部（ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ）１０５は、上述の視覚認識機能部１０１と、聴覚認識機能部１０２と、接触認識機能部１０３によって外部環境から認識されたターゲットやイベントを短期間保持する機能モジュールである。例えば、カメラ１５からの入力画像を約１５秒程度の短い期間だけ記憶する。
【００６６】
長期記憶部（ＬｏｎｇＴｅｒｍＭｅｍｏｒｙ）１０６は、物の名前など学習により得られた情報を長期間保持するために使用される。長期記憶部１０６は、例えば、ある行動モジュールにおいて外部刺激から内部状態の変化を連想記憶することができる。
【００６７】
また、本実施形態に係るロボット装置１の行動制御は、反射行動部１０９によって実現される「反射行動」と、状況依存行動階層１０８によって実現される「状況依存行動」と、熟考行動階層１０７によって実現される「熟考行動」に大別される。
【００６８】
反射的行動部（ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０９は、システムにとって極めて重大（クリティカル）な行動を司る行動階層である。このようなシステムにクリティカルな行動として、以下のものを挙げることができる。
【００６９】
（１）システムのブート行動
（２）システムのシャットダウン行動
（３）特定のエラーを解除する反射行動、あるいは認識された外部刺激に応じた反射的な機体動作
【００７０】
反射行動には、例えば、センサ入力された外部情報の認識結果を直接受けて、これを分類して、出力行動を直接決定する行動のことである。例えば、人間の顔を追いかけたり、うなずいたりといった振る舞いも含まれる。
【００７１】
ブート行動やシャットダウン行動は、すべての機体リソースを使用する必要がある。これに対し、反射行動は特定エラーなどに関与する機体リソースのみを使用する。
【００７２】
状況依存行動階層（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０８は、短期記憶部１０５並びに長期記憶部１０６の記憶内容や、内部状態管理部１０４によって管理される内部状態を基に、ロボット装置１が現在置かれている状況に即応した行動を制御する。
【００７３】
状況依存行動階層１０８は、各行動毎にステートマシン（又は状態遷移モデル）を用意しており、それ以前の行動や状況に依存して、センサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。また、状況依存行動階層１０８は、内部状態をある範囲に保つための行動（「ホメオスタシス行動」とも呼ぶ）も実現し、内部状態が指定した範囲内を越えた場合には、その内部状態を当該範囲内に戻すための行動が出現し易くなるようにその行動を活性化させる（実際には、内部状態と外部環境の両方を考慮した形で行動が選択される）。状況依存行動は、反射行動に比し、反応時間が遅い。
【００７４】
熟考行動階層（ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒ）１０７は、短期記憶部１０５並びに長期記憶部１０６の記憶内容に基づいて、ロボット装置１の比較的長期にわたる行動計画などを行なう。
【００７５】
熟考行動とは、与えられた状況あるいは人間からの命令により、推論やそれを実現するための計画を立てて行なわれる行動のことである。例えば、ロボットの位置と目標の位置から経路を探索することは熟考行動に相当する。このような推論や計画は、ロボット装置１がインタラクションを保つための反応時間よりも処理時間や計算負荷を要する（すなわち処理時間がかかる）可能性があるので、上記の反射行動や状況依存行動がリアルタイムで反応を返しながら、熟考行動は推論や計画を行なう。
【００７６】
熟考行動階層１０７や状況依存行動階層１０８、反射行動部１０９は、ロボット装置１のハードウェア構成に非依存の上位のアプリケーション・プログラムとして記述することができる。これに対し、ハードウェア依存層制御部（ＣｏｎｆｉｇｕｒａｔｉｏｎＤｅｐｅｎｄｅｎｔＡｃｔｉｏｎｓＡｎｄＲｅａｃｔｉｏｎｓ）１１０は、これら上位アプリケーション（「スキーマ」と呼ばれる行動モジュール）からの命令に応じて、関節アクチュエータの駆動などの機体のハードウェア（外部環境）を直接操作する。
【００７７】
Ｃ．状況依存行動制御
状況依存行動階層（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０８は、短期記憶部１０５並びに長期記憶部１０６の記憶内容や、内部状態管理部１０４によって管理される内部状態を基に、ロボット装置１が現在置かれている状況に即応した行動を制御する。また、状況依存行動階層１０８の一部として、認識された外部刺激に応じて反射的・直接的な機体動作を実行する反射行動部１０９を含んでいる。
【００７８】
Ｃ−１．状況依存行動階層の構成
本実施形態では、状況依存行動階層１０８は、各行動モジュール毎にステートマシン（又は状態遷移モデル）を用意しており、それ以前の行動や状況に依存して、センサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。行動モジュールは、外部刺激や内部状態の変化に応じた状況判断を行なうＭｏｎｉｔｏｒ機能と、行動実行に伴う状態遷移（ステートマシン）を実現するＡｃｔｉｏｎ機能とを備えたスキーマ（ｓｃｈｅｍａ）として記述される。状況依存行動階層１０８は、複数のスキーマが階層的に連結された木構造として構成されている（後述）。
【００７９】
また、状況依存行動階層１０８は、内部状態をある範囲に保つための行動（「ホメオスタシス行動」とも呼ぶ）も実現し、内部状態が指定した範囲内を越えた場合には、その内部状態を当該範囲内に戻すための行動が出易くなるようにその行動を活性化させる（実際には、内部状態と外部環境の両方を考慮した形で行動が選択される）。
【００８０】
図３に示したようなロボット１の行動制御システム１００における各機能モジュールは、オブジェクトとして構成される。各オブジェクトは、メッセージ通信と共有メモリを使ったオブジェクト間通信方法によりデータの受け渡しと呼び出し（Ｉｎｖｏｋｅ）を行なうことができる。図４には、本実施形態に係る行動制御システム１００のオブジェクト構成を模式的に示している。
【００８１】
視覚認識機能部１０１は、”ＦａｃｅＤｅｔｅｃｔｏｒ”、”ＭｕｌｉｔＣｏｌｏｔＴｒａｃｋｅｒ”、”ＦａｃｅＩｄｅｎｔｉｆｙ”という３つのオブジェクトで構成される。
【００８２】
ＦａｃｅＤｅｔｅｃｔｏｒは、画像フレーム中から顔領域を検出するオブジェクトであり、検出結果をＦａｃｅＩｄｅｎｔｉｆｙに出力する。ＭｕｌｉｔＣｏｌｏｔＴｒａｃｋｅｒは、色認識を行なうオブジェクトであり、認識結果をＦａｃｅＩｄｅｎｔｉｆｙ及びＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（短期記憶ブ１０５を構成するオブジェクト）に出力する。また、ＦａｃｅＩｄｅｎｔｉｆｙは、検出された顔画像を手持ちの人物辞書で検索するなどして人物の識別を行ない、顔画像領域の位置、大きさ情報とともに人物のＩＤ情報をＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙに出力する。
【００８３】
聴覚認識機能部１０２は、”ＡｕｄｉｏＲｅｃｏｇ”と”ＳｐｅｅｃｈＲｅｃｏｇ”という２つのオブジェクトで構成される。ＡｕｄｉｏＲｅｃｏｇは、マイクなどの音声入力装置からの音声データを受け取って、特徴抽出と音声区間検出を行なうオブジェクトであり、音声区間の音声データの特徴量及び音源方向をＳｐｅｅｃｈＲｅｃｏｇやＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙに出力する。ＳｐｅｅｃｈＲｅｃｏｇは、ＡｕｄｉｏＲｅｃｏｇから受け取った音声特徴量と音声辞書及び構文辞書を使って音声認識を行なうオブジェクトであり、認識された単語のセットをＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙに出力する。
【００８４】
触覚認識記憶部１０３は、接触センサからのセンサ入力を認識する”ＴａｃｔｉｌｅＳｅｎｓｏｒ”というオブジェクトで構成され、認識結果はＳＨｏｒｔＴｅｒｍＭｅｍｏｒｙや内部状態を管理するオブジェクトであるＩｎｔｅｒｎａｌＳｔａｔｅＭｏｄｅｌ（ＩＳＭ）に出力する。
【００８５】
ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（ＳＴＭ）は、短期記憶部１０５を構成するオブジェクトであり、上述の認識系の各オブジェクトによって外部環境から認識されたターゲットやイベントを短期間保持（例えばカメラ１５からの入力画像を約１５秒程度の短い期間だけ記憶する）する機能モジュールであり、ＳＴＭクライアントであるＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒに対して外部刺激の通知（Ｎｏｔｉｆｙ）を定期的に行なう。
【００８６】
ＬｏｎｇＴｅｒｍＭｅｍｏｒｙ（ＬＴＭ）は、長期記憶部１０６を構成するオブジェクトであり、物の名前など学習により得られた情報を長期間保持するために使用される。ＬｏｎｇＴｅｒｍＭｅｍｏｒｙは、例えば、ある行動モジュールにおいて外部刺激から内部状態の変化を連想記憶することができる。
【００８７】
ＩｎｔｅｒｎａｌＳｔａｔｕｓＭａｎａｇｅｒ（ＩＳＭ）は、内部状態管理部１０４を構成するオブジェクトであり、本能や感情といった数種類の情動を数式モデル化して管理しており、上述の認識系の各オブジェクトによって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）に応じてロボット装置１の本能や情動といった内部状態を管理する。
【００８８】
ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓｌａｙｅｒ（ＳＢＬ）は状況依存型行動階層１０８を構成するオブジェクトである。ＳＢＬは、ＳｈｏｒＴｅｒｍＭｅｍｏｒｙのクライアント（ＳＴＭクライアント）となるオブジェクトであり、ＳｈｏｒＴｅｒｍＭｅｍｏｒｙからは定期的に外部刺激（ターゲットやイベント）に関する情報の通知（Ｎｏｔｉｆｙ）を受け取ると、スキーマ（ｓｃｈｅｍａ）すなわち実行すべき行動モジュールを決定する（後述）。
【００８９】
ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒは、システムにとって極めて重大（クリティカル）な行動を司る行動階層である。このようなシステムにクリティカルな行動として、以下のものを挙げることができる。
【００９０】
（１）システムのブート行動
（２）システムのシャットダウン行動
（３）特定のエラーを解除する反射行動、あるいは認識された外部刺激に応じた反射的な機体動作
【００９１】
反射行動には、例えば、センサ入力された外部情報の認識結果を直接受けて、これを分類して、出力行動を直接決定する行動が含まれる。例えば、人間の顔を追いかけたり、うなずいたりといった振る舞いは反射行動も含まれる。ブート行動やシャットダウン行動は、すべての機体リソースを使用する必要がある。これに対し、反射行動は特定エラーなどに関与する機体リソースのみを使用する。
【００９２】
ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒは外部刺激や内部状態の変化などの状況に応じて行動を選択する。これに対し、ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒは、外部刺激に応じて反射的を行動する。これら２つのオブジェクトによる行動選択は独立して行なわれるため、互いに選択された行動モジュール（スキーマ）（後述）を機体上で実行する場合に、ロボット１のハードウェア・リソースが競合して実現不可能なこともある。
【００９３】
Ｒｅｓｏｕｒｃｅｍａｎａｇｅｒというオブジェクトは、ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒとＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒによる行動選択時のハードウェアの競合を調停する。そして、調停結果に基づいて機体動作を実現する各オブジェクトに通知することにより機体が駆動する。
【００９４】
ＳｏｕｎｄＰｅｒｆｏｒｍｅｒ、ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ、ＬｅｄＣｏｎｔｒｏｌｌｅｒは、機体動作を実現するオブジェクトである。ＳｏｕｎｄＰｅｒｆｏｒｍｅｒは、音声出力を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから与えられたテキスト・コマンドに応じて音声合成を行ない、ロボット１の機体上のスピーカから音声出力を行なう。また、ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒは、機体上の各関節アクチュエータの動作を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒから手や脚などを動かすコマンドを受けたことに応答して、該当する関節角を計算する。また、ＬｅｄＣｏｎｔｒｏｌｌｅｒは、ＬＥＤ１９の点滅動作を行なうためのオブジェクトであり、ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ経由でＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒからコマンドを受けたことに応答してＬＥＤ１９の点滅駆動を行なう。
【００９５】
図５には、状況依存行動階層（ＳＢＬ）１０８（但し、反射行動部１０９を含む）による状況依存行動制御の形態を模式的に示している。認識系１０１〜１０３による外部環境の認識結果は、外部刺激として状況依存行動階層１０８（反射行動部１０９を含む）に与えられる。また、認識系による外部環境の認識結果に応じた内部状態の変化も状況依存行動階層１０８に与えられる。そして、状況依存行動階層１０８では、外部刺激や内部状態の変化に応じて状況を判断して、行動選択を実現することができる。
【００９６】
図６には、図５に示した状況依存行動階層１０８による行動制御の基本的な動作例を示している。同図に示すように、状況依存行動階層１０８（ＳＢＬ）では、外部刺激や内部状態の変化によって各行動モジュール（スキーマ）の活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する。活動度レベルの算出には、例えばライブラリを利用することにより、すべてのスキーマについて統一的な計算処理を行なうことができる（以下、同様）。例えば、活動度レベルが最も高いスキーマを選択したり、所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行したりするようにしてもよい（但し、並列実行するときは各スキーマ同士でハードウェア・リソースの競合がないことを前提とする）。
【００９７】
また、図７には、図５に示した状況依存行動階層１０８により反射行動を行なう場合の動作例を示している。この場合、同図に示すように、状況依存行動階層１０８に含まれる反射行動部１０９（ＲｅｆｌｅｘｉｖｅＳＢＬ）は、認識系の各オブジェクトによって認識された外部刺激を直接入力として活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する。この場合、内部状態の変化は、活動度レベルの計算には使用されない。
【００９８】
また、図８には、図５に示した状況依存行動階層１０８により感情表現を行なう場合の動作例を示している。内部状態管理部１０４では、本能や感情などの情動を数式モデルとして管理しており、情動パラメータの状態値が所定値に達したことに応答して、状況依存行動階層１０８に内部状態の変化を通知（Ｎｏｔｉｆｙ）する。状況依存行動階層１０８は、内部状態の変化を入力として活動度レベルを算出して、活動度レベルの度合いに応じてスキーマを選択して行動を実行する。この場合、認識系の各オブジェクトによって認識された外部刺激は、内部状態管理部１０４（ＩＳＭ）における内部状態の管理・更新に利用されるが、スキーマの活動度レベルの算出には使用されない。
【００９９】
Ｃ−２．スキーマ（行動モジュール）
状況依存行動階層１０８は、各行動モジュール毎にステートマシンを用意しており、それ以前の行動や状況に依存して、センサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。行動モジュールは、機体動作を記述し行動実行に伴う状態遷移（ステートマシン）を実現するＡｃｔｉｏｎ機能と、ａｃｔｉｏｎ機能において記述された行動の実行を外部刺激や内部状態に応じて評価して状況判断を行なうＭｏｎｉｔｏｒ機能とを備えたスキーマ（ｓｃｈｅｍａ）として記述される。図９には、状況依存行動階層１０８が複数のスキーマによって構成されている様子を模式的に示している。
【０１００】
状況依存行動階層１０８（より厳密には、状況依存行動階層１０８のうち、通常の状況依存行動を制御する階層）は、複数のスキーマが階層的に連結されたツリー構造として構成され、外部刺激や内部状態の変化に応じてより最適なスキーマを統合的に判断して行動制御を行なうようになっている。ツリーは、例えば動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するためのサブツリーなど、複数のサブツリー（又は枝）を含んでいる。
【０１０１】
図１０には、状況依存行動階層１０８におけるスキーマのツリー構造を模式的に示している。同図に示すように、状況依存行動階層１０８は、短期記憶部１０５から外部刺激の通知（Ｎｏｔｉｆｙ）を受けるルート・スキーマを先頭に、抽象的な行動カテゴリから具体的な行動カテゴリに向かうように、各階層毎にスキーマが配設されている。例えば、ルート・スキーマの直近下位の階層では、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」、「食べる（Ｉｎｇｅｓｔｉｖｅ）」、「遊ぶ（Ｐｌａｙ）」というスキーマが配設される。そして、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」の下位には、「ＩｎｖｅｓｔｉｇａｔｉｖｅＬｏｃｏｍｏｔｉｏｎ」、「ＨｅａｄｉｎＡｉｒＳｎｉｆｆｉｎｇ」、「ＩｎｖｅｓｔｉｇａｔｉｖｅＳｎｉｆｆｉｎｇ」というより具体的な探索行動を記述したスキーマが配設されている。同様に、スキーマ「食べる（Ｉｎｇｅｓｔｉｖｅ）」の下位には「Ｅａｔ」や「Ｄｒｉｎｋ」などのより具体的な飲食行動を記述したスキーマが配設され、スキーマ「遊ぶ（Ｐｌａｙ）」の下位には「ＰｌａｙＢｏｗｉｎｇ」、「ＰｌａｙＧｒｅｅｔｉｎｇ」、「ＰｌａｙＰａｗｉｎｇ」などのより具体的な遊ぶ行動を記述したスキーマが配設されている。
【０１０２】
図示の通り、各スキーマは外部刺激と内部状態を入力している。また、各スキーマは、少なくともＭｏｎｉｔｏｒ関数とＡｃｔｉｏｎ関数を備えている。
【０１０３】
図１１には、スキーマの内部構成を模式的に示している。同図に示すように、スキーマは、所定の事象の発生に従がって状態（又はステート）が移り変わっていく状態遷移モデル（ステートマシン）の形式で機体動作を記述したＡｃｔｉｏｎ関数と、外部刺激や内部状態に応じてＡｃｔｉｏｎ関数の各状態を評価して活動度レベル値として返すＭｏｎｉｔｏｒ関数と、Ａｃｔｉｏｎ関数のステートマシンをＲＥＡＤＹ（準備完了）、ＡＣＴＩＶＥ（活動中），ＳＬＥＥＰ（待機中）いずれかの状態としてスキーマの状態を記憶管理する状態管理部で構成されている。
【０１０４】
Ｍｏｎｉｔｏｒ関数は、外部刺激と内部状態に応じて当該スキーマの活動度レベル（ＡｃｔｉｖａｔｉｏｎＬｅｖｅｌ：ＡＬ値）を算出する関数である。図１０に示すようなツリー構造を構成する場合、上位（親）のスキーマは外部刺激と内部状態を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールすることができ、子供のスキーマはＡＬ値を返り値とする。また、スキーマは自分のＡＬ値を算出するために、さらに子供のスキーマのＭｏｎｉｔｏｒ関数をコールすることができる。そして、ルートのスキーマは、各サブツリーからのＡＬ値が返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断することができる。
【０１０５】
例えば、ルートのスキーマは、ＡＬ値が最も高いスキーマを選択したり、ＡＬ値が所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行したりするようにしてもよい（但し、並列実行するときは各スキーマ同士でハードウェア・リソースの競合がないことを前提とする）。
【０１０６】
図１２には、Ｍｏｎｉｔｏｒ関数の内部構成を模式的に示している。同図に示すように、Ｍｏｎｉｔｏｒ関数は、当該スキーマで記述されている行動を誘発する評価値を活動度レベルとして算出する行動誘発評価値演算器と、使用する機体リソースを特定する使用リソース演算器を備えている。図１１で示す例では、Ｍｏｎｉｔｏｒ関数は、スキーマすなわち行動モジュールの管理を行なう行動状態制御部（仮称）からコールされると、Ａｃｔｉｏｎ関数のステートマシンを仮想実行して、行動誘発評価値（すなわち活動度レベル）と使用リソースを演算して、これを返すようになっている。
【０１０７】
また、Ａｃｔｉｏｎ関数は、スキーマ自身が持つ行動を記述したステートマシン（又は状態遷移モデル）（後述）を備えている。図１０に示すようなツリー構造を構成する場合、親スキーマは、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりすることができる。本実施形態では、ＡｃｔｉｏｎのステートマシンはＲｅａｄｙにならないと初期化されない。言い換えれば、中断しても状態はリセットされず、スキーマが実行中の作業データを保存することから、中断再実行が可能である（後述）。
【０１０８】
図１１で示す例では、スキーマすなわち行動モジュールの管理を行なう行動状態制御部（仮称）は、Ｍｏｎｉｔｏｒ関数からの戻り値に基づいて、実行すべき行動を選択し、該当するスキーマのＡｃｔｉｏｎ関数をコールし、あるいは状態管理部に記憶されているスキーマの状態の移行を指示する。例えば行動誘発評価値としての活動度レベルが最も高いスキーマを選択したり、リソースが競合しないように優先順位に従って複数のスキーマを選択したりする。また、行動状態制御部は、より優先順位の高いスキーマが起動し、リソースの競合が生じた場合、優先順位が下位のスキーマの状態をＡＣＴＩＶＥからＳＬＥＥＰに退避させ、競合状態が解かれるとＡＣＴＩＶＥに回復するなど、スキーマの状態を制御する。
【０１０９】
行動状態制御部は、図１３に示すように、状況依存行動階層１０８において１つだけ配設して、同階層１０８を構成するすべてのスキーマを一元的に集中管理するようにしてもよい。
【０１１０】
図示の例では、行動状態制御部は、行動評価部と、行動選択部と、行動実行部を備えている。行動評価部は、例えば所定の制御周期で各スキーマのＭｏｎｉｔｏｒ関数をコールして、各々の活動度レベルと使用リソースを取得する。
【０１１１】
行動選択部は、各スキーマによる行動制御と機体リソースの管理を行なう。例えば、集計された活動度レベルの高い順にスキーマを選択するとともに、（状況依存行動階層１０８内で管理する範囲において）使用リソースが競合しないように２以上のスキーマを同時に選択する。
【０１１２】
行動実行部は、選択されたスキーマのＡｃｔｉｏｎ関数に行動実行命令を発行したり、スキーマの状態（ＲＥＡＤＹ、ＡＣＴＩＶＥ，ＳＬＥＥＰ）を管理して、スキーマの実行を制御したりする。例えば、より優先順位の高いスキーマが起動し、リソースの競合が生じた場合、優先順位が下位のスキーマの状態をＡＣＴＩＶＥからＳＬＥＥＰに退避させ、競合状態が解かれるとＡＣＴＩＶＥに回復する。
【０１１３】
あるいは、このような行動状態制御部の機能を、状況依存行動階層１０８内の各スキーマ毎に配置するようにしてもよい。例えば、図１０に示したように，スキーマがツリー構造を形成している場合（図１４を参照のこと）、上位（親）のスキーマの行動状態制御は、外部刺激と内部状態を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールし、子供のスキーマから活動度レベルと使用リソースを返り値として受け取る。また、子供のスキーマは、自分の活動度レベルと使用リソースを算出するために、さらに子供のスキーマのＭｏｎｉｔｏｒ関数をコールする。そして、ルートのスキーマの行動状態制御部には、各サブツリーからの活動度レベルと使用リソースが返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断して、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりする。
【０１１４】
図１５には、状況依存行動階層１０８において通常の状況依存行動を制御するためのメカニズムを模式的に示している。
【０１１５】
同図に示すように、状況依存行動階層１０８には、短期記憶部１０５から外部刺激が入力（Ｎｏｔｉｆｙ）されるとともに、内部状態管理部１０９から内部状態の変化が入力される。状況依存行動階層１０８は、例えば動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するためのサブツリーなど、複数のサブツリーで構成されており、ルート・スキーマは、外部刺激の通知（Ｎｏｔｉｆｙ）に応答して、各サブツリーのｍｏｎｉｔｏｒ関数をコールし、その返り値としての活動度レベル（ＡＬ値）を参照して、統合的な行動選択を行ない、選択された行動を実現するサブツリーに対してａｃｔｉｏｎ関数をコールする。また、状況依存行動階層１０８において決定された状況依存行動は、リソース・マネージャにより反射行動部１０９による反射的行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ）に適用される。
【０１１６】
また、状況依存行動層１０８のうち、反射的行動部１０９は、上述した認識系の各オブジェクトによって認識された外部刺激に応じて反射的・直接的な機体動作を実行する（例えば、障害物の検出により咄嗟に避ける）。このため、通常の状況依存行動を制御する場合（図１０）とは相違し、認識系の各オブジェクトからの信号を直接入力する複数のスキーマが、階層化されずに並列的に配置されている。
【０１１７】
Ｃ−３．反射行動階層の構成
反射行動階層１０９は、システムにとって極めて重大（クリティカル）な行動を司る行動階層であり、各行動モジュール毎にステートマシン（又は状態遷移モデル）を用意している。行動モジュールは、外部刺激や内部状態の変化に応じた状況判断を行なうＭｏｎｉｔｏｒ機能と、行動実行に伴う状態遷移（ステートマシン）を実現するＡｃｔｉｏｎ機能とを備えたスキーマ（ｓｃｈｅｍａ）として記述される（同上）。
【０１１８】
図１６には、反射行動階層１０９の内部構成を示している。同図に示すように、複数のスキーマによって構成されているが、各スキーマの構成を図１１及び図１２を参照しながら説明した通りである。
【０１１９】
同図に示す例では、反射行動階層１０９は、ルートのスキーマの下位に、システムのブートを行なうスキーマ、システムのシャットダウンを行なうスキーマ、特定のエラーを解除する反射行動、あるいは認識された外部刺激に応じた反射的な機体動作を行なうスキーマ群が配置されている。
【０１２０】
視覚認識機能部１０１、聴覚認識機能部１０２、接触認識機能部１０３などの認識機能部から外部刺激が入力されると、ルートのスキーマは、下位の各スキーマのＭｏｎｉｔｏｒ関数をコールして、活動度レベルとスキーマ実行時の使用リソースを返り値として受け取り、最も活動度レベルの高いスキーマを実行させる。
【０１２１】
ブート行動モジュールやシャットダウン行動モジュールは、すべての機体リソースを使用する必要がある。これに対し、エラー処理／反射行動モジュールは特定エラーなどに関与する機体リソースのみを使用する。
【０１２２】
Ｃ−４．状況依存行動階層と反射行動階層の協働的動作
状況依存行動階層１０８によれば、それ以前の行動や状況に依存して、センサ入力によって得られる外部刺激とロボット自身の内部状態からなる状況に依存した行動を選択的に実行することができる。
【０１２３】
一方、反射行動階層１０９は、システムのブートやシャットダウン、特定のエラー解除などのシステムにとって極めて重大（クリティカル）な行動を実行する。
【０１２４】
状況依存型行動や反射行動は、ロボットの行動制御システムにおいて異なる制御階層レベルとして実装され（図９を参照のこと）、同時又は並列実行することが可能である。ところがこのような階層化システム・アーキテクチャにおいては、各行動階層の間でコミュニケーションすなわち情報交換を行なう術がない。このため、各行動階層において互いに干渉し合うような活動を独自に発現してしまった場合には、システムの誤動作を招来し、システム全体として一貫したパフォーマンスを行なうことが困難となってしまう。
【０１２５】
そこで、本実施形態では、一方の行動階層におけるスキーマのアクティビティを記述したログ情報を他方の行動階層に通信するためのいわゆるインターフェース機能を配設し、他方の行動階層ではこのインターフェースを介してスキーマの実行をモニタして、協働的又は強調的なスキーマの実行を行なうようにした。この結果、コンピュータ・リソースの使用やユーザの満足度などの観点で、システムのパフォーマンスが向上する。
【０１２６】
図１７には、状況依存行動階層１０８と反射行動階層１０９の間で同時並行実行が可能な協働的動作を実現するための仕組みを図解している。
【０１２７】
同図に示すように、状況依存行動階層１０８にはログ・メモリ・クライアント（ログ・メモリ・モジュール）１５１が設けられるとともに、反射行動階層１０９にはログ・メモリ・サーバ（ログ・メモリ・モジュール）１５２が設けられ、これら、サーバ〜クライアント間は通信部１５３を介して接続されている。
【０１２８】
また、状況行動依存行動階層１０８には、木構造形式のスキーマ群のルート・スキーマの直近下位に、反射行動階層監視スキーマ１５４を配設している。反射行動階層監視スキーマ１５４はダミーのＡｃｔｉｏｎ関数を持つとともに、そのＭｏｎｉｔｏｒ関数は、反射行動階層１０９におけるスキーマの実行に応答して、自らの活動度レベルを最大にするとともに、すべての使用リソース（あるいは反射行動階層１０９において使用されているリソース）を戻り値とする。この結果、ルート・スキーマは、必ず反射行動階層監視スキーマ１５４を選択するので、状況依存行動階層１０８と反射行動階層１０９の間でリソースの競合が回避され、システムにクリティカルなスキーマを支障なく実行することができる。
【０１２９】
スキーマの実行に関するログ情報を生成する機構は、各行動階層毎に配設される。行動階層間でログ情報を交換する機構は、クライアント〜サーバ形式で実装される。そして、各行動階層において、ログ・メモリからログ情報を取り出す機構が設けられている。
【０１３０】
システム実行中、上述した仕組みに従いスキーマが適宜選択され実行される。そして、各行動階層においてロギング機構を用いて、実行中のスキーマに関する情報を含んだ活動ログが生成される。ログ情報には、例えば以下のようなデータが含まれていることが好ましい。
【０１３１】
（１）スキーマの識別情報（階層ＩＤ、スキーマＩＤ、スキーマ名）
（２）実行ステータス（例えば、ＲＥＡＤＹ，ＡＣＴＩＶＥ，ＳＬＥＥＰ）
（３）サブ実行ステータス（例えば、開始（Ｓｔａｒｔ）、終了（Ｅｎｄ）、成功（Ｓｕｃｃｅｓｓ）、失敗（Ｆａｉｌ））
（４）実行ターゲット
（５）実行時間
【０１３２】
図１８には、ログ・メモリ・モジュールの構成例を示している。また、図１９には、メモリ構成の設定例を示している。
【０１３３】
行動階層間のログ情報の交換メカニズムすなわち通信部１５３を用いて、一方の行動階層において生成されたログ情報が他方の行動階層のログ・メモリに転送される。
【０１３４】
図１７に示す例では、各行動階層のログ・メモリ・モジュールは、クライアント１５１及びサーバ１５２として動作している。ログ・メモリ・モジュールは、当該行動階層内のスキーマ実行により生成されたログ情報を格納する。そして、外部の行動階層がクライアントとして接続されている場合、ログ・メモリ・モジュールは、ログ情報をクライアントに転送する。また同時に、ログ・メモリ・モジュールは、外部の行動階層のメモリ・モジュールに対してクライアントとして接続することができる。この場合、外部の行動階層において生成されたログ情報を取り出して格納することができる。
【０１３５】
ログ情報の取り出し機構を用いて、各スキーマは、当該行動階層内のログ・メモリ・モジュールからログ情報にアクセスすることができる。内部のログ・メモリ・モジュールと外部の行動階層間の通信機能により、活動度ログは、行動階層内と外部の行動階層双方の情報を含む。
【０１３６】
既に述べたように、スキーマ間で不適当な干渉（使用リソースの競合）が発生すると、システムの誤動作を招来するので、回避し又は制限しなければならない。
【０１３７】
より具体的に言えば、状況依存行動階層１０８は、システムにクリティカルな行動の実行を妨害すべきではない。例えば、反射行動階層１０９において、システムのブート時（ブート・スキーマの活動）、シャットダウン時（シャットダウン・スキーマの活動）、並びにエラー回避処理時や反射行動時（エラー処理／反射行動スキーマ）により検出され取り扱われているエラー状況において、状況依存行動階層１０８のスキーマを活動化すべきではない。
【０１３８】
反射行動階層監視スキーマ１５４は、状況依存行動階層１０８内で木構造形式のスキーマ群のルート・スキーマの直近下位に配設されている。反射行動階層監視スキーマ１５４は、ログ・メモリ・クライアント１５１内の活動度ログを監視して、反射行動階層１０９内の各スキーマ、すなわちブート・スキーマ、シャットダウン・スキーマ、エラー回避スキーマの開始／終了を検出する。反射行動階層１０９内のスキーマの活動が検出されると、反射行動階層監視スキーマ１５４は、適当な処理を実行する。例えば、状況行動依存階層１０８内で他のスキーマの実行を制限する。また、反射行動階層１０９内のスキーマの活動が終了すると、活動検出により課された制限を解除する。
【０１３９】
反射行動階層１０９内では、活動度ログが活動化されている。すなわち、スキーマが実行すると、ログ情報が生成される。
【０１４０】
状況依存行動階層１０８内のログ・メモリ・モジュールは、反射行動階層１０９内のログ・メモリ・モジュールに対し、クライアントとして接続されている。したがって、反射行動階層１０９内で生成されたログ情報は、状況依存行動階層１０８内のログ・メモリ・モジュールすなわちログ・メモリ・クライアント１５１に転送される。
【０１４１】
反射行動階層監視スキーマ１５４は、階層ＩＤ、スキーマＩＤ、スキーマ名などの識別情報を用いて、関連のあるスキーマのログを検索する。階層ＩＤとスキーマＩＤの両方を用いて、特定のスキーマを検索することができる。また、階層ＩＤのみを用いて、当該階層におけるすべてのスキーマの活動度を取得することができる。
【０１４２】
選択されたスキーマの関連する活動度を検索するために、実行ステータスとサブ実行ステータスを用いる。例えば、実行開始と実行終了は、それぞれ（ＡＣＴＩＶＥ，Ｓｔａｒｔ）及び（ＡＣＴＩＶＥ，（Ｅｎｄ，Ｓｕｃｃｅｓｓ））を探すことによって見出される。
【０１４３】
本実施形態では、状況依存行動階層１０８内では、リソース競合と活動度レベルを用いた以下の基準によりスキーマの実行が選択され又は制限される。
【０１４４】
（１）使用リソースが競合しないすべてのスキーマに対して実行権を許可する
（２）競合が生じた場合、より高い活動度レベルを持つスキーマに実行権が許可され、既に実行中の競合する他のスキーマの実行はすべて停止される。
【０１４５】
このようなスキーマ選択及び実行のポリシーは、当該行動階層内でのみ適用され、行動階層内のスキーマのみしか考慮されない。したがって、異なる行動階層において実行されるスキーマ間でリソースの競合が生じても、検出することができず、スキーマの活動間で不適当な干渉が招来される。
【０１４６】
本実施形態では、反射行動階層監視スキーマ１５４は、リソース競合と活動度レベルに基づくスキーマ選択及び実行のポリシーを用いて、状況依存行動階層１０８内で他のスキーマの活動を制限することができる。すなわち、反射行動階層監視スキーマ１５４は、適当なリソースを要求するとともに、活動度レベルを十分高く励起する。リソース競合に関しては、実行中のスキーマを停止させるとともに、他のスキーマが選択され実行されるのを回避する。
【０１４７】
反射行動階層監視スキーマ１５４は、反射行動階層１０９内で検出されたスキーマの活動に応じて、適当なリソースを選択する。ブート・スキーマやシャットダウン・スキーマが選択・実行されているときには、反射行動階層監視スキーマ１５４は、すべての機体リソースを要求する。また、エラー回避スキーマや反射行動スキーマが選択・実行されているときには、反射行動階層監視スキーマ１５４は、特定エラーなどに関与する機体リソースのみを要求する。
【０１４８】
反射行動階層監視スキーマ１５４は、状況行動依存行動階層１０８内で、ルート・スキーマの直近下位に配設されているので、当該行動階層内で他のスキーマの選択や実行に大きく影響を与えることができる。
【０１４９】
反射行動階層監視スキーマ１５４のＭｏｎｉｔｏｒ関数は、反射行動階層１０９におけるスキーマの実行に応答して、自らの活動度レベルを最大にするとともに、すべての使用リソース（あるいは反射行動階層１０９において使用されているリソース）を戻り値とする。この結果、ルート・スキーマは、必ず反射行動階層監視スキーマ１５４を選択するので、状況依存行動階層１０８と反射行動階層１０９の間でリソースの競合が回避され、システムにクリティカルなスキーマを支障なく実行することができる。
【０１５０】
また、反射行動階層監視スキーマ１５４は、活動度検出により課されているスキーマ選択や実行の制限を、自らの要求リソース解放と活動度レベルの低下により解除することができる。これによって、リソースの競合を取り除き、他のスキーマは通常の活動を継続することができる。
【０１５１】
このように、反射行動階層監視スキーマ１５４の介在により、複数の行動階層間でのリソースの競合を好適に回避し、不適当に干渉し合うことなく協働的且つ有機的に並列実行することができる。
【０１５２】
各行動階層においては、独自のポリシーに従い、スキーマの選択や実行を行なうことができる。また、各行動階層におけるリソース管理を簡素化することができる。
【０１５３】
また、新しい行動階層を、既存の行動階層とは独立して開発することができ、これをシステムに搭載する際にも特別な修正を必要としない。
【０１５４】
［追補］
以上、特定の実施形態を参照しながら、本発明について詳解してきた。しかしながら、本発明の要旨を逸脱しない範囲で当業者が該実施形態の修正や代用を成し得ることは自明である。
【０１５５】
本発明の要旨は、必ずしも「ロボット」と称される製品には限定されない。すなわち、電気的若しくは磁気的な作用を用いて人間の動作に似せた運動を行なう機械装置、あるいはその他の移動機械、あるいは機械装置の行動を制御する情報処理システムであるならば、例えば玩具などのような他の産業分野に属する製品であっても、同様に本発明を適用することができる。
【０１５６】
要するに、例示という形態で本発明を開示してきたのであり、本明細書の記載内容を限定的に解釈するべきではない。本発明の要旨を判断するためには、冒頭に記載した特許請求の範囲の欄を参酌すべきである。
【０１５７】
【発明の効果】
以上詳記したように、本発明によれば、ロボットが置かれている状況を統合的に判断して適当な行動を選択する状況依存型の行動と認識された外部刺激に応じて反射的な機体動作を実現する反射型の行動を適宜行なうことができる、優れたロボットの行動制御システムを提供することができる。
【０１５８】
また、本発明によれば、状況依存行動と反射行動が不適当に干渉し合うことなく協働的且つ有機的に並列実行することができる、優れたロボットの行動制御システムを提供することができる。
【図面の簡単な説明】
【図１】本発明に実施に供されるロボット装置１の機能構成を模式的に示した図である。
【図２】制御ユニット２０の構成をさらに詳細に示した図である。
【図３】本発明の実施形態に係るロボット装置１の行動制御システム１００の機能構成を模式的に示した図である。
【図４】本発明の実施形態に係る行動制御システム１００のオブジェクト構成を模式的に示した図である。
【図５】状況依存行動階層１０８による状況依存行動制御の形態を模式的に示した図である。
【図６】図５に示した状況依存行動階層１０８による行動制御の基本的な動作例を示した図である。
【図７】図５に示した状況依存行動階層１０８により反射行動を行なう場合の動作例を示した図である。
【図８】図５に示した状況依存行動階層１０８により感情表現を行なう場合の動作例を示した図である。
【図９】状況依存行動階層１０８が複数のスキーマによって構成されている様子を模式的に示した図である。
【図１０】状況依存行動階層１０８におけるスキーマのツリー構造を模式的に示した図である。
【図１１】スキーマの内部構成を模式的に示した図である。
【図１２】Ｍｏｎｉｔｏｒ関数の内部構成を模式的に示した図である。
【図１３】行動状態制御部の構成例を模式的に示した図である。
【図１４】行動状態制御部の他の構成例を模式的に示した図である。
【図１５】状況依存行動階層１０８において通常の状況依存行動を制御するためのメカニズムを模式的に示した図である。
【図１６】反射行動階層１０９の内部構成を示した図である。
【図１７】状況依存行動階層１０８と反射行動階層１０９間の協働的動作の仕組みを説明するための図である。
【図１８】ログ・メモリ・モジュールの構成例を示した図である。
【図１９】メモリ構成の設定例を示した図である。
【符号の説明】
１…ロボット装置
１５…ＣＣＤカメラ
１６…マイクロフォン
１７…スピーカ
１８…タッチ・センサ
１９…ＬＥＤインジケータ
２０…制御部
２１…ＣＰＵ
２２…ＲＡＭ
２３…ＲＯＭ
２４…不揮発メモリ
２５…インターフェース
２６…無線通信インターフェース
２７…ネットワーク・インターフェース・カード
２８…バス
２９…キーボード
４０…入出力部
５０…駆動部
５１…モータ
５２…エンコーダ
５３…ドライバ
１００…行動制御システム
１０１…視覚認識機能部
１０２…聴覚認識機能部
１０３…接触認識機能部
１０５…短期記憶部
１０６…長期記憶部
１０７…熟考行動階層
１０８…状況依存行動階層
１０９…反射行動部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a behavior control system for a robot that realizes realistic communication with a user by performing an autonomous operation, and particularly relates to a recognition result of an external environment such as vision and hearing and an internal state such as instinct and emotion. The present invention relates to a behavior control system, a behavior control method, and a robot apparatus for a situation-dependent behavior type robot that integrally determines a situation where a robot is placed and selects an appropriate behavior.
[0002]
More specifically, the present invention relates to a situation-dependent action in which the situation where the robot is placed is determined in an integrated manner and selects an appropriate action, and a robot action control system that appropriately performs a serious reflex action on the system. In particular, the present invention relates to a behavior control system for a robot that cooperatively and organically executes a situation-dependent behavior and a reflex behavior in a cooperative and organic manner without inappropriate interference.
[0003]
[Prior art]
A mechanical device that performs a motion resembling a human motion using an electric or magnetic action is called a “robot”. It is said that the robot is derived from the Slavic word "ROBOTA (slave machine)". In Japan, robots began to spread from the late 1960's, but most of them were industrial robots (industrial robots) such as manipulators and transfer robots for the purpose of automation and unmanned production work in factories. Met.
[0004]
Recently, a pet-type robot that imitates the body mechanism and operation of a four-legged animal such as dogs, cats, and bears, or the body mechanism and motion of an animal that walks upright on two legs, such as a human or monkey. Research and development on the structure of a legged mobile robot such as the "humanoid" or "humanoid" robot and stable walking control thereof have been progressing, and expectations for its practical use have been increasing. These legged mobile robots are unstable compared to crawler type robots, making posture control and walking control difficult.However, they are excellent in that they can realize flexible walking and running operations such as climbing up and down stairs and over obstacles. I have.
[0005]
One of the uses of the legged mobile robot is to perform various difficult tasks in industrial activities and production activities. For example, maintenance work in nuclear power plants, thermal power plants, petrochemical plants, transport and assembly of parts in manufacturing factories, cleaning in high-rise buildings, rescue in fire spots and other dangerous and difficult work, etc. .
[0006]
Another use of the legged mobile robot is not the work support described above, but a life-based use, that is, a use in "symbiosis" or "entertainment" with humans. This type of robot faithfully reproduces the movement mechanism of legged walking animals, such as humans, dogs (pets), and bears, and rich emotional expressions using limbs. In addition, it does not simply execute a pre-input motion pattern faithfully, but also dynamically responds to words and attitudes received from the user (or another robot) (such as "praise", "scratch", and "slap"). It is also required to realize a corresponding and lively response expression.
[0007]
In the conventional toy machine, the relationship between the user operation and the response operation is fixed, and the operation of the toy cannot be changed according to the user's preference. As a result, the user eventually gets tired of toys that repeat only the same operation. On the other hand, an intelligent robot autonomously selects an action including a dialogue and a body motion, so that it is possible to realize realistic communication at a higher intellectual level. As a result, the user feels a deep attachment and familiarity with the robot.
[0008]
In robots or other realistic dialogue systems, it is common to select actions sequentially in response to changes in the external environment, such as vision and hearing. Further, as another example of the action selection mechanism, a mechanism in which emotions such as instinct and emotion are modeled to manage an internal state of the system, and an action is selected according to a change in the internal state. Of course, the internal state of the system changes due to changes in the external environment, and also changes due to the selected action.
[0009]
However, there are few examples related to situation-dependent behavior control, in which the situation where the robot is placed, such as the external environment and the internal state, is integrated and the behavior is selected.
[0010]
Here, the internal state includes, for example, an instinct element corresponding to access to the limbic system in a living body, and intrinsic and social needs corresponding to access to the cerebral neocortex. In addition, it is composed of elements captured by the ethological model, and elements called emotions such as joy, sadness, anger, and surprise.
[0011]
In conventional intelligent robots and other autonomous interactive robots, internal states composed of various factors such as instinct and emotion are all collected as “emotional” to manage the internal state one-dimensionally. That is, each element constituting the internal state exists in parallel with each other, and the action is selected only based on the external situation or the internal state without a clear selection criterion.
[0012]
Further, in the conventional system, the selection and manifestation of the action includes all the actions in one dimension and determines which one to select. For this reason, as the number of operations increases, the selection becomes more complicated, and it becomes more difficult to select an action that reflects the situation or internal state at that time.
[0013]
On the other hand, in order for an intelligent robot to achieve autonomous behavior, it is thought that in addition to the situation-dependent behavior, it is necessary to realize a serious reflexive body motion in the system.
[0014]
The reflexive behavior here refers to, for example, system boot or shutdown, processing behavior when an error occurs, or directly receives the recognition result of external information input from the sensor, classifies this, and directly determines the output behavior. Action. For example, it is preferable to implement a behavior such as chasing or nodding a human face as a reflex action.
[0015]
The situation-dependent action and the reflex action can be implemented as different control hierarchy levels in a robot action control system, and can be executed simultaneously or in parallel. However, in such a hierarchical system architecture, there is no way to communicate, that is, exchange information, between the behavioral layers. For this reason, if activities that interfere with each other in each behavioral hierarchy appear independently, it becomes difficult to perform consistent performance as a whole system.
[0016]
[Problems to be solved by the invention]
An object of the present invention is to provide a reflection-type operation which realizes a reflection-type body operation in response to a recognized external stimulus and a situation-dependent type operation in which a situation where a robot is placed is integrally determined and an appropriate action is selected. An object of the present invention is to provide an excellent robot behavior control system that can appropriately perform the above-mentioned behavior.
[0017]
It is a further object of the present invention to provide an excellent robot behavior control system capable of cooperatively and organically executing a situation-dependent action and a reflex action in a cooperative and organic manner without inappropriate interference.
[0018]
Means and Action for Solving the Problems
The present invention has been made in view of the above problems, a behavior control system for a robot that operates autonomously,
A first behavior hierarchy for controlling the behavior of the robot;
A second behavior hierarchy that controls the behavior of the robot independently of the first behavior hierarchy;
A log memory module that is provided for each behavior hierarchy and records the execution status of the behavior in the behavior hierarchy,
A connection unit for communicating recorded contents between the log memory modules of each behavior hierarchy,
A robot behavior control system characterized by comprising:
[0019]
However, the term “system” as used herein refers to a logical collection of a plurality of devices (or functional modules that realize specific functions), and each device or functional module is in a single housing. It does not matter in particular.
[0020]
Here, the first behavior hierarchy is, for example, a situation-dependent behavior hierarchy in which a situation where the robot is placed is determined in an integrated manner and an appropriate behavior is selected. The second behavior hierarchy is a reflex behavior hierarchy that governs extremely important behaviors of the system, such as booting and shutting down the system and handling various errors.
[0021]
In the first and second behavior layers, behavior control is basically performed independently. However, if inappropriate interference (contention of used resources) occurs between the action layers, a malfunction of the system is caused. Therefore, in the first behavior hierarchy, the behavior in the first behavior hierarchy is controlled according to the recorded contents of the log memory module on the second behavior hierarchy obtained through the connection unit. It has become.
[0022]
For example, in the first behavioral hierarchy, based on the recorded contents of the log memory module of the second behavioral hierarchy obtained through the connection unit, while the second behavioral hierarchy is active, a standby state is provided. It is in a state.
[0023]
Alternatively, in the first behavior hierarchy, a device used in the second behavior hierarchy based on the recorded contents of the log memory module of the second behavior hierarchy obtained via the connection unit Select actions that avoid resources.
[0024]
That is, when critical behaviors of the system are activated, such as when the system is booted, when the system is shut down, and when an error is avoided, the activities of the context-dependent behavior hierarchy are avoided or restricted.
[0025]
Here, each action hierarchy is
One or more behavior modules, comprising: a state machine describing the aircraft operation; and an activity evaluator that evaluates an activity level of the current aircraft operation state in the state machine and an aircraft resource used at the time of activation of the aircraft operation state. ,
Instruct the action evaluator of each action module to calculate an activity level and a used resource, select an action module to be activated according to each activity level and a used resource, and select the selected action module. An action state control unit that controls the action state of each action module by instructing the state machine to execute,
Can be configured.
[0026]
The behavior evaluator evaluates an activity level of the state machine according to an external environment of a body and / or an internal state of the robot.
[0027]
Further, the behavior state control unit transitions the behavior module whose activity level has decreased from the active state to the standby state, and transitions the behavior module whose activity level has increased from the standby state to the active state. The behavior state control unit can select two or more behavior modules whose resources do not compete with each other according to the activity level.
[0028]
Further, each action hierarchy can be configured as a tree structure of action modules according to the realization level of the machine operation. Then, the behavior state control unit disposed in the top-level behavior module of the tree structure, instructs the behavior module to evaluate the activity level and the used resource toward the lower level of the tree structure, Selection and an instruction to execute a state machine can be made.
[0029]
In such a case, the activity log of the behavior module of the second behavior hierarchy is monitored via the log memory module immediately below the root behavior module in the first behavior hierarchy, and the second activity hierarchy is monitored. A behavior hierarchy monitoring behavior module for detecting the start / end of each behavior module in the behavior hierarchy of the above may be provided.
[0030]
The behavior hierarchy monitoring behavior module excites its activity level sufficiently high in response to the activity of the behavior module on the second behavior hierarchy side detected via the log memory module, and Resources can be requested.
[0031]
In such a case, since the behavior hierarchy monitoring behavior module is disposed immediately below the root behavior module, it can greatly affect the selection and execution of another schema within the same behavior hierarchy. That is, since the root behavior module always selects the behavior hierarchy monitoring behavior module, resource competition between the behavior layers is avoided, and a system-critical schema can be executed without any trouble.
[0032]
In addition, the behavior hierarchy monitoring behavior module can release the restriction on the behavior module selection and its execution imposed by the activity detection by releasing its own required resources and lowering the activity level. This removes resource contention and allows other behavior modules to continue normal activity.
[0033]
In this way, with the intervention of the behavior hierarchy monitoring behavior module, it is possible to suitably avoid resource competition between a plurality of behavior hierarchies, and to execute cooperatively and organically in parallel without inappropriate interference. .
[0034]
In each action hierarchy, an action module can be selected and executed according to a unique policy. Further, resource management in each behavior hierarchy can be simplified.
[0035]
Further objects, features, and advantages of the present invention will become apparent from more detailed descriptions based on embodiments of the present invention described below and the accompanying drawings.
[0036]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0037]
A. Configuration of robot device
FIG. 1 schematically shows a functional configuration of a robot device 1 used in the present invention. As shown in FIG. 1, the robot apparatus 1 includes a control unit 20 that performs overall control of the entire operation and other data processing, an input / output unit 40, a driving unit 50, and a power supply unit 60. . Hereinafter, each unit will be described.
[0038]
The input / output unit 40 includes, as input units, a CCD camera 15 corresponding to the eyes of the robot apparatus 1, a microphone 16 corresponding to the ears, and a touch sensor disposed at a part such as a head or a back to detect a user's contact. 18 or other various sensors corresponding to the five senses. The output unit is provided with a speaker 17 corresponding to a mouth, an LED indicator (eye lamp) 19 that forms a facial expression by a combination of blinking and lighting timing. These output units can express the user feedback from the robot apparatus 1 in a form other than a mechanical movement pattern by a leg or the like, such as a sound or blinking of a lamp.
[0039]
The drive unit 50 is a functional block that implements a body operation of the robot device 1 according to a predetermined movement pattern commanded by the control unit 20, and is a control target by behavior control. The drive unit 50 is a functional module for realizing a degree of freedom at each joint of the robot apparatus 1, and includes a plurality of drive units provided for each axis such as roll, pitch, and yaw at each joint. Each drive unit performs a rotation operation around a predetermined axis, an encoder 52 that detects a rotation position of the motor 51, and adaptively controls a rotation position and a rotation speed of the motor 51 based on an output of the encoder 52. It is composed of a combination of drivers 53.
[0040]
Depending on how the drive units are combined, the robot device 1 can be configured as a legged mobile robot such as a bipedal walking or a quadrupedal walking.
[0041]
The power supply unit 60 is a functional module that supplies power to each electric circuit and the like in the robot device 1 as the name implies. The robot apparatus 1 according to the present embodiment is of an autonomous driving type using a battery, and a power supply unit 60 includes a charging battery 61 and a charging / discharging control unit 62 that manages a charging / discharging state of the charging battery 61. .
[0042]
The charging battery 61 is configured, for example, in the form of a “battery pack” in which a plurality of lithium ion secondary battery cells are packaged in a cartridge type.
[0043]
Further, the charge / discharge control unit 62 grasps the remaining capacity of the battery 61 by measuring a terminal voltage and a charge / discharge current amount of the battery 61, an ambient temperature of the battery 61, and determines a start time and an end time of charging. decide. The start and end timings of charging determined by the charge / discharge control unit 62 are notified to the control unit 20 and serve as triggers for the robot apparatus 1 to start and end charging operations.
[0044]
The control unit 20 corresponds to a “brain” and is mounted on, for example, the head or the body of the robot apparatus 1.
[0045]
FIG. 2 illustrates the configuration of the control unit 20 in more detail. As shown in FIG. 1, the control unit 20 has a configuration in which a CPU (Central Processing Unit) 21 as a main controller is bus-connected to a memory, other circuit components, and peripheral devices. The bus 27 is a common signal transmission path including a data bus, an address bus, a control bus, and the like. Each device on the bus 27 is assigned a unique address (memory address or I / O address). The CPU 21 can communicate with a specific device on the bus 28 by specifying an address.
[0046]
A RAM (Random Access Memory) 22 is a writable memory composed of a volatile memory such as a DRAM (Dynamic RAM), and loads a program code executed by the CPU 21 or temporarily stores work data by the execution program. Used to save and so on.
[0047]
The ROM (Read Only Memory) 23 is a read-only memory that permanently stores programs and data. The program code stored in the ROM 23 includes a self-diagnosis test program executed when the power of the robot apparatus 1 is turned on, an operation control program for defining the operation of the robot apparatus 1, and the like.
[0048]
The control program of the robot apparatus 1 includes a “sensor input / recognition processing program” that processes sensor inputs from the camera 15 and the microphone 16 and recognizes the symbols as symbols, and performs storage operations (described later) such as short-term storage and long-term storage. An “action control program” that controls the action of the robot apparatus 1 based on a sensor input and a predetermined action control model, and a “drive control program” that controls the drive of each joint motor and the audio output of the speaker 17 according to the action control model. And so on.
[0049]
The non-volatile memory 24 is composed of a memory element that can be electrically erased and rewritten, such as an EEPROM (Electrically Erasable and Programmable ROM), and is used to hold data to be sequentially updated in a non-volatile manner. The data to be sequentially updated includes an encryption key and other security information, a device control program to be installed after shipment, and the like.
[0050]
The interface 25 is a device for interconnecting with devices outside the control unit 20 and enabling data exchange. The interface 25 performs data input / output with the camera 15, the microphone 16, and the speaker 17, for example. Also, the interface 25 inputs and outputs data and commands to and from each of the drivers 53-1 in the drive unit 50.
[0051]
The interface 25 includes a serial interface such as a RS (Recommended Standard) -232C, a parallel interface such as an IEEE (Institute of Electrical and Electronics Engineers) 1284, a USB (Universal Serial I / E) serial interface, and a USB (Universal Serial I / E) serial bus. , A general-purpose interface for connecting peripheral devices of a computer, such as a SCSI (Small Computer System Interface) interface, a memory card interface (card slot) for receiving a PC card or a memory stick, and the like. External device A program or data may be moved between the program and the program.
[0052]
As another example of the interface 25, an infrared communication (IrDA) interface may be provided to perform wireless communication with an external device.
[0053]
Further, the control unit 20 includes a wireless communication interface 26, a network interface card (NIC) 27, and the like, and performs near field wireless data communication such as Bluetooth, a wireless network such as IEEE 802.11b, or a wide area such as the Internet. Data communication can be performed with various external host computers via the network.
[0054]
By such data communication between the robot device 1 and the host computer, complicated operation control of the robot device 1 can be calculated or remotely controlled using a remote computer resource.
[0055]
B. Behavior control system for robot device
FIG. 3 schematically shows a functional configuration of the behavior control system 100 of the robot device 1 according to the embodiment of the present invention. The robot apparatus 1 can perform behavior control according to the recognition result of the external stimulus and a change in the internal state. Furthermore, by providing a long-term memory function and associatively storing a change in an internal state from an external stimulus, behavior control can be performed according to a recognition result of the external stimulus or a change in the internal state.
[0056]
The illustrated behavior control system 100 can be implemented by adopting object-oriented programming. In this case, each software is handled in a module unit called an “object” that integrates data and a processing procedure for the data. Further, each object can perform data transfer and call (Invoke) by message communication and an inter-object communication method using a shared memory.
[0057]
The behavior control system 100 includes a visual recognition function unit 101, a hearing recognition function unit 102, and a contact recognition function unit 103 in order to recognize an external environment (Environments).
[0058]
The visual recognition function unit (Video) 51 performs image recognition such as face recognition and color recognition based on a captured image input via an image input device such as a CCD (Charge Coupled Device) camera. Performs processing and feature extraction. The visual recognition function unit 51 is composed of a plurality of objects such as “MultiColorTracker”, “FaceDetector”, and “FaceIdentify”, which will be described later.
[0059]
The auditory recognition function unit (Audio) 52 performs voice recognition on voice data input via a voice input device such as a microphone to extract features or perform word set (text) recognition. The auditory recognition function unit 52 is composed of a plurality of objects such as “AudioRecog” and “AuthurDecoder” to be described later.
[0060]
The contact recognition function unit (Tactile) 53 recognizes a sensor signal from a contact sensor built in, for example, the head of the body, and recognizes an external stimulus such as “patched” or “hit”.
[0061]
An internal state manager (ISM: Internal Status Manager) 104 manages several types of emotions, such as instinct and emotion, by using mathematical models, and manages the above-described visual recognition function unit 101, auditory recognition function unit 102, and contact recognition function. The internal state such as instinct and emotion of the robot apparatus 1 is managed according to an external stimulus (ES: External Stimula) recognized by the unit 103.
[0062]
The emotion model and the instinct model have recognition results and action histories as inputs, respectively, and manage emotion values and instinct values. The behavior model can refer to these emotion values and instinct values.
[0063]
In the present embodiment, an emotion is composed of a plurality of layers according to the significance of its existence, and operates at each layer. From the determined plurality of operations, which operation to perform is determined according to the external environment or internal state at that time. In addition, actions are selected at each level, but by expressing actions preferentially from lower-order actions, instinctual actions such as reflex and higher-order actions such as action selection using memory are performed. Behavior can be consistently expressed on one individual.
[0064]
The robot apparatus 1 according to the present embodiment includes a short-term storage unit 105 that performs short-term storage that is lost over time, and performs information control in response to a recognition result of an external stimulus or a change in an internal state. A long-term storage unit 106 for holding data for a relatively long time is provided. The classification of short-term memory and long-term memory depends on neuropsychology.
[0065]
The short-term storage unit (ShortTermMemory) 105 is a functional module that holds, for a short time, a target or an event recognized from the external environment by the above-described visual recognition function unit 101, auditory recognition function unit 102, and contact recognition function unit 103. For example, the input image from the camera 15 is stored for only a short period of about 15 seconds.
[0066]
The long-term storage unit (LongTermMemory) 106 is used to hold information obtained by learning, such as the name of an object, for a long time. For example, the long-term storage unit 106 can associate and store a change in an internal state from an external stimulus in a certain behavior module.
[0067]
The behavior control of the robot apparatus 1 according to the present embodiment is performed by a “reflex action” realized by the reflex action unit 109, a “situation-dependent action” realized by the situation-dependent behavior layer 108, and a reflection behavior layer 107. It is broadly divided into "consideration actions" that are realized.
[0068]
The reflexive behavioral behavior layer (Reflexive Behavioral Behavior Layer) 109 is a behavioral hierarchy that governs extremely critical behaviors for the system. Behaviors that are critical for such a system include the following.
[0069]
(1) System boot behavior
(2) System shutdown action
(3) Reflex action to release a specific error, or reflex aircraft movement in response to a recognized external stimulus
[0070]
The reflex action is, for example, an action that directly receives a recognition result of external information input by a sensor, classifies the result, and directly determines an output action. For example, it also includes behaviors such as chasing and nodding a human face.
[0071]
Boot and shutdown actions require the use of all aircraft resources. On the other hand, the reflex action uses only the aircraft resources related to a specific error or the like.
[0072]
The situation-dependent behavior hierarchy (Suited Behaviors Layer) 108 is based on the storage contents of the short-term storage unit 105 and the long-term storage unit 106, and the internal state managed by the internal state management unit 104, based on the state where the robot apparatus 1 is currently placed. Control responsive actions.
[0073]
The situation-dependent action hierarchy 108 prepares a state machine (or a state transition model) for each action, classifies the recognition result of the external information input by the sensor depending on the action or situation before that, and The action is expressed on the airframe. The situation-dependent behavior hierarchy 108 also implements an action for keeping the internal state within a certain range (also called “homeostasis behavior”). When the internal state exceeds a specified range, the internal state is determined. The action is activated so that the action for returning to the range easily appears (actually, the action is selected in consideration of both the internal state and the external environment). Situation-dependent behavior has a slower reaction time than reflex behavior.
[0074]
The reflective behavior layer 107 performs a relatively long-term action plan of the robot apparatus 1 based on the contents stored in the short-term storage unit 105 and the long-term storage unit 106.
[0075]
Reflection behavior is behavior that is performed based on a given situation or a command from a human, with a reasoning or a plan for realizing it. For example, searching for a route from the position of the robot and the position of the target corresponds to a deliberate action. Such inferences and plans may require more processing time and calculation load (that is, more processing time) than the reaction time for the robot apparatus 1 to maintain the interaction. Reflective actions make inferences and plans while responding in real time.
[0076]
The reflection behavior hierarchy 107, the situation-dependent behavior hierarchy 108, and the reflex behavior unit 109 can be described as higher-level application programs independent of the hardware configuration of the robot apparatus 1. On the other hand, the hardware dependent layer control unit (ConfigurationDependentActionsAndReactions) 110 responds to a command from these higher-level applications (behavior modules called “schema”) to change the hardware (external environment) of the body such as driving joint actuators. Operate directly.
[0077]
C. Situation-dependent behavior control
The situation-dependent behavior hierarchy (Suited Behaviors Layer) 108 is based on the storage contents of the short-term storage unit 105 and the long-term storage unit 106, and the internal state managed by the internal state management unit 104, based on the state where the robot apparatus 1 is currently placed. Control responsive actions. In addition, as a part of the situation-dependent behavior layer 108, a reflex behavior unit 109 that performs a reflexive and direct body motion in response to the recognized external stimulus is included.
[0078]
C-1. Structure of the situation-dependent behavior hierarchy
In the present embodiment, the situation-dependent behavior hierarchy 108 prepares a state machine (or a state transition model) for each behavior module, and recognizes external information input by a sensor depending on the previous behavior or situation. Categorize the results and express the behavior on the aircraft. The behavior module is described as a schema (schema) having a Monitor function for determining a situation according to an external stimulus or a change in an internal state, and an Action function for realizing a state transition (state machine) accompanying the behavior execution. The situation-dependent behavior hierarchy 108 is configured as a tree structure in which a plurality of schemas are hierarchically connected (described later).
[0079]
The situation-dependent behavior hierarchy 108 also implements an action for keeping the internal state within a certain range (also called “homeostasis behavior”). When the internal state exceeds a specified range, the internal state is determined. The action is activated so that the action for returning to within the range becomes easy (actually, the action is selected in consideration of both the internal state and the external environment).
[0080]
Each functional module in the behavior control system 100 of the robot 1 as shown in FIG. 3 is configured as an object. Each object can perform data transfer and call (Invoke) by a message communication and an inter-object communication method using a shared memory. FIG. 4 schematically shows an object configuration of the behavior control system 100 according to the present embodiment.
[0081]
The visual recognition function unit 101 is composed of three objects, “FaceDetector”, “MultiColorTracker”, and “FaceIdentify”.
[0082]
FaceDetector is an object that detects a face area from within an image frame, and outputs a detection result to FaceIdentify. The MultiColorTracker is an object for performing color recognition, and outputs a recognition result to FaceIdentify and ShortTermMemory (objects constituting the short-term memory 105). In addition, FaceIdentify identifies a person by, for example, searching for a detected face image in a hand-held person dictionary, and outputs the ID information of the person together with the position and size information of the face image area to the ShortTermMemory.
[0083]
The auditory recognition function unit 102 is composed of two objects “AudioRecog” and “SpeechRecog”. AudioRecog is an object that receives voice data from a voice input device such as a microphone and performs feature extraction and voice section detection, and outputs the feature amount and sound source direction of voice data in a voice section to SpeechRecog and ShortTermMemory. The SpeechRecog is an object that performs speech recognition using the speech feature quantity received from the AudioRecog, the speech dictionary, and the syntax dictionary, and outputs a set of recognized words to the ShortTermMemory.
[0084]
The tactile recognition storage unit 103 is configured by an object called “TactileSensor” that recognizes a sensor input from a contact sensor, and outputs a recognition result to a ShortTermMemory or an InternalStateModel (ISM) that is an object for managing an internal state.
[0085]
The ShortTermMemory (STM) is an object constituting the short-term storage unit 105, and holds a target or an event recognized from an external environment by each of the above-described recognition system objects for a short period of time (for example, an input image from the camera 15 for about 15 seconds). This is a functional module that stores the information only for a short period of time), and periodically notifies the STM client (Suited Behaviors Layer) of an external stimulus (Notify).
[0086]
The LongTermMemory (LTM) is an object constituting the long-term storage unit 106, and is used to hold information obtained by learning, such as the name of an object, for a long time. LongTermMemory can, for example, associatively store a change in an internal state from an external stimulus in a certain behavior module.
[0087]
The InternalStatusManager (ISM) is an object that constitutes the internal state management unit 104, manages several types of emotions such as instinct and emotions in a mathematical model, and manages the external stimulus (ES) recognized by each of the above-described recognition system objects. : Internal instincts and emotions of the robot apparatus 1 are managed according to the External Stimula.
[0088]
The switched behavior layers (SBL) are objects that constitute the situation-dependent behavior hierarchy 108. The SBL is an object serving as a client of the ShormTermMemory (STM client). When a notification (Notify) of information on an external stimulus (target or event) is periodically received from the ShormTermMemory, a schema (schema), that is, an action module to be executed is executed. Is determined (described later).
[0089]
The ReflexiveSwitchedBehaviorsLayer is an action hierarchy that controls extremely critical actions for the system. Behaviors that are critical for such a system include the following.
[0090]
(1) System boot behavior
(2) System shutdown action
(3) Reflex action to release a specific error, or reflex aircraft movement in response to a recognized external stimulus
[0091]
The reflex action includes, for example, an action of directly receiving a recognition result of external information input by a sensor, classifying the result, and directly determining an output action. For example, behaviors such as chasing and nodding a human face include reflex behavior. Boot and shutdown actions require the use of all aircraft resources. On the other hand, the reflex action uses only the aircraft resources related to a specific error or the like.
[0092]
The SituationBehaviorsLayer selects an action according to a situation such as an external stimulus or a change in an internal state. On the other hand, ReflexiveSuitedBehaviorsLayer behaves reflexively in response to an external stimulus. Since the action selection by these two objects is performed independently, when the action modules (schema) (described later) selected from each other are executed on the body, hardware resources of the robot 1 conflict with each other and cannot be realized. There are also things.
[0093]
An object called Resourcemanager arbitrates a conflict between hardware at the time of selecting an action by the Situationed BehaviorsLayer and the ReflexiveSuitedBehaviorsLayer. Then, the body is driven by notifying each object that realizes the body operation based on the arbitration result.
[0094]
The SoundPerformer, the MotionController, and the LedController are objects that implement the body operation. The SoundPerformer is an object for performing voice output, performs voice synthesis in accordance with a text command given from the Switched BehaviorLayer via the ResourceManager, and performs voice output from a speaker on the body of the robot 1. The MotionController is an object for performing the operation of each joint actuator on the airframe, and calculates a corresponding joint angle in response to receiving a command to move a hand, a leg, or the like from the Sited BehaviorLayer via the ResourceManager. The LedController is an object for performing a blinking operation of the LED 19, and performs a blinking drive of the LED 19 in response to receiving a command from the SituationBehaviorLayer via the ResourceManager.
[0095]
FIG. 5 schematically shows a form of the situation-dependent behavior control by the situation-dependent behavior hierarchy (SBL) 108 (including the reflex behavior unit 109). The recognition result of the external environment by the recognition systems 101 to 103 is given to the situation-dependent behavior hierarchy 108 (including the reflex behavior unit 109) as an external stimulus. Further, a change in the internal state according to the recognition result of the external environment by the recognition system is also given to the situation-dependent behavior hierarchy 108. Then, the situation-dependent behavior hierarchy 108 can determine a situation according to an external stimulus or a change in the internal state, and implement the behavior selection.
[0096]
FIG. 6 shows a basic operation example of the behavior control by the situation-dependent behavior hierarchy 108 shown in FIG. As shown in the figure, in the situation-dependent behavior hierarchy 108 (SBL), the activity level of each behavior module (schema) is calculated based on an external stimulus or a change in the internal state, and the schema is determined according to the activity level. Select and perform action. For the calculation of the activity level, for example, by using a library, a uniform calculation process can be performed for all schemas (hereinafter the same). For example, a schema having the highest activity level may be selected, or two or more schemas exceeding a predetermined threshold may be selected and executed in parallel. Assume that there is no hardware resource conflict between the schemas).
[0097]
FIG. 7 shows an operation example when a reflex action is performed by the situation-dependent action hierarchy 108 shown in FIG. In this case, as shown in the figure, the reflexive action unit 109 (ReflexiveSBL) included in the context-dependent action hierarchy 108 calculates the activity level by directly using the external stimulus recognized by each object of the recognition system, An action is executed by selecting a schema according to the degree of the activity level. In this case, the change in the internal state is not used for calculating the activity level.
[0098]
FIG. 8 shows an operation example when emotion expression is performed by the situation-dependent behavior hierarchy 108 shown in FIG. The internal state management unit 104 manages emotions such as instinct and emotion as a mathematical model. In response to the state value of the emotion parameter reaching a predetermined value, a change in the internal state is transmitted to the situation-dependent behavior layer 108. Notify (notify). The situation-dependent behavior hierarchy 108 calculates an activity level by using a change in the internal state as an input, selects a schema according to the degree of the activity level, and executes an action. In this case, the external stimulus recognized by each object of the recognition system is used for managing and updating the internal state in the internal state management unit 104 (ISM), but is not used for calculating the activity level of the schema.
[0099]
C-2. Schema (behavior module)
The situation-dependent action hierarchy 108 prepares a state machine for each action module, classifies the recognition result of the external information input by the sensor depending on the action or situation before that, and performs the action on the aircraft. Express. The action module describes an action of the body and implements a state transition (state machine) accompanying the action execution, and evaluates the execution of the action described in the action function according to an external stimulus or an internal state to determine a situation. It is described as a schema having a Monitor function to perform. FIG. 9 schematically illustrates a situation in which the situation-dependent behavior hierarchy 108 is configured by a plurality of schemas.
[0100]
The context-dependent behavior hierarchy 108 (more strictly, of the context-dependent behavior hierarchy 108, a hierarchy that controls normal context-dependent behavior) is configured as a tree structure in which a plurality of schemas are hierarchically connected, and is used for external stimuli and Behavior control is performed by integrally determining a more optimal schema according to changes in the internal state. The tree includes a plurality of subtrees (or branches) such as a behavior model in which ethological situation-dependent behavior is formalized, and a subtree for executing emotional expression.
[0101]
FIG. 10 schematically shows a tree structure of a schema in the context-dependent behavior hierarchy 108. As shown in the figure, the situation-dependent behavior hierarchy 108 starts from a root schema that receives a notification (Notify) of an external stimulus from the short-term storage unit 105 and moves from an abstract behavior category to a specific behavior category. A schema is provided for each hierarchy. For example, in a hierarchy immediately below the root schema, schemas of “Search”, “Eatestive”, and “Play” are provided. Below the “search”, a schema describing a more specific search action, such as “InvestigativeLocomotion”, “HeadinAirSniffing”, and “InvestigativeSniffing”, is provided. Similarly, a schema describing more specific eating and drinking behaviors such as “Eat” and “Dlink” is provided below the schema “Eative”, and “Scheme (Play)” is below the schema “Play”. A schema describing a more specific playing action such as "Play Bowing", "Play Greeting", or "Play Paying" is provided.
[0102]
As shown, each schema inputs an external stimulus and an internal state. Each schema has at least a Monitor function and an Action function.
[0103]
FIG. 11 schematically shows the internal structure of the schema. As shown in the figure, the schema includes an Action function that describes a body operation in the form of a state transition model (state machine) in which a state (or a state) changes according to occurrence of a predetermined event, and an external stimulus. The Monitor function that evaluates each state of the Action function according to the internal state and returns it as an activity level value, and the state machine of the Action function is any of READY (ready), ACTIVE (active), and SLEEP (waiting). And a state management unit that stores and manages the state of the schema as the state.
[0104]
The Monitor function is a function that calculates an activity level (Activation Level: AL value) of the schema according to the external stimulus and the internal state. When the tree structure shown in FIG. 10 is configured, the upper (parent) schema can call the Monitor function of the lower (child) schema with the external stimulus and the internal state as arguments, and the child schema has the AL value. Is the return value. The schema can also call the Monitor function of the child's schema to calculate its own AL value. As the root schema, the AL value from each subtree is returned, so that the optimal schema, that is, the action according to the change of the external stimulus and the internal state, that is, the behavior can be determined in an integrated manner.
[0105]
For example, as the root schema, a schema having the highest AL value may be selected, or two or more schemas having an AL value exceeding a predetermined threshold may be selected and executed in parallel (however, the action may be performed in parallel). When executing in parallel, it is assumed that there is no hardware resource conflict between the schemas.)
[0106]
FIG. 12 schematically shows the internal configuration of the Monitor function. As shown in the figure, the Monitor function includes a behavior induction evaluation value calculator for calculating an evaluation value for inducing a behavior described in the schema as an activity level, and a used resource calculator for specifying an aircraft resource to be used. It has. In the example shown in FIG. 11, when the Monitor function is called from the behavior state control unit (tentative name) that manages the schema, that is, the behavior module, the Monitor function virtually executes the state machine of the Action function, and the behavior induction evaluation value (that is, the activity induction evaluation value). Calculates the used resource and the used resource, and returns it.
[0107]
The Action function includes a state machine (or state transition model) (described later) that describes the behavior of the schema itself. When the tree structure shown in FIG. 10 is configured, the parent schema can call the Action function to start or suspend the execution of the child schema. In this embodiment, the Action state machine is not initialized unless it becomes Ready. In other words, the state is not reset even if interrupted, and the schema saves the work data being executed, so that interrupted re-execution is possible (described later).
[0108]
In the example shown in FIG. 11, the behavior state control unit (tentative name) that manages the schema, that is, the behavior module, selects the behavior to be executed based on the return value from the Monitor function, and calls the Action function of the corresponding schema. Or instruct the transition of the state of the schema stored in the state management unit. For example, a schema having the highest activity level as a behavior induction evaluation value is selected, or a plurality of schemas are selected according to a priority order so that resources do not conflict. Also, the behavior state control unit saves the state of the lower-priority schema from ACTIVE to SLEEP when a higher-priority schema is activated and a resource conflict occurs. Control the state of the schema, such as recovering.
[0109]
As shown in FIG. 13, the behavior state control unit may be provided in the context-dependent behavior hierarchy 108 only once, and all schemas constituting the hierarchy 108 may be centrally managed.
[0110]
In the illustrated example, the behavior state control unit includes a behavior evaluation unit, a behavior selection unit, and a behavior execution unit. The behavior evaluation unit calls the Monitor function of each schema at a predetermined control cycle, for example, to acquire each activity level and used resources.
[0111]
The action selection unit performs action control and management of machine resources using each schema. For example, schemas are selected in descending order of the aggregated activity levels, and two or more schemas are simultaneously selected (within a range managed in the context-dependent behavior hierarchy 108) so that resources used do not conflict.
[0112]
The action execution unit issues an action execution command to the Action function of the selected schema, and manages the state of the schema (READY, ACTIVE, SLEEP) to control the execution of the schema. For example, when a schema with a higher priority is activated and a resource conflict occurs, the state of a schema with a lower priority is saved from ACTIVE to SLEEP, and when the conflict is resolved, the schema is restored to ACTIVE.
[0113]
Alternatively, the function of the behavior state control unit may be arranged for each schema in the context-dependent behavior hierarchy 108. For example, as shown in FIG. 10, when the schema forms a tree structure (see FIG. 14), the behavior state control of the upper (parent) schema is performed by using the external stimulus and the internal state as arguments, and The Monitor function of the (child) schema is called, and the activity level and the used resource are received as return values from the child schema. The child's schema also calls the child's schema Monitor function to calculate its activity level and resources used. Then, the activity level and the used resources from each subtree are returned to the behavior state control unit of the root schema. , Action function to start or suspend execution of the child schema.
[0114]
FIG. 15 schematically shows a mechanism for controlling normal context-dependent behavior in the context-dependent behavior hierarchy 108.
[0115]
As shown in the drawing, an external stimulus is input (Notify) from the short-term storage unit 105 and a change in the internal state is input from the internal state management unit 109 to the context-dependent behavior hierarchy 108. The context-dependent behavior hierarchy 108 is composed of a plurality of sub-trees, such as a behavior model in which ethological situation-dependent behavior is formalized, and a sub-tree for executing emotional expression. In response to the notification of the external stimulus (Notify), the monitor function of each subtree is called, and the activity level (AL value) as a return value is referred to, and an integrated action selection is performed. Call the action function on the subtree that implements the action. The context-dependent behavior determined in the context-dependent behavior hierarchy 108 is applied to the body operation (Motion Controller) through arbitration of hardware resource competition with the reflex behavior by the reflex behavior unit 109 by the resource manager. .
[0116]
In addition, in the context-dependent behavior layer 108, the reflex behavior section 109 executes reflex / direct body motion in response to an external stimulus recognized by each object of the above-described recognition system (for example, when an obstacle is detected). Avoid immediately by detection). Therefore, unlike the case of controlling the normal situation-dependent behavior (FIG. 10), a plurality of schemas for directly inputting signals from the respective objects of the recognition system are arranged in parallel without being hierarchized. .
[0117]
C-3. Structure of reflex behavior hierarchy
The reflection behavior hierarchy 109 is a behavior hierarchy that governs behavior that is extremely critical for the system, and has a state machine (or state transition model) for each behavior module. The behavior module is described as a schema (schema) including a Monitor function for performing a situation determination according to an external stimulus or a change in an internal state, and an Action function for implementing a state transition (state machine) accompanying the behavior execution ( Ibid.)
[0118]
FIG. 16 shows the internal configuration of the reflex behavior hierarchy 109. As shown in the figure, the configuration is made up of a plurality of schemas, and the configuration of each schema is as described with reference to FIGS. 11 and 12.
[0119]
In the example shown in the drawing, the reflex behavior hierarchy 109 includes, under the root schema, a schema for booting the system, a schema for shutting down the system, a reflex behavior for releasing a specific error, or a recognized external stimulus. A group of schemas for performing a reflexive body operation in accordance with the schema is arranged.
[0120]
When an external stimulus is input from a recognition function unit such as the visual recognition function unit 101, the auditory recognition function unit 102, and the contact recognition function unit 103, the root schema calls the Monitor function of each lower-level schema to determine the activity level. Receives the level and the resources used during schema execution as a return value, and executes the schema with the highest activity level.
[0121]
The boot action module and the shutdown action module need to use all the aircraft resources. On the other hand, the error handling / reflex action module uses only the machine resources related to a specific error or the like.
[0122]
C-4. Cooperative behavior of the context-dependent behavior hierarchy and reflex behavior hierarchy
According to the situation-dependent behavior layer 108, it is possible to selectively execute a situation-dependent behavior consisting of an external stimulus obtained by a sensor input and the internal state of the robot itself, depending on the behavior or situation before that.
[0123]
On the other hand, the reflex behavior layer 109 executes extremely critical behaviors for the system, such as booting and shutting down the system and clearing a specific error.
[0124]
The situation-dependent behavior and the reflex behavior are implemented as different control hierarchy levels in the robot behavior control system (see FIG. 9) and can be executed simultaneously or in parallel. However, in such a hierarchical system architecture, there is no way to communicate, that is, exchange information, between the behavioral layers. For this reason, if activities that interfere with each other are independently expressed in each behavioral hierarchy, a malfunction of the system is caused, and it is difficult to perform consistent performance as the whole system.
[0125]
Therefore, in the present embodiment, a so-called interface function for communicating log information describing the activity of the schema in one behavior hierarchy to the other behavior hierarchy is provided, and the schema of the schema is communicated via this interface in the other behavior hierarchy. The execution was monitored to perform collaborative or intense schema execution. As a result, the performance of the system is improved in terms of use of computer resources, user satisfaction, and the like.
[0126]
FIG. 17 illustrates a mechanism for realizing a cooperative operation that can be executed concurrently between the context-dependent behavior hierarchy 108 and the reflex behavior hierarchy 109.
[0127]
As shown in the figure, a log memory client (log memory module) 151 is provided in the context-dependent behavior hierarchy 108, and a log memory server (log memory module) is provided in the reflection behavior hierarchy 109. 152 are provided, and the server and the client are connected via a communication unit 153.
[0128]
Further, in the situation behavior dependent behavior hierarchy 108, a reflection behavior hierarchy monitoring schema 154 is arranged immediately below the root schema of the tree-structured schema group. The reflex behavior hierarchy monitoring schema 154 has a dummy Action function, and the Monitor function maximizes its activity level in response to the execution of the schema in the reflex behavior hierarchy 109, as well as all the resources used (or The resource used in the reflection behavior hierarchy 109) is set as a return value. As a result, since the root schema always selects the reflex behavior hierarchy monitoring schema 154, resource contention between the context-dependent behavior hierarchy 108 and the reflex behavior hierarchy 109 is avoided, and a system-critical schema is executed without any trouble. be able to.
[0129]
A mechanism for generating log information related to the execution of the schema is provided for each behavior hierarchy. A mechanism for exchanging log information between the behavior layers is implemented in a client-server format. In each action hierarchy, a mechanism for extracting log information from the log memory is provided.
[0130]
During system execution, a schema is appropriately selected and executed according to the above-described mechanism. Then, an activity log including information on the schema being executed is generated using a logging mechanism in each behavior hierarchy. The log information preferably includes, for example, the following data.
[0131]
(1) Schema identification information (layer ID, schema ID, schema name)
(2) Execution status (for example, READY, ACTIVE, SLEEP)
(3) Sub-execution status (for example, start (Start), end (End), success (Success), failure (Fail))
(4) Execution target
(5) Execution time
[0132]
FIG. 18 shows a configuration example of the log memory module. FIG. 19 shows a setting example of the memory configuration.
[0133]
Using the log information exchange mechanism between the action layers, that is, the communication unit 153, the log information generated in one action layer is transferred to the log memory of the other action layer.
[0134]
In the example illustrated in FIG. 17, the log memory module of each behavior layer operates as the client 151 and the server 152. The log memory module stores log information generated by executing a schema in the behavior hierarchy. When the external behavior hierarchy is connected as a client, the log memory module transfers the log information to the client. At the same time, the log memory module can be connected as a client to the memory module of the external behavior hierarchy. In this case, the log information generated in the external action hierarchy can be extracted and stored.
[0135]
Using a log information retrieval mechanism, each schema can access log information from a log memory module in the behavior hierarchy. Due to the communication function between the internal log memory module and the external behavioral hierarchy, the activity log includes information of both inside and outside the behavioral hierarchy.
[0136]
As described above, improper interference between the schemas (contention of used resources) causes a malfunction of the system, and must be avoided or limited.
[0137]
More specifically, the contextual behavior hierarchy 108 should not prevent the execution of system-critical behavior. For example, in the reflex behavior hierarchy 109, it is detected at the time of system boot (boot schema activity), at the time of shutdown (shutdown schema activity), and at the time of error avoidance processing or reflex behavior (error processing / reflex behavior schema). The schema of the contextual behavior hierarchy 108 should not be activated in the error situation being handled.
[0138]
The reflection behavior hierarchy monitoring schema 154 is disposed in the context-dependent behavior hierarchy 108 immediately below the root schema of the tree-structured schema group. The reflex behavior hierarchy monitoring schema 154 monitors the activity log in the log memory client 151 and determines the start / end of each schema in the reflex behavior hierarchy 109, that is, the boot schema, the shutdown schema, and the error avoidance schema. To detect. When the activity of the schema in the reflex behavior hierarchy 109 is detected, the reflex behavior monitoring schema 154 performs appropriate processing. For example, the execution of other schemas in the context 108 is restricted. Further, when the activity of the schema in the reflex action hierarchy 109 ends, the restriction imposed by the activity detection is released.
[0139]
In the reflex behavior hierarchy 109, the activity log is activated. That is, when the schema is executed, log information is generated.
[0140]
The log memory module in the context-dependent behavior hierarchy 108 is connected as a client to the log memory module in the reflection behavior hierarchy 109. Accordingly, the log information generated in the reflex behavior hierarchy 109 is transferred to the log memory module in the context-dependent behavior hierarchy 108, that is, the log memory client 151.
[0141]
The reflex behavior hierarchy monitoring schema 154 searches for a related schema log using identification information such as a hierarchy ID, a schema ID, and a schema name. A specific schema can be searched using both the hierarchical ID and the schema ID. Further, the activity levels of all schemas in the hierarchy can be obtained using only the hierarchy ID.
[0142]
Use the execution status and sub-execution status to retrieve the relevant activity of the selected schema. For example, execution start and execution end can be found by searching for (ACTIVE, Start) and (ACTIVE, (End, Success)), respectively.
[0143]
In the present embodiment, the execution of the schema is selected or restricted in the context-dependent behavior hierarchy 108 based on the following criteria using resource contention and activity level.
[0144]
(1) Grant execution rights to all schemas that do not conflict with resources used
(2) When a conflict occurs, the execution right is granted to the schema having a higher activity level, and execution of all other conflicting schemas already being executed is stopped.
[0145]
Such a schema selection and execution policy is applied only in the behavior hierarchy, and only the schema in the behavior hierarchy is considered. Therefore, even if resource conflicts occur between schemas executed in different behavioral hierarchies, they cannot be detected, resulting in inappropriate interference between schema activities.
[0146]
In this embodiment, the reflex behavior hierarchy monitoring schema 154 can use a schema selection and execution policy based on resource contention and activity levels to limit the activity of other schemas within the context-sensitive behavior hierarchy 108. That is, the reflex behavior hierarchy monitoring schema 154 requests appropriate resources and excites the activity level sufficiently high. Regarding resource contention, the running schema is stopped, and another schema is prevented from being selected and executed.
[0147]
The reflex behavior hierarchy monitoring schema 154 selects an appropriate resource according to the activity of the schema detected in the reflex behavior hierarchy 109. When a boot schema and a shutdown schema are selected and executed, the reflex behavior hierarchy monitoring schema 154 requests all machine resources. When the error avoidance schema or the reflex behavior schema is selected and executed, the reflex behavior hierarchy monitoring schema 154 requests only the machine resources related to the specific error or the like.
[0148]
The reflection behavior hierarchy monitoring schema 154 is disposed immediately below the root schema in the situation behavior dependence behavior hierarchy 108, so that it can significantly affect the selection and execution of other schemas in the behavior hierarchy. it can.
[0149]
The Monitor function of the reflex behavior hierarchy monitoring schema 154 maximizes its activity level in response to the execution of the schema in the reflex behavior hierarchy 109, and uses all the resources used (or used in the reflex behavior hierarchy 109). Resource) as the return value. As a result, since the root schema always selects the reflex behavior hierarchy monitoring schema 154, resource contention between the context-dependent behavior hierarchy 108 and the reflex behavior hierarchy 109 is avoided, and a system-critical schema is executed without any trouble. be able to.
[0150]
In addition, the reflection behavior hierarchy monitoring schema 154 can release the restriction on schema selection and execution imposed by the activity detection by releasing its own required resources and lowering the activity level. This removes resource contention and allows other schemas to continue normal activity.
[0151]
In this way, with the intervention of the reflex behavior hierarchy monitoring schema 154, it is possible to preferably avoid resource competition between a plurality of behavior hierarchies, and to cooperatively and organically execute in parallel without inappropriate interference. it can.
[0152]
In each behavior hierarchy, a schema can be selected and executed according to a unique policy. Further, resource management in each behavior hierarchy can be simplified.
[0153]
In addition, a new behavior hierarchy can be developed independently of the existing behavior hierarchy, and no special modification is required when installing this in the system.
[0154]
[Supplement]
The present invention has been described in detail with reference to the specific embodiments. However, it is obvious that those skilled in the art can modify or substitute the embodiment without departing from the scope of the present invention.
[0155]
The gist of the present invention is not necessarily limited to products called “robots”. In other words, if it is a mechanical device that performs a motion similar to the movement of a human using electric or magnetic action, or another mobile machine, or an information processing system that controls the behavior of the mechanical device, for example, a toy The present invention can be similarly applied to products belonging to other industrial fields.
[0156]
In short, the present invention has been disclosed by way of example, and the contents described in this specification should not be interpreted in a limited manner. In order to determine the gist of the present invention, the claims described at the beginning should be considered.
[0157]
【The invention's effect】
As described in detail above, according to the present invention, a situation-dependent action of integrally judging a situation where a robot is placed and selecting an appropriate action is reflected in response to a recognized external stimulus. It is possible to provide an excellent robot behavior control system that can appropriately perform a reflection-type behavior for realizing a body operation.
[0158]
Further, according to the present invention, it is possible to provide an excellent robot behavior control system capable of cooperatively and organically executing a situation-dependent action and a reflex action in a cooperative and organic manner without inappropriate interference. .
[Brief description of the drawings]
FIG. 1 is a diagram schematically showing a functional configuration of a robot device 1 used in the present invention.
FIG. 2 is a diagram showing the configuration of a control unit 20 in further detail.
FIG. 3 is a diagram schematically showing a functional configuration of a behavior control system 100 of the robot device 1 according to the embodiment of the present invention.
FIG. 4 is a diagram schematically illustrating an object configuration of the behavior control system 100 according to the embodiment of the present invention.
FIG. 5 is a diagram schematically showing a form of situation-dependent behavior control by a situation-dependent behavior hierarchy 108.
FIG. 6 is a diagram showing a basic operation example of behavior control by the situation-dependent behavior hierarchy shown in FIG.
FIG. 7 is a diagram showing an operation example when a reflex action is performed by the situation-dependent action hierarchy shown in FIG. 5;
8 is a diagram illustrating an operation example when emotion expression is performed by the situation-dependent behavior hierarchy shown in FIG. 5;
FIG. 9 is a diagram schematically illustrating a situation in which the situation-dependent behavior hierarchy 108 is configured by a plurality of schemas.
FIG. 10 is a diagram schematically showing a tree structure of a schema in a context-dependent behavior hierarchy 108.
FIG. 11 is a diagram schematically showing an internal configuration of a schema.
FIG. 12 is a diagram schematically illustrating an internal configuration of a Monitor function.
FIG. 13 is a diagram schematically illustrating a configuration example of an action state control unit.
FIG. 14 is a diagram schematically illustrating another configuration example of the behavior state control unit.
FIG. 15 is a diagram schematically showing a mechanism for controlling normal context-dependent behavior in the context-dependent behavior hierarchy 108.
FIG. 16 is a diagram showing an internal configuration of the reflex behavior hierarchy 109.
FIG. 17 is a diagram for explaining a mechanism of a cooperative operation between the situation-dependent behavior hierarchy 108 and the reflex behavior hierarchy 109.
FIG. 18 is a diagram illustrating a configuration example of a log memory module.
FIG. 19 is a diagram showing a setting example of a memory configuration.
[Explanation of symbols]
1 ... Robot device
15 ... CCD camera
16 ... Microphone
17… Speaker
18 ... Touch sensor
19 ... LED indicator
20 ... Control unit
21 ... CPU
22 ... RAM
23… ROM
24: Non-volatile memory
25 ... Interface
26 ... Wireless communication interface
27 ... Network interface card
28 ... Bus
29 ... Keyboard
40 ... input / output unit
50 ... Drive unit
51 ... motor
52 ... Encoder
53 ... Driver
100 ... behavior control system
101: Visual recognition function unit
102: Auditory recognition function unit
103 ... Touch recognition function unit
105: Short-term memory
106: Long-term storage unit
107: Reflection behavior hierarchy
108 ... Situation-dependent behavior hierarchy
109 ... Reflex action part

Claims

An action control system for an autonomously operating robot,
A first behavior hierarchy for controlling the behavior of the robot;
A second behavior hierarchy that controls the behavior of the robot independently of the first behavior hierarchy;
A log memory module that is provided for each behavior hierarchy and records the execution status of the behavior in the behavior hierarchy,
A connection unit for communicating recorded contents between the log memory modules of each behavior hierarchy,
A behavior control system for a robot, comprising:

The first behavioral hierarchy is a situation-dependent behavioral hierarchy that integrally determines a situation where the robot is placed and selects an appropriate behavior,
The second behavioral hierarchy is a reflexive behavioral hierarchy that governs behaviors that are critical to the system;
The robot behavior control system according to claim 1, wherein:

In the first behavior hierarchy, the behavior in the first behavior hierarchy is controlled in accordance with the recorded contents of the log memory module on the second behavior hierarchy obtained via the connection unit.
The robot behavior control system according to claim 1, wherein:

In the first behavioral hierarchy, based on the recorded contents of the log memory module of the second behavioral hierarchy obtained through the connection unit, the second behavioral hierarchy is in a standby state while the second behavioral hierarchy is active. Become,
The robot behavior control system according to claim 3, wherein:

In the first behavioral hierarchy, based on the recorded contents of the log memory module of the second behavioral hierarchy obtained via the connection unit, an aircraft resource used in the second behavioral hierarchy is determined. Select actions that have been avoided,
The robot behavior control system according to claim 3, wherein:

Each of the above-mentioned action levels is:
One or more behavior modules, comprising: a state machine describing the aircraft operation; and an activity evaluator that evaluates an activity level of the current aircraft operation state in the state machine and an aircraft resource used at the time of activation of the aircraft operation state. ,
Instruct the action evaluator of each action module to calculate an activity level and a used resource, select an action module to be activated according to each activity level and a used resource, and select the selected action module. An action state control unit that controls the action state of each action module by instructing the state machine to execute,
The robot behavior control system according to claim 1, further comprising:

The behavior evaluator evaluates an activity level of the state machine according to an external environment of an airframe and / or an internal state of the robot.
The robot behavior control system according to claim 6, wherein:

The behavior state control unit transitions the behavior module having the decreased activity level from the active state to the standby state, and transitions the behavior module having the increased activity level from the standby state to the active state.
The robot behavior control system according to claim 6, wherein:

The behavior state control unit selects two or more behavior modules whose resources do not compete with each other according to the activity level.
The robot behavior control system according to claim 6, wherein:

In each of the above-mentioned action levels, each of the above-mentioned action modules is configured in a tree structure form according to the realization level of the body operation,
The behavior state control unit disposed in the top-level behavior module of the tree structure is configured to instruct the behavior module to evaluate an activity level and a used resource toward the lower level of the tree structure, and to select a behavior module. , And instruct the execution of the state machine,
The robot behavior control system according to claim 6, wherein:

The activity log of the behavior module in the second behavior hierarchy is monitored via the log memory module immediately below the root behavior module in the first behavior hierarchy, and the activity log in the second behavior hierarchy is monitored. A behavior hierarchy monitoring behavior module for detecting start / end of each behavior module is provided.
The robot behavior control system according to claim 10, wherein:

The behavior hierarchy monitoring behavior module excites its activity level sufficiently high in response to the activity of the behavior module on the second behavior hierarchy side detected via the log memory module, and Request resources,
The robot behavior control system according to claim 12, wherein: