JP2004283958A

JP2004283958A - Robot device, method of controlling its behavior and program thereof

Info

Publication number: JP2004283958A
Application number: JP2003079145A
Authority: JP
Inventors: Rika Horinaka; 里香堀中; Tsutomu Sawada; 務澤田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2003-03-20
Filing date: 2003-03-20
Publication date: 2004-10-14

Abstract

<P>PROBLEM TO BE SOLVED: To provide a robot device capable of expressing various actions according to an external situation and internal status of itself if it is designated to express actions by a specific external stimulus such as a command from a user, and a method of controlling the behavior, and a program thereof. <P>SOLUTION: The system for controlling the behavior of the robot device has a plurality of schemes in which actions are described, and action selected from a plurality of actions is outputted. Each schema calculates an activation level AL of each action based on the internal status, a user's face, recognized information on an object, etc. or an external stimulus such as a command from the user. A schema is selected based on the AL and the action is outputted. For example, if a schema with a negative desire is selected according to a command of the user, the schema outputs action indicating such a negative desire that it does not want to perform the action described in the schema itself. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、人間や動物を模倣したエンターテイメント性を有するロボット装置、その制御方法及びプログラムに関し、特に、人間や動物と同様に、行動を発現することに対する欲求を有してこの欲求に基づき行動を選択して発現することができるロボット装置、その制御方法及びプログラムに関する。
【０００２】
【従来の技術】
電気的又は磁気的な作用を用いて人間（生物）の動作に似た運動を行う機械装置を「ロボット装置」という。我が国においてロボット装置が普及し始めたのは、１９６０年代末からであるが、その多くは、工場における生産作業の自動化・無人化等を目的としたマニピュレータ及び搬送ロボット装置等の産業用ロボット装置（ＩｎｄｕｓｔｒｉａｌＲｏｂｏｔ）であった。
【０００３】
最近では、人間のパートナーとして生活を支援する、即ち住環境その他の日常生活上の様々な場面における人的活動を支援する実用ロボット装置の開発が進められている。このような実用ロボット装置は、産業用ロボット装置とは異なり、人間の生活環境の様々な局面において、個々に個性の相違した人間、又は様々な環境への適応方法を自ら学習する能力を備えている。例えば、犬又は猫のように４足歩行の動物の身体メカニズム及びその動作を模した「ペット型」ロボット装置、或いは、２足直立歩行を行う人間等の身体メカニズム及びその動作をモデルにしてデザインされた「人間型」又は「人間形」ロボット装置（ＨｕｍａｎｏｉｄＲｏｂｏｔ）等のロボット装置は、既に実用化されつつある。
【０００４】
これらのロボット装置は、産業用ロボット装置と比較して、例えばエンターテインメント性を重視した様々な動作等を行うことができるため、エンターテインメントロボット装置と呼称される場合もある。また、そのようなロボット装置には、外部からの情報及び内部の状態に応じて自律的に動作するものがある。
【０００５】
ところで、かかるペットロボット装置においては、人間又は本物の犬や猫などのように、現在の状況に応じた最適な次の行動及び動作を行わせる機能や、過去の経験に基づいて次の行動及び動作を変化させる機能を搭載することができれば、より一層の親近感や満足感をユーザに与えて、ペットロボット装置としてのアミューズメント性をより向上させることができる。そこで、このようなアミューズメント性の向上を図ったロボット装置及びその制御方法が下記特許文献１に記載されている。
【０００６】
この特許文献１に記載されたロボット装置においては、複数種類の行動モデルを有し、行動選択手段を用いて、外部からの入力情報と自己の行動履歴及び又は成長履歴との少なくとも一方に基づいて、各行動モデルの出力の中から１つの行動モデルの出力を選択するよう構成されており、これにより、現在の状況に応じた最適な次の行動を連続して行わせることができる。
【０００７】
【特許文献１】
特開２００１−１５７９８１号公報
【０００８】
【発明が解決しようとする課題】
しかしながら、上述のような従来のロボット装置は、やりたい行動を発現させるための優先順位が高い行動モデルを選択することを目的としたものであり、優先順位が低い行動、即ち、やりたくないと判断されている行動（動作）は、選択されることもなく、従ってあまり考慮されていなかった。従って、従来のロボット装置の行動選択においては、ユーザから「これやって」と指定された場合は、必ず該当する行動（動作）を取るようにプログラムされているため、外部状況及びロボット装置の内部状態に拘わらず、同様に、指定された行動をとることになり、このような固定的な応答のみではユーザにも厭きが生じ、エンターテイメント性に欠けるという問題点がある。
【０００９】
即ち、やりたくないときにもやらなくていけない場合を設定した場合、例えば「一緒に遊ぼう」と言われても「疲れた」、「お腹が空いた」等の理由で「遊びたくない」等の負の欲求を持たせ、負の欲求に関してもロボット装置の行動に反映させることができれば、より人間又は、犬や猫等の動物の行動に模したものとなり、ユーザにより一層の親近感や満足感を与えることができ、更にエンターテイメント性を向上させることができる。
【００１０】
本発明は、このような従来の実情に鑑みて提案されたものであり、ユーザからの指示等の特定の外部刺激により動作を発現するように指定された場合であっても、外部状況及び自身の内部状態に応じた多様なバリエーションの動作を発現することが可能なロボット装置、その動作制御方法及びプログラムを提供することを目的とする。
【００１１】
【課題を解決するための手段】
上述した目的を達成するために、本発明に係るロボット装置は、内部状態及び外部刺激に基づき行動を選択し発現するロボット装置において、複数の行動が記述され、該複数の行動から選択された行動を出力する行動出力手段と、上記内部状態及び／又は外部刺激から各行動の実行優先度を算出する優先度算出手段とを有し、上記実行優先度は、各行動を発現することに対する正の欲求又は負の欲求を示し、上記行動出力手段は、選択された行動の実行優先度が負の欲求を示すものであるとき、当該選択された行動とは異なる行動を出力することを特徴とする。
【００１２】
本発明においては、やりたい等の正の欲求のみでなく、やりたくない等の負の欲求をも有するため、この負の欲求を行動に反映させることができ、例えばユーザに指示される等の所定の外部刺激により指定された行動に対する欲求が負である場合、指定された行動をやりたくない等と音声でユーザに通知したりするような代償行動を発現することができる。
【００１３】
本発明に係るロボット装置の行動制御方法は、内部状態及び外部刺激に基づき行動を選択し発現するロボット装置の行動制御方法において、上記内部状態及び／又は外部刺激から各行動の実行優先度を算出する優先度算出工程と、複数の行動から選択された行動を出力する行動出力工程とを有し、上記実行優先度は、各行動を発現することに対する正の欲求又は負の欲求を示し、上記行動出力工程では、上記選択された行動の実行優先度が負の欲求を示すものであるとき、当該選択された行動とは異なる行動を出力することを特徴とする。
【００１４】
また、本発明に係るプログラムは、上述した動作制御処理をコンピュータに実行させるものである。
【００１５】
【発明の実施の形態】
以下、本発明を適用した具体的な実施の形態について、図面を参照しながら詳細に説明する。本実施の形態のロボット装置は、内部状態に応じて自律的に行動することが可能なロボット装置において、例えば「やりたくない」等の負の欲求をも有し、これを表現することが可能なことを特徴とするものであるが、ここでは先ず、このようなロボット装置の好適な構成、及び制御システムについて説明し、次に本実施の形態における負の欲求を示すロボット装置について詳細に説明する。
【００１６】
（１）ロボット装置の構成
図１は、本実施の形態のロボット装置の外観を示す斜視図である。図１に示すように、ロボット装置１は、体幹部ユニット２の所定の位置に頭部ユニット３が連結されると共に、左右２つの腕部ユニット４Ｒ／Ｌと、左右２つの脚部ユニット５Ｒ／Ｌが連結されて構成されている（但し、Ｒ及びＬの各々は、右及び左の各々を示す接尾辞である。以下において同じ。）。
【００１７】
図２は、本実施の形態におけるロボット装置１の機能構成を模式的に示すブロック図である。図２に示すように、ロボット装置１は、全体の動作の統括的制御及びその他のデータ処理を行う制御ユニット２０と、入出力部４０と、駆動部５０と、電源部６０とで構成される。以下、各部について説明する。
【００１８】
入出力部４０は、入力部として人間の目に相当し、外部の状況を撮影するＣＣＤカメラ１５、及び耳に相当するマイクロフォン１６や頭部や背中等の部位に配設され、所定の押圧を受けるとこれを電気的に検出することで、ユーザの接触を感知するタッチ・センサ１８、前方に位置する物体までの距離を測定するための距離センサ、五感に相当するその他の各種のセンサ等を含む。また、出力部として、頭部ユニット３に備えられ、人間の口に相当するスピーカ１７、及び人間の目の位置に設けられ、感情表現や視覚認識状態を表現する例えばＬＥＤインジケータ（目ランプ）１９等を装備しており、これら出力部は、音声やＬＥＤインジケータ１９の点滅等、脚等による機械運動パターン以外の形式でもロボット装置１からのユーザ・フィードバックを表現することができる。
【００１９】
例えば頭部ユニットの頭頂部の所定箇所に複数のタッチ・センサ１８を設け、各タッチ・センサ１８における接触検出を複合的に活用して、ユーザからの働きかけ、例えばロボット装置１の頭部を「撫でる」「叩く」「軽く叩く」等を検出することができ、例えば、押圧センサのうちの幾つかが所定時間をおいて順次接触したことを検出した場合、これを「撫でられた」と判別し、短時間のうちに接触を検出した場合、「叩かれた」と判別する等場合分けし、これに応じて内部状態も変化し、このような内部状態の変化に応じて動作を発現することができる。
【００２０】
駆動部５０は、制御ユニット２０が指令する所定の運動パターンに従ってロボット装置１の機体動作を実現する機能ブロックであり、行動制御による制御対象である。駆動部５０は、ロボット装置１の各関節における自由度を実現するための機能モジュールであり、それぞれの関節におけるロール、ピッチ、ヨー等各軸毎に設けられた複数の駆動ユニット５４_１〜５４_ｎで構成される。各駆動ユニット５４_１〜５４_ｎは、所定軸回りの回転動作を行うモータ５１_１〜５１_ｎと、モータ５１_１〜５１_ｎの回転位置を検出するエンコーダ５２_１〜５２_ｎと、エンコーダ５２_１〜５２_ｎの出力に基づいてモータ５１_１〜５１_ｎの回転位置や回転速度を適応的に制御するドライバ５３_１〜５３_ｎとの組み合わせで構成される。
【００２１】
本ロボット装置１は、２足歩行としたが、駆動ユニットの組み合わせ方によって、ロボット装置１を例えば４足歩行等の脚式移動ロボット装置として構成することもできる。
【００２２】
電源部６０は、その字義通り、ロボット装置１内の各電気回路等に対して給電を行う機能モジュールである。本実施の形態に係るロボット装置１は、バッテリを用いた自律駆動式であり、電源部６０は、充電バッテリ６１と、充電バッテリ６１の充放電状態を管理する充放電制御部６２とで構成される。
【００２３】
充電バッテリ６１は、例えば、複数本のリチウムイオン２次電池セルをカートリッジ式にパッケージ化した「バッテリ・パック」の形態で構成される。
【００２４】
また、充放電制御部６２は、バッテリ６１の端子電圧や充電／放電電流量、バッテリ６１の周囲温度等を測定することでバッテリ６１の残存容量を把握し、充電の開始時期や終了時期等を決定する。充放電制御部６２が決定する充電の開始及び終了時期は制御ユニット２０に通知され、ロボット装置１が充電オペレーションを開始及び終了するためのトリガとなる。
【００２５】
制御ユニット２０は、「頭脳」に相当し、例えばロボット装置１の機体頭部あるいは胴体部に搭載されている。
【００２６】
図３は、制御ユニット２０の構成を更に詳細に示すブロック図である。図３に示すように、制御ユニット２０は、メイン・コントローラとしてのＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２１が、メモリ及びその他の各回路コンポーネントや周辺機器とバス接続された構成となっている。バス２８は、データ・バス、アドレス・バス、コントロール・バス等を含む共通信号伝送路である。バス２８上の各装置にはそれぞれに固有のアドレス（メモリ・アドレス又はＩ／Ｏアドレス）が割り当てられている。ＣＰＵ２１は、アドレスを指定することによってバス２８上の特定の装置と通信することができる。
【００２７】
ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２２は、ＤＲＡＭ（ＤｙｎａｍｉｃＲＡＭ）等の揮発性メモリで構成された書き込み可能メモリであり、ＣＰＵ２１が実行するプログラム・コードをロードしたり、実行プログラムによる作業データの一時的に保存そたりするために使用される。
【００２８】
ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２３は、プログラムやデータを恒久的に格納する読み出し専用メモリである。ＲＯＭ２３に格納されるプログラム・コードには、ロボット装置１の電源投入時に実行する自己診断テスト・プログラムや、ロボット装置１の動作を規定する動作制御プログラム等が挙げられる。
【００２９】
ロボット装置１の制御プログラムには、カメラ１５やマイクロフォン１６等のセンサ入力を処理してシンボルとして認識する「センサ入力・認識処理プログラム」、短期記憶や長期記憶等の記憶動作（後述）を司りながらセンサ入力と所定の行動制御モデルとに基づいてロボット装置１の行動を制御する「行動制御プログラム」、行動制御モデルに従って各関節モータの駆動やスピーカ１７の音声出力等を制御する「駆動制御プログラム」等が含まれる。
【００３０】
不揮発性メモリ２４は、例えばＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲＯＭ）のように電気的に消去再書き込みが可能なメモリ素子で構成され、逐次更新すべきデータを不揮発的に保持するために使用される。逐次更新すべきデータには、暗号鍵やその他のセキュリティ情報、出荷後にインストールすべき装置制御プログラム等が挙げられる。
【００３１】
インターフェース２５は、制御ユニット２０外の機器と相互接続し、データ交換を可能にするための装置である。インターフェース２５は、例えば、カメラ１５、マイクロフォン１６、又はスピーカ１７等との間でデータ入出力を行う。また、インターフェース２５は、駆動部５０内の各ドライバ５３_１〜５３_ｎとの間でデータやコマンドの入出力を行う。
【００３２】
また、インターフェース２５は、ＲＳ（ＲｅｃｏｍｍｅｎｄｅｄＳｔａｎｄａｒｄ）−２３２Ｃ等のシリアル・インターフェース、ＩＥＥＥ（ＩｎｓｔｉｔｕｔｅｏｆＥｌｅｃｔｒｉｃａｌａｎｄｅｌｅｃｔｒｏｎｉｃｓＥｎｇｉｎｅｅｒｓ）１２８４等のパラレル・インターフェース、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）インターフェース、ｉ−Ｌｉｎｋ（ＩＥＥＥ１３９４）インターフェース、ＳＣＳＩ（ＳｍａｌｌＣｏｍｐｕｔｅｒＳｙｓｔｅｍＩｎｔｅｒｆａｃｅ）インターフェース、ＰＣカードやメモリ・スティックを受容するメモリ・カード・インターフェース（カード・スロット）等のような、コンピュータの周辺機器接続用の汎用インターフェースを備え、ローカル接続された外部機器との間でプログラムやデータの移動を行うようにしてもよい。
【００３３】
また、インターフェース２５の他の例として、赤外線通信（ＩｒＤＡ）インターフェースを備え、外部機器と無線通信を行うようにしてもよい。
【００３４】
更に、制御ユニット２０は、無線通信インターフェース２６やネットワーク・インターフェース・カード（ＮＩＣ）２７等を含み、Ｂｌｕｅｔｏｏｔｈのような近接無線データ通信や、ＩＥＥＥ８０２．１１ｂのような無線ネットワーク、あるいはインターネット等の広域ネットワークを経由して、外部のさまざまなホスト・コンピュータとデータ通信を行うことができる。
【００３５】
このようなロボット装置１とホスト・コンピュータ間におけるデータ通信により、遠隔のコンピュータ資源を用いて、ロボット装置１の複雑な動作制御を演算したり、リモート・コントロールしたりすることができる。
【００３６】
（２）ロボット装置の制御システム
次に、ロボット装置の行動（動作）制御システムについて説明する。なお、上述したように、本実施の形態におけるロボット装置は、外部刺激及び自身の内部状態から自律的に動作を発現するもののうち、より人間らしい行動を発現するよう、やりたくない等の負の欲求を有してこれを行動に反映させるものであるが、ここでは先ず、自律的に動作を発現するロボット装置の行動制御システムについて説明し、その後、負の欲求を行動に反映させる方法について説明する。
【００３７】
図４は、本実施の形態におけるロボット装置１の制御システム１０の機能構成を示す模式図である。本実施の形態におけるロボット装置１は、外部刺激の認識結果や内部状態の変化に応じて、動作制御を行なうことができるものである。また、長期記憶機能を備え、外部刺激から内部状態の変化を連想記憶することにより、外部刺激の認識結果や内部状態の変化に応じて動作制御を行うことができる。
【００３８】
ここで、外的刺激とは、ロボット装置１がセンサ入力を認識して得られた知覚情報であり、例えば、カメラ１５から入力された画像に対して処理された色情報、形情報、顔情報等であり、より具体的には、色、形、顔、３Ｄ一般物体、ハンドジェスチャー、動き、音声、接触、匂い、味等の構成要素からなる。
【００３９】
また、内的状態とは、例えば、ロボット装置の身体に基づいた本能や感情等の情動を指す。本能的要素は、例えば、疲れ（ｆａｔｉｇｕｅ）、熱あるいは体内温度（ｔｅｍｐｅｒａｔｕｒｅ）、痛み（ｐａｉｎ）、食欲あるいは飢え（ｈｕｎｇｅｒ）、乾き（ｔｈｉｒｓｔ）、愛情（ａｆｆｅｃｔｉｏｎ）、好奇心（ｃｕｒｉｏｓｉｔｙ）、排泄（ｅｌｉｍｉｎａｔｉｏｎ）又は性欲（ｓｅｘｕａｌ）のうちの少なくとも１つである。また、情動的要素は、幸せ（ｈａｐｐｉｎｅｓｓ）、悲しみ（ｓａｄｎｅｓｓ）、怒り（ａｎｇｅｒ）、驚き（ｓｕｒｐｒｉｓｅ）、嫌悪（ｄｉｓｇｕｓｔ）、恐れ（ｆｅａｒ）、苛立ち（ｆｒｕｓｔｒａｔｉｏｎ）、退屈（ｂｏｒｅｄｏｍ）、睡眠（ｓｏｍｎｏｌｅｎｃｅ）、社交性（ｇｒｅｇａｒｉｏｕｓｎｅｓｓ）、根気（ｐａｔｉｅｎｃｅ）、緊張（ｔｅｎｓｅ）、リラックス（ｒｅｌａｘｅｄ）、警戒（ａｌｅｒｔｎｅｓｓ）、罪（ｇｕｉｌｔ）、悪意（ｓｐｉｔｅ）、誠実さ（ｌｏｙａｌｔｙ）、服従性（ｓｕｂｍｉｓｓｉｏｎ）又は嫉妬（ｊｅａｌｏｕｓｙ）のうちの少なくとも１つである。
【００４０】
図示の制御システム１０には、オブジェクト指向プログラミングを採り入れて実装することができる。この場合、各ソフトウェアは、データとそのデータに対する処理手続きとを一体化させた「オブジェクト」というモジュール単位で扱われる。また、各オブジェクトは、メッセージ通信と共有メモリを使ったオブジェクト間通信方法によりデータの受け渡しとＩｎｖｏｋｅを行なうことができる。
【００４１】
制御システム１０は、外部環境（Ｅｎｖｉｒｏｎｍｅｎｔｓ）７０を認識するために、視覚認識機能部８１、聴覚認識機能部８２、及び接触認識機能部８３等からなる機能モジュールである状態認識部８０を備えている。
【００４２】
視覚認識機能部（Ｖｉｄｅｏ）８１は、例えば、ＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ：電荷結合素子）カメラのような画像入力装置を介して入力された撮影画像を基に、顔認識や色認識等の画像認識処理や特徴抽出を行う。また、聴覚認識機能部（Ａｕｄｉｏ）８２は、マイク等の音声入力装置を介して入力される音声データを音声認識して、特徴抽出したり、単語セット（テキスト）認識を行ったりする。更に、接触認識機能部（Ｔａｃｔｉｌｅ）８３は、例えば機体の頭部等に内蔵された接触センサによるセンサ信号を認識して、「なでられた」とか「叩かれた」という外部刺激を認識する。
【００４３】
内部状態管理部（ＩＳＭ：ＩｎｔｅｒｎａｌＳｔａｔｕｓＭａｎａｇｅｒ）９１は、上述した本能や感情といった数種類の情動を数式モデル化して管理しており、上述の視覚認識機能部８１、聴覚認識機能部８２、及び接触認識機能部８３によって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）に応じてロボット装置１の本能や情動といった内部状態を管理する。
【００４４】
このような感情モデルと本能モデルは、それぞれ認識結果と行動（動作）履歴を入力に持ち、感情値と本能値を管理している。行動モデルは、これら感情値や本能値を参照することができる。
【００４５】
また、外部刺激の認識結果や内部状態の変化に応じて動作制御を行なうために、時間の経過とともに失われる短期的な記憶を行なう短期記憶部（ＳＴＭ：ＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ）９２と、情報を比較的長期間保持するための長期記憶部（ＬＴＭ：ＬｏｎｇＴｅｒｍＭｅｍｏｒｙ）９３を備えている。短期記憶と長期記憶という記憶メカニズムの分類は神経心理学に依拠する。
【００４６】
短期記憶部９２は、上述の視覚認識機能部８１、聴覚認識機能部８２、及び接触認識機能部８３によって外部環境から認識されたターゲットやイベントを短期間保持する機能モジュールである。例えば、図２に示すカメラ１５からの入力画像を約１５秒程度の短い期間だけ記憶する。
【００４７】
長期記憶部９３は、物の名前等学習により得られた情報を長期間保持するために使用される。長期記憶部９３は、例えば、ある行動記述モジュールにおいて外部刺激から内部状態の変化を連想記憶することができる。
【００４８】
また、本ロボット装置１の動作制御は、反射行動部（ＲｅｆｌｅｘｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０３によって実現される「反射行動」と、状況依存行動階層（ＳＢＬ：ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０２によって実現される「状況依存行動」と、熟考行動階層（ＤｅｌｉｂｅｒａｔｉｖｅＬａｙｅｒ）１０１によって実現される「熟考行動」に大別される。
【００４９】
反射行動部１０３は、上述の視覚認識機能部８１、聴覚認識機能部８２、及び接触認識機能部８３によって認識された外部刺激に応じて反射的な機体動作を実現する機能モジュールである。反射行動とは、基本的に、センサ入力された外部情報の認識結果を直接受けて、これを分類して、出力行動（動作）を直接決定する行動のことである。例えば、人間の顔を追いかけたり、うなずいたりといった振る舞いは反射行動として実装することが好ましい。
【００５０】
状況依存行動階層１０２は、短期記憶部９２及び長期記憶部９３の記憶内容や、内部状態管理部９１によって管理される内部状態を基に、ロボット装置１が現在置かれている状況に即応した行動を制御する。
【００５１】
状況依存行動階層１０２は、目的に応じた行動（動作）が記述された複数の行動記述モジュール（スキーマ）を有し、各行動（スキーマ）毎にステートマシンを用意しており、それ以前の動作や状況に依存して、センサ入力された外部情報の認識結果を分類して、動作を機体上で発現する。また、状況依存行動階層１０２は、内部状態をある範囲に保つための行動（「ホメオスタシス行動」とも呼ぶ）も実現し、内部状態が指定した範囲内を越えた場合には、その内部状態を当該範囲内に戻すための行動が出現し易くなるようにその行動を活性化させる（実際には、内部状態と外部環境の両方を考慮した形で行動（動作）が選択される）。
【００５２】
具体的には、各スキーマが、内部状態の変化及び外部刺激に基づき、そのスキーマの実行優先度を示す活動度レベル（アクティベーションレベル：ａｃｔｉｖａｔｉｏｎｌｅｖｅｌ、以下ＡＬともいう。）を算出し、このアクティベーションレベルが高いスキーマが１以上選択され、選択された動作が発現されるようになされている。即ち、例えばアクティベーションレベルが最も高いスキーマを選択したり、アクティベーションレベルが所定の閾値を越えた２以上のスキーマを選択して並列的に実行したりすることができる（但し、並列実行するときは各スキーマどうしでハードウェア・リソースの競合がないことを前提とする）。この状況依存行動は、反射行動に比し、反応時間が遅い。
【００５３】
熟考行動階層１０１は、短期記憶部９２及び長期記憶部９３の記憶内容に基づいて、ロボット装置１の比較的長期にわたる行動計画等を行う。熟考行動とは、与えられた状況あるいは人間からの命令により、推論やそれを実現するための計画を立てて行われる行動のことである。例えば、ロボット装置の位置と目標の位置から経路を探索することは熟考行動に相当する。このような推論や計画は、ロボット装置１がインタラクションを保つための反応時間よりも処理時間や計算負荷を要する（すなわち処理時間がかかる）可能性があるので、上記の反射行動や状況依存行動がリアルタイムで反応を返しながら、熟考行動は推論や計画を行う。
【００５４】
熟考行動階層１０１、状況依存行動階層１０２、及び反射行動部１０３は、ロボット装置１のハードウェア構成に非依存の上位のアプリケーション・プログラムとして記述することができる。これに対し、ハードウェア依存層制御部（ＣｏｎｆｉｇｕｒａｔｉｏｎＤｅｐｅｎｄｅｎｔＡｃｔｉｏｎｓＡｎｄＲｅａｃｔｉｏｎｓ）１０４は、これら上位アプリケーション、即ち、行動記述モジュール（スキーマ）からの命令に応じて、関節アクチュエータの駆動等の機体のハードウェア（外部環境）を直接操作する。このような構成により、ロボット装置１は、制御プログラムに基づいて自己及び周囲の状況を判断し、使用者からの指示及び働きかけに応じて自律的に行動できる。
【００５５】
次に、行動制御システム１０について更に詳細に説明する。図５は、本実施の形態における行動制御システム１０のオブジェクト構成を示す模式図である。
【００５６】
図５に示すように、視覚認識機能部８１は、ＦａｃｅＤｅｔｅｃｔｏｒ１１４、ＭｕｌｉｔＣｏｌｏｒＴｒａｃｋｅｒ１１３、ＦａｃｅＩｄｅｎｔｉｆｙ１１５という３つのオブジェクトで構成される。
【００５７】
ＦａｃｅＤｅｔｅｃｔｏｒ１１４は、画像フレーム中から顔領域を検出するオブジェクトであり、検出結果をＦａｃｅＩｄｅｎｔｉｆｙ１１５に出力する。ＭｕｌｉｔＣｏｌｏｒＴｒａｃｋｅｒ１１３は、色認識を行うオブジェクトであり、認識結果をＦａｃｅＩｄｅｎｔｉｆｙ１１５及びＳｈｏｒｔＴｅｒｍＭｅｍｏｒｙ（ＳＴＭ）９２に出力する。また、ＦａｃｅＩｄｅｎｔｉｆｙ１１５は、検出された顔画像を手持ちの人物辞書で検索する等して人物の識別を行ない、顔画像領域の位置、大きさ情報とともに人物のＩＤ情報をＳＴＭ９２に出力する。
【００５８】
聴覚認識機能部８２は、ＡｕｄｉｏＲｅｃｏｇ１１１とＳｐｅｅｃｈＲｅｃｏｇ１１２という２つのオブジェクトで構成される。ＡｕｄｉｏＲｅｃｏｇ１１１は、マイク等の音声入力装置からの音声データを受け取って、特徴抽出と音声区間検出を行うオブジェクトであり、音声区間の音声データの特徴量及び音源方向をＳｐｅｅｃｈＲｅｃｏｇ１１２やＳＴＭ９２に出力する。ＳｐｅｅｃｈＲｅｃｏｇ１１２は、ＡｕｄｉｏＲｅｃｏｇ１１１から受け取った音声特徴量と音声辞書及び構文辞書を使って音声認識を行うオブジェクトであり、認識された単語のセットをＳＴＭ９２に出力する。
【００５９】
触覚認識記憶部８３は、接触センサからのセンサ入力を認識するＴａｃｔｉｌｅＳｅｎｓｏｒ１１９というオブジェクトで構成され、認識結果はＳＴＭ９２や内部状態を管理するオブジェクトであるＩｎｔｅｒｎａｌＳｔａｔｅＭｏｄｅｌ（ＩＳＭ）９１に出力する。
【００６０】
ＳＴＭ９２は、短期記憶部を構成するオブジェクトであり、上述の認識系の各オブジェクトによって外部環境から認識されたターゲットやイベントを短期間保持（例えばカメラ１５からの入力画像を約１５秒程度の短い期間だけ記憶する）する機能モジュールであり、ＳＴＭクライアントであるＳＢＬ１０２に対して外部刺激の通知（Ｎｏｔｉｆｙ）を定期的に行なう。
【００６１】
ＬＴＭ９３は、長期記憶部を構成するオブジェクトであり、物の名前等学習により得られた情報を長期間保持するために使用される。ＬＴＭ９３は、例えば、ある行動記述モジュール（スキーマ）において外部刺激から内部状態の変化を連想記憶することができる。
【００６２】
ＩＳＭ９１は、内部状態管理部を構成するオブジェクトであり、本能や感情といった数種類の情動を数式モデル化して管理しており、上述の認識系の各オブジェクトによって認識された外部刺激（ＥＳ：ＥｘｔｅｒｎａｌＳｔｉｍｕｌａ）に応じてロボット装置１の本能や情動といった内部状態を管理する。
【００６３】
ＳＢＬ１０２は状況依存型行動階層を構成するオブジェクトである。ＳＢＬ１０２は、ＳＴＭ９２のクライアント（ＳＴＭクライアント）となるオブジェクトであり、ＳＴＭ９２からは定期的に外部刺激（ターゲットやイベント）に関する情報の通知（Ｎｏｔｉｆｙ）を受け取ると、スキーマ（Ｓｃｈｅｍａ）すなわち実行すべき行動記述モジュールを決定する（後述）。
【００６４】
ＲｅｆｌｅｘｉｖｅＳＢＬ（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒｓＬａｙｅｒ）１０３は、反射的行動部を構成するオブジェクトであり、上述した認識系の各オブジェクトによって認識された外部刺激に応じて反射的・直接的な機体動作を実行する。例えば、人間の顔を追いかける、うなずく、障害物の検出により咄嗟に避けるといった振る舞いを行なう。
【００６５】
ＳＢＬ１０２は外部刺激や内部状態の変化等の状況に応じた動作を選択する。これに対し、ＲｅｆｌｅｘｉｖｅＳＢＬ１０３は、外部刺激に応じて反射的な動作を選択する。これら２つのオブジェクトによる行動選択は独立して行なわれるため、互いに選択された行動記述モジュール（スキーマ）を機体上で実行する場合に、ロボット装置１のハードウェア・リソースが競合して実現不可能なこともある。ＲＭ（ＲｅｓｏｕｒｃｅＭａｎａｇｅｒ）１１６というオブジェクトは、ＳＢＬ１０２とＲｅｆｌｅｘｉｖｅＳＢＬ１０３とによる行動選択時のハードウェアの競合を調停する。そして、調停結果に基づいて機体動作を実現する各オブジェクトに通知することにより機体が駆動する。
【００６６】
ＳｏｕｎｄＰｅｒｆｏｒｍｅｒ１７２、ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ１７３、ＬＥＤＣｏｎｔｒｏｌｌｅｒ１７４は、機体動作を実現するオブジェクトである。ＳｏｕｎｄＰｅｒｆｏｒｍｅｒ１７２は、音声出力を行うためのオブジェクトであり、ＲＭ１１６経由でＳＢＬ１０２から与えられたテキスト・コマンドに応じて音声合成を行い、ロボット装置１の機体上のスピーカから音声出力を行う。また、ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ１７３は、機体上の各関節アクチュエータの動作を行なうためのオブジェクトであり、ＲＭ１１６経由でＳＢＬ１０２から手や脚等を動かすコマンドを受けたことに応答して、該当する関節角を計算する。また、ＬＥＤＣｏｎｔｒｏｌｌｅｒ１７４は、ＬＥＤ１９の点滅動作を行なうためのオブジェクトであり、ＲＭ１１６経由でＳＢＬ１０２からコマンドを受けたことに応答してＬＥＤ１９の点滅駆動を行なう。
【００６７】
（２−１）状況依存行動制御
次に、状況依存行動階層について更に詳細に説明する。図６には、状況依存行動階層（ＳＢＬ）（但し、反射行動部を含む）による状況依存行動制御の形態を模式的に示している。認識系の視覚認識機能部８１、聴覚認識機能部８２、及び接触認識機能部８３の機能モジュールによる外部環境７０の認識結果（センサ情報）１８２は、外部刺激１８３として状況依存行動階層（反射行動部１０３を含む）１０２ａに与えられる。また、認識系による外部環境７０の認識結果に応じた内部状態の変化１８４も状況依存行動階層１０２ａに与えられる。そして、状況依存行動階層１０２ａでは、外部刺激１８３や内部状態の変化１８４に応じて状況を判断して、行動選択を実現することができる。
【００６８】
図７には、図６に示した反射行動部１０３を含む状況依存行動階層（ＳＢＬ）１０２ａによる行動制御の基本的な動作例を示している。同図に示すように、状況依存行動階層１０２ａでは、外部刺激１８３や内部状態の変化１８４によって各行動記述モジュール（スキーマ）のアクティベーションレベルを算出して、アクティベーションレベルの度合いに応じてスキーマを選択して行動（動作）を実行する。アクティベーションレベルの算出には、例えばライブラリ１８５を利用することにより、すべてのスキーマについて統一的な計算処理を行なうことができる（以下、同様）。例えば、アクティベーションレベルが最も高いスキーマを選択したり、アクティベーションレベルが所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行するようにしてもよい（但し、並列実行するときは各スキーマどうしでハードウェア・リソースの競合がないことを前提とする）。
【００６９】
また、図８には、図６に示した状況依存行動階層１０２ａにより反射行動を行なう場合の動作例を示している。この場合、同図に示すように、状況依存行動階層１０２ａに含まれる反射行動部（ＲｅｆｌｅｘｉｖｅＳＢＬ）１０３は、認識系の各オブジェクトによって認識された外部刺激１８３を直接入力としてアクティベーションレベルを算出して、アクティベーションレベルの度合いに応じてスキーマを選択して行動を実行する。この場合、内部状態の変化１８４は、アクティベーションレベルの計算には使用されない。
【００７０】
また、図９には、図６に示した状況依存行動階層１０２により感情表現を行なう場合の動作例を示している。内部状態管理部９１では、本能や感情等の情動を数式モデルとして管理しており、情動パラメータの状態値が所定値に達したことに応答して、状況依存行動階層１０２に内部状態の変化１８４を通知（Ｎｏｔｉｆｙ）する。状況依存行動階層１０２は、内部状態の変化１８４を入力としてアクティベーションレベルを算出して、アクティベーションレベルの度合いに応じてスキーマを選択して行動を実行する。この場合、認識系の各オブジェクトによって認識された外部刺激１８３は、内部状態管理部（ＩＳＭ）９１における内部状態の管理・更新に利用されるが、スキーマのアクティベーションレベルの算出には使用されない。
【００７１】
（２−２）スキーマ
図１０には、状況依存行動階層１０２が複数のスキーマ１３２によって構成されている様子を模式的に示している。状況依存行動階層１０２は、各行動記述モジュール、即ちスキーマ毎にステートマシンを用意しており、それ以前の行動（動作）や状況に依存して、センサ入力された外部情報の認識結果を分類し、動作を機体上で発現する。スキーマは、外部刺激や内部状態に応じた状況判断を行なうＭｏｎｉｔｏｒ機能と、行動実行に伴う状態遷移（ステートマシン）を実現するＡｃｔｉｏｎ機能とを備えたスキーマ（Ｓｃｈｅｍａ）１３２として記述される。
【００７２】
状況依存行動階層１０２ｂ（より厳密には、状況依存行動階層１０２のうち、通常の状況依存行動を制御する階層）は、複数のスキーマ１３２が階層的に連結されたツリー構造として構成され、外部刺激や内部状態の変化に応じてより最適なスキーマ１３２を統合的に判断して行動制御を行なうようになっている。ツリー３００は、例えば動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するためのサブツリー等、複数のサブツリー（又は枝）を含んでいる。
【００７３】
図１１には、状況依存行動階層１０２におけるスキーマのツリー構造を模式的に示している。同図に示すように、状況依存行動階層１０２は、短期記憶部９２から外部刺激の通知（Ｎｏｔｉｆｙ）を受けるルート・スキーマ２０１_１、２０２_１、２０３_１を先頭に、抽象的な行動カテゴリから具体的な行動カテゴリに向かうように、各階層毎にスキーマが配設されている。例えば、ルート・スキーマの直近下位の階層では、「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」、「食べる（Ｉｎｇｅｓｔｉｖｅ）」、「遊ぶ（Ｐｌａｙ）」というスキーマ２０１_２、２０２_２、２０３_２が配設される。そして、スキーマ２０１_２「探索する（Ｉｎｖｅｓｔｉｇａｔｅ）」の下位には、「ＩｎｖｅｓｔｉｇａｔｉｖｅＬｏｃｏｍｏｔｉｏｎ」等というより具体的な探索行動を記述した複数のスキーマ２０１_３が配設されている。同様に、スキーマ２０２_２「食べる（Ｉｎｇｅｓｔｉｖｅ）」の下位には「Ｅａｔ」や「Ｄｒｉｎｋ」等のより具体的な飲食行動を記述した複数のスキーマ２０２_３が配設され、スキーマ２０３_２「遊ぶ（Ｐｌａｙ）」の下位には「ＰｌａｙＢｏｗｉｎｇ」、「ＰｌａｙＧｒｅｅｔｉｎｇ」等のより具体的な遊ぶ行動を記述した複数のスキーマ２０３_３が配設されている。
【００７４】
図示の通り、各スキーマは外部刺激１８３と内部状態（の変化）１８４を入力している。また、各スキーマは、少なくともＭｏｎｉｔｏｒ関数とＡｃｔｉｏｎと関数を備えている。
【００７５】
ここで、Ｍｏｎｉｔｏｒ関数とは、外部刺激１８３と内部状態１８４に応じて当該スキーマのアクティベーションレベル（ＡｃｔｉｖａｔｉｏｎＬｅｖｅｌ：ＡＬ値）を算出する関数である。図１１に示すようなツリー構造を構成する場合、上位（親）のスキーマは外部刺激１８３と内部状態１８４を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールすることができ、子供のスキーマはアクティベーションレベルを返り値とする。また、スキーマは自分のアクティベーションレベルを算出するために、更に子供のスキーマのＭｏｎｉｔｏｒ関数をコールすることができる。そして、ルートのスキーマには各サブツリーからのアクティベーションレベルが返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断することができる。
【００７６】
例えばアクティベーションレベルが最も高いスキーマを選択したり、アクティベーションレベルが所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行するようにしてもよい（但し、並列実行するときは各スキーマどうしでハードウェア・リソースの競合がないことを前提とする）。
【００７７】
また、Ａｃｔｉｏｎ関数は、スキーマ自身が持つ行動を記述したステートマシンを備えている。図１１に示すようなツリー構造を構成する場合、親スキーマは、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりすることができる。本実施の形態では、ＡｃｔｉｏｎのステートマシンはＲｅａｄｙにならないと初期化されない。言い換えれば、中断しても状態はリセットされず、スキーマが実行中の作業データを保存することから、中断再実行が可能である。
【００７８】
図１２には、状況依存行動階層１０２において通常の状況依存行動を制御するためのメカニズムを模式的に示している。
【００７９】
同図に示すように、状況依存行動階層（ＳＢＬ）１０２には、短期記憶部（ＳＴＭ）９２から外部刺激１８３が入力（Ｎｏｔｉｆｙ）されるとともに、内部状態管理部９１から内部状態の変化１８４が入力される。状況依存行動階層１０２は、例えば動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するためのサブツリー等、複数のサブツリーで構成されており、ルート・スキーマは、外部刺激１８３の通知（Ｎｏｔｉｆｙ）に応答して、各サブツリーのＭｏｎｉｔｏｒ関数をコールし、その返り値としてのアクティベーションレベル（ＡＬ）値を参照して、統合的な行動選択を行ない、選択された行動を実現するサブツリーに対してＡｃｔｉｏｎ関数をコールする。また、状況依存行動階層１０２において決定された状況依存行動は、リソース・マネージャＲＭ１１６により反射行動部１０３による反射的行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ）に適用される。
【００８０】
また、状況依存行動層１０２のうち、反射的行動部１０３は、上述した認識系の各オブジェクトによって認識された外部刺激１８３に応じて、例えば、障害物の検出により咄嗟に避ける等、反射的・直接的な機体動作を実行する。このため、図１１に示す通常の状況依存行動を制御する場合とは相違し、図１０に示すように、認識系の各オブジェクトからの信号を直接入力する複数のスキーマ１３２が階層化されずに並列的に配置されている。
【００８１】
図１３には、反射行動部１０３におけるスキーマの構成を模式的に示している。同図に示すように、反射行動部１０３には、聴覚系の認識結果に応答して動作するスキーマとしてＡｖｏｉｄＢｉｇＳｏｕｎｄ２０４、ＦａｃｅｔｏＢｉｇＳｏｕｎｄ２０５及びＮｏｄｄｉｎｇＳｏｕｎｄ２０９、視覚系の認識結果に応答して動作するスキーマとしてＦａｃｅｔｏＭｏｖｉｎｇＯｂｊｅｃｔ２０６及びＡｖｏｉｄＭｏｖｉｎｇＯｂｊｅｃｔ２０７、並びに、触覚系の認識結果に応答して動作するスキーマとして手を引っ込める２０８が、それぞれ対等な立場で（並列的に）配設されている。
【００８２】
図示の通り、反射的行動を行なう各スキーマは外部刺激１８３を入力に持つ。また、各スキーマは、少なくともＭｏｎｉｔｏｒ関数とＡｃｔｉｏｎ関数を備えている。Ｍｏｎｉｔｏｒ関数は、外部刺激１８３に応じて当該スキーマのアクティベーションレベルを算出して、これに応じて該当する反射的行動を発現すべきかどうかが判断される。また、Ａｃｔｉｏｎ関数は、スキーマ自身が持つ反射的行動を記述したステートマシン（後述）を備えており、コールされることにより、該当する反射的行動を発現するとともにＡｃｔｉｏｎの状態を遷移させていく。
【００８３】
図１４には、反射行動部１０３において反射的行動を制御するためのメカニズムを模式的に示している。図１３にも示したように、反射行動部１０３内には、反応行動を記述したスキーマや、即時的な応答行動を記述したスキーマが並列的に存在している。認識系の機能モジュール８０を構成する各オブジェクトから認識結果が入力されると、対応する反射行動スキーマがＡｏｎｉｔｏｒ関数によりアクティベーションレベルを算出し、その値に応じてＡｃｔｉｏｎを軌道すべきかどうかが判断される。そして、反射行動部１０３において起動が決定された反射的行動は、リソース・マネージャＲＭ１１６により状況依存行動階層１０２による状況依存行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ１７３）に適用される。
【００８４】
状況依存行動階層１０２及び反射行動部１０３を構成するスキーマは、例えばＣ＋＋言語ベースで記述される「クラス・オブジェクト」として記述することができる。図１５には、状況依存行動階層１０２において使用されるスキーマのクラス定義を模式的に示している。同図に示されている各ブロックはそれぞれ１つのクラス・オブジェクトに相当する。
【００８５】
図示の通り、状況依存行動階層（ＳＢＬ）１０２は、１以上のスキーマと、ＳＢＬ１０２の入出力イベントに対してＩＤを割り振るＥｖｅｎｔＤａｔａＨａｎｄｌｅｒ（ＥＤＨ）２１１と、ＳＢＬ１０２内のスキーマを管理するＳｃｈｅｍａＨａｎｄｌｅｒ（ＳＨ）２１２と、外部オブジェクト（ＳＴＭやＬＴＭ、リソース・マネージャ、認識系の各オブジェクト等）からデータを受信する１以上のＲｅｃｅｉｖｅＤａｔａＨａｎｄｌｅｒ（ＲＤＨ）２１３と、外部オブジェクトにデータを送信する１以上のＳｅｎｄＤａｔａＨａｎｄｌｅｒ（ＳＤＨ）２１４とを備えている。
【００８６】
ＳｃｈｅｍａＨａｎｄｌｅｒ２１２は、状況依存行動階層（ＳＢＬ）１０２や反射行動部１０３を構成する各スキーマやツリー構造等の情報（ＳＢＬのコンフィギュレーション情報）をファイルとして保管している。例えばシステムの起動時等に、ＳｃｈｅｍａＨａｎｄｌｅｒ２１２は、このコンフィギュレーション情報ファイルを読み込んで、図１１に示したような状況依存行動階層１０２のスキーマ構成を構築（再現）して、メモリ空間上に各スキーマのエンティティをマッピングする。
【００８７】
各スキーマは、スキーマのベースとして位置付けられるＯｐｅｎＲ＿Ｇｕｅｓｔ２１５を備えている。ＯｐｅｎＲ＿Ｇｕｅｓｔ２１５は、スキーマが外部にデータを送信するためのＤｓｕｂｊｅｃｔ２１６、並びに、スキーマが外部からデータを受信するためのＤＯｂｊｅｃｔ２１７というクラス・オブジェクトをそれぞれ１以上備えている。例えば、スキーマが、ＳＢＬ１０２の外部オブジェクト（ＳＴＭやＬＴＭ、認識系の各オブジェクト等）にデータを送るときには、Ｄｓｕｂｊｅｃｔ２１６はＳｅｎｄＤａｔａＨａｎｄｌｅｒ２１４に送信データを書き込む。また、ＤＯｂｊｅｃｔ２１７は、ＳＢＬ１０２の外部オブジェクトから受信したデータをＲｅｃｅｉｖｅＤａｔａＨａｎｄｌｅｒ２１３から読み取ることができる。
【００８８】
ＳｃｈｅｍａＭａｎａｇｅｒ２１８及びＳｃｈｅｍａＢａｓｅ２１９は、ともにＯｐｅｎＲ＿Ｇｕｅｓｔ２１５を継承したクラス・オブジェクトである。クラス継承は、元のクラスの定義を受け継ぐことであり、この場合、ＯｐｅｎＲ＿Ｇｕｅｓｔ２１５で定義されているＤｓｕｂｊｅｃｔ２１６やＤＯｂｊｅｃｔ２１７等のクラス・オブジェクトをＳｃｈｅｍａＭａｎａｇｅｒＢａｓｅ２１８やＳｃｈｅｍａＢａｓｅ２１９も備えていることを意味する（以下、同様）。例えば図１１に示すように複数のスキーマがツリー構造になっている場合、ＳｃｈｅｍａＭａｎａｇｅｒＢａｓｅ２１８は、子供のスキーマのリストを管理するクラス・オブジェクトＳｃｈｅｍａＬｉｓｔ２２０を持ち（子供のスキーマへのポインタを持ち）、子供スキーマの関数をコールすることができる。また、ＳｃｈｅｍａＢａｓｅ２１９は、親スキーマへのポインタを持ち、親スキーマからコールされた関数の返り値を戻すことができる。
【００８９】
ＳｃｈｅｍａＢａｓｅ２１９は、ＳｔａｔｅＭａｃｈｉｎｅ２２１及びＰｒｏｎｏｍｅ２２２という２つのクラス・オブジェクトを持つ。ＳｔａｔｅＭａｃｈｉｎｅ２２１は当該スキーマの行動（Ａｃｔｉｏｎ関数）についてのステートマシンを管理している。親スキーマは子供スキーマのＡｃｔｉｏｎ関数のステートマシンを切り替える（状態遷移させる）ことができる。また、Ｐｒｏｎｏｍｅ２２２には、当該スキーマが行動（Ａｃｔｉｏｎ関数）を実行又は適用するターゲットを代入する。後述するように、スキーマはＰｒｏｎｏｍｅ２２２に代入されたターゲットによって占有され、行動（動作）が終了（完結、異常終了等）するまでスキーマは解放されない。新規のターゲットのために同じ行動を実行するためには同じクラス定義のスキーマをメモリ空間上に生成する。この結果、同じスキーマをターゲット毎に独立して実行することができ（個々のスキーマの作業データが干渉し合うことはなく）、行動のＲｅｅｎｔｒａｎｃｅ性が確保される（後述）。
【００９０】
ＰａｒｅｎｔＳｃｈｅｍａＢａｓｅ２２３は、ＳｃｈｅｍａＭａｎａｇｅｒ２１８及びＳｃｈｅｍａＢａｓｅ２１９を多重継承するクラス・オブジェクトであり、スキーマのツリー構造において、当該スキーマ自身についての親スキーマ及び子供スキーマすなわち親子関係を管理する。
【００９１】
ＩｎｔｅｒｍｅｄｉａｔｅＰａｒｅｎｔＳｃｈｅｍａＢａｓｅ２２４は、ＰａｒｅｎｔＳｃｈｅｍａＢａｓｅ２２３を継承するクラス・オブジェクトであり、各クラスのためのインターフェース変換を実現する。また、ＩｎｔｅｒｍｅｄｉａｔｅＰａｒｅｎｔＳｃｈｅｍａＢａｓｅ２２４は、ＳｃｈｅｍａＳｔａｔｕｓＩｎｆｏ２２５を持つ。このＳｃｈｅｍａＳｔａｔｕｓＩｎｆｏ２２５は、当該スキーマ自身のステートマシンを管理するクラス・オブジェクトである。親スキーマは、子供スキーマのＡｃｔｉｏｎ関数をコールすることによってそのステートマシンの状態を切り換えることができる。また、子供スキーマのＭｏｎｉｔｏｒ関数をコールしてそのステートマシンの常態に応じたアクティベーションレベルを問うことができる。但し、スキーマのステートマシンは、前述したＡｃｔｉｏｎ関数のステートマシンとは異なるということを留意されたい。
【００９２】
ＡｎｄＰａｒｅｎｔＳｃｈｅｍａ２２６、ＮｕｍＯｒＰａｒｅｎｔＳｃｈｅｍａ２２７、ＯｒＰａｒｅｎｔＳｃｈｅｍａ２２８は、ＩｎｔｅｒｍｅｄｉａｔｅＰａｒｅｎｔＳｃｈｅｍａＢａｓｅ２２４を継承するクラス・オブジェクトである。ＡｎｄＰａｒｅｎｔＳｃｈｅｍａ２２６は、同時実行する複数の子供スキーマへのポインタを持つ。ＯｒＰａｒｅｎｔＳｃｈｅｍａ２２８は、いずれか択一的に実行する複数の子供スキーマへのポインタを持つ。また、ＮｕｍＯｒＰａｒｅｎｔＳｃｈｅｍａ２２７は、所定数のみを同時実行する複数の子供スキーマへのポインタを持つ。
【００９３】
ＰａｒｅｎｔＳｃｈｅｍａ２２９は、これらＡｎｄＰａｒｅｎｔＳｃｈｅｍａ２２６、ＮｕｍＯｒＰａｒｅｎｔＳｃｈｅｍａ２２７、ＯｒＰａｒｅｎｔＳｃｈｅｍａ２２８を多重継承するクラス・オブジェクトである。
【００９４】
図１６には、状況依存行動階層（ＳＢＬ）１０２内のクラスの機能的構成を模式的に示している。状況依存行動階層（ＳＢＬ）１０２は、ＳＴＭやＬＴＭ、リソース・マネージャ、認識系の各オブジェクト等外部オブジェクトからデータを受信する１以上のＲｅｃｅｉｖｅＤａｔａＨａｎｄｌｅｒ（ＲＤＨ）２１３と、外部オブジェクトにデータを送信する１以上のＳｅｎｄＤａｔａＨａｎｄｌｅｒ（ＳＤＨ）２１４とを備えている。
【００９５】
ＥｖｅｎｔＤａｔａＨａｎｄｌｅｒ（ＥＤＨ）２１１は、ＳＢＬ１０２の入出力イベントに対してＩＤを割り振るためのクラス・オブジェクトであり、ＲＤＨ２１３やＳＤＨ２１４から入出力イベントの通知を受ける。
【００９６】
ＳｃｈｅｍａＨａｎｄｌｅｒ２１２は、スキーマ１３２を管理するためのクラス・オブジェクトであり、ＳＢＬ１０２を構成するスキーマのコンフィギュレーション情報をファイルとして保管している。例えばシステムの起動時等に、ＳｃｈｅｍａＨａｎｄｌｅｒ２１２は、このコンフィギュレーション情報ファイルを読み込んで、ＳＢＬ１０２内のスキーマ構成を構築する。
【００９７】
各スキーマは、図１５に示したクラス定義に従って生成され、メモリ空間上にエンティティがマッピングされる。各スキーマは、ＯｐｅｎＲ＿Ｇｕｅｓｔ２１５をベースのクラス・オブジェクトとし、外部にデータ・アクセスするためのＤＳｕｂｊｅｃｔ２１６やＤＯｂｊｅｃｔ２１７等のクラス・オブジェクトを備えている。
【００９８】
スキーマ１３２が主に持つ関数とステートマシンを以下に示しておく。以下の関数は、ＳｃｈｅｍａＢａｓｅ２１９で記述されている。
ＡｃｔｉｖａｔｉｏｎＭｏｎｉｔｏｒ（）：スキーマがＲｅａｄｙ時にＡｃｔｉｖｅになるための評価関数
Ａｃｔｉｏｎｓ（）：Ａｃｔｉｖｅ時の実行用ステートマシン
Ｇｏａｌ（）：Ａｃｔｉｖｅ時にスキーマがＧｏａｌに達したかを評価する関数
Ｆａｉｌ（）：Ａｃｔｉｖｅ時にスキーマがｆａｉｌ状態かを判定する関数
ＳｌｅｅｐＡｃｔｉｏｎｓ（）：Ｓｌｅｅｐ前に実行されるステートマシン
ＳｌｅｅｐＭｏｎｉｔｏｒ（）：Ｓｌｅｅｐ時にＲｅｓｕｍｅするための評価関数
ＲｅｓｕｍｅＡｃｔｉｏｎｓ（）：Ｒｅｓｕｍｅ前にＲｅｓｕｍｅするためのステートマシン
ＤｅｓｔｒｏｙＭｏｎｉｔｏｒ（）：Ｓｌｅｅｐ時にスキーマがｆａｉｌ状態か判定する評価関数
ＭａｋｅＰｒｏｎｏｍｅ（）：ツリー全体のターゲットを決定する関数
【００９９】
（２−３）状況依存行動階層の機能
状況依存行動階層（ＳＢＬ）１０２は、短期記憶部９２及び長期記憶部９３の記憶内容や、内部状態管理部９１によって管理される内部状態を基に、ロボット装置１が現在置かれている状況に即応した動作を制御する。
【０１００】
前項で述べたように、本実施の形態における状況依存行動階層１０２は、スキーマのツリー構造（図１１を参照のこと）で構成されている。各スキーマは、自分の子供と親の情報を知っている状態で独立性を保っている。このようなスキーマ構成により、状況依存行動階層１０２は、Ｃｏｎｃｕｒｒｅｎｔな評価、Ｃｏｎｃｕｒｒｅｎｔな実行、Ｐｒｅｅｍｐｔｉｏｎ、Ｒｅｅｎｔｒａｎｔという主な特徴を持っている。以下、これらの特徴について詳解する。
【０１０１】
（２−３−１）Ｃｏｎｃｕｒｒｅｎｔな評価：
行動記述モジュールとしてのスキーマは外部刺激や内部状態の変化に応じた状況判断を行なうＭｏｎｉｔｏｒ機能を備えていることは既に述べた。Ｍｏｎｉｔｏｒ機能は、スキーマがクラス・オブジェクトＳｃｈｅｍａＢａｓｅでＭｏｎｉｔｏｒ関数を備えていることにより実装されている。Ｍｏｎｉｔｏｒ関数とは、外部刺激と内部状態に応じて当該スキーマのアクティベーションレベルを算出する関数である。
【０１０２】
図１１に示すようなツリー構造を構成する場合、上位（親）のスキーマは外部刺激１８３と内部状態の変化１８４を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールすることができ、子供のスキーマはアクティベーションレベルを返り値とする。また、スキーマは自分のアクティベーションレベルを算出するために、更に子供のスキーマのＭｏｎｉｔｏｒ関数をコールすることができる。そして、ルートのスキーマ２０１_１〜２０３_１には各サブツリーからのアクティベーションレベルが返されるので、外部刺激１８３と内部状態の変化１８４に応じた最適なスキーマすなわち動作を統合的に判断することができる。
【０１０３】
このようにツリー構造になっていることから、外部刺激１８３と内部状態の変化１８４による各スキーマの評価は、まずツリー構造の下から上に向かってＣｏｎｃｕｒｒｅｎｔに行なわれる。即ち、スキーマに子供スキーマがある場合には、選択した子供のＭｏｎｉｔｏｒ関数をコールしてから、自身のＭｏｎｉｔｏｒ関数を実行する。次いで、ツリー構造の上から下に向かって評価結果としての実行許可を渡していく。評価と実行は、その動作が用いるリソースの競合を解きながら行なわれる。
【０１０４】
本実施の形態における状況依存行動階層１０２は、スキーマのツリー構造を利用して、並列的に行動の評価を行なうことができるので、外部刺激１８３や内部状態の変化１８４等の状況に対しての適応性がある。また、評価時には、ツリー全体に関しての評価を行ない、このとき算出されるアクティベーションレベル（ＡＬ）値によりツリーが変更されるので、スキーマすなわち実行する動作を動的にプライオリタイズすることができる。
【０１０５】
（２−３−２）Ｃｏｎｃｕｒｒｅｎｔな実行：
ルートのスキーマには各サブツリーからのアクティベーションレベルが返されるので、外部刺激１８３と内部状態の変化１８４に応じた最適なスキーマすなわち動作を統合的に判断することができる。例えばアクティベーションレベルが最も高いスキーマを選択したり、アクティベーションレベルが所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行するようにしてもよい（但し、並列実行するときは各スキーマどうしでハードウェア・リソースの競合がないことを前提とする）。
【０１０６】
実行許可をもらったスキーマは実行される。すなわち、実際にそのスキーマは更に詳細の外部刺激１８３や内部状態の変化１８４を観測して、コマンドを実行する。実行に関しては、ツリー構造の上から下に向かって順次すなわちＣｏｎｃｕｒｒｅｎｔに行なわれる。即ち、スキーマに子供スキーマがある場合には、子供のＡｃｔｉｏｎｓ関数を実行する。
【０１０７】
Ａｃｔｉｏｎ関数は、スキーマ自身が持つ行動（動作）を記述したステートマシンを備えている。図１１に示すようなツリー構造を構成する場合、親スキーマは、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりすることができる。
【０１０８】
本実施の形態における状況依存行動階層（ＳＢＬ）１０２は、スキーマのツリー構造を利用して、リソースが競合しない場合には、余ったリソースを使う他のスキーマを同時に実行することができる。但し、Ｇｏａｌまでに使用するリソースに対して制限を加えないと、ちぐはぐな行動出現が起きる可能性がある。状況依存行動階層１０２において決定された状況依存行動は、リソース・マネージャにより反射行動部（ＲｅｆｌｅｘｉｖｅＳＢＬ）１０３による反射的行動とのハードウェア・リソースの競合の調停を経て、機体動作（ＭｏｔｉｏｎＣｏｎｔｒｏｌｌｅｒ）に適用される。
【０１０９】
（２−３−３）Ｐｒｅｅｍｐｔｉｏｎ：
１度実行に移されたスキーマであっても、それよりも重要な（優先度の高い）行動があれば、スキーマを中断してそちらに実行権を渡さなければならない。また、より重要な行動が終了（完結又は実行中止等）したら、元のスキーマを再開して実行を続けることも必要である。
【０１１０】
このような優先度に応じたタスクの実行は、コンピュータの世界におけるＯＳ（オペレーティング・システム）のＰｒｅｅｍｐｔｉｏｎと呼ばれる機能に類似している。ＯＳでは、スケジュールを考慮するタイミングで優先度のより高いタスクを順に実行していくという方針である。
【０１１１】
これに対し、本実施の形態におけるロボット装置１の制御システム１０は、複数のオブジェクトにまたがるため、オブジェクト間での調停が必要になる。例えば反射行動を制御するオブジェクトである反射行動部１０３は、上位の状況依存行動を制御するオブジェクトである状況依存行動階層１０２の行動評価を気にせずに物を避けたり、バランスをとったりする必要がある。これは、実際に実行権を奪い取り実行を行なう訳であるが、上位の行動記述モジュール（ＳＢＬ）に、実行権利が奪い取られたことを通知して、上位はその処理を行なうことによってＰｒｅｅｍｐｔｉｖｅな能力を保持する。
【０１１２】
また、状況依存行動層１０２内において、外部刺激１８３と内部状態の変化１８４に基づくアクティベーションレベルの評価の結果、あるスキーマに実行許可がなされたとする。更に、その後の外部刺激１８３と内部状態の変化１８４に基づくアクティベーションレベルの評価により、別のスキーマの重要度の方がより高くなったとする。このような場合、実行中のスキーマのＡｃｔｉｏｎｓ関数を利用してＳｌｅｅｐ状態にして中断することにより、Ｐｒｅｅｍｐｔｉｖｅな行動の切り替えを行なうことができる。
【０１１３】
実行中のスキーマのＡｃｔｉｏｎｓ（）の状態を保存して、異なるスキーマのＡｃｔｉｏｎｓ（）を実行する。また、異なるスキーマのＡｃｔｉｏｎｓ（）が終了した後、中断されたスキーマのＡｃｔｉｏｎｓ（）を再度実行することができる。
【０１１４】
また、実行中のスキーマのＡｃｔｉｏｎｓ（）を中断して、異なるスキーマに実行権が移動する前に、ＳｌｅｅｐＡｃｔｉｏｎｓ（）を実行する。例えば、ロボット装置１は、対話中にサッカーボールを見つけると、「ちょっと待ってね」と言って、サッカーすることができる。
【０１１５】
（２−３−４）Ｒｅｅｎｔｒａｎｔ：
状況依存行動階層１０２を構成する各スキーマは、一種のサブルーチンである。スキーマは、複数の親からコールされた場合には、その内部状態を記憶するために、それぞれの親に対応した記憶空間を持つ必要がある。
【０１１６】
これは、コンピュータの世界では、ＯＳが持つＲｅｅｎｔｒａｎｔ性に類似しており、本明細書ではスキーマのＲｅｅｎｔｒａｎｔ性と呼ぶ。図１６に示したように、スキーマ１３２はクラス・オブジェクトで構成されており、クラス・オブジェクトのエンティティすなわちインスタンスをターゲット（Ｐｒｏｎｏｍｅ）毎に生成することによりＲｅｅｎｔｒａｎｔ性が実現される。
【０１１７】
スキーマのＲｅｅｎｔｒａｎｔ性について、図１７を参照しながらより具体的に説明する。ＳｃｈｅｍａＨａｎｄｌｅｒ２１２は、スキーマを管理するためのクラス・オブジェクトであり、ＳＢＬ１０２を構成するスキーマのコンフィギュレーション情報をファイルとして保管している。システムの起動時に、ＳｃｈｅｍａＨａｎｄｌｅｒ２１２は、このコンフィギュレーション情報ファイルを読み込んで、ＳＢＬ１０２内のスキーマ構成を構築する。図１７に示す例では、Ｅａｔ２２１やＤｉａｌｏｇ２２２等の行動（動作）を規定するスキーマのエンティティがメモリ空間上にマッピングされているとする。
【０１１８】
ここで、外部刺激１８３と内部状態の変化１８４に基づくアクティベーションレベルの評価により、スキーマＤｉａｌｏｇ２２２に対してＡというターゲット（Ｐｒｏｎｏｍｅ）が設定されて、Ｄｉａｌｏｇ２２２が人物Ａとの対話を実行するようになったとする。
【０１１９】
そこに、人物Ｂがロボット装置１と人物Ａとの対話に割り込み、その後、外部刺激１８３と内部状態の変化１８４に基づくアクティベーションレベルの評価を行なった結果、Ｂとの対話を行なうスキーマ２２３の方がより優先度が高くなったとする。
【０１２０】
このような場合、ＳｃｈｅｍａＨａｎｄｌｅｒ２１２は、Ｂとの対話を行なうためのクラス継承した別のＤｉａｌｏｇエンティティ（インスタンス）をメモリ空間上にマッピングする。別のＤｉａｌｏｇエンティティを使用して、先のＤｉａｌｏｇエンティティとは独立して、Ｂとの対話を行なうことから、Ａとの対話内容は破壊されずに済む。従って、ＤｉａｌｏｇＡはデータの一貫性を保持することができ、Ｂとの対話が終了すると、Ａとの対話を中断した時点から再開することができる。
【０１２１】
Ｒｅａｄｙリスト内のスキーマは、その対象物（外部刺激１８３）に応じて評価すなわちアクティベーションレベルの計算が行なわれ、実行権が引き渡される。その後、Ｒｅａｄｙリスト内に移動したスキーマのインスタンスを生成して、これ以外の対象物に対して評価を行なう。これにより、同一のスキーマをａｃｔｉｖｅ又はｓｌｅｅｐ状態にすることができる。
【０１２２】
（３）ロボット装置への本発明の適用
次に、「やりたくない」等の負の欲求を有し、これを表現することが可能な本実施の形態におけるロボット装置について詳細に説明する。本実施の形態におけるロボット装置は、自身の内部状態と外部の状況から、最適な行動を選択するものであり、この行動選択の際の内部状態として、「やりたい」等の正の欲求値だけでなく「やりたくない」等の負の欲求値をも有し、これに基づき指定された行動を発現するか否かを決定するものである。このように、内部状態として、「やりたくない」等の負の値の欲求値を持つことによって、やりたい行動のみを発現させてやりたくないという負の欲求（負の内部状態）に関しては考慮されていなかった従来のロボット装置に比して、常に指示通りに動作を発現することを防止する等、発現する動作のバリエーションを多種多様としたロボット装置を提供するものである。
【０１２３】
（３−１）状況依存行動階層（ＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒ：ＳＢＬ）
上述したように、ロボット装置は自身の内外の情報を考慮したうえで行動判断を行うアルゴリズムとしてＳＢＬを有する。ＳＢＬは、ダンスを踊るというような抽象的な意味を持つ単位から、実際に例えばアクチュエータの回転角度等、機体動作を指示するモーションコマンドを出力するというような具体的な意味を持つ単位まで、１つ１つの単位で独立した意味と機能を有する複数の行動記述モジュール（スキーマ）がツリー構造となって構成されている。各スキーマは、内部に行動のシーケンスを記述したステートマシンを備えており、外部環境から得たセンサ情報に基づいて得られた短期記憶の情報や、自らの身体情報を評価した結果得られた内部状態、今までの短期記憶、内部状態の経験を関連付けて保存しておくことにより得られた長期記憶の情報などを利用して状態遷移を行いながら、外部刺激や内部状態の変化に応じた状況判断を行い、行動の生成（選択）が行われる。
【０１２４】
通常、スキーマは、各種センサ等の外部入力装置（状態認識部）から入力された外部情報（外部刺激）と、ロボット装置の内部情報（自らの内部状態パラメータ及び感情パラメータの算出を行う感情・本能モデルから得られた内部状態パラメータ）、即ちロボットの一次情動（本能）の満足度、及びそれによって変化する二次情動（感情）の値との双方に応じて、各スキーマのやりたさ度合いを示す実行優先度（活動度レベル（アクティベーションレベル：ＡｃｔｉｖａｔｉｏｎＬｅｖｅｌ））を算出し、どのスキーマを実行するかが決定（選択）される。これによって外部入力（外部刺激）と、内部状態に応じて実際にどのような行動を行うのが自律的に判断され、ロボット装置本体、又はスピーカ若しくはＬＥＤ等の表現手段を使用してその行動を実行する。
【０１２５】
このようなアクティベーションレベルは、現在の状況においてロボット装置がその動作を発現することが可能か否か（やれることができるか否か）を示す第１の欲求を示す値であるＲｅｌｅａｓｅＶａｌｕｅ（ＲＶ）と、ロボット装置自身がやりたいか否かを示す第２の欲求を示す値である欲求値（ＭｏｔｉｖａｔｉｏｎＶａｌｕｅ：ＭＶ）とから算出される。
【０１２６】
ＲｅｌｅａｓｅＶａｌｕｅ（ＲＶ）とは、外部からの刺激、対象物があればその物理的な外部情報（対象物の有無、対象物との間の距離、対象物の色及び形状等）及び各記憶部からの記憶に基づく値が例えば加算される等して算出されるものであり、例えば、ボールを蹴るスキーマは、その時点でカメラ等によりボールを認識できない場合は、動作を発現できないと判定し、その値が小さくなる。
【０１２７】
また、欲求値ＭＶは、ロボットの内部状態、即ち、本能・感情モデルにおいて算出された本能（欲求）値及び感情（情動）値に基づき算出されるもので、例えば、ボールを蹴るスキーマは、バッテリの充電が充分であったり、好きな色のボールを発見した場合にボールを蹴りたいという欲求が大きくなり、その値が大きくなる。上述したように、ロボット装置の感情モデルは、例えば「喜び（Ｊｏｙ）」、「悲しみ（Ｓａｄｎｅｓｓ）」、「怒り（Ａｎｇｅｒ）」、「驚き（Ｓｕｒｐｒｉｓｅ）」、「嫌悪（Ｄｉｓｇｕｓｔ）」及び「恐れ（Ｆｅａｒ）」の合計６つの情動について、各情動毎にその情動の強さを表すパラメータを保持し、本能モデルは、「運動欲（ｅｘｅｒｃｉｓｅ）」、「愛情欲（ａｆｆｅｃｔｉｏｎ）」、「食欲（ａｐｐｅｔｉｔｅ）」及び「好奇心（ｃｕｒｉｏｓｉｔｙ）」の互いに独立した４つの欲求について、これら欲求毎にその欲求の強さを表すパラメータを保持しており、これらの各値に基づき欲求値ＭＶが算出される。このような内部状態は、図４に示す内部状態管理部９１により、外部刺激並びに例えば自身のバッテリの残量及びモータの回転角等の情報を入力とし、上述のような複数の内部状態に対応した値（内部状態ベクトル）が算出され、管理されている。
【０１２８】
（３−２）ロボット装置の動作及びその作用
スキーマの選択に際しては、スキーマ毎が有しているアクティベーションレベルの大小で、実行する行動（スキーマ）が選択されるようになっており、外部情報及び内部状態からスキーマ自身がアクティベーションレベルを算出し、これに基づき自律的に動作が発現される。一方で、ユーザに「○○をやって」と言われたら、該当するスキーマを実行する等、自律的な動作とは別に、命令に従って、即ち他律的な動作を発現するようにも設定されている。
【０１２９】
例えばこのようなユーザの指示に従うようにする場合、ロボット装置の各スキーマは、例えば後述するＤｅｌｉｂｅｒａｔｉｖｅＳＢＬにより、ユーザの指示に該当するスキーマのアクティベーションレベルを強制的に上昇させる（加算する）等して、その該当するスキーマが選択されやすくなるよう構成されており、従ってこのような方法においては、ロボット装置の各スキーマ自身が外部情報及び内部状態から算出するアクティベーションレベルに拘わらず、ロボット装置は指定された行動を実行することになる。
【０１３０】
そこで、本実施の形態においては、そのような場合においても、ロボット装置の各スキーマ自身が算出したアクティベーションレベルを加味するストラテジーを導入することで、ロボット装置の動作発現にバリエーションを持たせるものである。
【０１３１】
アクティベーションレベルを求める際には、通常、上述のような感情・本能モデルにおける欲求値は、正の欲求値のみが考慮されており、この正の欲求値の大きさに応じて、アクティベーションレベルが算出されており、従って、アクティベーションレベルは各行動に対するやりたさ度合いを示すものであった。そして、例えば最もやりたい、即ちアクティベーションレベルが最も大きい値のスキーマを選択する等の方法がとられている。また、例えばユーザからの指定があった場合は、アクティベーションレベルに拘わらず指定されたスキーマを選択させたり、指定されたスキーマのアクティベーションレベルに所定の数値を加算する等して高くして選択させたりすることで、強制的に行動を実行させるよう構成されている。
【０１３２】
これに対し、本実施の形態においては、内部状態に応じては、例えば「やりたくない」等を示す負の欲求値ＭＶをロボット装置に持たせるものである。従って上述したように、負の欲求値ＭＶ及びＲｅｌｅａｓｅＶａｌｕｅ（ＲＶ）から得られるアクティベーションレベルの値も負になる場合が生じる。このように、アクティベーションレベルが、行動に対する正の欲求だけではなく、やりたくない等の負の欲求をも示してこれを行動に反映させることができれば、より人間らいしい行動となる。
【０１３３】
例えばロボット装置が外部から所定の動作をするよう指示された場合、即ち、自身で算出したアクティベーションレベルに拘わらず、強制的に動作するよう指示された場合において、指示されたスキーマのアクティベーションレベルが負であった場合は、ロボット装置が現在、その行動を発現したくない旨をユーザに示す動作を発現し、更にはその指示を拒否する機能、即ち、指定された行動を発現しない機能を持たせることができる。また、アクティベーションレベルが負の場合、その大きさに応じて「絶対やだ」、「あんまりやりたくないな」、「気分が乗らない」等をユーザに伝えることもできバリエーションが異なる拒否反応を返答させたりすることもできる。
【０１３４】
また、アクティベーションレベルが負の場合であっても、重ねて要求された場合には「しょうがないなあ」といやいや実行に移す等、より人間らしい反応を返すことが可能になる。なお、同様に、アクティベーションレベルが正の場合にその大きさに応じても返答のバリエーションを持たせることも可能である。
【０１３５】
更には、各スキーマに正の欲求に対応する行動と、これを否定する行動と記述しておけば、アクティベーションレベルが示す負の欲求が所定の閾値を下まわった場合に、このスキーマを選択して、負の欲求に対応する行動を出力するようにしておけば、ロボット装置は、負の欲求を有しているという現在の状態をユーザに伝えることができる。
【０１３６】
このように、アクティベーションレベルが正の欲求だけではなく、負の欲求をも示すようにすることで、ロボット装置は行動のバリエーションが増え、更に人間らしい行動を発現するようになる。ここでは、このような負の欲求を有することで増加するバリエーションの一例として、所定の外部刺激として、ユーザに「ボールを蹴って！」と言われ、ボールを蹴るスキーマが選択される場合を例にとって具体的に説明する。図１８及び図２０は、外部からの指示に従いロボット装置が行動を発現しない場合、及び発現する場合において、スキーマツリーにおけるスキーマのアクティベーションレベルを示す模式図である。
【０１３７】
先ず、指定されたスキーマのアクティベーションレベルが負であって、ロボット装置自身（スキーマ）は、その行動をやりたがっていない場合について説明する。ロボット装置がユーザに「ボールを蹴って！」と言われた場合、例えば、ユーザからの命令を解釈する後述する音声解釈スキーマが起動してその意味を把握し、このスキーマからの指令により図１８に示すように、ボールを蹴るスキーマ（スキーマＡ）を起動させるものとする。この際、スキーマＡは、自身のアクティベーションレベルを算出し、算出したアクティベーションレベルを用いて、自身がどれだけやりたいかを把握し、その値に応じて指示に従うか否かを判断する。
【０１３８】
ここでボールを蹴るスキーマＡのアクティベーションレベルは、上述したようにスキーマＡに定義されている１以上の内部状態が満たす現在の内部状態に基づき得られる欲求値ＭＶと、例えばボールを認識した等の外部刺激に基づき得られるＲｅｌｅａｓｅＶａｌｕｅ（ＲＶ）によって求めることができる。
【０１３９】
欲求値ＭＶの算出には、例えば図１９に示すような関数を用いることができる。図１９は、内部状態と欲求値ＭＶとの関係の一例を示すグラフ図である。行動価値ＡＬを算出する一方の要素である欲求値ＭＶは、各スキーマに定義されているいくつかの内部状態に対応した欲求値ベクトルＩｎｓＶ（ＩｎｓｔｉｎｃｔＶａｒｉａｂｌｅ）として求められる。例えば、「ボールを蹴る」行動を出力するスキーマには、内部状態ベクトルＩｎｔＶ｛ＩｎｔＶ＿ＮＯＵＲＩＳＨＭＥＮＴ「栄養状態」，ＩｎｔＶ＿ＦＡＴＩＧＵＥ「疲れ」｝が定義され、これより、「ボールを蹴る」行動に対する欲求値ＭＶとして、欲求値ベクトルＩｎｓＶ｛ＩｎｓＶ＿ＮＯＵＲＩＳＨＭＥＮＴ，ＩｎｓＶ＿ＦＡＴＩＧＵＥ｝を求める。欲求値ＭＶは、例えば内部状態「栄養状態」の値が大きいほど、「ボールを蹴る」行動に対する欲求値ＭＶが大きくなるような関数や、内部状態「疲れ」の値が大きいほど「ボールを蹴る」行動に対する欲求値ＭＶが減少し、内部状態「疲れ」の値が所定の大きさ以上になった場合に欲求値ＭＶが負になるような関数等、各内部状態及びそれに対応づけられた行動に応じた所定の関数等を用意してそれを使用することができる。
【０１４０】
具体的には、下記式（１）及び図１９に示すような関数が挙げられる。図１９は、横軸に内部状態ベクトルＩｎｔＶの各成分をとり、縦軸に欲求値ベクトルＩｎｓＶの各成分をとって、下記式（１）で示される内部状態と欲求値ＭＶとの関係を示すグラフ図である。
【０１４１】
【数１】

【０１４２】
欲求値ベクトルＩｎｓＶは、上記式（１）及び図１９に示すように、内部状態ベクトルＩｎｔＶの値のみで決まる。ここでは、内部状態の大きさを０乃至１００とし、そのときの欲求値の大きさが−１乃至１となるような関数を示す。例えば内部状態が０〜８０までは、正の増加関数とし、内部状態が８０で欲求値ＭＶが０、更に内部状態が満たされると欲求値ＭＶが負の減少関数となるような内部状態−欲求値曲線Ｌ１を設定することで、ロボット装置は、常に内部状態が８割の状態を維持するような欲求値ＭＶを有するようになる。
【０１４３】
上記式（１）における定数Ａ乃至Ｆを種々変更することで、各内部状態毎に異なる欲求値ＭＶを求めることができる。例えば、内部状態が０乃至１００の間において、欲求値ＭＶが１乃至０に変化するようにしてもよいし、また、各内部状態毎に上記式（１）とは異なる内部状態−欲求値関数を用意してもよい。なお、ここでは、内部状態が８割を超すと負の欲求値ＭＶが生じるような場合について説明したが、内部部状態によっては、欲求値ＭＶが負にならないような関数を設定したり、また、欲求値ＭＶが常に負になるような関数を設定したりしてもよい。例えば、ロボット装置の内部状態「栄養状態」は、充電の残量に基づくものとしたとき、常に充電が一杯の方がよいような場合は欲求値ＭＶが負にならないような関数を設定すればよい。
【０１４４】
ここで、図１９に示すような関数の場合、内部状態が１００であれば、欲求値ＭＶは負（−１）となり、負の欲求値及びＲｅｌｅａｓｅＶａｌｕｅ（ＲＶ）に基づき算出されるアクティベーションレベルは高い確率で、図１８に示すように負の値となる。このような場合であっても、ユーザがスキーマＡを実行するよう指示した場合、スキーマＡが例えばアクティベーションレベルに拘わらず選択される。
【０１４５】
このように、負のアクティベーションレベルを有するスキーマＡが指定された場合、スキーマＡは自身に記述された行動の代わりに、その行動を発現したくない旨をユーザに伝える代償行動を出力する。
【０１４６】
又は、後述するＤｅｌｉｂｅｒａｔｉｖｅＳＢＬの音声解釈スキーマにより、指定されたスキーマのアクティベーションレベルに所定の値が加算され、アクティベーションレベルが上昇させられた場合であっても、その加算後のスキーマＡのアクティベーションレベルが図１８に示すように負の値であるような場合においても同様であり、音声解釈スキーマに指定されたスキーマＡは、他のスキーマのアクティベーションレベルと比較されることなく一旦は選択されるが、スキーマＡはアクティベーションレベルが負であるため、上述したように、自身に記述された行動を出力する代わりにその行動をやりたくないことと伝える代償行動を出力する。
【０１４７】
図１８においては、スキーマＡは、アクティベーションレベル（ＡＬ）＝−３０であり、「えー、いやだなあ」という拒否をユーザに通知する音声を発生し、やりたくないという負の欲求を有していることを表現する。
【０１４８】
このように、通常、外部刺激及び内部状態の変化から算出されるアクティベーションレベルが高いものが選択され行動が発現されるが、このアクティベーションレベルに拘わらず、例えばユーザからの指示等により強制的に選択されるようなシステムを導入した場合、選択されたスキーマは、外部刺激及び内部状態の変化から自身のアクティベーションレベルを算出し、本実施の形態においては、この値が負である場合、やりたくない欲求（負の欲求）を示しているものと判断し、この負の値に応じた代償行動を発現する。
【０１４９】
このような代償行動は、アクティベーションレベルの負の値に応じて複数用意しておくことも可能である。アクティベーションレベルが例えば−５０の場合は、「やりたくないよ」等の否定の表現を行うと共に、ユーザが指定したスキーマＡに記述された行動を出力せず、また、上述の例のように、アクティベーションレベルが例えば−３０の場合は、嫌だというしぐさを表出しつつ、ユーザが指定したスキーマＡに記述された行動を出力する等、アクティベーションレベルに応じて行動を選択することができる。
【０１５０】
なお、このようにアクティベーションレベルに応じて異なる動作を行わせるためには、アクティベーションレベルの負の大きさに応じて、スキーマＡに記述された行動の代償として機能するスキーマを複数用意しておき、アクティベーションレベルに対応する代償スキーマをコールして代償スキーマに記述された代償行動を出力することができる。
【０１５１】
一方、図２０において、内部状態が８０より低い値であると、内部状態を上昇させようとして、その欲求値ＭＶは大きな値（正の値）をとる。欲求値ＭＶが正の値をとれば、上述した如く、欲求値ＭＶとＲｅｌｅａｓｅＶａｌｕｅ（ＲＶ）とに基づき算出されるスキーマＡのアクティベーションレベルは高い確率で正の値となる。そして、アクティベーションレベルが正の値である場合であって、所定の外部刺激によって選択された場合、又は所定の外部刺激により所定のアクティベーションレベルが加算された後の値が正である場合、ロボット装置は、「ボールを蹴りたい」と思っているとし、スキーマＡは、ボールを蹴る行動を出力する。
【０１５２】
即ち、アクティベーションレベルが正であって大きい数値である場合、「うん、わかった！」など、肯定の返答と共に、ボールを蹴る、という行動を表出し、ユーザに指示された行動を、ロボット装置自身もやりたがっていたことをアピールすることができる。この場合も、スピーカ及びＬＥＤ等の表現手段によりユーザにやりたいことをアピールする、やりたい度合いに応じた代償スキーマを用意して、ＡＬに応じてこの代償スキーマをコールする等、同様に欲求の度合いを表現することができる。
【０１５３】
本実施の形態においては、指定されたスキーマをロボット装置自身がやりたくない場合にはそれを主張することができ、更に指定された行動を拒否したり、拒否しても何度も指示された場合には、「いやいやながらやる」等の様々なバリエーションが可能となる。また、とてもやりたいことを指定された場合は、「よろこんでやる」ことも可能であり、これらのことから、ロボット装置がより知的に見える手助けとなる。
【０１５４】
また、上述の図１９に示す曲線は、内部状態が所定の値（＝８０）とるような欲求値を得るものであるが、内部状態が一定値ではなく、ある範囲内を保つような欲求値を求めるものであってもよい。
【０１５５】
（４）具体例
次に、自律的に動作するロボット装置が、ユーザに指示された場合、該当するスキーマを選択して行動を発現させる方法、即ちロボット装置を他律的に動作させる方法の具体例について説明する。
【０１５６】
実際のロボット装置の行動制御手段としてＳＢＬを用いる場合、ＳＢＬが持つスキーマツリーの役割に応じて複数のＳＢＬを用意する。具体的には、外部刺激と内部状態を基にそれぞれのスキーマのアクティベーションレベルを算出し、スキーマ間の競合を行わせて自律的に行動を決定し、行動出力のためのコマンドを発行するＮｏｒｍａｌＳＢＬ、ＮｏｒｍａｌＳＢＬの持つスキーマの機能を組み合わせてある一定の行動、行動シーケンスなどを実行するために、特定のＮｏｒｍａｌＳＢＬのスキーマに対して外的にアクティベーションレベルを設定し、行動生成を行わせるＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ、電源電圧低下時や転倒時など異常状態を監視して異常状態からの回避行動を他のＳＢＬより優先的に行うＳｙｓｔｅｍＳＢＬ、聴覚センサに与えられた突発的な音圧（音量）変化や視覚センサに与えられた画像情報（明るさ）の急激な変化に応じて反射行動を行わせるためのＲｅｆｌｅｘｉｖｅＳＢＬ等である。
【０１５７】
このようなＳＢＬアルゴリズムによる行動選択手法では、通常は行動記述モジュールであるスキーマが、感情本能モデルから得られる内部状態に基づいて自らの行動の優先順位を規定するアクティベーションレベルを算出し、スキーマツリーの中でスキーマ同士が競合を行う。最終的にロボットのハードウェア・リソースが競合しない範囲でアクティベーションレベルの高い順にスキーマが同時に立ち上がり、行動出力が実現する。このアルゴリズムによると、ロボットの行動選択はロボットの内部状態と、センサへの外部刺激の条件とから自律的に行動選択が行われる。この自律的行動選択手法を以下、ホメオスタシスモードという。また、ホメオスタシスモードを実現するスキーマツリーを以下、ＮｏｒｍａｌＳＢＬという。
【０１５８】
ＮｏｒｍａｌＳＢＬを構成する各スキーマは、上述したように、顔が見えた、ボールを発見したなどの外部刺激と、感情本能モデルによって評価された、痛み、空腹、疲れ、眠気等の内部状態を基に自らのアクティベーションレベルを算出する。アクティベーションレベルは複数のスキーマ間のアクティベーションレベル（実行優先度）を規定し、大きい値を持つスキーマから優先的に実行権を獲得する。最終的にロボットのハードウェア・リソースが競合しない範囲でアクティベーションレベルの高い順にスキーマが同時に立ち上がり、行動出力が実現する。ＳＢＬアルゴリズムによって、ロボット自身の行動選択は内部欲求に基づいて自律的に実行され行動生成を行うことが可能となる。
【０１５９】
即ち、ＮｏｒｍａｌＳＢＬとは、ＳＢＬによる行動選択アルゴリズムを用いて自律的行動判断を行う最も基本的なスキーマツリー構造であり、このＮｏｒｍａｌＳＢＬにおいては、ロボット自身の欲求が優先されて行動判断（選択）が行われるため、行動生成結果がどのようなものになるかは、ロボット装置が置かれたコンテキストを含む環境状態に依存する。
【０１６０】
従って、内部欲求と反する行動判断は行われないため、そのままのスキーマツリーではユーザの命令をトップダウンでロボット装置に対して与えて、行動を行わせることや、ある一連の決まった動作を再生することでデモンストレーションを行うことは困難である。
【０１６１】
そこで、このようなＮｏｒｍａｌＳＢＬを有するロボット装置においても、トップダウンの命令を可能とするため、自律的行動選択を行うＮｏｒｍａｌＳＢＬとは別に、他律的に行動選択を行うＤｅｌｉｂｅｒａｔｉｖｅＳＢＬを用意する。即ち、行動制御手段であるＳＢＬは、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬを有し、こにより、単一のＳＢＬのスキーマ同士、もしくは複数ＳＢＬのスキーマ同士で外部刺激、内部状態とは関係なく、アクティベーションレベルを設定することができる。図２１は、本具体例におけるＮｏｒｍａｌＳＢＬとＤｅｌｉｂｅｒａｔｉｖｅＳＢＬとの関係を示す模式図である。図２１の上図に示すＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０により、図２１の下図に示すＮｏｒｍａｌＳＢＬ２３０を構成する各スキーマ２３１乃至２３３に対して外的にアクティベーションレベルが設定された場合、外部刺激と内部状態をもとに算出されたアクティベーションレベルは無効となり、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０により外的に与えたアクティベーションレベルが優先される。この仕組みにより、特定のスキーマを特定のアクティベーションレベルで起動することが可能になる。
【０１６２】
このＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０は、上述のＮｏｒｍａｌＳＢＬ２３０と同様の構造を有するが、ＮｏｒｍａｌＳＢＬ２３０内のスキーマとの間で競合は行わず、独立したツリーとして構成される。そして、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０内のスキーマは、ＮｏｒｍａｌＳＢＬ２３０内の特定のスキーマに対して外的に高いアクティベーションレベルを設定することができ、これにより、ＮｏｒｍａｌＳＢＬ２３０内のある特定のスキーマ、即ちある特定の行動を実行させることが可能である。このトップダウンの要求に基づく行動選択手法を以下、Ｉｎｔｅｎｔｉｏｎモードといい、このような機能をＩｎｔｅｎｔｉｏｎ機能といい、このＩｎｔｅｎｔｉｏｎモードを実現するスキーマツリー２２０をＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０という。以下、本具体例におけるＮｏｒｍａｌＳＢＬとＤｅｌｉｂｅｒａｔｉｖｅＳＢＬについて更に詳細に説明する。
【０１６３】
（４−１）ＮｏｒｍａｌＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒ（ＮｏｒｍａｌＳＢＬ）
ＮｏｒｍａｌＳＢＬは、各行動記述モジュール（スキーマ）毎にステートマシンを用意しており、それ以前の行動や状況に依存して、状態認識部からの入力、即ちセンサ入力された外部情報の認識結果を分類して、行動を機体上で発現する。スキーマは、外部刺激と内部状態（の変化）を入力とし、少なくとも外部刺激や内部状態の変化に応じた状況判断を行なうＭｏｎｉｔｏｒ機能と、行動実行に伴う状態遷移（ステートマシン）を実現するＡｃｔｉｏｎ機能とを備えたスキーマ（Ｓｃｈｅｍａ）として記述される。そして、ＮｏｒｍａｌＳＢＬ２３０は、図２１の下図に示すように、例えば、ダンスを踊る、サッカーをする、なぞなぞをとく等の行動が記述された複数のスキーマ２３１〜２３３が階層的に連結された木構造（スキーマ・ツリー）２４０として構成されている。
【０１６４】
このような木構造に構成された複数のスキーマツリー２４０は、外部刺激や内部状態の変化に応じてより最適なスキーマを統合的に判断して行動制御を行なうようになっている。スキーマツリー２４０は、例えば動物行動学的（Ｅｔｈｏｌｏｇｉｃａｌ）な状況依存行動を数式化した行動モデルや、感情表現を実行するためのサブツリー等、複数のサブツリー（又は枝）を含んでいる。
【０１６５】
ここで、上述したＭｏｎｉｔｏｒ関数とは、外部刺激と内部状態に応じて当該スキーマのアクティベーションレベルを算出する関数である。ツリー構造を構成する場合、上位（親）のスキーマは外部刺激と内部状態を引数として下位（子供）のスキーマのＭｏｎｉｔｏｒ関数をコールすることができ、子供のスキーマはアクティベーションレベルを返り値とする。また、スキーマは自分のアクティベーションレベルを算出するために、更に子供のスキーマのＭｏｎｉｔｏｒ関数をコールすることができる。そして、ルートのスキーマ２３４には各サブツリーからのアクティベーションレベルが返されるので、外部刺激と内部状態の変化に応じた最適なスキーマすなわち行動を統合的に判断することができる。
【０１６６】
例えばアクティベーションレベルが最も高いスキーマを選択したり、アクティベーションレベルが所定の閾値を越えた２以上のスキーマを選択して並列的に行動実行したりすることができる（但し、並列実行するときは各スキーマどうしでハードウェア・リソースの競合がないことを前提とする）。
【０１６７】
また、Ａｃｔｉｏｎ関数は、スキーマ自身が持つ行動を記述したステートマシンを備えている。ツリー構造を構成する場合は、親スキーマは、Ａｃｔｉｏｎ関数をコールして、子供スキーマの実行を開始したり中断させたりすることができる。
【０１６８】
（４−２）ＤｅｌｉｂｅｒａｔｉｖｅＳｉｔｕａｔｅｄＢｅｈａｖｉｏｒＬａｙｅｒ（ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ）
上述したように、本具体例においては、自律的動作可能なロボット装置において、ユーザ等の指示により強制的に動作させるための行動選択手法（Ｉｎｔｅｎｔｉｏｎモード）を実現するＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０を有している。このＤｅｌｉｂｅｒａｔｉｖｅＳＢＬは、ＮｏｒｍａｌＳＢＬとは基本的な構造、すなわち行動記述モジュールであるスキーマがツリー状に構成されているという点では同一である。しかし、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０、即ちＤｅｌｉｂｅｒａｔｉｖｅＳＢＬを構成する各スキーマは、ＮｏｒｍａｌＳＢＬ２３０を構成する各スキーマとは異なり、単独では発話を行う、モーションを再生する等のコマンドを出力する機能は持たず、ＮｏｒｍａｌＳＢＬ２３０中の特定のスキーマに対して強制的に起動することで間接的に行動生成を行わせる機能（Ｉｎｔｅｎｔｉｏｎ機能）を有する。以下、この機能を「スキーマにＩｎｔｅｎｔｉｏｎを加える」という。また、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０は、ＮｏｒｍａｌＳＢＬ２３０内のスキーマに対してＩｎｔｅｎｔｉｏｎを加えると同時にパラメータを渡すことにより、行動をより限定して行わせる機能を有している。
【０１６９】
（４−３）ＳＢＬの機能
本具体例におけるＳＢＬの機能は、上述した如く、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０からＩｎｔｅｎｔｉｏｎを加える機能やパラメータを渡す機能と、上述の実施の形態において説明したように、ＮｏｒｍａｌＳＢＬ２３０がＩｎｔｅｎｔｉｏｎを加えられて強制的に起動するだけでなく、これを断る機能とを有している。以下、本具体例におけるＳＢＬ２００が有するこれらの各機能について詳細に説明する。
【０１７０】
（４−３−１）Ｉｎｔｅｎｔｉｏｎ機能
上述のＳＢＬの機能のうち、特定のスキーマに対して強制的に起動することで間接的に行動生成を行わせる機能（Ｉｎｔｅｎｔｉｏｎ機能）は、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０が、ＮｏｒｍａｌＳＢＬ２３０内のスキーマに対して外的にアクティベーションレベルを設定することで実現される。
【０１７１】
このような機能を有するＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０の具体的な利用方法として、図２１の上図に示すように、ユーザの音声命令などを解釈し、命令に相当するスキーマに対してＩｎｔｅｎｔｉｏｎを加えて命令に即した行動を行わせる音声解釈スキーマ（ＶｉｃｅＣｏｍｍａｎｄＨａｎｄｌｅｒ）２０１と、予め用意された一連のスキーマ起動シーケンスファイルを再生しながら特定のスキーマにＩｎｔｅｎｔｉｏｎをかけ、ロボット装置にデモンストレーション（機能紹介）を行わせる機能紹介スキーマ（ＳｃｒｉｐｔＰｌａｙｅｒ）２０２を例にとって具体的に説明する。
【０１７２】
音声解釈スキーマ（ＶｉｃｅＣｏｍｍａｎｄＨａｎｄｌｅｒ）２０１の、ユーザの音声命令などを解釈し、命令に相当するスキーマに対してＩｎｔｅｎｔｉｏｎを加え命令に即した行動を行わせる機能は、予め、ユーザの音声コマンドと、これに対応するＮｏｒｍａｌＳＢＬ中のスキーマとの対応データベースを作成することによって実現することが可能になる。例えば、「ダンスを踊って」、「サッカーして」、又は「なぞなぞしよう」等のユーザコマンドを受けて、ダンススキーマ２３１、サッカースキーマ２３２、なぞなぞスキーマ２３３を起動させることにより、ロボット装置の自律的行動選択を抑制して目的の行動をロボット装置に実行させることができる。
【０１７３】
このように、自律的に行動選択するＮｏｒｍａｌＳＢＬ２３０に対して他律的に動作選択させるＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０のＩｎｔｅｎｔｉｏｎ機能を利用する場合は、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０内のスキーマから、ＮｏｒｍａｌＳＢＬ２３０内の特定のスキーマに対してＩｎｔｅｎｔｉｏｎを加えることでトップダウンの行動生成を行うものである。
【０１７４】
図２２は、Ｉｎｔｅｎｔｉｏｎで与えられたアクティベーションレベルと内部で評価したアクティベーションレベルとの関係を示す模式図である。図２２に示すように、ある特定のスキーマに対してＩｎｔｅｎｔｉｏｎが加えられると、もともと、そのスキーマが持っていた内部状態と外部刺激に基づいて算出されたアクティベーションレベル（以下、ＡＬ１）に、Ｉｎｔｅｎｔｉｏｎによって外部から設定されたアクティベーションレベル（以下、ＡＬ２という。）が加算される。スキーマを起動するかどうかを判断するために、より上位のスキーマに計上されるＡＬ値としては、この２つの値の例えば和（ＡＬ１＋ＡＬ２、以下、ＡＬ＿ｔｏｔａｌ）等が利用される。これによって、外部からあるスキーマに対して本来より大きなＡＬを設定することが可能になる。なお、上記上位のスキーマに計上されるＡＬ＿ｔｏｔａｌとしては、２つのＡＬ値に対して、例えば、適当な係数等により重み付けを行ったものの和等としてもよい。
【０１７５】
このように、単純にＩｎｔｅｎｔｉｏｎによって基のＡＬ（ＡＬ１）の底上げをしているだけなので、その他のスキーマのＡＬ１がとても大きい場合には、Ｉｎｔｅｎｔｉｏｎが意味を持たないこともありえる。しかし、実際には統合スキーマツリーを構成する段階のＡＬの調整で、通常の起動状態では一定の範囲内、例えば０〜１００の間でＡＬ１が変化するように設定するので、その範囲を十分超えるようなＩｎｔｅｎｔｉｏｎを与えることでＩｎｔｅｎｔｉｏｎがかかったときに、対象のスキーマを確実に起動することが可能となる。
【０１７６】
例えば、図２１に示すＤｅｌｉｂｅｒａｔｉｖｅＳＢＬ２１０の音声解釈スキーマ２０１が、ＮｏｒｍａｌＳＢＬ２３０のダンスを踊るダンススキーマ２３１に対してＩｎｔｅｎｔｉｏｎ２５０を加えた場合、各スキーマのＡＬは、図２２に示すようになる。即ち、ダンススキーマ２３１自身が内部状態と外部刺激とから算出したＡＬ１が例えば８５であり、音声解釈スキーマ２０１から外的に付加されるＡＬ２が例えば１５００である場合、ダンススキーマ２３１は、Ｉｎｔｅｎｔｉｏｎ２５０が加えられる、即ち、ＡＬ２が付加され、アクティベーションレベルの合計値（以下、ＡＬ＿ｔｏｔａｌという。）が１５８５となる。ここで、ＮｏｒｍａｌＳＢＬ２３０における各スキーマが算出する内部状態と外部刺激とから算出されるアクティベーションレベル（ＡＬ１）の範囲を例えば０乃至１００等の所定の範囲に設定されており、このＡＬ１の範囲を超える大きなＡＬ２を付加するようにすることで、Ｉｎｔｅｎｔｉｏｎ２５０が加えられておらず、内部状態と外部刺激とから算出されるアクティベーションレベル（ＡＬ１）しか有していない他のスキーマより、ＡＬ＿ｔｏｔａｌが大きくなる。図２２の例では、他のスキーマ２３２〜２３４のＡＬ＿ｔｏｔａｌは、夫々５，１２，６７であるため、ＡＬ＿ｔｏｔａｌ＝１５８５であるＩｎｔｅｎｔｉｏｎ２５０が加えられたスキーマ２３１が選択されることになる。
【０１７７】
このように、各スキーマは、夫々が算出したアクティベーションレベルが比較され、例えば最も高いアクティベーションレベルを有するスキーマが選択されるように設定されている場合、このようなＩｎｔｅｎｔｉｏｎ２５０が加え、アクティベーションレベルを強制的に上昇させることで、選択させてスキーマを発火させて行動を発現させることができる。
【０１７８】
なお、本具体例は、Ｉｎｔｅｎｔｉｏｎとして十分大きな値を加算して指定したいスキーマを必ず選択するものであるが、図１８に示す上述の例においては、Ｉｎｔｅｎｔｉｏｎとして例えばアクティベーションレベルが＋３０等、本具体例より小さい値を加えるものである。そして、Ｉｎｔｅｎｔｉｏｎとしてのアクティベーションレベルが加算された後も負の値である場合に、自身の行動をやりたくない旨を発現するものであり、この場合はＩｎｔｅｎｔｉｏｎとして加算されるアクティベーションレベルは小さいものの、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬにより指定されたスキーマは、アクティベーションレベルを他のスキーマと比較されることなく一旦は選択されるように設計されている。即ち、アクティベーションレベルが負の欲求を示すものとすれば、Ｉｎｔｅｎｔｉｏｎを断る機能としても用いることができる。
【０１７９】
次に、Ｉｎｔｅｎｔｉｏｎを加える他の例として、ＮｏｒｍａｌＳＢＬ２３０の特定のスキーマに対してＩｎｔｅｎｔｉｏｎを加えながら、Ｉｎｔｅｎｔｉｏｎコマンドの中間で説明を行う等の音声コマンドを実行したいといった場合の行動を実行させる場合について、予め用意された一連のスキーマ起動シーケンスファイルを再生しながら特定のスキーマにＩｎｔｅｎｔｉｏｎをかけ、ロボットのデモンストレーション（機能紹介）スキーマ２０２を例にとって説明する。このような場合は、ＮｏｒｍａｌＳＢＬのスキーマ間でＩｎｔｅｎｔｉｏｎを加えることにより実現することが可能になる。
【０１８０】
機能紹介スキーマ２０２における予め用意された一連のスキーマ起動シーケンスファイルを再生しながら特定のスキーマにＩｎｔｅｎｔｉｏｎを加える機能は、例えばユーザとインタラクティブにロボット装置の機能紹介を行う際に有効な方法である。図２３は、機能紹介スキーマ２０２とＮｏｒｍａｌＳＢＬ２３０内の各スキーマとの関係を説明する模式図である。この機能紹介スキーマ２０２が有するスクリプトファイルには、機能紹介を行うための説明文を含む音声、モーション出力コマンドを実行するタイミング、ある特定のスキーマの機能を実行するためのＩｎｔｅｎｔｉｏｎをかける対象スキーマの情報とからなる一連のステートマシンが記述されている。
【０１８１】
デモンストレーションを行う機能紹介スキーマ２０２はスクリプトファイルを読み込みながら、説明コマンドを実行、若しくは特定のスキーマにＩｎｔｅｎｔｉｏｎを加えることでスキーマの機能を実行し、この操作を繰り返してユーザに対してロボットの機能紹介を行うというものである。
【０１８２】
例えば、ＮｏｒｍａｌＳＢＬ２３０内に自律的にダンスを踊る、サッカーを行う、及び問題（なぞなぞ）を出すという夫々ダンススキーマ２３１、サッカースキーマ２３２及び問題スキーマ２３３がある場合に、機能紹介スキーマ２０２は、先ず機能紹介スキーマ２０２内でコマンドを実行して自分（ロボット装置）は、ダンスが踊れる、サッカーができる、及び問題を出すということを、例えば音声等によりユーザに伝達し、ユーザの反応に応じて、ダンススキーマ２３１、サッカースキーマ２３２、又は問題スキーマ２３３に対してＩｎｔｅｎｔｉｏｎを加え、実際に、ダンススキーマ２３１を用いてダンスを踊る様子を実演する、サッカースキーマ２３２を用いてボールを蹴る様子を実演する、又は問題スキーマを用いてユーザに対して問題を出すという動作を実演することができる。
【０１８３】
このように、Ｉｎｔｅｎｔｉｏｎの機能を用いると自律的行動判断を行わせるために用意されたＮｏｒｍａｌＳＢＬ内のスキーマにＩｎｔｅｎｔｉｏｎをかけることで、既存のスキーマの持つ機能をそのまま再利用してユーザコマンドを解釈して行動を生成する、ユーザにロボットの機能を紹介するなどの、多くのバリエーションを持った行動生成が可能になる。このようにＩｎｔｅｎｔｉｏｎモードでスキーマを実行する際に、実際に行動生成コマンドを生成するスキーマシーケンス（アルゴリズム）はホメオスタシスモードで利用しているＮｏｒｍａｌＳＢＬのスキーマをそのまま再利用することができるため、ホメオスタシスモードとは独立で機能するＩｎｔｅｎｔｉｏｎモードを実現するために１からプログラムを書き分ける必要がなくなり極めて効率がよい。
【０１８４】
（４−３−２）命令を断る機能
Ｉｎｔｅｎｔｉｏｎを用いたスキーマの実行は、内部状態と外部刺激から評価されたホメオスタシスモードにおけるアクティベーションレベルを無視して、強制的にスキーマを起動するための仕組みとして考案されたものである。しかし、常に命令した行動が実現してしまうと、入力情報と行動出力との対応が固定的になりすぎて、特定の応答動作の繰り返しの結果、ユーザの飽きを招いてしまう可能性がある。そのため、Ｉｎｔｅｎｔｉｏｎモードにおいても、ホメオスタシスモードのアクティベーションレベルを完全に無視するのではなく、部分的に考慮に入れ、Ｉｎｔｅｎｔｉｏｎがかけられる前の本来のアクティベーションレベルの値が小さかったり、上述したように、負の欲求を示すものである場合によって、ユーザの命令を受諾するか、拒絶するかを判断させ、行動生成に多様性を持たせることが可能になる。
【０１８５】
このようにＩｎｔｅｎｔｉｏｎが加えられたスキーマ内部においては、ＩｎｔｅｎｔｉｏｎによるＡＬ２が足し合わされたＡＬ＿ｔｏｔａｌだけでなく、Ｉｎｔｅｎｔｉｏｎは加えられる前のＡＬ１も保存されており、参照することが可能なので、「本来はアクティベーションレベルが低いことを考慮して起動を拒否する」といった表現が可能になる。
【０１８６】
具体的には、Ｉｎｔｅｎｔｉｏｎをかけられたスキーマは起動した際に、同時に内部状態と外部刺激に基づくアクティベーションレベル（ＡＬ１）を算出し、そのＡＬ１がある一定閾値以上の値を持っていた場合には、所定の行動出力を行うが、閾値以下の場合には気分が乗らない旨の表現を行い、スキーマを終了するというものである。例えば、上記閾値をＡＬ１＿ｔｈ＝６０と設定した場合について説明する。図２２に示す例では、スキーマ２３１は、Ｉｎｔｅｎｔｉｏｎを加えられる前のＡＬ１＝８５であるため、ユーザの命令、即ち、音声解釈スキーマの指示通りに動作を行う。
【０１８７】
ここで、サッカーをするスキーマ２３２にＩｎｔｅｎｔｉｏｎが加えられ、ＡＬ２＝１５００が付加された場合、例えば、その日は何度もサッカーをした、ボールが見えていない、又は好きな色のボールではない等の理由でスキーマ２３２が自身で算出したＡＬ１＝５と低く、閾値以下である場合がある。このような場合、ロボット装置はサッカーをするという動作を拒否することができる。
【０１８８】
例えば、サッカーをするスキーマ２３２の下層に、疲れた態度を示すスキーマや、首を振るスキーマ、サッカーをしたくない旨を音声にてユーザに通知するスキーマ等の１以上の代償スキーマを用意しておき、スキーマ２３２のＡＬ１が所定の閾値以下であるのにＩｎｔｅｎｔｉｏｎが加えられた場合に、上記サッカースキーマ２３２を起動しない代わりに、スキーマ２３２のＡＬ１に応じてこれらの代償スキーマを起動させる等してもよい。
【０１８９】
なお、Ｉｎｔｅｎｔｉｏｎを用いてスキーマを実行した際の命令を拒否する機能を実装した場合には、拒否を認めず強制的に行動を実行させる、命令を拒否する機能を無効にする機能が必要となる。この拒否を無効にする機能を設けることにより、例えば上述の機能紹介スキーマ２０２がＩｎｔｅｎｔｉｏｎを用いてＮｏｒｍａｌＳＢＬ２３０内のスキーマを実行（選択）し、ロボット装置の機能紹介デモンストレーションを行おうとした場合に、ＮｏｒｍａｌＳＢＬ２３０内のスキーマが実行命令を拒否してしまうと、デモンストレーションが続行できなくなることを防止することができる。本具体例においては、強制フラグを用い、このフラグが立っているか否かにより、ＮｏｒｍａｌＳＢＬ２３０のスキーマからの拒否を受け付けるか否かを選択するようにする。従って、デモンストレーションを行うときには、ＮｏｒｍａｌＳＢＬ２３０のスキーマに対してＩｎｔｅｎｔｉｏｎを加えると同時に、強制的にスキーマを実行することを意味する情報を「強制フラグ」として渡すことにより、命令を拒否する機能を無効にする。
【０１９０】
内部状態と外部刺激により行動決定を行っている状態、即ちＮｏｒｍａｌＳＢＬのみにおける行動決定が完全自律モードだとすると、ＤｅｌｉｖｅｒａｔｉｖｅＳＢＬによってＩｎｔｅｎｔｉｏｎが加えられた際に、同時に、本来のアクティベーションレベル（ＡＬ１）を参照し、ある一定閾値以下の場合には命令を拒否するという機能は、半自律モードといえる。そして強制フラグと共にＩｎｔｅｎｔｉｏｎの機能を用いてロボットのデモンストレーションを行うような場合は、完全他律モードといえる。
【０１９１】
（４−３−３）Ｉｎｔｅｎｔｉｏｎを加えると同時にパラメータを渡す機能
通常Ｉｎｔｅｎｔｉｏｎを加えることによってＮｏｒｍａｌＳＢＬ内のスキーマを起動する際には、「ダンスして」、「サッカーして」など、抽象的な音声コマンドをハンドルしてスキーマ単位で行動を実行することが想定される。しかし、Ｉｎｔｅｎｔｉｏｎと同時にパラメータを渡す機能を搭載することにより、Ｉｎｔｅｎｔｉｏｎコマンドをより詳細に指定することが可能になる。例えば、サッカーをするスキーマに対してＩｎｔｅｎｔｉｏｎをかける場合に、「ピンク色のボールを蹴って」という音声コマンドをハンドルした場合には、「ピンク色のボール」に相当する情報を渡すことにより、サッカースキーマに対して、特にピンク色のボールを捜し、蹴るよう行動を限定して命令することが可能になる。この場合、Ｉｎｔｅｎｔｉｏｎ情報と同時に渡すべき情報は、対象オブジェクトのフィーチャーを示す情報、例えば、色彩、形状などが考えられる。
【０１９２】
（４−４）他の例
Ｉｎｔｅｎｔｉｏｎの加え方のバリエーション
Ｉｎｔｅｎｔｉｏｎの加え方は、上記具体例に限定されるものではなく、種々の変更が可能である。例えば、デモンストレーションを行う際に、シナリオに従って、音声発話による説明を加えながら、順番にスキーマを起動しロボットの機能を紹介するというものや、何か音声コマンドを受けた際に音声コマンドを解釈して、相当のスキーマを選択するか、又はスキーマ起動に必要なパラメータを渡し、スキーマを起動するというもの等もある。
【０１９３】
この他にもＩｎｔｅｎｔｉｏｎを用いたスキーマの起動方法には様々なバリエーションを持たせることが可能である。Ｉｎｔｅｎｔｉｏｎアルゴリズムにおいてバリエーションを作ることが可能なのは、基本的に以下の値を何らかの条件と連動させて変化させることによって実現される。即ち、
Ｉｎｔｅｎｔｉｏｎをかける際に加算するアクティベーションレベルの大きさ
Ｉｎｔｅｎｔｉｏｎをかけた際に、言うことを聞くか、命令を断るかを判断する閾値の大きさ
である。例えば、顔画像認識や話者認識によって獲得した情報をもとに、それまでの話者との経験からＩｎｔｅｎｔｉｏｎをかける際に加算するアクティベーションレベル（ＡＬ２）の大きさを変化させ、気に入った相手の言うことはよく聞くが、気に入らない相手の言うことは聞きにくいなど、相手が誰であるかによっていうことの聞き具合に変化を与えるようなことも可能である。
【０１９４】
本具体例においては、ＤｅｌｉｂｅｒａｔｉｖｅＳＢＬのＩｎｔｅｎｔｉｏｎ機能を用いることにより、ロボット装置の内部状態や外部刺激の状態によらず、人間の命令した言葉を解釈して行動生成を行うことや、ある一連の動作を定義した設定ファイルを再生することによってデモンストレーションを行うことが可能となる。このように、ホメオスタシスモードのために記述されたスキーマをＩｎｔｅｎｔｉｏｎモードのスキーマとしても再利用可能であるため、行動選択、生成のためのプログラムをすべて準備する必要がない。即ち、用意するべきプログラムはあるコマンドに対してどのスキーマを実行するか、もしくはスキーマをどのような順番で実行するかといったフレームワークのみでよいのでプログラミング効率が極め高い。
【０１９５】
これにより、内外の環境状況に応じて自律的に行動判断を行うホメオスタシスモードによるスキーマの実行と、ユーザからの命令、デモスクリプトなどトップダウンの命令に従って行動生成を行うＩｎｔｅｎｔｉｏｎモードとを共通のＳＢＬという行動制御アルゴリズムの中で扱うことが可能になる。
【０１９６】
そして、外的にアクティベーションレベル（ＡＬ２）を設定した際であっても、内部状態に基づいて算出されたアクティベーションレベル（ＡＬ１）を参照すると共に、感情状態に応じて加算するアクティベーションレベル（ＡＬ１）を増減させることで、不快な感情状態の時には言うことを聞きにくいが、喜んでいるような感情状態の時にはなんでも言うことを聞いてくれるというような動作を取らせることが可能となる。このように、外的には強制的にスキーマを起動させられてはいるものの、同時に内部では実際にはどの程度その行動を行いたいという意志を持っているのかを仮想的に考えることが可能であるので、ロボット装置の状況によっては、外的に加えられたＩｎｔｅｎｔｉｏｎを拒否するという行動を生成して入力情報−行動出力間の応答動作が固定化することを防ぎ、行動の多様化を実現することができる。即ち、単に命令に従うのみでなく、状況に応じて、命令を拒否することにより、ユーザがロボット装置の固定的な応答に飽きを生じてしまうことを防ぎ、より人間や動物に近い動作を生成することができる。
【０１９７】
【発明の効果】
以上詳細に説明したように本発明に係るロボット装置は、内部状態及び外部刺激に基づき行動を選択し発現するロボット装置において、複数の行動が記述され、該複数の行動から選択された行動を出力する行動出力手段と、上記内部状態及び／又は外部刺激から各行動の実行優先度を算出する優先度算出手段とを有し、上記実行優先度は、各行動を発現することに対する正の欲求又は負の欲求を示し、上記行動出力手段は、選択された行動の実行優先度が負の欲求を示すものであるとき、当該選択された行動とは異なる行動を出力するので、従来は、やりたい等の正の欲求のみが行動に反映されていたのに対し、選択された行動をしなくないという負の欲求を有する旨をユーザに動作で伝えたり、音声で通知して行動に反映させることができ、ロボット装置の行動に多種多様なバリエーションを持たせてより人間に近い行動を出力するようにすることができる。
【図面の簡単な説明】
【図１】本発明の実施の形態におけるロボット装置の外観を示す斜視図である。
【図２】本発明の実施の形態におけるロボット装置の機能構成を模式的に示すブロック図である。
【図３】本発明の実施の形態におけるロボット装置の制御ユニットの構成を更に詳細に示すブロック図である。
【図４】本発明の実施の形態におけるロボット装置の行動制御システムの機能構成を示す模式図である。
【図５】本発明の実施の形態における行動制御システムのオブジェクト構成を示す模式図である。
【図６】本発明の実施の形態における状況依存行動階層による状況依存行動制御の形態を示す模式図である。
【図７】状況依存行動階層による行動制御の基本的な動作例を示す模式図である。
【図８】状況依存行動階層により反射行動を行なう場合の動作例を示す模式図である。
【図９】状況依存行動階層により感情表現を行なう場合の動作例を示す模式図である。
【図１０】状況依存行動階層が複数のスキーマによって構成されている様子を示す模式図である。
【図１１】状況依存行動階層におけるスキーマのツリー構造を示す模式図である。
【図１２】状況依存行動階層において通常の状況依存行動を制御するためのメカニズムを示す模式図である。
【図１３】反射行動部におけるスキーマの構成を示す模式図である。
【図１４】反射行動部により反射的行動を制御するためのメカニズムを示す模式図である。
【図１５】状況依存行動階層において使用されるスキーマのクラス定義を示す模式図である。
【図１６】状況依存行動階層内のクラスの機能的構成を示す模式図である。
【図１７】スキーマのＲｅｅｎｔｒａｎｔ性を説明する図である。
【図１８】本発明の実施の形態におけるロボット装置の負の欲求を説明する図であって、外部からの指示に従いロボット装置が行動を発現しない場合において、スキーマツリーにおけるスキーマのアクティベーションレベルを示す模式図である。
【図１９】内部状態と欲求値ＭＶとの関係の一例を示すグラフ図である。
【図２０】外部からの指示に従いロボット装置が行動を発現する場合において、スキーマツリーにおけるスキーマのアクティベーションレベルを示す模式図である。
【図２１】本発明の実施の形態におけるＮｏｒｍａｌＳＢＬとＤｅｌｉｂｅｒａｔｉｖｅＳＢＬとの関係を示す模式図である。
【図２２】Ｉｎｔｅｎｔｉｏｎで与えられたアクティベーションレベルと内部で評価したアクティベーションレベルとの関係を示す模式図である
【図２３】本発明の実施の形態におけるＤｅｌｉｂｅｒａｔｉｖｅＳＢＬの機能紹介スキーマとＮｏｒｍａｌＳＢＬ内の各スキーマとの関係を説明する模式図である。
【符号の説明】
１ロボット装置、１０行動制御システム、１５ＣＣＤカメラ、１６マイクロフォン、１７スピーカ、１８タッチ・センサ、１９ＬＥＤインジケータ、２０制御部、２１ＣＰＵ、２２ＲＡＭ、２３ＲＯＭ、２４不揮発メモリ、２５インターフェース、２６無線通信インターフェース、２７ネットワーク・インターフェース・カード、２８バス、２９キーボード、４０入出力部、５０駆動部、５１モータ、５２エンコーダ、５３ドライバ、８１視覚認識機能部、８２聴覚認識機能部、８３接触認識機能部、９１内部状態管理部、９２短期記憶部（ＳＴＭ）、９３長期記憶部（ＬＴＭ）、１０１熟考行動階層、１０２状況依存行動階層（ＳＢＬ）、１０３反射行動部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a robot device having entertainment characteristics imitating a human or an animal, a control method and a program therefor, and in particular, similarly to a human or an animal, having a desire to express an action and performing an action based on the desire. The present invention relates to a robot device that can be selectively expressed, a control method thereof, and a program.
[0002]
[Prior art]
A mechanical device that performs a motion similar to the motion of a human (living organism) using an electric or magnetic action is called a “robot device”. Robotic devices have begun to spread in Japan since the late 1960s, and most of them have been industrial robotic devices such as manipulators and transfer robotic devices for the purpose of automation and unmanned production work in factories. (Industrial Robot).
[0003]
Recently, practical robot devices have been developed to support life as a human partner, that is, to support human activities in various situations in a living environment and other daily lives. Unlike a practical robot device, such a practical robot device has the ability to individually learn a human having different personalities or a method of adapting to various environments in various aspects of a human living environment. I have. For example, a “pet-type” robot device that simulates the body mechanism and operation of a four-legged animal such as a dog or a cat, or a body mechanism and operation of a human or the like that walks upright on two legs as a model Robotic devices such as the "humanoid" or "humanoid" robotic devices (Humanoid Robot) are already being put to practical use.
[0004]
Since these robot devices can perform, for example, various operations with an emphasis on entertainment, as compared with industrial robot devices, they are sometimes referred to as entertainment robot devices. Some of such robot devices operate autonomously according to external information and internal conditions.
[0005]
By the way, in such a pet robot device, like a human or a real dog or cat, a function for performing an optimal next action and action according to the current situation, and a next action and action based on past experience are performed. If a function for changing the operation can be provided, the user can be provided with a further sense of intimacy and satisfaction, and the amusement of the pet robot device can be further improved. Therefore, a robot apparatus and a control method thereof for improving such amusement properties are described in Patent Document 1 below.
[0006]
The robot device described in Patent Literature 1 has a plurality of types of behavior models, and uses a behavior selection unit based on at least one of externally input information and its own behavior history and / or growth history. The configuration is such that the output of one behavior model is selected from the outputs of the respective behavior models, whereby the optimal next behavior according to the current situation can be continuously performed.
[0007]
[Patent Document 1]
JP 2001-157981 A
[0008]
[Problems to be solved by the invention]
However, the conventional robot device as described above aims at selecting an action model with a high priority for expressing an action to be performed, and determines that an action with a low priority, that is, does not want to do. The action being taken was not selected and was therefore not considered much. Therefore, in the conventional robot device action selection, when the user specifies “do this”, the program is programmed to take the corresponding action (action) without fail. Regardless of the state, the designated action is similarly taken, and there is a problem that the user is disturbed by only such a fixed response and lacks in entertainment.
[0009]
In other words, if you set a case where you have to do it even when you do not want to do it, for example, even if you say "Let's play together", "I do not want to play" because of "Tired", "I am hungry" etc. If it is possible to have a negative desire such as, and the negative desire can also be reflected in the behavior of the robot apparatus, it becomes more imitated to the behavior of humans or animals such as dogs and cats, and the user can have a closer feeling and Satisfaction can be given, and entertainment characteristics can be further improved.
[0010]
The present invention has been proposed in view of such a conventional situation, and even when it is designated to exhibit an operation by a specific external stimulus such as an instruction from a user, the external situation and the self It is an object of the present invention to provide a robot apparatus capable of expressing various variations of operations according to the internal state of the robot, an operation control method thereof, and a program.
[0011]
[Means for Solving the Problems]
In order to achieve the above object, a robot apparatus according to the present invention is a robot apparatus that selects and expresses an action based on an internal state and an external stimulus, wherein a plurality of actions are described, and the action selected from the plurality of actions is described. , And priority calculation means for calculating the execution priority of each action from the internal state and / or the external stimulus. The execution priority is a positive value for expressing each action. Indicating a desire or a negative desire, wherein the behavior output means outputs a behavior different from the selected behavior when the execution priority of the selected behavior indicates a negative desire. .
[0012]
In the present invention, since there is not only a positive desire such as wanting to do but also a negative desire such as not wanting to do, it is possible to reflect this negative desire in the action, for example, a predetermined request such as being instructed by the user. If the desire for the action specified by the external stimulus is negative, a compensatory action such as notifying the user by voice that the specified action is not desired can be expressed.
[0013]
The behavior control method for a robot device according to the present invention is the behavior control method for a robot device for selecting and expressing a behavior based on an internal state and an external stimulus, wherein the execution priority of each behavior is calculated from the internal state and / or the external stimulus. Priority calculation step, and an action output step of outputting an action selected from a plurality of actions, the execution priority indicates a positive desire or a negative desire to express each action, In the action output step, when the execution priority of the selected action indicates a negative desire, an action different from the selected action is output.
[0014]
A program according to the present invention causes a computer to execute the above-described operation control processing.
[0015]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, specific embodiments to which the present invention is applied will be described in detail with reference to the drawings. The robot device of the present embodiment is a robot device that can autonomously act according to the internal state, and also has a negative desire such as “I do not want to do”, and can express this. Here, first, a preferred configuration and a control system of such a robot device will be described first, and then a robot device showing a negative desire in the present embodiment will be described in detail. I do.
[0016]
(1) Robot device configuration
FIG. 1 is a perspective view illustrating an appearance of a robot device according to the present embodiment. As shown in FIG. 1, in the robot apparatus 1, a head unit 3 is connected to a predetermined position of a trunk unit 2, and two left and right arm units 4 R / L and two left and right leg units 5 R / L are connected to each other (however, each of R and L is a suffix indicating each of right and left. The same applies hereinafter).
[0017]
FIG. 2 is a block diagram schematically illustrating a functional configuration of the robot device 1 according to the present embodiment. As shown in FIG. 2, the robot device 1 includes a control unit 20 that performs overall control of the entire operation and other data processing, an input / output unit 40, a driving unit 50, and a power supply unit 60. . Hereinafter, each unit will be described.
[0018]
The input / output unit 40 corresponds to a human eye as an input unit, and is disposed in a CCD camera 15 for photographing an external situation, a microphone 16 corresponding to an ear, a head, a back, and the like, and presses a predetermined pressure. When this is received, it is electrically detected to provide a touch sensor 18 for sensing a user's contact, a distance sensor for measuring a distance to an object located ahead, various other sensors corresponding to the five senses, and the like. Including. As an output unit, a speaker 17 provided in the head unit 3 and corresponding to a human mouth, and an LED indicator (eye lamp) 19 provided at the position of a human eye and expressing an emotional expression or a visual recognition state are provided. These output units can express the user feedback from the robot apparatus 1 in a form other than the mechanical movement pattern by the legs or the like, such as voice or blinking of the LED indicator 19.
[0019]
For example, a plurality of touch sensors 18 are provided at predetermined locations on the top of the head of the head unit. It can detect strokes, taps, taps, and the like.For example, if it is detected that some of the pressure sensors have come into contact sequentially after a predetermined time, this is determined to be stroked. However, when a contact is detected within a short time, it is classified as "hitting" or the like, and the internal state changes accordingly, and the action is developed according to such a change in the internal state. be able to.
[0020]
The drive unit 50 is a functional block that implements a body operation of the robot device 1 according to a predetermined movement pattern commanded by the control unit 20, and is a control target by behavior control. The drive unit 50 is a functional module for realizing a degree of freedom at each joint of the robot apparatus 1, and includes a plurality of drive units 54 provided for each axis such as roll, pitch, and yaw at each joint. ₁ ~ 54 _n It consists of. Each drive unit 54 ₁ ~ 54 _n Is a motor 51 that rotates around a predetermined axis. ₁ ~ 51 _n And the motor 51 ₁ ~ 51 _n Encoder 52 for detecting the rotational position of the ₁ ~ 52 _n And the encoder 52 ₁ ~ 52 _n Motor 51 based on the output of ₁ ~ 51 _n Driver 53 that adaptively controls the rotational position and rotational speed of the ₁ ~ 53 _n And a combination of
[0021]
Although the robot device 1 is a bipedal walking, the robotic device 1 may be configured as a legged mobile robot device such as a quadrupedal walking, depending on how the drive units are combined.
[0022]
The power supply unit 60 is a functional module that supplies power to each electric circuit and the like in the robot device 1 as the name implies. The robot apparatus 1 according to the present embodiment is of an autonomous driving type using a battery, and a power supply unit 60 includes a charge battery 61 and a charge / discharge control unit 62 that manages a charge / discharge state of the charge battery 61. You.
[0023]
The charging battery 61 is configured, for example, in the form of a “battery pack” in which a plurality of lithium ion secondary battery cells are packaged in a cartridge type.
[0024]
Further, the charge / discharge control section 62 grasps the remaining capacity of the battery 61 by measuring the terminal voltage and the charge / discharge current amount of the battery 61, the ambient temperature of the battery 61, and determines the start time and end time of charging. decide. The start and end timings of charging determined by the charge / discharge control unit 62 are notified to the control unit 20 and serve as triggers for the robot apparatus 1 to start and end charging operations.
[0025]
The control unit 20 corresponds to a “brain” and is mounted on, for example, the head or the body of the robot apparatus 1.
[0026]
FIG. 3 is a block diagram showing the configuration of the control unit 20 in further detail. As shown in FIG. 3, the control unit 20 has a configuration in which a CPU (Central Processing Unit) 21 as a main controller is bus-connected to a memory and other circuit components and peripheral devices. The bus 28 is a common signal transmission path including a data bus, an address bus, a control bus, and the like. Each device on the bus 28 is assigned a unique address (memory address or I / O address). The CPU 21 can communicate with a specific device on the bus 28 by specifying an address.
[0027]
A RAM (Random Access Memory) 22 is a writable memory composed of a volatile memory such as a DRAM (Dynamic RAM), and loads a program code executed by the CPU 21 or temporarily stores work data by the execution program. Used to sneak.
[0028]
The ROM (Read Only Memory) 23 is a read-only memory that permanently stores programs and data. The program codes stored in the ROM 23 include a self-diagnosis test program to be executed when the power of the robot apparatus 1 is turned on, an operation control program for defining the operation of the robot apparatus 1, and the like.
[0029]
The control program of the robot apparatus 1 includes a “sensor input / recognition processing program” that processes sensor inputs from the camera 15 and the microphone 16 and recognizes the symbols as symbols, and performs storage operations (described later) such as short-term storage and long-term storage. An “action control program” that controls the action of the robot apparatus 1 based on the sensor input and a predetermined action control model, and a “drive control program” that controls the drive of each joint motor and the audio output of the speaker 17 according to the action control model. Etc. are included.
[0030]
The non-volatile memory 24 is composed of a memory element that can be electrically erased and rewritten, such as an EEPROM (Electrically Erasable and Programmable ROM), and is used to hold data to be sequentially updated in a non-volatile manner. The data to be sequentially updated includes an encryption key and other security information, a device control program to be installed after shipment, and the like.
[0031]
The interface 25 is a device for interconnecting with devices outside the control unit 20 and enabling data exchange. The interface 25 performs data input / output with the camera 15, the microphone 16, the speaker 17, and the like, for example. The interface 25 is provided with each driver 53 in the drive unit 50. ₁ ~ 53 _n Input and output data and commands between
[0032]
The interface 25 includes a serial interface such as a RS (Recommended Standard) -232C, a parallel interface such as an IEEE (Institute of Electrical and Electronics Engineers) 1284, a USB interface (Universal Serial I / E), a serial interface, and a serial interface (USB). , A general-purpose interface for connecting peripheral devices of a computer, such as a SCSI (Small Computer System Interface) interface, a memory card interface (card slot) for receiving a PC card or a memory stick, and the like. Professional with external devices It may be carried out the movement of the ram and data.
[0033]
As another example of the interface 25, an infrared communication (IrDA) interface may be provided to perform wireless communication with an external device.
[0034]
Further, the control unit 20 includes a wireless communication interface 26, a network interface card (NIC) 27, and the like, and performs near field wireless data communication such as Bluetooth, a wireless network such as IEEE 802.11b, or a wide area such as the Internet. Data communication can be performed with various external host computers via the network.
[0035]
By such data communication between the robot device 1 and the host computer, complicated operation control of the robot device 1 can be calculated or remotely controlled using a remote computer resource.
[0036]
(2) Robot system control system
Next, a behavior (motion) control system of the robot device will be described. As described above, of the robot apparatus according to the present embodiment, among those that autonomously express their actions from external stimuli and their own internal states, negative desires such as not wanting to do so as to express more human-like actions The action control system of a robot device that autonomously expresses an action will be described first, and then a method of reflecting a negative desire in the action will be described. .
[0037]
FIG. 4 is a schematic diagram illustrating a functional configuration of the control system 10 of the robot device 1 according to the present embodiment. The robot apparatus 1 according to the present embodiment is capable of performing operation control in accordance with a recognition result of an external stimulus and a change in an internal state. In addition, by providing a long-term memory function and associatively storing a change in the internal state from the external stimulus, it is possible to perform operation control according to the recognition result of the external stimulus and a change in the internal state.
[0038]
Here, the external stimulus is perceptual information obtained by the robot apparatus 1 recognizing a sensor input, and is, for example, color information, shape information, face information processed for an image input from the camera 15. More specifically, it is composed of components such as color, shape, face, 3D general object, hand gesture, movement, voice, contact, smell, taste and the like.
[0039]
The internal state refers to, for example, emotions such as instinct and emotions based on the body of the robot device. Instinct factors include, for example, fatigue, heat or temperature, pain, appetite or hunger, third, affection, curiosity, excretion ( elimination or at least one of sexual desire. The emotional elements include happiness, sadness, anger, surprise, disgust, fear, frustration, boredom, and sleepiness. ), Sociability (gregalousness), patience, tension, relaxed, alertness, guilt, spite, loyalty, submissibility or It is at least one of jealousy.
[0040]
The illustrated control system 10 can be implemented using object-oriented programming. In this case, each software is handled in units of modules called "objects" in which data and a processing procedure for the data are integrated. Further, each object can perform data transfer and Invoke by message communication and an inter-object communication method using a shared memory.
[0041]
The control system 10 includes a state recognition unit 80, which is a functional module including a visual recognition function unit 81, a hearing recognition function unit 82, a contact recognition function unit 83, and the like, in order to recognize an external environment (Environments) 70. .
[0042]
The visual recognition function unit (Video) 81 performs image recognition such as face recognition or color recognition based on a captured image input via an image input device such as a CCD (Charge Coupled Device) camera. Perform processing and feature extraction. Further, the auditory recognition function unit (Audio) 82 performs voice recognition of voice data input via a voice input device such as a microphone, and performs feature extraction and word set (text) recognition. Further, the contact recognition function unit (Tactile) 83 recognizes a sensor signal from a contact sensor built in, for example, the head of the body, and recognizes an external stimulus such as “patched” or “hit”. .
[0043]
An internal status manager (ISM: Internal Status Manager) 91 manages several types of emotions, such as the instinct and emotions described above, by using mathematical models, and manages the above-described visual recognition function unit 81, auditory recognition function unit 82, and contact recognition. The internal state such as instinct and emotion of the robot apparatus 1 is managed in accordance with an external stimulus (ES: External Stimula) recognized by the functional unit 83.
[0044]
Such an emotion model and an instinct model each have a recognition result and an action (action) history as inputs, and manage an emotion value and an instinct value. The behavior model can refer to these emotion values and instinct values.
[0045]
In addition, in order to perform operation control in accordance with the recognition result of the external stimulus and the change in the internal state, the information is compared with a short-term memory (STM) 92 which stores a short-term memory which is lost over time. It has a long term memory (LTM: Long Term Memory) 93 for holding for a long period of time. The classification of short-term memory and long-term memory depends on neuropsychology.
[0046]
The short-term storage unit 92 is a functional module that holds a target or an event recognized from the external environment by the visual recognition function unit 81, the auditory recognition function unit 82, and the contact recognition function unit 83 for a short time. For example, the input image from the camera 15 shown in FIG. 2 is stored for a short period of about 15 seconds.
[0047]
The long-term storage unit 93 is used to hold information obtained by learning, such as the name of an object, for a long time. For example, the long-term storage unit 93 can associate and store a change in an internal state from an external stimulus in a certain behavior description module.
[0048]
The operation control of the robot apparatus 1 is performed by a “reflex behavior” realized by a reflexive behavior behavior layer (Reflexive Situated Behaviors Layer) 103 and a “situation-dependent behavior layer (SBL) realized by a Situation Behavior Layer (SBL) 102”. Behavior "and" contemplation behavior "realized by the Deliberative Layer 101.
[0049]
The reflex action unit 103 is a functional module that implements a reflexive body operation in response to an external stimulus recognized by the visual recognition function unit 81, the auditory recognition function unit 82, and the contact recognition function unit 83 described above. The reflex action is basically an action of directly receiving a recognition result of external information input from a sensor, classifying the result, and directly determining an output action (action). For example, it is preferable to implement a behavior such as chasing or nodding a human face as a reflex action.
[0050]
The situation-dependent behavior hierarchy 102 is based on the storage contents of the short-term storage unit 92 and the long-term storage unit 93 and the internal state managed by the internal state management unit 91, and the behavior corresponding to the situation where the robot apparatus 1 is currently placed. Control.
[0051]
The situation-dependent action hierarchy 102 has a plurality of action description modules (schema) in which actions (actions) according to purposes are described, and a state machine is prepared for each action (schema). The recognition result of the external information input from the sensor is classified depending on the situation, and the operation is expressed on the body. The situation-dependent behavior hierarchy 102 also implements an action for keeping the internal state within a certain range (also referred to as “homeostasis behavior”). When the internal state exceeds a specified range, the internal state is regarded as the relevant state. The action is activated so that the action for returning to the range easily appears (actually, the action (action) is selected in consideration of both the internal state and the external environment).
[0052]
Specifically, each schema calculates an activity level (activation level, hereinafter also referred to as AL) indicating the execution priority of the schema based on the change in the internal state and the external stimulus. At least one schema having a high activation level is selected, and the selected operation is expressed. That is, for example, a schema having the highest activation level can be selected, or two or more schemas whose activation levels exceed a predetermined threshold can be selected and executed in parallel (however, when executing in parallel) Assumes that there is no hardware resource contention between each schema). This situation-dependent action has a slower reaction time than the reflex action.
[0053]
The reflection behavior hierarchy 101 performs a relatively long-term action plan of the robot apparatus 1 based on the storage contents of the short-term storage unit 92 and the long-term storage unit 93. Reflection behavior is behavior that is performed based on a given situation or a command from a human, based on reasoning and planning to realize it. For example, searching for a route from the position of the robot device and the position of the target is equivalent to deliberate behavior. Such inferences and plans may require more processing time and calculation load (that is, more processing time) than the reaction time for the robot apparatus 1 to maintain the interaction. Thought behaviors make inferences and plans while responding in real time.
[0054]
The reflection behavior hierarchy 101, the situation-dependent behavior hierarchy 102, and the reflex behavior unit 103 can be described as higher-level application programs independent of the hardware configuration of the robot device 1. On the other hand, the hardware dependent layer control unit (Configuration Dependent Actions And Reactions) 104 responds to commands from these higher-level applications, that is, the behavior description module (schema), and controls the hardware (such as the drive of the joint actuator) of the body. External environment) directly. With such a configuration, the robot device 1 can determine its own and surrounding conditions based on the control program, and can act autonomously according to instructions and actions from the user.
[0055]
Next, the behavior control system 10 will be described in more detail. FIG. 5 is a schematic diagram illustrating an object configuration of the behavior control system 10 according to the present embodiment.
[0056]
As shown in FIG. 5, the visual recognition function unit 81 is composed of three objects: a Face Detector 114, a Multi Color Tracker 113, and a Face Identify 115.
[0057]
The Face Detector 114 is an object that detects a face area from within an image frame, and outputs a detection result to a Face Identify 115. The Multi Color Tracker 113 is an object for performing color recognition, and outputs a recognition result to the Face Identify 115 and the Short Term Memory (STM) 92. Further, the Face Identify 115 identifies the person by searching the detected face image using a person dictionary held by the user, and outputs the ID information of the person together with the position and size information of the face image area to the STM 92.
[0058]
The auditory recognition function unit 82 is composed of two objects, Audio Recog 111 and Speech Recog 112. The Audio Recog 111 is an object that receives voice data from a voice input device such as a microphone and performs feature extraction and voice section detection, and outputs the feature amount and sound source direction of voice data in the voice section to the Speech Recog 112 and the STM 92. The Speech Recog 112 is an object that performs speech recognition using the speech feature amount received from the Audio Recog 111, the speech dictionary, and the syntax dictionary, and outputs a set of recognized words to the STM 92.
[0059]
The tactile recognition storage unit 83 is configured by an object called a Tactile Sensor 119 that recognizes a sensor input from a contact sensor, and outputs a recognition result to the STM 92 and an Internal State Model (ISM) 91 that is an object for managing an internal state.
[0060]
The STM 92 is an object constituting a short-term storage unit, and holds a target or an event recognized from an external environment by each of the above-described recognition system objects for a short period of time (for example, an input image from the camera 15 is stored for a short period of about 15 seconds). Only), and periodically notifies the SBL client 102 of the external stimulus (Notify).
[0061]
The LTM 93 is an object that forms a long-term storage unit, and is used to hold information obtained by learning, such as the name of an object, for a long time. The LTM 93 can, for example, associate and store a change in an internal state from an external stimulus in a certain behavior description module (schema).
[0062]
The ISM 91 is an object that constitutes an internal state management unit, manages several types of emotions such as instinct and emotions in a mathematical model, and manages external stimuli (ES: External Stimula) recognized by each of the above-described recognition system objects. The internal state such as the instinct and emotion of the robot apparatus 1 is managed according to.
[0063]
The SBL 102 is an object that forms a situation-dependent behavior hierarchy. The SBL 102 is an object serving as a client (STM client) of the STM 92. Upon receiving a notification (Notify) of information on an external stimulus (target or event) from the STM 92 periodically, a schema (Schema), that is, an action description to be executed Determine the module (described below).
[0064]
A Reflexive SBL (Suited Behaviors Layer) 103 is an object that constitutes a reflexive behavior unit, and executes reflexive and direct body motion in response to an external stimulus recognized by each of the above-described recognition system objects. For example, a behavior such as following a human face, nodding, or avoiding immediately by detecting an obstacle is performed.
[0065]
The SBL 102 selects an operation according to a situation such as an external stimulus or a change in the internal state. On the other hand, the Reflexive SBL 103 selects a reflexive operation in response to an external stimulus. Since the action selection by these two objects is performed independently, when the action description modules (schema) selected from each other are executed on the body, hardware resources of the robot apparatus 1 conflict with each other and cannot be realized. Sometimes. An object called an RM (Resource Manager) 116 arbitrates hardware conflicts between the SBL 102 and the Reflexive SBL 103 when selecting an action. Then, the body is driven by notifying each object that realizes the body operation based on the arbitration result.
[0066]
The Sound Performer 172, the Motion Controller 173, and the LED Controller 174 are objects that implement the body operation. The Sound Performer 172 is an object for performing voice output, performs voice synthesis according to a text command given from the SBL 102 via the RM 116, and performs voice output from a speaker on the body of the robot apparatus 1. The Motion Controller 173 is an object for performing an operation of each joint actuator on the body, and calculates a corresponding joint angle in response to receiving a command to move a hand, a leg, or the like from the SBL 102 via the RM 116. . The LED Controller 174 is an object for performing a blinking operation of the LED 19, and performs blinking driving of the LED 19 in response to receiving a command from the SBL 102 via the RM 116.
[0067]
(2-1) Situation-dependent behavior control
Next, the situation-dependent action hierarchy will be described in more detail. FIG. 6 schematically illustrates a form of situation-dependent behavior control using a situation-dependent behavior hierarchy (SBL) (including a reflex behavior unit). The recognition result (sensor information) 182 of the external environment 70 by the functional modules of the visual recognition function unit 81, the auditory recognition function unit 82, and the contact recognition function unit 83 of the recognition system is used as an external stimulus 183 as a situation-dependent behavior hierarchy (reflex behavior unit). 103a). The change 184 of the internal state according to the recognition result of the external environment 70 by the recognition system is also given to the situation-dependent action hierarchy 102a. In the situation-dependent behavior hierarchy 102a, the situation can be determined according to the external stimulus 183 or the change 184 in the internal state, and the behavior can be selected.
[0068]
FIG. 7 shows a basic operation example of behavior control by the situation-dependent behavior hierarchy (SBL) 102a including the reflex behavior unit 103 shown in FIG. As shown in the figure, in the situation-dependent behavior hierarchy 102a, the activation level of each behavior description module (schema) is calculated based on the external stimulus 183 and the change 184 of the internal state, and the schema is determined according to the degree of the activation level. Select and execute the action (action). In calculating the activation level, for example, by using the library 185, a unified calculation process can be performed for all schemas (hereinafter the same). For example, a schema having the highest activation level may be selected, or two or more schemas having an activation level exceeding a predetermined threshold may be selected and executed in parallel. Assumes that there is no hardware resource conflict between each schema).
[0069]
FIG. 8 shows an operation example when a reflex action is performed by the situation-dependent action hierarchy 102a shown in FIG. In this case, as shown in the figure, the reflexive action unit (ReflexiveSBL) 103 included in the context-dependent action hierarchy 102a calculates the activation level by directly using the external stimulus 183 recognized by each object of the recognition system. Then, an action is performed by selecting a schema according to the degree of the activation level. In this case, the change 184 of the internal state is not used for calculating the activation level.
[0070]
FIG. 9 shows an operation example when emotion expression is performed by the situation-dependent behavior hierarchy 102 shown in FIG. The internal state management unit 91 manages emotions such as instinct and emotion as a mathematical model. In response to the state value of the emotion parameter reaching a predetermined value, a change 184 of the internal state is added to the situation-dependent behavior hierarchy 102. Is notified (Notify). The situation-dependent behavior hierarchy 102 calculates an activation level by using the change 184 of the internal state as an input, selects a schema according to the degree of the activation level, and executes the behavior. In this case, the external stimulus 183 recognized by each object of the recognition system is used for managing and updating the internal state in the internal state management unit (ISM) 91, but is not used for calculating the activation level of the schema.
[0071]
(2-2) Schema
FIG. 10 schematically illustrates a situation in which the situation-dependent behavior hierarchy 102 is configured by a plurality of schemas 132. The situation-dependent action hierarchy 102 prepares a state machine for each action description module, that is, for each schema, and classifies recognition results of external information input from a sensor depending on actions (actions) and situations before that. , The action is expressed on the fuselage. The schema is described as a schema (Schema) 132 having a Monitor function for making a situation determination according to an external stimulus or an internal state, and an Action function for realizing a state transition (state machine) accompanying an action execution.
[0072]
The context-dependent behavior hierarchy 102b (more strictly, of the context-dependent behavior hierarchy 102, a hierarchy that controls normal context-dependent behavior) is configured as a tree structure in which a plurality of schemas 132 are hierarchically connected. In addition, the behavior control is performed by integrally determining a more optimal schema 132 according to the change of the internal state. The tree 300 includes a plurality of subtrees (or branches) such as a behavior model in which ethological situation-dependent behavior is formalized, a subtree for executing emotional expression, and the like.
[0073]
FIG. 11 schematically shows a tree structure of a schema in the context-dependent behavior hierarchy 102. As shown in the drawing, the context-dependent behavior hierarchy 102 receives a notification (Notify) of an external stimulus from the short-term storage unit 92 and a root schema 201. ₁ , 202 ₁ , 203 ₁ , A schema is arranged for each hierarchy so as to go from an abstract action category to a specific action category. For example, in the hierarchy immediately below the root schema, the schema 201 of “Search (Investigate)”, “Eat (Ingestive)”, and “Play (Play)” ₂ , 202 ₂ , 203 ₂ Is arranged. And the schema 201 ₂ A plurality of schemas 201 describing a more specific search action such as “Investigative Locomation” are provided below “Search” (Investigate). ₃ Are arranged. Similarly, schema 202 ₂ A plurality of schemas 202 describing more specific eating and drinking behaviors such as “Eat” and “Drink” are provided below “Ingestive”. ₃ Is provided and the schema 203 ₂ Below the “Play”, a plurality of schemas 203 that describe more specific playing actions such as “Play Bowing” and “Play Greeting” ₃ Are arranged.
[0074]
As shown, each schema inputs an external stimulus 183 and an internal state (change) 184. Each schema includes at least a Monitor function, an Action, and a function.
[0075]
Here, the Monitor function is a function that calculates an activation level (Activation Level: AL value) of the schema according to the external stimulus 183 and the internal state 184. When a tree structure as shown in FIG. 11 is configured, the upper (parent) schema can call the Monitor function of the lower (child) schema with the external stimulus 183 and the internal state 184 as arguments, and the child schema is The activation level is returned. The schema can also call the Monitor function of the child's schema to calculate its activation level. Then, since the activation level from each subtree is returned to the root schema, it is possible to integrally determine the optimal schema, that is, the action according to the change of the external stimulus and the internal state.
[0076]
For example, a schema having the highest activation level may be selected, or two or more schemas whose activation levels exceed a predetermined threshold may be selected and executed in parallel (however, when executing in parallel, (Assuming that there is no hardware resource conflict between each schema).
[0077]
The Action function has a state machine that describes the behavior of the schema itself. When the tree structure shown in FIG. 11 is configured, the parent schema can call the Action function to start or interrupt the execution of the child schema. In this embodiment, the Action state machine is not initialized unless it becomes Ready. In other words, the state is not reset even if interrupted, and the schema saves the work data being executed, so that interrupted re-execution is possible.
[0078]
FIG. 12 schematically shows a mechanism for controlling normal context-dependent behavior in the context-dependent behavior hierarchy 102.
[0079]
As shown in the figure, an external stimulus 183 is input (Notify) from the short-term storage unit (STM) 92 to the context-dependent behavior hierarchy (SBL) 102, and a change 184 in the internal state is received from the internal state management unit 91. Is entered. The context-dependent behavior hierarchy 102 is composed of a plurality of sub-trees such as a behavior model in which an ethological (ethological) situation-dependent behavior is formalized and a sub-tree for executing emotional expression. In response to the notification (Notify) of the external stimulus 183, the Monitor function of each subtree is called, and by referring to the activation level (AL) value as a return value, an integrated action selection is performed, and Call the Action function for the subtree that implements the action. The context-dependent behavior determined in the context-dependent behavior hierarchy 102 is applied to the machine operation (Motion Controller) through arbitration of hardware resource competition with the reflex behavior by the reflex behavior unit 103 by the resource manager RM 116. Is done.
[0080]
In addition, in the context-dependent behavior layer 102, the reflex behavior section 103 responds to the external stimulus 183 recognized by each object of the above-described recognition system, for example, by rejecting the object immediately by detecting an obstacle. Perform direct airframe operations. Therefore, unlike the case of controlling the normal situation-dependent behavior shown in FIG. 11, as shown in FIG. 10, a plurality of schemas 132 for directly inputting signals from each object of the recognition system are not hierarchized. They are arranged in parallel.
[0081]
FIG. 13 schematically illustrates the configuration of the schema in the reflex action unit 103. As shown in the drawing, the reflex action unit 103 operates in response to the Aid Big Sound 204, the Face to Big Sound 205 and the Nodding Sound 209 as the schema that operates in response to the recognition result of the auditory system, and operates in response to the recognition result of the visual system. A Face to Moving Object 206 and an Avoid Moving Object 207 as schemas, and a hand-withdrawing 208 as a schema that operates in response to the recognition result of the haptic system are provided in an equivalent position (in parallel).
[0082]
As shown, each schema performing reflexive behavior has an external stimulus 183 as input. Each schema has at least a Monitor function and an Action function. The Monitor function calculates the activation level of the schema according to the external stimulus 183, and determines whether or not to exhibit the corresponding reflex behavior in accordance with the calculated activation level. The Action function includes a state machine (described later) that describes the reflexive behavior of the schema itself, and when called, expresses the relevant reflexive behavior and transitions the state of the Action.
[0083]
FIG. 14 schematically shows a mechanism for controlling the reflex behavior in the reflex behavior unit 103. As shown in FIG. 13, a schema describing a reaction behavior and a schema describing an immediate response behavior exist in parallel in the reflex behavior unit 103. When a recognition result is input from each object constituting the recognition-related function module 80, the corresponding reflex behavior schema calculates an activation level by an Aonitor function, and it is determined whether or not to orbit the Action according to the value. You. Then, the reflex action determined to be activated by the reflex action unit 103 is mediated by the resource manager RM 116 through the arbitration of hardware resource competition with the context-dependent behavior by the context-dependent behavior hierarchy 102, and then to the body operation (Motion Controller 173). Applied to
[0084]
The schema that configures the situation-dependent behavior hierarchy 102 and the reflex behavior unit 103 can be described as a “class object” described on a C ++ language basis, for example. FIG. 15 schematically shows a class definition of a schema used in the context-dependent behavior hierarchy 102. Each block shown in the figure corresponds to one class object.
[0085]
As illustrated, the context-dependent behavior hierarchy (SBL) 102 includes one or more schemas, an Event Data Handler (EDH) 211 that assigns IDs to input / output events of the SBL 102, and a Schema Handler (which manages schemas in the SBL 102). SH) 212, one or more Receive Data Handlers (RDH) 213 for receiving data from external objects (STM, LTM, resource manager, recognition-related objects, etc.), and one or more receive data handlers (RDH) 213 for transmitting data to external objects. And a Send Data Handler (SDH) 214.
[0086]
The Schema Handler 212 stores information such as schemas and tree structures (SBL configuration information) constituting the context-dependent behavior hierarchy (SBL) 102 and the reflex behavior unit 103 as a file. For example, when the system is started, the Schema Handler 212 reads this configuration information file, constructs (reproduces) the schema configuration of the situation-dependent behavior hierarchy 102 shown in FIG. 11, and stores each schema in the memory space. Map entity.
[0087]
Each schema has an OpenR_Guest 215 positioned as the base of the schema. The OpenR_Guest 215 includes one or more class objects Dsubject 216 for transmitting data to the outside of the schema and one or more class objects Dject217 for receiving data from the outside of the schema. For example, when the schema sends data to an external object (STM, LTM, each object of a recognition system, etc.) of the SBL 102, the Dsubject 216 writes the transmission data to the Send Data Handler 214. Further, the DOJECT 217 can read data received from the external object of the SBL 102 from the Receive Data Handler 213.
[0088]
Schema Manager 218 and Schema Base 219 are both class objects that inherit OpenR_Guest 215. The class inheritance is to inherit the definition of the original class. In this case, it means that a class object such as Dsubject 216 or DOobject 217 defined in OpenR_Guest 215 is also provided with Schema Manager Base 218 or Schema Base 219 (hereinafter, it is referred to as Schema Base 219). Similar). For example, when a plurality of schemas have a tree structure as shown in FIG. 11, the Schema Manager Base 218 has a class object Schema List 220 that manages a list of child schemas (has a pointer to the child schema), You can call functions in the child schema. The Schema Base 219 has a pointer to the parent schema and can return the return value of a function called from the parent schema.
[0089]
The Schema Base 219 has two class objects, State Machine 221 and Pronome 222. The state machine 221 manages a state machine for an action (action function) of the schema. The parent schema can switch (state transition) the state machine of the Action function of the child schema. Also, the target to which the schema executes or applies the action (Action function) is assigned to the Pronome 222. As will be described later, the schema is occupied by the target assigned to the Pronom 222, and the schema is not released until the action (operation) ends (complete, abnormal termination, etc.). To execute the same action for a new target, the same class definition schema is created in the memory space. As a result, the same schema can be executed independently for each target (the work data of each schema does not interfere with each other), and the Reentrance property of the behavior is secured (described later).
[0090]
The Parent Schema Base 223 is a class object that inherits the Schema Manager 218 and Schema Base 219 multiple times, and manages a parent schema and a child schema of the schema itself, that is, a parent-child relationship in the tree structure of the schema.
[0091]
The Intermediate Parent Schema Base 224 is a class object that inherits the Parent Schema Base 223, and implements interface conversion for each class. Further, the Intermediate Parent Schema Base 224 has a Schema Status Info 225. The Schema Status Info 225 is a class object that manages a state machine of the schema itself. The parent schema can switch the state of its state machine by calling the Action function of the child schema. In addition, a Monitor function of the child schema can be called to inquire about an activation level according to the normal state of the state machine. However, it should be noted that the state machine of the schema is different from the state machine of the Action function described above.
[0092]
The And Parent Schema 226, Num Or Parent Schema 227, and Or Parent Schema 228 are class objects that inherit from the Intermediate Parent Schema Base 224. The And Parent Schema 226 has pointers to a plurality of child schemas to be executed simultaneously. The Or Parent Schema 228 has pointers to a plurality of child schemas to be executed alternatively. The Num Or Parent Schema 227 has pointers to a plurality of child schemas that execute only a predetermined number at the same time.
[0093]
The Parent Schema 229 is a class object that inherits the And Parent Schema 226, Num Or Parent Schema 227, and Or Parent Schema 228 in multiples.
[0094]
FIG. 16 schematically shows a functional configuration of a class in the context-dependent behavior hierarchy (SBL) 102. The context-dependent behavior hierarchy (SBL) 102 transmits one or more Receive Data Handlers (RDH) 213 for receiving data from an external object such as an STM or LTM, a resource manager, or a recognition system object, and transmits data to the external object. And one or more Send Data Handlers (SDH) 214.
[0095]
An Event Data Handler (EDH) 211 is a class object for allocating an ID to an input / output event of the SBL 102, and receives a notification of the input / output event from the RDH 213 or the SDH 214.
[0096]
The Schema Handler 212 is a class object for managing the schema 132, and stores configuration information of the schema configuring the SBL 102 as a file. For example, when the system is started, the Schema Handler 212 reads the configuration information file and constructs a schema configuration in the SBL 102.
[0097]
Each schema is generated according to the class definition shown in FIG. 15, and entities are mapped on the memory space. Each schema has an OpenR_Guest 215 as a base class object, and includes class objects such as DObject 216 and DOObject 217 for externally accessing data.
[0098]
The functions and state machines that the schema 132 mainly has are described below. The following functions are described in Schema Base 219.
ActivationMonitor (): Evaluation function for making the schema Active when the schema is Ready
Actions (): Execution state machine at the time of Active
Goal (): a function that evaluates whether the schema has reached Goal during Active
Fail (): Function for determining whether the schema is in the Fail state during Active
SleepActions (): State machine executed before Sleep
SleepMonitor (): Evaluation function for resuming at the time of sleep
ResumeActions (): State machine for Resume before Resume
DestroyMonitor (): Evaluation function for determining whether the schema is in the fail state at the time of sleep
MakePronome (): a function that determines the target of the whole tree
[0099]
(2-3) Situation-dependent behavior hierarchy function
The situation-dependent behavior hierarchy (SBL) 102 is based on the storage contents of the short-term storage unit 92 and the long-term storage unit 93, and the internal state managed by the internal state management unit 91, to indicate the state where the robot apparatus 1 is currently placed. Control responsive actions.
[0100]
As described in the previous section, the situation-dependent behavior hierarchy 102 in the present embodiment is configured by a tree structure of a schema (see FIG. 11). Each schema keeps its independence, knowing its child and parent information. With such a schema configuration, the situation-dependent behavior hierarchy 102 has the main features of Concurrent evaluation, Concurrent execution, Preemption, and Reentrant. Hereinafter, these features will be described in detail.
[0101]
(2-3-1) Concurrent evaluation:
It has already been described that the schema as the action description module has a Monitor function for making a situation judgment according to an external stimulus or a change in the internal state. The Monitor function is implemented by the schema having a Monitor function in the class object Schema Base. The Monitor function is a function that calculates the activation level of the schema according to the external stimulus and the internal state.
[0102]
When the tree structure shown in FIG. 11 is configured, the upper (parent) schema can call the Monitor function of the lower (child) schema with the external stimulus 183 and the change 184 of the internal state as arguments. The schema returns the activation level as the return value. The schema can also call the Monitor function of the child's schema to calculate its activation level. And the root schema 201 ₁ ~ 203 ₁ , The activation level from each subtree is returned, so that an optimal schema, that is, an operation according to the external stimulus 183 and the change 184 of the internal state can be determined in an integrated manner.
[0103]
Because of the tree structure, the evaluation of each schema by the external stimulus 183 and the change 184 of the internal state is first performed on the current from the bottom of the tree structure to the top. That is, when the child schema exists in the schema, the Monitor function of the selected child is called, and then the Monitor function of the selected child is executed. Next, an execution permission as an evaluation result is passed from top to bottom of the tree structure. Evaluation and execution are performed while resolving contention for resources used by the operation.
[0104]
The situation-dependent behavior hierarchy 102 in the present embodiment can evaluate behavior in parallel using the tree structure of the schema, so that the situation such as the external stimulus 183 or the change 184 of the internal state can be evaluated. Be adaptive. In addition, at the time of evaluation, evaluation is performed for the entire tree, and the tree is changed according to the activation level (AL) value calculated at this time. Therefore, the schema, that is, the operation to be executed can be dynamically prioritized.
[0105]
(2-3-2) Concurrent execution:
Since the activation level from each subtree is returned to the root schema, the optimal schema, that is, the operation according to the external stimulus 183 and the change 184 of the internal state can be determined in an integrated manner. For example, a schema having the highest activation level may be selected, or two or more schemas whose activation levels exceed a predetermined threshold may be selected and executed in parallel (however, when executing in parallel, (Assuming that there is no hardware resource conflict between each schema).
[0106]
The schema for which permission has been granted is executed. That is, the schema actually observes the external stimulus 183 and internal state change 184 in more detail, and executes the command. Execution is performed sequentially from the top of the tree structure to the bottom, that is, Concurrent. That is, if the schema includes a child schema, the child's Actions function is executed.
[0107]
The Action function includes a state machine that describes an action (operation) of the schema itself. When the tree structure shown in FIG. 11 is configured, the parent schema can call the Action function to start or interrupt the execution of the child schema.
[0108]
The context-dependent behavior hierarchy (SBL) 102 according to the present embodiment can execute another schema using the surplus resources at the same time by using the tree structure of the schema when resources do not conflict. However, if there is no restriction on the resources used up to Goal, an unusual behavior may occur. The context-dependent behavior determined in the context-dependent behavior hierarchy 102 is applied to the body operation (Motion Controller) through arbitration of competition of hardware resources with the reflex behavior by the reflex behavior unit (Reflexive SBL) 103 by the resource manager. Is done.
[0109]
(2-3-3) Preemption:
Even if a schema has been executed once, if there is a more important (higher priority) action, the schema must be interrupted and the execution right must be given to it. In addition, when more important actions are completed (completed or stopped), it is necessary to resume the original schema and continue the execution.
[0110]
Executing a task according to such a priority is similar to a function called Preemption of an OS (Operating System) in the computer world. The OS has a policy of sequentially executing tasks with higher priorities at a timing when the schedule is considered.
[0111]
On the other hand, since the control system 10 of the robot apparatus 1 according to the present embodiment extends over a plurality of objects, arbitration between the objects is required. For example, the reflex action unit 103, which is an object for controlling reflex behavior, needs to avoid an object or balance without worrying about the behavior evaluation of the context-dependent behavior hierarchy 102, which is an object for controlling higher-level context-dependent behavior. is there. This means that the execution right is actually robbed and executed. However, the upper-level behavior description module (SBL) is notified that the execution right has been robbed, and the higher-level action description module (SBL) performs the processing to obtain a preemptive capability. Hold.
[0112]
Further, it is assumed that, as a result of the evaluation of the activation level based on the external stimulus 183 and the change 184 of the internal state in the context-dependent behavior layer 102, the execution permission is given to a certain schema. Further, it is assumed that the evaluation of the activation level based on the external stimulus 183 and the change 184 of the internal state thereafter has made the importance of another schema higher. In such a case, by switching to the sleep state using the Actions function of the schema being executed, the preemptive behavior can be switched.
[0113]
Save the state of Actions () of the running schema and execute Actions () of a different schema. Also, after Actions () of a different schema is completed, Actions () of the suspended schema can be executed again.
[0114]
Also, Actions () of the schema being executed is interrupted, and SleepActions () is executed before the execution right is transferred to a different schema. For example, if the robot apparatus 1 finds a soccer ball during a conversation, it can say, "Wait a moment" and play soccer.
[0115]
(2-3-4) Reentrant:
Each schema constituting the context-dependent behavior hierarchy 102 is a kind of subroutine. When a schema is called from a plurality of parents, it is necessary to have a storage space corresponding to each parent in order to store its internal state.
[0116]
This is similar to the reentrant property of the OS in the computer world, and is referred to as a schema reentrant property in this specification. As shown in FIG. 16, the schema 132 is composed of class objects, and the entity, that is, an instance of the class object is generated for each target (Pronome), thereby realizing the reentrant property.
[0117]
The reentrancy of the schema will be described more specifically with reference to FIG. The Schema Handler 212 is a class object for managing a schema, and stores configuration information of a schema configuring the SBL 102 as a file. When the system is started, the Schema Handler 212 reads this configuration information file and constructs a schema configuration in the SBL 102. In the example illustrated in FIG. 17, it is assumed that an entity of a schema that defines an action (operation) such as an Eat 221 or a Dialog 222 is mapped in a memory space.
[0118]
Here, by evaluating the activation level based on the external stimulus 183 and the change 184 of the internal state, a target (Pronome) A is set for the schema Dialog 222, and the Dialog 222 executes a dialogue with the person A. Suppose.
[0119]
Then, the person B interrupts the dialogue between the robot apparatus 1 and the person A, and then evaluates the activation level based on the external stimulus 183 and the change 184 in the internal state. Assume that the priority is higher.
[0120]
In such a case, the Schema Handler 212 maps another Dialog entity (instance) inheriting a class for interacting with B on the memory space. Since the conversation with B is performed using another Dialog entity independently of the previous Dialog entity, the contents of the conversation with A are not destroyed. Therefore, DialogA can maintain data consistency, and when the conversation with B ends, the dialogue with A can be resumed from the point of interruption.
[0121]
The schema in the Ready list is evaluated according to the object (external stimulus 183), that is, the activation level is calculated, and the execution right is delivered. After that, an instance of the schema moved in the Ready list is generated, and the other objects are evaluated. Thereby, the same schema can be set to the active or sleep state.
[0122]
(3) Application of the present invention to a robot device
Next, a detailed description will be given of the robot apparatus according to the present embodiment which has a negative desire such as "I do not want to do" and can express this. The robot apparatus according to the present embodiment selects an optimal action from its own internal state and external state. As the internal state at the time of this action selection, only the positive desire value such as “I want to do” is used. It also has a negative desire value such as "I do not want to do", and determines whether or not to perform the specified action based on this. Thus, by having a negative desire value such as “I do not want to do” as an internal state, a negative desire (negative internal state) of expressing only the desired action and not wanting to do it is considered. An object of the present invention is to provide a robot apparatus in which a variety of actions to be performed are provided in a wide variety, such as preventing an action from being performed as instructed at all times, compared to a conventional robot apparatus which has not been performed.
[0123]
(3-1) Situation-based behavior layer (SBL)
As described above, the robot apparatus has the SBL as an algorithm for performing an action determination in consideration of information inside and outside the robot apparatus. SBL ranges from an abstract unit such as dancing to a unit having a specific meaning such as actually outputting a motion command for instructing an aircraft operation, such as a rotation angle of an actuator. A plurality of action description modules (schema) having independent meaning and function in each unit are configured in a tree structure. Each schema is equipped with a state machine that describes the sequence of actions inside, and information on short-term memory obtained based on sensor information obtained from the external environment and internal information obtained as a result of evaluating own physical information State according to external stimulus or change in internal state while performing state transition using information of long-term memory obtained by associating and storing state, previous short-term memory, experience of internal state, etc. Judgment is made, and action generation (selection) is performed.
[0124]
Usually, the schema includes external information (external stimulus) input from an external input device (state recognition unit) such as various sensors, and internal information of the robot device (emotion and instinct for calculating its own internal state parameter and emotion parameter). Internal schema parameters obtained from the model), that is, the degree of satisfaction of the primary emotion (instinct) of the robot and the value of the secondary emotion (emotion) that changes according to the degree of simplicity of each schema. An execution priority (activity level (activation level: Activation Level)) shown is calculated, and a schema to be executed is determined (selected). This makes it possible to autonomously determine what action to actually perform in accordance with the external input (external stimulus) and the internal state, and use the robot apparatus body or an expression means such as a speaker or LED to determine the action. Execute.
[0125]
Such an activation level is ReleaseValue (RV), which is a value indicating a first desire indicating whether or not the robot device can express its operation in the current situation (whether or not it can do it). And a desire value (MotionValue: MV) which is a value indicating a second desire indicating whether or not the robot apparatus itself wants to do so.
[0126]
Release Value (RV) is a stimulus from the outside, if there is an object, physical external information (existence of the object, distance to the object, color and shape of the object, etc.) and each storage unit The value based on the memory of is calculated by, for example, adding, for example, for example, the schema to kick the ball, if the ball can not be recognized by the camera or the like at that time, it is determined that the action can not be expressed, The value decreases.
[0127]
The desire value MV is calculated based on the internal state of the robot, that is, the instinct (desire) value and the emotion (emotion) value calculated in the instinct / emotion model. If the battery is sufficiently charged or a ball of a favorite color is found, the desire to kick the ball increases, and the value increases. As described above, the emotion model of the robot apparatus is, for example, “Joy”, “Sadness”, “Anger”, “Surprise”, “Disgust”, and “Fear”. (Fear), a parameter indicating the intensity of the emotion is stored for each emotion, and the instinct model includes “exercise”, “affection”, and “appetite ( For each of the four independent desires of “appetite” and “curiosity”, a parameter indicating the strength of the desire is held for each of the desires, and the desire value MV is calculated based on these values. You. Such an internal state corresponds to a plurality of internal states as described above by inputting an external stimulus and information such as the remaining battery level of the own battery and the rotation angle of the motor by the internal state management unit 91 shown in FIG. The calculated value (internal state vector) is calculated and managed.
[0128]
(3-2) Operation of robot device and its operation
When selecting a schema, an action (schema) to be executed is selected according to the activation level of each schema, and the schema itself calculates the activation level from external information and an internal state. Then, the operation is autonomously expressed based on this. On the other hand, if the user is told to "do OO", the corresponding schema will be executed. ing.
[0129]
For example, when following such a user's instruction, each schema of the robot apparatus forcibly raises (adds) the activation level of the schema corresponding to the user's instruction by, for example, DerivativeSBL described later. , The corresponding schema is configured to be easily selected. Therefore, in such a method, the robot device specifies the schema regardless of the activation level calculated from the external information and the internal state by each schema itself of the robot device. The action will be performed.
[0130]
Therefore, in this embodiment, even in such a case, a variation in the operation expression of the robot device is provided by introducing a strategy that takes into account the activation level calculated by each schema of the robot device itself. is there.
[0131]
When determining the activation level, usually, only the positive desire value is considered as the desire value in the emotion / instinct model as described above, and the activation level is determined according to the magnitude of the positive desire value. Has been calculated, and thus the activation level indicates the degree of difficulty for each action. Then, for example, a method of selecting the schema that is most desired, that is, the value of the activation level that is the largest is selected. Also, for example, when the user specifies, the specified schema is selected regardless of the activation level, or a predetermined numerical value is added to the activation level of the specified schema to select a higher schema. By forcing the user to perform an action.
[0132]
On the other hand, in the present embodiment, the robot device is provided with a negative desire value MV indicating, for example, “I do not want to do”, depending on the internal state. Therefore, as described above, the value of the activation level obtained from the negative desire value MV and ReleaseValue (RV) may also be negative. As described above, if the activation level indicates not only a positive desire for an action but also a negative desire such as not wanting to do so and can reflect this in the action, the action becomes more humane.
[0133]
For example, when the robot apparatus is instructed to perform a predetermined operation from the outside, that is, when the robot apparatus is instructed to operate forcibly regardless of the activation level calculated by itself, the activation level of the specified schema is determined. Is negative, the robot apparatus now expresses an action indicating to the user that he does not want to exhibit the action, and furthermore, a function of rejecting the instruction, that is, a function of not expressing the specified action. You can have. Also, if the activation level is negative, the user can tell the user "absolutely not", "I do not want to do so much", "I do not feel comfortable" etc. You can also respond.
[0134]
Further, even if the activation level is negative, it is possible to return a more human-like reaction, such as "no help" if the request is repeatedly made, and the operation is executed. Similarly, when the activation level is positive, it is also possible to provide a response variation depending on the size of the activation level.
[0135]
Furthermore, if an action corresponding to a positive desire and an action that negates this are described in each schema, this schema is selected when the negative desire indicated by the activation level falls below a predetermined threshold. Then, if the action corresponding to the negative desire is output, the robot apparatus can inform the user of the current state of having the negative desire.
[0136]
As described above, by setting the activation level to indicate not only a positive desire but also a negative desire, the robot apparatus has more variations in behavior, and further exhibits human-like behavior. Here, as an example of a variation that increases due to having such a negative desire, a case where the user is told “Kick the ball!” As the predetermined external stimulus, and a schema for kicking the ball is selected. Will be described specifically. FIG. 18 and FIG. 20 are schematic diagrams showing the activation levels of the schema in the schema tree when the robot device does not exhibit the action according to the instruction from the outside and when it does.
[0137]
First, the case where the activation level of the specified schema is negative and the robot apparatus itself (schema) does not want to perform the action will be described. When the robot apparatus is instructed by the user to "kick the ball!", For example, a speech interpretation schema described later for interpreting a command from the user is activated and its meaning is grasped. It is assumed that a schema for kicking a ball (schema A) is activated as shown in FIG. At this time, the schema A calculates its own activation level, uses the calculated activation level to grasp how much it wants to do, and determines whether or not to follow the instruction according to the value.
[0138]
Here, the activation level of the schema A that kicks the ball is, as described above, the desire value MV obtained based on the current internal state that is satisfied by one or more internal states defined in the schema A, for example, when the ball is recognized. Can be obtained by Release Value (RV) obtained based on the external stimulus.
[0139]
In calculating the desire value MV, for example, a function as shown in FIG. 19 can be used. FIG. 19 is a graph showing an example of the relationship between the internal state and the desire value MV. The desire value MV, which is one element for calculating the action value AL, is obtained as a desire value vector InsV (Instinct Variable) corresponding to some internal states defined in each schema. For example, in the schema for outputting the action of “kicking a ball”, an internal state vector IntV {IntV_NOURISHMENT “nutrition state”, IntV_FATIGUE “tired”} is defined. A desire value vector InsV {InsV_NOURISHMENT, InsV_FATIGUE} is determined. The desire value MV is, for example, a function such that the greater the value of the internal state “nutrition state”, the greater the desire value MV for the action of “kick the ball”, or the larger the value of the internal state “fatigue”, the more “the kick of the ball” Each internal state and an action associated therewith, such as a function that makes the desire value MV negative when the desire value MV for the behavior decreases and the value of the internal state “fatigue” exceeds a predetermined value. A predetermined function or the like corresponding to the above can be prepared and used.
[0140]
Specifically, a function as shown in the following equation (1) and FIG. 19 is given. FIG. 19 shows the relationship between the internal state represented by the following equation (1) and the desire value MV by taking each component of the internal state vector IntV on the horizontal axis and taking each component of the desire value vector InsV on the vertical axis. FIG.
[0141]
(Equation 1)

[0142]
The desire value vector InsV is determined only by the value of the internal state vector IntV, as shown in the above equation (1) and FIG. Here, a function is shown in which the magnitude of the internal state is 0 to 100 and the magnitude of the desire value at that time is -1 to 1. For example, when the internal state is from 0 to 80, the function is a positive increasing function. When the internal state is 80, the desire value MV is 0, and when the internal state is satisfied, the desire value MV becomes a negative decreasing function. By setting the value curve L1, the robot device has the desire value MV such that the internal state always maintains 80%.
[0143]
By variously changing the constants A to F in the above equation (1), different desire values MV can be obtained for each internal state. For example, the desire value MV may be changed from 1 to 0 when the internal state is between 0 and 100, or the internal state-desired value function different from the above equation (1) for each internal state. May be prepared. Here, the case where the negative desire value MV occurs when the internal state exceeds 80% has been described. However, depending on the internal state, a function such that the desire value MV does not become negative may be set, or Alternatively, a function may be set such that the desire value MV is always negative. For example, assuming that the internal state “nutrition state” of the robot apparatus is based on the remaining amount of charge, if it is better that the charge is always full, a function may be set such that the desire value MV does not become negative. Good.
[0144]
Here, in the case of the function as shown in FIG. 19, if the internal state is 100, the desire value MV is negative (−1), and the activation level calculated based on the negative desire value and ReleaseValue (RV) is With a high probability, it becomes a negative value as shown in FIG. Even in such a case, when the user instructs to execute the schema A, the schema A is selected regardless of the activation level, for example.
[0145]
As described above, when the schema A having the negative activation level is specified, the schema A outputs a compensation action that informs the user that he or she does not want to exhibit the action, instead of the action described in the schema A itself.
[0146]
Alternatively, even when a predetermined value is added to the activation level of the specified schema by the speech interpretation schema of the DerivativeSBL described later and the activation level is increased, the activation of the schema A after the addition is performed. The same applies to the case where the level is a negative value as shown in FIG. 18, and the schema A specified as the speech interpretation schema is selected once without being compared with the activation levels of other schemas. However, since the activation level of the schema A is negative, as described above, instead of outputting the action described in itself, the schema A outputs a compensation action notifying that the user does not want to perform the action.
[0147]
In FIG. 18, schema A has an activation level (AL) = − 30, has a negative desire to generate a voice notifying the user of rejection of “Eh, no, no”, and not want to do so. Expressing that
[0148]
As described above, normally, the action having the high activation level calculated from the external stimulus and the change in the internal state is selected and the action is expressed. However, regardless of the activation level, the action is forcibly performed by an instruction from the user, for example. If a system such as is selected is introduced, the selected schema calculates its own activation level from external stimuli and changes in the internal state, and in this embodiment, if this value is negative, It is determined that it indicates a desire (negative desire) that the user does not want to do, and expresses a compensatory action according to the negative value.
[0149]
A plurality of such compensation actions can be prepared according to the negative value of the activation level. When the activation level is, for example, −50, a negative expression such as “I do not want to do” is performed, and the action described in the schema A specified by the user is not output. If the activation level is, for example, -30, the behavior described in the schema A specified by the user is output while expressing the dislike gesture, and the behavior can be selected according to the activation level. .
[0150]
In order to perform different operations according to the activation level in this way, a plurality of schemas functioning as a compensation for the behavior described in the schema A are prepared according to the negative magnitude of the activation level. In addition, a compensation scheme corresponding to the activation level can be called to output a compensation action described in the compensation scheme.
[0151]
On the other hand, in FIG. 20, if the internal state is a value lower than 80, the desire value MV takes a large value (positive value) in an attempt to increase the internal state. If the desire value MV takes a positive value, the activation level of the schema A calculated based on the desire value MV and ReleaseValue (RV) becomes a positive value with a high probability as described above. And when the activation level is a positive value, when selected by a predetermined external stimulus, or when the value after the predetermined activation level is added by the predetermined external stimulus is positive, Assume that the robot apparatus wants to "kick a ball", and schema A outputs a ball-kick action.
[0152]
That is, when the activation level is positive and a large number, the action of kicking the ball is displayed together with an affirmative response such as “Yes, I understand!”, And the action instructed by the user is represented by the robot apparatus. I can show myself what I wanted to do. In this case as well, the user expresses what he / she wants to do by expressing means such as a speaker and an LED, and prepares a compensation scheme according to the degree to be done, and calls this compensation scheme according to AL. Can be expressed.
[0153]
In the present embodiment, if the robot apparatus does not want to perform the specified schema, it can assert it, and further rejects the specified action or is instructed many times even if it is rejected. In such a case, various variations such as "do not do while doing" are possible. In addition, if the user wants to do what he / she wants to do, he / she can also do “pleasure,” which helps the robot device to look more intelligent.
[0154]
The above-mentioned curve shown in FIG. 19 obtains a desire value such that the internal state takes a predetermined value (= 80). However, the desire value such that the internal state is not a constant value but keeps within a certain range. May be required.
[0155]
(4) Specific example
Next, a description will be given of a specific example of a method of selecting a corresponding schema to express an action when a robot apparatus that operates autonomously is instructed by a user, that is, a method of operating a robot apparatus in a different manner.
[0156]
When an SBL is used as an action control unit of an actual robot device, a plurality of SBLs are prepared according to the role of the schema tree of the SBL. Specifically, NormalSBL which calculates the activation level of each schema based on the external stimulus and the internal state, determines the action autonomously by causing competition between the schemas, and issues a command for action output. In order to execute a certain action, action sequence, or the like by combining the functions of the schema of NormalSBL, an activation level is set externally to a specific schema of NormalSBL, and a DerivativeSBL for generating an action, a power supply. System SBL that monitors abnormal conditions such as when the voltage drops or falls, and gives priority to avoiding abnormal conditions over other SBLs. Sudden changes in sound pressure (volume) given to the auditory sensor or given to the visual sensor. Perform reflex action in response to sudden changes in image information (brightness) It is a ReflexiveSBL or the like for.
[0157]
In such an action selection method using the SBL algorithm, a schema, which is usually an action description module, calculates an activation level that defines the priority of its own action based on an internal state obtained from an emotional instinct model, and generates a schema tree. Conflicts between schemas. Eventually, the schema starts up in the order of higher activation levels at the same time as long as the hardware and resources of the robot do not compete, and action output is realized. According to this algorithm, the action of the robot is autonomously selected based on the internal state of the robot and the condition of the external stimulus to the sensor. This autonomous action selection method is hereinafter referred to as homeostasis mode. A schema tree that implements the homeostasis mode is hereinafter referred to as NormalSBL.
[0158]
As described above, each schema that constitutes NormalSBL is based on external stimuli such as the appearance of a face and the discovery of a ball, and the internal states such as pain, hunger, fatigue, and drowsiness evaluated by an emotional instinct model. Calculate your own activation level. The activation level defines an activation level (execution priority) between a plurality of schemas, and acquires an execution right preferentially from a schema having a large value. Eventually, the schema starts up in the order of higher activation levels at the same time as long as the hardware and resources of the robot do not compete, and action output is realized. With the SBL algorithm, the action selection of the robot itself is executed autonomously based on the internal desire, and the action can be generated.
[0159]
That is, NormalSBL is the most basic schema tree structure for performing autonomous action determination using an action selection algorithm based on SBL. In this NormalSBL, the action determination (selection) is performed by giving priority to the desire of the robot itself. Therefore, the behavior generation result depends on the environmental state including the context in which the robot device is placed.
[0160]
Therefore, since no action judgment contrary to the internal desire is performed, a user command is given to the robot apparatus from the top down in the schema tree as it is to cause the robot apparatus to perform an action or to reproduce a series of fixed actions. That makes it difficult to demonstrate.
[0161]
Therefore, in order to enable a top-down command even in a robot apparatus having such a NormalSBL, a DerivativeSBL for performing an action selection in a different manner is prepared separately from the NormalSBL for performing an autonomous action selection. In other words, the SBL that is the behavior control means has a Derivative SBL, which enables the activation level to be set independently of the external stimulus and internal state between the schemas of a single SBL or the schemas of a plurality of SBLs. Can be. FIG. 21 is a schematic diagram showing the relationship between NormalSBL and DerivativeSBL in this specific example. When the activation level is set externally for each of the schemas 231 to 233 constituting the Normal SBL 230 shown in the lower diagram of FIG. 21 by the Derivative SBL 210 shown in the upper diagram of FIG. 21, it is calculated based on the external stimulus and the internal state. The activated activation level becomes invalid, and the activation level externally given by the Derivative SBL 210 is prioritized. This mechanism allows a specific schema to be activated at a specific activation level.
[0162]
The Derivative SBL 210 has the same structure as the above-described Normal SBL 230, but does not compete with the schema in the Normal SBL 230, and is configured as an independent tree. Then, the schema in the DerivativeSBL 210 can set an externally high activation level for the specific schema in the Normal SBL 230, thereby executing a specific schema in the Normal SBL 230, that is, a specific action. It is possible. Hereinafter, an action selection method based on this top-down request is referred to as an Intention mode, such a function is referred to as an Intention function, and a schema tree 220 that implements the Intention mode is referred to as a Derivative SBL 210. Hereinafter, the Normal SBL and the Derivative SBL in this specific example will be described in more detail.
[0163]
(4-1) Normal Situated Behavior Layer (NormalSBL)
NormalSBL prepares a state machine for each action description module (schema), and classifies the input from the state recognition unit, that is, the recognition result of the external information input from the sensor, depending on the previous action or situation. Then, the action is expressed on the aircraft. The schema receives an external stimulus and an internal state (change) as inputs, and performs a Monitor function that determines a situation according to at least the external stimulus and a change in the internal state, and an Action function that realizes a state transition (state machine) accompanying an action execution. Is described as a schema having the following. Then, as shown in the lower diagram of FIG. 21, the NormalSBL 230 has a tree structure (eg, a plurality of schemas 231 to 233 in which actions such as dancing, playing soccer, and riddles are described) hierarchically connected. (Schema Tree) 240.
[0164]
The plurality of schema trees 240 configured in such a tree structure are adapted to integrally determine a more optimal schema according to a change in an external stimulus or an internal state to perform behavior control. The schema tree 240 includes a plurality of subtrees (or branches) such as a behavior model in which ethological situation-dependent behavior is formalized, and a subtree for executing emotional expression.
[0165]
Here, the above-mentioned Monitor function is a function for calculating the activation level of the schema in accordance with the external stimulus and the internal state. When constructing a tree structure, the upper (parent) schema can call the Monitor function of the lower (child) schema with the external stimulus and internal state as arguments, and the child schema has the activation level as the return value. . The schema can also call the Monitor function of the child's schema to calculate its activation level. Then, the activation level from each subtree is returned to the root schema 234, so that the optimal schema, that is, the action according to the change of the external stimulus and the internal state can be determined in an integrated manner.
[0166]
For example, a schema with the highest activation level can be selected, or two or more schemas with an activation level exceeding a predetermined threshold can be selected and executed in parallel (however, when executing in parallel, (Assuming that there is no hardware resource contention between each schema).
[0167]
The Action function has a state machine that describes the behavior of the schema itself. When configuring a tree structure, the parent schema can call the Action function to start or suspend execution of the child schema.
[0168]
(4-2) Derivative Situated Behavior Layer (Derivative SBL)
As described above, in this specific example, the robot apparatus capable of autonomous operation includes the Derivative SBL 210 that implements an action selection method (Intention mode) for forcibly operating according to an instruction from a user or the like. The derivative SBL is the same as the normal SBL in that it has a basic structure, that is, a schema that is an action description module is configured in a tree shape. However, the DerivativeSBL 210, that is, each schema constituting the Derivative SBL is different from each schema constituting the Normal SBL 230, and does not have a function of outputting commands such as uttering or reproducing a motion by itself, and a specific schema in the Normal SBL 230. Has a function (Intention function) of forcibly activating the program to generate an action indirectly. Hereinafter, this function is referred to as “adding an intention to the schema”. Further, the Derivative SBL 210 has a function of adding an Intention to the schema in the Normal SBL 230 and passing parameters at the same time, so that the behavior can be performed more limitedly.
[0169]
(4-3) Function of SBL
The function of the SBL in this specific example is, as described above, a function of adding an Intention from the Derivative SBL 210 and a function of passing a parameter, and as described in the above-described embodiment, the Normal SBL 230 is added with an Intention and is forcibly activated. But a function to refuse this. Hereinafter, each of these functions of the SBL 200 in this specific example will be described in detail.
[0170]
(4-3-1) Intention function
Among the functions of the SBL described above, the function (Intention function) that indirectly generates an action by forcibly starting a specific schema is one in which the Derivative SBL 210 externally accesses the schema in the Normal SBL 230. This is achieved by setting the activation level.
[0171]
As a specific usage method of the Derivative SBL 210 having such a function, as shown in the upper diagram of FIG. 21, a voice command of a user is interpreted, and an Intention is added to a schema corresponding to the command to conform to the command. A speech interpretation schema (ViceCommandHandler) 201 for performing an action, and a function introduction schema for performing a demonstration (function introduction) on a robot device by performing an intention on a specific schema while playing back a series of prepared schema activation sequence files. (ScriptScript) 202 as an example.
[0172]
The function of interpreting a user's voice command or the like of the voice interpretation schema (Vice Command Handler) 201 and adding an Intention to a schema corresponding to the command and performing an action in accordance with the command is performed in advance by a user's voice command and This can be realized by creating a database that corresponds to the schema in the corresponding NormalSBL. For example, in response to a user command such as “dance a dance”, “play a soccer”, or “let a riddle”, the dance apparatus 231 activates the dance schema 231, the soccer schema 232, and the riddle schema 233, thereby enabling the robot apparatus to autonomously operate. It is possible to cause the robot apparatus to execute a target action by suppressing the action selection.
[0173]
As described above, in the case where the Intention function of the Derivative SBL 210 that makes the NormalSBL 230 that autonomously selects an action autonomously select an action is to add an Intention from a schema in the Derivative SBL 210 to a specific schema in the Normal SBL 230. Is used to generate a top-down action.
[0174]
FIG. 22 is a schematic diagram showing a relationship between an activation level given by Intention and an activation level evaluated internally. As shown in FIG. 22, when an Intention is added to a specific schema, an Intention is initially added to an activation level (hereinafter, AL1) calculated based on an internal state and an external stimulus of the schema. , An activation level (hereinafter, referred to as AL2) set from outside is added. In order to determine whether to activate the schema, the sum of these two values (AL1 + AL2, hereinafter, AL_total) or the like is used as the AL value recorded in the higher-level schema. This makes it possible to set a larger AL for a certain external schema. The AL_total recorded in the upper schema may be, for example, the sum of two AL values weighted by an appropriate coefficient or the like.
[0175]
In this way, since the original AL (AL1) is simply raised by Intention, if AL1 of other schema is very large, Intention may not have meaning. However, in practice, by adjusting the AL at the stage of configuring the integrated schema tree, AL1 is set to change within a certain range, for example, from 0 to 100 in a normal startup state. By giving such an Intention, it becomes possible to reliably start the target schema when Intention occurs.
[0176]
For example, when the speech interpretation schema 201 of the Derivative SBL 210 shown in FIG. 21 adds the Intention 250 to the dance schema 231 for dancing the dance of the Normal SBL 230, the AL of each schema is as shown in FIG. That is, when the AL1 calculated from the internal state and the external stimulus of the dance schema 231 itself is, for example, 85 and the AL2 externally added from the speech interpretation schema 201 is, for example, 1500, the dance schema 231 is added with the Intention 250. That is, AL2 is added, and the total value of the activation levels (hereinafter, referred to as AL_total) becomes 1585. Here, the range of the activation level (AL1) calculated from the internal state and the external stimulus calculated by each schema in the NormalSBL 230 is set to a predetermined range, for example, 0 to 100, and exceeds the range of AL1. By adding a large AL2, the AL_total becomes larger than other schemas in which the Intention 250 is not added and has only the activation level (AL1) calculated from the internal state and the external stimulus. In the example of FIG. 22, since the AL_total of the other schemas 232 to 234 is 5, 12, and 67, respectively, the schema 231 to which the Intention 250 of AL_total = 1585 is added is selected.
[0177]
As described above, the activation levels calculated by the respective schemas are compared. For example, when the schema having the highest activation level is set to be selected, such an Intention 250 is added, and the activation level is added. By forcibly increasing, the schema can be selected and the schema can be fired to cause the action to be expressed.
[0178]
In this specific example, a schema to be specified is always selected by adding a sufficiently large value as Intention. In the above example shown in FIG. 18, for example, the activation level is +30 or the like as Intention. Add a value smaller than the example. Then, if the activation level as Intention is a negative value even after being added, it indicates that the user does not want to perform his / her own action. In this case, the activation level added as Intention is small. However, the schema specified by the DerivativeSBL is designed so that the activation level is selected once without being compared with other schemas. That is, if the activation level indicates a negative desire, it can also be used as a function for rejecting Intention.
[0179]
Next, as another example of adding the Intention, a case where an action such as performing a voice command such as performing an explanation in the middle of the Intention command while performing the Intention to a specific schema of the Normal SBL 230 is executed is described in advance. Intent will be applied to a specific schema while reproducing a series of prepared schema activation sequence files, and a robot demonstration (function introduction) schema 202 will be described as an example. Such a case can be realized by adding Intention between NormalSBL schemas.
[0180]
The function of adding Intention to a specific schema while playing back a series of schema activation sequence files prepared in advance in the function introduction schema 202 is an effective method, for example, when introducing a function of a robot apparatus interactively with a user. FIG. 23 is a schematic diagram illustrating the relationship between the function introduction schema 202 and each schema in the NormalSBL 230. The script file included in the function introduction schema 202 includes voice including a description for introducing the function, timing of executing a motion output command, information of a target schema to be subjected to Intention for executing a function of a specific schema. Are described.
[0181]
The function introduction schema 202 that performs the demonstration executes an explanation command while reading the script file, or executes the function of the schema by adding Intention to a specific schema, and repeats this operation to introduce the function of the robot to the user. It is to do.
[0182]
For example, when the NormalSBL 230 has a dance schema 231, a soccer schema 232, and a problem schema 233 for autonomously dancing, playing soccer, and asking a question (riddle), the function introduction schema 202 firstly indicates the function introduction. By executing a command in the schema 202, the user (robot apparatus) communicates to the user, for example, by voice or the like that the user can dance, can play soccer, and raises a problem. 231, soccer schema 232, or problem schema 233 by adding Intention to actually demonstrate how to dance using dance schema 231, demonstrate how to kick ball using soccer schema 232, or problem For users using schema It is possible to demonstrate the operation of issuing a problem.
[0183]
In this way, by using the function of Intention, by applying Intention to the schema in NormalSBL prepared for making autonomous behavior judgment, the function of the existing schema can be reused as it is to interpret the user command. This makes it possible to generate actions with many variations, such as generating actions by introducing robot functions to users. As described above, when executing the schema in the Intention mode, the schema sequence (algorithm) for actually generating the action generation command can reuse the NormalSBL schema used in the homeostasis mode as it is. Is very efficient because there is no need to rewrite a program from 1 to realize the Intention mode that functions independently.
[0184]
(4-3-2) Function to reject instructions
The execution of the schema using Intention has been devised as a mechanism for forcibly starting the schema ignoring the activation level in the homeostasis mode evaluated from the internal state and the external stimulus. However, if the commanded action is always realized, the correspondence between the input information and the action output becomes too fixed, and as a result of repeating a specific response operation, the user may be bored. Therefore, even in the Intention mode, the activation level of the homeostasis mode is not completely ignored, but is partially taken into consideration, and the value of the original activation level before Intention is applied is small, or as described above. Depending on the case of indicating a negative desire, it is possible to make the user decide whether to accept or reject the command of the user, and to have a variety in action generation.
[0185]
As described above, in the schema to which Intention is added, not only AL_total in which AL2 by Intention is added, but also AL1 before Intention is stored, and it is possible to refer to it. It refuses activation in consideration of the low level. "
[0186]
Specifically, when the schema in which Intention is applied is activated, the activation level (AL1) based on the internal state and the external stimulus is calculated at the same time, and when the AL1 has a value equal to or more than a certain threshold value, Means to output a predetermined action, but if the value is equal to or smaller than the threshold value, express that the user does not feel good and terminate the schema. For example, a case where the threshold is set to AL1_th = 60 will be described. In the example shown in FIG. 22, since the schema 231 has AL1 = 85 before the Intention is added, the operation is performed according to the user's instruction, that is, the instruction of the speech interpretation schema.
[0187]
Here, when Intention is added to the schema 232 for playing soccer and AL2 = 1500 is added, for example, the player played soccer many times on that day, the ball was not seen, or the ball was not a favorite color. For this reason, the schema 232 may have a low AL1 = 5 calculated by itself, and may be equal to or less than the threshold. In such a case, the robot device can reject the operation of playing soccer.
[0188]
For example, at the lower layer of the schema 232 for playing soccer, one or more compensation schemas such as a schema indicating a tired attitude, a schema for shaking a head, and a schema for notifying a user by voice that he or she does not want to play soccer are prepared. If the Intention is added while the AL1 of the schema 232 is equal to or less than the predetermined threshold, instead of activating the soccer schema 232, these compensation schemas are activated according to the AL1 of the schema 232, for example. Is also good.
[0189]
When a function for rejecting an instruction when executing a schema using Intention is implemented, a function for forcibly executing an action without rejection and for disabling a function for rejecting an instruction is required. . By providing a function for invalidating this rejection, for example, when the function introduction schema 202 executes (selects) the schema in the Normal SBL 230 using Intention and attempts to perform a function introduction demonstration of the robot apparatus, the function introduction schema 202 If the schema rejects the execution instruction, it is possible to prevent the demonstration from continuing. In this specific example, whether or not to accept the rejection from the schema of the Normal SBL 230 is selected based on whether or not this flag is set. Therefore, when performing the demonstration, the Intention is added to the schema of the Normal SBL 230, and at the same time, information indicating that the schema is forcibly executed is passed as the “force flag”, thereby disabling the function of rejecting the instruction. .
[0190]
Assuming that the action determination is performed by the internal state and the external stimulus, that is, the action determination in the Normal SBL alone is the completely autonomous mode, when the Intention is added by the Derivative SBL, the original activation level (AL1) is simultaneously referred to. The function of rejecting a command when the value is equal to or less than a certain threshold value can be said to be a semi-autonomous mode. When the demonstration of the robot is performed using the function of Intention together with the compulsory flag, it can be said that the mode is a completely heterogeneous mode.
[0191]
(4-3-3) Function to pass parameters at the same time as adding Intention
When starting the schema in NormalSBL by adding Intention, it is assumed that an abstract voice command such as "dance" or "soccer" is handled and actions are performed in schema units. You. However, by providing a function of passing a parameter at the same time as the Intention, it becomes possible to specify the Intention command in more detail. For example, when intention is applied to a schema for playing soccer, when a voice command “Kick a pink ball” is handled, information corresponding to “Pink ball” is passed, and soccer is passed. It is possible to instruct the schema with a limited action to search for and kick a particularly pink ball. In this case, the information to be passed at the same time as the Intention information may be information indicating a feature of the target object, for example, color, shape, and the like.
[0192]
(4-4) Other examples
Variation of how to add Intention
The method of adding Intention is not limited to the above specific example, and various changes can be made. For example, when performing a demonstration, according to the scenario, add a description by voice utterance, activate the schema in order and introduce the function of the robot, or interpret the voice command when receiving a voice command There is a method of selecting a corresponding schema or passing parameters required for starting the schema and starting the schema.
[0193]
In addition to the above, various variations can be given to a method of activating a schema using Intention. Variations can be created in the Intention algorithm by basically changing the following values in conjunction with some conditions. That is,
Activation level to be added when applying Intention
The size of the threshold that determines whether to listen to or reject the command when Intention is applied
It is. For example, based on information acquired by face image recognition or speaker recognition, the size of the activation level (AL2) to be added when applying Intention based on the experience with the speaker up to that point is changed, and the favorite partner is changed. It is also possible to give a change to the listening condition depending on who the other person is, such as listening to what is said, but not easily hearing what the other person does not like.
[0194]
In this specific example, by using the Intention function of the DerivativeSBL, regardless of the internal state of the robot apparatus or the state of the external stimulus, it interprets words instructed by humans to generate actions, and performs a series of operations. A demonstration can be performed by playing back the defined setting file. As described above, since the schema described for the homeostasis mode can be reused as the schema for the intention mode, it is not necessary to prepare all programs for action selection and generation. In other words, the program to be prepared only needs a framework for executing which schema for a certain command or in what order the schema is executed, so that the programming efficiency is extremely high.
[0195]
Thereby, the execution of the schema in the homeostasis mode in which the behavior is autonomously determined according to the internal and external environmental conditions, and the Intention mode in which the behavior is generated in accordance with a top-down command such as an instruction from a user or a demo script are referred to as a common SBL. It can be handled in the behavior control algorithm.
[0196]
Then, even when the activation level (AL2) is set externally, the activation level (AL1) calculated based on the internal state is referred to, and the activation level (AL2) added according to the emotional state is added. By increasing or decreasing AL1), it is possible to take an action such that it is hard to hear what to say in an unpleasant emotional state, but to listen to anything to say in a happy emotional state. In this way, although the schema is forcibly started externally, at the same time it is possible to virtually think internally how much you actually want to do that. Therefore, depending on the situation of the robot apparatus, an action of rejecting an externally applied Intention is generated to prevent the response action between the input information and the action output from being fixed, thereby realizing diversification of the action. be able to. That is, not only following the command but also rejecting the command according to the situation, thereby preventing the user from getting tired of the fixed response of the robot apparatus, and generating an action closer to a human or an animal. be able to.
[0197]
【The invention's effect】
As described in detail above, the robot device according to the present invention is a robot device that selects and expresses an action based on an internal state and an external stimulus, and describes a plurality of actions and outputs an action selected from the plurality of actions. And a priority calculating means for calculating an execution priority of each action from the internal state and / or the external stimulus, wherein the execution priority is a positive desire or a desire to express each action. When the execution priority of the selected action indicates a negative desire, the action output means outputs a different action from the selected action. Although only the positive desire of the user was reflected in the action, the user can be informed by action that the user has a negative desire to not perform the selected action, or the voice can be reflected in the action. Can Action of the robot apparatus to have a wide variety of variations can be made to output an action closer to humans.
[Brief description of the drawings]
FIG. 1 is a perspective view showing an appearance of a robot device according to an embodiment of the present invention.
FIG. 2 is a block diagram schematically showing a functional configuration of the robot device according to the embodiment of the present invention.
FIG. 3 is a block diagram showing the configuration of a control unit of the robot device in more detail in the embodiment of the present invention.
FIG. 4 is a schematic diagram showing a functional configuration of a behavior control system of the robot device according to the embodiment of the present invention.
FIG. 5 is a schematic diagram showing an object configuration of the behavior control system according to the embodiment of the present invention.
FIG. 6 is a schematic diagram showing a form of situation-dependent behavior control by a situation-dependent behavior hierarchy according to an embodiment of the present invention.
FIG. 7 is a schematic diagram illustrating a basic operation example of behavior control using a situation-dependent behavior hierarchy.
FIG. 8 is a schematic diagram showing an operation example when a reflex action is performed by a situation-dependent action hierarchy.
FIG. 9 is a schematic diagram showing an operation example when emotion expression is performed by a situation-dependent behavior hierarchy.
FIG. 10 is a schematic diagram showing a situation where a situation-dependent behavior hierarchy is composed of a plurality of schemas.
FIG. 11 is a schematic diagram showing a tree structure of a schema in a situation-dependent behavior hierarchy.
FIG. 12 is a schematic diagram showing a mechanism for controlling normal context-dependent behavior in a context-dependent behavior hierarchy.
FIG. 13 is a schematic diagram showing a configuration of a schema in a reflex action unit.
FIG. 14 is a schematic diagram showing a mechanism for controlling reflex behavior by a reflex behavior unit.
FIG. 15 is a schematic diagram showing a class definition of a schema used in a situation-dependent behavior hierarchy.
FIG. 16 is a schematic diagram showing a functional configuration of a class in a situation-dependent behavior hierarchy.
FIG. 17 is a diagram illustrating the reentrant property of a schema.
FIG. 18 is a diagram illustrating a negative desire of the robot device according to the embodiment of the present invention, and shows an activation level of the schema in the schema tree when the robot device does not exhibit an action according to an external instruction; It is a schematic diagram.
FIG. 19 is a graph showing an example of a relationship between an internal state and a desire value MV.
FIG. 20 is a schematic diagram showing an activation level of a schema in a schema tree when the robot device expresses an action according to an external instruction.
FIG. 21 is a schematic diagram showing a relationship between NormalSBL and DerivativeSBL in the embodiment of the present invention.
FIG. 22 is a schematic diagram showing a relationship between an activation level given by Intention and an activation level evaluated internally.
FIG. 23 is a schematic diagram illustrating a relationship between a functional introduction schema of Deliverable SBL and each schema in Normal SBL according to the embodiment of the present invention.
[Explanation of symbols]
Reference Signs List 1 robot device, 10 behavior control system, 15 CCD camera, 16 microphone, 17 speaker, 18 touch sensor, 19 LED indicator, 20 control unit, 21 CPU, 22 RAM, 23 ROM, 24 nonvolatile memory, 25 interface, 26 wireless Communication interface, 27 network interface card, 28 bus, 29 keyboard, 40 input / output unit, 50 drive unit, 51 motor, 52 encoder, 53 driver, 81 visual recognition function unit, 82 auditory recognition function unit, 83 contact recognition function Unit, 91 internal state management unit, 92 short-term storage unit (STM), 93 long-term storage unit (LTM), 101 reflection behavior hierarchy, 102 context-dependent behavior hierarchy (SBL), 103 reflex behavior unit

Claims

In a robot device that selects and expresses an action based on an internal state and an external stimulus,
A plurality of actions are described, and an action output unit that outputs an action selected from the plurality of actions,
Priority calculating means for calculating the execution priority of each action from the internal state and / or external stimulus,
The execution priority indicates a positive desire or a negative desire to express each action,
The above-mentioned action output means outputs a different action from the selected action when the execution priority of the selected action indicates a negative desire.

The robot apparatus according to claim 1, wherein the different action is an action that denies the selected action.

The robot apparatus according to claim 1, wherein the action output unit does not output the selected action when the execution priority of the selected action indicates a negative desire.

2. The robot apparatus according to claim 1, further comprising an action selection unit that selects the action based on the execution priority or a predetermined external stimulus.

2. The robot apparatus according to claim 1, wherein the predetermined external stimulus is a command from a user.

The execution priority is calculated based on a value indicating a first desire calculated according to the external stimulus and a value indicating a second desire calculated according to the internal state. The robot device according to claim 1.

The internal state has a plurality of parameters indicating the magnitude of emotion and / or instinct, and when an action is expressed, at least one of the parameters changes, and the value indicating the second desire is the internal state. 2. The robot apparatus according to claim 1, wherein each parameter is set within a predetermined range, and when each parameter is outside the predetermined range, the value indicates a negative desire.

2. The action output unit according to claim 1, wherein the action output unit has a plurality of actions different from the selected action, and outputs an action corresponding to the value indicating the negative desire from the plurality of different actions. Robotic device.

The action output means has a plurality of action description modules in which actions are described,
The robot apparatus according to claim 1, wherein the action description module is configured in a tree structure according to a realization level of each action.

In a behavior control method of a robot device that selects and expresses a behavior based on an internal state and an external stimulus,
A priority calculating step of calculating an execution priority of each action from the internal state and / or the external stimulus;
Having an action output step of outputting an action selected from a plurality of actions,
The execution priority indicates a positive desire or a negative desire to express each action,
In the behavior output step, when the execution priority of the selected behavior indicates a negative desire, a behavior different from the selected behavior is output.

The method according to claim 10, wherein the different action is an action that denies the selected action.

11. The action control method according to claim 10, wherein in the action output step, when the execution priority of the selected action indicates a negative desire, the selected action is not output. .

The action control method for a robot device according to claim 10, further comprising an action selection step of selecting the action based on the execution priority or a predetermined external stimulus.

The method according to claim 10, wherein the predetermined external stimulus is a command from a user.

The execution priority is calculated based on a value indicating a first desire calculated according to the external stimulus and a value indicating a second desire calculated according to the internal state. The method for controlling a behavior of a robot device according to claim 10.

The internal state has a plurality of parameters indicating the magnitude of emotion and / or instinct, and when an action is expressed, at least one of the parameters changes;
The value indicating the second desire is such that each parameter of the internal state is within a predetermined range, and is a value indicating a negative desire when each parameter is out of the predetermined range. The behavior control method for a robot device according to claim 10, wherein

The action control method according to claim 10, wherein in the action output step, an action corresponding to the value indicating the negative desire is output from a plurality of actions different from the selected action.

In a program for causing a computer to execute an operation that selects and expresses an action based on an internal state and an external stimulus,
A priority calculating step of calculating an execution priority of each action from the internal state and / or the external stimulus;
Having an action output step of outputting an action selected from a plurality of actions,
The execution priority indicates a positive desire or a negative desire to express each action,
In the behavior output step, when the execution priority of the selected behavior indicates a negative desire, a behavior different from the selected behavior is output.