JP2002205289A

JP2002205289A - Action control method for robot device, program, recording medium and robot device

Info

Publication number: JP2002205289A
Application number: JP2000403454A
Authority: JP
Inventors: Kouchiyo Nakatsuka; 洪長中塚; Masatoshi Takeda; 正資武田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2000-12-28
Filing date: 2000-12-28
Publication date: 2002-07-23

Abstract

PROBLEM TO BE SOLVED: To allow a robot device to identify an interactive object and take desirable action depending on thereto. SOLUTION: The robot device comprises an action learning means for making the robot device learn input information concerning the interactive object and then learn highly evaluated action with a RNN 100 and an action control means for identifying the interactive object from the input information concerning the interactive object in accordance with the learning result and presenting the action having a high evaluation value learned by the action learning means depending on the identified interactive object with an inverse RNN (RNN-1).

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ロボット装置、ロ
ボット装置の動作を制御するロボット装置の動作制御方
法に関し、詳しくは、動作を学習することができるロボ
ット装置及びロボット装置の動作制御方法、プログラム
及び記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a robot apparatus, an operation control method of the robot apparatus for controlling the operation of the robot apparatus, and more particularly, to a robot apparatus capable of learning the operation, an operation control method of the robot apparatus, and a program. And a recording medium.

【０００２】[0002]

【従来の技術】近年、外観形状が犬や猫等の動物に模し
て形成されたロボット装置が提供されている。そのよう
なロボット装置には、外部からの情報や内部の状態に応
じて自律的に動作するものがある。2. Description of the Related Art In recent years, there has been provided a robot device whose external shape is formed by imitating an animal such as a dog or a cat. Some of such robot devices operate autonomously according to external information or internal conditions.

【０００３】[0003]

【発明が解決しようとする課題】ところで、人間もそう
であるが、動物は、相手の気を引こうとする場合があ
る。例えば、それによって、自身のやる気（モチベーシ
ョン）が高められる。また、相手も、そのような気を引
こうとする動作を見れば、気分は悪いはずがない。By the way, as in humans, animals sometimes try to attract the other party. For example, it increases one's motivation. Also, if the other party sees such an action to attract attention, it cannot be sick.

【０００４】例えば、飼い主とペットとの間の関係でみ
た場合、飼い主は、ペットが褒められるために気を引こ
うとしている動作を見れば嬉しく思い、ペット自身も褒
められれば気分が良くなり、例えば、性格も良くなる。
ロボット装置には、ユーザ等の対話において性格が変化
するロボット装置もあり、このようなロボット装置の場
合、褒められるための動作結果を性格に影響させること
ができるため、さらに娯楽性が増すと言える。[0004] For example, in the relationship between the owner and the pet, the owner is glad if he sees the action that the pet is trying to pay attention to to be praised. For example, personality improves.
Among the robot devices, there is also a robot device whose character changes in a dialogue of a user or the like. In the case of such a robot device, it is possible to influence an operation result to be praised on the character, and thus it can be said that entertainment is further increased. .

【０００５】また、褒められるような動作は、相手（例
えば、飼い主）によって異なるものである。すなわち、
ある飼い主にとっては、いわゆる「お座り」の動作或い
は姿勢が褒める対象となっていたり、いわゆる「臥せ」
の動作或いは姿勢が褒める対象となっていたりする。[0005] The operation to be praised differs depending on the other party (for example, the owner). That is,
For some owners, the so-called "sitting" movement or posture is a target for praise, or the so-called "lying"
Movement or posture is a target for praise.

【０００６】そこで、本発明は、上述の実情に鑑みてな
されたものであり、対話相手を識別して、それに応じて
褒められる動作等をすることができるロボット装置の動
作制御方法、プログラム、記録媒体及びロボット装置を
提供することを目的としている。Accordingly, the present invention has been made in view of the above situation, and has an operation control method, a program, and a recording apparatus for a robot apparatus capable of identifying a conversation partner and performing an operation commended accordingly. It is an object to provide a medium and a robot device.

【０００７】[0007]

【課題を解決するための手段】本発明に係るロボット装
置の動作制御方法は、上述の課題を解決するために、対
話相手を識別させて、識別させた対話相手が高い評価を
する動作を、ロボット装置に学習させる学習工程と、対
話相手に関して入力された情報から学習工程における学
習結果に基づいて対話相手を識別して、学習した評価の
高い動作を、識別した対話相手に応じてロボット装置に
より表出させる動作制御工程とを有する。SUMMARY OF THE INVENTION In order to solve the above-mentioned problems, a motion control method for a robot apparatus according to the present invention includes the steps of: identifying a conversation partner; A learning step of learning the robot device, and identifying the conversation partner based on the learning result in the learning process from the information input with respect to the conversation partner, and learning the highly evaluated operation by the robot device according to the identified conversation partner. And an operation control step of exposing.

【０００８】このようなロボット装置の動作制御方法
は、対話相手を識別させて、識別させた対話相手が高い
評価をする動作を、ロボット装置に学習させ、対話相手
に関して入力された情報から学習結果に基づいて対話相
手を識別して、学習した評価の高い動作を、識別した対
話相手に応じてロボット装置により表出させる。In such an operation control method for a robot device, the conversation partner is identified, the robot device learns an operation in which the identified conversation partner evaluates highly, and a learning result is obtained from information input with respect to the conversation partner. And the robot apparatus expresses the learned high evaluation operation based on the identified conversation partner.

【０００９】このようなロボット装置の動作制御方法に
より、ロボット装置は、対話相手及びこの対話相手が高
い評価をする動作を学習し、その学習結果に基づいて、
対話相手に応じて評価の高い動作を表出する。According to such an operation control method for a robot device, the robot device learns the conversation partner and the operation that the conversation partner evaluates highly, and based on the learning result,
Express highly evaluated actions according to the conversation partner.

【００１０】また、本発明に係るプログラムは、上述の
課題を解決するために、対話相手を識別させて、識別さ
せた対話相手が高い評価をする動作を、ロボット装置に
学習させる学習工程と、対話相手に関して入力された情
報から学習工程における学習結果に基づいて対話相手を
識別して、学習した評価の高い動作を、識別した対話相
手に応じてロボット装置により表出させる動作制御工程
とを実行させる。[0010] Further, in order to solve the above-mentioned problems, the program according to the present invention includes a learning step of identifying a dialogue partner and causing the robot apparatus to learn an operation in which the identified dialogue partner evaluates highly. Executing a motion control step of identifying a conversation partner based on the learning result in the learning process from information input with respect to the conversation partner, and expressing a learned operation with a high evaluation by the robot apparatus according to the identified conversation partner. Let it.

【００１１】このようなプログラムは、対話相手を識別
させて、識別させた対話相手が高い評価をする動作を、
ロボット装置に学習させ、対話相手に関して入力された
情報から学習結果に基づいて対話相手を識別して、学習
した評価の高い動作を、識別した対話相手に応じてロボ
ット装置により表出させる。[0011] Such a program identifies a conversation partner and performs an operation in which the identified conversation partner evaluates highly.
The robot device is made to learn, the conversation partner is identified based on the learning result from the information input with respect to the conversation partner, and the learned motion with high evaluation is expressed by the robot device according to the identified conversation partner.

【００１２】このようなプログラムにより動作が制御さ
れるロボット装置は、対話相手及びこの対話相手が高い
評価をする動作を学習し、その学習結果に基づいて、対
話相手に応じて評価の高い動作を表出する。A robot apparatus whose operation is controlled by such a program learns a conversation partner and an operation that the conversation partner evaluates highly, and, based on the learning result, performs a movement with a high evaluation according to the conversation partner. Express.

【００１３】また、本発明に係る記録媒体は、上述の課
題を解決するために、対話相手を識別させて、識別させ
た対話相手が高い評価をする動作を、ロボット装置に学
習させる学習工程と、対話相手に関して入力された情報
から学習工程における学習結果に基づいて対話相手を識
別して、学習した評価の高い動作を、識別した対話相手
に応じてロボット装置により表出させる動作制御工程と
を実行させるプログラムが記録されている。[0013] Further, in order to solve the above-mentioned problems, the recording medium according to the present invention includes a learning step of identifying a dialogue partner and causing the robot apparatus to learn an operation of highly evaluating the identified dialogue partner. An operation control step of identifying a dialogue partner based on learning results in a learning process from information input with respect to the dialogue partner, and causing a robot apparatus to express a learned highly evaluated operation according to the identified dialogue partner. The program to be executed is recorded.

【００１４】このような記録媒体は、対話相手を識別さ
せて、識別させた対話相手が高い評価をする動作を、ロ
ボット装置に学習させ、対話相手に関して入力された情
報から学習結果に基づいて対話相手を識別して、学習し
た評価の高い動作を、識別した対話相手に応じてロボッ
ト装置により表出させる。Such a recording medium allows the robot apparatus to learn an operation in which the conversation partner is identified and gives a high evaluation to the identified conversation partner, and performs the conversation based on the learning result from the information input with respect to the conversation partner. The other party is identified, and the learned motion with a high evaluation is expressed by the robot apparatus according to the identified dialogue partner.

【００１５】このような記録媒体に記録されているプロ
グラムにより動作が制御されるロボット装置は、対話相
手及びこの対話相手が高い評価をする動作を学習し、そ
の学習結果に基づいて、対話相手に応じて評価の高い動
作を表出する。A robot device whose operation is controlled by a program recorded on such a recording medium learns a conversation partner and an operation that the conversation partner evaluates highly, and based on the learning result, gives the conversation partner. An operation with a high evaluation is expressed accordingly.

【００１６】また、本発明に係るロボット装置は、上述
の課題を解決するために、対話相手を識別して、識別し
た対話相手が高い評価をする動作を学習する学習手段
と、対話相手に関して入力された情報から学習手段の学
習結果に基づいて対話相手を識別して、学習した評価の
高い動作を、識別した対話相手に応じて動作により表出
する動作制御手段とを備える。Further, in order to solve the above-mentioned problems, the robot apparatus according to the present invention has a learning means for identifying a conversation partner and learning an operation in which the identified conversation partner gives a high evaluation, and an input for the conversation partner. Operation control means for identifying a conversation partner based on the learning result of the learning means based on the obtained information, and expressing a learned operation with a high evaluation according to the identified conversation partner.

【００１７】このようなロボット装置は、対話相手を識
別して、識別した対話相手が高い評価をする動作を学習
し、対話相手に関して入力された情報から学習結果に基
づいて対話相手を識別して、学習した評価の高い動作
を、識別した対話相手に応じて表出する。よって、ロボ
ット装置は、対話相手及びこの対話相手が高い評価をす
る動作を学習し、その学習結果に基づいて、対話相手に
応じて評価の高い動作を表出する。Such a robot device identifies a conversation partner, learns an operation in which the identified conversation partner gives a high evaluation, and identifies a conversation partner based on a learning result from information input about the conversation partner. Then, the learned highly evaluated operation is expressed according to the identified conversation partner. Therefore, the robot device learns the conversation partner and the operation in which the conversation partner evaluates high, and expresses the operation with high evaluation according to the conversation partner based on the learning result.

【００１８】[0018]

【発明の実施の形態】以下、本発明の実施の形態につい
て図面を用いて詳しく説明する。この実施の形態は、周
囲の環境（外的要因）や内部の状態（内的要因）に応じ
て自律的に行動をする自律型のロボット装置である。本
発明の適用により、ロボット装置は、学習により、ユー
ザを識別して、それに応じて褒められる動作を表出する
ことができるようになる。Embodiments of the present invention will be described below in detail with reference to the drawings. This embodiment is an autonomous robot apparatus that autonomously acts according to a surrounding environment (external factors) and an internal state (internal factors). According to the application of the present invention, the robot device can identify a user through learning and can express an operation praised accordingly.

【００１９】実施の形態では、先ず、ロボット装置の構
成について説明して、その後、ロボット装置における本
発明の適用部分について詳細に説明する。In the embodiment, first, the configuration of the robot device will be described, and thereafter, the application portion of the present invention in the robot device will be described in detail.

【００２０】（１）本実施の形態によるロボット装置の
構成図１に示すように、「犬」を模した形状のいわゆるペッ
トロボットとされ、胴体部ユニット２の前後左右にそれ
ぞれ脚部ユニット３Ａ，３Ｂ，３Ｃ，３Ｄが連結される
と共に、胴体部ユニット２の前端部及び後端部にそれぞ
れ頭部ユニット４及び尻尾部ユニット５が連結されて構
成されている。(1) Configuration of Robot Apparatus According to the Present Embodiment As shown in FIG. 1, a so-called pet robot having a shape imitating a "dog" is provided, and leg units 3A, 3A, 3B, 3C, and 3D are connected, and a head unit 4 and a tail unit 5 are connected to the front end and the rear end of the body unit 2, respectively.

【００２１】胴体部ユニット２には、図２に示すよう
に、ＣＰＵ（Central Processing Unit）１０、ＤＲＡ
Ｍ（Dynamic Random Access Memory）１１、フラッシュ
ＲＯＭ（Read ０nly Memory）１２、ＰＣ（Personal Co
mputer）カードインターフェース回路１３及び信号処理
回路１４が内部バス１５を介して相互に接続されること
により形成されたコントロール部１６と、このロボット
装置１の動力源としてのバッテリ１７とが収納されてい
る。また、胴体部ユニット２には、ロボット装置１の向
きや動きの加速度を検出するための角速度センサ１８及
び加速度センサ１９なども収納されている。As shown in FIG. 2, a CPU (Central Processing Unit) 10 and a DRA
M (Dynamic Random Access Memory) 11, Flash ROM (Read 0nly Memory) 12, PC (Personal Co.)
A controller 16 formed by interconnecting a card interface circuit 13 and a signal processing circuit 14 via an internal bus 15 and a battery 17 as a power source of the robot apparatus 1 are housed therein. . The body unit 2 also houses an angular velocity sensor 18 and an acceleration sensor 19 for detecting the acceleration of the direction and movement of the robot device 1.

【００２２】また、頭部ユニット４には、外部の状況を
撮像するためのＣＣＤ（Charge Coupled Device）カメ
ラ２０と、使用者からの「撫でる」や「叩く」といった
物理的な働きかけにより受けた圧力を検出するためのタ
ッチセンサ２１と、前方に位置する物体までの距離を測
定するための距離センサ２２と、外部音を集音するため
のマイクロホン２３と、鳴き声等の音声を出力するため
のスピーカ２４と、ロボット装置１の「目」に相当する
ＬＥＤ（Light Emitting Diode）（図示せず）となどが
それぞれ所定位置に配置されている。The head unit 4 includes a CCD (Charge Coupled Device) camera 20 for capturing an image of an external situation, and a pressure applied by a physical action such as "stroke" or "hit" from the user. , A distance sensor 22 for measuring a distance to an object located ahead, a microphone 23 for collecting external sounds, and a speaker for outputting a sound such as a cry 24 and LEDs (Light Emitting Diodes (not shown)) corresponding to the “eyes” of the robot apparatus 1 are arranged at predetermined positions.

【００２３】さらに、各脚部ユニット３Ａ〜３Ｄの関節
部分や各脚部ユニット３Ａ〜３Ｄ及び胴体部ユニット２
の各連結部分、頭部ユニット４及び胴体部ユニット２の
連結部分、並びに尻尾部ユニット５の尻尾５Ａの連結部
分などにはそれぞれ自由度数分のアクチュエータ２５_１
〜２５_ｎ及びポテンショメータ２６_１〜２６_ｎが配設さ
れている。例えば、アクチュエータ２５_１〜２５_ｎはサ
ーボモータを構成として有している。サーボモータの駆
動により、脚部ユニット３Ａ〜３Ｄが制御されて、目標
の姿勢或いは動作に遷移する。Further, the joints of the leg units 3A to 3D, the leg units 3A to 3D, and the trunk unit 2
The actuators 25 _{1 for the} degrees of freedom are respectively provided at the connection portions of the head unit 4 and the trunk unit 2, and at the connection portion of the tail 5 A of the tail unit 5.
To 25 _n and potentiometers 26 _{1 to} 26 _n are provided. For example, each of the actuators 25 _{1 to} 25 _n has a servomotor. By driving the servo motor, the leg units 3A to 3D are controlled, and transition to a target posture or operation is made.

【００２４】そして、これら角速度センサ１８、加速度
センサ１９、タッチセンサ２１、距離センサ２２、マイ
クロホン２３、スピーカ２４及び各ポテンショメータ２
６_１〜２６_ｎなどの各種センサ並びにＬＥＤ及び各アク
チュエータ２５_１〜２５_ｎは、それぞれ対応するハブ
２７_１〜２７_ｎを介してコントロール部１６の信号処理
回路１４と接続され、ＣＣＤカメラ２０及びバッテリ１
７は、それぞれ信号処理回路１４と直接接続されてい
る。The angular velocity sensor 18, acceleration sensor 19, touch sensor 21, distance sensor 22, microphone 23, speaker 24 and each potentiometer 2
6 _1-26 various sensors and LED and the actuator ₂₅ 1 to 25 _n, such as _n is connected to the signal processing circuit 14 of the control unit 16 via a corresponding hub ₂₇ 1 ~ 27 _n, CCD camera 20 and the battery 1
7 are directly connected to the signal processing circuit 14, respectively.

【００２５】信号処理回路ｌ４は、上述の各センサから
供給されるセンサデータや画像データ及び音声データを
順次取り込み、これらをそれぞれ内部バス１５を介して
ＤＲＡＭ１１内の所定位置に順次格納する。また信号処
理回路１４は、これと共にバッテリ１７から供給される
バッテリ残量を表すバッテリ残量データを順次取り込
み、これをＤＲＡＭ１１内の所定位置に格納する。The signal processing circuit 14 sequentially takes in the sensor data, image data, and audio data supplied from the above-described sensors, and sequentially stores them at predetermined positions in the DRAM 11 via the internal bus 15. In addition, the signal processing circuit 14 sequentially takes in remaining battery power data indicating the remaining battery power supplied from the battery 17 and stores the data at a predetermined position in the DRAM 11.

【００２６】このようにしてＤＲＡＭ１１に格納された
各センサデータ、画像データ、音声データ及びバッテリ
残量データは、この後ＣＰＵ１０がこのロボット装置１
の動作制御を行う際に利用される。The sensor data, image data, voice data, and remaining battery data stored in the DRAM 11 in this manner are thereafter transmitted to the robot device 1 by the CPU 10.
It is used when controlling the operation of.

【００２７】実際上ＣＰＵ１０は、ロボット装置１の電
源が投入された初期時、胴体部ユニット２の図示しない
ＰＣカードスロットに装填されたメモリカード２８又は
フラッシュＲＯＭ１２に格納された制御プログラムをＰ
Ｃカードインターフェース回路１３を介して又は直接読
み出し、これをＤＲＡＭ１１に格納する。In practice, when the power of the robot apparatus 1 is turned on, the CPU 10 executes the control program stored in the memory card 28 or the flash ROM 12 inserted in the PC card slot (not shown) of the body unit 2 into a P program.
The data is read out via the C card interface circuit 13 or directly and stored in the DRAM 11.

【００２８】また、ＣＰＵ１０は、この後上述のように
信号処理回路１４よりＤＲＡＭ１１に順次格納される各
センサデータ、画像データ、音声データ及びバッテリ残
量データに基づいて自己及び周囲の状況や、使用者から
の指示及び働きかけの有無などを判断する。The CPU 10 then determines the status of itself and its surroundings and the usage based on the sensor data, image data, audio data, and remaining battery data sequentially stored in the DRAM 11 from the signal processing circuit 14 as described above. Judge the instruction from the person and the presence or absence of the action.

【００２９】さらに、ＣＰＵ１０は、この判断結果及び
ＤＲＡＭ１１に格納しだ制御プログラムに基づいて続く
行動を決定すると共に、当該決定結果に基づいて必要な
アクチュエータ２５_１〜２５_ｎを駆動させることによ
り、頭部ユニット４を上下左右に振らせたり、尻尾部ユ
ニット５の尻尾５Ａを動かせたり、各脚部ユニット３Ａ
〜３Ｄを駆動させて歩行させるなどの行動を行わせる。Furthermore, CPU 10 is configured to determine a subsequent action based on the control program that is stored in the determination result and DRAM 11, by driving the actuator ₂₅ 1 to 25 _n required based on the determination result, the head The unit 4 can be swung up, down, left and right, the tail 5A of the tail unit 5 can be moved, and each leg unit 3A can be moved.
3D is driven to perform an action such as walking.

【００３０】また、この際ＣＰＵ１０は、必要に応じて
音声データを生成し、これを信号処理回路１４を介して
音声信号としてスピーカ２４に与えることにより当該音
声信号に基づく音声を外部に出力させたり、上述のＬＥ
Ｄを点灯、消灯又は点滅させる。At this time, the CPU 10 generates audio data as required and supplies the generated audio data to the speaker 24 as an audio signal via the signal processing circuit 14 so that the audio based on the audio signal is output to the outside. LE above
D is turned on, off or blinked.

【００３１】このようにしてこのロボット装置１におい
ては、自己及び周囲の状況や、使用者からの指示及び働
きかけに応じて自律的に行動し得るようになされてい
る。As described above, the robot apparatus 1 can autonomously act in accordance with the situation of itself and the surroundings, and instructions and actions from the user.

【００３２】（２）制御プログラムのソフトウェア構成ここで、ロボット装置１における上述の制御プログラム
のソフトウェア構成は、図３に示すようになる。この図
３において、デバイス・ドライバ・レイヤ３０は、この
制御プログラムの最下位層に位置し、複数のデバイス・
ドライバからなるデバイス・ドライバ・セット３１から
構成されている。この場合、各デバイス・ドライバは、
ＣＣＤカメラ２０（図２）やタイマ等の通常のコンピュ
ータで用いられるハードウェアに直接アクセスするごと
を許されたオブジェクトであり、対応するハードウェア
からの割り込みを受けて処理を行う。(2) Software Configuration of Control Program Here, the software configuration of the above-described control program in the robot apparatus 1 is as shown in FIG. In FIG. 3, a device driver layer 30 is located at the lowest layer of the control program, and includes a plurality of device drivers.
It comprises a device driver set 31 composed of drivers. In this case, each device driver
An object that is permitted to directly access hardware used in a normal computer, such as a CCD camera 20 (FIG. 2) and a timer, and performs processing in response to an interrupt from the corresponding hardware.

【００３３】また、ロボティック・サーバ・オブジェク
ト３２は、デバイス・ドライバ・レイヤ３０の最下位層
に位置し、例えば上述の各種センサやアクチュエータ２
５_１〜２５_ｎ等のハードウェアにアクセスするためのイ
ンターフェースを提供するソフトウェア群でなるバーチ
ャル・ロボット３３と、電源の切換えなどを管理するソ
フトウェア群でなるバワーマネージャ３４と、他の種々
のデバイス・ドライバを管理するソフトウェア群でなる
デバイス・ドライバ・マネージャ３５と、ロボット装置
１の機構を管理するソフトウェア群でなるデザインド・
ロボット３６とから構成されている。The robotic server object 32 is located at the lowest layer of the device driver layer 30 and includes, for example, the various sensors and actuators 2 described above.
A virtual robot 33 comprising a software group that provides a 5 _to 253 interface for accessing hardware such as _n, a bus word manager 34 made of a software suite for managing the power supply switching, the other various device A device driver manager 35, which is a group of software for managing drivers, and a designed driver, which is a group of software for managing the mechanism of the robot device 1,
And a robot 36.

【００３４】マネージャ・オブジェクト３７は、オブジ
ェクト・マネージャ３８及びサービス・マネージャ３９
から構成されている。オブジェクト・マネージャ３８
は、ロボティック・サーバ・オブジェクト３２、ミドル
・ウェア・レイヤ４０、及びアプリケーション・レイヤ
４１に含まれる各ソフトウェア群の起動や終了を管理す
るソフトウェア群であり、サービス・マネージャ３９
は、メモリカード２８（図２）に格納されたコネクショ
ンファイルに記述されている各オブジェクト間の接続情
報に基づいて各オブジェクトの接続を管理するソフトウ
ェア群である。The manager object 37 includes an object manager 38 and a service manager 39.
It is composed of Object Manager 38
Is a software group that manages activation and termination of each software group included in the robotic server object 32, the middleware layer 40, and the application layer 41, and includes a service manager 39.
Are a group of software for managing the connection of each object based on the connection information between the objects described in the connection file stored in the memory card 28 (FIG. 2).

【００３５】ミドル・ウェア・レイヤ４０は、ロボティ
ック・サーバ・オブジェクト３２の上位層に位置し、画
像処理や音声処理などのこのロボット装置１の基本的な
機能を提供するソフトウェア群から構成されている。ま
た、アプリケーション・レイヤ４１は、ミドル・ウェア
・レイヤ４０の上位層に位置し、当該ミドル・ウェア・
レイヤ４０を構成する各ソフトウェア群によって処理さ
れた処理結果に基づいてロボット装置１の行動を決定す
るためのソフトウェア群から構成されている。The middleware layer 40 is located on the upper layer of the robotic server object 32 and is composed of a group of software for providing basic functions of the robot apparatus 1 such as image processing and sound processing. I have. Further, the application layer 41 is located above the middleware layer 40, and
It is composed of a software group for determining an action of the robot apparatus 1 based on a processing result processed by each software group constituting the layer 40.

【００３６】なお、ミドル・ウェア・レイヤ４０及びア
プリケーション・レイヤ４１の具体なソフトウェア構成
をそれぞれ図４に示す。FIG. 4 shows specific software configurations of the middleware layer 40 and the application layer 41.

【００３７】ミドル・ウェア・レイヤ４０は、図４に示
すように、騒音検出用、温度検出用、明るさ検出用、音
階認識用、距離検出用、姿勢検出用、タッチセンサ用、
動き検出用及び色認識用の各信号処理モジュール５０〜
５８並びに入力セマンティクスコンバータモジュール５
９などを有する認識系６０と、出力セマンティクスコン
バータモジュール６８並びに姿勢管理用、トラッキング
用、モーション再生用、歩行用、転倒復帰用、ＬＥＤ点
灯用及び音再生用の各信号処理モジュール６１〜６７な
どを有する出力系６９とから構成されている。As shown in FIG. 4, the middle wear layer 40 is for noise detection, temperature detection, brightness detection, scale recognition, distance detection, posture detection, touch sensor,
Each signal processing module 50 for motion detection and color recognition
58 and input semantics converter module 5
9 and an output semantics converter module 68 and signal processing modules 61 to 67 for attitude management, tracking, motion reproduction, walking, fallback recovery, LED lighting and sound reproduction. And an output system 69.

【００３８】認識系６０の各信号処理モジュール５０〜
５８は、ロボティック・サーバ・オブジェクト３２のバ
ーチャル・ロボット３３によりＤＲＡＭ１１（図２）か
ら読み出される各センサデータや画像データ及び音声デ
ータのうちの対応するデータを取り込み、当該データに
基づいて所定の処理を施して、処理結果を入力セマンテ
ィクスコンバータモジュール５９に与える。ここで、例
えば、バーチャル・ロボット３３は、所定の通信規約に
よって、信号の授受或いは変換をする部分として構成さ
れている。Each signal processing module 50 to 50 of the recognition system 60
Numeral 58 fetches corresponding data among the sensor data, image data, and audio data read from the DRAM 11 (FIG. 2) by the virtual robot 33 of the robotic server object 32, and performs predetermined processing based on the data. Is given to the input semantics converter module 59. Here, for example, the virtual robot 33 is configured as a part that exchanges or converts signals according to a predetermined communication protocol.

【００３９】入力セマンティクスコンバータモジュール
５９は、これら各信号処理モジュール５０〜５８から与
えられる処理結果に基づいて、「うるさい」、「暑
い」、「明るい」、「ボールを検出した」、「転倒を検
出した」、「撫でられた」、「叩かれた」、「ドミソの
音階が聞こえた」、「動く物体を検出した」又は「障害
物を検出した」などの自己及び周囲の状況や、使用者か
らの指令及び働きかけを認識し、認識結果をアプリケー
ション・レイヤ４１（図２）に出力する。The input semantics converter module 59 detects "noisy", "hot", "bright", "ball detected", and "fallover" based on the processing results given from each of the signal processing modules 50 to 58. Self and surrounding conditions such as `` has been stroked '', `` stroked '', `` hitted '', `` he heard the domes '', `` detected a moving object '' or `` detected an obstacle '', and the user , And outputs the recognition result to the application layer 41 (FIG. 2).

【００４０】アプリケーション・レイヤ４ｌは、図５に
示すように、行動モデルライブラリ７０、行動切換えモ
ジュール７１、学習モジュール７２、感情モデル７３及
び本能モデル７４の５つのモジュールから構成されてい
る。As shown in FIG. 5, the application layer 41 includes five modules: a behavior model library 70, a behavior switching module 71, a learning module 72, an emotion model 73, and an instinct model 74.

【００４１】行動モデルライブラリ７０には、図６に示
すように、「バッテリ残量が少なくなった場合」、「転
倒復帰する」、「障害物を回避する場合」、「感情を表
現する場合」、「ボールを検出した場合」などの予め選
択されたいくつかの条件項目にそれぞれ対応させて、そ
れぞれ独立した行動モデル７０_１〜７０_ｎが設けられて
いる。As shown in FIG. 6, the behavior model library 70 includes “when the remaining battery power is low”, “returns to fall”, “when avoiding obstacles”, and “when expressing emotions”. , And independent action models 70 ₁ to 70 _n are respectively provided corresponding to some pre-selected condition items such as “when a ball is detected”.

【００４２】そして、これら行動モデル７０_１〜７０_ｎ
は、それぞれ入力セマンティクスコンバータモジュール
５９から認識結果が与えられたときや、最後の認識結果
が与えられてから一定時間が経過したときなどに、必要
に応じて後述のように感情モデル７３に保持されている
対応する情動のパラメータ値や、本能モデル７４に保持
されている対応する欲求のパラメータ値を参照しながら
続く行動をそれぞれ決定し、決定結果を行動切換えモジ
ュール７１に出力する。The behavior models 70 ₁ to 70 _n
Are stored in the emotion model 73 as described later, as necessary, when a recognition result is given from the input semantics converter module 59 or when a certain period of time has passed since the last recognition result was given. The next action is determined with reference to the corresponding emotion parameter value and the corresponding desire parameter value held in the instinct model 74, and the determination result is output to the action switching module 71.

【００４３】なお、この実施の形態の場合、各行動モデ
ル７０_１〜７０_ｎは、次の行動を決定する手法として、
図７に示すような１つのノード（状態）ＮＯＤＥ_０〜Ｎ
ＯＤＥ_ｎから他のどのノードＮＯＤＥ_０〜ＮＯＤＥ_ｎに
遷移するかを各ノードＮＯＤＥ_０〜ＮＯＤＥ_ｎに間を接
続するアークＡＲＣ_１〜ＡＲＣ_ｎ１に対してそれぞれ設
定された遷移確率Ｐ_１〜Ｐ_ｎに基づいて確率的に決定す
る有限確率オートマトンと呼ばれるアルゴリズムを用い
る。In the case of this embodiment, each of the behavior models 70 ₁ to 70 _n uses the following method to determine the next behavior.
One node (state) NODE _{0 to} N as shown in FIG.
The transition probability _P 1 to P _n which is set respectively arc _ARC 1 _~ARC _n1 connecting between whether to transition from ODE _n to any other node NODE ₀ ~NODE _n each node NODE ₀ ~NODE _n An algorithm called a finite stochastic automaton that determines stochastically on the basis is used.

【００４４】具体的に、各行動モデル７０_１〜７０
_ｎは、それぞれ自己の行動モデル７０_１〜７０_ｎを形成
するノードＮＯＤＥ_０〜ＮＯＤＥ_ｎにそれぞれ対応させ
て、これらノードＮＯＤＥ_０〜ＮＯＤＥ_ｎごとに図８に
示すような状態遷移表８０を有している。More specifically, each of the behavior models 70 ₁ to 70 ₁
_n has a state transition table 80 as shown in FIG. 8 for each of the nodes NODE _{0 to} NODE _n corresponding to the nodes NODE ₀ to NODE _n forming their own behavior models 70 ₁ to 70 _n , respectively. ing.

【００４５】この状態遷移表８０では、そのノードＮＯ
ＤＥ_０〜ＮＯＤＥ_ｎにおいて遷移条件とする入力イベン
ト（認識結果）が「入力イベント名」の行に優先順に列
記され、その遷移条件についてのさらなる条件が「デー
タ名」及び「データ範囲」の行における対応する列に記
述されている。In the state transition table 80, the node NO
Input events (recognition results) as transition conditions in DE _{0 to} NODE _n are listed in order of priority in the row of “input event name”, and further conditions for the transition conditions are described in the rows of “data name” and “data range”. It is described in the corresponding column.

【００４６】したがって、図８の状態遷移表８０で表さ
れるノードＮＯＤＥ_１００では、「ボールを検出（ＢＡ
ＬＬ）」という認識結果が与えられた場合に、当該認識
結果と共に与えられるそのボールの「大きさ（ＳＩＺ
Ｅ）」が「0から1000」の範囲であることや、「障害物
を検出（ＯＢＳＴＡＣＬＥ）」という認識結果が与えら
れた場合に、当該認識結果と共に与えられるその障害物
までの「距離（ＤＩＳＴＡＮＣＥ）」が「0から100」の
範囲であることが他のノードに遷移するための条件とな
っている。Therefore, in the node NODE ₁₀₀ represented by the state transition table 80 in FIG.
LL) ”, the“ size (SIZ) of the ball given together with the recognition result is given.
E) is in the range of “0 to 1000”, or when a recognition result of “obstacle detected (OBSTACLE)” is given, the “distance (DISTANCE)” to the obstacle given together with the recognition result is given. )) Is in the range of “0 to 100”, which is a condition for transitioning to another node.

【００４７】また、このノードＮＯＤＥ_１００では、認
識結果の入力がない場合においても、行動モデル７０_１
〜７０_ｎが周期的に参照する感情モデル７３及び本能モ
デル７４にそれぞれ保持された各情動及び各欲求のパラ
メータ値のうち、感情モデル７３に保持された「喜び
（ＪＯＹ）」、「驚き（ＳＵＲＰＲＩＳＥ）」若しくは
「悲しみ（ＳＵＤＮＥＳＳ）」のいずれかのパラメータ
値が「50から100」の範囲であるときには他のノードに
遷移することができるようになっている。In the node NODE ₁₀₀ , even when the recognition result is not input, the behavior model 70 ₁
Of the parameter values of each emotion and each desire held in the emotion model 73 and the instinct model 74 which are periodically referred to by the _n- 70n, “joy” and “surprise” held in the emotion model 73 are stored. )) Or "Sadness (SUDNESS)" when the parameter value is in the range of "50 to 100", it is possible to transition to another node.

【００４８】また、状態遷移表８０では、「他のノード
ヘの遷移確率」の欄における「遷移先ノード」の列にそ
のノードＮＯＤＥ_０〜ＮＯＤＥ_ｎから遷移できるノー
ド名が列記されていると共に、「入力イベント名」、
「データ値」及び「データの範囲」の行に記述された全
ての条件が揃ったときに遷移できる他の各ノードＮＯＤ
Ｅ_０〜ＮＯＤＥ_ｎへの遷移確率が「他のノードヘの遷移
確率」の欄内の対応する箇所にそれぞれ記述され、その
ノードＮＯＤＥ_０〜ＮＯＤＥ_ｎに遷移する際に出力すべ
き行動が「他のノードヘの遷移確率」の欄における「出
力行動」の行に記述されている。なお、「他のノードヘ
の遷移確率」の欄における各行の確率の和は１００
［％］となっている。[0048] In the state transition table 80, with the node name that can transition from the node NODE ₀ ~ NODE _n in the column of "transition destination node" are listed in the column of "transition probability of other Nodohe", " Input event name ",
Other nodes NOD that can transition when all the conditions described in the rows of “data value” and “data range” are met
The transition probabilities from E _{0 to} NODE _n are respectively described in corresponding portions in the column of “transition probability to other nodes”, and the action to be output when transitioning to that node NODE _{0 to} NODE _n is “other It is described in the row of “output action” in the column of “transition probability to node”. Note that the sum of the probabilities of each row in the column of “transition probability to another node” is 100
[%].

【００４９】したがって、図８の状態遷移表８０で表さ
れるノードＮＯＤＥ_１００では、例えば「ボールを検出
（ＢＡＬＬ）」し、そのボールの「ＳＩＺＥ（大き
さ）」が「0から1000」の範囲であるという認識結果が
与えられた場合には、「30［％］」の確率で「ノードＮ
ＯＤＥ_１２０（node 120）」に遷移でき、そのとき「Ａ
ＣＴＩＯＮ１」の行動が出力されることとなる。Therefore, in the node NODE ₁₀₀ represented by the state transition table 80 in FIG. 8, for example, “ball is detected (BALL)”, and the “SIZE” of the ball is in the range of “0 to 1000”. Is given, the probability of “30 [%]” and “node N
ODE ₁₂₀ (node 120) "and then" A
The action of “CTION1” is output.

【００５０】各行動モデル７０_１〜７０_ｎは、それぞれ
このような状態遷移表８０として記述されたノードＮＯ
ＤＥ_０〜ＮＯＤＥ_ｎがいくつも繋がるようにして構成
されており、入力セマンティクスコンバータモジュール
５９から認識結果が与えられたときなどに、対応するノ
ードＮＯＤＥ_０〜ＮＯＤＥ_ｎの状態遷移表を利用して確
率的に次の行動を決定し、決定結果を行動切換えモジュ
ール７１に出力するようになされている。Each of the behavior models 70 ₁ to 70 _n has the node NO described in the state transition table 80 as described above.
DE ₀ to NODE _n are connected to each other, and when a recognition result is given from the input semantics converter module 59 or the like, the probability is calculated using the state transition table of the corresponding nodes NODE ₀ to NODE _n. The next action is determined, and the result of the determination is output to the action switching module 71.

【００５１】図５に示す行動切換えモジュール７１は、
行動モデルライブラリ７０の各行動モデル７０_１〜７０
_ｎからそれぞれ出力される行動のうち、予め定められた
優先順位の高い行動モデル７０_１〜７０_ｎから出力され
た行動を選択し、当該行動を実行すべき旨のコマンド
（以下、これを行動コマンドという。）をミドル・ウェ
ア・レイヤ４０の出力セマンティクスコンバータモジュ
ール６８に送出する。なお、この実施の形態において
は、図６において下側に表記された行動モデル７０ _１〜
７０_ｎほど優先順位が高く設定されている。The action switching module 71 shown in FIG.
Each behavior model 70 in the behavior model library 70₁~ 70
_nOf the actions output from
High priority action model 70₁~ 70_nOutput from
Command to select the action and execute it
(Hereinafter, this is called an action command.)
A layer 40 output semantics converter module
To the rule 68. In this embodiment,
Is the behavior model 70 shown on the lower side in FIG. ₁~
70_nThe higher the priority, the higher the priority.

【００５２】また、行動切換えモジュール７１は、行動
完了後に出力セマンティクスコンバータモジュール６８
から与えられる行動完了情報に基づいて、その行動が完
了したことを学習モジュール７２、感情モデル７３及び
本能モデル７４に通知する。After the action is completed, the action switching module 71 outputs the output semantics converter module 68.
The completion of the action is notified to the learning module 72, the emotion model 73, and the instinct model 74 based on the action completion information given by the user.

【００５３】一方、学習モジュール７２は、入力セマン
ティクスコンバータモジュール５９から与えられる認識
結果のうち、「叩かれた」や「撫でられた」など、使用
者からの働きかけとして受けた教示の認識結果を入力す
る。On the other hand, the learning module 72 inputs the recognition result of the teaching received from the user, such as “hit” or “stroke”, among the recognition results given from the input semantics converter module 59. I do.

【００５４】そして、学習モジュール７２は、この認識
結果及び行動切換えモジュール７１からの通知に基づい
て、「叩かれた（叱られた）」ときにはその行動の発現
確率を低下させ、「撫でられた（褒められた）」ときに
はその行動の発現確率を上昇させるように、行動モデル
ライブラリ７０における対応する行動モデル７０_１〜７
０_ｎの対応する遷移確率を変更する。Then, based on the recognition result and the notification from the action switching module 71, the learning module 72 lowers the probability of occurrence of the action when "beaten (scolded)" and "strokes ( praised obtained) "sometimes to increase the expression probability of that action, behavior model 70 _1-7 corresponding in behavioral model library 70
Change the corresponding transition probabilities of 0 _n .

【００５５】他方、感情モデル７３は、「喜び（jo
y）」、「悲しみ（sadness）」、「怒り（anger）」、
「驚き（surprise）」、「嫌悪（disgust）」及び「恐
れ（fear）」の合計６つの情動について、各情動ごとに
その情動の強さを表すパラメータを保持している。そし
て、感情モデル７３は、これら各情動のパラメータ値
を、それぞれ入力セマンティクスコンバータモジュール
５９から与えられる「叩かれた」及び「撫でられた」な
どの特定の認識結果と、経過時間及び行動切換えモジュ
ール７１からの通知となどに基づいて周期的に更新す
る。On the other hand, the emotion model 73 indicates “joy (jo
y) "," sadness "," anger ",
For a total of six emotions, “surprise”, “disgust”, and “fear”, a parameter indicating the strength of the emotion is stored for each emotion. Then, the emotion model 73 converts the parameter values of each of these emotions into a specific recognition result such as “hit” and “stroke” given from the input semantics converter module 59 and the elapsed time and action switching module 71. Updates periodically based on the notification from.

【００５６】具体的には、感情モデル７３は、入力セマ
ンティクスコンバータモジュール５９から与えられる認
識結果と、そのときのロボット装置１の行動と、前回更
新してからの経過時間となどに基づいて所定の演算式に
より算出されるそのときのその情動の変動量を△Ｅ
［ｔ］、現在のその情動のパラメータ値をＥ［ｔ］、そ
の情動の感度を表す係数をｋ_ｅとして、（１）式によっ
て次の周期におけるその情動のパラメータ値Ｅ［ｔ＋
１］を算出し、これを現在のその情動のパラメータ値Ｅ
［ｔ］と置き換えるようにしてその情動のパラメータ値
を更新する。また、感情モデル７３は、これと同様にし
て全ての情動のパラメータ値を更新する。Specifically, the emotion model 73 is determined based on the recognition result given from the input semantics converter module 59, the behavior of the robot device 1 at that time, the elapsed time since the last update, and the like. The amount of change of the emotion at that time calculated by the arithmetic expression is expressed by △ E
[T], E [t] of the current parameter value of the emotion, the coefficient representing the sensitivity of the emotion as k _e, (1) the parameter value of the emotion in a next period by equation E [t +
1] is calculated, and the current parameter value E of the emotion is calculated.
The parameter value of the emotion is updated by replacing it with [t]. The emotion model 73 updates the parameter values of all emotions in the same manner.

【００５７】[0057]

【数１】 (Equation 1)

【００５８】なお、各認識結果や出力セマンティクスコ
ンバータモジュール６８からの通知が各情動のパラメー
タ値の変動量△Ｅ［ｔ］にどの程度の影響を与えるかは
予め決められており、例えば「叩かれた」といった認識
結果は「怒り」の情動のパラメータ値の変動量△Ｅ
［ｔ］に大きな影響を与え、「撫でられた」といった認
識結果は「喜び」の情動のパラメータ値の変動量△Ｅ
［ｔ］に大きな影響を与えるようになっている。The degree to which each recognition result and the notification from the output semantics converter module 68 affect the variation ΔE [t] of the parameter value of each emotion is determined in advance. Is the amount of change in the parameter value of the emotion of “anger” △ E
[T] is greatly affected, and the recognition result such as “stroke” is the variation amount of the parameter value of the emotion of “joy” 喜び E
[T] is greatly affected.

【００５９】ここで、出力セマンティクスコンバータモ
ジュール６８からの通知とは、いわゆる行動のフィード
バック情報（行動完了情報）であり、行動の出現結果の
情報であり、感情モデル７３は、このような情報によっ
ても感情を変化させる。これは、例えば、「吠える」と
いった行動により怒りの感情レベルが下がるといったよ
うなことである。なお、出力セマンティクスコンバータ
モジュール６８からの通知は、上述した学習モジュール
７２にも入力されており、学習モジュール７２は、その
通知に基づいて行動モデル７０_１〜７０_ｎの対応する遷
移確率を変更する。Here, the notification from the output semantics converter module 68 is so-called action feedback information (action completion information), information on the appearance result of the action, and the emotion model 73 also uses such information. Change emotions. This is, for example, a behavior such as "barking" that lowers the emotional level of anger. The notification from the output semantics converter module 68 is also input to the learning module 72 described above, the learning module 72 changes the corresponding transition probability of the behavioral models 70 ₁ to 70 _n based on the notification.

【００６０】なお、行動結果のフィードバックは、行動
切換えモジュレータ７１の出力（感情が付加された行
動）によりなされるものであってもよい。The feedback of the action result may be made by the output of the action switching modulator 71 (the action to which the emotion is added).

【００６１】一方、本能モデル７４は、「運動欲（exer
cise）」、「愛情欲（affection）」、「食欲（appetit
e）」及び「好奇心（curiosity）」の互いに独立した４
つの欲求について、これら欲求ごとにその欲求の強さを
表すパラメータを保持している。そして、本能モデル７
４は、これらの欲求のパラメータ値を、それぞれ入力セ
マンティクスコンバータモジュール５９から与えられる
認識結果や、経過時間及び行動切換えモジュール７１か
らの通知などに基づいて周期的に更新する。On the other hand, the instinct model 74 is “exercise desire (exer
cise) "," affection "," appetite "
e) ”and“ curiosity ”4
For each of the desires, a parameter indicating the strength of the desire is stored for each of the desires. And instinct model 7
4 periodically updates these desire parameter values based on the recognition result given from the input semantics converter module 59, the elapsed time, the notification from the action switching module 71, and the like.

【００６２】具体的には、本能モデル７４は、「運動
欲」、「愛情欲」及び「好奇心」については、認識結
果、経過時間及び出力セマンティクスコンバータモジュ
ール６８からの通知などに基づいて所定の演算式により
算出されるそのときのその欲求の変動量をΔＩ［ｋ］、
現在のその欲求のパラメータ値をＩ［ｋ］、その欲求の
感度を表す係数ｋ_ｉとして、所定周期で（２）式を用い
て次の周期におけるその欲求のパラメータ値Ｉ［ｋ＋
１］を算出し、この演算結果を現在のその欲求のパラメ
ータ値Ｉ［ｋ］と置き換えるようにしてその欲求のパラ
メータ値を更新する。また、本能モデル７４は、これと
同様にして「食欲」を除く各欲求のパラメータ値を更新
する。More specifically, the instinct model 74 determines the “movement desire”, “affection desire”, and “curiosity” based on the recognition result, the elapsed time, the notification from the output semantics converter module 68, and the like. The change amount of the desire at that time calculated by the arithmetic expression is ΔI [k],
Assuming that the parameter value of the current desire is I [k] and the coefficient k _i representing the sensitivity of the desire is a parameter value I [k +
1] is calculated, and the calculation result is replaced with the current parameter value I [k] of the desire to update the parameter value of the desire. Similarly, the instinct model 74 updates the parameter values of each desire except “appetite”.

【００６３】[0063]

【数２】 (Equation 2)

【００６４】なお、認識結果及び出力セマンティクスコ
ンバータモジュール６８からの通知などが各欲求のパラ
メータ値の変動量△Ｉ［ｋ］にどの程度の影響を与える
かは予め決められており、例えば出力セマンティクスコ
ンバータモジュール６８からの通知は、「疲れ」のパラ
メータ値の変動量△Ｉ［ｋ］に大きな影響を与えるよう
になっている。Note that the degree to which the recognition result and the notification from the output semantics converter module 68 affect the variation ΔI [k] of the parameter value of each desire is determined in advance. For example, the output semantics converter 68 The notification from the module 68 has a large effect on the variation ΔI [k] of the parameter value of “fatigue”.

【００６５】なお、本実施の形態においては、各情動及
び各欲求（本能）のパラメータ値がそれぞれ0から100ま
での範囲で変動するように規制されており、また係数ｋ
_ｅ、ｋ_ｉの値も各情動及び各欲求ごとに個別に設定され
ている。In the present embodiment, the parameter values of each emotion and each desire (instinct) are regulated so as to fluctuate in the range of 0 to 100, respectively.
_e, the value of k _i is also set individually for each emotion and each desire.

【００６６】一方、ミドル・ウェア・レイヤ４０の出力
セマンティクスコンバータモジュール６８は、図４に示
すように、上述のようにしてアプリケーション・レイヤ
４１の行動切換えモジュール７１から与えられる「前
進」、「喜ぶ」、「鳴く」又は「トラッキング（ボール
を追いかける）」といった抽象的な行動コマンドを出力
系６９の対応する信号処理モジュール６１〜６７に与え
る。On the other hand, as shown in FIG. 4, the output semantics converter module 68 of the middleware layer 40 provides “forward” and “pleasure” given from the action switching module 71 of the application layer 41 as described above. , "Sound" or "tracking (follow the ball)" is given to the corresponding signal processing modules 61 to 67 of the output system 69.

【００６７】そしてこれら信号処理モジュール６１〜６
７は、行動コマンドが与えられると当該行動コマンドに
基づいて、その行動を行うために対応するアクチュエー
タ２５_１〜２５_ｎ（図２）に与えるべきサーボ指令値
や、スピーカ２４（図２）から出力する音の音声データ
及び又は「目」のＬＥＤに与える駆動データを生成し、
これらのデータをロボティック・サーバ・オブジェクト
３２のバーチャル・ロボット３３及び信号処理回路１４
（図２）を順次介して対応するアクチュエータ２５_１〜
２５_ｎ又はスピーカ２４又はＬＥＤに順次送出する。The signal processing modules 61 to 6
7, given the behavior command based on the action command, and servo command value to be applied to the actuator 25 1 _{to 25} n _(Fig. 2) corresponding to perform that action, the output from the speaker 24 (FIG. 2) Generating sound data of the sound to be played and / or driving data to be given to the LED of the “eye”,
These data are transferred to the virtual robot 33 of the robotic server object 32 and the signal processing circuit 14.
The corresponding actuators 25 ₁ to 25 ₁ through (FIG. 2)
_25n or the speaker 24 or the LED.

【００６８】このようにしてロボット装置１において
は、制御プログラムに基づいて、自己（内部）及び周囲
（外部）の状況や、使用者からの指示及び働きかけに応
じた自律的な行動を行うことができるようになされてい
る。In this way, the robot apparatus 1 can perform autonomous actions according to its own (internal) and surrounding (external) conditions and instructions and actions from the user based on the control program. It has been made possible.

【００６９】（３）ロボット装置の情報の学習上述したロボット装置は、対話相手されるユーザを識別
して、識別したユーザが高い評価をする動作を学習し、
ユーザに関して入力された情報から学習結果に基づいて
ユーザを識別して、学習した評価の高い動作を、識別し
た対話相手に応じて表出することができる。(3) Learning of Information of Robot Apparatus The above-described robot apparatus identifies a user to be interacted with, and learns an operation in which the identified user gives a high evaluation.
The user can be identified based on the learning result from the information input about the user, and the learned highly evaluated operation can be expressed according to the identified conversation partner.

【００７０】具体的には、ロボット装置は、ユーザに関
して入力される情報として、飼い主（ユーザ）の顔（画
像情報）と音声（音響情報）を学習する。ユーザの顔に
ついては、図２に示すＣＣＤカメラ２０により入力され
て、信号処理回路１４等により信号処理して得た撮像画
像情報を用いる。また、ユーザの声については、図２に
示すマイクロホン２３により入力されて、信号処理回路
１４等により信号処理して得た音声情報（或いは音響情
報）を用いる。また、ユーザが高い評価を動作として
は、ユーザから褒められる動作或いは褒められた動作等
が挙げられる。More specifically, the robot device learns the face (image information) and voice (sound information) of the owner (user) as information input about the user. With respect to the user's face, captured image information input by the CCD camera 20 shown in FIG. 2 and signal-processed by the signal processing circuit 14 or the like is used. For the user's voice, voice information (or acoustic information) input by the microphone 23 shown in FIG. 2 and obtained by signal processing by the signal processing circuit 14 or the like is used. In addition, examples of the operation in which the user has a high evaluation include an operation praised by the user or an operation praised by the user.

【００７１】このような学習による動作の制御を、ロボ
ット装置は、ユーザに関して入力された情報を学習する
ユーザ識別学習工程と、評価の高い動作を学習する動作
学習工程と、ユーザに関して入力された情報からユーザ
識別学習工程における学習結果に基づいてユーザを識別
して、動作学習工程にて学習した評価値の高い動作を、
識別したユーザに応じて表出する動作制御工程とを有し
て、実現している。The robot controls the operation by such learning. The robot device performs a user identification learning step of learning information input about the user, an operation learning step of learning a highly evaluated operation, and From the user based on the learning result in the user identification learning step, the operation having a high evaluation value learned in the operation learning step is
And an operation control step of displaying an operation according to the identified user.

【００７２】そして、このような処理工程をロボット装
置は、モジュール或いはオブジェクトとして構築して実
行している。すなわち、ロボット装置は、このような処
理工程を実現するプログラムにより構築されている学習
モジュール若しくは学習オブジェクト及び動作制御モジ
ュール若しくは動作制御オブジェクトを備えている。ま
た、例えば、このようなプログラムは、ディスク状記録
媒体等に記録されたものとして提供され、或いはいわゆ
るインターネット上において配信されており、ロボット
装置は、そのような媒体を介してプログラムを取り込
み、上述したような処理を実現している。The robot apparatus constructs and executes such processing steps as modules or objects. That is, the robot device includes a learning module or a learning object and an operation control module or an operation control object that are constructed by a program that realizes such processing steps. Further, for example, such a program is provided as being recorded on a disk-shaped recording medium or the like, or is distributed on the so-called Internet, and the robot apparatus takes in the program via such a medium, and Such processing is realized.

【００７３】このようにロボット装置は、ユーザの識別
の学習や動作の学習をしているが、そのような学習手法
としては、例えば、ＲＮＮ（リカレント型ニューラルネ
ットワーク）の手法等が挙げられる。ＲＮＮは、例え
ば、Long Ji Liがその論文「Reinforcement Learning W
ith Hidden States」において提唱しいている学習の一
手法である。As described above, the robot apparatus learns user identification and motion, and examples of such a learning method include an RNN (Recurrent Neural Network) method. For example, Long Ji Li reported in his paper "Reinforcement Learning W.
ith Hidden States. "

【００７４】図９には、ＲＮＮ１００による学習と、学
習されたＲＮＮ１００の利用例を示している。図９中
（Ａ）は、学習時のＲＮＮ１００の使用態様を示し、図
９中（Ｂ）は、学習後のＲＮＮ１００の使用態様を示し
ている。FIG. 9 shows an example of learning by the RNN 100 and use of the learned RNN 100. (A) in FIG. 9 illustrates a use mode of the RNN 100 during learning, and (B) in FIG. 9 illustrates a use mode of the RNN 100 after the learning.

【００７５】図９中（Ａ）に示すように、学習時におい
て、ＲＮＮ１００に学習する対象の情報とされる学習情
報が入力される。これに対応して、ＲＮＮ１００は、そ
の学習情報に対応される対応データの出力をする。この
対応データは、いわゆる教示データ或いは目標データと
して把握されるものである。As shown in FIG. 9A, at the time of learning, learning information to be learned is input to the RNN 100. In response, the RNN 100 outputs corresponding data corresponding to the learning information. This correspondence data is grasped as so-called teaching data or target data.

【００７６】このようなＲＮＮ１００による学習によ
り、学習情報に対応データが対応されるようになる。す
なわち、学習後においては、図９中（Ｂ）に示すよう
に、学習後のＲＮＮ１００にある情報を入力することに
より、あるデータが出力されるようになるが、入力され
る情報が予め学習した学習情報である場合には、それに
対応する対応データ（教示データ或いは目標データ）が
同一或いは近似の値として出力されるようになる。よっ
て、このようなＲＮＮ１００の学習により、例えば、あ
る情報をＲＮＮ１００に入力させることにより、出力さ
れるデータ（対応データ）に応じて、その入力された情
報の特徴を知ることができる。The learning by the RNN 100 causes the corresponding data to correspond to the learning information. That is, after learning, as shown in FIG. 9B, by inputting certain information to the RNN 100 after learning, certain data is output, but the input information is learned in advance. If the information is learning information, corresponding data (teaching data or target data) corresponding to the learning information is output as an identical or approximate value. Therefore, by learning the RNN 100, for example, by inputting certain information to the RNN 100, it is possible to know the characteristics of the input information according to the output data (corresponding data).

【００７７】図１０には、ＲＮＮ１００の構成例を示し
ている。この図１０に示すように、ＲＮＮ１００は、入
力層１００_１、中間層１００_２及び出力層１００_３を有
している。入力層１００_１、中間層１００_２及び出力層
１００_３は、所定の数のニューロンを有し、各層間の各
ニューロンが接続されている。そして、各ニューロン
は、所定の重み係数を記憶している。FIG. 10 shows a configuration example of the RNN 100. As shown in this FIG. 10, RNN100 includes an input layer 100 _1, the intermediate layer 100 ₂ and the output layer 100 _3. The input layer 100 ₁ , the intermediate layer 100 _2, and the output layer 100 ₃ have a predetermined number of neurons, and each neuron in each layer is connected. Each neuron stores a predetermined weight coefficient.

【００７８】このように構成されているＲＮＮ１００に
おいて、入力層１００_１の各ニューロンに学習情報が入
力される。入力層１００_１に入力された学習情報は、中
間層１００_２を介して、出力層１００_３から出力され
る。出力層１００_３から出力されるデータが学習情報に
対応されるデータとなる。[0078] In RNN100 configured in this manner, the learning information is input to the neurons of the input layer 100 _1. Learning information input to the input layer 100 _1, through the intermediate layer 100 ₂ is outputted from the output layer 100 _3. Data outputted from the output layer 100 ₃ is data corresponding to the learning information.

【００７９】学習時には、各ニューロンは、入力に対し
て重み係数を乗算して、出力側の層の接続されている他
のニューロンに出力している。学習により、ニューロン
は、ＲＮＮ１００に入力される情報に応じて、所定の値
を出力するようになる。また、出力層１００_３における
所定のニューロンの出力の一部は、コンテクスト（cont
ext）Ｃ_ｔ＋１として、入力層１００_１のニューロンに
フィードバックされる。At the time of learning, each neuron multiplies an input by a weighting coefficient and outputs the result to another neuron connected to the layer on the output side. By the learning, the neuron outputs a predetermined value according to the information input to the RNN 100. Also, part of the output of a given neuron in the output layer 100 _3, context (cont
As _{ext) C t + 1,} is fed back to the input layer 100 ₁ neuron.

【００８０】このようにして、学習時には、ＲＮＮ１０
０は、現在の時刻（ステップ）ｔでの学習情報を入力
し、次の時刻ｔ＋１の情報としての対応データを予測し
て出力し、学習情報の入力に対して、所定の対応データ
を出力することができるようになる。Thus, at the time of learning, the RNN 10
0 inputs learning information at the current time (step) t, predicts and outputs corresponding data as information at the next time t + 1, and outputs predetermined corresponding data in response to the input of learning information. Will be able to do it.

【００８１】本実施の形態では、ロボット装置によりこ
のような学習を実現させており、学習情報が、ユーザの
声情報（音声情報）や顔情報（画像情報）とされる。ま
た、入力される学習情報及び出力される対応データのデ
ータ形式は、具体的には、ベクトル形式とされている。
例えば、ベクトルデータとして扱うことにで、特徴空間
における特徴量として学習情報や対応データを特定でき
るようになる。In the present embodiment, such learning is realized by the robot device, and the learning information is voice information (voice information) and face information (image information) of the user. Further, the data format of the input learning information and the output corresponding data is, specifically, a vector format.
For example, by treating as vector data, learning information and corresponding data can be specified as a feature amount in a feature space.

【００８２】このように、ＲＮＮを使用した学習によ
り、学習後は、ＲＮＮに情報を入力することで、ＲＮＮ
が学習に基づいて、あるデータを出力するようになる。
すなわち、ＲＮＮは、ある情報の入力に対して、その出
力として、その特徴を示すあるベクトルデータとしての
データを出力するようになる。As described above, by learning using the RNN, after learning, information is input to the RNN, so that the RNN is used.
Output certain data based on learning.
That is, in response to input of certain information, the RNN outputs, as its output, data as certain vector data indicating its characteristics.

【００８３】さらに、実施の形態では、学習後のＲＮＮ
を利用して、その逆の操作により情報を得ている。すな
わち、学習後のＲＮＮを利用して、今度は、そのような
特徴を示すデータを入力として、ある情報を特定してい
る。具体的には、図１１中（Ａ）に示したようにある学
習情報により学習がなされたＲＮＮ１００から、図１１
中（Ｂ）に示すように、新たなＲＮＮを生成して、その
新たなＲＮＮ１１０を使用して、対応データから学習情
報を特定する。Further, in the embodiment, the RNN after learning is
And the information is obtained by the reverse operation. That is, using the RNN after learning, this time, certain information is specified by inputting data indicating such characteristics. Specifically, as shown in (A) of FIG.
As shown in the middle (B), a new RNN is generated, and learning information is specified from the corresponding data using the new RNN 110.

【００８４】ここで、学習後のＲＮＮ１００から新たに
生成するＲＮＮ１１０は、いわゆるインバースＲＮＮ
（Inverse-ＲＮＮ、ＲＮＮ^−１）１１０であり、いわゆ
るインバースフォーワード（Inverse-Forward）を利用
して情報を特定することを可能とするものである。Here, the RNN 110 newly generated from the RNN 100 after learning is a so-called inverse RNN.
(Inverse-RNN, RNN ⁻¹ ) 110, which enables information to be specified using a so-called inverse-forward.

【００８５】以上のように、ＲＮＮにより情報を学習す
ることができ、学習したＲＮＮを利用して種々の情報を
得ることができる。ロボット装置は、このような学習手
法が適用されることにより、ユーザ（飼い主）の顔及び
声を識別して、ユーザから褒められたいときのパフォー
マンスを学習することを実現している。そして、ロボッ
ト装置は、褒められたいときに、学習して得たパフォー
マンスを特定して実行することができるようになされて
いる。次に、このような動作を実現するためのロボット
装置におけるより具体的な処理について説明する。As described above, information can be learned by the RNN, and various information can be obtained by using the learned RNN. By applying such a learning method, the robot device identifies the face and voice of the user (owner) and realizes the performance when the user wants to be praised. The robot apparatus can identify and execute the performance obtained by learning when it is desired to be praised. Next, more specific processing in the robot apparatus for realizing such an operation will be described.

【００８６】（３−１）学習によるユーザのモデル化及
び学習モデルを利用したユーザの識別（３−１−１）学習によるユーザのモデル化図１２には、学習時のＲＮＮ１００、すなわち、学習情
報として、センサ情報Ｐｔ、音声情報（ユーザの声の情
報）Ｓ_ｔ及び画像情報（ユーザの顔の情報）Ｆ _ｔの入力
がされ、対応してその出力Ｂ_ｔ＋１をするＲＮＮ１００
を示している。(3-1) Modeling and User Modeling by Learning
(3-1-1) Modeling of User by Learning FIG. 12 shows RNN 100 during learning, that is, learning information.
As information, sensor information Pt, voice information (information of the voice of the user)
Report) S_tAnd image information (user face information) F _tInput
And the corresponding output B_{t + 1}RNN100 to do
Is shown.

【００８７】ここで、センサ入力情報Ｐ_ｔは、例えば、
従来型のセンサ入力情報であって、具体的には、ロボッ
ト装置１の外部環境の情報や内部状態を示す情報であ
る。このようなセンサ入力情報Ｐ_ｔや上述の音声情報Ｓ
_ｔ及び画像情報Ｆ_ｔは、時系列の入力シーケンスとして
のベクトルデータとしてＲＮＮ１００に入力される。ま
た、このような入力に対して出力されるデータＢ_ｔ＋１
は、特徴シーケンスとしてベクトルデータ｛Ｂ（１），
Ｂ（２），・・・，Ｂ（ｎ）｝である。なお、ベクトル
データについて、｛Ｂ_１，Ｂ_２，・・・，Ｂ_ｎ｝との記
述は、上述の｛Ｂ（１），Ｂ（２），・・・，Ｂ
（ｎ）｝との記述と同義である。Here, the sensor input information _Pt is, for example,
This is conventional sensor input information, specifically, information on the external environment and information indicating the internal state of the robot apparatus 1. Such sensor input information _Pt and the above-described audio information S
_t and image information _{F t} is input to RNN100 as vector data as input sequence of time series. Also, data B _{t + 1} output for such an input
Are vector data {B (1),
B (2),..., B (n)}. Note that the description of the vector data as {B ₁ , B ₂ ,..., B _n } is based on the above {B (1), B (2),.
(N) It is the same as the description of｝.

【００８８】このようなセンサ入力情報Ｐ_ｔ，音声情報
Ｓ_ｔ及び画像情報Ｆ_ｔを入力した学習によってＲＮＮ１
００により特定される特徴シーケンスとされるデータＢ
_ｔ＋ _１に基づいてユーザの特徴を確定して、モデル化し
たユーザ（以下、ユーザモデルという。）Ｍ_ｕｓｅｒが
得られる。例えば、ユーザモデルＭ_ｕｓｅｒは、ベクト
ルデータＢ_ｔ＋１によって、特徴空間上で特定されるも
のとなる。すなわち、特定のユーザを示す情報をなすセ
ンサ入力情報Ｐ_ｔ，音声情報Ｓ_ｔ及び画像情報Ｆ_ｔの学
習によって、そのユーザの特徴を示す指標となるユーザ
モデルＭ_ｕｓｅ _ｒが得られる。[0088] RNN1 by learning entered such a sensor input information _{P t,} audio information _{S t} and the image information _{F t}
Data B as a feature sequence specified by 00
The characteristics of the user are determined based on _{t +} ₁ , and a modeled user (hereinafter, referred to as a user model) M _user is obtained. For example, the user model M _user is specified on the feature space by the vector data B _{t + 1} . That is, the sensor input information P _t forming information describing a specific _user, by learning audio information S _t and the image information F _t, the user model M _use _r as an index indicating characteristics of the user are obtained.

【００８９】（３−１−２）学習モデルを利用したユー
ザの識別上述したように、ＲＮＮ１００を利用して、ユーザをデ
ータとしてモデル化することができる。ＲＮＮ１００を
利用したユーザの識別では、学習により特定したユーザ
モデルＭ_ｕｓｅｒを利用して、ユーザの識別をする。図
１３には、ユーザを学習した後のＲＮＮ１００と、ユー
ザモデルＭ_ｕｓｅｒとを使用して行うユーザの認識を示
している。具体的には、次のようにユーザの識別がなさ
れる。(3-1-2) Identification of User Using Learning Model As described above, the user can be modeled as data using the RNN 100. In the identification of a user using the RNN 100, the user is identified using a user model M _user specified by learning. FIG. 13 shows recognition of the user performed using the RNN 100 after learning the user and the user model M _user . Specifically, the user is identified as follows.

【００９０】学習後のＲＮＮ１００に対して、識別対象
とするユーザの情報として、上述した学習時の学習情報
の入力形式と同様な情報を入力する。すなわち、ＲＮＮ
１００への入力は、センサ入力情報Ｐ_ｔ、音声情報Ｓ_ｔ
及び画像情報Ｆ_ｔである。そして、ＲＮＮ１００は、こ
のような入力に対応してデータＢ_ｔ＋１を出力する。こ
こで、データＢ_ｔ＋１のデータ形式は、学習時のデータ
形式と同様に、ベクトルデータ｛Ｂ（１），Ｂ（２），
・・・，Ｂ（ｎ）｝である。The information similar to the above-described learning information input format at the time of learning is input to the RNN 100 after learning as the information of the user to be identified. That is, RNN
The input to 100 is sensor input information P _t , voice information _St
And an image information _{F t.} Then, the RNN 100 outputs data Bt _{+ 1} in response to such an input. Here, the data format of the data _{Bt + 1 is the same as} the data format at the time of learning, and the vector data {B (1), B (2),
.., B (n)}.

【００９１】そして、このようにして得られるデータＢ
_ｔ＋１と、ベクトルデータからなるユーザモデルＭ
_ｕｓｅｒとを照合し比較して、差分を得る。この差分に
基づいて、入力された情報が、ユーザの特定がなされ
る。すなわち、この差分Ｅｒｒが、所定の閾値α以下で
ある場合には、予め学習しているユーザとして判断し、
一方で、その差分Ｅｒｒが所定の閾値αより大きい場合
には、予め学習したユーザとは別人であると判断し、入
力された情報に基づいてユーザを識別する。The data B thus obtained is
_{t + 1} and user model M composed of vector data
Compare and compare with _user to get difference. Based on the difference, the input information identifies the user. That is, when the difference Err is equal to or smaller than the predetermined threshold α, it is determined that the user has learned in advance,
On the other hand, when the difference Err is larger than the predetermined threshold α, it is determined that the user is different from the user who has learned in advance, and the user is identified based on the input information.

【００９２】以上のように、ＲＮＮにより、ユーザの識
別に使用するユーザについての情報を学習して、そして
学習結果として得たユーザに対応されるユーザモデルＭ
_ｕｓ _ｅｒに基づいて、ユーザを識別することができる。As described above, the information about the user used for user identification is learned by the RNN, and the user model M corresponding to the user obtained as the learning result is obtained.
Based on _us _er, it is possible to identify the user.

【００９３】（３−２）褒められる行動の学習及び学習
した行動の表出（３−２−１）褒められる行動の学習ユーザに褒められたときに表出させる動作の学習につい
て説明する。図１４には、学習時のＲＮＮ１００、すな
わち、学習情報として、パフォーマンスシーケンスＡ_ｔ
が入力され、対応してその出力Ｂ_ｔ＋１をするＲＮＮ１
００を示している。(3-2) Learning of Actions to be Praised and Expression of Learned Actions (3-2-1) Learning of Behaviors to Be Praised Learning of actions to be expressed when the user is praised will be described. FIG. 14 shows an RNN 100 at the time of learning, that is, a performance sequence A _t as learning information.
RNN1 correspondingly outputs its output B _{t + 1}
00 is shown.

【００９４】ここで、パフォーマンスシーケンスＡ
_ｔは、ロボット装置が表出する所定の行動と対応付けさ
れている。パフォーマンスシーケンスＡ_ｔに対応付けさ
れる行動が、ユーザに褒められたときの行動となる。例
えば、パフォーマンスシーケンスＡ_ｔに対応付けされる
行動としては、いわゆる「お座り」等の動作或いは姿勢
が挙げられる。また、このパフォーマンスシーケンスＡ
_ｔは、ベクトルデータ｛Ａ（１），Ａ（２），・・・，
Ａ（ｎ）｝として構成されている。Here, the performance sequence A
_t is associated with a predetermined action expressed by the robot device. Actions are associated with the performance sequence A _t is the action when the compliments the user. For example, the action to be associated with a performance sequence A _t, operation or attitude such as the so-called "Sitting" and the like. The performance sequence A
_t is vector data {A (1), A (2),.
A (n)}.

【００９５】学習時に、パフォーマンスシーケンスＡ_ｔ
がＲＮＮ１００に入力され、それに対応してデータＢ
_ｔ＋１が出力される。ここで、データＢ_ｔ＋１は、シー
ケンスデータであって、ベクトルデータ｛Ｂ（１），Ｂ
（２），・・・，Ｂ（ｎ）｝である。[0095] at the time of learning, performance sequence _{A t}
Is input to the RNN 100, and the data B
_{t + 1} is output. Here, data _{Bt + 1} is sequence data, and vector data {B (1), B
(2),..., B (n)}.

【００９６】そして、パフォーマンスシーケンスＡ
_ｔを、褒められたときの行動を示すデータであることを
前提としてＲＮＮ１００に入力させて学習することによ
り、出力されるデータＢ_ｔ＋１は、褒められたときの行
動の特徴を示す特徴データ（或いは特徴シーケンス）を
示すものとなる。Then, the performance sequence A
_t is input to the RNN 100 and learned assuming that the data indicates the behavior when praised, the output data B _{t + 1} is the feature data (or the characteristic data indicating the characteristic of the behavior when praised) Characteristic sequence).

【００９７】（３−２−２）インバースＲＮＮを用いた
学習した行動の表出上述したように、褒められたときの行動がＲＮＮにより
学習することができる。そして、ロボット装置は、その
ようにして学習した行動を、学習してＲＮＮを利用して
表出させることができる。具体的には、次のようにし
て、学習後のＲＮＮを利用して行動を表出させる。(3-2-2) Expression of Behavior Learned Using Inverse RNN As described above, the behavior when praised can be learned by the RNN. Then, the robot device can learn the behavior learned in this manner and express it using the RNN. Specifically, an action is expressed using the RNN after learning as follows.

【００９８】先ず、図１１を用いて説明したように、褒
められたいときに表出させる行動を学習したＲＮＮ１０
０から、インバースフォーワード（Inverse-Forward）
を用いてその行動シーケンスを特定することが可能な、
いわゆるインバースＲＮＮ（Inverse-ＲＮＮ、ＲＮＮ
^−１）１１０を生成する。First, as described with reference to FIG. 11, the RNN 10 that has learned the action to be expressed when a praise is desired.
From 0, Inverse-Forward
It is possible to specify the action sequence using
So-called Inverse-RNN (Inverse-RNN, RNN)
^-1 ) 110 is generated.

【００９９】図１５には、シーケンスＢ_ｔ＋１が入力と
され、対応してパフォーマンスシーケンスＡ_ｔを出力す
るインバースＲＮＮ１１０を示している。ここで、シー
ケンスＢ_ｔ＋１の入力に対応して出力されるパフォーマ
ンスシーケンスＡ_ｔはベクトルデータ｛Ａ（１），Ａ
（２），・・・，Ａ（ｎ）｝とされている。なお、ベク
トルデータについて、｛Ａ_１，Ａ_２，・・・，Ａ_ｎ｝と
の記述は、上述の｛Ａ（１），Ａ（２），・・・，Ａ
（ｎ）｝との記述と同義である。[0099] Figure 15 is a sequence _{B t + 1} is the input indicates the inverse RNN110 outputting performance sequence _{A t} correspond. Here, the performance sequence _{A t} is the vector data {A outputted in response to an input sequence _{B t + 1 (1),} A
, A (n)}. Note that the description of vector data as {A ₁ , A ₂ ,..., A _n } is based on the above-described {A (1), A (2),.
(N) It is the same as the description of｝.

【０１００】これにより、シーケンスＢ_ｔ＋１をインバ
ースＲＮＮ１１０に入力させることにより、褒められた
い動作を示すデータとされるパフォーマンスシーケンス
Ａ_ｔが出力されるようになる。そして、そのようなパフ
ォーマンスシーケンスＡ_ｔに基づいて、ロボット装置
は、対応付けされている「お座り」等といったユーザに
褒められる動作を実際に表出させる。[0100] Accordingly, by inputting the sequence _{B t + 1} to the inverse RNN110, performance sequence _{A t} which is data indicating an operation to be compliments is to be outputted. Then, based on such performance sequence A _t, robotic device to actually expose the operation praised the user such association has been that "Sitting", and the like.

【０１０１】このように、ロボット装置は、ユーザに褒
められたいと思うタイミングにおいて、学習後のＲＮＮ
１００から得たインバースＲＮＮ１１０に、ＲＮＮ１０
０による学習の際に得たシーケンスＢ_ｔ＋１を入力させ
ることで、ユーザに褒められる動作を表出することがで
きるようになる。As described above, the robot apparatus learns the RNN at the timing when the user wants to be praised.
Inverse RNN 110 obtained from 100, RNN 10
By inputting the sequence _{Bt + 1} obtained at the time of learning using 0, an operation praised by the user can be expressed.

【０１０２】（３−３）所定のユーザに対して褒められ
たいときのパフォーマンスシーケンスの特定ここでは、所定のユーザを識別して、褒められる動作を
表出させるまでの処理手順の概略を示している。図１６
には、その手順を示している。(3-3) Identification of Performance Sequence when Prescribed User is Praised Here, an outline of a processing procedure for identifying a prescribed user and expressing an operation to be praised will be described. I have. FIG.
Shows the procedure.

【０１０３】先ず、ステップＳ１として、事前に所定の
ユーザを識別するためのＲＮＮによる学習を行う。所定
のユーザの識別のための学習は、図１２を用いて説明し
た手順によって実現される。すなわち、ここにおける学
習により、所定のユーザに対応されるユーザモデルＭ
_ｕｓｅｒを得ることができる。First, as step S1, learning by an RNN for identifying a predetermined user is performed in advance. Learning for identifying a predetermined user is realized by the procedure described with reference to FIG. That is, by learning here, the user model M corresponding to the predetermined user
You can get a _user .

【０１０４】そして、ステップＳ２として、ユーザモデ
ルＭ_ｕｓｅｒに基づいて、ユーザを識別する処理を行
う。すなわち、図１３を用いて説明した処理であり、識
別対象とされる人間の情報を学習後のＲＮＮに入力し
て、それによって得られるＲＮＮからの出力とを、ユー
ザモデルＭ_ｕｓｅｒとを比較して、差分Ｅｒｒを得る。
そして、この差分Ｅｒｒが、所定の閾値以下である場合
には、当該ＲＮＮによって学習されている所定のユーザ
である判断して、所定の閾値を超えている場合には、学
習されたユーザではないと判断する。或いは、例特徴空
間において、ＲＮＮからの出力が、ベクトルデータとし
て把握されるユーザモデルＭ_ｕｓｅｒに対して所定の要
件を満たす値であれば、所定のユーザとして特定すると
いうこととしても良い。Then, as step S2, a process for identifying a user is performed based on the user model M _user . That is, this is the process described with reference to FIG. 13, in which the information of the human being to be identified is input to the RNN after learning, and the output from the RNN obtained thereby is compared with the user model M _user. To obtain the difference Err.
If the difference Err is equal to or smaller than a predetermined threshold, it is determined that the user is a predetermined user learned by the RNN. If the difference Err exceeds the predetermined threshold, the user is not a learned user. Judge. Alternatively, in the example feature space, if the output from the RNN is a value that satisfies predetermined requirements for the user model M _user grasped as vector data, it may be specified as a predetermined user.

【０１０５】一方で、ステップＳ３において、褒められ
たときを特定して、ＲＮＮによる学習を行っている。褒
められる動作の学習は、図１４を用いて説明した手順に
よって実現される。そして、図１５を用いて説明したよ
うに、学習後のＲＮＮからインバースＲＮＮを生成し、
ステップＳ４において、インバースＲＮＮに、動作表出
のためのシーケンスＢ_ｔ＋１を入力する。具体的には、
ステップＳ２において識別したユーザに対応されるシー
ケンスＢ_ｔ＋１を入力する。例えば、ロボット装置にお
いて、ユーザとシーケンスＢ_ｔ＋１を対応させてテーブ
ルとして記憶されており、このようなテーブルを参照し
て、識別したユーザに対応されるシーケンスＢ_ｔ＋１を
特定する。このようなユーザとシーケンスＢ_ｔ＋１との
対応付けについては、図１７に示すように、ユーザを識
別するための学習時におけるＲＮＮの出力とされるユー
ザの特徴を示すユーザ特徴シーケンスと、褒められる動
作の学習時におけるＲＮＮの出力とされる動作の特徴を
示す動作特徴シーケンスとを関連付けすることになされ
る。On the other hand, in step S3, when the praise is specified, learning by the RNN is performed. The learning of the action to be praised is realized by the procedure described with reference to FIG. Then, as described with reference to FIG. 15, an inverse RNN is generated from the RNN after learning,
In step S4, a sequence _{Bt + 1} for expressing an operation is input to the inverse RNN. In particular,
The sequence _{Bt + 1} corresponding to the user identified in step S2 is input. For example, in the robot apparatus, a user is associated with a sequence _{Bt + 1} and stored as a table, and by referring to such a table, a sequence _{Bt + 1} corresponding to the identified user is specified. As shown in FIG. 17, associating such a user with the sequence _{Bt + 1} , as shown in FIG. 17, a user feature sequence indicating the feature of the user as an output of the RNN at the time of learning for identifying the user, and a praised operation Is associated with an operation feature sequence indicating the feature of the operation as an output of the RNN at the time of learning.

【０１０６】ステップＳ５では、識別したユーザに関連
付けされるものとして特定したシーケンスＢ_ｔ＋１を、
インバースＲＮＮに入力させて、所定のパフォーマンス
シーケンスＡ_ｔを特定する。そして、ロボット装置は、
このような所定のパフォーマンスシーケンスＡ_ｔに対応
する動作の「お座り」等の行動（Action）を表出させ
る。In step S5, the sequence B _{t + 1} specified as being associated with the identified user is
By input to the inverse RNN, identifies the predetermined performance sequence A _t. And the robot device is
Such a predetermined operation corresponding to the performance sequence A _t of "Sitting" and the like behavior (Action) to expose the.

【０１０７】以上のような処理は、ロボット装置が褒め
られたいときに実行されるものであるが、例えば、この
ような褒められようとする動作の表出させるための処理
を実行開始するタイミングについては、そのようなタイ
ミングを得ることができるモデル（例えば、モジュール
或いはオブジェクト）をもつことにより実現する。すな
わち例えば、ロボット装置は、上述したように、モチベ
ーションを高めるモデルとして褒められるタイミングを
規定するモデルを有して、このモデルが所定のレベルに
達したときに褒められたい動作を表出させるための処理
を実行するようにする。The above-described processing is executed when the robot apparatus wants to be praised. For example, the timing at which the processing for displaying such an operation to be praised is started. Is realized by having a model (for example, a module or an object) capable of obtaining such timing. That is, for example, as described above, the robot device has a model that defines the timing of being praised as a model for increasing motivation, and for expressing an operation to be praised when this model reaches a predetermined level. Execute the process.

【０１０８】以上のように、ロボット装置は、ユーザ
（飼い主）の顔と音声とを識別して、ＲＮＮ手法に基づ
いて、飼い主から褒められたいときに出現させる動作
（パフォーマンス）を学習する。そして、ロボット装置
が飼い主から褒められたいとき、インバースＲＮＮを用
いて、そのパフォーマンスを実行する。これにより、例
えば、ユーザは、ロボット装置が褒められるために気を
引こうとしている動作を見て嬉しく思い、また、ロボッ
ト装置自身も褒められれば気分が良くなり、例えば、性
格も良くなり、モチベーションも高くなる。ロボット装
置は、上述したように、ユーザ等の対話において性格が
変化する感情モデルを有しており、褒められるための動
作結果を性格に影響させることができるため、さらに娯
楽性が増す。このように、ロボット装置は、学習能力を
向上されたものとなり、さらに、これにより、娯楽性が
向上されたものとなる。As described above, the robot device recognizes the face and voice of the user (owner), and learns the operation (performance) to appear when the owner wants to praise, based on the RNN method. Then, when the owner wants to be praised by the owner, the performance is executed using the inverse RNN. Thereby, for example, the user is delighted to see the operation that the robot device is trying to pay attention to to be praised. Will also be higher. As described above, the robot device has the emotion model whose character changes in the dialogue of the user or the like, and can affect the character of the operation result to be praised, thereby further increasing the entertainment. In this way, the robot device has an improved learning ability, and further has an improved entertainment.

【０１０９】（３−４）ＲＮＮによる学習の具体例例えば、ＲＮＮについての入力データ（ここでは、
Ｐ_ｔ，Ｆ_ｔ，Ｓ_ｔ，Ｃ_ｔ，θ）と出力データ（予測値）
Ｂ_ｔ＋１との間の関係は、（３）式のように示すことが
できる。すなわち、ＲＮＮを（３）式に示すような関数
Ｒ_ＮＮにより示すことができる。ここで、Ｂ_ｔ＋１，Ｐ
_ｔ，θは、それぞれ（４）式〜（６）式として示す。ま
た、θは、ＲＮＮを構成するユニット（ニューロン）の
バイアス値である。(3-4) Specific Example of Learning by RNN For example, input data on RNN (here,
_Pt , _Ft , _St , _Ct , θ) and output data (predicted value)
The relationship with B _{t + 1} can be expressed as in equation (3). That it can be shown by a function _{R NN} as shown in the RNN (3) expression. Where B _{t + 1} , P
_t and θ are shown as Expressions (4) to (6), respectively. Θ is a bias value of a unit (neuron) constituting the RNN.

【０１１０】[0110]

【数３】 (Equation 3)

【０１１１】[0111]

【数４】 (Equation 4)

【０１１２】[0112]

【数５】 (Equation 5)

【０１１３】[0113]

【数６】 (Equation 6)

【０１１４】このように、出力データＢ_ｔ＋１は、入力
データＰ_ｔ，Ｆ_ｔ，Ｓ_ｔ，Ｃ_ｔ，θを変数とする関数Ｒ
_ＮＮによって得ることができることがわかる。ここで、
以下のような仮説を立てる。例えば、（７）式を利用し
て、仮説を立てる。（７）式は、いわゆるメンバーシッ
プ関数或いはシグモイド関数と言われるものである。As described above, the output data B _{t + 1} is a function R with the input data P _t , F _t , S _t , C _t , and θ as variables.
It can be seen that it can be obtained by _NN . here,
I make the following hypothesis. For example, a hypothesis is made using equation (7). Equation (7) is a so-called membership function or sigmoid function.

【０１１５】[0115]

【数７】 (Equation 7)

【０１１６】ここで、図１８に示すように、ＲＮＮを構
成するニューロンとして把握されるユニットｊについて
考える。このユニットには、ｎ_ｊ，１（ｔ），・・・，
ｎ_ｊ _，Ｎｊ（ｔ）が入力されており、ユニットは、この
入力に対して、ｏ_ｊ，１（ｔ），・・・，ｏ
_ｊ，Ｌｊ（ｔ）を出力する。このような関係は、（８）
式として示される。（８）式において、Ｗ_１，１，・・
・，Ｗ_{Ｎｊ，Ｌｊ}は、重み係数である。そして、（８）
式は、（９）式として示すこともできる。Here, as shown in FIG. 18, consider a unit j which is grasped as a neuron constituting the RNN. This unit has n _{j, 1} (t), ...,
n _j _{, Nj} (t) are input, and the unit _{responds to} this input by using o _{j, 1} (t),.
_{j, Lj} (t) is output. Such a relationship is (8)
It is shown as an equation. In equation (8), W _1,1 ,.
, W _{Nj and Lj} are weighting factors. And (8)
The equation can also be expressed as equation (9).

【０１１７】[0117]

【数８】 (Equation 8)

【０１１８】[0118]

【数９】 (Equation 9)

【０１１９】ここで、（９）式の左辺を、（１０）式の
ような左辺として仮定する。すなわち、ユニットからの
出力値ｏ_ｊ，ｉを、ＲＮＮへの実際の入力とされるデー
タｔ，Ｐ_ｔ，Ｆ_ｔ，Ｓ_ｔ，Ｃ_ｔの関数として与えられる
と仮定する。そして、（１０）式及び上述の（７）式の
関係から（１１）式を得る。（１１）式は、入力に対し
て所定のデータを出力するＲＮＮを定義するものとな
る。Here, it is assumed that the left side of equation (9) is the left side as in equation (10). That is, it is assumed that the output value o _{j, i} from the unit is given as a function of the data t, P _t , F _t , S _t , and C _t that are the actual inputs to the RNN. Then, Expression (11) is obtained from the relationship between Expression (10) and Expression (7). Equation (11) defines an RNN that outputs predetermined data in response to an input.

【０１２０】[0120]

【数１０】 (Equation 10)

【０１２１】[0121]

【数１１】 [Equation 11]

【０１２２】次に以上のような式により把握されるＲＮ
Ｎによる学習を数式を用いて説明する。ＲＮＮへの入力
データは、（１２）式に示すようにベクトルデータ（或
いはシーケンスデータ）として示すことができる。Next, RN grasped by the above equation
The learning using N will be described using mathematical expressions. The input data to the RNN can be represented as vector data (or sequence data) as shown in equation (12).

【０１２３】[0123]

【数１２】 (Equation 12)

【０１２４】例えば、このベクトルデータの元Ａ_１，Ａ
_２，・・・Ａ_ｔ，・・・，Ａ_ｎはそれぞれ、（１３）式
に示すように、音声情報Ｓ_ｔや画像情報Ｆ_ｔによって特
定される値になる。なお、元をなす音声情報Ｓ_ｔや画像
情報Ｆ_ｔは特徴量としてのデータである。For example, the elements A ₁ , A
_{_{2, ··· A t, ···,}} A n , respectively, as shown in (13), a value specified by the speech information _{S t} and the image information _{F t.} Note that the original audio information _St and image information _Ft are data as feature amounts.

【０１２５】[0125]

【数１３】 (Equation 13)

【０１２６】この場合、ＲＮＮへの入力に対応して、ベ
クトルデータＢ_ｔ＋１が出力されるが、このベクトルデ
ータＢ_ｔ＋１の元についても、（１４）式に示すよう
に、音声情報Ｓ_ｔ＋１や画像情報Ｆ_ｔ＋１を特定する値
となる。In this case, the vector data B _{t + 1} is output in response to the input to the RNN. The source of the vector data B _{t + 1} is also expressed by the audio information _{St + 1} and the image data B _{t + 1} as shown in the equation (14). The value specifies information _{Ft + 1} .

【０１２７】[0127]

【数１４】 [Equation 14]

【０１２８】図１９には、学習時のＲＮＮへのデータ
｛Ａ_１，Ａ_２，・・・，Ａ_ｎ｝の入力と、それに対応さ
れる出力データ｛Ｂ_１，Ｂ_２，・・・，Ｂ_ｎ｝の出力を
示している。例えば、ユーザの声及び顔を学習する場合
には、図２０に示すように、時系列データとして入力さ
れる画像情報Ｆ_ｔ及び音声情報Ｓ_ｔを元とするベクトル
データ｛Ａ_１，Ａ_２，・・・，Ａ_ｎ｝がＲＮＮに入力さ
れる。FIG. 19 shows the input of data {A ₁ , A ₂ ,..., A _n } to the RNN during learning and the corresponding output data {B ₁ , B ₂ ,. B _n }. For example, in the case of learning the user's voice and face, FIG as shown in 20, when the vector data {A 1 to the original image information F _t and audio information S _t which is input as a series _data, A _2, .., A _n } are input to the RNN.

【０１２９】ここで、元をなす画像情報Ｆ_ｔについて
は、撮像して得られる連続した撮像画像の１枚１枚から
特徴量を検出して、その特徴量が画像情報Ｆ_ｔとされ
る。また、音声情報Ｓ_ｔについては、マイク等により得
られる連続した音声信号から、例えば、所定のインター
バルによるサンプリングによりそれぞれから特徴量を検
出して、その特徴量が音声情報Ｓ_ｔとされる。これによ
り、例えば、「お座り」といった際の各時点の音声の特
徴量と、その際に変化するユーザの顔の各時点の特徴量
とが元をなすデータの時系列からなるベクトルデータ
｛Ａ_１，Ａ_２，・・・，Ａ_ｎ｝を得ることができる。こ
のように、ベクトルデータＡ_ｔの元Ａ_１，Ａ_２，・・
・，Ａ_ｎはそれぞれ、学習時に時系列データとして画像
情報Ｆ_ｔ及び音声情報Ｓ_ｔのそれぞれの時点における特
徴を示すデータとなる。このようなベクトルデータＡ_ｔ
を入力として、ＲＮＮにより学習した場合、関数Ｒ_ＮＮ
を用いると、入力データＡ_ｔと出力データＢ_ｔ＋１との
関係は、（１５）式として示すことができる。[0129] Here, the image information F _t constituting the original, and detecting the feature quantity from one single continuous captured images obtained by capturing, the characteristic amount is an image information F _t. Also, the audio information S _t, from a continuous speech signal obtained by the microphone or the like, for example, by detecting a feature from each by sampling by a predetermined interval, the feature amount is the audio information S _t. Thus, for example, vector data ｛A composed of a time series of data based on the feature amount of the voice at each time point such as “sitting” and the feature amount at each time point of the user's face changing at that time. _1, _a 2, can be obtained., the _{a n}.} Thus, based on _A _{1, A} 2 of the vector data _{A t,} · ·
·, A _n are each a data indicating a feature of each time point of the image information F _t and audio information S _t as time-series data at the time of learning. Such a vector data A _t
As input, if learned by RNN, function _{R NN}
With the relationship between the input data _{A t} and the output data _{B t + 1} can be expressed as (15).

【０１３０】[0130]

【数１５】 (Equation 15)

【０１３１】ここで、関数Ｒ_ＮＮのパラメータとされる
Ｗ及びθについては、Ｗは、ＲＮＮのニューロン（ユニ
ット）が保持する重み係数であり、θは、そのようなニ
ューロン（ユニット）のバイアス値である。データＡ_ｔ
を学習することによって、このような重み係数Ｗやバイ
アス値θは、重み係数Ｗ^＊やバイアス値θ^＊として決定
される。ここで、学習によって得られる重み係数Ｗ^＊や
バイアス値θ^＊は、ベクトルデータ｛Ａ_１，Ａ_２，・・
・，Ａ_ｎ｝によって定められるのであり、このようなこ
とから、重み係数Ｗ^＊やバイアス値θ^＊は、（１６）式
及び（１７）式として与えることができる。[0131] Here, the W and theta are the parameters of the function R _NN, W is a weighting factor RNN of neurons (units) holds, theta, such bias value of the neuron (units) It is. Data _{A t}
Is learned, such weighting factor W and bias value θ are determined as weighting factor W ^* and bias value θ ^* . Here, the weighting coefficient W ^* and the bias value θ ^* obtained by learning are represented by vector data {A ₁ , A ₂ ,.
, A _n }, and the weight coefficient W ^* and the bias value θ ^* can be given by the equations (16) and (17).

【０１３２】[0132]

【数１６】 (Equation 16)

【０１３３】[0133]

【数１７】 [Equation 17]

【０１３４】すなわち例えば、「お座り」をユーザが発
した際には、その関数Ｒ_ＮＮは、重み係数Ｗ^＊やバイア
ス値θ^＊を変数として、（１８）式として得ることがで
きる。That is, for example, when the user issues “sitting”, the function _RNN can be obtained as equation (18) using the weight coefficient W ^* and the bias value θ ^* as variables.

【０１３５】[0135]

【数１８】 (Equation 18)

【０１３６】そして、このように学習がなされたＲＮＮ
からインバースＲＮＮを生成する。このインバースＲＮ
Ｎ（関数Ｒ_ＮＮ ^−１）を使用することにより、図２１に
示すように、データ｛Ｂ_１’，Ｂ_２’，・・・，
Ｂ_ｎ’｝を入力することにより、学習時に入力されたデ
ータ｛Ａ’_１，Ａ_２’，・・・，Ａ_ｎ’｝が出力される
ようになる。Then, the RNN trained in this way is
To generate the inverse RNN. This inverse RN
By using N (function R _NN ⁻¹ ), as shown in FIG. 21, data {B ₁ ′, B ₂ ′,.
By inputting B _n ′, data {A ′ ₁ , A ₂ ′,..., A _n ′} input during learning is output.

【０１３７】例えば、インバースＲＮＮ（関数Ｒ_ＮＮ
^−１）、入力される値Ｂ_ｔ’、及びそれに対応して出力
される値Ａ_ｔ’の関係は、（１９）式のように示すこと
ができる。[0137] For example, the inverse RNN (function _{R NN}
⁻¹ ), the input value B _t ′, and the output value A _t ′ corresponding to the input value B _t ′ can be expressed as in equation (19).

【０１３８】[0138]

【数１９】 [Equation 19]

【０１３９】ここで、出力される値Ａ_ｔ’と、入力され
る値Ｂ_ｔ’とは、例えば、（２０）式及び（２１）式の
ように、ユーザの声データ（Ｓ_ｔ或いはＳ_ｔ’）や顔デ
ータ（Ｆ_ｔ或いはＦ_ｔ’）等によって示される。[0139] Here, 'and the value _{B t} inputted' values _{A t} to be outputted from, for example, (20) as in equation and equation (21), the user's voice data _{(S t} or _{S t} ') Or face data ( _Ft or _Ft ').

【０１４０】[0140]

【数２０】 (Equation 20)

【０１４１】[0141]

【数２１】 (Equation 21)

【０１４２】以上、ＲＮＮによる学習工程を、数式を用
いて説明した。このような学習工程により、予めユーザ
からの「お座り」をデータ｛Ａ_１，Ａ_２，・・・，
Ａ_ｎ｝の入力として学習をし、その出力とされる時系列
データ｛Ｂ_１，Ｂ_２，・・・，Ｂ _ｎ｝を、学習後におい
て、学習後のＲＮＮから得たインバースＲＮＮに入力さ
せることにより、「お座り」に対応して入力したデータ
｛Ａ_１，Ａ_２，・・・，Ａ _ｎ｝を得ることができるよう
になる。As described above, the learning process by the RNN uses the mathematical formula.
And explained. By such a learning process, the user
"Sitting" from the data @ A₁, A₂, ...,
A_nTime series of learning as an input of 、 and its output
Data No. B₁, B₂, ..., B _n学習, after learning
Input to the inverse RNN obtained from the RNN after learning.
By inputting the data corresponding to "sitting"
｛A₁, A₂, ..., A _nSo that you can get
become.

【０１４３】そして、このようなデータ｛Ａ_１，Ａ_２，
・・・，Ａ_ｎ｝に、「お座り」の動作の実行を対応付け
しておくことで、学習後のインバースＲＮＮへのデータ
｛Ｂ _１，Ｂ_２，・・・，Ｂ_ｎ｝の入力により、ロボット
装置が実際の「お座り」の動作を表出されるようにな
る。The data {A}₁, A₂,
... A_n対応付け is associated with the execution of the “sitting” operation
By doing, the data to inverse RNN after learning
｛B ₁, B₂, ..., B_nRobot input by inputting｝
The device now displays the actual `` sitting '' behavior
You.

【０１４４】さらに、学習により得たデータ｛Ｂ_１，Ｂ
_２，・・・，Ｂ_ｎ｝から得たデータ｛Ａ_１，Ａ_２，・・
・，Ａ_ｎ｝に基づいて「お座り」の動作を表出させるタ
イミングを、ユーザに褒められたいタイミングとするこ
とにより、ロボット装置は、ユーザに褒められたいとき
に、「お座り」の行動を表出するようになる。Further, the data {B ₁ , B
_2, ..., data _{A 1 obtained from _{B _n},} _A 2, · ·
By setting the timing at which the action of “sitting” is expressed based on A _n } as the timing at which the user wants to be praised, the robot apparatus can perform the action of “sit down” when the user wants to be praised. Will be expressed.

【０１４５】以上のような数式により構築された学習モ
デルを有することで、ロボット装置は、学習により、所
定のユーザを識別して、褒められる動作を表出すること
ができるようになる。By having the learning model constructed by the above mathematical formulas, the robot device can identify a predetermined user by learning, and can express a praised operation.

【０１４６】なお、上述の実施の形態では、対話相手が
人間とされるユーザである場合について説明した。しか
し、これに限定されるものではなく、例えば、対話相手
もロボット装置であってもの良い。この場合、学習によ
り、ロボット装置を識別して、その識別したロボット装
置に応じた動作を表出するようにする。[0146] In the above-described embodiment, a case has been described where the conversation partner is a user assumed to be a human. However, the present invention is not limited to this. For example, the conversation partner may be a robot device. In this case, the robot device is identified by learning, and an operation corresponding to the identified robot device is expressed.

【０１４７】また、上述の実施の形態では、対話相手に
関して入力される情報を音声情報や画像情報として説明
した。しかし、これに限定されるものではなく、対話相
手の特定できる得る他の情報をロボット装置が対話相手
を識別するための情報にすることもできる。Further, in the above-described embodiment, information input about a conversation partner has been described as audio information or image information. However, the present invention is not limited to this, and other information that can identify the conversation partner can be used as information for the robot apparatus to identify the conversation partner.

【０１４８】[0148]

【発明の効果】本発明に係るロボット装置の動作制御方
法は、対話相手を識別させて、識別させた対話相手が高
い評価をする動作を、ロボット装置に学習させる学習工
程と、対話相手に関して入力された情報から学習工程に
おける学習結果に基づいて対話相手を識別して、学習し
た評価の高い動作を、識別した対話相手に応じてロボッ
ト装置により表出させる動作制御工程とを有することに
より、対話相手を識別させて、識別させた対話相手が高
い評価をする動作を、ロボット装置に学習させ、対話相
手に関して入力された情報から学習結果に基づいて対話
相手を識別して、学習した評価の高い動作を、識別した
対話相手に応じてロボット装置により表出させることが
できる。According to the motion control method for a robot apparatus according to the present invention, a learning step of causing a robot apparatus to learn an operation to evaluate a conversation partner with a high evaluation by identifying the conversation partner, and to input the conversation partner. An operation control step of identifying a conversation partner based on the learning result in the learning step from the obtained information, and expressing a learned high evaluation operation by the robot apparatus according to the identified conversation partner. The robot device is made to learn an operation in which the other party is identified and the identified dialogue partner gives a high evaluation, and the dialogue partner is identified based on the learning result from the information input with respect to the dialogue partner. The movement can be expressed by the robot device according to the identified conversation partner.

【０１４９】このようなロボット装置の動作制御方法に
より、ロボット装置は、対話相手及びこの対話相手が高
い評価をする動作を学習し、その学習結果に基づいて、
対話相手に応じて評価の高い動作を表出することができ
る。According to the operation control method of the robot device, the robot device learns the conversation partner and the operation that the conversation partner evaluates highly, and based on the learning result,
Highly evaluated actions can be expressed according to the conversation partner.

【０１５０】また、本発明に係るプログラムは、対話相
手を識別させて、識別させた対話相手が高い評価をする
動作を、ロボット装置に学習させる学習工程と、対話相
手に関して入力された情報から学習工程における学習結
果に基づいて対話相手を識別して、学習した評価の高い
動作を、識別した対話相手に応じてロボット装置により
表出させる動作制御工程とを実行させることにより、対
話相手を識別させて、識別させた対話相手が高い評価を
する動作を、ロボット装置に学習させ、対話相手に関し
て入力された情報から学習結果に基づいて対話相手を識
別して、学習した評価の高い動作を、識別した対話相手
に応じてロボット装置により表出させることができる。Further, the program according to the present invention includes a learning step of causing a robot apparatus to learn an operation in which a dialogue partner is identified and the identified dialogue partner gives a high evaluation, and a learning step based on information input with respect to the dialogue partner. A conversation control unit that identifies the conversation partner based on the learning result in the process, and causes the robot apparatus to express the learned highly evaluated operation by the robot device according to the identified conversation partner. Then, the robot device learns the operation that the identified conversation partner evaluates high, and identifies the conversation partner based on the learning result from the information input about the conversation partner, and identifies the learned operation with high evaluation. It can be displayed by the robot device according to the conversation partner.

【０１５１】このようなプログラムにより動作が制御さ
れるロボット装置は、対話相手及びこの対話相手が高い
評価をする動作を学習し、その学習結果に基づいて、対
話相手に応じて評価の高い動作を表出することができ
る。A robot apparatus whose operation is controlled by such a program learns a conversation partner and an operation that the conversation partner evaluates highly, and, based on the learning result, performs a movement with a high evaluation according to the conversation partner. Can be revealed.

【０１５２】また、本発明に係る記録媒体は、対話相手
を識別させて、識別させた対話相手が高い評価をする動
作を、ロボット装置に学習させる学習工程と、対話相手
に関して入力された情報から学習工程における学習結果
に基づいて対話相手を識別して、学習した評価の高い動
作を、識別した対話相手に応じてロボット装置により表
出させる動作制御工程とを実行させるプログラムが記録
されていることにより、対話相手を識別させて、識別さ
せた対話相手が高い評価をする動作を、ロボット装置に
学習させ、対話相手に関して入力された情報から学習結
果に基づいて対話相手を識別して、学習した評価の高い
動作を、識別した対話相手に応じてロボット装置により
表出させることができる。Further, the recording medium according to the present invention includes a learning step of causing the robot apparatus to learn an operation in which a dialogue partner is evaluated highly by identifying the dialogue partner, and a learning step in which information is input with respect to the dialogue partner. A program is recorded for executing an operation control step of identifying a conversation partner based on a learning result in the learning step and expressing a learned highly evaluated operation by a robot device according to the identified conversation partner. According to this, the robot device is made to learn the operation in which the conversation partner is identified, and the identified conversation partner gives a high evaluation, and the conversation partner is identified based on the learning result from the information input with respect to the conversation partner, and the learning is performed. A highly evaluated operation can be expressed by the robot apparatus according to the identified conversation partner.

【０１５３】このような記録媒体に記録されているプロ
グラムにより動作が制御されるロボット装置は、対話相
手及びこの対話相手が高い評価をする動作を学習し、そ
の学習結果に基づいて、対話相手に応じて評価の高い動
作を表出することができる。A robot apparatus whose operation is controlled by a program recorded on such a recording medium learns a conversation partner and an operation that the conversation partner gives a high evaluation, and based on the learning result, gives the conversation partner. Accordingly, a highly evaluated operation can be expressed.

【０１５４】また、本発明に係るロボット装置は、対話
相手を識別して、識別した対話相手が高い評価をする動
作を学習する学習手段と、対話相手に関して入力された
情報から学習手段の学習結果に基づいて対話相手を識別
して、学習した評価の高い動作を、識別した対話相手に
応じて表出する動作制御手段とを備えることにより、対
話相手を識別して、識別した対話相手が高い評価をする
動作を学習し、対話相手に関して入力された情報から学
習結果に基づいて対話相手を識別して、学習した評価の
高い動作を、識別した対話相手に応じて表出することが
できる。よって、ロボット装置は、対話相手及びこの対
話相手が高い評価をする動作を学習し、その学習結果に
基づいて、対話相手に応じて評価の高い動作を表出する
ことができる。Further, the robot apparatus according to the present invention identifies the conversation partner, and learns the operation that the identified conversation partner gives a high evaluation, and the learning result of the learning unit from the information input with respect to the conversation partner. And an operation control means for displaying a learned high evaluation operation in accordance with the identified dialogue partner, thereby identifying the dialogue partner and increasing the identified dialogue partner. The operation to be evaluated is learned, the conversation partner is identified based on the learning result from the information input with respect to the conversation partner, and the learned high evaluation operation can be expressed according to the identified conversation partner. Therefore, the robot apparatus can learn the conversation partner and the operation that the conversation partner evaluates high, and can express the highly evaluated operation according to the conversation partner based on the learning result.

[Brief description of the drawings]

【図１】本発明の実施の形態であるロボット装置の外観
構成を示す斜視図である。FIG. 1 is a perspective view illustrating an external configuration of a robot device according to an embodiment of the present invention.

【図２】上述のロボット装置の回路構成を示すブロック
図である。FIG. 2 is a block diagram showing a circuit configuration of the robot device described above.

【図３】上述のロボット装置のソフトウェア構成を示す
ブロック図である。FIG. 3 is a block diagram illustrating a software configuration of the robot device described above.

【図４】上述のロボット装置のソフトウェア構成におけ
るミドル・ウェア・レイヤの構成を示すブロック図であ
る。FIG. 4 is a block diagram illustrating a configuration of a middleware layer in a software configuration of the robot device described above.

【図５】上述のロボット装置のソフトウェア構成におけ
るアプリケーション・レイヤの構成を示すブロック図で
ある。FIG. 5 is a block diagram showing a configuration of an application layer in the software configuration of the robot device described above.

【図６】上述のアプリケーション・レイヤの行動モデル
ライブラリの構成を示すブロック図である。FIG. 6 is a block diagram showing a configuration of the behavior model library of the application layer.

【図７】ロボット装置の行動決定のための情報となる有
限確率オートマトンを説明するために使用した図であ
る。FIG. 7 is a diagram used to explain a finite probability automaton that is information for determining an action of the robot apparatus.

【図８】有限確率オートマトンの各ノードに用意された
状態遷移表を示す図である。FIG. 8 is a diagram showing a state transition table prepared for each node of the finite probability automaton.

【図９】ＲＮＮによる情報の学習の説明に使用した図で
ある。FIG. 9 is a diagram used for explaining information learning by the RNN.

【図１０】ＲＮＮの構造の具体例を示す図である。FIG. 10 is a diagram showing a specific example of the structure of an RNN.

【図１１】ＲＮＮからインバースＲＮＮ（ＲＮＮ^−１）
を生成して行う学習の説明に使用した図である。FIG. 11 shows an RNN to an inverse RNN (RNN ⁻¹ ).
FIG. 7 is a diagram used for explaining learning performed by generating a.

【図１２】ＲＮＮによるユーザの情報の学習の説明に使
用した図である。FIG. 12 is a diagram used to explain learning of user information by the RNN.

【図１３】学習後のＲＮＮを用いたユーザの識別につい
ての説明に使用した図である。FIG. 13 is a diagram used for describing user identification using an RNN after learning.

【図１４】褒められる動作の学習の説明に使用した図で
ある。FIG. 14 is a diagram used for explaining learning of a motion to be praised.

【図１５】インバースＲＮＮを使用して褒められる動作
を表出させることの説明に使用した図である。FIG. 15 is a diagram used to explain that an operation praised using the inverse RNN is expressed.

【図１６】ユーザを識別して、褒められる動作を表出さ
せるための処理手順を示すフローチャートである。FIG. 16 is a flowchart illustrating a processing procedure for identifying a user and expressing a praised operation.

【図１７】ユーザ識別の学習により得たユーザ特徴シー
ケンスと動作の学習により得た動作特徴シーケンスとの
関連付けの例の説明に使用した図である。FIG. 17 is a diagram used to explain an example of association between a user feature sequence obtained by learning user identification and an operation feature sequence obtained by learning operation.

【図１８】ＲＮＮのユニット（ニューロン）であって、
入力と、その出力を示す図である。FIG. 18 shows a unit (neuron) of RNN,
It is a figure which shows an input and its output.

【図１９】数式化したＲＮＮの説明に使用した図であ
る。FIG. 19 is a diagram used for describing RNN expressed by a mathematical expression.

【図２０】学習時において、ＲＮＮに入力されるユーザ
の情報としてのベクトルデータの説明に使用した図であ
る。FIG. 20 is a diagram used to explain vector data as user information input to the RNN during learning.

【図２１】数式化したインバースＲＮＮの説明に使用し
た図である。FIG. 21 is a diagram used for describing an inverse RNN in a mathematical expression.

[Explanation of symbols]

１ロボット装置、１０ＣＰＵ 1 robot device, 10 CPU

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 2C150 CA02 DA05 DA24 DA25 DA26 DA27 DA28 DF03 DF04 DF06 DF33 ED42 ED52 EF07 EF16 EF23 EF29 EF33 EF36 3F059 AA00 BA00 BB06 DB04 DC00 DC01 FC15 3F060 AA00 BA10 CA14 ──────────────────────────────────────────────────の Continued on the front page F term (reference) 2C150 CA02 DA05 DA24 DA25 DA26 DA27 DA28 DF03 DF04 DF06 DF33 ED42 ED52 EF07 EF16 EF23 EF29 EF33 EF36 3F059 AA00 BA00 BB06 DB04 DC00 DC01 FC15 3F060 AA00 BA10 CA14

Claims

[Claims]

A learning step of causing a robot apparatus to learn an operation in which the conversation partner is highly evaluated by identifying the conversation partner, based on the learning result in the learning step from information input with respect to the conversation partner. An operation control step of causing the robot apparatus to express a learned high evaluation operation by the robot apparatus according to the identified dialogue partner.

2. The method according to claim 1, wherein the learning step includes a dialogue partner learning step of causing the robot apparatus to learn information input with respect to the dialogue partner, and an operation learning step of learning the robot apparatus with a highly evaluated operation. In the operation control step, the conversation partner is identified based on the learning result in the conversation partner learning step from the information input with respect to the conversation partner, and the operation having a high evaluation value learned in the operation learning step is identified. 2. The operation control method for a robot device according to claim 1, wherein the robot device is displayed in accordance with the condition.

3. In the operation learning step, the robot device learns the operation by an RNN (recurrent neural network). In the operation control step, an inverse RNN (RNN ⁻¹ ) having an inverse functional relationship of the learned RNN. 3. The operation control method for a robot device according to claim 2, wherein the operation is expressed by the robot device based on the following.

4. The operation control method for a robot apparatus according to claim 1, wherein the input information on the conversation partner is face information and voice information of the conversation partner.

5. The operation control method for a robot device according to claim 1, wherein the highly evaluated operation is an operation in which the robot device is praised by a conversation partner.

6. The method according to claim 1, wherein said robot device has an appearance simulating an animal.

7. The method according to claim 1, wherein the conversation partner is a human.

8. A learning step in which the robot apparatus learns an operation in which the conversation partner is identified by giving a high evaluation to the conversation partner, based on the learning result in the learning step based on information input about the conversation partner. And a motion control step of causing the robot apparatus to express a learned operation with a high evaluation according to the identified dialogue partner.

9. A learning step for causing a robot device to learn an operation in which a conversation partner is identified and the identified conversation partner evaluates high, based on a learning result in the learning step from information input about the conversation partner. And a motion control step of causing the robot apparatus to display a learned high-reputation operation by the robot apparatus according to the identified conversation partner.

10. A learning means for identifying a conversation partner and learning an operation in which the identified conversation partner gives a high evaluation, and identifying the conversation partner based on the learning result of the learning means from information inputted about the conversation partner. And a motion control means for displaying the learned highly evaluated operation in accordance with the identified conversation partner.

11. The learning means includes a dialogue partner learning means for learning information inputted with respect to a dialogue partner, and an operation learning means for learning an operation having a high evaluation. Identifying the conversation partner based on the learning result of the conversation partner learning unit from the input information, and expressing a highly evaluated operation learned by the operation learning unit according to the identified conversation partner. The robot device according to claim 10, wherein

12. The operation learning means learns an operation by an RNN (Recurrent Neural Network), and the operation control means operates based on an inverse RNN (RNN ^-1 ) which is an inverse functional relationship of the learned RNN. The robot device according to claim 10, wherein the operation is expressed.

13. The robot apparatus according to claim 10, wherein the input information relating to the conversation partner is face information and voice information of the conversation partner.

14. The robot apparatus according to claim 10, wherein the operation having a high evaluation is an operation in which the robot apparatus is praised by a conversation partner.

15. The robot device according to claim 11, wherein the robot device has an appearance shape imitating an animal.

16. The robot apparatus according to claim 11, wherein the conversation partner is a human.