JP2010094799A

JP2010094799A - Humanoid robot

Info

Publication number: JP2010094799A
Application number: JP2008290366A
Authority: JP
Inventors: Hiroaki Koike; 浩昭小池; Yoshitaka Ajioka; 嘉孝味岡
Original assignee: LITTLEISLAND Inc
Current assignee: LITTLEISLAND Inc
Priority date: 2008-10-17
Filing date: 2008-10-17
Publication date: 2010-04-30

Abstract

<P>PROBLEM TO BE SOLVED: To provide a humanoid robot for transmitting information to a specific person by using similar face and voice, which is adapted to produce love or the like that a specific speaker intends to transmit by expressing the speaker's personality in the action taken during voice output. <P>SOLUTION: Phoneme data such as fifty sounds necessary for recording and speaking the specific person's voice, and action data as the specific person's habits are previously registered in a storage (3) mounted inside a robot body (1) having a face similar to the specific person. During speak, the voice is synthesized by a main CPU unit (211) within a controller (2) and then output through a speaker (26). At the same time actuators (9) to (25) are activated to cause quadruped actions. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、画像、音声や複数のセンサからの入力情報を基に身振り手振り、表情などを用いてコミュニケーションを行う人型ロボットに関する。The present invention relates to a humanoid robot that performs communication using gestures, facial expressions, and the like based on images, sounds, and input information from a plurality of sensors.

玩具やゲーム等の民生用分野、あるいは製造業や流通業等の産業用分野において各種のロボットが広く用いられており、これらのロボットの中には、姿形が人間に酷似し、ユーザとコミュニケーションを行ういわゆる人型のロボットも開発されている。
従来の人型ロボットは、顔面部、発声機構、四肢動作のいずれかに対して特定個人を対象にしたものはあるが、一般に、顔面部、発声機構、四肢動作は画一的に特定されている。Various robots are widely used in consumer fields such as toys and games, and industrial fields such as manufacturing and distribution, and these robots are very similar in shape to humans and communicate with users. So-called humanoid robots have also been developed.
Some conventional humanoid robots are intended for specific individuals with respect to any one of the face, utterance mechanism, and limb movement, but generally the face, utterance mechanism, and limb movement are specified uniformly. Yes.

例えば、特許文献１に開示される発明においては、文章の内容を解析して感情に関わる情報を抽出し、抽出された感情情報に対応した発声やジェスチャ（身振り・手振り）を行なう電子メールを読み上げるロボットが示されている。For example, in the invention disclosed in Patent Document 1, information related to emotion is extracted by analyzing the content of a sentence, and an e-mail that performs utterance or gesture (gesture / hand gesture) corresponding to the extracted emotion information is read out A robot is shown.

特開２００６−１４２４０７号公報JP 2006-142407 A

上述した従来例のように、ロボットが伝達しようとする文章の内容を解析して文章中から感情に関わる情報を抽出し、抽出された感情情報に対応したジェスチャを音声とともに発現することで音声と動作とにより、すなわち言語と非言語とにより、感情表現をすることができる。
しかしながら、ここでなされているロボットのジェスチャは、特定の感情表現に対応するジェスチャのみであり、しかもその動作は、ロボットに固定的に記憶されている喜怒哀楽に対する動作をジェスチャで表現するものである。As in the conventional example described above, the content of the text to be transmitted by the robot is analyzed, information related to emotion is extracted from the text, and a gesture corresponding to the extracted emotion information is expressed together with the voice. It is possible to express emotions by movement, that is, by language and non-language.
However, the robot gestures made here are only gestures corresponding to specific emotion expressions, and the movements express gestures and emotions stored in the robot in a gesture. is there.

一般に、人が会話等で音声を発するときに生じるジェスチャは、千差万別であり、それぞれ癖がある。特定個人が音声を発する際のジェスチャは、その特定個人の癖を含むジェスチャで行われるのが好ましく、また、特定個人に似顔、似声でそのようなジェスチャがなされると、情報を伝達のみでなく、愛情などの感情を表現することができる。In general, the gestures that occur when a person utters a voice in conversation or the like are various and have various habits. It is preferable that a gesture when a specific person utters a voice is performed using a gesture including the trap of the specific person. If such a gesture is made with a face or voice similar to a specific person, only the information can be transmitted. It can express emotions such as love.

本発明は、特定個人に似顔、似声により情報を伝達する人型ロボットにおいて、音声を発する際の動きのなかに、特定の話者の個性を表現することにより、話者が伝達しようとする愛情などを演出できる人型ロボットを提供することを目的とする。The present invention is a humanoid robot that conveys information to a specific individual with a similar face and voice, and the speaker intends to convey the personality of the specific speaker by expressing the personality of the specific speaker in the movement of the voice. The purpose is to provide a humanoid robot that can produce affection.

また、本発明は、特定個人の声や癖を簡単に変更することができ、特定個人の癖を実現することができる人型ロボットを提供することを目的とする。Another object of the present invention is to provide a humanoid robot that can easily change the voice and habit of a specific individual and can realize the habit of the specific individual.

上記課題を解決するためになされた本発明の人型ロボットは、特定個人の似顔である人体型の外形を有する人型ロボットの外形を構成する部分にそれぞれの部分に対応する人体各部分の動きと同様の動きを起こさせるアクチュエータ群と、音声を発するスピーカとを備えた人型ロボットであって、ロボットのモデルとなる特定個人の情報を蓄積するモデル情報データベースと、特定個人の音声の音素を蓄積する音声合成音素データベースと、特定個人の癖などの動作を時系列データとして蓄積する癖モーションデータベースと、対話相手の顔画像データや特徴情報を蓄積する相手情報データベースと、ロボットの行動履歴を蓄積する行動履歴データベースと、対話から得た情報やインターネット上の情報を蓄積する知識情報データベースを備え、前記音声合成音素データベースの蓄積データに基づいて特定個人の音素を合成して発話する音声合成手段と、前記癖モーションデータベースの蓄積データに基づいて特定個人の癖を演出する動作手段と、前記相手情報データベース、前記モデル情報データベース、及び前記行動履歴データベースの蓄積データに基づいて対話相手に対する感情を認識する感情認識手段と、対話相手の音声データの内容を認識する音声認識手段と、前記音声認識手段の認識結果に応じて対話相手との対話やインターネット上の情報を知識として前記知識情報データベースに蓄積する学習手段と、前記感情認識手段における認識結果、前記学習手段における蓄積知識に応じて対話相手に対する感情を演出し、該演出結果に応じて前記音声合成手段及び動作手段を制御する感情演出制御手段とを備えたことを特徴とする。The humanoid robot of the present invention made to solve the above-described problems is a movement of each part of the human body corresponding to each of the parts constituting the outer shape of the humanoid robot having a humanoid outer shape that is a face of a specific individual. A humanoid robot having a group of actuators that cause the same movement and a speaker that emits a voice, and a model information database that stores information on a specific person as a model of the robot, and a phoneme of the voice of the specific person Accumulated speech synthesis phoneme database, 癖 motion database that accumulates movements of specific individuals as chronological data, partner information database that accumulates face image data and feature information of conversation partner, and robot action history Action history database and knowledge information database that accumulates information obtained from dialogue and information on the Internet Voice synthesizing means that synthesizes and utters a phoneme of a specific person based on the accumulated data of the speech synthesis phoneme database, operation means that produces a habit of the specific person based on the accumulated data of the haze motion database, and the partner An emotion recognition means for recognizing emotions to a conversation partner based on data stored in the information database, the model information database, and the action history database; a speech recognition means for recognizing the content of voice data of the conversation partner; and the speech recognition means A learning means for accumulating in the knowledge information database as information on the conversation with the conversation partner and the information on the Internet according to the recognition result of the recognition, the recognition result in the emotion recognition means, and the conversation partner according to the accumulated knowledge in the learning means Producing emotions, and the voice synthesizing means and the operating means according to the production results Characterized in that a control emotion presentation control means.

また、人型ロボットは、特定個人のシナリオが記述されているモデル情報データベースから対話相手と対話を行い、その対話から対話相手に対するロボットの感情を認識する。その感情に対する感情を持った演出をするため、音声合成で合成された音声をスピーカから発生させる制御を行うとともに、その音声に対応する癖データに記憶された人体各部の動きを、形体癖模写部が模写するように制御し、音声と動きとが同期するようにして、人型ロボットを動作させる。The humanoid robot interacts with the conversation partner from the model information database in which the scenario of a specific individual is described, and recognizes the emotion of the robot with respect to the conversation partner from the conversation. In order to produce a feeling with feelings for that emotion, control is performed to generate a voice synthesized by voice synthesis from the speaker, and the movement of each part of the human body stored in the heel data corresponding to the voice is displayed. The humanoid robot is operated so that the voice and movement are synchronized.

本発明によれば、特定個人に似顔のロボットが似声でしかも音声を発するときに特定個人の癖を動作で演出するので、感情や愛情を演出することができる。According to the present invention, when a robot with a similar face to a specific individual produces a voice that is similar to the voice, the specific individual's habit is produced by an action, so that emotion and affection can be produced.

BEST MODE FOR CARRYING OUT THE INVENTION

図１〜図９を用いて本発明の実施形態を説明する。
本発明は、特定個人の音声や癖を特定個人の似顔のロボットで制御することで、従来に無い特定個人に似た人型ロボットを実現する。その概略を図１、２に基づいて説明すれば、特定個人に似顔のロボット本体（１）の内部に含まれる制御部（２）に搭載されたストレージ（２２０）にあらかじめ特定個人の声を収録し発話するために必要となる五十音等の音素データや特定個人の癖となる動作データ等を登録し、発話時にその音声を制御部（２）内のメインＣＰＵユニット（２１１）により合成して、スピーカ（２６）より出力し、同時にアクチュエータ（９）〜（２５）を四肢動作させるものである。The embodiment of the present invention will be described with reference to FIGS.
The present invention realizes a humanoid robot that resembles a specific individual that has not existed in the past by controlling the voice and habit of the specific individual with a robot similar to the specific individual. If the outline is explained based on FIGS. 1 and 2, a voice of a specific individual is recorded in advance in a storage (220) mounted on the control unit (2) included in the robot body (1) similar to the specific individual. Register phoneme data such as the Japanese syllabary required for utterance and motion data that is a habit of a specific individual, and synthesize the voice by the main CPU unit (211) in the control unit (2) at the time of utterance. The actuator (9) to (25) are operated on the extremities at the same time.

図１は本発明の概要を示すロボット本体（１）の構成図である。
写真等を基にした特定個人の似顔であるロボット本体（１）は、主に録音音源等を基にした発声機構であるスピーカ（２６）、聴音機構であるマイク（２７，２８）、カメラ（１１），（１２）、四肢動作をするアクチュエータ（９）〜（２５）、ネットワーク接続機構（６）、センサのタッチセンサ（２９），（３０）、温度センサ（３１）及びこれらの各部を統括する制御部（２）から構成される。FIG. 1 is a configuration diagram of a robot body (1) showing an outline of the present invention.
The robot body (1), which is a face of a specific individual based on a photograph or the like, is composed of a speaker (26) that is an utterance mechanism mainly based on a recording sound source, a microphone (27, 28) that is a listening mechanism, a camera ( 11), (12), actuators (9) to (25) for limb movement, network connection mechanism (6), sensor touch sensors (29), (30), temperature sensor (31), and these parts It is comprised from the control part (2) to perform.

図２はロボット本体（１）に含まれる制御部（２）の構成図である。
制御部（２）は、人型ロボットの人工知能を持つメインＣＰＵユニット（２１０）とアクチュエータやセンサを制御するサブＣＰＵユニット（２３０）とデータベース等のデータファイルを保存するストレージユニット（２２０）から構成され、メインＣＰＵユニット（２１０）とサブＣＰＵユニット（２３０）は、メインＣＰＵユニット（２１０）内の通信モジュール（２１３）とサブＣＰＵ（２３０）内の通信モジュール（２３２）とで常に互いの情報を伝達している。
メインＣＰＵユニット（２１０）は、人工知能であるアプリケーションソフトを実行するＣＰＵ（２１１）とメモリ（２１２）とカメラ（７）（８）や無線ＬＡＮ（６）を制御するセンサ入出力モジュール（２１４）とサブＣＰＵからの情報を得るための通信モジュール（２１３）から構成されている。
サブＣＰＵユニット（２３０）は、各アクチュエータの制御や各センサの管理をするＣＰＵ（２３１）とアクチュエータ（９）〜（２５）を制御するアクチュエータ制御モジュール（２３７）と握手などを検出するためのタッチセンサ（２９）（３０）や周囲の温度を測定する温度センサ（３１）や人を検出するための人感センサ（３２）等を制御するセンサ入出力モジュール（２３６）と位置情報を得るためのＧＰＳ（２３３）と人型ロボットが揺すられたりなどの振動を検出するための加速度センサ（２３４）と人型ロボット内の温度を測定するための温度センサ（２３５）から構成される。FIG. 2 is a configuration diagram of the control unit (2) included in the robot body (1).
The control unit (2) includes a main CPU unit (210) having an artificial intelligence of a humanoid robot, a sub CPU unit (230) that controls actuators and sensors, and a storage unit (220) that stores a data file such as a database. The main CPU unit (210) and the sub CPU unit (230) always exchange information with each other in the communication module (213) in the main CPU unit (210) and the communication module (232) in the sub CPU (230). Communicating.
The main CPU unit (210) includes a CPU (211) that executes application software that is artificial intelligence, a memory (212), a camera (7) (8), and a sensor input / output module (214) that controls the wireless LAN (6). And a communication module (213) for obtaining information from the sub CPU.
The sub CPU unit (230) includes a CPU (231) for controlling each actuator and managing each sensor, an actuator control module (237) for controlling the actuators (9) to (25), and a touch for detecting a handshake. A sensor input / output module (236) for controlling the sensors (29) and (30), a temperature sensor (31) for measuring the ambient temperature, a human sensor (32) for detecting a person, and the like to obtain position information The GPS (233), an acceleration sensor (234) for detecting vibrations such as shaking of the humanoid robot, and a temperature sensor (235) for measuring the temperature in the humanoid robot.

図３は、ソフトウェアの概要を示す構成図である。
本ソフトウェアは、制御部（２）上のＣＰＵのオペレーションシステム（Ｓ１０１）上で動作する。
オペレーションシステム（Ｓ１０１）上には、人工知能の役割をするアプリケーションプログラム（Ｓ１０２）、カメラを制御するカメラデバイスプログラム（Ｓ１１７）、マイクを制御するマイクデバイスプログラム（Ｓ１１８）、スピーカを制御するスピーカデバイスプログラム（Ｓ１１９）、特定個人のデータや行動履歴等を蓄積している行動履歴データベース（Ｓ１０３）から構成される。
アプリケーションプログラム（Ｓ１０２）は、話者の顔や物体を認識するための画像認識プログラム（Ｓ１１１）、音声や対話相手を特定するための音声認識プログラム（Ｓ１１２）、対話相手の感情を認識するための感情認識プログラム（Ｓ１１３）、対話相手との対話やインターネット上の情報を知識としてデータベースに蓄積する学習制御プログラム（Ｓ１１４）、音声を合成するための音声合成プログラム（Ｓ１１５）、感情を演出するための感情演出プログラムから構成される。
データベース（Ｓ１０３）は、対話から得た情報やインターネット上情報を蓄積する知識情報データベース（Ｓ１２１）、特定個人の似声を音声合成するための音声合成音素データベース（Ｓ１２２）、特定個人の癖の動作データを蓄積するための癖モーションデータベース（Ｓ１２３）、特定個人を演出するためのシナリオを記述するモデル情報データベース（Ｓ１２４）、ロボットと対話する相手の顔情報や特徴を蓄積するための相手情報データベース（Ｓ１２５）、時間に対する本システムの各センサの取得データを蓄積するための行動履歴データベース（Ｓ１２６）から構成される。FIG. 3 is a configuration diagram showing an outline of software.
This software operates on the operation system (S101) of the CPU on the control unit (2).
On the operation system (S101), an application program (S102) that plays the role of artificial intelligence, a camera device program (S117) that controls the camera, a microphone device program (S118) that controls the microphone, and a speaker device program that controls the speaker (S119), which is composed of an action history database (S103) in which specific personal data, action history, and the like are accumulated.
The application program (S102) includes an image recognition program (S111) for recognizing the speaker's face and object, a voice recognition program (S112) for identifying voice and a conversation partner, and a recognition partner's emotion. Emotion recognition program (S113), learning control program (S114) for accumulating in the database as information on dialogue with the conversation partner and information on the Internet, speech synthesis program (S115) for synthesizing speech, and for producing emotion Consists of emotion production program.
The database (S103) includes a knowledge information database (S121) for accumulating information obtained from dialogues and information on the Internet, a speech synthesis phoneme database (S122) for synthesizing a specific person's similar voice, and a specific person's habits Saddle motion database (S123) for storing data, model information database (S124) describing a scenario for producing a specific individual, partner information database for storing face information and features of a partner who interacts with the robot ( S125), configured from an action history database (S126) for accumulating acquired data of each sensor of this system with respect to time.

本発明は、特定個人を演出するため複数のセンサからの入力に基づいてロボットが対話する相手話者への感情に対して対応動作をする。
その制御方法は、相手情報データベースに登録されている話者に対する感情レベルに対して、対応動作のパターンを変化させる。
相手情報データベースには、相手話者に対して「好き」「普通」「嫌い」の感情レベルを持っている。対応動作は、例えば、好きな話者には、優しい言葉使いで話しをしたり、愛嬌を振舞ったりする。嫌いな人には無視をするような動作をさせる。
また、感情レベルは、更新される。例えば、大好きな話者であっても、話しかけられる回数が減ったりすると、「大好き」から「好き」に変化する。
対応動作における言葉使いについては、イントネーションを変化させ対応する。また、対応する動きについては、予め登録した癖モーションデータベースから選択し、実行する。According to the present invention, in order to produce a specific individual, the robot responds to emotions to the other speaker with whom the robot interacts based on inputs from a plurality of sensors.
The control method changes the pattern of the corresponding action with respect to the emotion level for the speaker registered in the partner information database.
The partner information database has emotional levels of “like”, “normal”, and “dislike” for the other speaker. The corresponding action is, for example, speaking to a favorite speaker with a gentle wording or acting with caress. Make people dislike it and ignore it.
Also, the emotion level is updated. For example, even if the speaker is a favorite speaker, when the number of talks is reduced, it changes from “love” to “like”.
The wording in the response action is handled by changing the intonation. The corresponding motion is selected from a pre-registered eyelid motion database and executed.

特定個人を演出するフローを図４フローチャートにより説明すると、相手話者がロボットと会話を始めるとロボットは日常会話で会話をする。（Ｓ４０１）そのときにロボットは、画像による顔認識処理を行い話者を特定する。（Ｓ４０２）並行して、音声による話者特定を行い（Ｓ４０３）、（Ｓ４０２）と（Ｓ４０３）が一定の確度で一致した場合、話者の名前を呼び、確認を行い特定する（Ｓ４０４）。特定された相手話者の相手情報データベースと照らし合わせて、話者が好きな場合（Ｓ４０５）、好きな人モード（Ｓ４０７）の処理を行い、話者が嫌いな場合（Ｓ４０６）、嫌いな人モード（Ｓ４０８）の処理を行い、いずれでもない場合、普通の人モード（Ｓ４０９）の処理を行う。（Ｓ４０７）又は（Ｓ４０８）又は（Ｓ４０９）の処理が終わると、日常会話（Ｓ４０１）に戻る。The flow for producing a specific individual will be described with reference to the flowchart of FIG. (S401) At that time, the robot performs face recognition processing using an image and identifies a speaker. (S402) In parallel, speaker identification is performed by voice (S403). When (S402) and (S403) match with a certain degree of accuracy, the name of the speaker is called, confirmed, and identified (S404). If the speaker is liked by comparing with the partner information database of the identified other speaker (S405), the favorite person mode (S407) is processed, and if the speaker is disliked (S406), the disliked person The mode (S408) processing is performed, and if none of them is performed, the normal human mode (S409) processing is performed. When the processing of (S407), (S408), or (S409) ends, the routine returns to the daily conversation (S401).

例えば、好きな人モード（Ｓ４０７）の処理を図５フローチャートにより説明すると、好きな人モードに遷移するとロボットは、愛嬌のある音素による音声合成を行いながら好きな人用日常会話を行う。（Ｓ５０１）そのときにロボットは、音声による話者の感情を測定し（Ｓ５０２）、感情を特定するための話題を投げかけて、話者の感情を特定する。（Ｓ５０３）話者が元気の場合（Ｓ５０４）、元気モード（Ｓ５０６）の処理を行い、話者が落ち込んでいる場合（Ｓ５０５）、慰めモード（Ｓ５０７）の処理を行う。いずれでもない場合、平常モード（Ｓ５０８）での処理を行う。For example, the process of the favorite person mode (S407) will be described with reference to the flowchart of FIG. 5. When the robot changes to the favorite person mode, the robot performs a daily conversation for a favorite person while performing speech synthesis using a charming phoneme. (S501) At that time, the robot measures the emotion of the speaker by voice (S502), throws a topic for identifying the emotion, and identifies the emotion of the speaker. (S503) When the speaker is fine (S504), the process in the energetic mode (S506) is performed. When the speaker is depressed (S505), the process in the comfort mode (S507) is performed. If it is neither, processing in the normal mode (S508) is performed.

好きな人モード、嫌いな人モード及び普通の人モードに対する音声合成は音声合成音素データベースより、異なるイントネーション選択することで対応する。The speech synthesis for the favorite person mode, the disliked person mode, and the normal person mode can be performed by selecting different intonations from the speech synthesis phoneme database.

元気モード、慰めモード及び平常モードにおける会話内容は、予めモデルとなる特定個人のモデル情報をモデル情報データベース（Ｓ１２４）に会話の流れに沿ったシナリオとして予め記述し登録しておく。The conversation contents in the energetic mode, the comfort mode, and the normal mode are previously described and registered in the model information database (S124) as model information of a specific individual as a scenario in accordance with the conversation flow.

モデル情報データベース（Ｓ１２４）に登録されるシナリオの一例を図６に示す。図において、ロボットのモデルとなるモデル情報データベース（Ｓ１２４）は、名前（Ｓ６０１）、生年月日（Ｓ６０２）、趣味（Ｓ６０３）、特技（Ｓ６０４）、演出を表現するためのシナリオ（Ｓ６０５）、シナリオのトリガーとなる音声認識語句（Ｓ６０６）、演出するための声（Ｓ６０８）、演出するための動き（Ｓ６０９）をＸＭＬ形式で記述する。
特定個人を対象としているため、特定個人の癖などを演出する際に柔軟性の高いＸＭＬ形式としている。An example of a scenario registered in the model information database (S124) is shown in FIG. In the figure, a model information database (S124) serving as a robot model includes a name (S601), a date of birth (S602), a hobby (S603), a special skill (S604), a scenario for expressing an effect (S605), a scenario A speech recognition phrase (S606), a voice for production (S608), and a movement for production (S609) are described in XML format.
Since it is targeted at a specific individual, the XML format is highly flexible when producing a specific personal habit or the like.

音声認識語句（Ｓ６０６）は、音声認識の辞書の役割をしている。
このモデル情報データベース（Ｓ１２４）に記述されている音声認識語句数がロボットの認識できる語句数となる。The speech recognition word / phrase (S606) serves as a speech recognition dictionary.
The number of speech recognition phrases described in the model information database (S124) is the number of phrases that can be recognized by the robot.

一方、音声合成音素データベース（Ｓ１２２）には、予め用意した基本文を読み上げて、あらかじめ登録しておく。On the other hand, in the speech synthesis phoneme database (S122), a basic sentence prepared in advance is read out and registered in advance.

また、癖モーションデータベース（Ｓ１２３）には、本人の癖となる各関節の時系列データをあらかじめ登録する。In addition, time series data of each joint that becomes the user's heel is registered in advance in the heel motion database (S123).

相手情報データベース（Ｓ１２５）は、図７に示すようにＩＤ（Ｓ７０１）、対話相手の名前（Ｓ７０２）、生年月日（Ｓ７０３）、趣味（Ｓ７０４）、特技（Ｓ７０５）、音声認識用音素（Ｓ７０６）、顔認識用固有値（Ｓ７０７）、対話相手に対する感情レベル（Ｓ７０８）等のフィールドから構成される。感情レベル（Ｓ７０８）は、行動履歴データベース（Ｓ１２６）から、ロボットとの会話頻度や会話シーケンスによる会話をしたときのロボットが話者に対して判定した感情値や制御部（２）内のサブＣＰＵユニット（２３０）内の加速度センサ（２３４）から取得したデータから優しく抱いてもらっているかあるいは頭を叩かれていないか等を判定し感情値を決定して、更新される。As shown in FIG. 7, the partner information database (S125) includes an ID (S701), a conversation partner name (S702), a date of birth (S703), a hobby (S704), a special skill (S705), and a phoneme for speech recognition (S706). ), Eigenvalues for face recognition (S707), emotion level for the conversation partner (S708), and the like. The emotion level (S708) is obtained from the behavior history database (S126), the emotion value determined for the speaker by the robot when talking with the robot by the conversation frequency or conversation sequence, and the sub CPU in the control unit (2). It is updated by determining whether or not it is gently held from the data acquired from the acceleration sensor (234) in the unit (230), whether the head is not hit, etc., and the emotion value is determined.

行動履歴データベース（Ｓ１２６）は、図に示すようにＩＤ（Ｓ８０１）、履歴登録時の年月日（Ｓ８０２）、時間（Ｓ８０３）、場所（Ｓ８０４）、対話相手（Ｓ８０５）、そのときの遷移したシナリオであるシナリオ（Ｓ８０６）、対話相手が未知のときに新規に登録される対話相手の顔認識用固有値（Ｓ８０７）、未知の人の名前や新しい単語を覚えたときの新単語（Ｓ８０８）等のフィールドから構成される。新単語（Ｓ８０８）は、対話相手との対話やインターネット上のデータを取得した際に学習制御プログラム（Ｓ１１４）により、登録される。場所（Ｓ８０４）は、制御部（２）内のサブＣＰＵユニット（２３０）内のＧＰＳ（２３３）から取得した位置情報と会話シーケンスにより、場所を特定し、登録される。As shown in the figure, the action history database (S126) has an ID (S801), a date (S802) at the time of history registration, a time (S803), a place (S804), a conversation partner (S805), and a transition at that time. Scenario (S806) which is a scenario, face recognition eigenvalue newly registered when the conversation partner is unknown (S807), name of unknown person or new word when a new word is remembered (S808), etc. It is composed of fields. The new word (S808) is registered by the learning control program (S114) when the conversation with the conversation partner or data on the Internet is acquired. The location (S804) is specified and registered by the location information and the conversation sequence acquired from the GPS (233) in the sub CPU unit (230) in the control unit (2).

知識情報データベース（Ｓ１２１）は、ＩＤ（Ｓ９０１）、情報取得日（Ｓ９０２）、時間（Ｓ９０３）、場所（Ｓ９０４）、対話相手（Ｓ９０５）、そのときの遷移したシナリオのカテゴリ（Ｓ９０６）、新規に覚えた人の名前や単語である新単語（Ｓ９０７）等のフィールドから構成される。知識情報データベースは、行動履歴データベース（Ｓ１２６）やインターネット上のデータを取得した際に学習制御プログラム（Ｓ１１４）により、登録される。The knowledge information database (S121) includes an ID (S901), an information acquisition date (S902), a time (S903), a place (S904), a conversation partner (S905), a category of the transitioned scenario (S906), a new It consists of fields such as the name of a person who has learned and a new word (S907) which is a word. The knowledge information database is registered by the learning control program (S114) when the action history database (S126) or data on the Internet is acquired.

本発明のシステムの一例を示す説明図である。It is explanatory drawing which shows an example of the system of this invention. 図１に示す制御部、アクチュエータ及びセンサの電気的な構成を示すブロック図である。It is a block diagram which shows the electrical structure of the control part, actuator, and sensor which are shown in FIG. 制御部のソフトウェアの構成を示すブロック図である。It is a block diagram which shows the software structure of a control part. 感情表現をするソフトウェアのフローチャートである。It is a flowchart of the software which expresses an emotion. 好きな人モードにおけるソフトウェアのフローチャートである。It is a flowchart of the software in a favorite person mode. モデル情報データベースである。It is a model information database. 相手情報データベースである。It is a partner information database. 行動履歴データベースである。It is an action history database. 知識情報データベースである。It is a knowledge information database.

Explanation of symbols

１ロボット
２制御部
２１０メインＣＰＵユニット
２２０ストレージ
２３０サブＣＰＵユニット
６無線ＬＡＮ
７、８カメラ
９〜２５アクチュエータ
２６スピーカ
２７、２８マイク
２９、３０タッチセンサ
３１温度センサ
３２人感センサ1 Robot 2 Control unit 210 Main CPU unit 220 Storage 230 Sub CPU unit 6 Wireless LAN
7, 8 Camera 9-25 Actuator 26 Speaker 27, 28 Microphone 29, 30 Touch sensor 31 Temperature sensor 32 Human sensor

Claims

A robot whose face shape resembles a specific individual having a face part, a speech mechanism, a sounding mechanism, a camera mechanism, a limb movement mechanism, a network connection mechanism, an information storage mechanism, and a control mechanism that supervises these parts,
A humanoid robot provided with a corresponding action executing means for causing the robot to execute a corresponding action according to a stimulus to the robot recognized based on inputs from a plurality of sensors provided in each part of the limb of the robot.

The facial part is a face of a specific individual, the utterance mechanism uses a sound source approximated to the voice of the specific individual, and the corresponding action is an action approximated to the eye of a specific individual. The humanoid robot according to claim 1.

A model information database that stores information on specific individuals;
A speech synthesis phoneme database that stores phonemes of specific individuals;
癖 Motion database that accumulates movements of specific individuals as chronological data,
A partner information database that stores face image data and feature information of the conversation partner,
An action history database for accumulating robot action history;
It has a knowledge information database that accumulates information obtained from dialogue and information on the Internet,
A speech synthesizer that synthesizes and utters a specific individual phoneme based on the accumulated data of the speech synthesis phoneme database;
An operation means for producing a specific individual's wrinkle based on the accumulated data of the wrinkle motion database;
Emotion recognition means for recognizing emotions for a conversation partner based on the stored data of the partner information database, the model information database, and the action history database;
Voice recognition means for recognizing the content of the voice data of the conversation partner,
Learning means for accumulating in the knowledge information database as knowledge information on the Internet and conversations with the conversation partner according to the recognition result of the voice recognition means;
An emotion effect control means for producing an emotion for a conversation partner according to the recognition result in the emotion recognition means, the accumulated knowledge in the learning means, and controlling the voice synthesis means and the operation means according to the effect result;
The humanoid robot according to claim 1, further comprising: