JP5751610B2

JP5751610B2 - Conversation robot

Info

Publication number: JP5751610B2
Application number: JP2010221556A
Authority: JP
Inventors: 哲則小林; 真也藤江; 洋一松山
Original assignee: Waseda University
Current assignee: Waseda University
Priority date: 2010-09-30
Filing date: 2010-09-30
Publication date: 2015-07-22
Anticipated expiration: 2030-09-30
Also published as: JP2012076162A

Description

本発明は、会話ロボットに関し、例えば話題を共有しながら複数人で行われる会話（以下、これをグループ会話と呼ぶ）を行う際に適用して好適なものである。 The present invention relates to a conversation robot, and is suitable for application when, for example, a conversation performed by a plurality of people while sharing a topic (hereinafter referred to as a group conversation).

近年、電気的若しくは磁気的な作用を用いて人間や動物の動作に似せた運動を行うロボットが数多く商品化されている。このようなロボットに対し、例えば人間同士が日常的に行う会話と同様の会話をユーザとの間で行い得るような音声会話機能を搭載したロボットも知られている（例えば、非特許文献１参照）。 In recent years, many robots have been commercialized that use electric or magnetic actions to perform movements resembling human or animal movements. For such robots, there is also known a robot equipped with a voice conversation function capable of performing conversations with users similar to conversations that humans perform daily, for example (see Non-Patent Document 1, for example). ).

実際上、これら様々な会話ロボットのなかには、ＣＣＤ（ChargeCoupled Device）カメラやマイクロホン等の各種外部センサを搭載し、これら外部センサの出力に基づいて外部状況を認識して、認識結果に基づいて自律的に行動し得るようになされたものなどもある。例えば、その一例としては、外部センサを基に、発話するユーザの方向を認識し、例えば胴体部を移動させて胴体部正面及び頭部正面を当該ユーザの方向に向けて、あたかもユーザの発話に反応してユーザに視線を向けて会話を行うような会話ロボットも知られている。 In fact, these various conversation robots are equipped with various external sensors such as CCD (Charge Coupled Device) cameras and microphones, recognize external conditions based on the output of these external sensors, and autonomously based on the recognition results. Some are designed to be able to act. For example, as an example, based on an external sensor, the direction of a user who speaks is recognized, and for example, the body part is moved so that the front part of the body part and the front part of the head face the direction of the user, as if the user's speech There is also known a conversation robot that reacts and talks to a user.

伊吹征太，木村憲次，武田夏佳: コミュニケーションロボットを用いた高齢者生活支援システム,日本機械学会誌，Vol.108，No.1038，pp.392-395(2005)Seita Ibuki, Kenji Kimura, Natsuyoshi Takeda: Life support system for the elderly using communication robots, Journal of the Japan Society of Mechanical Engineers, Vol.108, No.1038, pp.392-395 (2005)

しかしながら、このような会話ロボットでは、実際にグループ会話が行われている際に、発話しているユーザの方向に胴体部正面及び頭部正面を移動させ、発話しているユーザにだけ視線を向けるような動作が行われると、発話していない他のユーザもグループ会話の参加者であるにもかかわらず、あたかもグループ会話から外されたかのような不自然さを、発話しているユーザや他のユーザに感じさせてしまうという問題があった。 However, in such a conversation robot, when the group conversation is actually performed, the front of the body and the front of the head are moved in the direction of the speaking user, and the line of sight is directed only to the speaking user. When this action is performed, the unnaturalness as if the other user who was not speaking is also a participant in the group conversation, as if it was removed from the group conversation, There was a problem that the user felt.

本発明は以上の点を考慮してなされたもので、状況に応じた自然な会話を行い得る会話ロボットを提案することを目的とする。 The present invention has been made in consideration of the above points, and an object of the present invention is to propose a conversation robot that can perform a natural conversation according to the situation.

かかる課題を解決するため本発明の請求項１は、胴体部に回動可能に設けられた上体部と、前記上体部に回動可能に設けられた頭部とを備え、外部センサからの出力結果に基づいて複数の対象物と自律的にグループ会話を行う会話ロボットにおいて、前記外部センサから取得した出力結果に基づいて前複数の対象物の位置を検出する位置検出手段と、前記外部センサから取得した出力結果に基づいて、前記複数の対象物のうち１つを主注目対象物として認識し、他の残りの前記対象物を従注目対象物として認識する役割識別手段と、前記主注目対象物及び前記従注目対象物の位置から求めた重心方向線で示された重心方向に、前記上体部の正面を向けるように前記上体部の回動角度を制御する胴体部制御手段と、前記主注目対象物の方向に前記頭部の正面を向けるように前記頭部の回動角度を制御する頭部制御手段とを備え、前記胴体部制御手段と前記頭部制御手段は、前記役割識別手段からのデータに基づいて、前記上体部の正面を前記重心方向線で示された重心方向に向けた状態のまま前記頭部だけを回動させて前記注目対象物の方向へ前記頭部の正面を向けるように制御でき、かつ、前記頭部の正面を前記主注目対象物の方向に向けた状態のまま前記上体部だけを回動させて前記重心方向線で示された重心方向へ前記上体部の正面を向けるように制御できる構成とされていることを特徴とする。 In order to solve this problem, a first aspect of the present invention includes an upper body portion that is rotatably provided on the body portion and a head portion that is rotatably provided on the upper body portion. in based on the output results with a plurality of objects conversation robots that autonomously group conversation, position detecting means for detecting the positions of a plurality of objects before on the basis of the output results obtained from the external sensor, the external based on the output results obtained from the sensor, recognizes one of the plurality of objects as the main object of interest, recognizing the role identification means other remaining of the object as従注th object, before Symbol Body part control for controlling the rotation angle of the upper body part so that the front surface of the upper body part is directed toward the center of gravity direction indicated by the center of gravity direction line obtained from the positions of the main attention object and the secondary attention object. Means in front of the main object of interest And a head control means for controlling the rotation angle of the head so as to direct the front of the head, the head control unit and the body unit control means, based on data from the roles identification means, It is possible to control the head so that the front of the head is directed toward the target object by rotating only the head while the front of the body is directed toward the center of gravity indicated by the center of gravity direction line. And, with the front of the head directed toward the main object of interest, only the upper body is rotated so that the front of the upper body is moved toward the center of gravity indicated by the center of gravity direction line. It is the structure which can be controlled to turn .

また、本発明の請求項２は、前記複数の対象物がユーザであり、前記役割識別手段は、前記外部センサから取得した画像及び又は音声に基づいて、複数の前記ユーザのうち、発話するユーザあるいは主聴者であるユーザを前記主注目対象物として認識し、他の残りの前記ユーザを前記従注目対象物として認識することを特徴とする。 According to a second aspect of the present invention, the plurality of objects are users, and the role identification unit is a user who speaks among the plurality of users based on images and / or sounds acquired from the external sensors. Alternatively, a user who is a main listener is recognized as the main target object, and the other remaining users are recognized as the sub target object.

また、本発明の請求項３は、前記役割識別手段は、前記主注目対象物として認識した前記ユーザが注目する次注目対象推定物を認識し、該ユーザが発話終了後に前記次注目対象推定物を新たな主注目対象物とすることを特徴とする。 Further, according to a third aspect of the present invention, the role identifying means recognizes a next target object estimation object noticed by the user recognized as the main target object, and the user recognizes the next target object estimation object after the utterance ends. As a new main target object.

また、本発明の請求項４は、前記重心方向線と所定の基準線のなす重心方向角度は、前記主注目対象物の方向に延びる主注目対象方向線と前記基準線のなす主注目対象角度と、前記従注目対象物の方向に延びる従注目対象方向線と前記基準線のなす従注目対象角度を算出し、前記主注目対象角度と前記従注目対象角度とを全て合算した値を前記複数の対象物の総数で除算することによって求めることを特徴とする。 According to a fourth aspect of the present invention, the center-of-gravity direction angle formed between the center-of-gravity direction line and a predetermined reference line is a main target-of-interest angle formed between the main target-of-interest direction line extending in the direction of the main target-of-interest and the reference line. And a secondary attention target angle formed by the secondary attention object direction line extending in the direction of the secondary attention object and the reference line, and the plurality of values obtained by adding all the primary attention object angle and the secondary attention object angle It is obtained by dividing by the total number of objects.

また、本発明の請求項５は、前記胴体部制御手段は、前記複数の対象物から求めた前記重心方向へ前記上体部の正面を向けると、前記上体部の回動角度が所定回動角度以上になる場合、前記上体部の回動角度が前記所定回動角度範囲内となるように、前記複数の対象物の中から所定の対象物を除外して前記重心方向を求めることを特徴とする。 According to a fifth aspect of the present invention, when the body part control means directs the front of the upper body part in the direction of the center of gravity obtained from the plurality of objects, the rotation angle of the upper body part is a predetermined number of times. When the moving angle is equal to or greater than the moving angle, the center of gravity direction is obtained by excluding a predetermined object from the plurality of objects so that the rotation angle of the upper body part is within the predetermined rotation angle range. It is characterized by.

また、本発明の請求項６は、指示部の可動を制御する指示部制御手段を備えたことを特徴とする。 According to a sixth aspect of the present invention, there is provided an instruction unit control means for controlling the movement of the instruction unit.

また、本発明の請求項７は、前記重心方向に前記上体部の正面を向けると、前記上体部の回動角度が所定回動角度以上になる場合、前記上体部の回動を前記回動角度の範囲内になるように、前記胴体部を移動手段によって移動させる移動制御手段を備えることを特徴とする。 According to a seventh aspect of the present invention, when the front surface of the upper body part is directed in the direction of the center of gravity, the upper body part is rotated when the rotation angle of the upper body part is equal to or greater than a predetermined rotation angle. It is characterized by comprising movement control means for moving the body part by moving means so as to be within the range of the rotation angle.

本発明によれば、主注目対象物だけでなく従注目対象物に対しても、あたかも注目しているかのような印象を与えることができ、かくして状況に応じた自然な会話を行い得る。 According to the present invention, not only the main target object but also the subordinate target object can be given an impression as if it is focused, and thus a natural conversation according to the situation can be performed.

本発明の会話ロボットの外観構成を示す概略図である。It is the schematic which shows the external appearance structure of the conversation robot of this invention. ユーザＢが新たに加わったときの会話ロボットの動作の様子を示す概略図である。It is the schematic which shows the mode of operation | movement of the conversation robot when the user B newly joins. 会話ロボットに搭載された会話装置の回路構成を示すブロック図である。It is a block diagram which shows the circuit structure of the conversation apparatus mounted in the conversation robot. 重心方向の求め方の説明に供する概略図である。It is the schematic where it uses for description of how to obtain | require a gravity center direction. ユーザＢがグループ会話から離脱したときの会話ロボットの動作の様子を示す概略図である。It is the schematic which shows the mode of operation | movement of the conversation robot when the user B detaches | leaves from a group conversation.

以下図面に基づいて本発明の実施の形態を詳述する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（１）本願発明の概要
図１において、１は会話ロボットを示し、この会話ロボット１は、人に模して外観が形成されており、胴体部２に対して回動自在に連結された頭部３と、当該胴体部２の左右に可動自在に連結された腕部4a,4bとを備えている。実際上、この胴体部２は、移動可能な車輪５を備えた基台６を有し、基台６に対して上体部８がｚ方向を回動軸として回動方向Caに回動自在に設けられた構成を有する。 (1) Outline of the Present Invention In FIG. 1, reference numeral 1 denotes a conversation robot. The conversation robot 1 has an external appearance imitating a person and is connected to a body portion 2 so as to be rotatable. A portion 3 and arm portions 4a and 4b movably connected to the left and right of the body portion 2 are provided. In practice, the body portion 2 has a base 6 having movable wheels 5, and the upper body portion 8 is rotatable with respect to the base 6 in a rotation direction Ca with the z direction as a rotation axis. It has the structure provided in.

上体部８には、上部左右に肩関節部9a,9bを介して腕部4a,4bが設けられているとともに、上部に設けた首部8aに上下左右に可動する頭部３が連結されている。腕部4aには、肩関節部9aによって上体部８に対し上腕部11がx方向、y方向及びz方向に可動するように設けられているとともに、上腕部11に肘関節部12を介して前椀部13が回動自在に設けられ、さらにこの前椀部13に手首関節部14を介して手部15が回動自在に設けられている。これにより、腕部4a,4bは、これら肩関節部9a、肘関節部12及び手首関節部14を各アクチュエータ（図示せず）により駆動させ、上腕部11、前椀部13及び手部15を上体部８の正面（以下、単に胴体部正面と呼ぶ）8b側に向けて突き出す等、人の腕の動作に模した動作を行なえ得る。 The upper body portion 8 is provided with arm portions 4a and 4b on the upper left and right via shoulder joint portions 9a and 9b, and a head 3 that is movable vertically and horizontally is connected to a neck portion 8a provided on the upper portion. Yes. The arm 4a is provided with an upper arm 11 that is movable in the x, y, and z directions with respect to the upper body 8 by the shoulder joint 9a, and the elbow joint 12 is connected to the upper arm 11 through the elbow joint 12. The front heel part 13 is rotatably provided, and the front heel part 13 is provided with a hand part 15 via the wrist joint part 14 so as to be rotatable. Thereby, the arm portions 4a and 4b drive the shoulder joint portion 9a, the elbow joint portion 12 and the wrist joint portion 14 by respective actuators (not shown), and the upper arm portion 11, the forehead portion 13 and the hand portion 15 are moved. An action imitating the action of a person's arm, such as protruding toward the front side of the upper body part 8 (hereinafter simply referred to as the front side of the body part) 8b, can be performed.

また、頭部３は、胴体部正面8bと同じ側を頭部３の正面（以下、単に頭部正面と呼ぶ）3aとして、人の目を模した目部21と、人の口を模した口部22とが当該頭部正面3aに形成されている。また、頭部３には、頭部正面3aの目部21に「目」として機能する一対のＣＣＤ（Charge Coupled Device）カメラ23が設けられているとともに、口部22の内部に発声装置として機能するスピーカ（図示せず）が配設されている。 In addition, the head 3 imitates the eyes 21 and the mouth of a person, with the same side as the body front 8b as the front of the head 3 (hereinafter simply referred to as the head front) 3a. A mouth portion 22 is formed on the head front surface 3a. In addition, the head 3 is provided with a pair of CCD (Charge Coupled Device) cameras 23 that function as “eyes” at the eyes 21 of the head front surface 3a, and functions as an uttering device inside the mouth 22. A speaker (not shown) is provided.

さらに、この会話ロボット１は、ユーザの音声をマイクロホン（図１では図示せず）により集音し、当該ユーザの発話した内容に応じて、当該ユーザに対し最適な返答や質問等をスピーカから発話したり、頭部３、上体部８、腕部4a,4b及び車輪５を動かして、ユーザとの間で状況に応じた自律的な行動を実行し得るようになされている。また、かかる構成に加えて、本発明による会話ロボット１は、会話するユーザの人数に応じて、頭部３及び上体部８を回動方向Caに回動させて、頭部正面3a及び胴体部正面8bの向く方向を最適な方向に適宜変更させ、会話するユーザが増えても、自然なグループ会話が行なえるようになされている。以下、複数のユーザとの間でグループ会話を行なう際の会話ロボット１の動作について説明する。 Further, the conversation robot 1 collects the user's voice with a microphone (not shown in FIG. 1), and utters an optimum response, question, etc. from the speaker according to the content of the user's utterance. Or by moving the head 3, the upper body 8, the arms 4a and 4b, and the wheels 5 so as to execute autonomous actions according to the situation with the user. In addition to such a configuration, the conversation robot 1 according to the present invention rotates the head 3 and the upper body 8 in the rotation direction Ca according to the number of users who have a conversation, and thus the head front 3a and the trunk. The direction in which the front surface 8b is directed is appropriately changed to an optimal direction so that a natural group conversation can be performed even if the number of users having a conversation increases. Hereinafter, the operation of the conversation robot 1 when performing a group conversation with a plurality of users will be described.

ここで、図２（Ａ）は、会話ロボット１とユーザＡとの間で行われる１対１の会話の状況を、上方から見た様子を示し、これに対して、図２（Ｂ）は、新たにユーザＢが加わり、会話ロボット１とユーザＡとユーザＢとの間で行われる複数人によるグループ会話の状況を、上方から見た様子を示している。なお、図２（Ａ）及び（Ｂ）中「△」マークは、ユーザＡやユーザＢの顔正面A1,B2や、会話ロボット１の頭部正面3aを示し、実線の矢印は、ユーザＡ及びユーザＢの顔正面A1,B2、会話ロボット１の頭部正面3aが向けられた方向を示している。また、点線の矢印は、ユーザＡ及びユーザＢの体正面A2,B2、会話ロボット１の胴体部正面8bが向けられた方向を示している。 Here, FIG. 2A shows a situation of a one-to-one conversation between the conversation robot 1 and the user A as viewed from above, whereas FIG. FIG. 4 shows a situation where a user B is newly added and a situation of group conversation between a plurality of persons performed between the conversation robot 1, the user A, and the user B is viewed from above. 2A and 2B, the “Δ” mark indicates the front face A1, B2 of the user A or the user B and the head front face 3a of the conversation robot 1, and the solid line arrows indicate the user A and the user A. A direction in which the front faces A1 and B2 of the user B and the head front face 3a of the conversation robot 1 are directed is shown. The dotted arrows indicate the directions in which the body fronts A2 and B2 of the user A and the user B and the body front 8b of the conversation robot 1 are directed.

因みに、この実施の形態の場合では、ユーザＡ及びユーザＢの音声を集音する手法の一例として、ユーザＡ及びユーザＢにそれぞれ専用のマイクロホンを用意し、ユーザＡ及びユーザＢの所定部位にそれぞれ付けたマイクロホンが、会話ロボット１に配線を介して接続されている。これにより、会話ロボット１は、マイクロホンから集音したユーザＡ及びユーザＢの各音声を音声信号として取得し、当該音声信号を解析することによりユーザＡ及びユーザＢの発話状況や発話内容を認識し得るようになされている。 Incidentally, in the case of this embodiment, as an example of a method for collecting the voices of the user A and the user B, a dedicated microphone is prepared for each of the user A and the user B, and the predetermined parts of the user A and the user B are respectively provided. The attached microphone is connected to the conversation robot 1 via wiring. Thereby, the conversation robot 1 acquires each voice of the user A and the user B collected from the microphone as a voice signal, and recognizes the voice situation and the contents of the voice of the user A and the user B by analyzing the voice signal. Has been made to get.

ここで、図２（Ａ）に示すように、この会話ロボット１は、ユーザＡとの間で会話を行うとき、頭部３を回動させて頭部正面3aをユーザＡに向けるとともに、上体部８も回動させて胴体部正面8bをユーザＡに向け、ユーザＡを注視しているかのような動作を行なう。このように、会話ロボット１は、頭部正面3a及び胴体部正面8bをともにユーザＡに向けることで、ユーザＡとの会話を行なうことを明確に意思表示しているかのような印象を、ユーザＡに対し与えることができる。 Here, as shown in FIG. 2A, when the conversation robot 1 has a conversation with the user A, the conversation robot 1 rotates the head 3 so that the head front 3a faces the user A. The body part 8 is also rotated so that the body part front face 8b faces the user A, and an operation is performed as if the user A is being watched. In this way, the conversation robot 1 directs the head front 3a and the torso front 8b to the user A so that the user can have an impression as if he / she clearly indicated that he / she would have a conversation with the user A. Can be given to A.

かかる構成に加えて、この会話ロボット１は、ユーザＡとの間の会話中に、例えばユーザＢが近づくと、ＣＣＤカメラ23により撮像した動画像に基づいて当該ユーザＢの顔正面B1や体正面B2の向きを検出し、この検出結果から、ユーザＢの顔正面B1の一部及び体正面B2の一部（後述する）が、会話ロボット１の方向に向いていると認識すると、ユーザＢが新たに会話に加わりグループ会話が行われると判断し得るようになされている。 In addition to such a configuration, the conversation robot 1 is connected to the user A during the conversation with the user A, for example, when the user B approaches, the face B1 or the body front of the user B based on the moving image captured by the CCD camera 23. If the direction of B2 is detected and it is recognized from this detection result that a part of user B's face front B1 and a part of body front B2 (described later) are facing the direction of conversation robot 1, user B It is possible to judge that a group conversation is newly performed by joining a conversation.

また、会話ロボット１は、図２（Ｂ）に示すように、ユーザＡ及びユーザＢを含めたグループ会話であると認識すると、ＣＣＤカメラ23で撮像した動画像を基に、グループ会話の参加者（図２（Ｂ）ではユーザＡ及びユーザＢ）全員の位置を特定しその位置関係から重心方向線CG1（後述する）を算出し、現在会話中のユーザＡの方向に頭部正面3aを向けた状態のまま、上体部８だけを回動させて、重心方向線CG1で示された重心方向へ胴体部正面8bを向けるように構成されている。これにより、会話ロボット１は、頭部正面3aをユーザＡに向けることで、ユーザＡとの会話を継続しているかのような印象をユーザＡ及びユーザＢに対して与えることができる。また、これに加えて会話ロボット１は、ユーザＡ及びユーザＢの位置から求めた重心方向に胴体部正面8bを向けることで、ユーザＡだけでなく、あたかもユーザＢにも注視しているかのような印象をユーザＡ及びユーザＢに与え、ユーザＡ及びユーザＢとの間で自然な会話を実現し得る。 When the conversation robot 1 recognizes that the conversation is a group conversation including the user A and the user B as shown in FIG. 2B, the participant of the group conversation is based on the moving image captured by the CCD camera 23. (User A and user B in FIG. 2B) identify the positions of all the members, calculate the center of gravity direction line CG1 (described later) from the positional relationship, and point the head front 3a toward the user A who is currently talking In this state, only the upper body portion 8 is rotated, and the body portion front surface 8b is directed toward the center of gravity direction indicated by the center of gravity direction line CG1. Thereby, the conversation robot 1 can give the user A and the user B the impression that the conversation with the user A is continued by directing the head front 3a to the user A. In addition to this, the conversation robot 1 looks not only at the user A but also at the user B by directing the body front 8b in the direction of the center of gravity obtained from the positions of the users A and B. A natural impression can be given to the user A and the user B, and a natural conversation between the user A and the user B can be realized.

（２）会話ロボットの回路構成
次に、図２（Ａ）及び（Ｂ）に示すような会話ロボット１の行動を、図３に示す回路構成を用いて以下説明する。この実施の形態の場合、会話ロボット１には、図３に示すような会話装置30が内蔵されており、例えば頭部正面3aに設けられたＣＣＤカメラ23は、頭部正面3a方向を撮像して得られた動画像を動画像データとして、顔向き・体向き検出部32と、顔認識部33とにそれぞれ送出する。顔認識部33には、動画像データから生成される動画像の中から、統計的手法によって予め定められた肌色尤度を基に、ほぼ楕円状の肌色領域を特定してこれを顔領域として抽出する。顔認識部33は、この肌色領域を正規化した後、この正規化した画像から両目の距離や鼻の幅等の特徴（顔特徴量）を算出する。ここで、顔認識部33には、ユーザＡやユーザＢの両目の距離や鼻の幅等の顔特徴量が顔データとして予め記憶されている。これにより顔認識部33は、動画像データを基に検出した顔特徴量と、登録されている顔データの特徴量とを比べることで、動画像中にユーザＡやユーザＢが存在していることを認識し得、これを顔識別結果データとして位置検出部35に送出する。 (2) Circuit Configuration of Conversation Robot Next, the behavior of the conversation robot 1 as shown in FIGS. 2A and 2B will be described below using the circuit configuration shown in FIG. In the case of this embodiment, the conversation robot 1 incorporates a conversation device 30 as shown in FIG. 3. For example, the CCD camera 23 provided on the head front surface 3a images the head front surface 3a direction. The moving image obtained in this way is sent as moving image data to the face direction / body direction detection unit 32 and the face recognition unit 33, respectively. The face recognizing unit 33 identifies an almost elliptical skin color area from the moving image generated from the moving image data based on the skin color likelihood determined in advance by a statistical method, and uses this as the face area. Extract. The face recognition unit 33 normalizes this skin color area, and then calculates features (face feature amounts) such as the distance between both eyes and the width of the nose from the normalized image. Here, in the face recognition unit 33, face feature amounts such as the distance between both eyes of the user A and the user B and the width of the nose are stored in advance as face data. As a result, the face recognition unit 33 compares the face feature amount detected based on the moving image data with the feature amount of the registered face data, so that the user A and the user B exist in the moving image. Can be recognized and sent to the position detection unit 35 as face identification result data.

また、この実施の形態の場合、顔向き・体向き検出部32でも、ＣＣＤカメラ23から動画像データを受け取ると、動画像データから生成される動画像の中から、統計的手法によって予め定められた肌色尤度を基に、ほぼ楕円状の肌色領域を特定してこれを顔領域として抽出した後、この肌色領域を正規化し、この正規化した画像から両目の距離や鼻の幅等の特徴（顔特徴量）を算出する。また、顔向き・体向き検出部32は、動画像データから生成される動画像の中から、統計的手法によって予め定められた人の上半身の輪郭データを基に、ユーザＡ及びユーザＢの各上半身を特定してこれを上半身領域として抽出した後、この上半身領域を正規化し、この正規化した画像から肩幅等の特徴（上半身輪郭特徴量）を算出する。ここで、顔向き・体向き検出部32は、例えば顔向きテンプレート情報（後述する）と、体向きテンプレート情報（後述する）とを予め記憶しており、これら顔向きテンプレート情報と顔特徴量とを照らし合わせることで、両目距離の変化等から顔正面A1,B1の向きを識別するとともに、体向きテンプレート情報と上半身輪郭特徴量とを照らし合わせることで、肩幅の変化等から体正面A2,B2の向きを識別し得るようになされている。 Further, in the case of this embodiment, when the face direction / body direction detection unit 32 receives moving image data from the CCD camera 23, it is determined in advance by a statistical method from moving images generated from the moving image data. Based on the skin color likelihood, after identifying an almost elliptical skin color area and extracting it as a face area, normalize this skin color area, and the features such as the distance of both eyes and the width of the nose from this normalized image (Facial feature amount) is calculated. Further, the face direction / body direction detection unit 32 is configured to detect each of the user A and the user B based on the contour data of the upper body of the person predetermined by a statistical method from the moving images generated from the moving image data. After specifying the upper body and extracting it as an upper body region, the upper body region is normalized, and features such as shoulder width (upper body contour feature amount) are calculated from the normalized image. Here, the face orientation / body orientation detection unit 32 stores, for example, face orientation template information (described later) and body orientation template information (described later) in advance. Is used to identify the orientation of the face front A1, B1 from changes in the distance between the eyes, etc., and by comparing the body orientation template information with the upper body contour features, the body front A2, B2 It is made to be able to identify the direction of the.

例えば、顔向きテンプレート情報は、ＣＣＤカメラ23に対して人の顔が正面のとき、斜め前左右30度のとき、斜め前左右60度のとき、斜め前左右90度のときのそれぞれ統計的な顔特徴量のモデル（以下、統計的顔向き特徴量と呼ぶ）を示したものであり、これら統計的顔向き特徴量と、ユーザＡ及びユーザＢの顔特徴量とを比べることで、顔正面A1,B1の向きを識別し、これを顔向き識別データとして得るようになされている。また、体向きテンプレート情報は、例えばＣＣＤカメラに対して人の上半身が正面のとき、斜め前左右30度のとき、斜め前左右60度のとき、斜め前左右90度のときのそれぞれ統計的な上半身輪郭特徴量のモデル（以下、統計的体向き特徴量と呼ぶ）を示したものであり、これら統計的体向き特徴量と、ユーザＡ及びユーザＢの上半身輪郭特徴量とを比べることで、体正面A2,B2の向きを識別し、これを体向き識別データとして得るようになされている。そして、顔向き・体向き検出部32は、これら動画像データを基に検出したユーザＡ及びユーザＢの顔向き識別データ及び体向き識別データをそれぞれ役割識別部36に送出する。 For example, the face orientation template information is statistical when the human face is in front of the CCD camera 23, when it is 30 degrees diagonally left and right, when it is 60 degrees diagonally right and left, and when it is 90 degrees diagonally right and left. A model of a facial feature value (hereinafter referred to as a statistical facial orientation feature value) is shown. By comparing these statistical facial orientation feature values with the facial feature values of the users A and B, the front of the face is shown. The orientations of A1 and B1 are identified, and this is obtained as face orientation identification data. The body orientation template information is statistically calculated when, for example, a person's upper body is front to the CCD camera, 30 degrees diagonally left and right, 60 degrees diagonally right and left, and 90 degrees diagonally right and left. A model of the upper body contour feature quantity (hereinafter referred to as a statistical body orientation feature quantity) is shown, and by comparing these statistical body orientation feature quantities with the upper body outline feature quantities of the user A and the user B, The directions of the body front faces A2 and B2 are identified, and this is obtained as body orientation identification data. Then, the face direction / body direction detection unit 32 sends the face direction identification data and the body direction identification data of the user A and the user B detected based on the moving image data to the role identification unit 36, respectively.

なお、ここで、ユーザＡ及びユーザＢの顔正面A1,B1及び体正面A2,B2の向きを識別する手法としては、例えば「顔と身体の外観及び形状の変動傾向を考慮した上体輪郭抽出・追跡手法」（俵直弘藤江真也小林哲則（「画像の認・理解シンポジウム（MIRU2010）」2010年7月））に記載された技術内容を適用するようにしてもよく、ユーザＡ及びユーザＢの顔正面A1,B1及び体正面A2,B2の向きを識別できれば、その他種々の手法を適用してもよい。 Here, as a method for identifying the orientations of the front faces A1, B1 and the front faces A2, B2 of the user A and the user B, for example, “upper body contour extraction considering the appearance and shape variation tendency of the face and body”・ Technology described in “Tracking Method” (Naoya Hiroshi, Shinya Fujie, Tetsunori Kobayashi (“Image Recognition and Understanding Symposium (MIRU2010)”, July 2010)) may be applied. User A and User B Various other methods may be applied as long as the orientations of the face fronts A1 and B1 and the body fronts A2 and B2 can be identified.

このときマイクロホン37a,37bは、ユーザＡ及びユーザＢの各音声をそれぞれ集音すると、これらを音声信号としてそれぞれ音声処理部39に送出する。音声処理部39には、ユーザＡ及びユーザＢの音声の特徴（音声特徴量）が音声識別データとして予め記憶されており、マイクロホン37a,37bから受け取った各音声信号からそれぞれ特徴量を抽出し、この特徴量と音声識別データとを比べることで、どのマイクロホン37a,37bがユーザＡ又はユーザＢに用いられているか否かを認識し得るようになされている。 At this time, when the microphones 37a and 37b collect the voices of the user A and the user B, respectively, the microphones 37a and 37b send them to the voice processing unit 39 as voice signals. The voice processing unit 39 stores in advance the voice features (voice feature quantities) of the user A and the user B as voice identification data, and extracts the feature quantities from the respective voice signals received from the microphones 37a and 37b, By comparing the feature quantity with the voice identification data, it is possible to recognize which microphone 37a, 37b is used by the user A or the user B.

そして、音声処理部39は、例えば一方のマイクロホン37aがユーザＡの音声を集音し、他方のマイクロホン37bがユーザＢの音声を集音していることを示す音声識別結果データを生成し、これを位置検出部35に送出する。また、この音声処理部39は、バイグラム言語モデル、ＨＭＭ（Hidden Markov Model;隠れマルコフモデル）を用いた語彙量約七百のフレーム同期の連続音声認識を行ない、音声信号を単語の列へと変換し、これを単語列データとして行動選択部40に送出する。 Then, for example, the voice processing unit 39 generates voice identification result data indicating that one microphone 37a collects the voice of the user A and the other microphone 37b collects the voice of the user B. Is sent to the position detector 35. The speech processing unit 39 performs frame-synchronous continuous speech recognition using a bigram language model, HMM (Hidden Markov Model), and converts the speech signal into a sequence of words. This is sent to the action selection unit 40 as word string data.

一方、位置検出部35は、顔認識部33から受け取った顔識別結果データと、音声処理部39から受け取った音声識別結果データとを対応付けることにより、動画像中のユーザＡ及びユーザＢがどの位置に存在し、かつユーザＡ及びユーザＢのいずれが発話しているかを認識し得るようになされている。 On the other hand, the position detection unit 35 associates the face identification result data received from the face recognition unit 33 with the voice identification result data received from the voice processing unit 39, so that the positions of the user A and the user B in the moving image are And the user A or the user B can be recognized.

実際上、位置検出部35は、動画像中において認識したユーザＡと、一方のマイクロホン37aで得られた音声信号とを対応付けるとともに、動画像中において認識したユーザＢと、他方のマイクロホン37bで得られた音声信号とを対応付け、動画像中のユーザＡ及びユーザＢのいずれかが発話しているかを認識し、これを位置検出結果データとして役割識別部36に送信する。 In practice, the position detection unit 35 associates the user A recognized in the moving image with the audio signal obtained by the one microphone 37a, and obtains the user B recognized in the moving image and the other microphone 37b. The voice signals are associated with each other, and it is recognized whether any of the user A or the user B in the moving image is speaking, and this is transmitted to the role identifying unit 36 as position detection result data.

役割識別部36は、顔向き・体向き検出部32から受け取った顔向き識別データ及び体向き識別データから、ユーザＡ及びユーザＢが会話に参加しているか否かを判断し得るようになされている。実際上、役割識別部36は、例えばユーザＡの動画像から得られたユーザＡの顔向き識別データ及び体向き識別データから、ユーザＡの顔正面A1の一部及び体正面A2の一部がともに、会話ロボット１側に向いているか否かを判断する。その結果、ユーザＡの顔向き識別データ及び体向き識別データから、ユーザＡの顔正面A1の一部及び体正面A2の一部が、会話ロボット１側に向いている場合（例えば、会話ロボット１に対してユーザＡの顔が正面のとき、斜め前左右30度のとき、斜め前左右60度のとき）、このことはユーザＡが会話ロボット１や他のユーザＢとグループ会話を行うために、会話ロボット１や他のユーザＢ側に顔正面A1及び体正面A2を向けていると判断し、役割識別部36は、ユーザＡがグループ会話の参加者であると認識する。 The role identification unit 36 can determine whether or not the user A and the user B are participating in the conversation from the face direction identification data and the body direction identification data received from the face direction / body direction detection unit 32. Yes. In practice, the role identifying unit 36 determines that, for example, a part of the face front A1 and a part of the body front A2 of the user A are obtained from the face orientation identification data and the body orientation identification data of the user A obtained from the moving image of the user A. Both of them determine whether or not they are facing the conversation robot 1 side. As a result, when a part of the face front A1 and a part of the body front A2 of the user A are facing the conversation robot 1 from the face orientation identification data and the body orientation identification data of the user A (for example, the conversation robot 1 In contrast, when the face of the user A is front, when the front left / right is 30 degrees diagonally, and when the front left / right is 60 degrees diagonally), this is because the user A has a group conversation with the conversation robot 1 or another user B. Then, it is determined that the face front A1 and the body front A2 are directed to the conversation robot 1 or another user B side, and the role identifying unit 36 recognizes that the user A is a participant in the group conversation.

また、役割識別部36は、ユーザＢについても同様に、ユーザＢの動画像から得られたユーザＢの顔向き識別データ及び体向き識別データから、ユーザＢの顔正面B1の一部及び体正面B2の一部が、会話ロボット１側に向いているか否かを判断する。その結果、ユーザＢの顔向き識別データ及び体向き識別データから、ユーザＢの顔正面B1の一部及び体正面B2の一部が、会話ロボット１側に向いている場合、このことはユーザＢが会話ロボット１や他のユーザＡとグループ会話を行うために、会話ロボット１や他のユーザＡ側に顔正面B1及び体正面B2を向けていると判断し、役割識別部36は、ユーザＢがグループ会話の参加者であると認識する。 Similarly, for the user B, the role identifying unit 36 also determines a part of the user B's face front B1 and the body front from the user B's face orientation identification data and body orientation identification data obtained from the user B moving image. It is determined whether a part of B2 is facing the conversation robot 1 side. As a result, if the face direction identification data and the body direction identification data of the user B indicate that a part of the front face B1 and a part of the front face B2 of the user B face the conversation robot 1 side, this means that the user B Determines that the front face B1 and the front face B2 are facing the conversation robot 1 or the other user A side in order to perform a group conversation with the conversation robot 1 or the other user A, and the role identifying unit 36 Recognize that they are participants in a group conversation.

次に、役割識別部36は、位置検出部35から受け取った位置検出結果データに基づいて、グループ会話の参加者として判断したユーザＡ及びユーザＢのうち、いずれかが発話者であるか否かを判断し得る。例えば、役割識別部36は、位置検出部35から受け取った位置検出結果データに基づいて、グループ会話の参加者であると判断したユーザＡのマイクロホン37aから音声信号を取得すると、当該ユーザＡを発話者（主注目対象物）とし、他方のユーザＢを聴者（従注目対象物）とし、これらユーザＡ及びユーザＢの各役割（この場合、ユーザＡを発話者（主注目対象物）とし、他方のユーザＢを聴者（従注目対象物）とする）を、位置検出結果データに対応付けた主従注目位置検出データを生成し、これを頭部駆動制御部41及び胴体部駆動制御部42にそれぞれ送出する。 Next, the role identification unit 36 determines whether one of the user A and the user B determined as a group conversation participant based on the position detection result data received from the position detection unit 35 is a speaker. Can be judged. For example, when the role identification unit 36 obtains an audio signal from the microphone 37a of the user A determined to be a participant in the group conversation based on the position detection result data received from the position detection unit 35, the role identification unit 36 utters the user A. A user (main attention object), the other user B as a listener (subordinate attention object), the roles of these users A and B (in this case, user A as a speaker (main attention object), The user B is a listener (subordinate attention object), and main / subject attention position detection data associated with the position detection result data is generated, and this is generated in the head drive control section 41 and the trunk section drive control section 42, respectively. Send it out.

なお、このとき、役割識別部36は、例えば現在発話している発話者たるユーザＡの顔向き識別データに基づいて、ユーザＡの顔正面A1の向きがユーザＢ方向であると判断すると、当該ユーザＡがユーザＢに向けて現在発話しており、ユーザＡの発話終了後にユーザＢが何らかの返答をするため発話する可能性が高いと推定し、このユーザＢを次発話推定者（ユーザＡの発話終了後にこの次発話推定者（次注目対象推定物）を主注目対象物とする）として、これを主従注目位置検出データに対応付けて頭部駆動制御部41に送出する。 At this time, if the role identifying unit 36 determines that the orientation of the face front A1 of the user A is the user B direction based on the face orientation identification data of the user A who is the currently speaking speaker, for example, User A is currently speaking to user B, and after user A's utterance ends, user B estimates that there is a high probability of speaking because he / she replies, and this user B is assumed to be the next utterance estimator (user A's After the utterance is completed, the next utterance estimator (the next attention target estimation object is set as the main attention object) is sent to the head drive control unit 41 in association with the main / subject attention position detection data.

さらに、役割識別部36は、後述する行動選択部40にて選択された発話内容を当該行動選択部40から受け取り、会話ロボット１自身が現在発話していると認識したとき、顔向き・体向き検出部32からの顔向き識別データ及び体向き識別データから、会話ロボット１に顔正面A1,B1及び体正面A2,B2を向けているユーザＡ又はユーザＢが存在しているか否かを判断する。その結果、役割識別部36は、顔正面A1,B1及び体正面A2,B2を会話ロボット１に向いているユーザＡ又はユーザＢを主聴者（主注目対象物）とし、これを主従注目位置検出データに対応付けて頭部駆動制御部41及び胴体部駆動制御部42にそれぞれ送出する。因みに、この実施の形態の場合、ユーザＡ及びユーザＢともに顔正面A1,B1及び体正面A2,B2が会話ロボット１に向いているとき、ユーザＡ及びユーザＢのいずれか一方をランダムに主聴者として選択し、これを主従注目位置検出データに対応付けて頭部駆動制御部41に送出する。 Furthermore, when the role identification unit 36 receives the utterance content selected by the behavior selection unit 40 described later from the behavior selection unit 40 and recognizes that the conversation robot 1 is currently speaking, the role orientation / body orientation From the face orientation identification data and the body orientation identification data from the detection unit 32, it is determined whether or not there is a user A or a user B pointing the face front A1, B1 and the body front A2, B2 to the conversation robot 1. . As a result, the role identifying unit 36 uses the user A or the user B facing the conversation robot 1 as the face front A1, B1 and the body front A2, B2 as the main listener (main attention object), and detects the main / subject attention position. The data is sent to the head drive control unit 41 and the body drive control unit 42 in association with the data. Incidentally, in the case of this embodiment, when both user A and user B face front A1, B1 and body front A2, B2 are facing conversation robot 1, either user A or user B is randomly selected as the main listener. And is sent to the head drive control unit 41 in association with the main / subordinate attention position detection data.

頭部駆動制御部41は、役割識別部36から主従注目位置検出データを受け取ると、発話者や主聴者等の主注目対象物としたユーザＡ又はユーザＢの方向（以下、これを主注目対象方向と呼ぶ）を特定した後、頭部正面3aと、主注目対象方向との角度差を算出する。実際上、この頭部駆動制御部41は、図４（Ａ）に示すように、ＣＣＤカメラ23の撮像画枠51の中央線（以下、画枠中央線と呼ぶ）dが頭部正面3aとして予め設定されており、当該画枠中央線下端を角度中心点Oとして、この角度中心点Oから、主注目対象物とした例えばユーザＡの胴体中心方向に延びる主注目対象方向線d1を算出する。また、頭部駆動制御部41は、角度中心点Oを基準に画枠中央線dから主注目対象方向線d1までの角度（以下、これを主注目対象角度と呼ぶ）θ1を算出し、この主注目対象角度θ1を頭部回動命令として頭部アクチュエータ43に送出する。 Upon receiving the master-slave attention position detection data from the role identification unit 36, the head drive control unit 41 receives the direction of the user A or the user B as the main target object such as the speaker or the main listener (hereinafter, this is the main target object). Then, the angle difference between the head front surface 3a and the main attention target direction is calculated. Actually, as shown in FIG. 4A, the head drive control unit 41 uses the center line (hereinafter referred to as the image frame center line) d of the imaging image frame 51 of the CCD camera 23 as the head front surface 3a. A main attention target direction line d1 extending in the body center direction of the user A, for example, as the main attention object is calculated from the angle center point O with the lower end of the image frame center line as the angle center point O. . The head drive control unit 41 calculates an angle θ1 from the image frame center line d to the main target object direction line d1 with reference to the angle center point O (hereinafter, referred to as a main target object angle) θ1, The main target angle θ1 is sent to the head actuator 43 as a head rotation command.

なお、図４（Ａ）では、既に画枠中央線dと主注目対象方向線d1とが一致していることから、主注目対象角度θ1は0度となる。仮に画枠中央線dと主注目対象方向線d1とがずれているときには、頭部アクチュエータ43が頭部回動命令に基づいて主注目対象角度θ1だけ頭部３を回動させることにより、画枠中央線dと主注目対象方向線d1とを一致させ、主注目対象方向に頭部正面3aを向けさせ得る（図２（Ａ）及び（Ｂ））。 In FIG. 4A, since the image frame center line d and the main attention target direction line d1 already coincide with each other, the main attention target angle θ1 is 0 degree. If the image frame center line d and the main attention target direction line d1 are deviated, the head actuator 43 rotates the head 3 by the main attention target angle θ1 based on the head rotation command. The frame center line d and the main attention target direction line d1 can be matched, and the head front surface 3a can be directed to the main attention target direction (FIGS. 2A and 2B).

因みに、胴体部駆動制御部42は、会話ロボット１がユーザＡとだけ会話を行なっているとき、上体部８を主注目対象角度θ1まで回動させるのに必要な上体部回動角度を算出し、この上体部回動角度を胴体部回動命令として胴体部アクチュエータ44に送出する。胴体部アクチュエータ44は、胴体部回動命令に基づいて上体部回動角度だけ上体部８を回動させることにより、頭部正面3aが向いている主注目対象方向に胴体部正面8bも向けさせ得る（図２（Ａ））。 Incidentally, when the conversation robot 1 is talking only with the user A, the body drive control unit 42 determines the body part rotation angle necessary to rotate the body 8 to the main target angle θ1. The upper body part rotation angle is calculated and sent to the body part actuator 44 as a body part rotation command. The torso part actuator 44 rotates the torso part 8 by the torso part turning angle based on the torso part turning command, so that the torso part front face 8b is also directed in the main target direction toward which the head front face 3a faces. Can be directed (FIG. 2A).

これに対して、胴体部駆動制御部42は、ユーザＡだけでなくユーザＢとも会話を行なっているとき、役割識別部36から主従注目位置検出データを受け取ると、図４（Ａ）に示すように、角度中心点Oから、主注目対象物としたユーザＡの胴体中心方向に延びる主注目対象方向線d1を算出し、角度中心点Oを基準に画枠中央線dから主注目対象方向線d1までの主注目対象角度θ1を算出する。 On the other hand, when the body part drive control unit 42 has conversation with not only the user A but also the user B and receives the master / slave attention position detection data from the role identification unit 36, as shown in FIG. Then, the main attention target direction line d1 extending in the body center direction of the user A as the main attention object is calculated from the angle center point O, and the main attention target direction line from the image frame center line d is calculated based on the angle center point O. The main target object angle θ1 up to d1 is calculated.

また、このとき胴体部駆動制御部42は、角度中心点Oから、従注目対象物としたユーザＢの胴体中心方向に延びる従注目対象方向線d2を算出し、角度中心点Oを基準に画枠中央線dから従注目対象方向線d2までの角度（以下、これを従注目対象角度と呼ぶ）θ2を算出する。次いで、胴体部駆動制御部42は、これら主注目対象角度θ1と従注目対象角度θ2とを全て合算して、認識したユーザ（ユーザＡ及びユーザＢ）の総数である「２」で除算し、角度中心点Oを基準に画枠中央線dから重心方向線CG1までの角度（以下、これを重心方向角度と呼ぶ）θ_CG1を算出して、これを胴体部回動命令として胴体部アクチュエータ44に送出する。 Further, at this time, the body part drive control unit 42 calculates, from the angle center point O, a secondary attention target direction line d2 extending in the body center direction of the user B as a secondary attention object, and displays the image based on the angle center point O. An angle θ2 from the frame center line d to the secondary attention target direction line d2 (hereinafter referred to as the secondary attention target angle) is calculated. Next, the body part drive control unit 42 adds up all of the main target target angle θ1 and the subordinate target target angle θ2, and divides by “2”, which is the total number of recognized users (user A and user B), An angle from the image frame center line d to the center of gravity direction line CG1 (hereinafter referred to as the center of gravity direction angle) θ _CG1 is calculated with respect to the angle center point O as a body part rotation command, and the body part actuator 44 To send.

これにより、胴体部アクチュエータ44は、胴体部回動命令に基づいて重心方向角度θ_CG1だけ上体部８を回動させることにより、胴体部正面8bを重心方向線CG1側に向けさせ得る（図２（Ａ）及び（Ｂ））。かくして、会話ロボット１は、上体部８だけを回動させて、胴体部正面8bを重心方向へ向けることで、あたかも会話ロボット１、ユーザＡ及びユーザＢの全員でグループ会話を行っているかのような意思表示を、ユーザＡ及びユーザＢに行ない得る。 As a result, the body part actuator 44 can turn the upper body part 8 by the center-of-gravity direction angle θ _CG1 based on the body part rotation command, thereby turning the body part front face 8b toward the center-of-gravity direction line CG1 (see FIG. 2 (A) and (B)). Thus, the conversation robot 1 turns only the upper body part 8 and directs the body part front face 8b toward the center of gravity so that it is as if all of the conversation robot 1, user A, and user B are having a group conversation. Such intention display can be made to the user A and the user B.

因みに、上述した実施の形態においては、グループ会話として、ユーザＡ及びユーザＢの２人をグループ会話の参加者としたときの会話ロボット１の動作について述べたが、本発明はこれに限らず、３人や４人等その他複数人をグループ会話の参加者としたときでも、この会話ロボット１は同様の動作を実行し得る。例えば、図４（Ｂ）に示すように、ユーザＡ及びユーザＢに加えて、新たにユーザＣが加わり、３人をグループ会話の参加者としたときの会話ロボット１における動作について以下説明する。 Incidentally, in the above-described embodiment, as the group conversation, the operation of the conversation robot 1 when the user A and the user B are the participants of the group conversation has been described, but the present invention is not limited to this, The conversation robot 1 can perform the same operation even when three or four or more other persons are participating in the group conversation. For example, as shown in FIG. 4B, the operation of the conversation robot 1 when a user C is newly added in addition to the users A and B and three persons are group conversation participants will be described below.

この場合、役割識別部36は、ユーザＣについても同様に、ユーザＣの動画像から得られたユーザＣの顔向き識別データ及び体向き識別データから、ユーザＣの顔正面C1の一部及び体正面C2の一部が、会話ロボット１側に向いているか否かを判断する。その結果、ユーザＣの顔向き識別データ及び体向き識別データから、ユーザＣの顔正面C1の一部及び体正面C2の一部が、会話ロボット１側に向いている場合、このことはユーザＣが会話ロボット１や他のユーザＡ、ユーザＢとグループ会話を行うために、会話ロボット１側に顔正面C1及び体正面C2を向けていると判断し、役割識別部36は、ユーザＣがグループ会話の参加者であると認識する。 In this case, for the user C, the role identification unit 36 similarly uses a part of the face C1 and the body of the face C1 of the user C from the face orientation identification data and the body orientation identification data of the user C obtained from the moving image of the user C. It is determined whether a part of the front C2 is facing the conversation robot 1 side. As a result, if a part of the front face C1 and a part of the front face C2 of the user C face the conversation robot 1 side from the face orientation identification data and the body orientation identification data of the user C, this means that the user C Determines that the front face C1 and the front face C2 are facing the conversation robot 1 side in order to have a group conversation with the conversation robot 1 and other users A and B, and the role identification unit 36 determines that the user C is a group. Recognize that you are a participant in a conversation.

また、胴体部駆動制御部42は、役割識別部36から主従注目位置検出データを受け取ると、上述と同様にして、角度中心点Oを基準に画枠中央線dから主注目対象方向線d1までの主注目対象角度θ1を算出するとともに、角度中心点Oを基準に画枠中央線dから従注目対象方向線d2までの従注目対象角度θ2を算出する。また、ここでは、図４（Ｂ）に示すように、ユーザＡ及びユーザＢに加えて、新たにユーザＣが認識されている。これにより、胴体部駆動制御部42は、角度中心点Oから、従注目対象物としたユーザＣの胴体中心方向に延びる従注目対象方向線d3を算出し、角度中心点Oを基準に画枠中央線dから従注目対象方向線d3までの従注目対象角度θ3を算出する。 Further, when receiving the main-slave attention position detection data from the role identification unit 36, the body part drive control unit 42, similar to the above, from the image frame center line d to the main target-of-interest direction line d1 based on the angle center point O. Is calculated, and the secondary attention target angle θ2 from the image frame center line d to the secondary attention target direction line d2 is calculated on the basis of the angle center point O. Here, as shown in FIG. 4B, in addition to user A and user B, user C is newly recognized. As a result, the torso drive control unit 42 calculates, from the angle center point O, the subordinate target direction line d3 extending in the trunk center direction of the user C as the subordinate target object, and the image frame based on the angle center point O. A slave target angle θ3 from the center line d to the slave target direction line d3 is calculated.

そして、胴体部駆動制御部42は、これら主注目対象角度θ1と従注目対象角度θ2と従注目対象角度θ3を全て合算して、認識したユーザ（ユーザＡ、ユーザＢ及びユーザＣ）の総数である「３」で除算し、画枠中央線dから重心方向線CG1までの重心方向角度θ_CG2を算出して、これを胴体部回動命令として胴体部アクチュエータ44に送出する。すなわち、この胴体部駆動制御部42は、θ1＋θ2＋θ3＋…＋θn／nの計算式に基づいて、角度中心点Oを基準に画枠中央線dからの重心方向角度θ_CGを算出し得るようになされている（但し、θ1は主注目対象角度、θ2〜θnは従注目対象角度を示し、nは認識したユーザ総数を示す）。 Then, the body drive control unit 42 adds up the main target object angle θ1, the subordinate target object angle θ2, and the subordinate target object angle θ3, and determines the total number of recognized users (user A, user B, and user C). Dividing by a certain “3”, a center-of-gravity direction angle θ _CG2 from the image frame center line d to the center-of-gravity direction line CG1 is calculated, and this is sent to the body part actuator 44 as a body part rotation command. That is, the body drive control unit 42 can calculate the center-of-gravity direction angle θ _CG from the image frame center line d based on the angle center point O based on the calculation formula of θ 1 + θ 2 + θ 3 +... + Θ n / n. (However, θ1 is the main target object angle, θ2 to θn are the subordinate target object angles, and n is the total number of recognized users).

因みに、図３に示すように、行動選択部40は、音声処理部39から単語列データを受け取ると、データベース47に予め記憶されているキーワードを読み出して、当該単語列データの中に含まれるキーワードを抽出し、予め定められたテンプレートの中からこれら抽出したキーワード列と対応するテンプレートをデータベース47から読み出す。これにより行動選択部40は、キーワード列に基づいて選択された所定のテンプレートから、キーワード列がどのような意味を示しているのかを判断し得るようになされている。ここでデータベース47には、各テンプレート毎に、会話ロボット１が発話する発話内容や、腕部4a,4bを動かす等の動作内容を示す行動パターンが対応付けられた行動パターンテーブルが予め記憶されている。 Incidentally, as shown in FIG. 3, when the action selection unit 40 receives word string data from the voice processing unit 39, the action selection unit 40 reads a keyword stored in advance in the database 47, and includes the keyword included in the word string data. And a template corresponding to the extracted keyword string is read out from the database 47 from predetermined templates. Thereby, the action selection unit 40 can determine what meaning the keyword string indicates from a predetermined template selected based on the keyword string. Here, in the database 47, an action pattern table in which action contents indicating action contents such as utterance contents uttered by the conversation robot 1 and moving the arms 4a and 4b are stored in advance for each template. Yes.

これにより、行動選択部40は、文字列データから抽出したキーワード列に対応する行動パターンを、テンプレートを基に行動パターンテーブルの中から選択し、この選択した行動パターンに対応付けられた所定の発話内容及び動作内容をデータベース47から読み出して、発話内容を音声合成部48に送出するとともに、動作内容を腕部4a,4b等の各駆動部に送出する。音声合成部48は、行動選択部40から与えられる発話内容を音声信号に変換する機能を有し、かくして得られた音声信号をスピーカ49に送出するようになされている。これによりこの音声信号に基づく音声をスピーカ49から出力させることができるようになされている。また、腕部4a,4b等の駆動部は、行動選択部40から与えられる動作内容を基に、ユーザＡやユーザＢの発話に応じて手部15を上げる等、状況に応じた自律的な動作を実現し得る。 Thereby, the action selection unit 40 selects an action pattern corresponding to the keyword string extracted from the character string data from the action pattern table based on the template, and a predetermined utterance associated with the selected action pattern. The contents and the action contents are read from the database 47, and the utterance contents are sent to the speech synthesizer 48, and the action contents are sent to each drive unit such as the arm portions 4a and 4b. The voice synthesizing unit 48 has a function of converting the utterance content given from the action selecting unit 40 into a voice signal, and sends the voice signal thus obtained to the speaker 49. As a result, sound based on the sound signal can be output from the speaker 49. In addition, the driving units such as the arm units 4a and 4b are autonomous in accordance with the situation, such as raising the hand unit 15 according to the utterances of the user A and the user B based on the operation content given from the action selecting unit 40. Operation can be realized.

因みに、会話ロボット１は、図２（Ｂ）に示すように、ユーザＡ及びユーザＢとの間でグループ会話を行なっている際に、図５（Ａ）に示すように、例えばユーザＢが会話ロボット１側に顔正面B1の一部及び体正面B2の一部が向かないように向きを変えると、ユーザＢがグループ会話の参加者ではなくなったと判断し、図５（Ｂ）に示すように、ユーザＡ側に胴体部正面8bを向け、ユーザＡとの間だけで会話を行なうようになされている。 Incidentally, when the conversation robot 1 has a group conversation between the user A and the user B as shown in FIG. 2B, for example, as shown in FIG. When the direction is changed so that part of the face front B1 and part of the body front B2 do not face the robot 1, the user B is determined not to be a participant in the group conversation, as shown in FIG. 5 (B). The body part front face 8b is directed to the user A side, and a conversation with the user A is performed.

実際上、役割識別部36は、ユーザＢの動画像から得られたユーザＢの顔向き識別データ及び体向き識別データから、ユーザＢの顔正面B1の一部及び体正面B2の一部が、会話ロボット１側に向いていないと判断すると（例えば、会話ロボット１に対してユーザＢの顔が、斜め前左右90度以上のとき）、ユーザＢがグループ会話の参加者でなりことを示す主従注目位置検出データを生成し、これを胴体部駆動制御部42にそれぞれ送出する。 In practice, the role identifying unit 36 determines that a part of the user B's face front B1 and a part of the body front B2 are obtained from the user B's face orientation identification data and body orientation identification data obtained from the user B's moving image. If it is determined that it is not suitable for the conversation robot 1 (for example, when the face of the user B is 90 degrees or more in front of the conversation robot 1), the master-slave indicating that the user B is a participant in the group conversation Attention position detection data is generated and sent to the body part drive control unit 42.

胴体部駆動制御部42は、上体部８を主注目対象角度θ1まで回動させるのに必要な上体部回動角度を算出し、この上体部回動角度を胴体部回動命令として胴体部アクチュエータ44に送出する。胴体部アクチュエータ44は、胴体部回動命令に基づいて上体部回動角度だけ上体部８を回動させることにより、頭部正面3aが向いている主注目対象方向に胴体部正面8bも向けさせ得る（図５（Ｂ））。これにより、会話ロボット１は、ユーザＢがグループ会話から離脱しても、ユーザＡに対し胴体部正面8bを向かせることにより、ユーザＡとの間で会話を続ける意思表示を行なえ、状況に応じた自然な会話を行い得る。 The body part drive control unit 42 calculates a body part rotation angle necessary to rotate the body part 8 to the main target object angle θ1, and uses this body part rotation angle as a body part rotation command. It is sent to the body actuator 44. The torso part actuator 44 rotates the torso part 8 by the torso part turning angle based on the torso part turning command, so that the torso part front face 8b is also directed in the main target direction toward which the head front face 3a faces. Can be directed (FIG. 5B). Thereby, even if the user B leaves the group conversation, the conversation robot 1 can indicate the intention to continue the conversation with the user A by directing the trunk front 8b toward the user A. Can have a natural conversation.

（３）動作及び効果
以上の構成において、会話ロボット１では、ユーザＡやユーザＢの音声や動画像に基づいて、ユーザＡ及びユーザＢのうちいずれが発話者であるかを認識し、ユーザＡ又はユーザＢの発話内容に応じて、スピーカ49から発する発話内容や、腕部4a,4bの動作等の行動パターンを変化させ、ユーザＡやユーザＢとの会話に応じた自律的行動を実現できる。 (3) Operation and Effect In the above configuration, the conversation robot 1 recognizes which of the user A and the user B is the speaker based on the voices and moving images of the user A and the user B, and the user A Or, depending on the utterance content of the user B, the utterance content uttered from the speaker 49 and the behavior pattern such as the operation of the arms 4a, 4b can be changed to realize autonomous behavior according to the conversation with the user A or the user B. .

また、この会話ロボット１では、ユーザＡ及びユーザＢのうち、発話者や主聴者、或いは発話終了後の次発話推定者を主注目対象物と認識し、当該主注目対象物の方向に頭部正面3aが向くように頭部３を回動させる。これにより、会話ロボット１では、あたかも主注目対象物となるユーザＡ又はユーザＢの動作に応じて、ユーザＡ又はユーザＢと会話するため注視しているかのような印象を与えることができ、かくしてユーザＡ又はユーザＢとの間で状況に応じた自然な会話を行い得る。 In the conversation robot 1, of the user A and the user B, the utterer, the main listener, or the next utterance estimator after the utterance is recognized as the main target object, and the head in the direction of the main target object The head 3 is rotated so that the front 3a faces. Thereby, in the conversation robot 1, according to the operation of the user A or the user B as the main target object, it is possible to give an impression as if the user is gazing to talk with the user A or the user B. A natural conversation according to the situation can be performed with the user A or the user B.

これに加えて、この会話ロボット１では、主注目対象角度θ1及び従注目対象角度θ2を算出し、これら主注目対象角度θ1と従注目対象角度θ2とを全て合算した後、認識したユーザ（ユーザＡ及びユーザＢ）の総数で除算して重心方向角度θ_CG1を算出し、胴体部正面8bをこの重心方向角度θ_CG1まで回動させる。このように、会話ロボット１では、胴体部正面8bを重心方向角度θ_CG1まで回動させることで、重心方向に胴体部正面8bを向けさせて、ユーザＡだけでなくユーザＢに対しても、あたかも注目しているかのような印象を与えることができ、かくしてユーザＡ又はユーザＢとの間で状況に応じた自然な会話を実現し得る。 In addition to this, the conversation robot 1 calculates the main target object angle θ1 and the subordinate target object angle θ2, sums up all of the main target object angle θ1 and the subordinate target object angle θ2, and then recognizes the user (user) The center-of-gravity direction angle θ _CG1 is calculated by dividing by the total number of A and user B), and the body portion front face 8b is rotated to the center-of-gravity direction angle θ _CG1 . In this way, in the conversation robot 1, by rotating the trunk front 8b to the center of gravity direction angle _θCG1 , the trunk front 8b is directed toward the center of gravity, so that not only the user A but also the user B It is possible to give an impression as if the user is paying attention, and thus a natural conversation according to the situation can be realized with the user A or the user B.

（４）他の実施の形態
なお、本発明は、本実施形態に限定されるものではなく、本発明の要旨の範囲内で種々の変形実施が可能であり、例えば犬等の動物に似せた会話ロボットを適用してもよい。また、上述した実施の形態においては、対象物としてのユーザＡ及びユーザＢのうち、ユーザＡを主注目対象物とし、ユーザＢやユーザＣを従注目対象物とした場合について述べたが、本発明はこれに限らず、例えば筆記具により文字を記載可能な掲示型のボードや、各種情報が表示された表示装置、掲示型の印刷物等を主注目対象物又は従注目対象物（対象物）としてもよい。 (4) Other Embodiments The present invention is not limited to this embodiment, and various modifications can be made within the scope of the gist of the present invention. For example, the present invention resembles an animal such as a dog. A conversation robot may be applied. In the above-described embodiment, the case where the user A is the main target object and the user B and the user C are the subordinate target objects among the users A and B as the target objects is described. The invention is not limited to this. For example, a bulletin board on which characters can be written with a writing instrument, a display device on which various types of information are displayed, a bulletin-type printed matter, etc., are used as a main attention object or a secondary attention object (object) Also good.

この場合、会話ロボット１は、主注目対象物としてボードを認識し、従注目対象物としてユーザＢを認識したとき、腕部4a,4bを可動制御する指示部制御手段（図示せず）によって、指示部としての腕部4a,4bを主注目対象物たるボードの方向に向けるように、当該腕部4a,4bの可動を制御するようにしてもよい。 In this case, when the conversation robot 1 recognizes the board as the main target object, and recognizes the user B as the subordinate target object, the conversation robot 1 uses an instruction unit control means (not shown) that controls the arms 4a and 4b to move. The movement of the arm portions 4a and 4b may be controlled so that the arm portions 4a and 4b as the instruction portions are directed toward the board as the main target object.

この際、会話ロボット１は、胴体部駆動制御部42によって、主注目対象物としてのボードの位置から求めた主注目対象角度θ1と、従注目対象物としてのユーザＢの位置から求めた従注目対象角度θ2とを全て合算して、認識した対象物（ボード及びユーザＢ）の総数である「２」で除算し、角度中心点Oを基準に画枠中央線dから重心方向線CGまでの重心方向角度θ_CGを算出して、これを胴体部回動命令として胴体部アクチュエータ44に送出する。 At this time, the conversation robot 1 uses the trunk drive control unit 42 to determine the target attention angle θ1 determined from the position of the board as the target attention object and the secondary attention determined from the position of the user B as the secondary attention object. All the target angles θ2 are added together and divided by “2”, which is the total number of recognized objects (board and user B), and from the frame center line d to the center of gravity direction line CG based on the angle center point O The center-of-gravity direction angle θ _CG is calculated and sent to the body part actuator 44 as a body part rotation command.

これにより、会話ロボット１では、胴体部アクチュエータ44により、胴体部回動命令に基づいて重心方向角度θ_CGだけ上体部８を回動させ、胴体部正面8bを重心方向線CG1側に向けさせることができる。かくして、会話ロボット１の上体部８だけを回動させて、胴体部正面8bを重心方向へ向けることで、あたかも会話ロボット１が、ボードを腕部4a,4bで指示しつつ、上体部８の一部をユーザＢ側に向けて、ボードに注目しつつユーザＢとの間で会話を行なっているかのような印象を与えることができる。 Thereby, in the conversation robot 1, the body part actuator 44 rotates the body part 8 by the center-of-gravity direction angle θ _CG based on the body part rotation command, and the body part front 8b faces the center-of-gravity direction line CG1. be able to. Thus, by turning only the upper body part 8 of the conversation robot 1 and directing the front part 8b of the body part toward the center of gravity, it is as if the conversation robot 1 indicates the board with the arms 4a and 4b. A part of 8 is directed to the user B side, and an impression can be given as if a conversation is being performed with the user B while paying attention to the board.

また、上述した実施の形態においては、撮像画枠51内のユーザＡ及びユーザＢの位置を基に重心方向を求め、基台６を動かすことなく、上体部８のみを回動させて重心方向に胴体部正面8bを向けさせるようにした場合について述べたが、本発明はこれに限らず、例えば撮像画枠51内のユーザＡ及びユーザＢの位置を基に重心方向を求め、この重心方向へ胴体部正面8bを向けるための回動角度が、所定の回動角度以上であるとき、上体部８の回動を所定の回動角度範囲内になるように、移動制御手段によって基台６の車輪５を回動させ、胴体部２自体を回動角度方向に回動させたり、或いは胴体部２を回動角度方向側に平行移動させる等して、回動角度方向に胴体部２を自動的に移動させるようにしてもよい。 Further, in the above-described embodiment, the direction of the center of gravity is obtained based on the positions of the user A and the user B in the imaging image frame 51, and only the upper body part 8 is rotated without moving the base 6, thereby the center of gravity. However, the present invention is not limited to this. For example, the center of gravity direction is obtained based on the positions of the user A and the user B in the imaging frame 51, and the center of gravity is obtained. When the rotation angle for directing the body portion front surface 8b in the direction is equal to or greater than a predetermined rotation angle, the movement control means controls the rotation of the upper body portion 8 to be within a predetermined rotation angle range. The body 5 is rotated in the rotation angle direction by rotating the wheel 5 of the base 6 and rotating the body portion 2 itself in the rotation angle direction, or by moving the body portion 2 parallel to the rotation angle direction side. 2 may be moved automatically.

さらに、上述した実施の形態においては、撮像画枠51内の全てのユーザＡ及びユーザＢの位置から重心方向を求め、当該上体部８を重心方向へ回動させるようにした場合について述べたが、本発明はこれに限らず、撮像画枠51内の全てのユーザＡ及びユーザＢの位置から求めた重心方向に、胴体部正面8bを向けると、所定回動角度以上に上体部８を回動させる必要があるとき、当該所定回動角度以内となるように、所定回動角度以上の位置に存在するユーザＡ又はユーザＢを除外する等し、ユーザＡ又はユーザＢの中から一部を除外して重心方向を求めるようにしてもよい。 Furthermore, in the above-described embodiment, the case where the direction of the center of gravity is obtained from the positions of all the users A and B in the imaging frame 51 and the upper body part 8 is rotated in the direction of the center of gravity has been described. However, the present invention is not limited to this, and when the body front 8b is directed in the direction of the center of gravity obtained from the positions of all the users A and B in the imaging image frame 51, the upper body portion 8 exceeds a predetermined rotation angle. When the user A or the user B needs to be rotated, the user A or the user B existing at a position greater than or equal to the predetermined rotation angle is excluded so as to be within the predetermined rotation angle. The center-of-gravity direction may be obtained by excluding the portion.

さらに、上述した実施の形態においては、外部センサとして、会話ロボット１の目部21に撮像手段であるＣＣＤカメラ23を設け、このＣＣＤカメラ23で撮像された撮像画枠51内のユーザＡ及びユーザＢの位置から重心方向を求めるようにした場合について述べたが、本発明はこれに限らず、例えば外部センサとして室内天井に撮像手段であるカメラを設置し、ユーザＡ、ユーザＢ及び会話ロボット１の位置関係を上方からカメラで撮像し、この撮像画像内のユーザＡ及びユーザＢの位置から重心方向を求め、当該重心方向に会話ロボット１の胴体部正面8bを向けるようにしてもよい。 Further, in the above-described embodiment, as an external sensor, a CCD camera 23 as an imaging unit is provided in the eye 21 of the conversation robot 1, and the user A and the user A in the imaging image frame 51 captured by the CCD camera 23 are used. Although the case where the direction of the center of gravity is obtained from the position of B has been described, the present invention is not limited to this. For example, a camera as an imaging unit is installed on the indoor ceiling as an external sensor, and the user A, the user B, and the conversation robot 1 May be captured from above by a camera, the center of gravity direction may be obtained from the positions of the users A and B in the captured image, and the body front 8b of the conversation robot 1 may be directed toward the center of gravity.

また、上述した実施の形態のその他の形態として、例えば、音源方向の特定に関しては、例えば「ロボット頭部に設置した２系統のマイクによる音源定位（日本音響学会、春季研究発表講演論文誌ｐｐ469-470 1999 小林哲則、宮田大介、松坂要佐）」に記載された技術内容を用いていてもよく、この場合、頭部３に２つのマイクロホンを設け、各マイクロホンで受音した信号から音源定位を行い、おおよその到来方向を求めた後、ＣＣＤカメラ23で得られた動画像を画像処理してその方向にいるユーザを検索し、発話者として認定するようにしてもよい。 In addition, as another form of the above-described embodiment, for example, regarding the specification of the sound source direction, for example, “sound source localization using two microphones installed on the robot head” 470 1999 Tetsunori Kobayashi, Daisuke Miyata, Kayo Matsuzaka) ”. In this case, two microphones are installed in the head 3, and the sound source localization is based on the signal received by each microphone. After obtaining the approximate direction of arrival, the moving image obtained by the CCD camera 23 may be subjected to image processing to search for a user in that direction and to be recognized as a speaker.

１会話ロボット
２胴体部
３頭部
4a,4b 腕部（指示部）
５車輪（移動手段）
８上体部
23 ＣＣＤカメラ（外部センサ）
37a,37b マイクロホン（外部センサ）
35 位置検出部（位置検出手段）
36 役割識別部（役割識別手段）
41 頭部駆動制御部（頭部制御手段）
42 胴体部駆動制御部（胴体部制御手段） 1 Conversation robot 2 Torso 3 Head
4a, 4b Arm (instruction part)
5 wheels (moving means)
8 upper body
23 CCD camera (external sensor)
37a, 37b Microphone (external sensor)
35 Position detector (position detection means)
36 Role identification part (role identification means)
41 Head drive control unit (head control means)
42 Fuselage part drive control part (torso part control means)

Claims

A body part rotatably provided on the body part and a head part rotatably provided on the body part, and autonomously with a plurality of objects based on an output result from an external sensor In a conversation robot that performs group conversations ,
Position detecting means for detecting positions of a plurality of previous objects based on an output result obtained from the external sensor;
On the basis of the output result acquired from the external sensor, recognizes one of the plurality of objects as the main object of interest, recognizing the role identification means other remaining of the object as従注th object,
The centroid direction indicated in the previous SL main object of interest and the center of gravity direction line obtained from the position of the従注eye object, body to control the rotation angle of the upper body portion so as to direct a front of the upper body portion Part control means;
A head control means for controlling a rotation angle of the head so that the front of the head faces the direction of the main target object ;
Based on the data from the role identifying means, the torso part control means and the head control means are configured so that the front of the upper body part is oriented in the center of gravity direction indicated by the center of gravity direction line. The upper body part can be controlled so that the front of the head is directed toward the target object by rotating only the head, and the front of the head is directed toward the main target object. The conversation robot is characterized in that it can be controlled so that the front of the upper body part is directed in the direction of the center of gravity indicated by the center of gravity direction line .

The plurality of objects are users;
The role identification means is
Based on the image and / or sound acquired from the external sensor, among the plurality of users, the user who speaks or is the main listener is recognized as the main target object, and the other remaining users are the subordinate attention. The conversation robot according to claim 1, wherein the conversation robot is recognized as an object.

The role identification means is
The next attention target estimation object which the user recognized as the main attention object recognizes is recognized, and the user sets the next attention object estimation object as a new main attention object after the utterance ends. 2. The conversation robot according to 2.

The center-of-gravity direction angle formed by the center-of-gravity direction line and a predetermined reference line is
The main target object angle formed by the main target object direction line extending in the direction of the main target object and the reference line, and the sub target object angle formed by the reference line and the sub target object direction line extending in the direction of the sub target object To calculate
4. The method according to claim 1, wherein a value obtained by adding all the main attention target angles and the sub attention object angles is calculated by dividing the sum by the total number of the plurality of objects. 5. Conversation robot.

The body part control means includes:
When the front surface of the upper body part is directed in the direction of the center of gravity obtained from the plurality of objects, when the rotation angle of the upper body part is equal to or greater than a predetermined rotation angle, the rotation angle of the upper body part is 5. The center-of-gravity direction is calculated by excluding a predetermined object from the plurality of objects so as to be within a predetermined rotation angle range. Conversation robot.

The conversation robot according to any one of claims 1 to 5, further comprising instruction unit control means for controlling movement of the instruction unit.

When the front surface of the upper body part is directed in the direction of the center of gravity, the rotation of the upper body part is within the range of the rotation angle when the rotation angle of the upper body part is equal to or greater than a predetermined rotation angle. The conversation robot according to claim 1, further comprising movement control means for moving the body portion by movement means.