JPWO2017145929A1

JPWO2017145929A1 - Attitude control device, robot, and attitude control method

Info

Publication number: JPWO2017145929A1
Application number: JP2018501632A
Authority: JP
Inventors: 誠悟伊藤; 秀俊篠原
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2016-02-25
Filing date: 2017-02-17
Publication date: 2018-10-25
Also published as: WO2017145929A1; CN108698231A

Abstract

ユーザとの対話開始時に、ロボット自体が発話の意図があるか否かをユーザに対して示すこと。ユーザとの対話が可能なロボット（１０１）の姿勢を制御する姿勢制御装置（３）であって、ユーザとの対話開始時に特定された上記ロボット（１０１）の姿勢が、発話意図提示姿勢でない場合に、駆動系（１）を駆動させて当該ロボット（１０１）に上記発話意図提示姿勢をとらせる。 To indicate to the user whether or not the robot itself intends to speak at the start of dialogue with the user. The posture control device (3) for controlling the posture of the robot (101) capable of dialogue with the user, wherein the posture of the robot (101) specified at the start of the dialogue with the user is not the utterance intention presentation posture Then, the driving system (1) is driven to cause the robot (101) to assume the above-mentioned utterance intention presentation posture.

Description

本発明は、ユーザとの対話が可能なロボットの姿勢を制御する姿勢制御装置、姿勢制御装置を備えたロボット及び姿勢制御方法に関する。 The present invention relates to a posture control device that controls the posture of a robot capable of interacting with a user, a robot including the posture control device, and a posture control method.

近年、自分の発話に応じた動作を行うロボットが開発されている。そして、これらロボットに対して、自分の発話に応じた動作をより自然に行うことが要求されている。例えば、特許文献１には、実際の動作と同期した音声を合成することで、発話に応じた動作を自然に行わせるロボット装置が開示されている。また、特許文献２には、ロボットが音声を出力する間にロボットのジェスチャを生成することで、発話に応じた動作を自然に行わせる人間型ロボットが開示されている。 In recent years, robots that perform actions according to their utterances have been developed. And it is requested | required of these robots to perform the operation | movement according to own utterance more naturally. For example, Patent Document 1 discloses a robot apparatus that naturally performs an operation corresponding to an utterance by synthesizing a voice synchronized with an actual operation. Further, Patent Document 2 discloses a humanoid robot that naturally performs an operation corresponding to an utterance by generating a gesture of the robot while the robot outputs sound.

日本国公開特許公報「特許第５４０２６４８号」（２０１３年１１月８日登録）Japanese Patent Publication “Patent No. 5402648” (registered on November 8, 2013) 日本国公表特許公報「特表２０１４−５０４９５９号」（２０１４年２月２７日公表）Japanese Patent Gazette “Special Table 2014-504959” (published February 27, 2014)

ところで、ユーザとロボットの対話を円滑に行うには、ユーザとの対話を開始する際に、当該ロボット自体が発話の意図があるか否かをユーザに対して明確に知らせる必要がある。しかし、上記の各特許文献に開示されたロボットは、ロボット自体が発話する際の動作が自然になるように工夫されているものの、ユーザとの対話を開始する際に、当該ロボット自体が発話の意図があるか否かをユーザに対して示すことは特に考慮されていない。 By the way, in order to smoothly perform the dialogue between the user and the robot, it is necessary to clearly notify the user whether or not the robot itself has the intention of speaking when starting the dialogue with the user. However, although the robots disclosed in each of the above-mentioned patent documents are devised so that the behavior when the robot itself speaks is natural, when the dialogue with the user starts, the robot itself speaks. It is not particularly considered to indicate to the user whether or not there is an intention.

本発明は、前記の問題点に鑑みてなされたものであり、その目的は、ユーザとの対話開始時に、ロボット自体が発話の意図があるか否かをユーザに対して明確に示すことができる姿勢制御装置及び姿勢制御方法を実現することにある。 The present invention has been made in view of the above-described problems, and its purpose is to clearly indicate to the user whether or not the robot itself intends to speak at the start of dialogue with the user. An attitude control device and an attitude control method are realized.

上記の課題を解決するために、本発明の一態様に係る姿勢制御装置は、ユーザとの対話が可能であり、且つ複数の駆動部を駆動させて種々の姿勢をとることが可能なロボットに備えられ、当該ロボットの姿勢を制御する姿勢制御装置であって、上記ロボットの姿勢を、上記各駆動部の駆動状態から特定する姿勢特定部と、上記各駆動部の駆動制御を行う駆動制御部と、を備え、上記駆動制御部は、ユーザとの対話開始時に上記姿勢特定部によって特定された上記ロボットの姿勢が、当該ロボットに発話を行う意図があることを示す発話意図提示姿勢でない場合に、上記各駆動部を駆動させて当該ロボットに上記発話意図提示姿勢をとらせることを特徴としている。 In order to solve the above problems, an attitude control device according to one embodiment of the present invention is a robot that can interact with a user and can drive a plurality of driving units to take various attitudes. A posture control device that is provided and controls the posture of the robot, the posture specifying unit that specifies the posture of the robot from the drive state of each drive unit, and the drive control unit that performs drive control of each drive unit And the drive control unit is configured such that the posture of the robot specified by the posture specifying unit at the start of dialogue with the user is not an utterance intention presentation posture indicating that the robot has an intention to speak. The drive units are driven to cause the robot to take the utterance intention presentation posture.

本発明の一態様に係る姿勢制御方法は、ユーザとの対話が可能であり、且つ複数の駆動部を駆動させて種々の姿勢をとることが可能なロボットの姿勢を制御する姿勢制御方法であって、ユーザとの対話開始時に上記ロボットの姿勢を特定する姿勢特定ステップと、上記姿勢特定ステップにより特定されたロボットの姿勢が、当該ロボットに発話を行う意図があることを示す発話意図提示姿勢でない場合に、上記各駆動部を駆動させて当該ロボットに上記発話意図提示姿勢をとらせる駆動制御ステップとを含むことを特徴としている。 An attitude control method according to an aspect of the present invention is an attitude control method for controlling the attitude of a robot capable of interacting with a user and driving a plurality of driving units to take various attitudes. Thus, the posture specifying step for specifying the posture of the robot at the start of the dialogue with the user, and the posture of the robot specified by the posture specifying step are not the utterance intention presenting posture indicating that the robot is intended to speak. A drive control step of driving each of the drive units to cause the robot to take the utterance intention presentation posture.

本発明の一態様によれば、ユーザとの対話開始時に、ロボット自体が発話の意図があるか否かをユーザに対して明確に示すことができるという効果を奏する。 According to one aspect of the present invention, it is possible to clearly indicate to the user whether or not the robot itself has an intention to speak at the start of a dialog with the user.

本発明の実施形態１に係るロボットの概略構成ブロック図である。1 is a schematic configuration block diagram of a robot according to a first embodiment of the present invention. 図１に示すロボットが備える姿勢制御装置によるロボットの姿勢制御の処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process of the attitude | position control of the robot by the attitude | position control apparatus with which the robot shown in FIG. 1 is provided. 本発明の実施形態２に係るロボットの概略構成ブロック図である。It is a schematic block diagram of a robot according to Embodiment 2 of the present invention. 図３に示すロボットが備える姿勢制御装置によるロボットの姿勢制御の処理の流れを示すシーケンス図である。FIG. 4 is a sequence diagram showing a flow of processing of posture control of the robot by the posture control device provided in the robot shown in FIG. 3. 本発明の実施形態３に係るロボットの概略構成ブロック図である。It is a schematic block diagram of the robot which concerns on Embodiment 3 of this invention. 本発明の実施形態３の変形例に係るロボットの概略構成ブロック図である。It is a schematic block diagram of the robot which concerns on the modification of Embodiment 3 of this invention.

〔実施形態１〕
以下、本発明の実施の形態について、詳細に説明する。本実施形態では、少なくとも人もしくは動物、ないしはそれに類した外殻とそれを動作する複数の駆動部からなる駆動系を持ち、ユーザとの対話が可能なロボットについて説明する。Embodiment 1
Hereinafter, embodiments of the present invention will be described in detail. In the present embodiment, a robot having a driving system composed of at least a human or animal or similar outer shell and a plurality of driving units that operate the outer shell and capable of interacting with a user will be described.

（ロボットの概要）
図１は、本実施形態に係るロボット１０１の概略構成図である。ロボット１０１は、少なくとも人もしくは動物、ないしはそれに類した外殻（図示せず）を備えている。ロボット１０１は、さらに、外殻を動作する複数の駆動部（マニピュレータ）からなる駆動系１、ユーザとの対話を実現するための音声系２、駆動系１を駆動させて種々の姿勢を取らせるための姿勢制御装置３を含んでいる。(Robot overview)
FIG. 1 is a schematic configuration diagram of a robot 101 according to the present embodiment. The robot 101 includes at least a human or animal or similar outer shell (not shown). The robot 101 further drives the driving system 1 composed of a plurality of driving units (manipulators) that operate the outer shell, the audio system 2 for realizing dialogue with the user, and the driving system 1 to take various postures. The attitude control device 3 is included.

音声系２は、マイク２１、入力装置２２、音声認識装置２３、対話装置２４、音声合成装置２５、再生装置２６、スピーカ２７、再生状況取得装置２８を含んでいる。マイク２１は、ユーザが発する声を集音し、集音した声を電子的な波のデータ（波形データ）に変換する装置である。マイク２１は、変換した電子的な波形データを、後段の入力装置２２に送る。 The voice system 2 includes a microphone 21, an input device 22, a voice recognition device 23, a dialogue device 24, a voice synthesis device 25, a playback device 26, a speaker 27, and a playback status acquisition device 28. The microphone 21 is a device that collects a voice uttered by a user and converts the collected voice into electronic wave data (waveform data). The microphone 21 sends the converted electronic waveform data to the input device 22 at the subsequent stage.

入力装置２２は、上記電子的な波形データを記録する装置である。入力装置２２は、波形データの記録中に、当該波形データが無音を示す波形データの状態が所定時間以上継続した場合、記録を終了し、記録が終了したこと、すなわち入力終了を示す信号を姿勢制御装置３に送る。入力装置２２は、入力終了を示す信号を姿勢制御装置３に送るタイミングで、記録した波形データを後段の音声認識装置２３に送る。音声認識装置２３は、入力装置２２から送られた電子的な波形データからテキストデータに変換する(ＡＳＲ:Automatic Speech Recognition)装置である。音声認識装置２３は、変換したテキストデータを、後段の対話装置２４に送る。 The input device 22 is a device that records the electronic waveform data. When recording the waveform data, if the state of the waveform data indicating that the waveform data indicates no sound continues for a predetermined time or longer, the input device 22 ends the recording, and indicates that the recording has ended, that is, a signal indicating the end of input. Send to control device 3. The input device 22 sends the recorded waveform data to the subsequent speech recognition device 23 at the timing of sending a signal indicating the end of input to the attitude control device 3. The voice recognition device 23 is a device that converts electronic waveform data sent from the input device 22 into text data (ASR: Automatic Speech Recognition). The voice recognition device 23 sends the converted text data to the subsequent dialogue device 24.

対話装置２４は、音声認識装置２３から送られたテキストデータを解析してユーザの発話内容（解析結果）を特定し、特定した発話内容に対して会話が成り立つ応答内容を示す対話用データの取得を行う装置である。また、対話装置２４は、取得した対話用データから、応答内容に対応する、テキストデータを抽出する。そして、対話装置２４は、抽出したテキストデータを後段の音声合成装置２５に送る。 The dialogue device 24 analyzes the text data sent from the speech recognition device 23 to identify the user's utterance content (analysis result), and obtains dialogue data indicating the response content in which conversation is established for the identified utterance content. It is a device that performs. Further, the dialogue device 24 extracts text data corresponding to the response content from the obtained dialogue data. Then, the dialogue device 24 sends the extracted text data to the subsequent speech synthesizer 25.

音声合成装置２５は、対話装置２４から送られたテキストデータをＰＣＭデータにするＴＴＳ（Text to Speech）装置である。音声合成装置２５は、変換したＰＣＭデータを後段の再生装置２６に送る。再生装置２６は、音声合成装置２５から送られたＰＣＭデータを音波としてスピーカ２７に出力する装置である。ここで出力される音波は、人が認識できる音をいう。また、再生装置２６から出力された音波は、ユーザの発話内容に対する応答内容となる。これにより、ユーザとロボット１０１との間で会話が成り立つことになる。また、再生装置２６は、ＰＣＭデータをスピーカ２７に出力すると同時に再生状況取得装置２８に出力する。 The speech synthesizer 25 is a TTS (Text to Speech) device that converts text data sent from the dialogue device 24 into PCM data. The speech synthesizer 25 sends the converted PCM data to the playback device 26 at the subsequent stage. The playback device 26 is a device that outputs the PCM data sent from the speech synthesizer 25 to the speaker 27 as sound waves. The sound wave output here means a sound that can be recognized by a person. In addition, the sound wave output from the playback device 26 becomes the response content for the user's utterance content. As a result, a conversation is established between the user and the robot 101. Further, the playback device 26 outputs the PCM data to the speaker 27 and simultaneously outputs it to the playback status acquisition device 28.

再生状況取得装置２８は、再生装置２６からＰＣＭデータが送られると、スピーカ２７による音声の出力が開始されたことを示す信号、すなわちロボット１０１によるユーザへの音声の再生開始時（発話開始時）であることを示す信号を姿勢制御装置３に送る。 When the PCM data is sent from the playback device 26, the playback status acquisition device 28 is a signal indicating that the output of the voice from the speaker 27 has started, that is, when the playback of the voice to the user by the robot 101 is started (at the start of speech). Is sent to the attitude control device 3.

姿勢制御装置３は、ロボット１０１の姿勢を制御する装置であって、駆動制御装置３１、筐体状態取得装置３２、姿勢記録装置３３、挙動パターン記録装置３４を含んでいる。駆動制御装置３１は、ロボット１０１の姿勢を、上記駆動系（駆動部）１の駆動状態から特定する姿勢特定部３１ａと、上記駆動系１の駆動制御を行う駆動制御部３１ｂを含んでいる。 The posture control device 3 is a device that controls the posture of the robot 101, and includes a drive control device 31, a housing state acquisition device 32, a posture recording device 33, and a behavior pattern recording device 34. The drive control device 31 includes a posture specifying unit 31 a that specifies the posture of the robot 101 from the drive state of the drive system (drive unit) 1 and a drive control unit 31 b that performs drive control of the drive system 1.

筐体状態取得装置３２は、駆動系１の駆動状態を示す情報を取得する装置である。ここで、駆動系１の駆動状態を示す情報とは、ロボット１０１の姿勢を特定するための駆動系１がどのような駆動状態にあるかを示す情報である。例えば、ロボットの関節に取り付けられたロータリーエンコーダから得られる関節の角度情報やトルクのオン／オフ状態などが駆動状態を示す情報に相当する。この情報は、筐体状態取得装置３２から駆動制御装置３１の姿勢特定部３１ａに送られる。 The casing state acquisition device 32 is a device that acquires information indicating the drive state of the drive system 1. Here, the information indicating the driving state of the driving system 1 is information indicating what driving state the driving system 1 for specifying the posture of the robot 101 is in. For example, joint angle information obtained from a rotary encoder attached to the joint of the robot, torque on / off state, and the like correspond to information indicating the drive state. This information is sent from the housing state acquisition device 32 to the attitude specifying unit 31a of the drive control device 31.

姿勢記録装置３３は、ロボット１０１がとる発話意図提示姿勢を記録する装置である。具体的には、姿勢記録装置３３には、ロボット１０１が発話意図提示姿勢をとるように、駆動系１の駆動状態を示す情報が記録されている。発話意図提示姿勢とは、例えば、ロボットが口に手を当てるような姿勢や気を付け姿勢、ユーザの顔を向くといった姿勢のことであり、ロボットが発話を行う意図をユーザに示す姿勢のことである。 The posture recording device 33 is a device that records the speech intention presentation posture taken by the robot 101. Specifically, information indicating the driving state of the drive system 1 is recorded in the posture recording device 33 so that the robot 101 assumes the utterance intention presentation posture. The utterance intention presentation posture is, for example, a posture in which the robot touches the mouth, a careful posture, a posture in which the user faces the user's face, and the posture in which the robot indicates the intention to speak to the user. It is.

挙動パターン記録装置３４は、ロボット１０１の発話内容に対応付けられた挙動パターンを記録する装置である。具体的には、挙動パターン記録装置３４には、挙動パターンとして、発話内容毎に対応付けられた駆動系１の駆動状態を示す情報が記録されている。なお、挙動パターンとして、姿勢記録装置３３の情報だけでなく、筐体状態取得装置３２ならび、例えば転倒検知や重力加速度などの各種センサ、もしくはロボット１０１の内部状態、例えば音声認識結果の過去の行動パターンを追加しても構わない。さらに、ユーザの発話内容をカテゴリ分けしたもの。発話時のピッチに合わせたものであってもよい。また、発話意図提示姿勢は、ロボット１０１が物を持つなど、当該ロボット１０１の状況によるため一種類でなく、複数種類あってもよい。 The behavior pattern recording device 34 is a device that records a behavior pattern associated with the utterance content of the robot 101. Specifically, in the behavior pattern recording device 34, information indicating the driving state of the driving system 1 associated with each utterance content is recorded as a behavior pattern. As behavior patterns, not only the information of the posture recording device 33 but also the housing state acquisition device 32, for example, various sensors such as fall detection and gravitational acceleration, or the internal state of the robot 101, for example, past behaviors of voice recognition results. You may add a pattern. In addition, the user's utterance content is divided into categories. It may be adapted to the pitch at the time of utterance. Also, the utterance intention presentation posture is not limited to one type, and may be a plurality of types, depending on the situation of the robot 101, such as the robot 101 holding an object.

姿勢特定部３１ａでは、ロボット２０１の駆動系１の駆動状態を示す情報を取得することで、当該ロボット２０１が現在どのような姿勢をしているかを特定するようになっている。特定した姿勢を示す情報は、姿勢特定部３１ａから駆動制御部３１ｂに送られる。 The posture specifying unit 31a acquires information indicating the drive state of the drive system 1 of the robot 201, thereby specifying what posture the robot 201 is currently in. Information indicating the identified posture is sent from the posture identifying unit 31a to the drive control unit 31b.

駆動制御部３１ｂでは、ユーザとの対話開始時に上記姿勢特定部３１ａによって特定された上記ロボット１０１の姿勢が、発話意図提示姿勢であるか否かを判定する。ここで、ユーザとの対話開始時は、当該ロボット１０１によるユーザへの音声の再生開始時である。つまり、駆動制御部３１ｂは、再生状況取得装置２８からのロボット１０１によるユーザへの音声の再生開始されたことを示す信号を受け取ったタイミングにより、姿勢特定部３１ａによって特定されたロボット１０１の姿勢を判定する。 The drive control unit 31b determines whether or not the posture of the robot 101 specified by the posture specifying unit 31a at the start of the dialogue with the user is an utterance intention presentation posture. Here, the dialogue with the user is started when the robot 101 starts reproducing the voice to the user. That is, the drive control unit 31b determines the posture of the robot 101 specified by the posture specifying unit 31a at the timing when the signal indicating that the reproduction of the voice to the user by the robot 101 is started from the reproduction status acquisition device 28 is received. judge.

駆動制御部３１ｂは、ロボット１０１の姿勢を判定した結果、当該ロボット１０１の姿勢が発話意図提示姿勢でない場合、駆動系１を駆動させてロボット１０１に発話意図提示姿勢をとらせる。つまり、ユーザとの対話開始時に上記ロボット１０１の姿勢を特定し（姿勢特定ステップ）、特定されたロボットの姿勢が、当該ロボット１０１に発話を行う意図があることを示す発話意図提示姿勢であるか否かを判定する。そして、発話意図提示姿勢ではない場合に、駆動系１を駆動させて当該ロボット１０１に上記発話意図提示姿勢をとらせる（駆動制御ステップ）。このように、ロボット１０１は、ユーザとの対話開始時に発話意図提示姿勢でない場合に、当該発話意図提示姿勢に戻る動作を行っているため、ユーザはロボット１０１に発話の意図があることを容易に理解することができる。 As a result of determining the posture of the robot 101, if the posture of the robot 101 is not the utterance intention presentation posture, the drive control unit 31b drives the drive system 1 to cause the robot 101 to assume the utterance intention presentation posture. That is, whether or not the robot 101 is identified at the start of the dialogue with the user (posture identifying step), and whether the identified robot posture is an utterance intention presentation posture indicating that the robot 101 is intended to speak. Determine whether or not. If it is not the utterance intention presentation posture, the drive system 1 is driven to cause the robot 101 to take the utterance intention presentation posture (drive control step). As described above, when the robot 101 is not in the utterance intention presentation posture at the start of the dialogue with the user, the robot 101 performs an operation of returning to the utterance intention presentation posture. Therefore, the user can easily confirm that the robot 101 has the intention of utterance. I can understand.

また、駆動制御部３１ｂは、ロボット１０１の姿勢を判定した結果、当該ロボット１０１の姿勢が発話意図提示姿勢である場合、ロボット１０１の発話開始前に、これから発話を行うことをユーザに知らせるための動作を行わせる。例えば、ロボット１０１の発話意図提示姿勢において頭が正面を向いていれば、一度頭をうなだれた状態にして頭を正面に戻す動作を発話開始前に行わせる。この動作を行った後、ロボット１０１は発話を行う。これにより、ユーザは、ロボット１０１に発話の意図があることを容易に理解することができる。 In addition, when the posture of the robot 101 is determined to be the utterance intention presentation posture as a result of determining the posture of the robot 101, the drive control unit 31b notifies the user that the utterance will be performed before the utterance of the robot 101 starts. Let the action take place. For example, if the head is facing the front in the utterance intention presentation posture of the robot 101, the operation of returning the head to the front with the head once swung is performed before starting the utterance. After performing this operation, the robot 101 speaks. Thereby, the user can easily understand that the robot 101 has the intention of speaking.

（姿勢制御処理）
図２は、図１に示すロボット１０１の姿勢制御処理の流れを示すシーケンス図である。以下のシーケンス図には、ロボット１０１による音声が再生されるまでの処理（１）、音声再生中にロボット１０１の挙動が終了したときの処理（２）、ロボット１０１の挙動中に音声再生が終了したときの処理（３）が含まれている。(Attitude control processing)
FIG. 2 is a sequence diagram showing the flow of the posture control process of the robot 101 shown in FIG. In the following sequence diagram, the process until the voice is reproduced by the robot 101 (1), the process when the behavior of the robot 101 is finished during the voice reproduction (2), and the voice reproduction is finished during the behavior of the robot 101. Processing (3) is included.

処理（１）の概要：ロボット１０１において、音声系２では、基本的にユーザの発話がマイク２１より取得され、入力装置２２により録される。録された発話について、その後、音声認識装置２３により音声認識をし、対話装置２４より音声認識結果から対話文字列を取得し、音声合成装置２５により対話文字列を音声合成し、再生装置２６により音声合成内容をスピーカ２７にて鳴動する。なお、ユーザの発話の取得から音声合成内容の鳴動までを一連の所作としている。 Outline of Process (1): In the robot 101, in the voice system 2, the user's utterance is basically acquired from the microphone 21 and recorded by the input device 22. The recorded utterance is then recognized by the speech recognition device 23, the dialogue character string is acquired from the speech recognition result from the dialogue device 24, the dialogue character string is synthesized by the speech synthesizer 25, and the playback device 26 The speech synthesis content is sounded by the speaker 27. Note that a series of operations from the acquisition of the user's utterance to the ringing of the speech synthesis content is made.

また、本実施形態では、上記音声系２においてユーザとの対話開始時は、ユーザの発話がマイク２１より取得され、取得された発話に対応する応答用の発話を再生装置２６により再生を開始するタイミングとする。 In the present embodiment, at the start of dialogue with the user in the voice system 2, the user's utterance is acquired from the microphone 21, and playback of the response utterance corresponding to the acquired utterance is started by the playback device 26. Timing.

姿勢制御装置３では、上記対話開始時に、筐体状態取得装置３２によって筐体（ロボット１０１の駆動系１）の情報（駆動情報）を取得する。そして、駆動制御装置３１は必要であれば駆動系１をアクティブにした後に姿勢記録装置３３の情報に従って発話意図提示姿勢に変更し、挙動パターン記録装置３４から発話内容に応じて挙動パターンのいずれかを選ぶ。 In the attitude control device 3, information (driving information) of the housing (the driving system 1 of the robot 101) is obtained by the housing state obtaining device 32 at the start of the dialogue. Then, if necessary, the drive control device 31 activates the drive system 1 and then changes to the utterance intention presentation posture according to the information of the posture recording device 33, and either one of the behavior patterns is selected from the behavior pattern recording device 34 according to the utterance content. Select.

駆動系１が挙動パターンに応じて駆動を開始すると、ロボット１０１の発話が開始される。具体的には、上記処理（１）は、図２に示すシーケンスのうち、（１．音声データ入力）〜（１３．音データ鳴動）までの処理が対応している。すなわち、マイク２１は、ユーザが発話することで入力された音声を波形データに変換し、音データとして入力装置２２に出力する（１．音声データ入力）。入力装置２２は、入力された音データを入力し、入力した音データを音声認識装置２３に出力する（２．音声認識開始命令）。 When the drive system 1 starts driving according to the behavior pattern, the robot 101 starts to speak. Specifically, the processing (1) corresponds to the processing from (1. voice data input) to (13. sound data ringing) in the sequence shown in FIG. That is, the microphone 21 converts voice input by the user's speech into waveform data, and outputs the waveform data to the input device 22 (1. voice data input). The input device 22 inputs the input sound data, and outputs the input sound data to the speech recognition device 23 (2. Speech recognition start command).

音声認識装置２３は、図示しない制御部からの音声認識開始命令を受けて、入力された音データをテキストデータに変換し、対話装置２４に出力する（３．対話開始命令）。対話装置２４は、図示しない制御部からの対話開始命令を受けて、入力されたテキストデータからユーザの発話内容を解析し、発話内容に対応する対話文のテキストデータをデータベース（図示せず）から取得する。そして、取得したテキストデータを音声合成装置２５に出力する（４．対話文言合成命令）。 Upon receiving a voice recognition start command from a control unit (not shown), the voice recognition device 23 converts the input sound data into text data and outputs it to the dialog device 24 (3. dialog start command). The dialogue device 24 receives a dialogue start command from a control unit (not shown), analyzes the user's utterance content from the input text data, and obtains text data of a dialogue sentence corresponding to the utterance content from a database (not shown). get. Then, the acquired text data is output to the speech synthesizer 25 (4. dialogue wording synthesis command).

音声合成装置２５は、図示しない制御部からの対話文言合成命令を受けて、入力されたテキストデータを出力用音波データ（ＰＣＭデータ）に変換し、再生装置２６に出力する（５．音声データ再生命令）。再生装置２６は、図示しない制御部からの音声データ再生命令を受けて、出力用音波データを再生する際に、再生状況取得装置２８に対して発話開始状態変更情報を出力する（６．発話開始状態変更）。この発話開始状態変更情報は、ロボット１０１による発話が開始されたか否かを示す情報であり、この場合は、ロボット１０１による発話が開示されたことを示す情報である。 The voice synthesizer 25 receives an interactive wording synthesis command from a control unit (not shown), converts the input text data into output sound wave data (PCM data), and outputs it to the playback device 26 (5. Voice data playback). order). When the reproduction device 26 receives an audio data reproduction command from a control unit (not shown) and reproduces the output sound wave data, the reproduction device 26 outputs utterance start state change information to the reproduction state acquisition device 28 (6. Speech start). State change). This utterance start state change information is information indicating whether or not the utterance by the robot 101 has been started. In this case, the utterance start state change information is information indicating that the utterance by the robot 101 has been disclosed.

再生状況取得装置２８は、入力された発話開始状態変更情報からロボット１０１が発話したことを駆動制御装置３１に通知する（７．発話開始状況通知）。ここで通知するのは、ロボット１０１によるユーザへの音声の再生が開始されたことを示す信号である。 The reproduction status acquisition device 28 notifies the drive control device 31 that the robot 101 has spoken from the input utterance start state change information (7. Talk start status notification). The signal to be notified here is a signal indicating that the robot 101 has started to reproduce the voice to the user.

駆動制御装置３１は、再生状況取得装置２８からのロボット１０１によるユーザへの音声の再生開始されたことを示す信号を受け取ったタイミングにより、筐体状態取得装置３２からロボット１０１の状態（筐体状態）を取得する（８．筐体情報取得）。また、駆動制御装置３１は、さらに、姿勢記録装置３３に記録されている発話意図提示姿勢も取得する（９．発話意図提示姿勢取得）。これにより、駆動制御装置３１は、取得した筐体状態から、姿勢特定部３１ａによってロボット１０１の姿勢を特定し、特定したロボット１０１の姿勢が、取得した発話意図提示姿勢であるか否かを判定することができる。そして、駆動制御装置３１は、判定結果に応じて駆動系１を駆動する（１０．発話意図提示姿勢移行）。 The drive control device 31 receives the signal indicating that the reproduction of the voice to the user by the robot 101 from the reproduction status acquisition device 28 is received, and the state of the robot 101 (the case state) from the case state acquisition device 32. ) (8. Acquire housing information). Further, the drive control device 31 also acquires the speech intention presentation posture recorded in the posture recording device 33 (9. Speech intention presentation posture acquisition). Accordingly, the drive control device 31 specifies the posture of the robot 101 by the posture specifying unit 31a from the acquired housing state, and determines whether or not the specified posture of the robot 101 is the acquired speech intention presentation posture. can do. Then, the drive control device 31 drives the drive system 1 according to the determination result (10. Utterance intention presentation posture transition).

ここで、駆動制御装置３１は、特定したロボット１０１の姿勢が発話意図提示姿勢でない場合、発話意図提示姿勢になるように駆動系１を駆動させる。一方、駆動制御装置３１は、特定したロボット１０１の姿勢が発話意図提示姿勢である場合、当該発話意図提示姿勢の頭が正面を向いている状態であれば、一旦、頭をうなだれた状態にしてから正面に戻す動作を行わせる。 Here, when the identified posture of the robot 101 is not the utterance intention presentation posture, the drive control device 31 drives the drive system 1 so as to be the utterance intention presentation posture. On the other hand, when the identified posture of the robot 101 is the utterance intention presentation posture, the drive control device 31 temporarily hangs the head if the head of the utterance intention presentation posture is facing the front. To return to the front.

駆動制御装置３１は、ロボット１０１が発話を開始する際に、挙動パターン記録装置３４から発話内容に応じた挙動パターンを取得し（１１．挙動パターン取得）、取得した挙動パターンになるように駆動系１の駆動を開始させる（１２．挙動開始命令）。駆動系１の駆動が開始されると、再生装置２６は、図示しない制御部からの音声データ再生命令を受けて、入力された出力用音波データを音波としてスピーカ２７によって鳴動させる（１３．音声データ鳴動）。 When the robot 101 starts an utterance, the drive control device 31 acquires a behavior pattern corresponding to the utterance content from the behavior pattern recording device 34 (11. behavior pattern acquisition), and drives the drive system so that the acquired behavior pattern is obtained. 1 is started (12. Behavior start command). When driving of the drive system 1 is started, the playback device 26 receives an audio data playback command from a control unit (not shown) and causes the speaker 27 to ring the input output sound wave data as a sound wave (13. audio data). Ringing).

処理（２）の概要：駆動制御装置３１は、再生状況取得装置２８から取得した情報が発話継続を示す情報である場合、すなわち発話（再生）が終わっていない場合には再度、挙動パターン記録装置３４の中いずれかの挙動パターンを作動させる。なお、挙動パターンは、発話が終わったタイミングで選択してもよいし、事前に選択していても構わない。 Outline of process (2): When the information acquired from the reproduction status acquisition device 28 is information indicating continuation of the utterance, that is, when the utterance (reproduction) has not ended, the drive control device 31 again performs the behavior pattern recording device. Activating any of the behavior patterns in 34. The behavior pattern may be selected at the timing when the utterance ends or may be selected in advance.

具体的には、上記の処理（２）は、図２に示すシーケンスのうち、（１４．挙動終了）〜（１８．挙動開始命令）までの処理が対応している。すなわち、ロボット１０１の発話中（音声再生中）における駆動系１による挙動パターンの終了は、筐体状態取得装置３２が取得した駆動系１の駆動状態（筐体状態）によって判定する（１４．挙動終了）。筐体状態取得装置３２は、挙動終了していることを示す情報を挙動通知として駆動制御装置３１に出力する（１５．挙動通知命令）。 Specifically, the processing (2) corresponds to the processing from (14. Behavior end) to (18. Behavior start command) in the sequence shown in FIG. That is, the end of the behavior pattern by the drive system 1 during the speech of the robot 101 (during sound reproduction) is determined by the drive state (case state) of the drive system 1 acquired by the case state acquisition device 32 (14. Behavior). End). The housing state acquisition device 32 outputs information indicating that the behavior has ended to the drive control device 31 as a behavior notification (15. behavior notification command).

駆動制御装置３１は、筐体状態取得装置３２から取得した筐体状態から挙動が終了したことが通知されると、再生状況取得装置２８から再生状況を取得する（１６．再生状況取得）。また、駆動制御装置３１は、取得した再生状況から再生中であると判断すれば、再度、挙動パターン記録装置３４から発話内容に応じた挙動パターンを取得（１７．挙動パターン取得）する。そして、取得した挙動パターンになるように駆動系１の駆動を開始させる（１８．挙動開始命令）。 When the drive control device 31 is notified that the behavior has ended from the housing state acquired from the housing state acquisition device 32, the drive control device 31 acquires the playback state from the playback state acquisition device 28 (16. playback state acquisition). If the drive control device 31 determines that the reproduction is in progress from the obtained reproduction status, the drive control device 31 again obtains a behavior pattern according to the utterance content from the behavior pattern recording device 34 (17. behavior pattern acquisition). Then, the drive of the drive system 1 is started so as to obtain the acquired behavior pattern (18. Behavior start command).

処理（３）の概要：駆動制御装置３１は、再生状況取得装置２８から取得した情報が発話終了を示す情報である場合、発話終了と判定し、駆動系１をアイドル状態もしくは非アクティブとする。一方、発話が終わったタイミングを再生状況取得装置２８が取得すると、駆動制御装置３１は筐体状態取得装置３２を確認し、作動が行われている最中の場合、駆動系１に中止命令を発行し、初期姿勢となる発話意図提示姿勢に戻るように、当該駆動系１を駆動させる。なお、作動が所定の時間（例えば４００ｍｓ）以内であれば許容の範囲として中止命令を出さない。 Outline of Process (3): If the information acquired from the reproduction status acquisition device 28 is information indicating the end of the utterance, the drive control device 31 determines that the utterance has ended and sets the drive system 1 in the idle state or inactive. On the other hand, when the playback status acquisition device 28 acquires the timing when the utterance is finished, the drive control device 31 checks the housing state acquisition device 32, and if the operation is being performed, issues a stop command to the drive system 1. The drive system 1 is driven so as to return to the utterance intention presentation posture that is issued and is the initial posture. If the operation is within a predetermined time (for example, 400 ms), no stop command is issued as an allowable range.

具体的には、上記の処理（３）は、図２に示すシーケンスのうち、（１９．再生終了）〜（２２．再生終了命令）までの処理が対応している。すなわち、再生装置２６は、再生を終了したとき（１９．再生終了）、再生状況取得装置２８に対して再生終了状態変更情報を出力する（２０．再生終了状態変更）。 Specifically, the processing (3) corresponds to the processing from (19. Playback end) to (22. Playback end command) in the sequence shown in FIG. That is, the playback device 26 outputs playback end state change information to the playback status acquisition device 28 (20. playback end state change) when playback ends (19. playback end).

再生状況取得装置２８は、入力された再生終了状態変更情報からロボット１０１が発話を終了したことを駆動制御装置３１に通知する（２１．再生終了通知）。ここで通知するのは、ロボット１０１によるユーザへの音声の再生が終了されたことを示す信号である。 The playback status acquisition device 28 notifies the drive control device 31 that the robot 101 has finished speaking from the input playback end status change information (21. Playback end notification). The signal to be notified here is a signal indicating that the reproduction of the voice to the user by the robot 101 has been completed.

駆動制御装置３１は、再生状況取得装置２８から取得した再生終了通知から再生終了命令（中止命令）を駆動系１に発する（２２．再生終了命令）。これにより、駆動系１の動作を停止させる。 The drive control device 31 issues a playback end command (stop command) to the drive system 1 from the playback end notification acquired from the playback status acquisition device 28 (22. playback end command). As a result, the operation of the drive system 1 is stopped.

（効果）
以上のように、ロボット１０１は、ユーザとの対話開始時に、姿勢が発話の意図があることをユーザに知らしめるための発話意図提示姿勢となる。つまり、ロボット１０１は、ユーザとの対話開始時に、ロボット自体が発話の意図があるか否かをユーザに対して示すことができる。これにより、ユーザとロボット１０１との対話を円滑に行うことができるので、ユーザとロボットとの間で自然なノンバーバルコミュニケーションを実現できる。(effect)
As described above, the robot 101 becomes an utterance intention presentation posture for informing the user that the posture is intended to be uttered when the conversation with the user is started. That is, the robot 101 can indicate to the user whether or not the robot itself intends to speak at the start of dialogue with the user. Thereby, since the dialogue between the user and the robot 101 can be performed smoothly, natural non-verbal communication can be realized between the user and the robot.

なお、本実施形態では、ユーザとの対話開始時を、ロボットによるユーザへの音声の再生開始時としていたが、これに限定されるものではなく、ロボットによるユーザの音声の入力終了時であってもよい。この場合、図２に示すシーケンスでは、入力装置２２による入力終了時に、駆動制御装置３１に対して対話開始状況通知を行う。このタイミングで駆動制御装置３１は、筐体状態取得装置３２から筐体状態を取得する。以降の処理は、上述した処理と同じである。 In this embodiment, the start of the dialog with the user is the start of the voice reproduction to the user by the robot. However, the present invention is not limited to this, and is the end of the user's voice input by the robot. Also good. In this case, in the sequence shown in FIG. 2, a dialog start status notification is sent to the drive control device 31 when input by the input device 22 is completed. At this timing, the drive control device 31 acquires the housing state from the housing state acquisition device 32. Subsequent processing is the same as the processing described above.

また、入力装置２２による入力終了時は、音声認識の開始時でもあるため、ユーザとの対話開始時を、ロボットによる音声認識の開始時であってもよい。さらに、ロボット１０１を構成する筐体にスイッチを設け、スイッチ押下をもってマイク２１をＯＮにし、スイッチ離しをもってＯＦＦにする装置の場合、ユーザとの対話開始時を、スイッチ離しを開始時としてもよい。また、ロボット１０１を構成する筐体にカメラを用い、カメラにおいて人物検知をし、更に人物の唇の動作が終了したことを検知することをもって会話が開始されると想定されるタイミングを、ユーザとの対話開始時としてもよい。 Further, since the input end by the input device 22 is also the start of voice recognition, the start of the dialog with the user may be the start of voice recognition by the robot. Furthermore, in the case of a device in which a switch is provided in a housing constituting the robot 101, the microphone 21 is turned on when the switch is pressed, and the microphone 21 is turned off when the switch is released, the start of the dialog with the user may be the start of the switch release. In addition, when a camera is used for the housing constituting the robot 101, a person is detected by the camera, and further, the timing when the conversation is assumed to be started by detecting that the movement of the lips of the person is finished is determined with the user. It is also possible to start the dialogue.

なお、ロボット１０１が発話の開始をするのは、対話装置２４から応答用の発話内容が形成できた場合であり、ユーザの意図が不十分である発話内容の場合や、くしゃみ等ユーザの発話に意味が無い場合では、発話内容が形成できない。このような場合には、ロボット１０１は発話をする代わりに、発話意図提示姿勢から、発話意図が無くなったことを示す発話意図解除挙動をとっても良い。この発話意図解除挙動としては、発話意図提示姿勢と異なる姿勢になるような挙動であれば、どのような挙動でもよく、ユーザが、ロボット１０１に発話の意図がなくなったことを認識させ易い挙動であることが好ましい。 Note that the robot 101 starts utterance when the utterance content for response can be formed from the dialogue device 24. In the case of the utterance content that the user's intention is insufficient or the user's utterance such as sneeze. If there is no meaning, utterance content cannot be formed. In such a case, the robot 101 may take an utterance intention release behavior indicating that the utterance intention is lost from the utterance intention presentation posture instead of uttering. The utterance intention release behavior may be any behavior as long as the utterance intention presentation posture is different from the utterance intention presentation posture, and is a behavior in which the user can easily recognize that the utterance intention has been lost. Preferably there is.

また、上記構成の姿勢制御装置３では、姿勢記録装置３３と、挙動パターン記録装置３４とは独立して設けられている例を示しているが、これら２つの装置を一つの記録装置としてもよい。 In the posture control device 3 configured as described above, an example is shown in which the posture recording device 33 and the behavior pattern recording device 34 are provided independently. However, these two devices may be a single recording device. .

〔実施形態２〕
本発明の他の実施形態について、説明すれば以下のとおりである。なお、説明の便宜上、前記実施形態１にて説明した部材と同じ機能を有する部材については、同じ符号を付記し、その説明を省略する。[Embodiment 2]
Another embodiment of the present invention will be described as follows. For convenience of explanation, members having the same functions as those described in the first embodiment are given the same reference numerals, and descriptions thereof are omitted.

（ロボットの概要）
図３は、本実施形態に係るロボット２０１の概略構成図である。ロボット２０１は、音声認識装置２３をネットワーク上のサーバ（図示せず）内に設け、当該音声認識装置２３との通信を行うための通信装置２９を追加した構成となっている点で前記実施形態１のロボット１０１と異なる。つまり、ロボット２０１では、マイク２１を通して入力された音声を入力装置２２で入力した後、入力した音声データを通信装置２９によってネットワーク上のサーバに送り、当該サーバ内の音声認識装置２３にて音声認識を行わせることができる。そして、サーバ内の音声認識装置２３による認識結果は、通信装置２９を介して対話装置２４に送られる。それらの点で、ロボット２０１は、前記実施形態１のロボット１０１と異なる。通信装置２９は、インターネット等の外部のネットワーク上に設けられた音声認識装置２３と通信が行えるものであれば、どのような方式の通信装置であってもよい。(Robot overview)
FIG. 3 is a schematic configuration diagram of the robot 201 according to the present embodiment. The robot 201 has the configuration in which the voice recognition device 23 is provided in a server (not shown) on the network and a communication device 29 for performing communication with the voice recognition device 23 is added. Different from the first robot 101. That is, in the robot 201, after the voice input through the microphone 21 is input by the input device 22, the input voice data is sent to the server on the network by the communication device 29, and the voice recognition device 23 in the server recognizes the voice. Can be performed. Then, the recognition result by the voice recognition device 23 in the server is sent to the dialogue device 24 via the communication device 29. In these respects, the robot 201 is different from the robot 101 of the first embodiment. The communication device 29 may be any type of communication device as long as it can communicate with the voice recognition device 23 provided on an external network such as the Internet.

（姿勢制御処理）
図４は、図３に示すロボット２０１の姿勢制御処理の流れを示すシーケンス図である。以下のシーケンス図では、ロボット２０１による音声が再生されるまでの処理（１１）、音声再生中にロボット２０１の挙動が終了したときの処理（１２）、ロボット２０１の挙動中に音声再生が終了したときの処理（１３）が含まれている。(Attitude control processing)
FIG. 4 is a sequence diagram showing the flow of the posture control process of the robot 201 shown in FIG. In the following sequence diagram, the process until the voice is reproduced by the robot 201 (11), the process when the behavior of the robot 201 is finished during the voice reproduction (12), and the voice reproduction is finished during the behavior of the robot 201. Processing (13) is included.

処理（１１）の概要：この処理（１１）は、前記実施形態１で説明した処理（１）とほぼ同じ処理であるが、ユーザとの対話開始時が異なる。すなわち、ユーザの発話がマイク２１より音声データとして取得され、取得された音声データを入力装置２２により入力し、入力終了した時点を、ユーザと対話開始時としている点で前記実施形態１の処理（１）と異なる。 Outline of the process (11): This process (11) is substantially the same as the process (1) described in the first embodiment, but is different at the start of the dialogue with the user. That is, the process of the first embodiment is that the user's utterance is acquired as sound data from the microphone 21, the acquired sound data is input by the input device 22, and the input end time is set as the start of the dialog with the user ( Different from 1).

具体的には、上記処理（１１）は、図４に示すシーケンスのうち、（１．音声データ入力）〜（１５．音データ鳴動）までの処理が対応している。すなわち、マイク２１は、ユーザが発話することで入力された音声を波形データに変換し、音データとして入力装置２２に出力する（１．音声データ入力）。入力装置２２は、入力された音データを入力し、入力が終了した場合に、駆動制御装置３１に対して、対話開始状況通知を行う（２．対話開始状況通知）。この対話開始状況通知により、駆動制御装置３１に対して、ユーザの音声の入力が終了したことを通知する。 Specifically, the processing (11) corresponds to the processing from (1. voice data input) to (15. sound data ringing) in the sequence shown in FIG. That is, the microphone 21 converts voice input by the user's speech into waveform data, and outputs the waveform data to the input device 22 (1. voice data input). The input device 22 inputs the input sound data, and when the input is completed, notifies the drive control device 31 of a dialog start status (2. Dialog start status notification). By this dialog start status notification, the drive control device 31 is notified that the input of the user's voice has been completed.

駆動制御装置３１は、再生状況取得装置２８からのロボット１０１によるユーザへの音声の再生開始されたことを示す信号を受け取ったタイミングにより、筐体状態取得装置３２からロボット２０１の状態（筐体状態）を取得する（３．筐体情報取得）。駆動制御装置３１は、さらに、姿勢記録装置３３に記録されている発話意図提示姿勢も取得する（４．発話意図提示姿勢取得）。これにより、駆動制御装置３１は、取得した筐体状態から、姿勢特定部３１ａによってロボット２０１の姿勢を特定し、特定したロボット２０１の姿勢が、取得した発話意図提示姿勢であるか否かを判定することができる。そして、駆動制御装置３１は、判定結果に応じて駆動系１を駆動する（５．発話意図提示姿勢移行）。 The drive control device 31 receives the signal indicating that the reproduction of the voice to the user by the robot 101 from the reproduction status acquisition device 28 is received, and the state of the robot 201 (the case state) from the case state acquisition device 32. ) (3. Acquire housing information). The drive control device 31 also acquires the utterance intention presentation posture recorded in the posture recording device 33 (4. utterance intention presentation posture acquisition). Accordingly, the drive control device 31 specifies the posture of the robot 201 by the posture specifying unit 31a from the acquired housing state, and determines whether or not the specified posture of the robot 201 is the acquired utterance intention presentation posture. can do. And the drive control apparatus 31 drives the drive system 1 according to a determination result (5. Speech intention presentation attitude | position transition).

駆動制御装置３１は、特定したロボット２０１の姿勢が発話意図提示姿勢でない場合、発話意図提示姿勢になるように駆動系１を駆動させる。一方、駆動制御装置３１は、特定したロボット２０１の姿勢が発話意図提示姿勢である場合、当該発話意図提示姿勢の頭が正面を向いている状態であれば、一旦、頭をうなだれた状態にしてから正面に戻す動作を行わせる。 When the identified posture of the robot 201 is not the utterance intention presentation posture, the drive control device 31 drives the drive system 1 so as to be the utterance intention presentation posture. On the other hand, when the identified posture of the robot 201 is the utterance intention presentation posture, the drive control device 31 temporarily hangs the head if the head of the utterance intention presentation posture is facing the front. To return to the front.

その後、入力装置２２は、図示しない制御部による音声認識開始命令（１）を受付けて、通信装置２９を介して、ネットワーク上のサーバに設けられた音声認識装置２３に入力した音声データを送信する（６．音声認識開始命令（１））。音声認識装置２３は、サーバ内の制御部からの音声認識開始命令（２）を受けて、入力された音データをテキストデータに変換し（７．音声認識開始命令（２））、対話装置２４に出力する（８．対話開始命令）。 Thereafter, the input device 22 receives a voice recognition start command (1) from a control unit (not shown), and transmits the voice data input to the voice recognition device 23 provided in the server on the network via the communication device 29. (6. Speech recognition start command (1)). The speech recognition device 23 receives the speech recognition start command (2) from the control unit in the server, converts the input sound data into text data (7. speech recognition start command (2)), and the dialogue device 24. (8. Dialogue start command).

対話装置２４は、図示しない制御部からの対話開始命令を受けて、入力されたテキストデータからユーザの発話内容を解析し、発話内容に対応する対話文のテキストデータをデータベース（図示せず）から取得する。さらに、取得したテキストデータを音声合成装置２５に出力する（９．対話文言合成命令）。 The dialogue device 24 receives a dialogue start command from a control unit (not shown), analyzes the user's utterance content from the input text data, and obtains text data of a dialogue sentence corresponding to the utterance content from a database (not shown). get. Furthermore, the acquired text data is output to the speech synthesizer 25 (9. Dialogue word composition command).

音声合成装置２５は、図示しない制御部からの対話文言合成命令を受けて、入力されたテキストデータを出力用音波データ（ＰＣＭデータ）に変換し、再生装置２６に出力する（１０．音声データ再生命令）。再生装置２６は、図示しない制御部からの音声データ再生命令を受けて、出力用音波データを再生する際に、再生状況取得装置２８に対して発話開始状態変更情報を出力する（１１．発話開始状態変更）。この発話開始状態変更情報は、ロボット２０１による発話が開始されたか否かを示す情報であり、この場合は、ロボット２０１による発話が開始されたことを示す情報である。 The speech synthesizer 25 receives an interactive wording synthesis command from a control unit (not shown), converts the input text data into output sound wave data (PCM data), and outputs it to the playback device 26 (10. Voice data playback). order). When the playback device 26 receives an audio data playback command from a control unit (not shown) and plays back the output sound wave data, the playback device 26 outputs utterance start state change information to the playback status acquisition device 28 (11. Start of speech). State change). The utterance start state change information is information indicating whether or not the utterance by the robot 201 has been started. In this case, the utterance start state change information is information indicating that the utterance by the robot 201 has been started.

再生状況取得装置２８は、入力された発話開始状態変更情報からロボット２０１が発話したことを駆動制御装置３１に通知する（１２．発話開始状況通知）。ここで通知するのは、ロボット２０１によるユーザへの音声の再生が開始されたことを示す信号である。駆動制御装置３１は、ロボット２０１が発話を開始する際に、挙動パターン記録装置３４から発話内容に応じた挙動パターンを取得し（１３．挙動パターン取得）、取得した挙動パターンになるように駆動系１の駆動を開始させる（１４．挙動開始命令）。駆動系１の駆動が開始されると、再生装置２６は、図示しない制御部からの音声データ再生命令を受けて、入力された出力用音波データを音波としてスピーカ２７によって鳴動させる（１５．音声データ鳴動）。 The playback status acquisition device 28 notifies the drive control device 31 that the robot 201 has spoken from the input utterance start state change information (12. Speak start status notification). The signal to be notified here is a signal indicating that the robot 201 has started playing the voice to the user. When the robot 201 starts utterance, the drive control device 31 acquires a behavior pattern corresponding to the utterance content from the behavior pattern recording device 34 (13. behavior pattern acquisition), and drives the drive system so that the acquired behavior pattern is obtained. 1 is started (14. Behavior start command). When driving of the drive system 1 is started, the playback device 26 receives a sound data playback command from a control unit (not shown) and causes the speaker 27 to ring the input output sound wave data as a sound wave (15. sound data). Ringing).

処理（１１）は以上の通りであるが、処理（１２）は、前記実施形態１の処理（２）と同じであり、処理（１３）は、前記実施形態１の処理（３）と同じであるため、これらの処理についての説明は省略する。 The process (11) is as described above, but the process (12) is the same as the process (2) in the first embodiment, and the process (13) is the same as the process (3) in the first embodiment. Therefore, description of these processes is omitted.

（効果）
以上のように、ロボット２０１は、ユーザとの対話開始時に、姿勢が発話の意図があることをユーザに知らしめるための発話意図提示姿勢となる。つまり、ロボット２０１は、ユーザとの対話開始時に、ロボット自体が発話の意図があるか否かをユーザに対して示すことができる。これにより、ユーザとロボット２０１との対話を円滑に行うことができる。しかも、本実施形態の場合、音声認識装置２３をネットワーク上のサーバ内に設けているので、ロボット２０１内部での音声認識処理を行わずに済むため、ロボット２０１における処理負担を軽減することができる。(effect)
As described above, the robot 201 becomes an utterance intention presentation posture for informing the user that the posture is intended to be uttered when the conversation with the user is started. That is, the robot 201 can indicate to the user whether or not the robot itself intends to speak at the start of dialogue with the user. Thereby, the dialogue between the user and the robot 201 can be performed smoothly. In addition, in the case of the present embodiment, since the voice recognition device 23 is provided in a server on the network, it is not necessary to perform voice recognition processing inside the robot 201, so the processing load on the robot 201 can be reduced. .

なお、本実施形態では、ユーザとの対話開始時を、ロボットによるユーザの音声の入力終了時としていたが、これに限定されるものではなく、ロボットによるユーザへの音声の再生開始時としてもよい。 In the present embodiment, the start of the dialog with the user is the end of the user's voice input by the robot. However, the present invention is not limited to this, and may be the start of the playback of the voice to the user by the robot. .

また、本実施形態の場合、前記実施形態１と同様に、入力装置２２による入力終了時は、音声認識の開始時でもあるため、ユーザとの対話開始時を、ロボットによる音声認識の開始時であってもよい。さらに、ロボット２０１を構成する筐体にスイッチを設け、スイッチ押下をもってマイク２１をＯＮにし、スイッチ離しをもってＯＦＦにする装置の場合、ユーザとの対話開始時を、スイッチ離しを開始時としてもよい。また、ロボット２０１を構成する筐体にカメラを用い、カメラにおいて人物検知をし、更に人物の唇の動作が終了したことを検知することをもって会話が開始されると想定されるタイミングを、ユーザとの対話開始時としてもよい。 In the case of the present embodiment, as in the first embodiment, the end of input by the input device 22 is also the start of speech recognition, so the start of dialogue with the user is the start of speech recognition by the robot. There may be. Furthermore, in the case of a device in which a switch is provided in a casing constituting the robot 201 and the microphone 21 is turned on when the switch is pressed and turned off when the switch is released, the start of the dialog with the user may be set as the start of the switch release. In addition, when a camera is used for the housing constituting the robot 201, a person is detected by the camera, and further, the timing when the conversation is assumed to be started by detecting that the movement of the lips of the person is finished is determined with the user. It is also possible to start the dialogue.

また、ロボット２０１が発話の開始をするのは、対話装置２４から応答用の発話内容が形成できた場合であり、ユーザの意図が不十分である発話内容の場合や、くしゃみ等ユーザの発話に意味が無い場合では、発話内容が形成できない。このような場合には、ロボット２０１は発話をする代わりに、発話意図提示姿勢から、発話意図が無くなったことを示す発話意図解除挙動をとっても良い。 Also, the robot 201 starts the utterance when the utterance content for response can be formed from the dialogue device 24. In the case of the utterance content that the user's intention is insufficient or the user's utterance such as sneeze. If there is no meaning, utterance content cannot be formed. In such a case, the robot 201 may take an utterance intention release behavior indicating that the utterance intention has disappeared from the utterance intention presentation posture instead of speaking.

なお、上記構成の姿勢制御装置３では、姿勢記録装置３３と、挙動パターン記録装置３４とは独立して設けられている例を示しているが、これら２つの装置を一つの記録装置としてもよい。 In the posture control device 3 configured as described above, an example is shown in which the posture recording device 33 and the behavior pattern recording device 34 are provided independently. However, these two devices may be a single recording device. .

また、前記実施形態１，２では、ロボット１０１、２０１の姿勢制御を、音声系２からの出力信号（入力装置２２からの音声入力終了を示す信号、再生状況取得装置２８からの音声再生開始を示す信号）に基づいて行っていた。これに対して、以下の実施形態３では、カメラで撮像したユーザの顔画像に基づいて行う例について説明する。 In the first and second embodiments, the posture control of the robots 101 and 201 is performed by using an output signal from the voice system 2 (a signal indicating the end of voice input from the input device 22 and voice playback start from the playback status acquisition device 28). Signal). On the other hand, in the following third embodiment, an example performed based on a user's face image captured by a camera will be described.

〔実施形態３〕
本発明のさらに他の実施形態について、説明すれば以下のとおりである。なお、説明の便宜上、前記実施形態１にて説明した部材と同じ機能を有する部材については、同じ符号を付記し、その説明を省略する。[Embodiment 3]
The following will describe still another embodiment of the present invention. For convenience of explanation, members having the same functions as those described in the first embodiment are given the same reference numerals, and descriptions thereof are omitted.

（ロボットの概要）
図５は、本実施形態に係るロボット３０１の概略構成図である。ロボット３０１は、前記実施形態１のロボット１０１とほぼ同じ構成であり、画像系４が新たに設けられている点で異なる。画像系４は、ユーザの顔を撮像するカメラ４１、およびカメラ４１によって撮像した顔画像を取得する画像取得装置（画像取得部）４２を含んでいる。画像系４は、さらに、画像取得装置４２が取得した顔画像が、当該ユーザによる発話終了を示す画像であるか否かを判定する画像判定装置（画像判定部）４３を含んでいる。(Robot overview)
FIG. 5 is a schematic configuration diagram of the robot 301 according to the present embodiment. The robot 301 has substantially the same configuration as the robot 101 of the first embodiment, and is different in that the image system 4 is newly provided. The image system 4 includes a camera 41 that captures the face of the user, and an image acquisition device (image acquisition unit) 42 that acquires a face image captured by the camera 41. The image system 4 further includes an image determination device (image determination unit) 43 that determines whether or not the face image acquired by the image acquisition device 42 is an image indicating the end of speech by the user.

カメラ４１は、ロボット３０１の対話相手となるユーザを撮像するデジタルカメラであって、ロボット３０１の内部に搭載できるものであれば、どのような形式、方式のカメラであってもよい。画像取得装置４２は、カメラ４１が撮像したユーザの画像から当該ユーザの顔画像を取得する装置である。画像取得装置４２は、取得したユーザの顔画像を画像判定装置４３に送る。 The camera 41 is a digital camera that captures an image of a user who is a conversation partner of the robot 301, and may be any type and system as long as it can be mounted inside the robot 301. The image acquisition device 42 is a device that acquires the face image of the user from the user image captured by the camera 41. The image acquisition device 42 sends the acquired user face image to the image determination device 43.

画像判定装置４３は、画像取得装置４２から送られたユーザの顔画像から顔認証を行い、認証結果から、ユーザによる発話終了を示す画像であるか否かを判定する装置である。ここでは、ユーザの口が閉じたときの顔画像であるか否かを判定する。そして、ユーザの口が閉じたときの顔画像を判定した結果を、姿勢制御装置３に送る。つまり、姿勢制御装置３では、ユーザの口が閉じたときのタイミングでロボット３０１の姿勢制御を行う。すなわち、本実施形態では、姿勢制御装置３によるロボット３０１の姿勢判定のタイミング、すなわちユーザとの対話開始時を、上記画像判定装置４３によりユーザの発話終了を示す画像であると判定された時とする。 The image determination device 43 is a device that performs face authentication from the user's face image sent from the image acquisition device 42 and determines whether the image indicates the end of speech by the user from the authentication result. Here, it is determined whether or not the face image is when the user's mouth is closed. Then, the result of determining the face image when the user's mouth is closed is sent to the posture control device 3. That is, the posture control device 3 performs posture control of the robot 301 at the timing when the user's mouth is closed. That is, in this embodiment, the timing of the posture determination of the robot 301 by the posture control device 3, that is, the time when the dialogue with the user is started is determined by the image determination device 43 as an image indicating the end of the user's utterance. To do.

（効果）
以上のように、ロボット３０１は、ユーザとの対話開始時（画像判定装置４３によりユーザの発話終了を示す画像であると判定された時）に、姿勢が発話の意図があることをユーザに知らしめるための発話意図提示姿勢となる。つまり、ロボット３０１は、ユーザとの対話開始時に、ロボット自体が発話の意図があるか否かをユーザに対して示すことができる。これにより、ユーザとロボット３０１との対話を円滑に行うことができる。(effect)
As described above, the robot 301 informs the user that the posture is intended to be uttered when the dialogue with the user is started (when the image determining device 43 determines that the image indicates the end of the user's utterance). It becomes the utterance intention presentation posture to squeeze. That is, the robot 301 can indicate to the user whether or not the robot itself intends to speak at the start of the dialogue with the user. Thereby, the dialogue between the user and the robot 301 can be performed smoothly.

（変形例）
図６は、図５に示すロボット３０１の変形例のロボット４０１の概略構成ブロック図である。ロボット４０１は、前記実施形態２のロボット２０１とほぼ同じ構成であり、画像系４が新たに設けられている点で異なる。また、画像系４を用いた姿勢制御については、図５に示すロボット３０１と同じであるため、その説明は省略する。(Modification)
FIG. 6 is a schematic configuration block diagram of a robot 401 which is a modification of the robot 301 shown in FIG. The robot 401 has substantially the same configuration as the robot 201 of the second embodiment, and is different in that the image system 4 is newly provided. The posture control using the image system 4 is the same as that of the robot 301 shown in FIG.

（効果）
ロボット４０１によれば、ロボット３０１とほぼ同じ効果を奏し、さらに、音声認識装置２３をネットワーク上のサーバ内に設けているので、ロボット４０１内部での音声認識処理を行わずに済む。そのため、ロボット４０１における処理負担を軽減することができるという効果を奏する。(effect)
According to the robot 401, substantially the same effect as the robot 301 is obtained, and furthermore, since the voice recognition device 23 is provided in a server on the network, it is not necessary to perform voice recognition processing inside the robot 401. As a result, the processing load on the robot 401 can be reduced.

なお、本実施形態のロボット３０１及び変形例のロボット４０１においては、何れも、ユーザとの対話開始時を、ユーザの発話終了を示す画像を判定した時としていたが、これに限定されなくてもよい。例えば、前記実施形態１のロボット１０１、前記実施形態２のロボット２０１と同様に、音声系２からの出力信号（入力装置２２からの音声入力終了を示す信号、再生状況取得装置２８からの音声再生開始を示す信号）の受信時であってもよい。 Note that in the robot 301 of the present embodiment and the robot 401 of the modification example, the dialogue start time with the user is determined when an image indicating the end of the user's speech is determined. However, the present invention is not limited to this. Good. For example, similar to the robot 101 of the first embodiment and the robot 201 of the second embodiment, an output signal from the voice system 2 (a signal indicating the end of voice input from the input device 22, a voice reproduction from the playback status acquisition device 28). It may be at the time of reception of a signal indicating the start.

〔ソフトウェアによる実現例〕
駆動制御装置３１の制御ブロックは、集積回路（ＩＣチップ）等に形成された論理回路（ハードウェア）によって実現してもよいし、ＣＰＵ（Central Processing Unit）を用いてソフトウェアによって実現してもよい。[Example of software implementation]
The control block of the drive control device 31 may be realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like, or may be realized by software using a CPU (Central Processing Unit). .

後者の場合、駆動制御装置３１は、各機能を実現するソフトウェアであるプログラムの命令を実行するＣＰＵ、上記プログラムおよび各種データがコンピュータ（またはＣＰＵ）で読み取り可能に記録されたＲＯＭ（Read Only Memory）または記憶装置（これらを「記録媒体」と称する）、上記プログラムを展開するＲＡＭ（Random Access Memory）などを備えている。そして、コンピュータ（またはＣＰＵ）が上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記記録媒体としては、「一時的でない有形の媒体」、例えば、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムは、該プログラムを伝送可能な任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに供給されてもよい。なお、本発明は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 In the latter case, the drive control device 31 includes a CPU that executes instructions of a program that is software that implements each function, and a ROM (Read Only Memory) in which the program and various data are recorded so as to be readable by the computer (or CPU). Alternatively, a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) that expands the program, and the like are provided. And the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it. As the recording medium, a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. The program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program. The present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission.

〔まとめ〕
本発明の態様１に係る姿勢制御装置は、ユーザとの対話が可能であり、且つ複数の駆動部（駆動系１）を駆動させて種々の姿勢をとることが可能なロボット（１０１，２０１，３０１，４０１）に備えられ、当該ロボット（１０１，２０１，３０１，４０１）の姿勢を制御する姿勢制御装置（３１）であって、上記ロボット（１０１，２０１，３０１，４０１）の姿勢を、上記各駆動部（駆動系１）の駆動状態から特定する姿勢特定部（３１ａ）と、上記各駆動部（駆動系１）の駆動制御を行う駆動制御部（３１ｂ）と、を備え、上記駆動制御部（３１ｂ）は、ユーザとの対話開始時に上記姿勢特定部（３１ａ）によって特定された上記ロボット（１０１，２０１，３０１，４０１）の姿勢が、当該ロボット（１０１，２０１，３０１，４０１）に発話を行う意図があることを示す発話意図提示姿勢でない場合に、上記各駆動部（駆動系１）を駆動させて当該ロボット（１０１，２０１，３０１，４０１）に上記発話意図提示姿勢をとらせることを特徴としている。[Summary]
The posture control apparatus according to the first aspect of the present invention is a robot (101, 201,) capable of interacting with a user and driving a plurality of drive units (drive system 1) to take various postures. 301, 401) and a posture control device (31) for controlling the posture of the robot (101, 201, 301, 401), wherein the posture of the robot (101, 201, 301, 401) is An attitude specifying unit (31a) that specifies the drive state of each drive unit (drive system 1), and a drive control unit (31b) that controls the drive of each drive unit (drive system 1). The unit (31b) indicates that the posture of the robot (101, 201, 301, 401) specified by the posture specifying unit (31a) at the start of the dialogue with the user is the robot (101, 201, 301, 40). ) Is not an utterance intention presentation posture indicating that there is an intention to utter, and each of the driving units (drive system 1) is driven to set the utterance intention presentation posture to the robot (101, 201, 301, 401). It is characterized by letting it take.

上記の構成によれば、ユーザとの対話開始時に、ロボット（１０１，２０１，３０１，４０１）の姿勢を常に発話意図提示姿勢にさせることができるので、ユーザはロボットに発話意図があることを当該ロボットの姿勢により視覚的に容易に認識できる。 According to the above configuration, since the posture of the robot (101, 201, 301, 401) can always be set to the utterance intention presentation posture at the start of the dialogue with the user, the user can confirm that the robot has the utterance intention. It can be easily recognized visually by the posture of the robot.

これにより、ユーザとの対話開始時に、ロボット自体が発話の意図があるか否かをユーザに対して明確に示すことができるので、ユーザとロボットの対話を円滑に行うことが可能となり、その結果、ユーザとロボットとの間で自然なノンバーバルコミュニケーションを実現できる。 As a result, it is possible to clearly indicate to the user whether or not the robot itself intends to speak at the start of the dialog with the user. Natural non-verbal communication can be realized between the user and the robot.

本発明の態様２に係る姿勢制御装置は、上記態様１において、上記ロボット（１０１，２０１，３０１，４０１）が、ユーザの音声を入力し、入力した音声に応じて当該ユーザに向かって音声を再生することでユーザとの対話を行うとき、上記ユーザとの対話開始時は、当該ロボット（１０１，２０１，３０１，４０１）によるユーザへの音声の再生開始時であってもよい。 In the posture control apparatus according to aspect 2 of the present invention, in the aspect 1, the robot (101, 201, 301, 401) inputs a user's voice, and makes a voice toward the user according to the input voice. When a dialog with the user is performed by playing, the start of the dialog with the user may be a time when the robot (101, 201, 301, 401) starts to play a voice to the user.

上記の構成によれば、ユーザとの対話開始時は、ロボット（１０１，２０１，３０１，４０１）によるユーザへの音声の再生開始時であるので、ロボットが当に発話しようとするタイミングでユーザに発話意図提示姿勢をとることができる。これにより、ロボットに発話意図があることを当該ロボットの姿勢に加えて、ユーザに対して、音声により明確に認識させることができる。 According to the above configuration, since the robot (101, 201, 301, 401) starts the voice reproduction to the user at the start of the dialogue with the user, the robot prompts the user at the timing when the robot is about to speak. The utterance intention presentation posture can be taken. Thereby, in addition to the posture of the robot, the user can clearly recognize that the robot has an intention to speak by voice.

本発明の態様３に係る姿勢制御装置は、上記態様１において、上記ロボット（１０１，２０１，３０１，４０１）が、上記ユーザの音声を入力し、入力した音声に応じて当該ユーザに向かって音声を再生することでユーザとの対話を行うとき、上記ユーザとの対話開始時は、当該ロボット（１０１，２０１，３０１，４０１）によるユーザの音声の入力終了時であってもよい。 The posture control device according to aspect 3 of the present invention is the posture control apparatus according to aspect 1, in which the robot (101, 201, 301, 401) inputs the user's voice and speaks to the user according to the input voice. When the dialogue with the user is performed by reproducing the above, the start of the dialogue with the user may be the end of the user's voice input by the robot (101, 201, 301, 401).

上記の構成によれば、ユーザとの対話開始時は、ロボット（１０１，２０１，３０１，４０１）によるユーザの音声の入力終了時であるので、ユーザの発話終了のタイミングでユーザに発話意図提示姿勢をとることができる。これにより、ロボットは、ユーザに対して迅速に発話の意図があることを知らせることができる。 According to the above configuration, since the dialogue with the user starts when the user's voice input by the robot (101, 201, 301, 401) ends, the utterance intention presentation posture to the user at the timing of the user's utterance end. Can be taken. Thereby, the robot can inform the user that there is an intention to speak quickly.

本発明の態様４に係る姿勢制御装置は、上記態様１において、ユーザの顔を撮像した顔画像を取得する画像取得部（画像取得装置４２）と、上記画像取得部（画像取得装置４２）が取得した顔画像が、当該ユーザによる発話終了を示す画像であるか否かを判定する画像判定部（画像判定装置４３）と、を備え、上記ユーザとの対話開始時は、上記画像判定部（画像判定装置４３）によりユーザの発話終了を示す画像であると判定された時であってもよい。 In the aspect control device according to aspect 4 of the present invention, in the above aspect 1, the image acquisition unit (image acquisition device 42) that acquires a face image obtained by imaging the user's face and the image acquisition unit (image acquisition device 42) include: An image determination unit (image determination device 43) for determining whether or not the acquired face image is an image indicating the end of the utterance by the user, and at the start of the dialogue with the user, the image determination unit ( It may be when the image determination device 43) determines that the image indicates the end of the user's utterance.

上記の構成によれば、ユーザとの対話開始時は、上記画像判定部（画像判定装置４３）によりユーザの発話終了を示す画像であると判定された時であるので、ユーザの発話終了タイミングでユーザに発話意図提示姿勢をとることができる。これにより、ロボットは、ユーザに対して迅速に発話の意図があることを知らせることができる。 According to the above configuration, the start of the dialogue with the user is a time when the image determination unit (image determination device 43) determines that the image indicates the end of the user's utterance. The user can take the utterance intention presentation posture. Thereby, the robot can inform the user that there is an intention to speak quickly.

本発明の態様５に係るロボットは、上記態様１〜４の何れか１態様に係る姿勢制御装置（３１）を備えたことを特徴としている。上記の構成によれば、ユーザに対して明確に発話の意図があることを知らせることができる。 A robot according to an aspect 5 of the present invention is characterized by including the attitude control device (31) according to any one of the above aspects 1 to 4. According to the above configuration, it is possible to notify the user that there is a clear intention to speak.

本発明の態様６に係る姿勢制御方法は、ユーザとの対話が可能であり、且つ複数の駆動部（駆動系１）を駆動させて種々の姿勢をとることが可能なロボット（１０１，２０１，３０１，４０１）の姿勢を制御する姿勢制御方法であって、ユーザとの対話開始時に上記ロボット（１０１，２０１，３０１，４０１）の姿勢を特定する姿勢特定ステップと、上記姿勢特定ステップにより特定されたロボット（１０１，２０１，３０１，４０１）の姿勢が、当該ロボット（１０１，２０１，３０１，４０１）に発話を行う意図があることを示す発話意図提示姿勢でない場合に、上記各駆動部を駆動させて当該ロボット（１０１，２０１，３０１，４０１）に上記発話意図提示姿勢をとらせる駆動制御ステップと、を含むことを特徴としている。上記の構成によれば、上記態様１と同じ効果を奏する。 The posture control method according to aspect 6 of the present invention is a robot (101, 201, 101) capable of interacting with a user and driving a plurality of drive units (drive system 1) to take various postures. 301, 401) is a posture control method for controlling the posture of the robot (101, 201, 301, 401) at the start of dialogue with the user, and is specified by the posture specifying step and the posture specifying step. When the posture of the robot (101, 201, 301, 401) is not an utterance intention presentation posture indicating that the robot (101, 201, 301, 401) has an intention to utter, the above drive units are driven. And a drive control step for causing the robot (101, 201, 301, 401) to take the utterance intention presentation posture. According to said structure, there exists the same effect as the said aspect 1. FIG.

本発明の各態様に係る姿勢制御装置は、コンピュータによって実現してもよく、この場合には、コンピュータを上記姿勢制御装置が備える各部（ソフトウェア要素）として動作させることにより上記姿勢制御装置をコンピュータにて実現させる姿勢制御装置の姿勢制御プログラム、およびそれを記録したコンピュータ読み取り可能な記録媒体も、本発明の範疇に入る。 The posture control device according to each aspect of the present invention may be realized by a computer. In this case, the posture control device is operated on each computer by causing the computer to operate as each unit (software element) included in the posture control device. The attitude control program of the attitude control device to be realized and a computer-readable recording medium on which the attitude control program is recorded also fall within the scope of the present invention.

本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。さらに、各実施形態にそれぞれ開示された技術的手段を組み合わせることにより、新しい技術的特徴を形成することができる。 The present invention is not limited to the above-described embodiments, and various modifications are possible within the scope shown in the claims, and embodiments obtained by appropriately combining technical means disclosed in different embodiments. Is also included in the technical scope of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.

１駆動系（駆動部）、２音声系、３姿勢制御装置、４画像系、２１マイク、２２入力装置、２３音声認識装置、２４対話装置、２５音声合成装置、２６再生装置、２７スピーカ、２８再生状況取得装置、２９通信装置、３１駆動制御装置３１ａ姿勢特定部、３１ｂ駆動制御部、３２筐体状態取得装置、３３姿勢記録装置、３４挙動パターン記録装置、４１カメラ、４２画像取得装置（画像取得部）、４３画像判定装置（画像判定部）、１０１，２０１，３０１，４０１ロボット DESCRIPTION OF SYMBOLS 1 Drive system (drive part), 2 Voice system, 3 Attitude control apparatus, 4 Image system, 21 Microphone, 22 Input device, 23 Speech recognition apparatus, 24 Dialogue device, 25 Speech synthesizer, 26 Playback apparatus, 27 Speaker, 28 Reproduction status acquisition device, 29 communication device, 31 drive control device 31a posture specifying unit, 31b drive control unit, 32 housing state acquisition device, 33 posture recording device, 34 behavior pattern recording device, 41 camera, 42 image acquisition device (image Acquisition unit), 43 image determination device (image determination unit), 101, 201, 301, 401 robot

Claims

A posture control device for controlling a posture of a robot that is capable of interacting with a user and that is provided in a robot that can take various postures by driving a plurality of drive units,
A posture identifying unit that identifies the posture of the robot from the driving state of each driving unit;
A drive control unit that performs drive control of each of the drive units;
With
The drive control unit
When the posture of the robot specified by the posture specifying unit at the start of dialogue with the user is not an utterance intention presentation posture indicating that the robot has an intention to speak, the robot is driven by driving the driving units. A posture control device that causes the above-mentioned utterance intention presentation posture to be taken.

When the robot inputs a user's voice and interacts with the user by reproducing the voice toward the user according to the input voice,
The posture control apparatus according to claim 1, wherein the dialogue with the user is started when the robot starts reproducing the voice to the user.

When the robot interacts with the user by inputting the user's voice and playing the voice toward the user according to the input voice,
The posture control apparatus according to claim 1, wherein the dialogue with the user is started when the user's voice input by the robot is finished.

An image acquisition unit that acquires a face image obtained by imaging a user's face;
An image determination unit for determining whether the face image acquired by the image acquisition unit is an image indicating the end of speech by the user;
With
The posture control apparatus according to claim 1, wherein the start of the dialogue with the user is a time when the image determination unit determines that the image indicates the end of the user's utterance.

A robot capable of interacting with a user and driving a plurality of driving units to take various postures;
A robot comprising the attitude control device according to claim 1.

An attitude control method for controlling the attitude of a robot capable of interacting with a user and driving a plurality of driving units to take various attitudes,
A posture identifying step for identifying the posture of the robot at the start of dialogue with the user;
When the posture of the robot identified by the posture identifying step is not an utterance intention presentation posture indicating that the robot has an intention to utter, the drive unit is driven to set the utterance intention presentation posture to the robot. Drive control step to be taken,
A posture control method comprising: