JP6391386B2

JP6391386B2 - Server, server control method, and server control program

Info

Publication number: JP6391386B2
Application number: JP2014192653A
Authority: JP
Inventors: 新開　誠; 誠新開
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2014-09-22
Filing date: 2014-09-22
Publication date: 2018-09-19
Anticipated expiration: 2034-09-22
Also published as: JP2016062077A

Description

本発明は、擬似的なコミュニケーションを提供する対話装置、対話システム、対話プログラム、サーバ、サーバの制御方法およびサーバ制御プログラムに関する。 The present invention relates to an interactive apparatus, an interactive system, an interactive program, a server, a server control method, and a server control program that provide pseudo communication.

対話装置として、ユーザと対話可能な音声対話装置が提案されている（特許文献１）。
当該装置では、人間同士のように自然な対話が可能なように感情パラメータを設けて、感情パラメータの値に従って応答内容を変更する方式が採用されている。 As an interactive device, a speech interactive device capable of interacting with a user has been proposed (Patent Document 1).
The apparatus employs a method in which emotion parameters are provided so that a natural conversation like humans is possible, and the response content is changed according to the value of the emotion parameters.

また、別の例として、人との対話が不自然になることを抑制する対話装置も提案されている（特許文献２）。当該装置では、対話の理解度に合わせて表出される態度を変更する方式が採用されている。 As another example, an interactive device that suppresses unnatural interaction with a person has been proposed (Patent Document 2). The apparatus employs a method of changing the attitude expressed in accordance with the level of understanding of the dialogue.

特開２００２−１２３２８９号公報JP 2002-123289 A 特開２０１３−１５４４５８号公報JP 2013-154458 A

一方で、上記特許文献１に従う方式では、感情パラメータは、ユーザの感情に起因して調整されるが、自然な対話においてユーザの感情と対話装置とが関連している必要はなく、逆に不自然な対話となる可能性も考えられる。 On the other hand, in the method according to Patent Document 1, the emotion parameter is adjusted due to the user's emotion, but it is not necessary for the user's emotion and the dialogue device to be related in a natural dialogue, and conversely, There is also the possibility of a natural dialogue.

特許文献２に従う方式では、理解度に応じた対応となるためユーザ側が装置の理解度に合わせることが必要となり、自然な対話とならない。 In the method according to Japanese Patent Laid-Open No. 2004-228688, the correspondence is made according to the degree of understanding, so the user side needs to match the degree of understanding of the device, and a natural conversation does not occur.

本開示は、上記課題を解決するためになされたものであって、ユーザとの自然な対話が可能な対話装置、対話システム、対話プログラム、サーバ、サーバの制御方法およびサーバ制御プログラムを提供することを目的とするものである。 The present disclosure has been made to solve the above-described problem, and provides an interactive apparatus, an interactive system, an interactive program, a server, a server control method, and a server control program capable of natural interaction with a user. It is intended.

本発明のある局面に従う対話装置は、ユーザからの入力を受け付ける入力受付部と、装置の感情を示す機嫌パラメータを管理するパラメータ管理部と、各々が、選択される際の指標として機嫌パラメータの値と関連付けられ、入力受付部で受け付けたユーザからの入力に対する応答に関する複数の応答情報を記憶する記憶部と、記憶部を参照して、パラメータ管理部で管理される機嫌パラメータに基づいて複数の応答情報の１つを選択する選択部と、選択部により選択された応答情報に基づいて応答処理を実行する応答処理実行部とを備える。 An interactive device according to an aspect of the present invention includes an input receiving unit that receives input from a user, a parameter management unit that manages a mood parameter indicating emotion of the device, and a value of a mood parameter as an index when each is selected And a plurality of responses based on the mood parameters managed by the parameter management unit with reference to the storage unit, and a storage unit that stores a plurality of response information related to responses from the user received by the input reception unit A selection unit that selects one of the information; and a response processing execution unit that executes a response process based on the response information selected by the selection unit.

本発明のある局面に従う対話システムは、ユーザからの入力を受け付ける入力受付部と、装置の感情を示す機嫌パラメータを管理するパラメータ管理部と、各々が、選択される際の指標として機嫌パラメータの値と関連付けられ、入力受付部で受け付けたユーザからの入力に対する応答に関する複数の応答情報を記憶する記憶部と、記憶部を参照して、パラメータ管理部で管理される機嫌パラメータに基づいて複数の応答情報の１つを選択する選択部と、選択部により選択された応答情報に基づいて応答処理を実行する応答処理実行部とを備える。 An interactive system according to an aspect of the present invention includes an input receiving unit that receives an input from a user, a parameter management unit that manages a mood parameter indicating an emotion of the device, and a value of the mood parameter as an index when each is selected And a plurality of responses based on the mood parameters managed by the parameter management unit with reference to the storage unit, and a storage unit that stores a plurality of response information related to responses from the user received by the input reception unit A selection unit that selects one of the information; and a response processing execution unit that executes a response process based on the response information selected by the selection unit.

本発明のある局面に従うコンピュータにおいて実行される対話プログラムであって、対話プログラムは、コンピュータに対して、ユーザからの入力を受け付けるステップと、装置の感情を示す機嫌パラメータを管理するステップと、各々が、選択される際の指標として機嫌パラメータの値と関連付けられ、受け付けたユーザからの入力に対する応答に関する複数の応答情報が記憶された記憶部を参照して、機嫌パラメータに基づいて複数の応答情報の１つを選択するステップと、選択された応答情報に基づいて応答処理を実行するステップとを備える、処理を実行させる。 An interactive program executed in a computer according to an aspect of the present invention, the interactive program receiving a user's input to the computer, managing a mood parameter indicating an emotion of the device, , Referring to a storage unit that is associated with the value of the mood parameter as an index at the time of selection, and stores a plurality of response information related to the response to the input from the received user, a plurality of response information based on the mood parameter A process including a step of selecting one and a step of executing a response process based on the selected response information is executed.

本発明のある局面に従う対話装置と通信可能に設けられたサーバとを備える対話システムであって、対話装置は、ユーザからの入力を受け付ける入力受付部と、サーバから指示に従って応答処理を実行する応答処理実行部とを備える。サーバは、装置の感情を示す機嫌パラメータを管理するパラメータ管理部と、各々が、選択される際の指標として機嫌パラメータの値と関連付けられ、入力受付部で受け付けたユーザからの入力に対する応答に関する複数の応答情報を記憶する記憶部と、記憶部を参照して、パラメータ管理部で管理される機嫌パラメータに基づいて複数の応答情報の１つを選択する選択部と、選択部により選択された応答情報に基づいて対話装置に対して応答処理を実行するように指示する応答処理実行指示部とを含む。 An interactive system including an interactive device according to an aspect of the present invention and a server provided so as to be communicable, the interactive device receiving an input from a user, and a response executing response processing in accordance with an instruction from the server A processing execution unit. The server includes a parameter management unit that manages mood parameters indicating emotions of the device, a plurality of parameters related to responses from the user received by the input receiving unit, each of which is associated with a value of the mood parameter as an index for selection. A storage unit that stores the response information, a selection unit that selects one of a plurality of response information based on the mood parameters managed by the parameter management unit with reference to the storage unit, and a response selected by the selection unit A response process execution instructing unit that instructs the dialogue apparatus to execute the response process based on the information.

本発明の別の局面に従うサーバと通信可能に設けられた対話装置であって、ユーザからの入力を受け付ける入力受付部と、入力受付部で受け付けたユーザからの入力に従って、サーバで管理されている装置の感情を示す機嫌パラメータに基づいてサーバで記憶されているユーザからの入力に対する応答に関する複数の応答情報の中から選択された応答情報に基づいて応答処理を実行する応答処理実行部とを備える。 An interactive apparatus provided to be able to communicate with a server according to another aspect of the present invention, managed by a server according to an input receiving unit that receives an input from a user and an input from the user that is received by the input receiving unit A response processing execution unit that executes response processing based on response information selected from a plurality of response information related to a response to an input from a user stored in the server based on a mood parameter indicating an emotion of the device .

本発明の別の局面に従うサーバと通信可能に設けられた対話装置のコンピュータにおいて実行される対話プログラムであって、対話プログラムは、コンピュータに対して、ユーザからの入力を受け付けるステップと、受け付けたユーザからの入力に従って、サーバで管理されている装置の感情を示す機嫌パラメータに基づいてサーバで記憶されているユーザからの入力に対する応答に関する複数の応答情報の中から選択された応答情報に基づいて応答処理を実行するステップとを備える、処理を実行させる。 An interactive program executed in a computer of an interactive device provided to be able to communicate with a server according to another aspect of the present invention, the interactive program receiving an input from a user to the computer, and an accepted user In response to an input from, a response is made based on response information selected from a plurality of response information regarding a response to the input from the user stored in the server based on the mood parameter indicating the emotion of the device managed by the server And executing a process including the step of executing the process.

本発明のある局面に従う対話装置と通信可能に設けられたサーバであって、対話装置で受け付けたユーザからの入力を受信する受信部と、装置の感情を示す機嫌パラメータを管理するパラメータ管理部と、各々が、選択される際の指標として機嫌パラメータの値と関連付けられ、受信部で受信したユーザからの入力に対する応答に関する複数の応答情報を記憶する記憶部と、記憶部を参照して、パラメータ管理部で管理される機嫌パラメータに基づいて複数の応答情報の１つを選択する選択部と、選択部により選択された応答情報に基づいて対話装置に対して応答処理を実行するように指示する応答処理実行指示部とを備える。 A server provided to be able to communicate with an interactive device according to an aspect of the present invention, a receiving unit that receives an input from a user received by the interactive device, a parameter management unit that manages a mood parameter indicating the emotion of the device, , Each of which is associated with the value of the mood parameter as an index at the time of selection, a storage unit that stores a plurality of response information related to responses from the user received by the reception unit, and a parameter with reference to the storage unit A selection unit that selects one of a plurality of response information based on the mood parameter managed by the management unit, and instructs the dialog device to execute response processing based on the response information selected by the selection unit A response processing execution instruction unit.

好ましくは、パラメータ管理部は、機嫌パラメータを更新する。
好ましくは、パラメータ管理部は、外部から取得した情報に基づいて機嫌パラメータを更新する。 Preferably, the parameter management unit updates the mood parameter.
Preferably, the parameter management unit updates the mood parameters based on information acquired from the outside.

好ましくは、パラメータ管理部は、外部から取得した環境情報に基づいて機嫌パラメータを更新する。 Preferably, the parameter management unit updates the mood parameters based on environmental information acquired from the outside.

好ましくは、パラメータ管理部は、外部から取得した情報が所定条件を満たす場合に機嫌パラメータを更新する。 Preferably, the parameter management unit updates the mood parameter when information acquired from the outside satisfies a predetermined condition.

好ましくは、パラメータ管理部は、受信部で受信したユーザからの入力態様に基づいて機嫌パラメータを更新する。 Preferably, a parameter management part updates a mood parameter based on the input mode from the user received by the receiving part.

好ましくは、パラメータ管理部は、受信部で受信したユーザからの音声入力内容に基づいて機嫌パラメータを更新する。 Preferably, the parameter management unit updates the mood parameter based on the content of the voice input from the user received by the receiving unit.

好ましくは、パラメータ管理部は、受信部で受信したユーザにより検知したセンサ入力情報に基づいて機嫌パラメータを更新する。 Preferably, the parameter management unit updates the mood parameter based on the sensor input information detected by the user received by the receiving unit.

好ましくは、サーバは、ユーザからの入力の履歴を記憶する履歴記憶部をさらに備え、パラメータ管理部は、履歴記憶部を参照して、ユーザからの複数の入力態様に基づいて機嫌パラメータを更新する。 Preferably, the server further includes a history storage unit that stores a history of input from the user, and the parameter management unit updates the mood parameters based on a plurality of input modes from the user with reference to the history storage unit. .

好ましくは、応答処理実行指示部は、対話装置に対して音声合成処理を実行するように指示し、複数の応答情報は、音声合成処理を実行する際のそれぞれが異なるパラメータ情報を含む。 Preferably, the response process execution instructing unit instructs the dialogue apparatus to execute the speech synthesis process, and the plurality of response information includes different parameter information when executing the speech synthesis process.

好ましくは、応答処理実行指示部は、対話装置に対して音声出力処理を実行するように指示し、複数の応答情報は、音声出力処理を実行する際のそれぞれが異なる出力内容を含む。 Preferably, the response process execution instructing unit instructs the dialogue apparatus to execute the audio output process, and the plurality of response information includes different output contents when executing the audio output process.

本発明のある局面に従う対話装置と通信可能に設けられたサーバの制御方法であって、対話装置で受け付けたユーザからの入力を受信するステップと、装置の感情を示す機嫌パラメータを管理するステップと、各々が、選択される際の指標として機嫌パラメータの値と関連付けられ、受け付けたユーザからの入力に対する応答に関する複数の応答情報が記憶された記憶部を参照して、機嫌パラメータに基づいて複数の応答情報の１つを選択するステップと、選択された応答情報に基づいて対話装置に対して応答処理を実行するように指示するステップとを備える。 A method for controlling a server provided to be able to communicate with an interactive device according to an aspect of the present invention, the step of receiving input from a user accepted by the interactive device, and the step of managing mood parameters indicating emotions of the device; , Each is associated with the value of the mood parameter as an index at the time of selection, and a plurality of response information regarding the response to the input from the received user is stored, and a plurality of response information is stored based on the mood parameter. Selecting one of the response information, and instructing the dialog device to execute a response process based on the selected response information.

本発明のある局面に従う対話装置と通信可能に設けられたサーバのコンピュータにおいて実行されるサーバ制御プログラムであって、サーバ制御プログラムは、コンピュータに対して、対話装置で受け付けたユーザからの入力を受信するステップと、装置の感情を示す機嫌パラメータを管理するステップと、各々が、選択される際の指標として機嫌パラメータの値と関連付けられ、受け付けたユーザからの入力に対する応答に関する複数の応答情報が記憶された記憶部を参照して、機嫌パラメータに基づいて複数の応答情報の１つを選択するステップと、選択された応答情報に基づいて対話装置に対して応答処理を実行するように指示するステップとを備える、処理を実行させる。 A server control program executed in a computer of a server provided to be able to communicate with an interactive apparatus according to an aspect of the present invention, the server control program receiving an input from a user received by the interactive apparatus with respect to the computer And a step of managing mood parameters indicating emotions of the device, each of which is associated with a value of the mood parameter as an index when selected, and a plurality of response information relating to responses to received inputs from the user is stored Referring to the stored storage unit, selecting one of a plurality of response information based on the mood parameter, and instructing the dialogue apparatus to execute a response process based on the selected response information The process is executed.

本開示の一態様によれば、ユーザとの自然な対話が可能である。 According to one aspect of the present disclosure, natural interaction with a user is possible.

実施形態に基づく対話システム１について説明する図である。It is a figure explaining dialogue system 1 based on an embodiment. 実施形態に係る対話システム１の要部構成について説明する図である。It is a figure explaining the principal part structure of the dialogue system 1 which concerns on embodiment. 実施形態１に基づく応答内容データベース２３２について説明する図である。It is a figure explaining the response content database 232 based on Embodiment 1. FIG. 実施形態１に基づく履歴記憶部２３３について説明する図である。It is a figure explaining the log | history memory | storage part 233 based on Embodiment 1. FIG. 実施形態１に基づくセンサ情報データベース２３４について説明する図である。It is a figure explaining the sensor information database 234 based on Embodiment 1. FIG. 実施形態１に基づく変更テーブルデータベース２３１について説明する図である。It is a figure explaining the change table database 231 based on Embodiment 1. FIG. 実施形態１に基づく対話システム１における応答処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the response process in the dialogue system 1 based on Embodiment 1. 実施形態１に基づくサーバ２０の対話出力処理を実行するフロー図である。It is a flowchart which performs the dialog output process of the server 20 based on Embodiment 1. FIG. 実施形態１に基づくサーバ２０の機嫌パラメータ取得処理を実行するフロー図である。It is a flowchart which performs the mood parameter acquisition process of the server 20 based on Embodiment 1. FIG. 実施形態２に基づく音声合成パラメータテーブルを説明する図である。It is a figure explaining the speech synthesis parameter table based on Embodiment 2. FIG. 実施形態３に基づく応答内容データベース２３２Ａについて説明する図である。It is a figure explaining the response content database 232A based on Embodiment 3. FIG. 実施形態４に基づくキャラクタデータベースについて説明する図である。It is a figure explaining the character database based on Embodiment 4. FIG. 実施形態５に基づくサーバの構成について説明する図である。It is a figure explaining the structure of the server based on Embodiment 5. FIG.

本実施の形態について、以下、図面を参照しながら説明する。実施の形態の説明において、個数および量などに言及する場合、特に記載がある場合を除き、本発明の範囲は必ずしもその個数およびその量などに限定されない。実施の形態の説明において、同一の部品および相当部品に対しては、同一の参照番号を付し、重複する説明は繰り返さない場合がある。特に制限が無い限り、実施の形態に示す構成に示す構成を適宜組み合わせて用いることは、当初から予定されていることである。 The present embodiment will be described below with reference to the drawings. In the description of the embodiments, when the number and amount are referred to, the scope of the present invention is not necessarily limited to the number and amount unless otherwise specified. In the description of the embodiments, the same parts and corresponding parts are denoted by the same reference numerals, and redundant description may not be repeated. Unless there is a restriction | limiting in particular, it is planned from the beginning to use suitably combining the structure shown to the structure shown to embodiment.

＜実施形態１＞
（対話システム１の構成）
図１は、実施形態に基づく対話システム１について説明する図である。 <Embodiment 1>
(Configuration of Dialog System 1)
FIG. 1 is a diagram illustrating a dialogue system 1 based on the embodiment.

図１を参照して、実施形態に基づく対話システム１は、人形（対話装置）１０、ネットワーク５、サーバ２０とにより構成されている。 With reference to FIG. 1, a dialogue system 1 based on the embodiment is configured by a doll (dialogue device) 10, a network 5, and a server 20.

人形１０は、ネットワーク５を介してサーバ２０と通信可能に設けられている。なお、本例においては、ネットワーク５を介してサーバ２０と通信する場合について説明するが、直接、サーバ２０と通信する方式としてもよい。 The doll 10 is provided to be able to communicate with the server 20 via the network 5. In this example, the case of communicating with the server 20 via the network 5 will be described, but a method of directly communicating with the server 20 may be used.

対話システム１は、一例として人間（ユーザ）から人形１０に対して音声が入力（以降では、「ユーザ発話」とも記載）されると、サーバ２０において音声認識されて、入力された音声に対する応答内容を表す音声（以降では、「音声再生」とも記載）を、人形１０から出力する。当該処理を繰り返すことにより、実施形態に係る対話システム１は、ユーザと、人形１０との疑似的な会話を実現する。なお、本例においては、一例としてユーザから人形１０に対して音声が入力される場合について説明するが、人形１０からユーザに対して音声が出力され、これに対してユーザが人形１０に対して音声を入力する場合であってもよく、特にその順序については限定されない。 As an example, when a voice is input from a human (user) to the doll 10 (hereinafter also referred to as “user utterance”), the dialogue system 1 recognizes the voice in the server 20 and responds to the input voice. Is output from the doll 10 (hereinafter also referred to as “audio playback”). By repeating the process, the interactive system 1 according to the embodiment realizes a pseudo conversation between the user and the doll 10. In this example, a case where a voice is input from the user to the doll 10 will be described as an example, but a voice is output from the doll 10 to the user. The voice may be input, and the order is not particularly limited.

なお、実施形態では、対話装置の一例として、音声を認識してユーザに対して音声応答を出力する人形１０を例に挙げて説明するが、本発明はこれに限定されるものではない。例えば、対話機能を有する人形１０以外の家電（例えば、テレビ、電子レンジなど）などを、対話装置として採用することもできる。 In the embodiment, a doll 10 that recognizes a voice and outputs a voice response to the user will be described as an example of an interactive device. However, the present invention is not limited to this. For example, home appliances (for example, a television, a microwave oven, etc.) other than the doll 10 having an interactive function may be employed as the interactive device.

また、実施形態では、サーバ２０が１つのサーバによって実現される構成を例に挙げて説明するが、本発明はこれに限定されるものではなく、サーバ２０の備える各部（各機能）の少なくとも一部を、他のサーバにより実現する構成を採用してもよい。 In the embodiment, a configuration in which the server 20 is realized by one server will be described as an example. However, the present invention is not limited to this, and at least one of the units (functions) included in the server 20 is described. A configuration may be adopted in which the unit is realized by another server.

（対話システム１の要部構成）
図２は、実施形態に係る対話システム１の要部構成について説明する図である。 (Main components of the dialogue system 1)
FIG. 2 is a diagram illustrating a configuration of main parts of the interactive system 1 according to the embodiment.

図２を参照して、まず、人形１０の構成について説明する。
実施形態に基づく人形１０は、通信部１０１、制御部１０２、マイク１０３、スピーカ１０４、駆動部１０６、センサ１０８および記憶部１０９を含む。 With reference to FIG. 2, the structure of the doll 10 is demonstrated first.
The doll 10 based on the embodiment includes a communication unit 101, a control unit 102, a microphone 103, a speaker 104, a driving unit 106, a sensor 108, and a storage unit 109.

通信部１０１は、外部との通信を行う手段である。具体的には、通信部１０１は、サーバ２０と例えばインターネットなどのネットワーク５を介して通信する。なお、無線あるいは有線のいずれの通信も可能である。 The communication unit 101 is means for performing communication with the outside. Specifically, the communication unit 101 communicates with the server 20 via a network 5 such as the Internet. Note that either wireless or wired communication is possible.

マイク１０３は、外部から音の入力を受け付ける。なお、実施形態では、マイク１０３が入力を受け付ける音を示す音データには、主に人間の発する音声の周波数帯域に含まれる音のデータ（音声データとも称する）の入力を受け付ける場合について説明するが、音声データの周波数帯域以外の周波数帯域を含む音のデータが含まれていてもよい。マイク１０３は、入力された音を示す音声データを、制御部１０２に出力する。 The microphone 103 receives sound input from the outside. In the embodiment, a case will be described in which input of sound data (also referred to as sound data) included in a frequency band of sound mainly produced by humans is received as sound data indicating sound that the microphone 103 receives input. Sound data including a frequency band other than the frequency band of the audio data may be included. The microphone 103 outputs audio data indicating the input sound to the control unit 102.

スピーカ１０４は、制御部１０２から出力される応答内容を表す音声応答を出力する。以降では、人形１０がスピーカ１０４を介して行う音声応答の出力を、「再生」とも記載する。なお、応答内容の詳細については、後述する。 The speaker 104 outputs a voice response representing the response content output from the control unit 102. Hereinafter, the output of the voice response performed by the doll 10 via the speaker 104 is also referred to as “reproduction”. Details of the response contents will be described later.

駆動部１０６は、制御部１０２からの指示に基づいて、人形１０を動作させる。たとえば、駆動部１０６により人形１０に設けられた可動可能な手、足等のパーツを駆動することが可能である。 The drive unit 106 operates the doll 10 based on an instruction from the control unit 102. For example, it is possible to drive parts such as movable hands and feet provided on the doll 10 by the driving unit 106.

操作部１０７は、人形１０に対して各種動作の実行を指示を受け付ける。
センサ１０８は、各種の情報を検出する。本例においては、一例として、環境情報として気温、湿度を検知することが可能な温度センサ、湿度センサが含まれるものとする。また、ユーザが人形１０に触れたことを検知するセンサ（振動センサ等）も含まれるものとする。さらに、当該センサ１０８には撮像情報を取得するカメラも含まれ、カメラによりユーザの顔を撮像して、撮像情報に含まれるユーザの表情としての微笑みを検知することが可能なセンサも含まれるものとする。 The operation unit 107 receives instructions for performing various operations with respect to the doll 10.
The sensor 108 detects various information. In this example, as an example, the environmental information includes a temperature sensor and a humidity sensor capable of detecting temperature and humidity. Further, a sensor (vibration sensor or the like) that detects that the user touches the doll 10 is also included. Further, the sensor 108 includes a camera that acquires imaging information, and includes a sensor that can capture a user's face by the camera and detect a smile as a user's facial expression included in the imaging information. And

記憶部１０９は、ＲＡＭ（Random Access Memory）及びフラッシュメモリなどの記憶装置であり、人形１０の各種機能を実現するためのプログラム等が格納されている。 The storage unit 109 is a storage device such as a RAM (Random Access Memory) and a flash memory, and stores programs for realizing various functions of the doll 10.

制御部１０２は、主にＣＰＵ（Central Processing Unit）で構成され、記憶部１０９に格納されているプログラムを当該ＣＰＵが実行する各部の機能を実現する。 The control unit 102 is mainly composed of a CPU (Central Processing Unit), and realizes the function of each unit that the CPU executes a program stored in the storage unit 109.

制御部１０２は、人形１０の各部を統括的に制御する。具体的には、制御部１０２は、駆動部１０６を制御することにより、人形１０の動作を制御する。また、制御部１０２は、マイク１０３によって外部から取得された音を示す音声データを、通信部１０１を介してサーバ２０に送信する。 The control unit 102 comprehensively controls each unit of the doll 10. Specifically, the control unit 102 controls the operation of the doll 10 by controlling the driving unit 106. Further, the control unit 102 transmits audio data indicating sound acquired from the outside by the microphone 103 to the server 20 via the communication unit 101.

また、制御部１０２は、サーバ２０に送信した音声データに関して、サーバ２０で音声認識した回答フレーズデータを通信部１０１を介して受信する。そして、制御部１０２は、受信した回答フレーズデータに従って応答内容を表す音声応答をスピーカ１０４から出力することが可能である。 Further, the control unit 102 receives the reply phrase data recognized by the server 20 via the communication unit 101 regarding the voice data transmitted to the server 20. And the control part 102 can output the audio | voice response showing the response content from the speaker 104 according to the received reply phrase data.

制御部１０２の主な機能構成について説明する。
制御部１０２は、応答処理実行部１１２と、音声入力受付部１１４と、情報取得部１１５とを含む。 The main functional configuration of the control unit 102 will be described.
The control unit 102 includes a response process execution unit 112, a voice input reception unit 114, and an information acquisition unit 115.

音声入力受付部１１４は、音声データを検出（抽出）する。換言すれば、音声入力受付部１１４は、外部から受信した音データから、人間の発する音声の周波数帯域を抽出することによって、音データ（音声データ）を検出する。 The voice input reception unit 114 detects (extracts) voice data. In other words, the voice input receiving unit 114 detects the sound data (voice data) by extracting the frequency band of the voice uttered by a person from the sound data received from the outside.

音声入力受付部１１４における、音データから音声データを検出する方法としては、例えば、音データから人間の発する音声の周波数帯域（例えば、１００Ｈｚ以上かつ１ｋＨｚ以下の周波数帯域）を抽出することによって音声データを検出する方法を挙げることができる。この場合には、音声入力受付部１１４は、音データから人間の発する音声の周波数帯域を抽出するために、例えば、バンドパスフィルタ、又は、ハイパスフィルタ及びローパスフィルタを組み合わせたフィルタなどを備えていればよい。 As a method of detecting the voice data from the sound data in the voice input reception unit 114, for example, the voice data is extracted by extracting the frequency band (for example, the frequency band of 100 Hz or more and 1 kHz or less) of the voice generated by humans from the sound data. The method of detecting can be mentioned. In this case, the voice input receiving unit 114 may be provided with, for example, a bandpass filter or a filter combining a high-pass filter and a low-pass filter in order to extract a frequency band of voice uttered by humans from sound data. That's fine.

音声入力受付部１１４は、音データから検出した音声データを通信部１０１を介してサーバ２０に送信する。 The voice input reception unit 114 transmits voice data detected from the sound data to the server 20 via the communication unit 101.

応答処理実行部１１２は、サーバ２０からの回答フレーズデータに基づいて、音声合成処理して一例としてスピーカ１０４を介してユーザに発話する。 The response processing execution unit 112 performs speech synthesis processing based on the answer phrase data from the server 20 and utters the user via the speaker 104 as an example.

情報取得部１１５は、センサ１０８で検知された温度、湿度等の情報およびユーザの接触や微笑みを検知した検出情報を取得して通信部１０１を介してサーバ２０に送信する。 The information acquisition unit 115 acquires information such as temperature and humidity detected by the sensor 108 and detection information detected by the user's contact and smile, and transmits them to the server 20 via the communication unit 101.

次に、実施形態に基づくサーバ２０の構成について説明する。
実施形態に基づくサーバ２０は、通信部２０１、制御部２０２および記憶部２０３を含む。 Next, the configuration of the server 20 based on the embodiment will be described.
The server 20 based on the embodiment includes a communication unit 201, a control unit 202, and a storage unit 203.

通信部２０１は、外部との通信を行う手段である。具体的には、通信部２０１は、人形１０と、例えばインターネットなどのネットワーク５を介して通信する。また、人形１０以外の外部装置とも通信可能に設けられている。なお、無線あるいは有線のいずれの通信でも可能である。 The communication unit 201 is a means for performing communication with the outside. Specifically, the communication unit 201 communicates with the doll 10 via a network 5 such as the Internet. Moreover, it is provided so that communication with external apparatuses other than the doll 10 is also possible. Note that wireless or wired communication is possible.

記憶部２０３は、ＲＡＭ（Random Access Memory）及びフラッシュメモリなどの記憶装置であり、サーバ２０の各種機能を実現するためのプログラム等が格納されている。また、記憶部２０３は、一例として機嫌パラメータを更新する際に利用される変更テーブルデータベース２３１と、音声入力に対する応答に関する情報（応答情報とも称する）である応答内容データベース２３２と、履歴を管理する履歴記憶部２３３と、センサ情報を記憶するセンサ情報データベース２３４とを有している。 The storage unit 203 is a storage device such as a RAM (Random Access Memory) and a flash memory, and stores programs for realizing various functions of the server 20. In addition, the storage unit 203 includes, as an example, a change table database 231 used when updating the mood parameters, a response content database 232 that is information (also referred to as response information) regarding a response to voice input, and a history that manages the history It has the memory | storage part 233 and the sensor information database 234 which memorize | stores sensor information.

制御部２０２は、主にＣＰＵ（Central Processing Unit）で構成され、記憶部２０３に格納されているプログラムを当該ＣＰＵが実行することによって実現される。 The control unit 202 is mainly configured by a CPU (Central Processing Unit), and is realized by the CPU executing a program stored in the storage unit 203.

制御部２０２は、サーバ２０の各部を統括的に制御する。具体的には、制御部２０２は、人形１０からの通信部２０１を介して受信した音声データについて音声認識した結果、回答フレーズデータを通信部２０１を介して人形１０に出力する。 The control unit 202 comprehensively controls each unit of the server 20. Specifically, the control unit 202 outputs answer phrase data to the doll 10 via the communication unit 201 as a result of performing voice recognition on the voice data received from the doll 10 via the communication unit 201.

次に、サーバ２０の制御部２０２の主な機能構成について説明する。
制御部２０２は、音声入力受信部２２１、管理部２２２、音声認識部２２３、選択部２２４および応答処理実行指示部２２５を有する。 Next, the main functional configuration of the control unit 202 of the server 20 will be described.
The control unit 202 includes a voice input reception unit 221, a management unit 222, a voice recognition unit 223, a selection unit 224, and a response process execution instruction unit 225.

音声入力受信部２２１は、通信部２０１を介して人形１０から送信された音声データを受信する。音声入力受信部２２１は、受信した音声データを音声認識部２２３に出力する。 The voice input reception unit 221 receives voice data transmitted from the doll 10 via the communication unit 201. The voice input receiving unit 221 outputs the received voice data to the voice recognition unit 223.

管理部２２２は、人形１０（装置）に関する機嫌パラメータを管理する。本例においては、管理部２２２は、一例として変更テーブルデータベース２３１等に基づいて人形１０の機嫌パラメータを算出し、更新する。 The management unit 222 manages mood parameters related to the doll 10 (device). In this example, the management unit 222 calculates and updates the mood parameters of the doll 10 based on the change table database 231 and the like as an example.

音声認識部２２３は、音声入力受信部２２１によって受信した音声データの示す音声の内容（音声内容）を認識内容として認識する。具体的には、記憶部２０３に予め設けられている音声認識に利用される辞書を用いて音声データに対する認識フレーズを取得する。なお、当該音声認識に利用される辞書を用いて音声データに対する認識フレーズを取得できなかった場合には音声認識は失敗と判断する。 The voice recognition unit 223 recognizes the voice content (sound content) indicated by the voice data received by the voice input reception unit 221 as the recognition content. Specifically, a recognition phrase for voice data is acquired using a dictionary used for voice recognition provided in advance in the storage unit 203. In addition, when the recognition phrase with respect to audio | voice data cannot be acquired using the dictionary utilized for the said audio | voice recognition, it determines with audio | voice recognition having failed.

選択部２２４は、音声認識部２２３の音声内容の認識結果に基づいて、応答内容を決定する。具体的には、選択部２２４は、記憶部２０３に格納されている応答内容データベース２３２を参照して、管理部２２２で管理されている機嫌パラメータに基づいて音声データの示す音声内容に対応する応答内容（応答情報）を選択（決定）する。 The selection unit 224 determines the response content based on the speech content recognition result of the speech recognition unit 223. Specifically, the selection unit 224 refers to the response content database 232 stored in the storage unit 203 and based on the mood parameters managed by the management unit 222, the response corresponding to the audio content indicated by the audio data Select (determine) the content (response information).

応答処理実行指示部２２５は、選択部２２４により選択された応答内容（応答情報）である回答フレーズデータを通信部２０１を介して人形１０に送信する。 The response process execution instructing unit 225 transmits the response phrase data that is the response content (response information) selected by the selection unit 224 to the doll 10 via the communication unit 201.

（応答内容データベース）
図３は、実施形態１に基づく応答内容データベース２３２について説明する図である。 (Response database)
FIG. 3 is a diagram illustrating the response content database 232 based on the first embodiment.

図３を参照して、当該応答内容データベース２３２は、一例として実施形態に基づくサーバ２０の備える記憶部２０３に格納されている。 With reference to FIG. 3, the response content database 232 is stored in the storage unit 203 of the server 20 based on the embodiment as an example.

具体的には、応答内容データベース２３２には、複数の応答情報が登録されている。具体的には、認識内容（認識フレーズ）と応答内容（回答フレーズ）とが関連付けられて登録されている。本例においては、それぞれの認識フレーズと回答フレーズとの組み合わせに対して識別番号（応答ＩＤ）が割り当てられている。なお、一例として本例における応答内容データベース２３２に登録されている認識フレーズは、音声認識に利用される辞書にも同様に登録されているものとする。 Specifically, a plurality of response information is registered in the response content database 232. Specifically, the recognition content (recognition phrase) and the response content (answer phrase) are registered in association with each other. In this example, an identification number (response ID) is assigned to each combination of recognition phrase and answer phrase. As an example, it is assumed that the recognition phrase registered in the response content database 232 in this example is also registered in the dictionary used for speech recognition.

一例として、ここでは認識フレーズとして、「おはよう」、「かわいいね」、「ダメだね」、「東京のことを教えて」、・・・に対応して回答フレーズがそれぞれ関連付けられて格納されている。 As an example, here, as the recognition phrases, answer phrases are stored in association with "Good morning", "Cute", "No", "Tell me about Tokyo" ... Yes.

例えば、応答ＩＤ「１」の認識フレーズ「おはよう」に対応して回答フレーズ「おはよう！今日も１日頑張ろう！」が関連付けられて登録されている場合が示されている。 For example, a case is shown in which an answer phrase “Good morning! Let's do our best today also” is registered in association with the recognition phrase “Good morning” of the response ID “1”.

また、応答ＩＤ「２」の認識フレーズ「おはよう」に対応して回答フレーズ「おはよう」が関連付けられて登録されている場合が示されている。 In addition, a case is shown in which the answer phrase “good morning” is associated and registered in correspondence with the recognition phrase “good morning” of the response ID “2”.

また、応答ＩＤ「３」の認識フレーズ「おはよう」に対応して回答フレーズ「ふわぁー。まだ眠いよぉ」が関連付けられて登録されている場合が示されている。 Further, a case is shown in which an answer phrase “Fuwaa. Still sleepy yo” is associated and registered in correspondence with the recognition phrase “Good morning” of the response ID “3”.

また、応答ＩＤ「４」の認識フレーズ「かわいいね」に対応して回答フレーズ「ありがとう！嬉しいなぁ。」が関連付けられて登録されている場合が示されている。 In addition, a case is shown in which an answer phrase “Thank you! I am happy.” Is associated and registered in correspondence with the recognition phrase “cute” of the response ID “4”.

また、応答ＩＤ「５」の認識フレーズ「かわいいね」に対応して回答フレーズ「ふーん。どうせお世辞でしょ？」が関連付けられて登録されている場合が示されている。 In addition, a case is shown in which an answer phrase “Fun. Anyway flattering?” Is associated and registered in correspondence with the recognition phrase “cute” of the response ID “5”.

例えば、応答ＩＤ「６」の認識フレーズ「かわいいね」に対応して回答フレーズ「うん。お世辞はもういいから。」が関連付けられて登録されている場合が示されている。 For example, a case is shown in which an answer phrase “Yeah, the flatter is already good” is associated and registered in correspondence with the recognition phrase “cute” of the response ID “6”.

また、応答ＩＤ「７」の認識フレーズ「ダメだね」に対応して回答フレーズ「ごめんね。勉強しておくよ。」が関連付けられて登録されている場合が示されている。 In addition, a case is shown in which an answer phrase “I'm sorry. I'll study.” Is registered in association with the recognition phrase “no good” of the response ID “7”.

また、応答ＩＤ「８」の認識フレーズ「ダメだね」に対応して回答フレーズ「ごめんね。」が関連付けられて登録されている場合が示されている。 In addition, a case is shown in which the answer phrase “I'm sorry” is registered in association with the recognition phrase “no good” of the response ID “8”.

また、応答ＩＤ「９」の認識フレーズ「ダメだね」に対応して回答フレーズ「はぁ。もう寝ようかな。」が関連付けられて登録されている場合が示されている。 In addition, a case is shown in which the response phrase “ha. I'm going to sleep” is registered in association with the recognition phrase “no good” of the response ID “9”.

また、応答ＩＤ「１０」の認識フレーズ「東京のことを教えて」に対応して回答フレーズ「東京と言えば、隅田川の花火大会が有名だよ。」が関連付けられて登録されている場合が示されている。 In addition, there is a case in which the response phrase “TOKYO is famous for the fireworks display of Sumida River” is registered in association with the recognition phrase “Tell me about Tokyo” of the response ID “10”. It is shown.

また、応答ＩＤ「１１」の認識フレーズ「東京のことを教えて」に対応して回答フレーズ「東京には、初詣の参拝客第１位の明治神宮があるよ。」が関連付けられて登録されている場合が示されている。 Also, in response to the recognition phrase “Tell me about Tokyo” of the response ID “11”, the response phrase “There is Meiji Jingu, the first place for first visitor in Tokyo” is registered in association with it. The case is shown.

また、応答ＩＤ「１２」の認識フレーズ「東京のことを教えて」に対応して回答フレーズ「そのくらい自分で調べたら？」が関連付けられて登録されている場合が示されている。 In addition, a case is shown in which an answer phrase “How much do you investigate yourself?” Is registered in association with the recognition phrase “Tell me about Tokyo” of the response ID “12”.

そして、各認識フレーズに対応する回答フレーズに関して、本例においては機嫌範囲が対応付けられている。 And in this example, the mood range is matched with the answer phrase corresponding to each recognition phrase.

当該機嫌範囲は、同じ認識フレーズに対して、複数の回答フレーズが有る場合に選択する指標として用いられる。本例においては、管理部２２２で管理されている機嫌パラメータＰに基づいて回答フレーズが選択される。複数の回答フレーズの中から機嫌パラメータＰに対応する機嫌範囲の回答フレーズが選択される。 The mood range is used as an index to be selected when there are a plurality of answer phrases for the same recognition phrase. In this example, an answer phrase is selected based on the mood parameter P managed by the management unit 222. An answer phrase in a mood range corresponding to the mood parameter P is selected from the plurality of answer phrases.

なお、必ずしも機嫌範囲に含まれる回答フレーズを選択するものではなく、相対的に選択される確率を高くするようにしても良い。これにより、パターン化された回答内容になることなく、ユーザとの間での自然なコミュニケーションを図ることが可能である。 It should be noted that an answer phrase included in the mood range is not necessarily selected, and the probability of being relatively selected may be increased. As a result, natural communication with the user can be achieved without becoming patterned answer contents.

また、認識フレーズが無い場合（ｎｕｌｌ）に対応して再応答を要求する回答フレーズ（再応答回答フレーズ）が設けられている。ここで、認識フレーズが無い場合とは、音声認識に失敗した場合を意味する。なお、音声認識に利用される辞書に登録されている認識フレーズが、応答内容データベース２３２に登録されていない場合、すなわち、音声認識は成功したが対応する認識フレーズが応答内容データベース２３２に登録されていない場合にも、認識フレーズが無い場合として処理するようにしても良い。 In addition, an answer phrase (re-answer answer phrase) for requesting a re-response in response to a case where there is no recognized phrase (null) is provided. Here, the case where there is no recognition phrase means a case where speech recognition fails. If the recognition phrase registered in the dictionary used for speech recognition is not registered in the response content database 232, that is, the speech recognition has succeeded but the corresponding recognition phrase is registered in the response content database 232. Even when there is no recognition phrase, it may be processed as a case where there is no recognition phrase.

具体的には、応答ＩＤ「１００」に関して、認識フレーズが無い場合（ｎｕｌｌ）に回答フレーズ「なになに」が関連付けられて登録されている場合が示されている。 Specifically, regarding the response ID “100”, when there is no recognized phrase (null), the answer phrase “what is” is associated and registered.

また、応答ＩＤ「１０１」に関して、認識フレーズが無い場合（ｎｕｌｌ）に回答フレーズ「もう一度言って」が関連付けられて登録されている場合が示されている。当該認識フレーズが無い場合（ｎｕｌｌ）の回答フレーズを複数設けることによりパターン化された応答となることを回避することが可能である。 In addition, regarding the response ID “101”, a case where the answer phrase “say again” is associated and registered when there is no recognized phrase (null) is shown. By providing a plurality of answer phrases when there is no such recognition phrase (null), it is possible to avoid a patterned response.

本例においては、ユーザに対する発話、また、ユーザ発話に対するユーザへの回答等のユーザに対応する応答処理を実行する場合に、装置の機嫌パラメータＰに基づいてユーザに対する発話あるいは回答等の応答を決定する方式について説明する。 In this example, when a response process corresponding to the user such as an utterance to the user or an answer to the user is performed, a response such as an utterance or an answer to the user is determined based on the mood parameter P of the device. The method to do is demonstrated.

そして、本例においては、各認識フレーズに対応して、会話種別と機嫌変化度が登録されている場合が示されている。会話種別は、会話における認識フレーズの種類を示すものであり、機嫌パラメータＰを更新する際に利用される。また、機嫌変化度についても機嫌パラメータＰを更新する際に利用されるパラメータである。 And in this example, the case where conversation classification and a mood change degree are registered corresponding to each recognition phrase is shown. The conversation type indicates the type of recognition phrase in the conversation, and is used when the mood parameter P is updated. The mood change degree is also a parameter used when the mood parameter P is updated.

一例として、認識フレーズ「おはよう」に対応して会話種別「あいさつ」、機嫌変化度「０」がそれぞれ関連付けられて登録されている。 As an example, the conversation type “greeting” and the mood change degree “0” are associated and registered in correspondence with the recognition phrase “good morning”.

また、認識フレーズ「かわいいね」、「ダメだね」に対応して会話種別「評価」が関連付けられて登録されており、それぞれ認識フレーズの内容に従って異なる機嫌変化度がそれぞれ関連付けられて登録されている。 In addition, the conversation type “evaluation” is associated and registered corresponding to the recognition phrases “cute ne” and “no use”, and different mood change degrees are respectively associated and registered according to the contents of the recognition phrase. Yes.

また、認識フレーズ「東京のことを教えて」に対応して会話種別「質問」、機嫌変化度「０」がそれぞれ関連付けられて登録されている。 Corresponding to the recognition phrase “Tell me about Tokyo”, a conversation type “question” and a mood change degree “0” are associated and registered.

他の認識フレーズについても基本的に同様に登録されている。
機嫌変化度に関して、認識フレーズの内容が積極的な内容であれば、機嫌パラメータＰの値が大きくなるように機嫌変化度が付与される。一方、認識フレーズの内容が消極的な内容であれば、機嫌パラメータＰの値が低くなるように機嫌変化度が付与される。 The other recognition phrases are basically registered in the same manner.
Regarding the mood change degree, if the content of the recognition phrase is positive, the mood change degree is given so that the value of the mood parameter P increases. On the other hand, if the content of the recognition phrase is passive content, the mood change degree is given so that the value of the mood parameter P becomes low.

例えば、選択部２２４は、管理部２２２で管理される装置の機嫌パラメータＰの値が大きい場合には、認識フレーズに対応して積極的あるいは快活な回答フレーズを選択する。 For example, when the value of the mood parameter P of the device managed by the management unit 222 is large, the selection unit 224 selects a positive or cheerful answer phrase corresponding to the recognition phrase.

一方で、選択部２２４は、管理部２２２で管理される装置の機嫌パラメータＰの値が低い場合には、認識フレーズに対応して消極的あるいは陰鬱な回答フレーズが選択される。 On the other hand, when the value of the mood parameter P of the device managed by the management unit 222 is low, the selection unit 224 selects a passive or gloomy answer phrase corresponding to the recognized phrase.

機嫌変化度等に基づく機嫌パラメータの算出処理により、機嫌パラメータの変化に応じた応答処理が実行される。これにより、ユーザとの間での自然なコミュニケーションを図ることが可能である。 A response process corresponding to the change in the mood parameter is executed by the mood parameter calculation process based on the mood change degree or the like. Thereby, natural communication with the user can be achieved.

（履歴記憶部）
図４は、実施形態１に基づく履歴記憶部２３３について説明する図である。 (History storage)
FIG. 4 is a diagram illustrating the history storage unit 233 based on the first embodiment.

図４を参照して、当該履歴記憶部２３３は、一例として実施形態１に基づくサーバ２０の記憶部２０３に格納されている。 With reference to FIG. 4, the history storage unit 233 is stored in the storage unit 203 of the server 20 based on the first embodiment as an example.

具体的には、履歴記憶部２３３は、人形１０等の履歴に関する情報を格納している。本例においては、応答処理実行指示部２２５が選択部２２４により選択した応答内容（応答情報）である回答フレーズデータを通信部２０１を介して人形１０に送信した際に、当該履歴記憶部２３３に履歴に関する情報を格納するものとする。また、履歴記憶部２３３は、人形１０の情報取得部１１５から送信されたセンサ情報の履歴に関する情報を格納する。さらに、外部装置から取得した情報の履歴に関する情報を格納する。 Specifically, the history storage unit 233 stores information related to the history of the doll 10 or the like. In this example, when response phrase data, which is the response content (response information) selected by the response processing execution instruction unit 225 by the selection unit 224, is transmitted to the doll 10 via the communication unit 201, the history storage unit 233 stores the response phrase data. Information on history is stored. In addition, the history storage unit 233 stores information related to the history of sensor information transmitted from the information acquisition unit 115 of the doll 10. Furthermore, information related to the history of information acquired from the external device is stored.

図４（Ａ）を参照して、ここでは、履歴記憶部２３３で記憶されている応答内容の履歴情報が示されている。 Referring to FIG. 4A, here, the history information of the response content stored in history storage unit 233 is shown.

サーバ２０は、複数の人形についてそれぞれ管理することが可能であり、それぞれの人形に対して固有の識別番号が割り当てられている。本例においては、一例として、人形のＩＤ（装置ＩＤ）として「１０」が割り当てられている人形の履歴がそれぞれ登録されている場合が示されている。 The server 20 can manage a plurality of dolls, and a unique identification number is assigned to each doll. In this example, as an example, a case where histories of dolls to which “10” is assigned as a doll ID (device ID) is registered.

ここでは、発話された「時刻」、「応答ＩＤ」、「機嫌変化度」の情報が登録されている場合が示されている。 Here, the case where the information of the uttered "time", "response ID", and "degree of change in mood" is registered is shown.

「時刻」は、サーバ２０から人形１０を介してユーザに対して発話した時刻を意味する。なお、本例においては、ユーザに対して発話した時刻を意味するが、当該時刻に限られず発話処理（応答処理）に関する時刻が特定できればどのような時間であっても良い。例えば、サーバ２０がユーザからの音声データの入力を受け付けた時刻でも良いし、音声認識した時刻でも良い。 “Time” means a time when the server 20 speaks to the user via the doll 10. In this example, it means the time when the user speaks, but is not limited to the time, and any time may be used as long as the time related to the speech process (response process) can be specified. For example, it may be the time when the server 20 receives input of voice data from the user, or may be the time when the voice is recognized.

「応答ＩＤ」は、一例としてサーバ２０が人形１０を介してユーザに対して発話した応答情報を特定する情報であり、基本的に応答内容データベース２３２の応答ＩＤに対応するものである。 The “response ID” is information for identifying the response information uttered by the server 20 to the user via the doll 10 as an example, and basically corresponds to the response ID in the response content database 232.

「機嫌変化度」は、機嫌パラメータＰを変更する際に利用するパラメータであり、応答ＩＤに関連付けられた機嫌変化度が示されている。 The “moment change degree” is a parameter used when changing the mood parameter P, and indicates the mood change degree associated with the response ID.

本例においては、一例として、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１０６：３０：４２」に応答ＩＤ「２」に対応する応答情報に基づく発話が実行されたことが示されている。また、その際の機嫌変化度は「０」である
人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１０６：３１：０５」に応答ＩＤ「１１０」に対応する応答情報に基づく発話が実行されたことが示されている。また、その際の機嫌変化度は「０」である
また、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１０６：３１：２０」に応答ＩＤ「１３０」に対応する応答情報に基づく発話が実行されたことが示されている。その際の機嫌変化度は「０」である。 In this example, as an example, an utterance based on the response information corresponding to the response ID “2” is executed at the time “2014-07-01 06:30:42” with respect to the device ID “10” of the doll. It is shown. The mood change degree at that time is “0”. The utterance based on the response information corresponding to the response ID “110” at the time “2014-07-01 06:31:05” with respect to the device ID “10” of the doll. Is shown to have been executed. In addition, the mood change degree at that time is “0”. Also, regarding the doll device ID “10”, the response information corresponding to the response ID “130” at the time “2014-07-01 06:31:20” It is shown that the utterance based on was executed. The mood change degree at that time is “0”.

また、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１０６：３０：４４」に応答ＩＤ「４」に対応する応答情報に基づく発話が実行されたことが示されている。ここで、応答ＩＤ「４」に対応する応答情報に基づく発話が実行されたことが示されている。また、その際の機嫌変化度は「＋５」である。 In addition, regarding the doll device ID “10”, it is shown that the utterance based on the response information corresponding to the response ID “4” was executed at the time “2014-07-01 06:30:44”. Here, it is shown that the utterance based on the response information corresponding to the response ID “4” has been executed. In addition, the mood change degree at that time is “+5”.

図４（Ｂ）を参照して、ここでは、履歴記憶部２３３で記憶されている外部から取得した情報（センサ情報も含む）の履歴情報が示されている。 Referring to FIG. 4B, here, history information of information (including sensor information) acquired from the outside stored in history storage unit 233 is shown.

なお、人形１０が通信部２０１を介して取得したセンサ情報は、センサ情報データベース２３４に格納される。センサ情報データベース２３４については後述する。 Sensor information acquired by the doll 10 via the communication unit 201 is stored in the sensor information database 234. The sensor information database 234 will be described later.

ここでは、取得した装置の「装置ＩＤ」、「時刻」、「内容」の情報が登録されている場合が示されている。 Here, a case is shown in which “device ID”, “time”, and “content” information of the acquired device is registered.

「装置ＩＤ」は、割り当てられた固有の識別番号であり、本例においては、「１０」が割り当てられている人形の履歴がそれぞれ登録されている場合が示されている。 “Apparatus ID” is an assigned unique identification number. In this example, the history of dolls to which “10” is assigned is registered.

「時刻」は、サーバ２０が人形１０からの情報の入力を受けた時刻を意味する。なお、本例においては、情報の入力を受けた時刻を意味するが、当該時刻に限られず、取得に関する時刻が特定できればどのような時間であっても良い。例えば、人形１０がセンサ１０８で検知した時刻でも良いし、情報取得部１１５が情報を送信した時刻でも良い。 “Time” means the time at which the server 20 receives input of information from the doll 10. In this example, it means the time when information is received, but is not limited to this time, and any time may be used as long as the time related to acquisition can be specified. For example, the time when the doll 10 is detected by the sensor 108 or the time when the information acquisition unit 115 transmits information may be used.

「内容」は、一例としてサーバ２０が人形１０から受けたセンサ情報等を特定する情報であり、センサ情報の場合には、センサ情報データベース２３２のセンサＩＤに対応するものである。また、外部装置から取得した情報の場合には、当該取得情報が含まれていてもよい。 The “content” is information for specifying sensor information received from the doll 10 by the server 20 as an example. In the case of sensor information, the “content” corresponds to the sensor ID in the sensor information database 232. In the case of information acquired from an external device, the acquired information may be included.

本例においては、一例として、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１０９：５４：１０」にセンサＩＤ「１」に対応するセンサ情報を取得したことが示されている。 In this example, as an example, it is shown that sensor information corresponding to the sensor ID “1” is acquired at the time “2014-07-01 09:54:10” regarding the device ID “10” of the doll. .

また、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１０９：５５：３３」にセンサＩＤ「２」に対応するセンサ情報を取得したことが示されている。 In addition, regarding the doll device ID “10”, the sensor information corresponding to the sensor ID “2” is acquired at the time “2014-07-01 09:55:33”.

また、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１０９：５９：４２」にセンサＩＤ「３」に対応するセンサ情報を取得したことが示されている。 In addition, regarding the doll device ID “10”, it is indicated that the sensor information corresponding to the sensor ID “3” is acquired at the time “2014-07-01 09:59:42”.

また、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１０９：５９：４３」にセンサＩＤ「４」に対応するセンサ情報を取得したことが示されている。 In addition, regarding the doll device ID “10”, the sensor information corresponding to the sensor ID “4” is acquired at the time “2014-07-01 09:59:43”.

また、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１１０：００：０１」にセンサＩＤ「５」に対応するセンサ情報を取得したことが示されている。 In addition, regarding the doll device ID “10”, it is indicated that the sensor information corresponding to the sensor ID “5” is acquired at the time “2014-07-01 10:00:01”.

また、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１１０：００：０１」にセンサＩＤ「６」に対応するセンサ情報を取得したことが示されている。 In addition, regarding the doll device ID “10”, it is indicated that the sensor information corresponding to the sensor ID “6” is acquired at the time “2014-07-01 10:00:01”.

また、人形の装置ＩＤ「１０」に関して、時刻「２０１４−０７−０１１２：００：００」に「天気「晴れ」」の情報を取得したことが示されている。 Further, regarding the doll device ID “10”, it is shown that the information of “weather“ sunny ”” is acquired at the time “2014-07-01 12:00:00”.

他の履歴についても同様である。
（センサ情報データベース）
図５は、実施形態１に基づくセンサ情報データベース２３４について説明する図である。 The same applies to other histories.
(Sensor information database)
FIG. 5 is a diagram illustrating the sensor information database 234 based on the first embodiment.

図５を参照して、当該センサ情報データベース２３４は、一例として実施形態に基づくサーバ２０の記憶部２０３に格納されている。 Referring to FIG. 5, the sensor information database 234 is stored in the storage unit 203 of the server 20 based on the embodiment as an example.

具体的には、センサ情報データベース２３４は、人形１０等のセンサ１０８で検知されたセンサ情報を格納している。本例においては、通信部２０１を介して情報取得部１１５から送信されたセンサ情報を制御部２０２が受信した際に、記憶部２０３のセンサ情報データベース２３４に格納するものとする。 Specifically, the sensor information database 234 stores sensor information detected by the sensor 108 such as the doll 10. In this example, when the control unit 202 receives the sensor information transmitted from the information acquisition unit 115 via the communication unit 201, the sensor information is stored in the sensor information database 234 of the storage unit 203.

センサ情報データベースに格納する際に、それぞれのセンサ情報に対して固有の識別番号が割り当てられている。本例においては、一例として、センサＩＤが昇順的に割り当てられている場合が示されている。そして、当該割り当てらたセンサＩＤが図４（Ｂ）で説明したように履歴記憶部２３３の外部から取得した情報の履歴に格納される。 When storing in the sensor information database, a unique identification number is assigned to each sensor information. In this example, as an example, a case where sensor IDs are assigned in ascending order is shown. The assigned sensor ID is stored in the history of information acquired from the outside of the history storage unit 233 as described with reference to FIG.

一例として、センサＩＤに対応して、「時刻」、「種別」、「値」がそれぞれ関連付けられて登録されている。 As an example, “time”, “type”, and “value” are associated and registered in correspondence with the sensor ID.

「時刻」は、サーバ２０が人形１０からのセンサ情報の入力を受けた時刻を意味する。なお、本例においては、センサ情報の入力を受けた時刻を意味するが、当該時刻に限られず、センサ取得に関する時刻が特定できればどのような時間であっても良い。例えば、人形１０がセンサ１０８で検知した時刻でも良いし、情報取得部１１５がセンサ情報を送信した時刻でも良い。 “Time” means the time when the server 20 receives input of sensor information from the doll 10. In this example, it means the time when sensor information is input, but is not limited to this time, and any time may be used as long as the time related to sensor acquisition can be specified. For example, the time when the doll 10 is detected by the sensor 108 or the time when the information acquisition unit 115 transmits the sensor information may be used.

「種別」は、センサ情報の種別を意味し、一例として、種別「ｔｏｕｃｈ」は、人形１０に対するユーザの接触を検知したこと意味する。また、種別「ｓｍｉｌｅ」は、ユーザの微笑みを検知したことを意味する。また、種別「湿度」は、センサ１０８で取得した湿度を意味する。また、種別「温度」は、センサ１０８で取得した温度を意味する。 “Type” means the type of sensor information, and as an example, the type “touch” means that the user's contact with the doll 10 is detected. The type “smile” means that a smile of the user has been detected. The type “humidity” means the humidity acquired by the sensor 108. The type “temperature” means a temperature acquired by the sensor 108.

また、「値」は、種別「湿度」、「温度」の場合に検出された数値を意味する。
本例においては、一例として、センサＩＤ「１」として、時刻「２０１４−０７−０１０９：５４：１０」に種別「ｔｏｕｃｈ」のセンサ情報を取得したことが示されている。 “Value” means a numerical value detected in the case of the types “humidity” and “temperature”.
In this example, as an example, it is indicated that sensor information of type “touch” is acquired at time “2014-07-01 09:54:10” as sensor ID “1”.

また、センサＩＤ「２」として、時刻「２０１４−０７−０１０９：５５：３３」に種別「ｔｏｕｃｈ」のセンサ情報を取得したことが示されている。 Further, it is indicated that the sensor information of the type “touch” is acquired at the time “2014-07-01 09:55:33” as the sensor ID “2”.

また、センサＩＤ「３」として、時刻「２０１４−０７−０１０９：５９：４２」に種別「ｓｍｉｌｅ」のセンサ情報を取得したことが示されている。 Further, it is indicated that the sensor information of the type “smile” is acquired at the time “2014-07-01 09:59:42” as the sensor ID “3”.

また、センサＩＤ「４」として、時刻「２０１４−０７−０１０９：５９：４３」に種別「ｔｏｕｃｈ」のセンサ情報を取得したことが示されている。 Further, it is indicated that the sensor information of the type “touch” is acquired at the time “2014-07-01 09:59:43” as the sensor ID “4”.

また、センサＩＤ「５」として、時刻「２０１４−０７−０１１０：００：０１」に種別「温度」のセンサ情報を取得したことが示されている。また、値は「２８」を取得したことが示されている。 Further, it is indicated that the sensor information of the type “temperature” is acquired at the time “2014-07-01 10:00:01” as the sensor ID “5”. Further, it is indicated that the value “28” has been acquired.

また、センサＩＤ「６」として、時刻「２０１４−０７−０１１０：００：０１」に種別「湿度」のセンサ情報を取得したことが示されている。また、値は「５２」を取得したことが示されている。 Further, it is indicated that the sensor information of the type “humidity” is acquired at the time “2014-07-01 10:00:01” as the sensor ID “6”. Further, it is indicated that the value “52” is acquired.

他のセンサ情報についても同様である。
（変更テーブルデータベース）
図６は、実施形態１に基づく変更テーブルデータベース２３１について説明する図である。 The same applies to other sensor information.
(Change table database)
FIG. 6 is a diagram illustrating the change table database 231 based on the first embodiment.

図６（Ａ）には、会話種別連続回数と機嫌変化度との対応関係が示されている。
会話種別が連続する場合にはマイナスの機嫌変化度となるように設定されている。 FIG. 6A shows the correspondence between the number of consecutive conversation types and the degree of change in mood.
It is set to have a negative mood change degree when the conversation types are continuous.

一例として、会話種別としての連続回数が１０回以上の場合には、機嫌変化度「−５」が対応付けられている。 As an example, when the number of consecutive conversation types is 10 or more, the mood change degree “−5” is associated.

また、連続回数が５回以上の場合には、機嫌変化度「−１」が対応付けられている。
また、連続回数が０回の場合には、機嫌変化度「０」が対応付けられている。 When the number of consecutive times is 5 or more, the mood change degree “−1” is associated.
Further, when the number of continuous times is 0, the mood change degree “0” is associated.

図６（Ｂ）には、不快指数と機嫌変化度との対応関係が示されている。
一例として、不快指数Ｑが高いほどマイナスの機嫌変化度となるように設定されている。 FIG. 6B shows the correspondence between the discomfort index and the mood change degree.
As an example, the higher the discomfort index Q, the lower the mood change level.

不快指数Ｑが７５より大きく８０以下（７５＜Ｑ≦８０）の場合に、機嫌変化度「−１」が対応付けられている。 When the discomfort index Q is greater than 75 and equal to or less than 80 (75 <Q ≦ 80), the mood change degree “−1” is associated.

不快指数Ｑが８０より大きく８５以下（８０＜Ｑ≦８５）の場合に、機嫌変化度「−５」が対応付けられている。 When the discomfort index Q is greater than 80 and less than or equal to 85 (80 <Q ≦ 85), the mood change degree “−5” is associated.

不快指数Ｑが８５より大きい（８５＜Ｑ）場合に、機嫌変化度「−１０」が対応付けられている。 When the discomfort index Q is greater than 85 (85 <Q), the mood change degree “−10” is associated.

図６（Ｃ）には、時間帯と機嫌変化度との対応関係が示されている。
早朝や深夜の時間帯についてマイナスの機嫌変化度となるように設定されている。 FIG. 6C shows a correspondence relationship between the time zone and the mood change degree.
It is set to have a negative mood change in the early morning and late night hours.

一例として、時間帯「０：００〜３：５９」の場合に、機嫌変化度「−１０」が対応付けられている。 As an example, in the case of the time zone “0: 0 to 3:59”, the mood change degree “−10” is associated.

時間帯「４：００〜５：５９」の場合に、機嫌変化度「−５」が対応付けられている。
時間帯「６：００〜６：５９」の場合に、機嫌変化度「−２」が対応付けられている。 In the case of the time zone “4: 00 to 5: 59”, the mood change degree “−5” is associated.
In the case of the time zone “6: 0 to 6:59”, the mood change degree “−2” is associated.

時間帯「２３：００〜２３：５９」の場合に、機嫌変化度「−２」が対応付けられている。 In the case of the time zone “23: 00 to 23: 59”, the mood change degree “−2” is associated.

図６（Ｄ）には、曜日と機嫌変化度との対応関係が示されている。
一例として、週の初めはマイナスの機嫌変化度、週末はプラスの機嫌変化度となるように設定されている。 FIG. 6D shows the correspondence between the day of the week and the mood change degree.
As an example, a negative mood change degree is set at the beginning of the week, and a positive mood change degree is set at the weekend.

月曜日の場合に、機嫌変化度「−１０」が対応付けられている。
金曜日の場合に、機嫌変化度「＋１０」が対応付けられている。 In the case of Monday, the mood change degree “−10” is associated.
In the case of Friday, the mood change degree “+10” is associated.

土曜日の場合に、機嫌変化度「＋１０」が対応付けられている。
図６（Ｅ）には、天気と機嫌変化度との対応関係が示されている。 In the case of Saturday, the mood change degree “+10” is associated.
FIG. 6E shows a correspondence relationship between the weather and the mood change degree.

一例として、天気がいい日は、プラスの機嫌変化度、悪い日には、マイナスの機嫌変化度となるように設定されている。 As an example, a positive mood change degree is set on a day when the weather is good, and a negative mood change degree is set on a bad day.

晴れの場合に機嫌変化度「＋１０」が対応付けられている。
くもりの場合に機嫌変化度「０」が対応付けられている。 When the weather is fine, the mood change degree “+10” is associated.
In the case of cloudy, the mood change degree “0” is associated.

雨の場合に機嫌変化度「−１０」が対応付けられている。
（応答処理）
図７は、実施形態１に基づく対話システム１における応答処理の流れを示すシーケンス図である。 In the case of rain, the mood change degree “−10” is associated.
(Response processing)
FIG. 7 is a sequence diagram showing a response process flow in the interactive system 1 based on the first embodiment.

図７に示されるように、ユーザは、人形１０に対して発話（ユーザ発話とも称する）する（シーケンスｓｑ０）。 As shown in FIG. 7, the user utters (also referred to as user utterance) to doll 10 (sequence sq0).

人形１０は、ユーザ発話に対して音声の入力を受け付ける（シーケンスｓｑ１）。具体的には、音声入力受付部１１４は、マイク１０３を介して外部からの音の入力を受け付ける。 The doll 10 receives an input of voice in response to the user utterance (sequence sq1). Specifically, the voice input receiving unit 114 receives a sound input from the outside via the microphone 103.

次に、人形１０は、音声データをサーバ２０に出力する（シーケンスｓｑ２）。具体的には、音声入力受付部１１４は、通信部１０１を介してサーバ２０に出力する。 Next, the doll 10 outputs audio data to the server 20 (sequence sq2). Specifically, the voice input reception unit 114 outputs to the server 20 via the communication unit 101.

次に、サーバ２０は、人形１０から送信された音声データを受信して音声認識を実行する（シーケンスｓｑ３）。具体的には、音声入力受信部２２１は、通信部２０１を介して音声データを受信して、音声認識部２２３に出力する。そして、音声認識部２２３は、音声内容を認識する。そして、音声認識部２２３は、認識結果を選択部２２４に出力する。 Next, server 20 receives the voice data transmitted from doll 10 and performs voice recognition (sequence sq3). Specifically, the voice input reception unit 221 receives voice data via the communication unit 201 and outputs the voice data to the voice recognition unit 223. Then, the voice recognition unit 223 recognizes the voice content. Then, the voice recognition unit 223 outputs the recognition result to the selection unit 224.

次に、サーバ２０は、認識結果に基づいて回答フレーズを決定する対話出力処理を実行する（シーケンスｓｑ４）。具体的には、選択部２２４は、回答フレーズを決定して応答処理実行指示部２２５に出力する。対話出力処理については後述する。 Next, the server 20 executes a dialog output process for determining an answer phrase based on the recognition result (sequence sq4). Specifically, the selection unit 224 determines an answer phrase and outputs it to the response process execution instruction unit 225. The dialog output process will be described later.

そして、サーバ２０は、決定した回答フレーズデータを人形１０に送信する（シーケンスｓｑ５）。具体的には、応答処理実行指示部２２５は、通信部２０１を介して選択部２２４により決定した回答フレーズデータを人形１０に送信する。本例においては、回答フレーズは一例として音声ファイルであるものとする。なお、テキスト形式のファイルであっても良い。他の例においても同様である。 Then, server 20 transmits the determined answer phrase data to doll 10 (sequence sq5). Specifically, the response process execution instruction unit 225 transmits the reply phrase data determined by the selection unit 224 to the doll 10 via the communication unit 201. In this example, the answer phrase is an audio file as an example. It may be a text file. The same applies to other examples.

次に、人形１０は、音声対話出力を実行する（シーケンスｓｑ６）。具体的には、応答処理実行部１１２は、通信部２０１を介して受信した回答フレーズデータに基づいてスピーカ１０４を介してユーザに応答（音声対話）する。すなわち、応答処理実行部１１２は、回答フレーズデータである音声ファイルを再生してスピーカ１０４により音声をユーザに応答（発話）する（シーケンスｓｑ６Ａ）。 Next, the doll 10 executes voice dialogue output (sequence sq6). Specifically, the response process execution unit 112 responds (voice conversation) to the user via the speaker 104 based on the answer phrase data received via the communication unit 201. That is, the response process execution unit 112 reproduces an audio file that is answer phrase data and responds (speaks) the audio to the user through the speaker 104 (sequence sq6A).

そして、ユーザは、人形１０からの応答処理に対する反応として、人形１０に対して発話（回答）する（シーケンスｓｑ６Ｂ）。 Then, the user speaks (answers) the doll 10 as a response to the response process from the doll 10 (sequence sq6B).

人形１０は、応答処理に対するユーザからの音声の入力を受け付ける（シーケンスｓｑ７）。具体的には、音声入力受付部１１４は、マイク１０３を介して外部からの音の入力を受け付ける。 The doll 10 receives a voice input from the user for the response process (sequence sq7). Specifically, the voice input receiving unit 114 receives a sound input from the outside via the microphone 103.

次に、人形１０は、音声データをサーバ２０に出力する（シーケンスｓｑ８）。具体的には、音声入力受付部１１４は、通信部１０１を介してサーバ２０に出力する。 Next, the doll 10 outputs audio data to the server 20 (sequence sq8). Specifically, the voice input reception unit 114 outputs to the server 20 via the communication unit 101.

次に、サーバ２０は、人形１０から送信された音声データを受信して音声認識を実行する（シーケンスｓｑ９）。具体的には、音声入力受信部２２１は、通信部２０１を介して音声データを受信して、音声認識部２２３に出力する。そして、音声認識部２２３は、音声内容を認識する。そして、音声認識部２２３は、認識結果を選択部２２４に出力する。 Next, server 20 receives the voice data transmitted from doll 10 and performs voice recognition (sequence sq9). Specifically, the voice input reception unit 221 receives voice data via the communication unit 201 and outputs the voice data to the voice recognition unit 223. Then, the voice recognition unit 223 recognizes the voice content. Then, the voice recognition unit 223 outputs the recognition result to the selection unit 224.

次に、サーバ２０は、認識結果に基づいて回答フレーズを決定する対話出力処理を実行する（シーケンスｓｑ１０）。具体的には、選択部２２４は、回答フレーズを決定して応答処理実行指示部２２５に出力する。対話出力処理については後述する。 Next, the server 20 executes a dialog output process for determining an answer phrase based on the recognition result (sequence sq10). Specifically, the selection unit 224 determines an answer phrase and outputs it to the response process execution instruction unit 225. The dialog output process will be described later.

そして、サーバ２０は、決定した回答フレーズデータを人形１０に送信する（シーケンスｓｑ１１）。具体的には、応答処理実行指示部２２５は、通信部２０１を介して選択部２２４により決定した回答フレーズデータを人形１０に送信する。 Then, server 20 transmits the determined answer phrase data to doll 10 (sequence sq11). Specifically, the response process execution instruction unit 225 transmits the reply phrase data determined by the selection unit 224 to the doll 10 via the communication unit 201.

次に、人形１０は、音声対話出力を実行する（シーケンスｓｑ１２）。具体的には、応答処理実行部１１２は、通信部２０１を介して受信した回答フレーズデータに基づいてスピーカ１０４を介してユーザに応答（音声対話）する。すなわち、応答処理実行部１１２は、回答フレーズデータである音声ファイルを再生してスピーカ１０４により音声をユーザに応答（発話）する（シーケンスｓｑ１２Ａ）。以降、同様の処理が繰り返される。 Next, the doll 10 executes voice dialogue output (sequence sq12). Specifically, the response process execution unit 112 responds (voice conversation) to the user via the speaker 104 based on the answer phrase data received via the communication unit 201. That is, the response process execution unit 112 reproduces an audio file that is answer phrase data and responds (speaks) the audio to the user through the speaker 104 (sequence sq12A). Thereafter, the same processing is repeated.

（対話出力処理）
図８は、実施形態１に基づくサーバ２０の対話出力処理を実行するフロー図である。 (Interactive output processing)
FIG. 8 is a flowchart for executing the dialogue output process of the server 20 based on the first embodiment.

図８を参照して、当該フロー図は、記憶部１０９に格納されているプログラムを実行して制御部１０２の各部が機能することにより実行される処理である。 Referring to FIG. 8, the flowchart is a process executed by executing a program stored in storage unit 109 and causing each unit of control unit 102 to function.

まず、音声認識が成功したかどうかを判断する（ステップＳ１）。具体的には、選択部２２４は、音声認識部２２３から音声認識結果として認識フレーズが通知されたか否かを判断する。 First, it is determined whether or not the voice recognition is successful (step S1). Specifically, the selection unit 224 determines whether or not a recognition phrase is notified from the voice recognition unit 223 as a voice recognition result.

ステップＳ１において、音声認識が成功したと判断した場合（ステップＳ１においてＹＥＳ）には、次に、回答フレーズが複数有るかどうかを判断する（ステップＳ２）。具体的には、選択部２２４は、応答内容データベース２３２（図３）を参照して、認識フレーズに対応する回答フレーズが複数登録されているか否かを判断する。 If it is determined in step S1 that the speech recognition is successful (YES in step S1), it is next determined whether or not there are a plurality of answer phrases (step S2). Specifically, the selection unit 224 refers to the response content database 232 (FIG. 3) and determines whether or not a plurality of answer phrases corresponding to the recognized phrase are registered.

ステップＳ２において、回答フレーズが複数有ると判断した場合（ステップＳ２においてＹＥＳ）には、機嫌パラメータを取得する（ステップＳ３）。具体的には、選択部２２４は、管理部２２２に指示して機嫌パラメータを要求する。管理部２２２は、機嫌パラメータを算出して、算出結果を選択部２２２に出力する。機嫌パラメータの算出処理については後述する。 If it is determined in step S2 that there are a plurality of answer phrases (YES in step S2), a mood parameter is acquired (step S3). Specifically, the selection unit 224 instructs the management unit 222 to request a mood parameter. The management unit 222 calculates the mood parameter and outputs the calculation result to the selection unit 222. The mood parameter calculation process will be described later.

そして、次に、回答フレーズを選択する（ステップＳ４）。具体的には、一例として選択部２２４は、管理部２２２から出力された機嫌パラメータに基づいて応答内容データベース２３２（図３）を参照して認識フレーズに対応する回答フレーズを選択する。 Next, an answer phrase is selected (step S4). Specifically, as an example, the selection unit 224 selects an answer phrase corresponding to the recognized phrase with reference to the response content database 232 (FIG. 3) based on the mood parameter output from the management unit 222.

そして、次に、出力処理を実行する（ステップＳ５）。具体的には、選択部２２４は、選択した回答フレーズを応答処理実行指示部２２５に出力する。応答処理実行指示部２２５は、選択部２２４が選択（決定）した回答フレーズデータを通信部２０１を介して人形１０に出力する。 Next, an output process is executed (step S5). Specifically, the selection unit 224 outputs the selected answer phrase to the response process execution instruction unit 225. The response process execution instructing unit 225 outputs the reply phrase data selected (determined) by the selection unit 224 to the doll 10 via the communication unit 201.

そして、処理を終了する（リターン）。
一方、ステップＳ２において、回答フレーズが複数無いと判断した場合（ステップＳ２においてＮＯ）には、回答フレーズを決定する（ステップＳ７）。具体的には、選択部２２４は、応答内容データベース２３２（図３）を参照して認識フレーズに対応する回答フレーズを選択（決定）する。 Then, the process ends (return).
On the other hand, if it is determined in step S2 that there are not a plurality of answer phrases (NO in step S2), an answer phrase is determined (step S7). Specifically, the selection unit 224 selects (determines) an answer phrase corresponding to the recognized phrase with reference to the response content database 232 (FIG. 3).

そして、処理を終了する（リターン）。
一方、ステップＳ１において、音声認識が成功しなかったと判断した場合（ステップＳ１においてＮＯ）には、再応答回答フレーズを決定する（ステップＳ６）。 Then, the process ends (return).
On the other hand, if it is determined in step S1 that the speech recognition has not been successful (NO in step S1), a reresponse answer phrase is determined (step S6).

具体的には、選択部２２４は、音声認識が成功しなかったと判断した場合、応答内容データベース２３２（図３）を参照して認識フレーズが無い場合（ｎｕｌｌ）に対応する再応答回答フレーズを選択（決定）する。例えば、再度、ユーザからの応答を得るために例えば「なになに」、「もう一度ゆって」等の再応答回答フレーズを選択（決定）する。また、特に当該再応答回答フレーズに限られず他のフレーズ、たとえば、「さっきの良かったでしょう」等でも良い。 Specifically, when the selection unit 224 determines that the speech recognition has not been successful, the selection unit 224 refers to the response content database 232 (FIG. 3) and selects the re-response reply phrase corresponding to the case where there is no recognized phrase (null). (decide. For example, in order to obtain a response from the user again, for example, a re-response reply phrase such as “what” or “go again” is selected (determined). In particular, the phrase is not limited to the re-response reply phrase, and may be another phrase such as “It should have been good”.

そして、処理を終了する（リターン）。
当該処理により、機嫌パラメータに基づいて応答内容データベース２３２から回答フレーズを選択して、サーバ２０から回答フレーズデータが人形１０に出力されて発話される。 Then, the process ends (return).
With this process, an answer phrase is selected from the response content database 232 based on the mood parameters, and the answer phrase data is output from the server 20 to the doll 10 and uttered.

これにより、機嫌パラメータの変化に応じた応答処理（発話）が実行されることにより、ユーザとの間での自然なコミュニケーションを図ることが可能である。 Thereby, it is possible to achieve natural communication with the user by executing the response process (speech) according to the change in the mood parameter.

なお、上記においては、選択部２２４は、応答内容データベース２３２（図３）を参照して認識フレーズが無い場合（ｎｕｌｌ）に対応する再応答回答フレーズを通信部２０１を介して人形１０に出力する場合について説明したが、履歴記憶部２３３を参照して前回出力した回答フレーズデータを再出力するようにしても良い。 In the above, the selection unit 224 refers to the response content database 232 (FIG. 3) and outputs a re-response reply phrase corresponding to the case where there is no recognized phrase (null) to the doll 10 via the communication unit 201. Although the case has been described, the answer phrase data output last time with reference to the history storage unit 233 may be output again.

なお、音声認識部２２３は、音声内容の認識結果（音声認識結果）として得られる認識の確度（確からしさを示す度合）を示す信頼度を算出することも可能である。当該信頼度が低い場合には、認識フレーズが無いと判断するようにしても良い。なお、音声認識部２２３における音声認識結果の信頼度の判定方法としては、例えば、予め複数用意されている、所定の言葉（フレーズ）を示す音声波形モデル（音響モデル）と音声データの示す波形との一致度を判定し、最も高い一致度を信頼度とする判定方法などを用いることができる。なお、本発明はこれに限定されるものではなく、他の方式を用いることもできる。 Note that the speech recognition unit 223 can also calculate the reliability indicating the accuracy of recognition (the degree of probability) obtained as a speech content recognition result (speech recognition result). When the reliability is low, it may be determined that there is no recognition phrase. As a method for determining the reliability of the speech recognition result in the speech recognition unit 223, for example, a plurality of speech waveform models (acoustic models) indicating a predetermined word (phrase) and a waveform indicating speech data are prepared in advance. A determination method of determining the degree of coincidence and using the highest degree of coincidence as the reliability can be used. In addition, this invention is not limited to this, Other systems can also be used.

また、サーバ２０からの回答フレーズデータに基づいて人形１０から応答処理を実行させる場合、応答処理に時間がかかることが想定される。したがって、「え〜っと」等の音声を発話させたり、他の応答処理をさせることで、ユーザに違和感を与えることを軽減し、人形１０に対して親近感を抱かせることが可能となる。すなわち、より自然なコミュニケーションを図ることが可能である。このようなつなぎの音声を発話する等の応答処理を一定時間ごとに実行してもよい。このような応答処理は、予め定められた応答でもよいし、いくつかのパターンの中から選択されるものでもよく、また、その選択はランダムに選択されるものでもよい。このようなつなぎの音声を発話する等の応答処理は応答速度の面で人形１０により実行させる方がより好ましいが、サーバ２０の指示により実行する方式を採用することも可能である。具体的には、図７のシーケンス図のシーケンスｓｑ２において、サーバ２０が人形１０からの音声データを受信した際に、当該つなぎの音声を発話する等の応答処理を実行するように、サーバ２０から人形１０に対して指示する構成を採用するようにしても良い。なお、以下の形態についても同様に適用可能である。 Further, when the response process is executed from the doll 10 based on the answer phrase data from the server 20, it is assumed that the response process takes time. Therefore, it is possible to reduce the feeling of discomfort to the user and to give the doll 10 a sense of familiarity by uttering a voice such as “Utto” or performing other response processing. . That is, more natural communication can be achieved. Response processing such as uttering such connected voice may be executed at regular intervals. Such a response process may be a predetermined response, may be selected from several patterns, or may be selected at random. It is more preferable that the doll 10 performs the response process such as uttering the voice of the connection in terms of the response speed, but a method of executing it by the instruction of the server 20 can also be adopted. Specifically, in the sequence sq2 in the sequence diagram of FIG. 7, when the server 20 receives the voice data from the doll 10, the server 20 performs a response process such as uttering the voice of the connection. A configuration instructing the doll 10 may be adopted. Note that the following embodiments are also applicable.

（機嫌パラメータ取得処理）
図９は、実施形態１に基づくサーバ２０の機嫌パラメータ取得処理を実行するフロー図である。 (Moody parameter acquisition process)
FIG. 9 is a flowchart for executing the mood parameter acquisition process of the server 20 based on the first embodiment.

図９を参照して、当該フロー図は、記憶部２０３に格納されているプログラムを実行して制御部２０２の管理部２２２により実行される処理である。 With reference to FIG. 9, the flowchart is a process executed by the management unit 222 of the control unit 202 by executing a program stored in the storage unit 203.

まず、機嫌パラメータＰを算出するための評価値を算出する（ステップＳ１０）。
まず、機嫌パラメータは１つのパラメータで表現することとし、ここでは一例として、−１０〜＋１０の間の値で表現する。すなわち、−１０が最も機嫌が悪い状態であり、＋１０が最も機嫌が良い状態となる。機嫌は多くの要因に影響を受けると考えられるため、一例として、機嫌パラメータは種々の要因の線形結合として算出する。 First, an evaluation value for calculating the mood parameter P is calculated (step S10).
First, the mood parameter is expressed by one parameter. Here, as an example, it is expressed by a value between −10 and +10. That is, −10 is the state with the worst mood, and +10 is the state with the best mood. Since mood is considered to be influenced by many factors, as an example, the mood parameter is calculated as a linear combination of various factors.

例えば、要因としては、直近の会話頻度、直近の会話内容、直近のセンサ情報、現在日時、外部情報の５つを用いる。直近の会話頻度、直近の会話内容、直近のセンサ情報、現在日時、外部情報のそれぞれの評価値をＶ１，Ｖ２，Ｖ３，Ｖ４，Ｖ５とする。 For example, as the factors, five of the latest conversation frequency, the latest conversation content, the latest sensor information, the current date and time, and external information are used. Assume that the evaluation values of the latest conversation frequency, the latest conversation content, the latest sensor information, the current date and time, and the external information are V1, V2, V3, V4, and V5.

次に、評価値に基づいて機嫌パラメータＰを算出する（ステップＳ１２）。
本例においては、機嫌パラメータＰは、次式で算出する。 Next, the mood parameter P is calculated based on the evaluation value (step S12).
In this example, the mood parameter P is calculated by the following equation.

機嫌パラメータＰ＝Ｋ１×Ｖ１＋Ｋ２×Ｖ２＋Ｋ３×Ｖ３＋Ｋ４×Ｖ４＋Ｋ５×Ｖ５
ここで、Ｋ１〜Ｋ５は、それぞれの要因の重み付け値であり、この重み付け値については任意に変更することが可能である。 Mood parameter P = K1 × V1 + K2 × V2 + K3 × V3 + K4 × V4 + K5 × V5
Here, K1 to K5 are weighting values of the respective factors, and the weighting values can be arbitrarily changed.

そして、算出結果に基づいて機嫌パラメータを更新する（ステップＳ１４）。管理部２２２は、更新した機嫌パラメータＰを選択部２２４に出力する。 Then, the mood parameter is updated based on the calculation result (step S14). The management unit 222 outputs the updated mood parameter P to the selection unit 224.

そして、処理を終了する（エンド）。
管理部２２２の評価値Ｖ１〜Ｖ５の算出方法について説明する。 Then, the process ends (END).
A method for calculating the evaluation values V1 to V5 of the management unit 222 will be described.

直近の会話頻度の評価値Ｖ１については、まず、履歴記憶部２３３を参照して、応答ＩＤから記録されている直近（例えば７２時間）の会話回数Ｔ１を取得する。 For the most recent conversation frequency evaluation value V1, first, the history storage unit 233 is referred to, and the most recent (for example, 72 hours) conversation count T1 recorded from the response ID is acquired.

会話回数Ｔ１を用いて、会話頻度による評価値Ｖ１を次式により算出する。
評価値Ｖ１＝Ｔ１／１０−１０
なお、Ｖ１の上限値は１０、下限値は−１０とする。他の評価値Ｖ２〜Ｖ５についても同様である。 Using the conversation count T1, an evaluation value V1 based on the conversation frequency is calculated by the following equation.
Evaluation value V1 = T1 / 10−10
The upper limit value of V1 is 10 and the lower limit value is −10. The same applies to the other evaluation values V2 to V5.

すなわち、会話回数が２００回以上であれば評価値Ｖ１＝１０となり、会話回数が０回であればＶ１＝０となる。 That is, the evaluation value V1 = 10 when the number of conversations is 200 or more, and V1 = 0 when the number of conversations is zero.

直近の会話内容の評価値Ｖ２については、履歴記憶部２３３を参照して、記録されている直近（例えば２４時間）の機嫌変化度の値（合計値）を取得する。 For the evaluation value V2 of the latest conversation content, the history storage unit 233 is referred to, and the recorded value (total value) of the latest mood change degree (for example, 24 hours) is acquired.

なお、履歴記憶部２３３を参照して、応答ＩＤに対応する直近の会話種別を取得し、図６（Ａ）の変更テーブルデータベース２３１を参照して同じ会話種別が連続している場合にはマイナス評価として先の合計値と加算するようにしても良い。 It should be noted that the latest conversation type corresponding to the response ID is acquired by referring to the history storage unit 233, and minus if the same conversation type is continuous with reference to the change table database 231 of FIG. You may make it add with the previous total value as evaluation.

直近のセンサ情報の評価値Ｖ３については、履歴記憶部２３３およびセンサ情報データベース２３４を参照して、記録されている直近（例えば２４時間）のセンサ情報を取得する。 For the evaluation value V3 of the latest sensor information, the latest sensor information recorded (for example, 24 hours) is acquired by referring to the history storage unit 233 and the sensor information database 234.

撫でられた回数（種別「ｔｏｕｃｈ」）分だけプラス評価、笑顔を検出した回数（種別「ｓｍｉｌｅ」）分だけプラス評価を行う。また、検出された温度および湿度に基づいて不快指数Ｑを算出して、図６（Ｂ）の変更テーブルデータベース２３１を参照して、機嫌変化度を算出して、これらの値の合計値を取得する。 A positive evaluation is performed for the number of strokes (type “touch”), and a positive evaluation is performed for the number of times that a smile is detected (type “smile”). Further, the discomfort index Q is calculated based on the detected temperature and humidity, the mood change degree is calculated with reference to the change table database 231 in FIG. 6B, and the total value of these values is obtained. To do.

現在日時の評価値Ｖ４については、現在日時が特定の日付/時間帯に基づいて評価する。 For the evaluation value V4 of the current date and time, the current date and time are evaluated based on a specific date / time zone.

図６（Ｃ）および（Ｄ）を参照して、対応する特定の日付／時間帯に基づいて、機嫌変化度を算出する。 Referring to FIGS. 6C and 6D, the mood change degree is calculated based on the corresponding specific date / time zone.

外部情報の評価値Ｖ５については、履歴記憶部２３３を参照して、外部装置から取得した情報に基づいて算出する。本例においては、図６（Ｅ）を参照して、天気情報に基づく機嫌変化度を算出する。 The external information evaluation value V5 is calculated based on the information acquired from the external device with reference to the history storage unit 233. In this example, referring to FIG. 6E, the mood change degree based on the weather information is calculated.

上記評価値Ｖ１〜Ｖ５の算出結果に基づいて機嫌パラメータＰを算出する。一例として、重み付け値Ｋ１〜Ｋ５を全て０．２とすれば、平均として機嫌パラメータＰを算出することが可能である。 The mood parameter P is calculated based on the calculation results of the evaluation values V1 to V5. As an example, if the weight values K1 to K5 are all 0.2, the mood parameter P can be calculated as an average.

なお、機嫌パラメータＰの算出方式は、上記方式に限られず、種々の方式を採用することが可能である。例えば、本例においては、評価値Ｖ１〜Ｖ５に基づいて機嫌パラメータＰを算出する方式について説明したが、いずれか１つの評価値に基づいて機嫌パラメータＰを算出するようにしても良く、その組み合わせも任意である。また、上記においては、機嫌パラメータＰの算出方式として、各評価値を線形加算する方式について説明したが、各評価値をそれぞれ乗算し、乗算した値を機嫌パラメータの値として算出するようにしても良い。 In addition, the calculation method of the mood parameter P is not limited to the above method, and various methods can be adopted. For example, in this example, the method of calculating the mood parameter P based on the evaluation values V1 to V5 has been described. However, the mood parameter P may be calculated based on any one of the evaluation values, and a combination thereof. Is also optional. In the above description, the method of linearly adding the evaluation values has been described as the method of calculating the mood parameter P. However, each evaluation value is multiplied, and the multiplied value is calculated as the value of the mood parameter. good.

なお、本例においては、人形１０とサーバ２０とが協働して動作する対話システム１の構成について説明したが、音声認識等のサーバ２０の機能を人形１０に含めてスタンドアローンで動作する対話装置を実現するようにしても良い。 In this example, the configuration of the interactive system 1 in which the doll 10 and the server 20 operate in cooperation has been described. However, the interactive operation that includes the functions of the server 20 such as voice recognition in the doll 10 and operates in a stand-alone manner. You may make it implement | achieve an apparatus.

また、本例においては、ユーザからの発話に対する応答処理に対して、機嫌パラメータＰに基づく応答処理を実行する方式について説明したが、特にこれに限られず、ユーザからの発話ではなく、独り言のような所定周期における自動応答処理についても機嫌パラメータＰに基づく応答処理を実行するようにしてもよい。 Moreover, in this example, although the system which performs the response process based on the mood parameter P with respect to the response process with respect to the utterance from a user was demonstrated, it is not restricted to this in particular, it is not an utterance from a user but self-speaking Also for the automatic response process in a predetermined cycle, a response process based on the mood parameter P may be executed.

＜実施形態２＞
上記の実施形態１においては、機嫌パラメータに基づいて回答フレーズを変更する方式について説明した。 <Embodiment 2>
In said Embodiment 1, the system which changes an answer phrase based on a mood parameter was demonstrated.

実施形態２においては、さらに、音声合成パラメータも変更する方式について説明する。 In the second embodiment, a method for changing a speech synthesis parameter will be described.

図１０は、実施形態２に基づく音声合成パラメータテーブルを説明する図である。
当該テーブルは、一例として応答内容データベース２３２に格納されているものとする。 FIG. 10 is a diagram illustrating a speech synthesis parameter table based on the second embodiment.
Assume that the table is stored in the response content database 232 as an example.

図１０を参照して、「機嫌範囲」と、「音量」、「ピッチ」、「話速」とがそれぞれ対応付けられている場合が示されている。本例においては、機嫌パラメータが高いほど「音量」、「ピッチ」、「話速」が大きく、機嫌パラメータが低いほど「音量」、「ピッチ」、「話速」が小さくなるように設定されている。 Referring to FIG. 10, a case where “moody range” is associated with “volume”, “pitch”, and “speech speed” is shown. In this example, “volume”, “pitch”, and “speech speed” are set higher as the mood parameter is higher, and “volume”, “pitch”, and “speech speed” are set lower as the mood parameter is lower. Yes.

本例においては、基準値を「１．０」としており、音量の値が小さいほど音が小さく、値が大きいほど音が大きくなるものとする。また、ピッチの値が小さいほど音程が低く、大きいほど音程が高くなるものとする。また、話速の値が小さいほど再生スピードが遅く、大きいほど再生スピードが速くなるものとする。 In this example, it is assumed that the reference value is “1.0”, the sound is smaller as the volume value is smaller, and the sound is larger as the value is larger. Also, the pitch is lower as the pitch value is smaller, and the pitch is higher as the pitch value is larger. Further, it is assumed that the smaller the speech speed value, the slower the reproduction speed, and the larger the value, the faster the reproduction speed.

具体的には、機嫌パラメータＰの機嫌範囲が−６未満の場合には、音量「０．９」、ピッチ「０．９」、話速「０．９」に設定されている場合が示されている。 Specifically, when the mood range of the mood parameter P is less than −6, the case where the volume is set to “0.9”, the pitch “0.9”, and the speech speed “0.9” is shown. ing.

また、機嫌パラメータＰの機嫌範囲が−６以上であり、−２未満の場合には、音量「０．９５」、ピッチ「０．９５」、話速「１．０」に設定されている場合が示されている。 Further, when the mood range of the mood parameter P is −6 or more and less than −2, the volume is set to “0.95”, the pitch “0.95”, and the speech speed “1.0”. It is shown.

また、機嫌パラメータＰの機嫌範囲Ｐが−２以上であり、２未満の場合には、音量「１．０」、ピッチ「１．０」、話速「１．０」に設定されている場合が示されている。 Further, when the mood range P of the mood parameter P is −2 or more and less than 2, the volume “1.0”, the pitch “1.0”, and the speech speed “1.0” are set. It is shown.

また、機嫌パラメータＰの機嫌範囲が２以上であり、６未満の場合には、音量「１．０５」、ピッチ「１．１」、話速「１．０」に設定されている場合が示されている。 Also, when the mood range of the mood parameter P is 2 or more and less than 6, the case where the volume is set to “1.05”, the pitch “1.1”, and the speech speed “1.0” is shown. Has been.

また、機嫌パラメータＰの機嫌範囲が６以上の場合には、音量「１．１」、ピッチ「１．２」、話速「１．１」に設定されている場合が示されている。 Further, when the mood range of the mood parameter P is 6 or more, the case where the volume is set to “1.1”, the pitch “1.2”, and the speech speed “1.1” is shown.

したがって、機嫌パラメータＰが低い場合には、暗い感じのトーンの再生となり、機嫌パラメータＰが高い場合には、明るい感じのトーンの再生となる。 Accordingly, when the mood parameter P is low, a dark tone is reproduced, and when the mood parameter P is high, a bright tone is reproduced.

実施形態２において、サーバ２０の選択部２２４は、選択した回答フレーズを応答処理実行指示部２２５に出力するとともに、図１０の音声合成パラメータテーブルを参照して、機嫌パラメタＰに応じた音声合成パラメータ（音量、ピッチ、話速のデータ）を出力する。 In the second embodiment, the selection unit 224 of the server 20 outputs the selected answer phrase to the response process execution instructing unit 225 and refers to the speech synthesis parameter table of FIG. (Volume, pitch, speech speed data) is output.

応答処理実行指示部２２５は、選択部２２４が選択（決定）した回答フレーズデータとともに音声合成パラメータを通信部２０１を介して人形１０に出力する。 The response process execution instructing unit 225 outputs the speech synthesis parameters to the doll 10 via the communication unit 201 together with the answer phrase data selected (determined) by the selection unit 224.

人形１０の応答処理実行部１１２は、通信部２０１を介して受信した回答フレーズデータおよび音声合成パラメータに基づいてスピーカ１０４を介してユーザに応答（音声対話）する。 The response processing execution unit 112 of the doll 10 responds (voice dialogue) to the user via the speaker 104 based on the answer phrase data and the voice synthesis parameter received via the communication unit 201.

当該処理により、機嫌パラメータに基づく回答フレーズが、機嫌パラメータに基づく音声合成パラメータにより人形１０から再生出力されて発話される。 With this processing, the answer phrase based on the mood parameter is reproduced and output from the doll 10 by the speech synthesis parameter based on the mood parameter and is uttered.

なお、本例においては、機嫌パラメータに基づいて回答フレーズおよび音声合成パラメータがともに選択される場合について説明したが、特にこれに限られず、音声合成パラメータのみが選択され回答フレーズは共通のものを利用することも可能である。 In this example, the case where both the answer phrase and the speech synthesis parameter are selected based on the mood parameter has been described. However, the present invention is not limited to this, and only the speech synthesis parameter is selected and the same answer phrase is used. It is also possible to do.

＜実施形態３＞
上記の実施形態においては、ユーザの発話に対する応答内容として音声再生する場合について説明したが、特に音声再生する場合に限られず、他の方式による応答内容としても良い。 <Embodiment 3>
In the above-described embodiment, the case where the voice is reproduced as the response content to the user's utterance has been described.

図１１は、実施形態３に基づく応答内容データベース２３２Ａについて説明する図である。 FIG. 11 is a diagram illustrating the response content database 232A based on the third embodiment.

図１１を参照して、当該応答内容データベース２３２Ａは、一例として実施形態３に基づくサーバ２０の備える記憶部２０３に格納されている。 Referring to FIG. 11, the response content database 232A is stored in the storage unit 203 included in the server 20 based on the third embodiment as an example.

具体的には、応答内容データベース２３２Ａには、複数の応答情報が登録されている。具体的には、認識内容（認識フレーズ）と応答内容（回答フレーズ）とが関連付けられて登録されている。本例においては、それぞれの認識フレーズと回答フレーズとの組み合わせに対して識別番号（応答ＩＤ）が割り当てられている。なお、一例として本例における応答内容データベース２３２に登録されている認識フレーズは、音声認識に利用される辞書にも同様に登録されているものとする。 Specifically, a plurality of response information is registered in the response content database 232A. Specifically, the recognition content (recognition phrase) and the response content (answer phrase) are registered in association with each other. In this example, an identification number (response ID) is assigned to each combination of recognition phrase and answer phrase. As an example, it is assumed that the recognition phrase registered in the response content database 232 in this example is also registered in the dictionary used for speech recognition.

一例として、ここでは認識フレーズとして、「おはよう」・・・に対応ｄして回答フレーズがそれぞれ関連付けられて格納されている。 As an example, here, as a recognition phrase, answer phrases corresponding to “good morning”... Are stored in association with each other.

例えば、応答ＩＤ「１」の認識フレーズ「おはよう」に対応して応答態様「動作パターン１」が関連付けられて登録されている場合が示されている。 For example, the case where the response mode “operation pattern 1” is associated and registered corresponding to the recognition phrase “good morning” of the response ID “1” is shown.

また、応答ＩＤ「２」の認識フレーズ「おはよう」に対応して応答態様「動作パターン２」が関連付けられて登録されている場合が示されている。 Further, the case where the response mode “operation pattern 2” is associated and registered in correspondence with the recognition phrase “good morning” of the response ID “2” is shown.

また、応答ＩＤ「３」の認識フレーズ「おはよう」に対応して応答態様「動作パターン３」が関連付けられて登録されている場合が示されている。 Further, the case where the response mode “operation pattern 3” is associated and registered corresponding to the recognition phrase “good morning” of the response ID “3” is shown.

ここで、応答態様「動作パターン１」〜「動作パターン３」は、サーバ２０が人形１０に対して所定の動作パターンの動き（移動処理）を実行するように指示することを意味する。 Here, the response modes “motion pattern 1” to “motion pattern 3” mean that the server 20 instructs the doll 10 to execute a motion (movement process) of a predetermined motion pattern.

例えば、「動作パターン１」は、駆動部１０６により人形１０が手を所定期間動かす動きを実行するパターンを意味する。 For example, “motion pattern 1” means a pattern in which the doll 10 moves the hand for a predetermined period by the driving unit 106.

「動作パターン２」は、駆動部１０６により人形１０の足を所定期間動かす動きを実行するパターンを意味する。 “Operation pattern 2” means a pattern in which the drive unit 106 performs a movement of moving the doll's 10 foot for a predetermined period.

「動作パターン３」は、駆動部１０６により人形１０の頭を所定期間動かす動きを実行するパターンを意味する。 “Operation pattern 3” means a pattern in which the driving unit 106 performs a movement of moving the head of the doll 10 for a predetermined period.

そして、各認識フレーズに対応する応答態様に関して、本例においては機嫌変化度が対応付けられている。 And in this example, the mood change degree is matched regarding the response mode corresponding to each recognition phrase.

機嫌変化度に関して、認識フレーズの内容が積極的な内容であれば、機嫌パラメータＰの値が大きくなるように機嫌変化度が付与される。一方、認識フレーズの内容が消極的な内容であれば、機嫌パラメータＰの値が低くなるように機嫌変化度が付与される。上記したように機嫌変化度等に基づく機嫌パラメータの算出処理により、機嫌パラメータの変化に応じた応答処理が実行される。 Regarding the mood change degree, if the content of the recognition phrase is positive, the mood change degree is given so that the value of the mood parameter P increases. On the other hand, if the content of the recognition phrase is passive content, the mood change degree is given so that the value of the mood parameter P becomes low. As described above, the response process according to the change in the mood parameter is executed by the mood parameter calculation process based on the mood change degree or the like.

当該機嫌範囲は、同じ認識フレーズに対して、複数の回答フレーズが有る場合に選択する指標として用いられる。本例においては、管理部２２２で管理されている機嫌パラメータＰに基づいて回答フレーズが選択される。機嫌パラメータＰがどの機嫌範囲に属するかが判断されて、対応する回答フレーズが選択される。 The mood range is used as an index to be selected when there are a plurality of answer phrases for the same recognition phrase. In this example, an answer phrase is selected based on the mood parameter P managed by the management unit 222. It is determined which mood range the mood parameter P belongs to, and a corresponding answer phrase is selected.

たとえば、機嫌範囲として機嫌パラメータＱが５より大きい場合には、応答ＩＤ「１」が選択される。また、機嫌範囲として機嫌パラメータＱが−５より大きく、５以下の場合には、応答ＩＤ「２」が選択される。また、機嫌範囲として機嫌パラメータＱが−５以下の場合には、応答ＩＤ「３」が選択される。 For example, when the mood parameter Q is larger than 5 as the mood range, the response ID “1” is selected. If the mood parameter Q is greater than −5 and less than or equal to 5 as the mood range, the response ID “2” is selected. When the mood parameter Q is −5 or less as the mood range, the response ID “3” is selected.

＜実施形態４＞
ユーザとの間での自然なコミュニケーションを図るために、画一的な音声再生ではなく、各装置毎にそれぞれ対応するキャラクタを割り当てることも可能である。 <Embodiment 4>
In order to achieve natural communication with the user, it is possible to assign a character corresponding to each device instead of uniform audio reproduction.

図１２は、実施形態４に基づくキャラクタデータベースについて説明する図である。
図１２を参照して、本例におけるキャラクタデータベースは、記憶部２０３に格納されているものとする。 FIG. 12 is a diagram illustrating a character database based on the fourth embodiment.
Referring to FIG. 12, it is assumed that the character database in this example is stored in storage unit 203.

キャラクタデータベースには、複数のキャラクタ情報が登録されている。具体的には、キャラクタＩＤに関連づけられたキャラクタ情報（名前、話者、音量、ピッチ、話速、応援チーム）が関連付けられて登録されている。 A plurality of pieces of character information are registered in the character database. Specifically, character information (name, speaker, volume, pitch, speaking speed, support team) associated with the character ID is associated and registered.

具体的には、キャラクタＩＤに対応して、「名前」、「話者」、「音量」、「ピッチ」、「話速」、「応援チーム」がそれぞれ関連付けられて登録されている。 Specifically, “name”, “speaker”, “volume”, “pitch”, “speaking speed”, and “support team” are associated and registered in correspondence with the character ID.

一例として、キャラクタＩＤが「１」の場合には、名前「ココ」、話者「Ｗｏｍａｎ１」、音量「１．０」、ピッチ「１．０」、話速「１．０」、応援チーム「Ｎｕｌｌ」が登録されている場合が示されている。 As an example, when the character ID is “1”, the name “here”, the speaker “Woman1”, the volume “1.0”, the pitch “1.0”, the speech speed “1.0”, the support team “ The case where “Null” is registered is shown.

キャラクタＩＤが「２」の場合には、名前「はなこ」、話者「Ｗｏｍａｎ２」、音量「１．０」、ピッチ「１．０５」、話速「０．９」、応援チーム「Ｎｕｌｌ」が登録されている場合が示されている。 When the character ID is “2”, the name “Hanako”, the speaker “Woman2”, the volume “1.0”, the pitch “1.05”, the speech speed “0.9”, and the support team “Null” The registered case is shown.

キャラクタＩＤが「３」の場合には、名前「げんき」、話者「ｍａｎ１」、音量「１．２」、ピッチ「１．０」、話速「１．１」、応援チーム「阪神」が登録されている場合が示されている。 When the character ID is “3”, the name “Genki”, the speaker “man1”, the volume “1.2”, the pitch “1.0”, the speech speed “1.1”, and the support team “Hanshin” The registered case is shown.

装置ＩＤが「１０」の人形１０のキャラクタＩＤが「１」に割り当てられている場合には、名前「ココ」、話者「Ｗｏｍａｎ１」、音量「１．０」、ピッチ「１．０」、話速「１．０」、応援チーム「Ｎｕｌｌ」のキャラクタ特性としてユーザと対話することが可能である。 When the character ID of the doll 10 whose device ID is “10” is assigned to “1”, the name “Coco”, speaker “Woman1”, volume “1.0”, pitch “1.0”, It is possible to interact with the user as the character characteristics of the speaking speed “1.0” and the support team “Null”.

名前は、自己紹介する際に用いられる。話者は、音声合成処理する際の基準となる音声パターンデータのことを意味する。話者「Ｗｏｍａｎ１」は、女性パターンデータ１を指し示す。応援チームは、キャラクタの個性として有する野球球団の応援チームを指し示す。 The name is used when introducing yourself. The speaker means voice pattern data which is a reference for voice synthesis processing. The speaker “Woman1” points to the female pattern data 1. The support team indicates a support team of a baseball team that has character personality.

本例においては、「Ｎｕｌｌ」なので特になしとする。
一方、応援チームが「阪神」の場合には、阪神を応援チームとする。 In this example, since it is “Null”, there is nothing in particular.
On the other hand, when the support team is “Hanshin”, Hanshin is set as the support team.

サーバ２０は、通信部２０１を介して所定のタイミングで野球球団の勝敗結果に関する情報を取得する。たとえば、前日の勝敗結果を取得して、応援チームが勝った場合に機嫌パラメータを上昇させる。あるいは、応援チームが負けた場合に、機嫌パラメータを下降させる。 The server 20 acquires information on the result of winning or losing the baseball team at a predetermined timing via the communication unit 201. For example, the winning / losing result of the previous day is acquired, and the mood parameter is increased when the support team wins. Or, when the support team loses, the mood parameter is lowered.

当該処理により、画一的な音声再生ではなく、キャラクタ毎に個性を持たせることにより、キャラクタの機嫌パラメータの変化に応じた応答処理（発話）が実行されることにより、ユーザとの間での自然なコミュニケーションを図ることが可能である。 With this process, response processing (utterance) according to changes in the character's mood parameters is executed by giving each character a personality, rather than uniform audio reproduction. Natural communication is possible.

＜実施形態５＞
図１３は、実施形態５に基づくサーバの構成について説明する図である。 <Embodiment 5>
FIG. 13 is a diagram illustrating a configuration of a server based on the fifth embodiment.

図１３を参照して、本例においては、サーバが複数設けられている場合が示されている。 Referring to FIG. 13, in this example, a case where a plurality of servers are provided is shown.

本例においては、一例としてサーバ２０Ａと、サーバ２０Ｂとが設けられている場合が示されている。 In this example, the case where the server 20A and the server 20B are provided is shown as an example.

上記の構成においては、音声認識と音声認識に対する回答フレーズ（応答態様）とを決定する処理とを同じサーバで実行する場合について説明したが、一方で、当該処理をそれぞれ独立のサーバで実行することも可能である。 In the above configuration, the case where the voice recognition and the process for determining the answer phrase (response mode) for voice recognition are executed by the same server has been described. On the other hand, the process is executed by an independent server. Is also possible.

具体的には、サーバ２０Ａにおいて音声データに対する音声認識を実行し、サーバ２０Ｂにおいて回答フレーズデータを人形１０に出力する構成としてもよい。 Specifically, the server 20A may perform voice recognition on the voice data, and the server 20B may output answer phrase data to the doll 10.

例えば、人形１０から音声データをサーバ２０Ａに送信する（１）。サーバ２０Ａが音声データの音声認識を実行する（２）。そして、サーバ２０Ａが人形１０に対して認識フレーズを送信する（３）。 For example, voice data is transmitted from the doll 10 to the server 20A (1). The server 20A executes voice recognition of the voice data (2). Then, the server 20A transmits the recognition phrase to the doll 10 (3).

人形１０がサーバ２０Ａから認識フレーズを受信して、別のサーバ２０Ｂに当該認識フレーズを送信する（４）。 The doll 10 receives the recognition phrase from the server 20A, and transmits the recognition phrase to another server 20B (4).

サーバ２０Ｂは、人形１０から認識フレーズを受信して、当該認識フレーズに対応する回答フレーズを決定する（５）。そして、サーバ２０Ｂは、人形に対して回答フレーズデータを送信する（６）。 The server 20B receives the recognition phrase from the doll 10 and determines an answer phrase corresponding to the recognition phrase (5). And server 20B transmits reply phrase data to a doll (6).

なお、本例においては、サーバ２０Ａが音声データの音声認識を実行した認識フレーズを人形１０に対して送信する場合について説明したが、認識フレーズに限られず音声認識の結果を示す情報であればどのようなものでも良い。例えば、サーバ２０Ｂに格納されている回答フレーズにアクセスするために必要なアクセス情報（ＵＲＬ（Uniform Resource Locator）等）であってもよい。例えば、当該アクセス情報（ＵＲＬ）を人形１０は、サーバ２０Ａから受信して、サーバ２０Ｂにアクセスすることにより回答フレーズをサーバ２０Ｂから取得する構成としてもよい。また、アクセス情報に限られず、サーバ２０Ｂに格納されている回答フレーズがファイル形式で保存されている場合には、サーバ２０Ａからの音声認識の結果を示す情報として、ファイル名を指定する情報であってもよい。例えば、当該ファイル名を人形１０は、サーバ２０Ａから受信して、サーバ２０Ｂに対してファイル名を指定して情報を要求することにより、回答フレーズに関連するファイルをサーバ２０Ｂから取得することが可能である。 In addition, in this example, although the server 20A demonstrated the case where the recognition phrase which performed the speech recognition of audio | voice data was transmitted with respect to the doll 10, if it is the information which shows not only the recognition phrase but the speech recognition result, which Something like that. For example, it may be access information (URL (Uniform Resource Locator), etc.) necessary for accessing an answer phrase stored in the server 20B. For example, the doll 10 may receive the answer information from the server 20B by receiving the access information (URL) from the server 20A and accessing the server 20B. Further, not limited to access information, when an answer phrase stored in the server 20B is saved in a file format, it is information for designating a file name as information indicating the result of speech recognition from the server 20A. May be. For example, the doll 10 receives the file name from the server 20A, requests the server 20B by specifying the file name, and requests information, thereby acquiring a file related to the answer phrase from the server 20B. It is.

また、同様に、サーバ２０Ａからの音声認識の結果を示す情報として、認識フレーズをテキスト化したテキスト情報を送信するようにしてもよい。人形１０は、当該テキスト情報から認識フレーズを抽出して、サーバ２０Ｂにアクセスして回答フレーズを取得するようにしてもよいし、当該テキスト情報をサーバ２０Ｂに送信して、サーバ２０Ｂで認識フレーズを含むテキスト情報を解析して、解析結果に基づいて回答フレーズを決定して、人形１０に送信する構成としてもよい。 Similarly, text information obtained by converting a recognition phrase into text may be transmitted as information indicating the result of speech recognition from the server 20A. The doll 10 may extract the recognition phrase from the text information and access the server 20B to obtain the answer phrase, or transmit the text information to the server 20B, and the server 20B determines the recognition phrase. It is good also as a structure which analyzes the text information to include, determines an answer phrase based on an analysis result, and transmits to the doll 10.

また、サーバ２０Ｂから人形１０に回答フレーズデータを送信する構成について説明した。具体的には、回答フレーズデータである音声ファイルを送信して人形１０が当該音声ファイルに従って発話する場合について説明したが、音声ファイルに限られず、回答フレーズデータとしてテキスト情報を送信して、人形１０で当該テキスト情報を解析（いわゆる読み上げ機能等）して発話（応答処理）するようにしてもよい。 Moreover, the structure which transmits reply phrase data to the doll 10 from the server 20B was demonstrated. Specifically, the case where the voice file which is the answer phrase data is transmitted and the doll 10 speaks according to the voice file has been described. However, the doll 10 is not limited to the voice file, but the text information is transmitted as the answer phrase data, and the doll 10 Then, the text information may be analyzed (so-called reading function or the like) and uttered (response processing).

また、本例においては、サーバ２０で音声認識する場合について説明したが、人形１０で音声認識し、その結果に対する回答フレーズを人形１０内で決定して、回答フレーズをサーバ２０Ｂから取得するようにしてもよい。その場合、認識フレーズに対するサーバ２０Ｂの回答フレーズにアクセスするアクセス情報（ＵＲＬ）が対応付けられたＵＲＬ対応テーブルを記憶部１０９に設けることにより実現することが可能である。 Moreover, in this example, although the case where the speech recognition was performed by the server 20 was described, the speech recognition is performed by the doll 10, the answer phrase for the result is determined in the doll 10, and the answer phrase is acquired from the server 20B. May be. In this case, it can be realized by providing the storage unit 109 with a URL correspondence table in which access information (URL) for accessing the reply phrase of the server 20B for the recognized phrase is associated.

また、人形１０内に保存されている情報を利用して音声認識の結果に対する回答フレーズを取得することも可能である。 It is also possible to acquire an answer phrase for the result of speech recognition using information stored in the doll 10.

例えば、一時的に情報を格納することが可能なキャッシュメモリに以前に利用された認識フレーズに対する回答フレーズの情報が含まれている場合には、当該キャッシュメモリに格納されている回答フレーズの情報を利用することにより、例えば、サーバ２０Ｂにアクセスすることなく回答フレーズを取得して人形１０から発話（応答処理）することも可能である。これによりキャッシュメモリに格納されている情報を利用して早期に発話することが可能である。 For example, when the cache memory capable of temporarily storing information includes information on the answer phrase for the previously used recognition phrase, the answer phrase information stored in the cache memory is By using it, for example, it is possible to acquire an answer phrase and access the doll 10 (response processing) without accessing the server 20B. As a result, it is possible to utter early using the information stored in the cache memory.

また、人形１０内に回答フレーズである音声ファイルが保存されている場合に、サーバ２０Ａが当該人形１０内に保存されている音声ファイルを音声認識の結果を示す情報として指定するようにしても良い。当該処理により、サーバ２０Ｂにアクセスすることなく、人形１０内に保存されている音声ファイルを利用して早期に発話することが可能である。なお、当該音声ファイルが人形１０内に保存されていない場合には、サーバ２０Ｂに対して当該指定された音声ファイルを要求して、サーバ２０Ｂから音声ファイルを取得して発話するようにしても良い。 Further, when an audio file that is an answer phrase is stored in the doll 10, the server 20A may specify the audio file stored in the doll 10 as information indicating the result of the voice recognition. . Through this process, it is possible to speak early using an audio file stored in the doll 10 without accessing the server 20B. If the audio file is not stored in the doll 10, the designated audio file may be requested to the server 20B, and the audio file may be acquired from the server 20B and spoken. .

実施形態５のサーバの構成については、上記の実施形態１〜４のいずれにも適用可能である。 The server configuration of the fifth embodiment can be applied to any of the first to fourth embodiments.

＜実施形態６＞
人形１０及びサーバ２０等の制御ブロックは、集積回路（ＩＣチップ）等に形成された論理回路（ハードウェア）によって実現してもよいし、ＣＰＵ（Central Processing Unit）を用いてソフトウェアによって実現してもよい。 <Embodiment 6>
Control blocks such as the doll 10 and the server 20 may be realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like, or may be realized by software using a CPU (Central Processing Unit). Also good.

後者の場合、人形１０及びサーバ２０は、各機能を実現するソフトウェアであるプログラムの命令を実行するＣＰＵ、上記プログラムおよび各種データがコンピュータ（またはＣＰＵ）で読み取り可能に記録されたＲＯＭ（Read Only Memory）または記憶装置（これらを「記録媒体」と称する）、上記プログラムを展開するＲＡＭ（Random Access Memory）などを備えている。そして、コンピュータ（またはＣＰＵ）が上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記記録媒体としては、「一時的でない有形の媒体」、例えば、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムは、該プログラムを伝送可能な任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに供給されてもよい。なお、本発明は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 In the latter case, the doll 10 and the server 20 include a CPU that executes instructions of a program that is software that realizes each function, and a ROM (Read Only Memory) in which the program and various data are recorded so as to be readable by a computer (or CPU). ) Or a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) for expanding the program, and the like. And the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it. As the recording medium, a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. The program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program. The present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission.

（まとめ）
今回開示された実施の形態は、すべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は、上記した実施の形態の説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 (Summary)
The embodiment disclosed this time should be considered as illustrative in all points and not restrictive. The scope of the present invention is shown not by the above description of the embodiments but by the scope of claims for patent, and is intended to include meanings equivalent to the scope of claims for patent and all modifications within the scope.

１対話システム、５ネットワーク、１０人形、２０，２０Ａ，２０Ｂサーバ、１０１，２０１通信部、１０２，２０２制御部、１０３マイク、１０４スピーカ、１０６駆動部、１０７操作部、１０９，２０３記憶部、１１２応答処理実行部、１１４音声入力受付部、２２１音声入力受信部、２２２管理部、２２３音声認識部、２２４選択部、２２５応答処理実行部、２３１変更テーブルデータベース、２３２応答内容データベース、２３３履歴記憶部、２３４センサ情報データベース。 1 Dialog system, 5 network, 10 doll, 20, 20A, 20B server, 101, 201 communication unit, 102, 202 control unit, 103 microphone, 104 speaker, 106 drive unit, 107 operation unit, 109, 203 storage unit, 112 Response processing execution unit, 114 voice input reception unit, 221 voice input reception unit, 222 management unit, 223 voice recognition unit, 224 selection unit, 225 response processing execution unit, 231 change table database, 232 response content database, 233 history storage unit 234 Sensor information database.

Claims

A server provided to be able to communicate with a dialog device,
A receiving unit for receiving voice input from a user received by the interactive device;
A parameter management unit that manages mood parameters indicating emotions of the device;
A storage unit that stores a plurality of response information relating to responses to voice input from the user received by the reception unit, each of which is associated with the value of the mood parameter as an index when selected,
A selection unit that refers to the storage unit and selects one of the plurality of response information based on the mood parameter managed by the parameter management unit;
A response process execution instructing unit that instructs the dialogue apparatus to execute a response process based on the response information selected by the selection unit;
The storage unit stores change information of the mood parameter provided corresponding to each of the plurality of response information,
The server, wherein the parameter management unit updates the mood parameter based on change information of the mood parameter corresponding to one of the plurality of response information selected by the selection unit.

The server according to claim 1, wherein the parameter management unit updates the mood parameter based on information acquired from the outside.

The server according to claim 2, wherein the parameter management unit updates the mood parameter based on environmental information acquired from the outside.

The server according to claim 2 , wherein the parameter management unit updates the mood parameter when information acquired from the outside satisfies a predetermined condition.

The server according to claim 1, wherein the parameter management unit updates the mood parameter based on an input mode received from the user by the receiving unit.

The server according to claim 5, wherein the parameter management unit updates the mood parameter based on voice input content from the user received by the receiving unit.

The server according to claim 5, wherein the parameter management unit updates the mood parameter based on sensor input information detected by the user received by the receiving unit.

The server further includes a history storage unit that stores a history of input from the user,
The server according to claim 1, wherein the parameter management unit refers to the history storage unit and updates the mood parameter based on a plurality of input modes from the user.

The response process execution instructing unit instructs the dialogue apparatus to execute a speech synthesis process,
The server according to claim 1, wherein each of the plurality of response information includes different parameter information when executing the speech synthesis process.

The response process execution instructing unit instructs the dialogue apparatus to execute a voice output process,
The server according to claim 1, wherein each of the plurality of response information includes different output contents when executing the voice output process.

The storage unit further stores other response information related to the voice input from the user received by the receiving unit, which is not associated with the value of the mood parameter,
The server according to claim 1, wherein the selection unit selects the other response information when a predetermined condition is satisfied.

A method of controlling a server provided to be able to communicate with a dialog device,
Receiving input from a user accepted by the interactive device;
Managing mood parameters indicating the emotion of the device;
Based on the mood parameters, each of which is associated with the value of the mood parameter as an index at the time of selection, and refers to a storage unit that stores a plurality of response information regarding responses to the input from the received user. Selecting one of the plurality of response information;
Instructing the interactive device to perform response processing based on selected response information, and
The storage unit stores change information of the mood parameters provided corresponding to the plurality of response information, respectively.
The step of managing the mood parameter includes a step of updating the mood parameter based on change information of the mood parameter corresponding to one of the selected plurality of response information.

A server control program executed in a computer of a server provided so as to be communicable with a dialog device,
The server control program is for the computer.
Receiving input from a user accepted by the interactive device;
Managing mood parameters indicating the emotion of the device;
Based on the mood parameters, each of which is associated with the value of the mood parameter as an index at the time of selection, and refers to a storage unit that stores a plurality of response information regarding responses to the input from the received user. Selecting one of the plurality of response information;
Instructing the interactive device to perform response processing based on selected response information, and
The storage unit stores change information of the mood parameters provided corresponding to the plurality of response information, respectively.
The step of managing the mood parameter includes a step of updating the mood parameter based on change information of the mood parameter corresponding to one of the selected plurality of response information.