JP2016020963A

JP2016020963A - Interaction evaluation device, interaction evaluation system, interaction evaluation method, and interaction evaluation program

Info

Publication number: JP2016020963A
Application number: JP2014144179A
Authority: JP
Inventors: 木付　英士; Eiji Kitsuke; 英士木付; 野村　敏男; Toshio Nomura; 敏男野村; 中村　哲; Satoru Nakamura; 哲中村; 雅博水上; Masahiro Mizukami
Original assignee: Nara Institute of Science and Technology NUC; Sharp Corp
Current assignee: Nara Institute of Science and Technology NUC; Sharp Corp
Priority date: 2014-07-14
Filing date: 2014-07-14
Publication date: 2016-02-04

Abstract

PROBLEM TO BE SOLVED: To provide an interaction evaluation device capable of evaluating interaction with a user.SOLUTION: An interaction evaluation device includes: a storage part which stores interaction information for interacting with a user; an interaction part which performs interaction processing with the user based upon the interaction information stored in the storage part; an input reception part which receives input from the user in the interaction processing; and an evaluation part which calculates comfortability associated with the interaction with the user based upon a type of the input received by the input reception part.SELECTED DRAWING: Figure 2

Description

本発明は、擬似的なコミュニケーションを評価する対話評価装置、対話評価システム、対話評価方法および対話評価プログラムに関する。 The present invention relates to a dialog evaluation apparatus, a dialog evaluation system, a dialog evaluation method, and a dialog evaluation program for evaluating pseudo communication.

対話装置として、ユーザと対話可能なロボット装置が提案されている（特許文献１）。
当該装置では、学習機能を有しており、ユーザによるロボット装置を撫でる等の動作を検知して、ユーザにより褒められた行動を当該装置が学習することによりユーザの好みに合うように応答内容を変更する方式が採用されている。 As an interactive device, a robot device capable of interacting with a user has been proposed (Patent Document 1).
The device has a learning function, detects an operation such as stroking the robot device by the user, and the device learns the action given up by the user so that the device can respond to the user's preference. The change method is adopted.

特開２００２−２０５２８９号公報JP 2002-205289 A

一方で、上記ロボット装置ではユーザによる撫でる等の動作により、ユーザの好み等を学習するものであり、ユーザとの対話を分析して評価する方式ではない。 On the other hand, the robot apparatus learns the user's preferences and the like by the operation such as the user's stroke, and is not a method of analyzing and evaluating the dialogue with the user.

本発明は、上記課題を解決するためになされたものであって、ユーザとの対話を評価することが可能な対話評価装置、対話評価システム、対話評価方法および対話評価プログラムを提供することを目的とする。 The present invention has been made to solve the above-described problem, and an object of the present invention is to provide a dialog evaluation apparatus, a dialog evaluation system, a dialog evaluation method, and a dialog evaluation program capable of evaluating a dialog with a user. And

本発明のある局面に従う対話評価装置は、ユーザと対話するための対話情報を記憶する記憶部と、記憶部に記憶された対話情報に基づいてユーザとの間で対話処理を実行する対話部と、対話処理におけるユーザからの入力を受け付ける入力受付部と、入力受付部で受け付けた入力の態様に基づいてユーザとの対話に関する快適度を算出する評価部とを備える。 A dialogue evaluation apparatus according to an aspect of the present invention includes a storage unit that stores dialogue information for interacting with a user, and a dialogue unit that executes dialogue processing with the user based on the dialogue information stored in the storage unit. An input receiving unit that receives an input from the user in the dialogue process, and an evaluation unit that calculates a comfort level related to the dialogue with the user based on an input mode received by the input receiving unit.

好ましくは、対話部は、記憶部に記憶された対話情報に基づいてユーザとの間で対話処理を実行し、入力受付部は、対話処理におけるユーザからの音声入力を受け付ける。 Preferably, the dialogue unit executes a dialogue process with the user based on the dialogue information stored in the storage unit, and the input reception unit receives a voice input from the user in the dialogue process.

特に、評価部は、対話処理におけるユーザからの音声入力の応答内容、応答速度、口調の少なくとも１つに基づいてユーザとの対話に関する快適度を算出する。 In particular, the evaluation unit calculates the degree of comfort related to the dialogue with the user based on at least one of the response content, the response speed, and the tone of the voice input from the user in the dialogue processing.

特に、評価部は、複数回の対話処理におけるユーザからの音声入力の態様に基づいてユーザとの対話に関する快適度を算出する。 In particular, the evaluation unit calculates the degree of comfort related to the dialogue with the user based on the manner of voice input from the user in a plurality of dialogue processes.

本発明のある局面に従う対話評価システムは、ユーザと対話するための対話情報を記憶する記憶部と、記憶部に記憶された対話情報に基づいてユーザとの間で対話処理を実行する対話部と、対話処理におけるユーザからの入力を受け付ける入力受付部と、入力受付部で受け付けた入力の態様に基づいてユーザとの対話に関する快適度を算出する評価部とを備える。 A dialog evaluation system according to an aspect of the present invention includes a storage unit that stores dialog information for interacting with a user, and a dialog unit that executes dialog processing with the user based on the dialog information stored in the storage unit. An input receiving unit that receives an input from the user in the dialogue process, and an evaluation unit that calculates a comfort level related to the dialogue with the user based on an input mode received by the input receiving unit.

本発明のある局面に従う対話評価方法は、ユーザと対話するための対話情報に基づいてユーザとの間で対話処理を実行するステップと、対話処理におけるユーザからの入力を受け付けるステップと、受け付けた入力の態様に基づいてユーザとの対話に関する快適度を算出するステップとを備える。 A dialog evaluation method according to an aspect of the present invention includes a step of executing a dialog process with a user based on dialog information for dialog with the user, a step of receiving an input from the user in the dialog process, and a received input And calculating a comfort level related to the dialogue with the user based on the above aspect.

本発明のある局面に従う対話評価プログラムは、コンピュータにおいて実行される対話評価プログラムであって、対話評価プログラムは、コンピュータに対して、ユーザと対話するための対話情報に基づいてユーザとの間で対話処理を実行するステップと、対話処理におけるユーザからの入力を受け付けるステップと、受け付けた入力の態様に基づいてユーザとの対話に関する快適度を算出するステップとを備える。 A dialogue evaluation program according to an aspect of the present invention is a dialogue evaluation program executed on a computer, and the dialogue evaluation program interacts with a computer based on dialogue information for interacting with the user. A step of executing a process; a step of receiving an input from the user in the interactive process; and a step of calculating a comfort level related to the dialog with the user based on the received input mode.

本発明の一態様によれば、ユーザとの対話を評価することが可能である。 According to one aspect of the present invention, it is possible to evaluate a dialog with a user.

本実施形態１に基づく対話システム１について説明する図である。It is a figure explaining the dialogue system 1 based on this Embodiment 1. FIG. 本実施形態１に基づく対話システム１の要部構成について説明する図である。It is a figure explaining the principal part structure of the dialogue system 1 based on this Embodiment 1. FIG. 本実施形態１に基づく対話データベース１３２について説明する図である。It is a figure explaining the dialogue database 132 based on this Embodiment 1. FIG. 本実施形態１に基づく記憶部２０３の具体例について説明する図である。It is a figure explaining the specific example of the memory | storage part 203 based on this Embodiment 1. FIG. 本実施形態１に基づく評価データベース２３１に登録されている評価テーブルを説明する図である。It is a figure explaining the evaluation table registered in the evaluation database 231 based on this Embodiment 1. FIG. 本実施形態１に基づく対話システム１における処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of a process in the dialogue system 1 based on this Embodiment 1. FIG. 本実施形態１に基づく評価装置２０の対話評価処理を実行するフロー図である。It is a flowchart which performs the dialogue evaluation process of the evaluation apparatus 20 based on this Embodiment 1. FIG. 実施形態２に基づくユーザとの対話に関する評価の概念を説明する図である。It is a figure explaining the concept of evaluation regarding the dialog with the user based on Embodiment 2. FIG.

本実施の形態について、以下、図面を参照しながら説明する。実施の形態の説明において、個数および量などに言及する場合、特に記載がある場合を除き、本発明の範囲は必ずしもその個数およびその量などに限定されない。実施の形態の説明において、同一の部品および相当部品に対しては、同一の参照番号を付し、重複する説明は繰り返さない場合がある。特に制限が無い限り、実施の形態に示す構成に示す構成を適宜組み合わせて用いることは、当初から予定されていることである。 The present embodiment will be described below with reference to the drawings. In the description of the embodiments, when the number and amount are referred to, the scope of the present invention is not necessarily limited to the number and amount unless otherwise specified. In the description of the embodiments, the same parts and corresponding parts are denoted by the same reference numerals, and redundant description may not be repeated. Unless there is a restriction | limiting in particular, it is planned from the beginning to use suitably combining the structure shown to the structure shown to embodiment.

＜実施形態１＞
（対話システム１の構成）
図１は、本実施形態１に基づく対話システム１について説明する図である。 <Embodiment 1>
(Configuration of Dialog System 1)
FIG. 1 is a diagram illustrating a dialogue system 1 based on the first embodiment.

図１を参照して、本実施形態１に基づく対話システム１は、対話装置１０と、評価装置２０とにより構成されている。 With reference to FIG. 1, the dialogue system 1 according to the first embodiment includes a dialogue device 10 and an evaluation device 20.

対話装置１０は、評価装置２０と通信可能に設けられている。なお、本例においては、対話装置１０に評価装置２０とが直接通信する場合について説明するが、ネットワークを介して評価装置２０と通信する方式としてもよい。 The dialogue device 10 is provided so as to be able to communicate with the evaluation device 20. In this example, a case where the evaluation apparatus 20 communicates directly with the interactive apparatus 10 will be described, but a method of communicating with the evaluation apparatus 20 via a network may be used.

対話システム１は、一例として対話装置１０から人間（ユーザ）に対して音声が出力され、これに対して対話装置１０に人間（ユーザ）が発した音声が入力されると、音声認識されて、入力された音声に対する応答内容を表す音声（以降では、「音声応答」とも記載）を、対話装置１０から出力する。当該処理を繰り返すことにより、本実施形態に係る対話システム１は、ユーザと、対話装置１０との疑似的なコミュニケーション（会話あるいは対話）を実現する。 As an example, the dialogue system 1 outputs a voice to a human (user) from the dialogue device 10. When a voice uttered by the human (user) is input to the dialogue device 10, the voice is recognized. A voice representing the content of response to the input voice (hereinafter also referred to as “voice response”) is output from the dialogue apparatus 10. By repeating this process, the dialogue system 1 according to the present embodiment realizes pseudo communication (conversation or dialogue) between the user and the dialogue device 10.

なお、本実施形態では、対話装置１０の一例として、音声を認識してユーザに対して音声応答を出力することが可能な装置であればどのようなものでもよく、例えば、掃除機能を有する掃除ロボットや、対話機能を有する人形や、他の家電（例えば、テレビ、電子レンジなど）などを、対話装置として採用することもできる。 In the present embodiment, as an example of the interactive device 10, any device that can recognize a voice and output a voice response to the user may be used. For example, a cleaning device having a cleaning function may be used. A robot, a doll having an interactive function, other home appliances (for example, a television, a microwave oven, etc.) can also be employed as the interactive device.

また、評価装置２０は、対話装置１０と人間との間の対話の態様を分析して評価する。評価装置２０は、対話装置１０と人間との間の対話の態様として円滑なコミュニケーションが図られていると判断した場合には評価を高くし、円滑なコミュニケーションが図られていないと判断した場合には評価を低くする。 Further, the evaluation device 20 analyzes and evaluates the mode of dialogue between the dialogue device 10 and a human being. When the evaluation device 20 determines that smooth communication is being achieved as a mode of dialogue between the dialog device 10 and a human being, the evaluation device 20 increases the evaluation, and when it is determined that smooth communication is not achieved. Lowers the rating.

円滑なコミュニケーションが図られているか否かの評価の指標として、一例としてユーザとの対話に関する快適度あるいは好適度を定義する。快適度あるいは好適度は、ユーザが対話に関して快適あるいは好適であると感じている度合を数値化したものである。 As an example of an index for evaluating whether or not smooth communication is being achieved, a comfort level or a preference level regarding a dialog with a user is defined as an example. The comfort level or the suitable level is a numerical value of the degree that the user feels comfortable or preferable with respect to the dialogue.

例えば、気の利いた対話処理が実現されている場合には快適度あるいは好適度は高く、逆に的外れな気が利かない対話処理が実現されている場合には快適度あるいは好適度は低くなるように設定される。快適度あるいは好適度は、対話を分析した結果に基づいて種々の方式に基づいて算出することが可能であるが、本例においては、対話を分析した結果として得られる応答内容、応答速度、口調の少なくとも１つのパラメータに基づいて判断する。 For example, comfort or preference is high when nifty dialogue processing is realized, and comfort or suitability is low when non-intuitive dialogue processing is realized. Is set as follows. The comfort level or the suitability level can be calculated based on various methods based on the analysis result of the dialogue, but in this example, the response content, the response speed, the tone, which are obtained as a result of analyzing the dialogue. The determination is made based on at least one of the parameters.

なお、本例においては、主に音声を用いた対話について説明するが、対話が可能な形式であればどのような形式でもよく、例えばＳＮＳ（social networking service）を利用した文字等による対話についても同様に適用可能である。 In this example, dialogue using voice is mainly described. However, any format can be used as long as dialogue is possible. For example, dialogue using characters using social networking service (SNS) is also possible. The same applies.

また、人間と対話装置１０との間でコミュニケーションを図ることができればどのような対話形式でもよく、ともに音声対話を用いる必要はなく、異なる対話形式を採用しても良い。具体的には、一方が音声で、他方が文字あるいはジェスチャー等を組み合わせた対話であっても良い。 In addition, any dialogue format may be used as long as communication can be performed between the human and the dialogue device 10, and it is not necessary to use a voice dialogue, and a different dialogue format may be adopted. Specifically, it may be a dialogue in which one is voice and the other is a combination of characters or gestures.

また、本実施形態では、評価装置２０は、１つの装置によって実現される構成を例に挙げて説明するが、これに限定されるものではなく、評価装置２０の備える各部（各機能）の少なくとも一部を、他の装置、例えばサーバ等により実現する構成を採用してもよい。 In the present embodiment, the evaluation device 20 is described by taking a configuration realized by one device as an example. However, the evaluation device 20 is not limited to this, and at least each unit (each function) included in the evaluation device 20 is used. You may employ | adopt the structure which implement | achieves one part with another apparatus, for example, a server.

また、本例においては、対話装置１０と評価装置２０とがそれぞれ別形態である場合について説明するが特にこれに限られず１つの装置として実現することも当然に可能である。 In this example, the case where the interactive device 10 and the evaluation device 20 are different forms will be described. However, the present invention is not limited to this, and can naturally be realized as one device.

（対話システム１の要部構成）
図２は、本実施形態１に基づく対話システム１の要部構成について説明する図である。 (Main components of the dialogue system 1)
FIG. 2 is a diagram for explaining a main configuration of the dialogue system 1 based on the first embodiment.

図２を参照して、まず、対話装置１０の構成について説明する。
本実施形態に基づく対話装置１０は、通信部１０１、制御部１０２、マイク１０３、スピーカ１０４および記憶部１０９を含む。 With reference to FIG. 2, first, the configuration of the interactive apparatus 10 will be described.
The interactive apparatus 10 based on the present embodiment includes a communication unit 101, a control unit 102, a microphone 103, a speaker 104, and a storage unit 109.

通信部１０１は、外部との通信を行う手段である。具体的には、通信部１０１は、評価装置２０の通信部２０１と通信する。なお、無線あるいは有線のいずれの通信も可能である。 The communication unit 101 is means for performing communication with the outside. Specifically, the communication unit 101 communicates with the communication unit 201 of the evaluation device 20. Note that either wireless or wired communication is possible.

マイク１０３は、外部から音の入力を受け付ける。なお、本実施形態では、マイク１０３が入力を受け付ける音を示す音データには、主に人間の発する音声の周波数帯域に含まれる音のデータ（音声データとも称する）の入力を受け付ける場合について説明するが、音声データの周波数帯域以外の周波数帯域を含む音のデータが含まれていてもよい。マイク１０３は、入力された音を示す音声データを、制御部１０２に出力する。 The microphone 103 receives sound input from the outside. In the present embodiment, a case will be described in which input of sound data (also referred to as sound data) included mainly in the frequency band of sound emitted by humans is received as sound data indicating sound that the microphone 103 receives input. However, sound data including a frequency band other than the frequency band of the audio data may be included. The microphone 103 outputs audio data indicating the input sound to the control unit 102.

スピーカ１０４は、制御部１０２から出力される応答内容を表す音声応答を出力する。以降では、対話装置１０がスピーカ１０４を介して行う音声応答の出力を、「発話」とも記載する。なお、応答内容の詳細については、後述する。 The speaker 104 outputs a voice response representing the response content output from the control unit 102. Hereinafter, the output of a voice response performed by the interactive apparatus 10 via the speaker 104 is also referred to as “utterance”. Details of the response contents will be described later.

記憶部１０９は、ＲＡＭ（Random Access Memory）及びフラッシュメモリなどの記憶装置であり、対話装置１０の各種機能を実現するためのプログラム等が格納されている。 The storage unit 109 is a storage device such as a RAM (Random Access Memory) and a flash memory, and stores programs for realizing various functions of the interactive device 10.

制御部１０２は、主にＣＰＵ（Central Processing Unit）で構成され、記憶部１０９に格納されているプログラムを当該ＣＰＵが実行する各部の機能を実現する。 The control unit 102 is mainly composed of a CPU (Central Processing Unit), and realizes the function of each unit that the CPU executes a program stored in the storage unit 109.

制御部１０２は、対話装置１０の各部を統括的に制御する。
制御部１０２の主な機能構成について説明する。 The control unit 102 comprehensively controls each unit of the dialogue apparatus 10.
The main functional configuration of the control unit 102 will be described.

制御部１０２は、対話処理実行部１１２と、音声入力受付部１１４とを含む。
音声入力受付部１１４は、マイク１０３によって外部から入力される音声データを検出（抽出）する。換言すれば、音声入力受付部１１４は、外部から受信した音データから、人間の発する音声の周波数帯域を抽出することによって、音データ（音声データ）を検出する。 The control unit 102 includes a dialogue processing execution unit 112 and a voice input reception unit 114.
The voice input receiving unit 114 detects (extracts) voice data input from the outside by the microphone 103. In other words, the voice input receiving unit 114 detects the sound data (voice data) by extracting the frequency band of the voice uttered by a person from the sound data received from the outside.

音声入力受付部１１４における、音データから音声データを検出する方法としては、例えば、音データから人間の発する音声の周波数帯域（例えば、１００Ｈｚ以上かつ１ｋＨｚ以下の周波数帯域）を抽出することによって音声データを検出する方法を挙げることができる。この場合には、音声入力受付部１１４は、音データから人間の発する音声の周波数帯域を抽出するために、例えば、バンドパスフィルタ、又は、ハイパスフィルタ及びローパスフィルタを組み合わせたフィルタなどを備えていればよい。 As a method of detecting the voice data from the sound data in the voice input reception unit 114, for example, the voice data is extracted by extracting the frequency band (for example, the frequency band of 100 Hz or more and 1 kHz or less) of the voice generated by humans from the sound data. The method of detecting can be mentioned. In this case, the voice input receiving unit 114 may be provided with, for example, a bandpass filter or a filter combining a high-pass filter and a low-pass filter in order to extract a frequency band of voice uttered by humans from sound data. That's fine.

音声入力受付部１１４は、音データから検出した音声データを通信部１０１を介して評価装置２０に送信する。 The voice input reception unit 114 transmits voice data detected from the sound data to the evaluation device 20 via the communication unit 101.

また、音声入力受付部１１４は、検出した音声データを対話処理実行部１１２に出力する。 In addition, the voice input receiving unit 114 outputs the detected voice data to the dialogue processing execution unit 112.

対話処理実行部１１２は、音声入力受付部１１４で検出した音声データに基づいて対話処理を実行する。具体的には、対話処理実行部１１２は、音声データを音声認識するとともに、当該認識結果に基づいて応答内容を設定する。そして、対話処理実行部１１２は、応答内容を表す音声をスピーカ１０４を介してユーザに出力する。また、対話処理実行部１１２は、応答内容を評価装置２０にも出力する。 The dialogue processing execution unit 112 executes dialogue processing based on the voice data detected by the voice input reception unit 114. Specifically, the dialogue processing execution unit 112 recognizes voice data and sets response contents based on the recognition result. Then, the dialogue process execution unit 112 outputs a voice representing the response content to the user via the speaker 104. Further, the dialogue process execution unit 112 also outputs the response content to the evaluation device 20.

次に、本実施形態１に基づく評価装置２０の構成について説明する。
本実施形態１に基づく評価装置２０は、通信部２０１、制御部２０２および記憶部２０３を含む。 Next, the structure of the evaluation apparatus 20 based on this Embodiment 1 is demonstrated.
The evaluation device 20 based on the first embodiment includes a communication unit 201, a control unit 202, and a storage unit 203.

通信部２０１は、外部との通信を行う手段である。具体的には、通信部２０１は、対話装置１０の通信部１０１と通信する。なお、無線あるいは有線のいずれの通信でも可能である。 The communication unit 201 is a means for performing communication with the outside. Specifically, the communication unit 201 communicates with the communication unit 101 of the dialogue apparatus 10. Note that wireless or wired communication is possible.

記憶部２０３は、ＲＡＭ（Random Access Memory）及びフラッシュメモリなどの記憶装置であり、評価装置２０の各種機能を実現するためのプログラム等が格納されている。記憶部２０３は、一例として人間と対話装置１０との対話に関する快適度を算出するために必要なテーブル、数式等が記憶される評価データベース２３１、音声データを認識するための音声辞書２３２、音声データを記憶する音声データ記憶部２３４および音声データを分析した分析結果を記憶する分析データ記憶部２３５とを有している。 The storage unit 203 is a storage device such as a RAM (Random Access Memory) and a flash memory, and stores programs for realizing various functions of the evaluation device 20. As an example, the storage unit 203 includes an evaluation database 231 that stores tables, mathematical formulas, and the like necessary for calculating the degree of comfort related to a dialogue between a human and the dialogue apparatus 10, a voice dictionary 232 for recognizing voice data, and voice data. And an analysis data storage unit 235 for storing an analysis result obtained by analyzing the voice data.

制御部２０２は、主にＣＰＵ（Central Processing Unit）で構成され、記憶部２０３に格納されているプログラムを当該ＣＰＵが実行することによって実現される。 The control unit 202 is mainly configured by a CPU (Central Processing Unit), and is realized by the CPU executing a program stored in the storage unit 203.

制御部２０２は、評価装置２０の各部を統括的に制御する。具体的には、制御部２０２は、対話装置１０からの通信部２０１を介して受信した音声データに基づいて、ユーザとの対話に関する快適度を算出する。 The control unit 202 comprehensively controls each unit of the evaluation device 20. Specifically, the control unit 202 calculates the comfort level related to the dialogue with the user based on the voice data received from the dialogue device 10 via the communication unit 201.

次に、評価装置２０の制御部２０２の主な機能構成について説明する。
制御部２０２は、音声入力受信部２２１、評価部２２２、音声分析部２２５および出力部２２６を有する。 Next, the main functional configuration of the control unit 202 of the evaluation apparatus 20 will be described.
The control unit 202 includes a voice input reception unit 221, an evaluation unit 222, a voice analysis unit 225, and an output unit 226.

音声入力受信部２２１は、通信部２０１を介して対話装置１０から送信された音声データを受信する。音声入力受信部２２１は、受信した音声データを音声データ記憶部２３４に記憶する。また、履歴テーブル２３３に受信した音声データに関する履歴情報が登録される。 The voice input receiving unit 221 receives voice data transmitted from the dialogue apparatus 10 via the communication unit 201. The voice input receiving unit 221 stores the received voice data in the voice data storage unit 234. In addition, history information regarding the received audio data is registered in the history table 233.

音声分析部２２５は、音声データ記憶部２３４に記憶された音声データを分析して、分析データ記憶部２３５に分析結果を格納する。本例においては、音声分析部２２５は、音声辞書２３２を利用して分析結果として音声の内容（音声内容）を認識するとともに、さらに音量、話速、返答時間等を計測する。なお、本例においては、評価装置２０で音声認識する場合について説明するが、対話装置１０で音声認識された内容を対話装置１０から受信するようにしても良い。 The voice analysis unit 225 analyzes the voice data stored in the voice data storage unit 234 and stores the analysis result in the analysis data storage unit 235. In this example, the speech analysis unit 225 uses the speech dictionary 232 to recognize the content of speech (speech content) as an analysis result, and further measures volume, speech speed, response time, and the like. In this example, the case where speech is recognized by the evaluation device 20 will be described. However, the content recognized by the interactive device 10 may be received from the interactive device 10.

評価部２２２は、分析データ記憶部２３５に記憶されている分析結果に基づいてユーザとの対話に関する快適度を算出する。具体的には、評価部２２２は、評価データベース２３１に格納されている数式等を利用して分析データ記憶部２３５に記憶された分析結果に基づいて快適度を算出する。 The evaluation unit 222 calculates the comfort level related to the dialog with the user based on the analysis result stored in the analysis data storage unit 235. Specifically, the evaluation unit 222 calculates the comfort level based on the analysis result stored in the analysis data storage unit 235 using mathematical formulas stored in the evaluation database 231.

出力部２２６は、評価部２２２で算出した快適度を出力する。なお、出力する形態としては特に限定することなく、図示しない表示部に快適度を数値化して表示するようにしても良いし、音声により出力するようにしても良い。あるいは印刷媒体に記録して出力するようにしても良い。 The output unit 226 outputs the comfort level calculated by the evaluation unit 222. The output form is not particularly limited, and the comfort level may be digitized and displayed on a display unit (not shown) or may be output by voice. Alternatively, it may be recorded on a print medium and output.

（対話データベース）
本実施の形態において、対話は、例えば、応答内容を保持するテーブルを参照して認識内容に応じた応答内容を選択する態様によって、あるいは、自然言語解析を行なって応答文を自動生成する態様によって、実現される。 (Dialog database)
In the present embodiment, the dialogue is performed, for example, by referring to a table holding the response contents or by selecting a response content corresponding to the recognized content or by automatically generating a response sentence by performing natural language analysis. Realized.

図３を参照して、テーブルを参照して応答内容を選択する場合について説明する。図３は、本実施形態１に基づく対話データベース１３２について説明する図である。当該対話データベース１３２は、一例として本実施形態に基づく対話装置１０の備える記憶部１０９に格納されている。 A case where response contents are selected with reference to a table will be described with reference to FIG. FIG. 3 is a diagram illustrating the dialogue database 132 based on the first embodiment. The dialogue database 132 is stored in the storage unit 109 included in the dialogue apparatus 10 according to the present embodiment as an example.

具体的には、対話データベース１３２には、複数の応答情報が登録されている。具体的には、認識内容（認識フレーズ）と応答内容（回答フレーズ）とが関連付けられて登録されている。本例においては、それぞれの認識フレーズと回答フレーズとの組み合わせに対して識別番号（応答ＩＤ）が割り当てられている。なお、一例として本例における対話データベース１３２に登録されている認識フレーズは、音声認識に利用される辞書にも同様に登録されているものとする。 Specifically, a plurality of response information is registered in the dialogue database 132. Specifically, the recognition content (recognition phrase) and the response content (answer phrase) are registered in association with each other. In this example, an identification number (response ID) is assigned to each combination of recognition phrase and answer phrase. As an example, it is assumed that the recognition phrase registered in the dialogue database 132 in this example is also registered in the dictionary used for speech recognition.

一例として、ここでは認識フレーズとして、「おはよう」、「ただいま」、・・・に対応して回答フレーズがそれぞれ関連付けられて格納されている。 As an example, here, as the recognition phrases, answer phrases are stored in association with “good morning”, “just now”,.

例えば、応答ＩＤ「１」の認識フレーズ「おはよう」に対応して回答フレーズ「おはよう！今日も１日頑張ろう！」が関連付けられて登録されている場合が示されている。 For example, a case is shown in which an answer phrase “Good morning! Let's do our best today also” is registered in association with the recognition phrase “Good morning” of the response ID “1”.

また、応答ＩＤ「２」の認識フレーズ「おはよう」に対応して回答フレーズ「おはよう」が関連付けられて登録されている場合が示されている。 In addition, a case is shown in which the answer phrase “good morning” is associated and registered in correspondence with the recognition phrase “good morning” of the response ID “2”.

また、応答ＩＤ「３」の認識フレーズ「おはよう」に対応して回答フレーズ「ふわぁー。まだ眠いよぉ」が関連付けられて登録されている場合が示されている。 Further, a case is shown in which an answer phrase “Fuwaa. Still sleepy yo” is associated and registered in correspondence with the recognition phrase “Good morning” of the response ID “3”.

また、応答ＩＤ「４」の認識フレーズ「ただいま」に対応して回答フレーズ「おかえり。今日もお仕事大変だった？」が関連付けられて登録されている場合が示されている。 In addition, a case is shown in which an answer phrase “Okaeri: Did you work hard today?” Is registered in association with the recognition phrase “Tadaima” of response ID “4”.

また、応答ＩＤ「５」の認識フレーズ「ただいま」に対応して回答フレーズ「おかえりなさい」が関連付けられて登録されている場合が示されている。 Further, a case is shown in which an answer phrase “return” is registered in association with the recognition phrase “Tadaima” of the response ID “5”.

また、認識フレーズが無い場合（ｎｕｌｌ）に対応して再応答を要求する回答フレーズ（再応答回答フレーズ）が設けられている。ここで、認識フレーズが無い場合とは、音声認識に失敗した場合を意味する。なお、音声認識に利用される辞書に登録されている認識フレーズが、対話データベース１３２に登録されていない場合、すなわち、音声認識は成功したが対応する認識フレーズが対話データベース１３２に登録されていない場合にも、認識フレーズが無い場合として処理するようにしても良い。 In addition, an answer phrase (re-answer answer phrase) for requesting a re-response in response to a case where there is no recognized phrase (null) is provided. Here, the case where there is no recognition phrase means a case where speech recognition fails. In addition, when the recognition phrase registered in the dictionary used for speech recognition is not registered in the dialogue database 132, that is, when the speech recognition is successful but the corresponding recognition phrase is not registered in the dialogue database 132. Alternatively, it may be processed as a case where there is no recognition phrase.

具体的には、応答ＩＤ「１００」に関して、認識フレーズが無い場合（ｎｕｌｌ）に回答フレーズ「なになに」が関連付けられて登録されている場合が示されている。 Specifically, regarding the response ID “100”, when there is no recognized phrase (null), the answer phrase “what is” is associated and registered.

また、応答ＩＤ「１０１」に関して、認識フレーズが無い場合（ｎｕｌｌ）に回答フレーズ「もう一度言って」が関連付けられて登録されている場合が示されている。当該認識フレーズが無い場合（ｎｕｌｌ）の回答フレーズを複数設けることによりパターン化された応答となることを回避することが可能である。 In addition, regarding the response ID “101”, a case where the answer phrase “say again” is associated and registered when there is no recognized phrase (null) is shown. By providing a plurality of answer phrases when there is no such recognition phrase (null), it is possible to avoid a patterned response.

本例においては、ユーザ発話に対して、対話装置１０からユーザに対する回答等の応答処理を実行する場合に、ユーザに対する回答等の応答がユーザにとって回答として好ましいか否かを快適度として、ユーザからの音声入力により評価する方式について説明する。 In this example, when a response process such as an answer to the user is executed from the dialogue apparatus 10 in response to the user utterance, whether or not a response such as an answer to the user is preferable as an answer is set as a comfort level from the user. A method for evaluating by voice input will be described.

例えば、ユーザに対する回答等の応答について、ユーザが好ましい反応を示した判断した場合には、ユーザとの対話に関する快適度を高くする。 For example, when it is determined that the user has shown a favorable response to a response such as an answer to the user, the degree of comfort related to the dialogue with the user is increased.

これにより、ユーザに対する回答等の応答に関して、ユーザの快適度を精度よく把握することにより、ユーザとの円滑なコミュニケーションを促進することが可能である。 Thereby, it is possible to promote smooth communication with the user by accurately grasping the user's comfort level regarding the response such as an answer to the user.

（記憶部２０３）
図４は、本実施形態１に基づく記憶部２０３の具体例について説明する図である。 (Storage unit 203)
FIG. 4 is a diagram illustrating a specific example of the storage unit 203 based on the first embodiment.

図４（Ａ）を参照して、ここでは、音声データ記憶部２３４の履歴テーブル２３３が示されている。一例として、本実施形態１に基づく評価装置２０の備える記憶部２０３に格納されている。 Referring to FIG. 4A, here, a history table 233 of the audio data storage unit 234 is shown. As an example, it is stored in the storage unit 203 provided in the evaluation device 20 based on the first embodiment.

音声データ記憶部２３４は、音声入力受信部２２１で受信した音声データを格納している。履歴テーブル２３３は、対話装置１０の対話の履歴情報を格納している。 The voice data storage unit 234 stores the voice data received by the voice input reception unit 221. The history table 233 stores dialogue history information of the dialogue apparatus 10.

本例においては、対話装置１０は、音声入力受付部１１４で音声データの入力を受け付けた場合に通信部１０１を介して音声データを評価装置２０に出力する。評価装置２０の音声入力受信部２２１は、対話装置１０からの音声データの入力を受信した場合に音声データ記憶部２３４に当該データを記憶させるとともに、履歴テーブル２３３に履歴情報を登録する。 In this example, the dialogue apparatus 10 outputs voice data to the evaluation apparatus 20 via the communication unit 101 when the voice input reception unit 114 receives input of voice data. The voice input receiving unit 221 of the evaluation device 20 stores the data in the voice data storage unit 234 and registers the history information in the history table 233 when the voice data input from the dialogue apparatus 10 is received.

また、対話装置１０は、対話処理実行部１１２により応答内容を表す音声をスピーカ１０４を介してユーザに出力する際に、応答内容である音声データを評価装置２０に出力する。評価装置２０の音声入力受信部２２１は、対話装置１０からの音声データの入力を受信した場合に音声データ記憶部２３４に当該データを記憶させるとともに、履歴テーブル２３３に履歴情報を登録する。 Further, when the dialogue processing execution unit 112 outputs the voice representing the response content to the user via the speaker 104, the dialogue device 10 outputs the voice data as the response content to the evaluation device 20. The voice input receiving unit 221 of the evaluation device 20 stores the data in the voice data storage unit 234 and registers the history information in the history table 233 when the voice data input from the dialogue apparatus 10 is received.

当該処理により音声データ記憶部２３４には、音声入力受信部２２１が受信した受信日時に対応付けられた音声データが記憶される。また、当該音声データを音声データ記憶部２３４に記憶される際に発行される識別番号（ＩＤ）と受信日とが対応付けられた履歴テーブル２３３が生成される。 Through this process, the voice data storage unit 234 stores voice data associated with the reception date and time received by the voice input reception unit 221. Further, a history table 233 is generated in which an identification number (ID) issued when the audio data is stored in the audio data storage unit 234 is associated with the reception date.

一例として、「時刻」と「音声データＩＤ」とが対応付けられている。
「時刻」は、評価装置２０が対話装置１０から音声データを受信した時刻を意味する。 As an example, “time” and “voice data ID” are associated with each other.
“Time” means the time at which the evaluation apparatus 20 receives voice data from the dialogue apparatus 10.

なお、本例においては、評価装置２０が音声データを受信した受信日に対応付けている場合について説明しているが、特にこれに限られず対話装置１０が音声データの入力を受け付けた日時としても良いし、対話装置１０が発話した日時と対応付けることも可能である。この場合には、当該情報を対話装置１０が評価装置２０に音声データとともに送信するようにすればよい。 In this example, the case where the evaluation device 20 associates with the reception date when the voice data is received has been described. However, the present invention is not limited to this. It is also possible to associate it with the date and time when the dialogue apparatus 10 spoke. In this case, the dialog device 10 may transmit the information to the evaluation device 20 together with the voice data.

「音声データＩＤ」は、音声データ記憶部２３４に記憶されている音声データを特定する情報であり、音声データを記憶する際に発行されるＩＤに対応するものである。 The “voice data ID” is information for specifying the voice data stored in the voice data storage unit 234, and corresponds to an ID issued when the voice data is stored.

本例においては、一例として、時刻「２０１３−０９−１２０６：３０：４２」に対応して音声データＩＤ「１００Ａ」が登録されている。時刻「２０１３−０９−１２０６：３１：００」に対応して音声データＩＤ「１００Ｂ」が登録されている。時刻「２０１３−０９−１２０６：３１：３０」に対応して音声データＩＤ「１０１Ａ」が登録されている。時刻「２０１３−０９−１２０６：３２：１０」に対応して音声データＩＤ「１０１Ｂ」が登録されている。 In this example, as an example, the audio data ID “100A” is registered corresponding to the time “2013-09-12 06:30:42”. The audio data ID “100B” is registered corresponding to the time “2013-09-12 06:31:00”. The audio data ID “101A” is registered corresponding to the time “2013-09-12 06:31:30”. The audio data ID “101B” is registered corresponding to the time “2013-09-12 06:32:10”.

なお、ここで、説明を簡易にするべく音声データＩＤの「Ａ」は対話装置１０から発話した音声データを意味するものととする。また、音声データＩＤの「Ｂ」は対話装置１０に対して入力された音声データを意味するものとする。 Here, in order to simplify the explanation, it is assumed that “A” of the voice data ID means voice data uttered from the dialogue apparatus 10. Further, the voice data ID “B” means voice data input to the dialogue apparatus 10.

図４（Ｂ）を参照して、分析データ記憶部２３５が示されている。一例として本実施形態に基づく評価装置２０の備える記憶部２０３に格納されている。 With reference to FIG. 4B, an analysis data storage unit 235 is shown. As an example, it is stored in the storage unit 203 provided in the evaluation device 20 based on this embodiment.

具体的には、分析データ記憶部２３５は、音声データ記憶部２３４に記憶されている音声データを分析した分析結果を格納している。本例においては、音声入力受信部２２１が当該音声データ記憶部２３４に音声データを格納するものとする。そして、音声分析部２２５は、音声データ記憶部２３４に格納されている音声データを抽出して分析し、分析結果を分析データ記憶部２３５に格納する。 Specifically, the analysis data storage unit 235 stores an analysis result obtained by analyzing the voice data stored in the voice data storage unit 234. In this example, it is assumed that the voice input receiving unit 221 stores voice data in the voice data storage unit 234. Then, the voice analysis unit 225 extracts and analyzes the voice data stored in the voice data storage unit 234 and stores the analysis result in the analysis data storage unit 235.

分析データ記憶部２３５は、対話装置１０との対話に関するユーザの快適度を算出するための音声態様のパラメータを格納する。 The analysis data storage unit 235 stores audio mode parameters for calculating the user's comfort level related to the dialogue with the dialogue apparatus 10.

一例として、「認識フレーズ」、「音量」、「話速」、「返答時間」が示されている。
「認識フレーズ」は、ユーザが発話した音声内容である。 As an example, “recognition phrase”, “volume”, “speech speed”, and “response time” are shown.
The “recognition phrase” is the voice content uttered by the user.

「音量」は、音の大きさのレベルを意味する。音声データの振幅を計測することにより取得することが可能である。 “Volume” means the level of loudness. It can be obtained by measuring the amplitude of the audio data.

「話速」は、１分間に話される言葉の数を意味する。
「返答時間」は、対話装置１０の応答処理に対してユーザの返答がマイク１０３に入力されるまでの時間を意味する。履歴テーブル２３３における対話装置１０の応答処理の時刻からユーザから音声データが入力された時刻とに基づいて「返答時間」を算出することができる。 “Speaking speed” means the number of words spoken per minute.
“Response time” means the time until a user's response is input to the microphone 103 in response to the response process of the dialogue apparatus 10. The “response time” can be calculated based on the response processing time of the interactive device 10 in the history table 233 and the time when the voice data is input from the user.

ここでは、音声データＩＤ「１００Ｂ」について、「認識フレーズ」、「音量」、「話速」、「返答時間」が対応付けられている。 Here, “recognition phrase”, “volume”, “speaking speed”, and “response time” are associated with the voice data ID “100B”.

具体的には、「認識フレーズ」「ＸＸＸ」、「音量」「−３５．３ｄＢ」、「話速」「８０個／分」、「返答時間」「８ｓｅｃ」が対応付けられている。 Specifically, “recognition phrase” “XXX”, “volume” “−35.3 dB”, “speaking speed” “80 / min”, “response time” “8 sec” are associated.

ここでは、音声データＩＤ「１０１Ｂ」について、「認識フレーズ」、「音量」、「話速」、「返答時間」が対応付けられている。 Here, “recognition phrase”, “volume”, “speech speed”, and “response time” are associated with the voice data ID “101B”.

具体的には、「認識フレーズ」「ＹＹＹ」、「音量」「−３１．９ｄＢ」、「話速」「１００個／分」、「返答時間」「２ｓｅｃ」が対応付けられている。 Specifically, “recognition phrase” “YYY”, “volume” “−31.9 dB”, “speech speed” “100 / min”, “response time” “2 sec” are associated.

なお、音声データＩＤ「１００Ａ」、「１０１Ａ」を分析していないのは、対話装置１０が発話した内容だからである。 The reason why the voice data IDs “100A” and “101A” are not analyzed is because the content is spoken by the dialogue apparatus 10.

当該得られた音声態様のパラメータに基づいてユーザとの対話に関する快適度を算出する。算出の方式については後述する。 A comfort level related to the dialogue with the user is calculated based on the obtained parameters of the voice mode. The calculation method will be described later.

図５は、本実施形態１に基づく評価データベース２３１に登録されている評価テーブルを説明する図である。 FIG. 5 is a diagram for explaining an evaluation table registered in the evaluation database 231 based on the first embodiment.

本例においてはユーザとの対話に関する快適度を算出するにあたり、一例として３つの評価値を用いる。具体的には、音声データの応答内容に対する評価値Ｘと、応答速度に対する評価値Ｙと、口調に対する評価値Ｚとを用いる。 In this example, three evaluation values are used as an example in calculating the comfort level related to the dialogue with the user. Specifically, an evaluation value X for response contents of voice data, an evaluation value Y for response speed, and an evaluation value Z for tone are used.

図５（Ａ）を参照して、音声データの応答内容に対する評価値Ｘの評価テーブルが示されている。当該評価テーブルは、応答内容の意味に従って評価値Ｘを算出するテーブルである。 With reference to FIG. 5A, an evaluation table of evaluation values X with respect to response contents of audio data is shown. The said evaluation table is a table which calculates the evaluation value X according to the meaning of the response content.

「認識フレーズ」と「評価値Ｘ」とが対応付けられている。
対話装置１０の発話に対する印象が良いと感じる言葉あるいは、積極的な言葉に対しては関心度合は普通であると判断されるため評価値は高くなるように設定されている。例えば、「すごい」は評価値Ｘ＝「１０」、「いいね」は、評価値Ｘ＝「１０」、「わかった」は、評価値Ｘ＝「７」に設定されている。 “Recognition phrase” and “evaluation value X” are associated with each other.
The evaluation value is set to be high because it is determined that the degree of interest is normal for words that feel that the conversation device 10 has a good impression of the utterance or positive words. For example, evaluation value X = “10” is set for “great”, evaluation value X = “10” is set for “like”, and evaluation value X = “7” is set for “I understand”.

また、対話装置１０の発話に対する印象が普通と感じる言葉に対しては評価値は中程度となるように設定されている。例えば、「うん」は評価値Ｘ＝「５」、「はいはい」は評価値Ｘ＝「５」に設定されている。 In addition, the evaluation value is set to be moderate for words that feel that the dialogue device 10 feels normal. For example, “Yes” is set to the evaluation value X = “5”, and “Yes yes” is set to the evaluation value X = “5”.

逆に、対話装置１０の発話に対する印象が悪いと感じる言葉あるいは、消極的な言葉に対しては評価値が低くなるように設定されている。例えば、「だめだよ」は評価値Ｘ＝「０．１」に設定されている。 On the contrary, the evaluation value is set to be low for a word that feels that the utterance of the dialogue apparatus 10 is bad or a negative word. For example, “no use” is set to an evaluation value X = “0.1”.

また、対話装置１０の発話に対する印象の判断が付き難い言葉や認識できなかった場合、「（非登録）」、「（応答なし）」には評価値Ｘ＝「１」に設定されている。 Further, when it is difficult to recognize an impression on the utterance of the dialogue apparatus 10 or a word cannot be recognized, the evaluation value X is set to “1” for “(not registered)” and “(no response)”.

なお、上記は一例であり他の方式により評価するようにしても良い。例えば、印象が良いと感じる言葉あるいは積極的な言葉に対しては、正数として、反対に印象が悪いと感じる言葉あるいは消極的な言葉に対しては、負数としても良い。 Note that the above is an example, and evaluation may be performed by other methods. For example, a positive number may be used for words that feel good or positive, and a negative number may be used for words that feel bad or negative.

図５（Ｂ）を参照して、応答速度に対する評価値の算出の一例について説明する。図５（Ｂ）には、一実施の形態に従う、音声データの応答速度に対する評価値Ｙの評価テーブルが示されている。当該評価テーブルは、応答速度の速さに応じて評価値Ｙを算出するテーブルである。 With reference to FIG. 5B, an example of calculation of the evaluation value with respect to the response speed will be described. FIG. 5B shows an evaluation table of evaluation value Y with respect to the response speed of voice data according to one embodiment. The evaluation table is a table for calculating the evaluation value Y according to the response speed.

具体的には、例えば、「Ｙ＝１／返答時間」のテーブルが設けられている。
応答速度が速い場合、すなわち、返答時間が短い場合には対話装置１０の発話に対する関心度合が高いと考えられるため評価値Ｙの値が大きくなり、応答速度が遅い場合、すなわち、返答時間が長い場合には対話装置１０の発話に対する関心度合が低いと考えられるため評価値Ｙの値が小さくなるように設定される。なお、評価値Ｙの算出または設定の態様は、上述のものに限られない。例えば、他の局面において、離散分布が用いられてもよい。 Specifically, for example, a table of “Y = 1 / response time” is provided.
When the response speed is fast, that is, when the response time is short, the degree of interest in the utterance of the dialogue apparatus 10 is considered to be high, and thus the evaluation value Y increases, and when the response speed is slow, that is, the response time is long. In this case, since the degree of interest in the utterance of the dialogue apparatus 10 is considered low, the evaluation value Y is set to be small. Note that the manner of calculating or setting the evaluation value Y is not limited to that described above. For example, in other aspects, a discrete distribution may be used.

図５（Ｃ）を参照して、口調に対する評価値の算出の一例について説明する。図５（Ｃ）には、一実施の形態に従う、音声データの口調に対する評価値Ｚの評価テーブルが示されている。当該評価テーブルは、口調の興奮の度合として評価値Ｚを算出するテーブルである。 With reference to FIG. 5C, an example of calculating the evaluation value for the tone will be described. FIG. 5C shows an evaluation table of evaluation values Z for the tone of voice data according to one embodiment. The evaluation table is a table for calculating an evaluation value Z as the degree of tone excitement.

具体的には、一例として、声が大きくて早口である場合には、対話装置１０の発話に対する関心度合が高いと判断されるため評価値は高くなるように設定されている。例えば、「音量≧Ｐ、話速≧Ｑ」は評価値Ｚ＝「５」に設定されている。ここで、「Ｐ」は音量の大きい、小さいを判断する基準値である。「Ｑ」は、早口か否かを判断する基準値である。 Specifically, as an example, when the voice is loud and spoken, it is determined that the degree of interest in the utterance of the dialogue apparatus 10 is high, so the evaluation value is set to be high. For example, “sound volume ≧ P, speech speed ≧ Q” is set to the evaluation value Z = “5”. Here, “P” is a reference value for determining whether the volume is high or low. “Q” is a reference value for determining whether or not the mouth is fast.

また、声が小さくて早口である場合、声が大きくて早口でない場合には、対話装置１０の発話に対する関心度合は普通であると判断されるため評価値は中程度になるように設定されている。例えば、「音量＜Ｐ、話速≧Ｑ」は評価値Ｚ＝「１」に設定されている。「音量≧Ｐ、話速＜Ｑ」は評価値Ｚ＝「１」に設定されている。 Also, when the voice is small and fast, or when the voice is loud and not fast, the degree of interest in the utterance of the dialogue apparatus 10 is determined to be normal, so the evaluation value is set to be moderate. Yes. For example, “volume <P, speech speed ≧ Q” is set to the evaluation value Z = “1”. “Volume ≧ P, speaking speed <Q” is set to evaluation value Z = “1”.

また、声が小さくて早口でない場合には、対話装置１０の発話に対する関心度合は低いと判断されるため評価値は低くなるように設定されている。例えば、「音量＜Ｐ、話速＜Ｑ」は評価値Ｚ＝「０．１」に設定されている。 In addition, when the voice is low and the speech is not fast, it is determined that the degree of interest in the utterance of the dialogue apparatus 10 is low, so the evaluation value is set to be low. For example, “sound volume <P, speech speed <Q” is set to evaluation value Z = “0.1”.

なお、口調として興奮の度合を音量、話速で評価する場合について説明したが特にこれに限られず、他のパラメータを用いることも可能である。例えば、口調を評価する際に、音の高低、抑揚の有無等も含めて評価値を算出することも可能である。 Although the case where the degree of excitement is evaluated by the volume and the speech speed has been described as the tone, the present invention is not limited to this, and other parameters can also be used. For example, when evaluating the tone, it is also possible to calculate the evaluation value including the pitch of the sound and the presence or absence of inflection.

本実施形態においては、音声データの応答内容に対する評価値Ｘと、応答速度に対する評価値Ｙと、口調に対する評価値Ｚとに基づいて快適度を算出する。 In the present embodiment, the comfort level is calculated based on the evaluation value X for the response content of the voice data, the evaluation value Y for the response speed, and the evaluation value Z for the tone.

具体的には、評価値Ｘ、Ｙ、Ｚを変数とした所定の関数を設けて快適度を算出するようにしても良い。一例として、評価値Ｘ、Ｙ、Ｚをそれぞれ乗算することにより得られる値を快適度に設定することが可能である。 Specifically, the comfort level may be calculated by providing a predetermined function with the evaluation values X, Y, and Z as variables. As an example, it is possible to set the value obtained by multiplying the evaluation values X, Y, and Z as the comfort level.

（処理の流れ）
図６は、本実施形態１に基づく対話システム１における処理の流れを示すシーケンス図である。 (Process flow)
FIG. 6 is a sequence diagram showing a flow of processing in the dialogue system 1 based on the first embodiment.

図６に示されるように、ユーザは、対話装置１０に対して発話（ユーザ発話とも称する）する（シーケンスｓｑ０）。 As shown in FIG. 6, the user utters (also referred to as user utterance) to interactive apparatus 10 (sequence sq0).

対話装置１０は、ユーザ発話に対して音声の入力を受け付ける（シーケンスｓｑ１）。具体的には、音声入力受付部１１４は、マイク１０３を介して外部からの音の入力を受け付ける。 Interactive device 10 accepts an input of voice in response to the user utterance (sequence sq1). Specifically, the voice input receiving unit 114 receives a sound input from the outside via the microphone 103.

次に、対話装置１０は、音声データを評価装置２０に出力する（シーケンスｓｑ２）。具体的には、音声入力受付部１１４は、通信部１０１を介して評価装置２０に出力する。 Next, dialogue apparatus 10 outputs the voice data to evaluation apparatus 20 (sequence sq2). Specifically, the voice input reception unit 114 outputs to the evaluation device 20 via the communication unit 101.

評価装置２０は、対話装置１０からの音声データを受信して音声データ記憶部２３４に記憶する（シーケンスｓｑ３）。 Evaluation device 20 receives the voice data from dialog device 10 and stores it in voice data storage unit 234 (sequence sq3).

そして、音声分析部２２５は、音声データ記憶部２３４に記憶された音声データに対して分析して分析結果を分析データ記憶部２３５に格納する（シーケンスｓｑ４）。 Then, voice analysis unit 225 analyzes the voice data stored in voice data storage unit 234 and stores the analysis result in analysis data storage unit 235 (sequence sq4).

一方で、対話装置１０は、音声入力受付部１１４で受け付けた音声データに対して音声認識を実行する（シーケンスｓｑ５）。具体的には、対話処理実行部１１２は、音声データに対して認識フレーズを取得する。 On the other hand, dialogue apparatus 10 performs voice recognition on the voice data received by voice input receiving unit 114 (sequence sq5). Specifically, the dialogue processing execution unit 112 acquires a recognition phrase for the audio data.

また、対話装置１０は、認識フレーズに応じた音声応答出力を実行する（シーケンスｓｑ６）。具体的には、対話処理実行部１１２は、対話データベース１３２を利用して認識フレーズに対応する回答フレーズを取得する。そして、対話処理実行部１１２は、取得した回答フレーズの音声データをスピーカ１０４に出力する。 In addition, dialog device 10 executes a voice response output corresponding to the recognized phrase (sequence sq6). Specifically, the dialogue processing execution unit 112 acquires an answer phrase corresponding to the recognized phrase using the dialogue database 132. Then, the dialogue process execution unit 112 outputs the acquired voice data of the answer phrase to the speaker 104.

対話装置１０は、スピーカ１０４から音声を再生する（シーケンスｓｑ７）。
また、対話装置１０の対話処理実行部１１２は、音声データをスピーカ１０４に出力する際、通信部１０１を介して評価装置２０に音声データを送信する（シーケンスｓｑ８）。 Interactive device 10 reproduces sound from speaker 104 (sequence sq7).
Further, when outputting the voice data to the speaker 104, the dialog processing execution unit 112 of the dialog device 10 transmits the voice data to the evaluation device 20 via the communication unit 101 (sequence sq8).

評価装置２０は、対話装置１０からの音声データを受信した音声データ記憶部２３４に記憶する（シーケンスｓｑ９）。 Evaluation device 20 stores the audio data from dialog device 10 in audio data storage unit 234 that has received the audio data (sequence sq9).

次に、ユーザは、対話装置１０からの音声の再生を受けて発話（ユーザ発話）する（シーケンスｓｑ１０）。 Next, the user utters (user utterance) in response to the reproduction of the voice from the dialogue apparatus 10 (sequence sq10).

対話装置１０は、ユーザ発話に対して音声の入力を受け付ける（シーケンスｓｑ１１）。具体的には、音声入力受付部１１４は、マイク１０３を介して外部からの音の入力を受け付ける。 Interactive device 10 accepts an input of voice in response to the user utterance (sequence sq11). Specifically, the voice input receiving unit 114 receives a sound input from the outside via the microphone 103.

次に、対話装置１０は、音声データを評価装置２０に出力する（シーケンスｓｑ１２）。具体的には、音声入力受付部１１４は、通信部１０１を介して評価装置２０に出力する。 Next, dialogue apparatus 10 outputs the voice data to evaluation apparatus 20 (sequence sq12). Specifically, the voice input reception unit 114 outputs to the evaluation device 20 via the communication unit 101.

評価装置２０は、対話装置１０からの音声データを受信して音声データ記憶部２３４に記憶する（シーケンスｓｑ１３）。 Evaluation device 20 receives the voice data from dialog device 10 and stores it in voice data storage unit 234 (sequence sq13).

そして、音声分析部２２５は、音声データ記憶部２３４に記憶された音声データに対して分析して分析結果を分析データ記憶部２３５に格納する（シーケンスｓｑ１４）。 Then, voice analysis unit 225 analyzes the voice data stored in voice data storage unit 234 and stores the analysis result in analysis data storage unit 235 (sequence sq14).

そして、評価部は、分析データ記憶部２３５に格納された分析結果に基づいてユーザとの対話に関する快適度を算出する（シーケンスｓｑ１４Ａ）。当該処理については後述する。 Then, the evaluation unit calculates a comfort level related to the dialogue with the user based on the analysis result stored in analysis data storage unit 235 (sequence sq14A). This process will be described later.

一方で、対話装置１０は、音声入力受付部１１４で受け付けた音声データに対して音声認識を実行する（シーケンスｓｑ１５）。具体的には、対話処理実行部１１２は、音声データに対して認識フレーズを取得する。 On the other hand, dialog device 10 performs voice recognition on the voice data received by voice input receiving unit 114 (sequence sq15). Specifically, the dialogue processing execution unit 112 acquires a recognition phrase for the audio data.

次に、対話装置１０は、認識フレーズに応じた音声応答出力を実行する（シーケンスｓｑ１６）。具体的には、対話処理実行部１１２は、対話データベース１３２を利用して認識フレーズに対応する回答フレーズを取得する。そして、対話処理実行部１１２は、取得した回答フレーズの音声データをスピーカ１０４に出力する。 Next, dialogue apparatus 10 executes voice response output corresponding to the recognized phrase (sequence sq16). Specifically, the dialogue processing execution unit 112 acquires an answer phrase corresponding to the recognized phrase using the dialogue database 132. Then, the dialogue process execution unit 112 outputs the acquired voice data of the answer phrase to the speaker 104.

対話装置１０は、スピーカ１０４から音声を再生する（シーケンスｓｑ１７）。
また、対話装置１０の対話処理実行部１１２は、音声データをスピーカ１０４に出力する際、通信部１０１を介して評価装置２０に音声データを送信する（シーケンスｓｑ１８）。 Interactive device 10 reproduces sound from speaker 104 (sequence sq17).
Further, when outputting the voice data to the speaker 104, the dialogue processing execution unit 112 of the dialogue apparatus 10 transmits the voice data to the evaluation apparatus 20 via the communication unit 101 (sequence sq18).

評価装置２０は、対話装置１０からの音声データを受信した音声データ記憶部２３４に記憶する（シーケンスｓｑ１９）。 Evaluation device 20 stores the audio data from dialog device 10 in audio data storage unit 234 that has received the audio data (sequence sq19).

次に、ユーザは、対話装置１０からの音声の再生を受けて発話（ユーザ発話）する（シーケンスｓｑ２０）。 Next, the user utters (user utterance) in response to the reproduction of the voice from the dialogue apparatus 10 (sequence sq20).

次に、対話装置１０は、音声データを評価装置２０に出力する（シーケンスｓｑ２１）。具体的には、音声入力受付部１１４は、通信部１０１を介して評価装置２０に出力する。 Next, dialogue apparatus 10 outputs the voice data to evaluation apparatus 20 (sequence sq21). Specifically, the voice input reception unit 114 outputs to the evaluation device 20 via the communication unit 101.

評価装置２０は、対話装置１０からの音声データを受信して音声データ記憶部２３４に記憶する（シーケンスｓｑ２２）。 Evaluation device 20 receives the voice data from dialog device 10 and stores it in voice data storage unit 234 (sequence sq22).

そして、音声分析部２２５は、音声データ記憶部２３４に記憶された音声データに対して分析して分析結果を分析データ記憶部２３５に格納する（シーケンスｓｑ２３）。 Then, voice analysis unit 225 analyzes the voice data stored in voice data storage unit 234 and stores the analysis result in analysis data storage unit 235 (sequence sq23).

そして、評価部は、分析データ記憶部２３５に格納された分析結果に基づいてユーザとの対話に関する快適度を算出する（シーケンスｓｑ２４）。当該処理については後述する。 Then, the evaluation unit calculates a comfort level related to the dialogue with the user based on the analysis result stored in analysis data storage unit 235 (sequence sq24). This process will be described later.

（対話評価処理）
図７は、本実施形態１に基づく評価装置２０の対話評価処理を実行するフロー図である。 (Dialogue evaluation process)
FIG. 7 is a flowchart for executing the dialogue evaluation process of the evaluation apparatus 20 based on the first embodiment.

図７を参照して、当該フロー図は、記憶部２０３に格納されているプログラムを実行して制御部２０２の評価部２２２により実行される処理である。 With reference to FIG. 7, the flowchart is a process executed by the evaluation unit 222 of the control unit 202 by executing a program stored in the storage unit 203.

まず、ユーザ応答があったか否かを判断する（ステップＳ１０）。具体的には、評価部２２２は、図４で説明した履歴テーブル２３３に格納されているデータに基づいてユーザ応答の有無を判断する。この点で評価部２２２は、履歴テーブル２３３に対話装置１０に対して入力された音声データが登録されているか否かにより判断することが可能である。例えば音声データＩＤに「Ｂ」の識別子が付されて登録されているか否かにより判断することも可能である。 First, it is determined whether or not there is a user response (step S10). Specifically, the evaluation unit 222 determines the presence / absence of a user response based on the data stored in the history table 233 described with reference to FIG. In this regard, the evaluation unit 222 can determine whether or not the voice data input to the interactive device 10 is registered in the history table 233. For example, it is possible to determine whether or not the voice data ID is registered with the identifier “B”.

そして、ステップＳ１０において、ユーザ応答が有ったと判断した場合（ステップＳ１０においてＹＥＳ）には、次に、評価対象か否かを判断する（ステップＳ１１）。具体的には、ユーザ応答に対応する音声データが評価対象となるかどうかを判断する。例えば、履歴テーブル２３３の時刻に従って、ユーザ応答に対応する音声データの直近に対話装置１０から発話された音声データが有るか否かにより判断することが可能である。例えば、対話装置１０から発話してからユーザ応答までに所定期間（３０秒程度）経過しているような場合には、対話装置１０から発話した内容と無関係なユーザ応答であると判断して評価対象ではないと判断することが可能である。一方で、所定期間以内である場合には、対話装置１０から発話した内容と関係あるユーザ応答であると判断して評価対象であると判断することが可能である。なお、本例においては、ユーザの返答時間により評価対象か否かを判断する場合について説明したが、特にこれに限られずユーザ応答の内容に従って評価対象か否かを判断するようにしても良い。 If it is determined in step S10 that there has been a user response (YES in step S10), it is next determined whether or not it is an evaluation target (step S11). Specifically, it is determined whether or not the voice data corresponding to the user response is to be evaluated. For example, according to the time of the history table 233, it is possible to determine whether there is voice data uttered from the dialogue apparatus 10 in the immediate vicinity of the voice data corresponding to the user response. For example, when a predetermined period (about 30 seconds) elapses after the utterance from the dialog device 10 until the user response, it is determined that the user response is unrelated to the content uttered from the dialog device 10 and evaluated. It is possible to determine that it is not the target. On the other hand, when it is within the predetermined period, it is possible to determine that the user response is related to the content uttered from the dialogue apparatus 10 and to determine that the user is an evaluation target. In this example, the case where it is determined whether or not the evaluation target is based on the response time of the user has been described. However, the present invention is not limited to this.

なお、ステップＳ１０において、ユーザ応答が無いと判断した場合（ステップＳ１０においてＮＯ）あるいはステップＳ１１において、評価対象ではないと判断した場合（ステップＳ１１においてＮＯ）には、処理を終了する（エンド）。 If it is determined in step S10 that there is no user response (NO in step S10) or if it is determined in step S11 that it is not an evaluation target (NO in step S11), the process ends (END).

一方、ステップＳ１１において、評価対象であると判断した場合（ステップＳ１１においてＹＥＳ）には、応答内容を取得する(ステップＳ１２）。具体的には、評価部２２２は、図４（Ｂ）で説明した分析データ記憶部２３５に記憶されているユーザが発話した音声内容である認識フレーズを取得する。 On the other hand, if it is determined in step S11 that it is an evaluation target (YES in step S11), response content is acquired (step S12). Specifically, the evaluation unit 222 acquires a recognition phrase that is the voice content spoken by the user stored in the analysis data storage unit 235 described with reference to FIG.

また、ステップＳ１２において、音量、話速、返答時間を取得する（ステップＳ１３）。具体的には、評価部２２２は、分析データ記憶部２３５から音声データの音声態様として音量、話速、返答時間を取得する。 In step S12, the volume, speech speed, and response time are acquired (step S13). Specifically, the evaluation unit 222 acquires volume, speech speed, and response time from the analysis data storage unit 235 as the voice mode of the voice data.

次に、各評価値を算出する（ステップＳ１４）。具体的には、評価部２２２は、図５の評価データベース２３１に登録されている評価テーブルに基づいて、音声データの応答内容、応答速度、口調に対する各評価値を算出する。 Next, each evaluation value is calculated (step S14). Specifically, the evaluation unit 222 calculates each evaluation value for the response content, response speed, and tone of the voice data based on the evaluation table registered in the evaluation database 231 of FIG.

そして、次に、各評価値に応じた快適度を算出する（ステップＳ１６）。具体的には、評価部２２２は、各評価値をそれぞれ乗算した快適度を算出する。 Next, the comfort level corresponding to each evaluation value is calculated (step S16). Specifically, the evaluation unit 222 calculates a comfort level obtained by multiplying each evaluation value.

そして、算出した快適度を出力する（ステップＳ１８）。具体的には、評価部２２２は、算出した快適度を一例として数値化して表示する。 Then, the calculated comfort level is output (step S18). Specifically, the evaluation unit 222 displays the calculated comfort level as a numerical value as an example.

また、本例における評価部２２２は、図５の評価データベース２３１に格納されている評価テーブルに基づいて応答内容、応答速度、口調の各評価値を算出して、快適度を算出する場合について説明したが、特にこれに限られず、少なくとも１つの情報に基づいて評価値を算出して快適度を算出するようにしても良い。例えば、「応答内容」のみを評価しても良いし、「応答内容」と「応答速度」とを組み合わせて評価するようにしても良い。当該複数の情報を組み合わせることにより対話装置１０との対話に対するユーザの応答のニュアンスを精度よく把握して、ユーザとの間での対話に対する快適度を適切に評価して判断することも可能である。 In addition, the evaluation unit 222 in this example describes a case where the comfort level is calculated by calculating the evaluation values of the response content, the response speed, and the tone based on the evaluation table stored in the evaluation database 231 of FIG. However, the present invention is not limited to this, and the comfort level may be calculated by calculating the evaluation value based on at least one piece of information. For example, only “response content” may be evaluated, or “response content” and “response speed” may be evaluated in combination. By combining the plurality of pieces of information, it is possible to accurately grasp the nuance of the user's response to the dialogue with the dialogue apparatus 10 and appropriately evaluate and judge the degree of comfort for the dialogue with the user. .

また、快適度の算出方式は、上記方式に限られず、種々の方式を採用することが可能である。例えば、上記においては、各評価値に対してそれぞれ重み付けした値をそれぞれ加算して快適度を算出するようにしても良い。 Further, the comfort level calculation method is not limited to the above method, and various methods can be employed. For example, in the above, the comfort level may be calculated by adding each weighted value to each evaluation value.

なお、本例においては、対話装置１０と評価装置２０とが協働して動作する対話システム１の構成について説明したが、評価装置２０の機能を対話装置１０に含めてスタンドアローンで動作する対話装置を実現するようにしても良い。 In this example, the configuration of the interactive system 1 in which the interactive device 10 and the evaluation device 20 operate in cooperation has been described. However, the interactive operation that includes the functions of the evaluation device 20 in the interactive device 10 and operates in a stand-alone manner. You may make it implement | achieve an apparatus.

＜実施形態２＞
上記の実施形態１では、対話装置１０から発話した内容に対するユーザ応答に基づいて発話した内容を評価する場合について説明したが、一連の対話全体を評価するようにしても良い。 <Embodiment 2>
In the first embodiment described above, a case has been described in which the content uttered based on the user response to the content uttered from the dialog device 10 is evaluated. However, the entire series of dialogs may be evaluated.

図８は、実施形態２に基づくユーザとの対話に関する評価の概念を説明する図である。
図８を参照して、ユーザは、対話装置１０に対して発話（「暇だな」）する（シーケンスｓｑ３０）。 FIG. 8 is a diagram for explaining the concept of evaluation related to a dialog with a user based on the second embodiment.
Referring to FIG. 8, the user speaks (“I'm free”) to dialog apparatus 10 (sequence sq30).

対話装置１０は、ユーザ発話に対して音声の入力を受け付けて、評価装置２０に音声データを出力する（シーケンスｓｑ３１）。 Dialog device 10 accepts a voice input in response to a user utterance and outputs voice data to evaluation device 20 (sequence sq31).

評価装置２０は、対話装置１０から送信された音声データを受信して格納する（シーケンスｓｑ３２）。 Evaluation device 20 receives and stores the audio data transmitted from interactive device 10 (sequence sq32).

また、対話装置１０は、受け付けたユーザ発話（「暇だな」）に対して音声応答出力（「サッカー好き？」）を実行する（シーケンスｓｑ３３）。また、対話装置１０は、評価装置２０に音声データを出力する（シーケンスｓｑ３４）。なお、ユーザ発話に対する対話装置１０からの応答出力の内容は、対話データベース１３２に格納されているものとする。以下の場合についても同様である。 In addition, the dialogue apparatus 10 executes a voice response output (“Soccer do you like?”) For the received user utterance (“Don't spare”) (sequence sq33). Interactive device 10 outputs voice data to evaluation device 20 (sequence sq34). It is assumed that the content of the response output from the dialogue apparatus 10 for the user utterance is stored in the dialogue database 132. The same applies to the following cases.

評価装置２０は、対話装置１０から送信された音声データを受信して格納する（シーケンスｓｑ３５）。 Evaluation device 20 receives and stores the audio data transmitted from interactive device 10 (sequence sq35).

ユーザは、対話装置１０からの音声応答出力（「サッカー好き？」）を受けて、対話装置１０に対して発話（「すきだよ」）する（シーケンスｓｑ３６）。 In response to the voice response output (“Soccer?”) From the dialog device 10, the user speaks (“Sukida”) to the dialog device 10 (sequence sq36).

対話装置１０は、ユーザ発話に対して音声の入力を受け付けて、評価装置２０に音声データを出力する（シーケンスｓｑ３７）。 Dialog device 10 accepts an input of voice in response to a user utterance, and outputs voice data to evaluation device 20 (sequence sq37).

評価装置２０は、対話装置１０から送信された音声データを受信して格納する（シーケンスｓｑ３８）。 Evaluation device 20 receives and stores the voice data transmitted from interactive device 10 (sequence sq38).

そして、評価装置２０は、ユーザとの対話を評価する（シーケンスｓｑ３９）。
一例として、評価装置２０は、対話装置１０からユーザに対して音声応答出力した「サッカー好き？」に対するユーザの発話「すきだよ」について評価する。 Then, the evaluation device 20 evaluates the dialog with the user (sequence sq39).
As an example, the evaluation device 20 evaluates the user's utterance “Sukidayo” for “Soccer Do you like?”, Which is a voice response output from the dialogue device 10 to the user.

また、対話装置１０は、受け付けたユーザ発話に対して音声応答出力（「僕は○○選手が好きだな」）を実行する（シーケンスｓｑ４０）。また、対話装置１０は、評価装置２０に音声データを出力する（シーケンスｓｑ４１）。 In addition, dialogue apparatus 10 executes voice response output (“I like XX player”) for the received user utterance (sequence sq40). In addition, dialogue apparatus 10 outputs audio data to evaluation apparatus 20 (sequence sq41).

評価装置２０は、対話装置１０から送信された音声データを受信して格納する（シーケンスｓｑ４２）。 Evaluation device 20 receives and stores the voice data transmitted from interactive device 10 (sequence sq42).

ユーザは、対話装置１０からの音声応答出力（「僕は○○選手が好きだな」）を受けて、対話装置１０に対して発話（「良く知ってるね」）する（シーケンスｓｑ４３）。 The user receives the voice response output from the dialogue device 10 (“I like XX player”) and speaks (“I know well”) to the dialogue device 10 (sequence sq43).

対話装置１０は、ユーザ発話に対して音声の入力を受け付けて、評価装置２０に音声データを出力する（シーケンスｓｑ４４）。 Dialog device 10 accepts an input of voice in response to a user utterance, and outputs voice data to evaluation device 20 (sequence sq44).

評価装置２０は、対話装置１０から送信された音声データを受信して格納する（シーケンスｓｑ４５）。 Evaluation device 20 receives and stores the audio data transmitted from interactive device 10 (sequence sq45).

そして、評価装置２０は、ユーザとの対話を評価する（シーケンスｓｑ４６）。
一例として、評価装置２０は、対話装置１０からユーザに対して音声応答出力した「僕は○○選手が好きだな」に対するユーザの発話「良く知ってるね」について評価する。 Then, the evaluation device 20 evaluates the dialogue with the user (sequence sq46).
As an example, the evaluation device 20 evaluates the user's utterance “I know well” to “I like XX player” which is output as a voice response from the dialogue device 10 to the user.

また、評価装置２０は、ユーザとの対話を評価する（シーケンスｓｑ４７）。
一例として、評価装置２０は、一連のユーザとの対話全体の内容に基づいて対話を評価する。 Moreover, the evaluation apparatus 20 evaluates the dialogue with the user (sequence sq47).
As an example, the evaluation device 20 evaluates the dialogue based on the contents of the whole dialogue with a series of users.

具体的には、対話装置１０からの音声応答出力「サッカー好き？」、「僕は○○選手が好きだな」の一連の「サッカー」の話題に対する対話を評価する。例えば、それぞれの評価を加算あるいは積算することにより快適度を算出するようにしても良い。なお、話題については、音声応答出力と「話題情報」とが関連づけられている場合には、当該関連付けられている「話題情報」から抽出するようにしても良いし、あるいは音声応答出力に含まれるキーワード等に基づいて話題情報を推定する公知のプログラムにより抽出することも可能である。 Specifically, a dialogue on a series of “soccer” topics such as voice response output “I like soccer?” And “I like XX player” from the dialogue apparatus 10 is evaluated. For example, the comfort level may be calculated by adding or integrating each evaluation. In addition, when the voice response output and “topic information” are associated with each other, the topic may be extracted from the associated “topic information” or included in the voice response output. It can also be extracted by a known program that estimates topic information based on keywords or the like.

また、話題に対する対話を評価するのみならず、対話の意図や経緯に対して評価するようにしても良い。 Further, not only the dialogue with respect to the topic but also the intention and history of the dialogue may be evaluated.

当該評価により、特定の対話の話題等に対する関心度、興味の度合を把握することが可能である。 With this evaluation, it is possible to grasp the degree of interest and the degree of interest in the topic of a specific dialogue.

また、上記においては２回のユーザ発話に基づいて対話を評価する場合について説明したが、特に２回に限られず、さらに複数回の連続した対話を評価するようにしても良い。 In the above description, the dialogue is evaluated based on two user utterances. However, the dialogue is not limited to two times, and a plurality of consecutive dialogues may be evaluated.

また、ある一定の所定期間毎に当該期間に含まれる対話を評価するようにしても良い。
当該評価により、個々の対話のみならず、対話全体に対する評価が可能である。 Moreover, you may make it evaluate the dialog included in the said period for every fixed predetermined period.
With this evaluation, it is possible to evaluate not only individual dialogues but also the whole dialogues.

また、本例においては、１台の対話装置１０からの音声データに基づいてユーザとの対話を評価する評価装置２０について説明したが、複数の対話装置１０を設けて、複数のユーザとの対話を評価することにより、統計的分布により一般的なユーザの傾向を把握することも可能である。当該傾向を把握することにより、一般的なユーザにとって快適な対話を実現することが可能な対話データベースを構築することが可能である。 Moreover, in this example, although the evaluation apparatus 20 which evaluates the dialog with a user based on the audio | speech data from the one dialog apparatus 10 was demonstrated, the several dialog apparatus 10 is provided and a dialog with a some user is provided. It is also possible to grasp general user trends by statistical distribution. By grasping the tendency, it is possible to construct a dialogue database that can realize a comfortable dialogue for a general user.

また、対話データベースを構築するにあたり、辞書には登録されていない非登録用語が音声入力される場合に、辞書に登録されている登録用語が音声入力される場合と快適度の傾向が類似しているような場合には、非登録用語と登録用語とが類似していると判断して、登録用語と同様に辞書登録するようにしても良い。 Also, when constructing a dialogue database, when unregistered terms that are not registered in the dictionary are input by voice, the tendency of comfort is similar to that when registered terms registered in the dictionary are input by speech. In such a case, it may be determined that the unregistered term and the registered term are similar, and the dictionary may be registered similarly to the registered term.

＜実施形態３＞
上記においては、主に音声を用いた対話を評価する方式について説明したが、例えばＳＮＳ（social networking service）を利用した文字（メッセージ）等による対話についても適用可能である。 <Embodiment 3>
In the above description, a method for evaluating a dialogue mainly using speech has been described. However, for example, a dialogue using characters (messages) using SNS (social networking service) can be applied.

具体的には、一例として対話装置１０から発信した情報に対する応答内容としてユーザから入力されるメッセージの内容が印象が良いと感じる言葉あるいは積極的な言葉であるか否かに基づいて評価値Ｘ１を算出する。 Specifically, as an example, the evaluation value X1 is calculated based on whether or not the content of the message input from the user as a response content to the information transmitted from the interactive device 10 is a word that feels good or an aggressive word. calculate.

また、対話装置１０から発信した情報に対する返信あるいは返答の時間の速さに基づいて評価値Ｙ１を算出する。 Also, the evaluation value Y1 is calculated based on the speed of the reply or reply time for the information transmitted from the dialogue apparatus 10.

また、対話装置１０から発信した情報に対する応答内容の形式、語調に基づいて評価値を算出する。例えば、メッセージに付されている記号「？」「！」等の有無、数等に基づいて興奮の度合として評価値Ｚ１を算出する。 In addition, the evaluation value is calculated based on the format and tone of the response content for the information transmitted from the interactive device 10. For example, the evaluation value Z1 is calculated as the degree of excitement based on the presence / absence and number of symbols “?” And “!” Attached to the message.

そして、評価値Ｘ１、Ｙ１、Ｚ１をそれぞれ乗算することにより得られる値を快適度に設定するようにしても良い。 Then, values obtained by multiplying the evaluation values X1, Y1, and Z1 may be set as the comfort level.

＜実施形態４＞
対話装置１０及び評価装置２０等の制御ブロックは、集積回路（ＩＣチップ）等に形成された論理回路（ハードウェア）によって実現してもよいし、ＣＰＵ（Central Processing Unit）を用いてソフトウェアによって実現してもよい。 <Embodiment 4>
Control blocks such as the interactive device 10 and the evaluation device 20 may be realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like, or realized by software using a CPU (Central Processing Unit). May be.

後者の場合、対話装置１０及び評価装置２０は、各機能を実現するソフトウェアであるプログラムの命令を実行するＣＰＵ、上記プログラムおよび各種データがコンピュータ（またはＣＰＵ）で読み取り可能に記録されたＲＯＭ（Read Only Memory）または記憶装置（これらを「記録媒体」と称する）、上記プログラムを展開するＲＡＭ（Random Access Memory）などを備えている。そして、コンピュータ（またはＣＰＵ）が上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記記録媒体としては、「一時的でない有形の媒体」、例えば、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムは、該プログラムを伝送可能な任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに供給されてもよい。なお、本発明は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 In the latter case, the dialogue apparatus 10 and the evaluation apparatus 20 are a CPU (execution of a program that is software that implements each function), a ROM (Read Only Memory) or a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) for expanding the program, and the like. And the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it. As the recording medium, a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. The program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program. The present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission.

今回開示された実施の形態は、すべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は、上記した実施の形態の説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The embodiment disclosed this time should be considered as illustrative in all points and not restrictive. The scope of the present invention is shown not by the above description of the embodiments but by the scope of claims for patent, and is intended to include meanings equivalent to the scope of claims for patent and all modifications within the scope.

１対話システム、１０対話装置、２０評価装置、１０１，２０１通信部、１０２，２０２制御部、１０３マイク、１０４スピーカ、１０９，２０３記憶部、１１２対話処理実行部、１１４音声入力受付部、１３２対話データベース、２２１音声入力受信部、２２２評価部、２２５音声分析部、２２６出力部、２３１評価データベース、２３２音声辞書、２３３履歴テーブル、２３４音声データ記憶部、２３５分析データ記憶部。 DESCRIPTION OF SYMBOLS 1 Dialog system, 10 Dialogue apparatus, 20 Evaluation apparatus, 101,201 Communication part, 102,202 Control part, 103 Microphone, 104 Speaker, 109,203 Storage part, 112 Dialogue process execution part, 114 Voice input reception part, 132 Dialogue Database, 221 voice input receiving unit, 222 evaluation unit, 225 voice analysis unit, 226 output unit, 231 evaluation database, 232 voice dictionary, 233 history table, 234 voice data storage unit, 235 analysis data storage unit.

Claims

A storage unit for storing dialogue information for dialogue with the user;
A dialogue unit that executes dialogue processing with the user based on dialogue information stored in the storage unit;
An input receiving unit that receives an input from the user in the interactive process;
An interaction evaluation apparatus comprising: an evaluation unit that calculates a degree of comfort related to an interaction with the user based on an input mode received by the input reception unit.

The dialogue unit executes dialogue processing with the user based on dialogue information stored in the storage unit,
The dialog evaluation apparatus according to claim 1, wherein the input receiving unit receives a voice input from the user in the dialog processing.

The dialogue evaluation according to claim 2, wherein the evaluation unit calculates a comfort level related to the dialogue with the user based on at least one of a response content, a response speed, and a tone of the voice input from the user in the dialogue processing. apparatus.

The dialogue evaluation apparatus according to claim 2, wherein the evaluation unit calculates a comfort level related to a dialogue with the user based on a voice input mode from the user in the dialogue processing a plurality of times.

A storage unit for storing dialogue information for dialogue with the user;
A dialogue unit that executes dialogue processing with the user based on dialogue information stored in the storage unit;
An input receiving unit that receives an input from the user in the interactive process;
An interaction evaluation system comprising: an evaluation unit that calculates a comfort level related to an interaction with the user based on an input mode received by the input reception unit.

Executing interaction processing with the user based on interaction information for interacting with the user;
Receiving an input from the user in the interactive process;
And a step of calculating a comfort level related to the dialogue with the user based on the received input mode.

A dialogue evaluation program executed on a computer,
The dialogue evaluation program is for the computer.
Executing interaction processing with the user based on interaction information for interacting with the user;
Receiving an input from the user in the interactive process;
A dialog evaluation program for executing a process, comprising: calculating a comfort level related to a dialog with the user based on the received input mode.