JP2010157081A

JP2010157081A - Response generation device and program

Info

Publication number: JP2010157081A
Application number: JP2008334824A
Authority: JP
Inventors: Kazuya Shimooka; 和也下岡; Yusuke Nakano; 雄介中野; Katsuji Yamashita; 勝司山下
Original assignee: Toyota Motor Corp; Toyota Central R&D Labs Inc
Current assignee: Toyota Motor Corp; Toyota Central R&D Labs Inc
Priority date: 2008-12-26
Filing date: 2008-12-26
Publication date: 2010-07-15
Anticipated expiration: 2028-12-26
Also published as: JP5195414B2

Abstract

<P>PROBLEM TO BE SOLVED: To naturally dialogize by generating a suitable sentence corresponding to an input content from users. <P>SOLUTION: A first user's deliverance input from a microphone 12 is sound recognized, a state discrimination part 22 discriminates whether or not a state is included in the first user's deliverance based on analysis result morphologically analyzed for the sound-recognized first user's deliverance, then when the state is included, an emotional polarity estimation part 24 estimates an emotional polarity expressed by the first user's deliverance, and a question generation part 26 generates and outputs a question sentence questioning an emotion to a user. An emotional polarity extraction part 28 extracts the emotional polarity from analysis result of a second user's deliverance to the question sentence, and a polarity conformity discrimination part 30 discriminates whether or not the estimated emotional polarity and the extracted emotional polarity conform. A response generation part 32 generates a response sentence indicating "agreement" when the emotional polarity conforms and generates a response sentence indicating "surprise" or selects a response sentence from formula response sentences of supportive response or promotion when it does not conform. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、応答生成装置及びプログラムに係り、特に、ユーザと円滑な対話を行うための応答生成装置及びプログラムに関する。 The present invention relates to a response generation device and a program, and more particularly, to a response generation device and a program for performing a smooth dialogue with a user.

従来、入力されたユーザ発話から概念を抽出して、抽出した概念またはその関連語を用いた複数の応答文を生成し、予め定めた概念毎の「話題の豊富さ」及び「感情」に基づいて、生成された複数の応答文の中から優先度の高い応答文を決定して出力する応答生成装置が提案されている（例えば、特許文献１参照）。
特開２００７−２１９１４９号公報 Conventionally, a concept is extracted from an input user utterance, a plurality of response sentences using the extracted concept or its related words are generated, and based on “abundance of topics” and “emotion” for each predetermined concept Thus, there has been proposed a response generation device that determines and outputs a response sentence having a high priority from a plurality of generated response sentences (see, for example, Patent Document 1).
JP 2007-219149 A

しかしながら、上記の特許文献１の応答生成装置では、ユーザ発話に含まれる概念またはその関連語を用いて応答文を生成するため、異なる意図をもって発話されたものであっても、発話内容が同じ場合には同じ応答文が生成されることになり、自然な対話を行うことができない場合がある、という問題がある。例えば、以下の対話例１及び対話例２について、特許文献１の応答生成装置では、ユーザ発話２の「楽しかったよ」に基づいて、対話例１及び対話例２のいずれの場合も、例えば「楽しかったんだぁ」のような応答文が生成されることになる。 However, since the response generation device of Patent Document 1 generates a response sentence using the concept included in the user utterance or its related word, even if the utterance contents are the same even if the utterance is uttered with different intentions Has the problem that the same response sentence will be generated, and natural dialogue may not be possible. For example, with respect to the following Dialogue Example 1 and Dialogue Example 2, in the response generation device of Patent Document 1, based on “It was fun” of the user utterance 2, in both cases of Dialogue Example 1 and Dialogue Example 2, for example, “It was fun. A response sentence such as “Tadaa” will be generated.

（対話例１）
ユーザ発話１：遊園地に行ったよ。
システム応答：どうだった？
ユーザ発話２：楽しかったよ。
（対話例２）
ユーザ発話１：雨の中の運動会だったよ。
システム応答：どうだった？
ユーザ発話２：楽しかったよ。 (Dialogue example 1)
User utterance 1: I went to an amusement park.
System response: How was it?
User utterance 2: It was fun.
(Dialogue example 2)
User utterance 1: It was an athletic meet in the rain.
System response: How was it?
User utterance 2: It was fun.

本発明は、上記の問題を解決するためになされたものであり、ユーザからの入力内容に対応した適切な応答文を生成して、自然な対話を行うことができる応答生成装置及びプログラムを提供することを目的とする。 The present invention has been made to solve the above problem, and provides a response generation apparatus and program capable of generating an appropriate response sentence corresponding to the input content from the user and performing a natural conversation. The purpose is to do.

上記目的を達成するために、本発明に係る応答生成装置は、ユーザからの入力文を入力する入力手段と、質問文を出力する前に、前記入力手段によって入力された入力文を第１の入力文として、該第１の入力文の構造を解析した解析結果に基づいて、前記第１の入力文に事態を表す単語または単語と単語との組み合わせが含まれているか否かを判別する事態判別手段と、前記事態判別手段で前記第１の入力文に事態が含まれていると判別された場合に、前記ユーザに感情を尋ねるための予め用意された質問文を出力するように制御する第１の制御手段と、前記事態判別手段で前記第１の入力文に事態が含まれていると判別された場合に、前記第１の入力文の解析結果に基づいて、前記第１の入力文が表す感情極性を推定する推定手段と、前記第１の制御手段により制御されて出力された質問文に対して、前記ユーザから前記入力手段によって入力された入力文を第２の入力文として、該第２の入力文の構造を解析した解析結果から、前記第２の入力文の感情極性を抽出する抽出手段と、前記推定手段で推定された感情極性と、前記抽出手段で抽出された感情極性とが一致する場合には、予め用意された第１の応答文を生成して出力し、一致しない場合には、予め用意された第２の応答文を生成して出力するように制御する第２の制御手段とを含んで構成されている。 In order to achieve the above object, a response generation apparatus according to the present invention includes: an input unit that inputs an input sentence from a user; and an input sentence that is input by the input unit before outputting a question sentence. As an input sentence, based on an analysis result obtained by analyzing the structure of the first input sentence, it is determined whether or not the first input sentence includes a word indicating a situation or a combination of a word and a word Control is performed so as to output a question sentence prepared in advance for asking the user about emotions when it is determined by the determination means and the situation determination means that the first input sentence includes a situation. When the first control means and the situation determination means determine that the first input sentence includes a situation, the first input is based on the analysis result of the first input sentence. Estimating means for estimating the emotion polarity represented by the sentence; An analysis result obtained by analyzing the structure of the second input sentence using the input sentence input from the user by the input means as a second input sentence for the question sentence controlled and output by the first control means From the extraction means for extracting the emotion polarity of the second input sentence, the emotion polarity estimated by the estimation means and the emotion polarity extracted by the extraction means are prepared in advance. A first response sentence is generated and output, and if they do not match, a second response means prepared to generate and output a second response sentence prepared in advance is included. .

また、本発明に係る応答生成プログラムは、コンピュータを、質問文を出力する前に、ユーザからの入力文を入力する入力手段によって入力された入力文を第１の入力文として、該第１の入力文の構造を解析した解析結果に基づいて、前記第１の入力文に事態を表す単語または単語と単語との組み合わせが含まれているか否かを判別する事態判別手段と、前記事態判別手段で前記第１の入力文に事態が含まれていると判別された場合に、前記ユーザに感情を尋ねるための予め用意された質問文を出力するように制御する第１の制御手段と、前記事態判別手段で前記第１の入力文に事態が含まれていると判別された場合に、前記第１の入力文の解析結果に基づいて、前記第１の入力文が表す感情極性を推定する推定手段と、前記第１の制御手段により制御されて出力された質問文に対して、前記ユーザから前記入力手段によって入力された入力文を第２の入力文として、該第２の入力文の構造を解析した解析結果から、前記第２の入力文の感情極性を抽出する抽出手段と、前記推定手段で推定された感情極性と、前記抽出手段で抽出された感情極性とが一致する場合には、予め用意された第１の応答文を生成して出力し、一致しない場合には、予め用意された第２の応答文を生成して出力するように制御する第２の制御手段として機能させるためのプログラムである。 Further, the response generation program according to the present invention uses the input sentence input by the input means for inputting the input sentence from the user as the first input sentence before the computer outputs the question sentence. A situation determination means for determining whether or not the first input sentence includes a word or a combination of words and words based on an analysis result obtained by analyzing a structure of the input sentence; and the situation determination means The first control means for controlling to output a question sentence prepared in advance for asking the user for emotion when it is determined that a situation is included in the first input sentence, When it is determined that a situation is included in the first input sentence by the situation determination unit, the emotion polarity represented by the first input sentence is estimated based on the analysis result of the first input sentence. Estimating means and the first control means From the analysis result obtained by analyzing the structure of the second input sentence using the input sentence input from the user by the input means as the second input sentence for the question sentence controlled and output, When the extraction means for extracting the emotion polarity of the two input sentences matches the emotion polarity estimated by the estimation means and the emotion polarity extracted by the extraction means, a first response prepared in advance This is a program for functioning as a second control means for generating and outputting a sentence and controlling so as to generate and output a second response sentence prepared in advance if they do not match.

本発明に係る応答生成装置及びプログラムによれば、事態判別手段が、質問文を出力する前にユーザからの入力文を入力する入力手段によって入力された入力文を第１の入力文として、第１の入力文の構造を解析した解析結果に基づいて、第１の入力文に事態を表す単語または単語と単語との組み合わせが含まれているか否かを判別する。「事態」とは、何らかの感情が対応付けられる行動、事象、出来事をいう。 According to the response generation device and the program according to the present invention, the situation determination unit uses the input sentence input by the input unit that inputs the input sentence from the user before outputting the question sentence as the first input sentence. Based on the analysis result obtained by analyzing the structure of one input sentence, it is determined whether or not the first input sentence includes a word representing a situation or a combination of a word and a word. “Situation” refers to an action, event, or event associated with some emotion.

そして、第１の制御手段は、事態判別手段で第１の入力文に事態が含まれていると判別された場合に、ユーザに感情を尋ねるための予め用意された質問文を出力するように制御する。この質問文に対して、ユーザから入力手段によって入力された入力文を第２の入力文として、抽出手段が、第２の入力文の構造を解析した解析結果から、第２の入力文の感情極性を抽出する。また、推定手段は、事態判別手段で第１の入力文に事態が含まれていると判別された場合に、第１の入力文の解析結果に基づいて、第１の入力文が表す感情極性を推定する。そして、第２の制御手段が、推定手段で推定された感情極性と、抽出手段で抽出された感情極性とが一致する場合には、予め用意された第１の応答文を生成して出力し、一致しない場合には、予め用意された第２の応答文を生成して出力するように制御する。 The first control means outputs a question sentence prepared in advance for asking the user about emotions when the situation judgment means judges that the situation is included in the first input sentence. Control. With respect to this question sentence, the input sentence input from the user by the input means is used as the second input sentence, and the extraction means analyzes the emotion of the second input sentence from the analysis result obtained by analyzing the structure of the second input sentence. Extract polarity. In addition, when the situation determination unit determines that the situation is included in the first input sentence, the estimation unit determines the emotion polarity represented by the first input sentence based on the analysis result of the first input sentence. Is estimated. When the second control means matches the emotion polarity estimated by the estimation means with the emotion polarity extracted by the extraction means, a first response sentence prepared in advance is generated and output. If they do not match, control is performed so as to generate and output a second response sentence prepared in advance.

このように、第１の入力文から推定される感情極性と、第２の入力文から抽出される感情極性とが一致するか否かによって応答文を異ならせることができるため、ユーザからの入力内容に対応した適切な応答文を生成して、自然な対話を行うことができる。 Thus, the response sentence can be made different depending on whether or not the emotion polarity estimated from the first input sentence matches the emotion polarity extracted from the second input sentence. An appropriate response sentence corresponding to the content can be generated and a natural conversation can be performed.

また、前記第１の応答文を、前記第２の入力文に対して同意を示す応答文とし、前記第２の応答文を、前記第２の入力文に対して驚きを示す応答文、前記ユーザに入力文の入力を促す応答文、または相槌の応答文とすることができる。このように、第１の入力文から推定される感情極性と、第２の入力文から抽出される感情極性とが一致する場合には、同意を示す応答文を出力し、感情極性が一致しない場合には、驚きを示す応答文を出力することにより自然な対話を行うことができる。また、感情極性が一致しない場合には、例えば、音声認識の誤認識やユーザによる入力文の入力ミスがあった可能性があるものと想定して、誤認識の可能性のある解析結果に基づいた応答文を生成するのではなく、ユーザに入力文の入力を促す応答文または相槌の応答文を出力することにより、対話を破綻させることなく進行させることができる。 In addition, the first response sentence is a response sentence indicating consent to the second input sentence, and the second response sentence is a response sentence indicating surprise to the second input sentence, It can be a response sentence that prompts the user to input an input sentence, or a response sentence of a conflict. In this way, when the emotion polarity estimated from the first input sentence matches the emotion polarity extracted from the second input sentence, a response sentence indicating consent is output and the emotion polarities do not match. In some cases, a natural conversation can be performed by outputting a response sentence indicating surprise. Further, if the emotion polarities do not match, for example, it is assumed that there is a possibility that there is a misrecognition of voice recognition or an input error of an input sentence by the user, and based on an analysis result that may be misrecognized. Instead of generating a response sentence, a response sentence that prompts the user to input an input sentence or a response sentence that is compatible can be output without breaking the dialog.

また、本発明の応答生成装置は、前記事態判別手段で前記第１の入力文に事態が含まれていないと判別された場合に、ユーザに入力文の入力を促す応答文、または相槌の応答文を出力するように制御する第３の制御手段を含んで構成することができる。第１の入力文に事態が含まれていない場合には、感情極性を推定することができないため、ユーザに入力文の入力を促す応答文、または相槌の応答文を出力することにより、対話を破綻させることなく進行させることができる。 Further, the response generation apparatus of the present invention provides a response sentence that prompts the user to input an input sentence or a response of a conflict when the situation determination means determines that the first input sentence does not include a situation A third control means for controlling to output a sentence can be included. When the situation is not included in the first input sentence, the emotional polarity cannot be estimated. Therefore, by outputting a response sentence that prompts the user to input the input sentence or a response sentence of the conflict, the dialogue is performed. It is possible to proceed without failing.

以上説明したように、本発明の応答生成装置及びプログラムによれば、ユーザからの入力内容に対応した適切な応答文を生成して、自然な対話を行うことができる、という効果が得られる。 As described above, according to the response generation device and the program of the present invention, it is possible to generate an appropriate response sentence corresponding to the input content from the user and perform a natural conversation.

以下、図面を参照して本発明の実施の形態を詳細に説明する。なお、本実施の形態では、ユーザからの発話を入力として、所定の処理を実行して音声出力する応答生成装置に本発明を適用した場合について説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the present embodiment, a case will be described in which the present invention is applied to a response generation apparatus that executes a predetermined process and outputs a voice by using an utterance from a user as an input.

図１に示すように、第１の実施の形態に係る応答生成装置１０は、ユーザ発話を集音して音声信号を生成するマイク１２と、音声出力を行うスピーカ１４と、マイク１２及びスピーカ１４に接続され、かつ、適切な応答文を生成する所定の処理を実行するコンピュータ１６とを備えている。 As illustrated in FIG. 1, the response generation device 10 according to the first exemplary embodiment includes a microphone 12 that collects user utterances to generate an audio signal, a speaker 14 that outputs audio, a microphone 12, and a speaker 14. And a computer 16 that executes a predetermined process for generating an appropriate response sentence.

コンピュータ１６は、応答生成装置１０全体の制御を司るＣＰＵ、後述する応答生成プログラム等各種プログラムを記憶した記憶媒体としてのＲＯＭ、ワークエリアとしてデータを一時格納するＲＡＭ、各種情報が記憶された記憶手段としてのＨＤＤ、Ｉ／Ｏ（入出力）ポート、及びこれらを接続するバスを含んで構成されている。Ｉ／Ｏポートには、マイク１２及びスピーカ１４が接続されている。 The computer 16 includes a CPU that controls the entire response generation apparatus 10, a ROM as a storage medium that stores various programs such as a response generation program to be described later, a RAM that temporarily stores data as a work area, and a storage unit that stores various information. As an HDD, an I / O (input / output) port, and a bus connecting them. A microphone 12 and a speaker 14 are connected to the I / O port.

また、このコンピュータ１６を、ハードウエアとソフトウエアとに基づいて定まる機能実現手段毎に分割した機能ブロックで説明すると、図１に示すように、マイク１２から入力された音声信号を音声認識して、一般的な形態素解析器を用いて音声認識されたユーザ発話を示す文字列情報に対して形態素解析を行う言語解析部２０、言語解析部２０による解析結果に基づいて、ユーザ発話に事態が含まれているか否かを判別する事態判別部２２、ユーザ発話に事態が含まれている場合に、ユーザ発話が表す感情極性を推定する感情極性推定部２４、ユーザ発話に事態が含まれている場合に、ユーザに感情を尋ねる質問文を生成する質問生成部２６、質問文に対するユーザ発話を言語解析部２０で解析した解析結果から感情極性を抽出する感情極性抽出部２８、感情極性推定部２４で推定された感情極性と感情極性抽出部２８で抽出された感情極性とが一致するか否かを判別する極性一致判別部３０、極性一致判別部３０の判別結果に基づいて、異なる応答文を生成する応答生成部３２、ユーザ発話に事態が含まれていない場合に、定型の応答文を選択する定型応答部３４、質問生成部２６、応答生成部３２、及び定型応答部３４で生成または選択された応答文を音声信号に変換してスピーカ１４から出力させる出力部３６を含んだ構成で表すことができる。 In addition, when the computer 16 is described by functional blocks divided for each function realizing means determined based on hardware and software, the voice signal input from the microphone 12 is recognized as shown in FIG. A situation is included in the user utterance based on the analysis result of the language analysis unit 20 that performs morphological analysis on the character string information indicating the user utterance that has been voice-recognized using a general morphological analyzer. When the situation is included in the user utterance, when the situation is included in the user utterance, when the situation is included in the user utterance, the emotion polarity estimation section 24 that estimates the emotion polarity represented by the user utterance In addition, a question generation unit 26 that generates a question sentence that asks the user for emotions, and an emotion that extracts emotion polarities from the analysis result obtained by analyzing the user utterance for the question sentence by the language analysis unit 20 Of the polarity matching determination unit 30 and the polarity matching determination unit 30 for determining whether or not the emotion polarity estimated by the sex extraction unit 28 and the emotion polarity estimation unit 24 and the emotion polarity extracted by the emotion polarity extraction unit 28 match. Based on the determination result, a response generation unit 32 that generates different response sentences, a standard response part 34 that selects a standard response sentence when a situation is not included in the user utterance, a question generation unit 26, and a response generation unit 32 , And an output unit 36 that converts the response sentence generated or selected by the standard response unit 34 into an audio signal and outputs it from the speaker 14.

ここで、本実施の形態では、後述するように、質問文の出力前に入力されたユーザ発話と出力後に入力されたユーザ発話とを区別する必要があるため、前者を「第１のユーザ発話」、後者を「第２のユーザ発話」と称する。 In this embodiment, as described later, since it is necessary to distinguish between a user utterance input before the output of the question sentence and a user utterance input after the output, the former is referred to as “first user utterance”. The latter is referred to as “second user utterance”.

事態判別部２２は、第１のユーザ発話に対する言語解析部２０での解析結果に、事態を示す単語、または単語と単語との組み合わせが含まれているか否かを判別する。「事態」とは、何らかの感情が対応付けられる行動、事象、出来事であり、ここでは、解析結果に（ｉ）「動詞」が含まれている場合、及び（ｉｉ）「事態性名詞＋”だ”」が含まれている場合に「事態」が含まれていると判別する。「事態性名詞」とは、例えば「雨」や「運動会」といった出来事を示す名詞である。事態性名詞は、図２に示すような事態性名詞辞書を予め作成しておき、判別の際に、この事態性名詞辞書を参照して判別する。 The situation determination unit 22 determines whether the analysis result of the language analysis unit 20 for the first user utterance includes a word indicating the situation or a combination of the word and the word. “Situation” is an action, event, or event associated with some emotion. Here, (i) “verb” is included in the analysis result, and (ii) “situation noun +”. If "" is included, it is determined that "Situation" is included. “Situation noun” is a noun indicating an event such as “rain” or “athletic meet”. Situation nouns are determined in advance by creating a situation noun dictionary as shown in FIG. 2 and referring to this situation noun dictionary at the time of determination.

感情極性推定部２４は、事態判別部２２で、第１のユーザ発話に事態が含まれていると判別された場合に、第１のユーザ発話が表す感情極性がポジティブかネガティブかを推定する。感情極性の推定は、例えば、感情極性が既知の学習データをＳＶＭ（ＳｕｐｐｏｒｔＶｅｃｔｅｒＭａｃｈｉｎｅ）の手法を用いて学習して感情極性モデルを構築し、この感情極性モデルと第１のユーザ発話の解析結果とを比較することにより行う。 The emotion polarity estimation unit 24 estimates whether the emotion polarity represented by the first user utterance is positive or negative when the situation determination unit 22 determines that a situation is included in the first user utterance. The estimation of the emotion polarity is performed by, for example, learning the learning data with known emotion polarity using the SVM (Support Vector Machine) method to construct an emotion polarity model, and analyzing the result of the emotion polarity model and the first user utterance By comparing with.

質問生成部２６は、事態判別部２２で、第１のユーザ発話に事態が含まれていると判別された場合に、ユーザに感情を尋ねる質問文を生成する。質問文は、基本となる質問文、例えば、「どう思う？」や「どんな感じ？」といった文を予め作成しておき、ユーザ発話の解析結果に基づいて、この基本となる質問文から選択した１つの文の時制や表現（丁寧な表現か砕けた表現かなど）を修正して、質問文を生成する。また、図３に示すような質問文例を予め作成しておき、この中からランダムに選択するようにしてもよい。生成された質問文は、出力部３６で音声信号に変換されてスピーカ１４から出力される。 The question generation unit 26 generates a question sentence that asks the user about emotions when the situation determination unit 22 determines that a situation is included in the first user utterance. The question sentence is a basic question sentence, for example, "What do you think?" Or "What do you feel?" Prepared in advance, and selected from this basic question sentence based on the analysis result of user utterance A question sentence is generated by correcting the tense and expression (such as polite expression or broken expression) of one sentence. Also, a question sentence example as shown in FIG. 3 may be created in advance and selected randomly from this. The generated question sentence is converted into an audio signal by the output unit 36 and output from the speaker 14.

感情極性抽出部２８は、質問生成部２６で生成された質問文が出力された後に入力された第２のユーザ発話を言語解析部２０で解析した解析結果から感情極性を抽出する。質問生成部２６で生成された質問文は、ユーザの感情を尋ねる質問文であるため、ここで入力される第２のユーザ発話には、直接的に感情を表す単語が含まれていると考えられる。そこで、第２のユーザ発話に含まれる感情を表す単語について、図４に示すような感情極性辞書を参照して、感情極性を抽出する。感情極性辞書は、非特許文献「スピンモデルによる単語の感情極性抽出」（高村他、情報処理学会論文誌ジャーナルｖｏｌ４７、Ｎｏ．２、ｐｐ．６２７−６３７、２００６）に記載の手法を用いるなどして、予め作成しておく。 The emotion polarity extraction unit 28 extracts the emotion polarity from the analysis result obtained by analyzing the second user utterance input after the question sentence generated by the question generation unit 26 is output by the language analysis unit 20. Since the question sentence generated by the question generation unit 26 is a question sentence for asking the user's emotion, the second user utterance input here is considered to contain a word directly representing the emotion. It is done. Therefore, the emotion polarity is extracted with reference to the emotion polarity dictionary as shown in FIG. 4 for the word representing the emotion included in the second user utterance. The emotion polarity dictionary uses a technique described in non-patent literature “Emotion polarity extraction of words by spin model” (Takamura et al., Journal of Information Processing Society of Japan, vol. 47, No. 2, pp. 627-637, 2006). Create in advance.

応答生成部３２は、極性一致判別部３０で、感情極性推定部２４で推定された感情極性と感情極性抽出部２８で抽出された感情極性とが一致すると判別された場合には、「同意」のニュアンスを含んだ応答文を生成し、一致しないと判別された場合には、「驚き」のニュアンスを含んだ応答文を生成する。生成された応答文は、出力部３６で音声信号に変換されてスピーカ１４から出力される。 If the polarity match determination unit 30 determines that the emotion polarity estimated by the emotion polarity estimation unit 24 matches the emotion polarity extracted by the emotion polarity extraction unit 28, the response generation unit 32 determines “agree”. A response sentence including the nuance of “surprise” is generated when it is determined that they do not match. The generated response sentence is converted into an audio signal by the output unit 36 and output from the speaker 14.

「同意」のニュアンスを含んだ応答文は、例えば、図５に示すような応答文例及び応答文例フォーマットを予め用意しておき、この応答文例からランダムに選択したり、応答文例フォーマットを用いて生成したりする。応答文例フォーマットは、例えば、「やっぱり（ユーザの感情）だよねぇ」のようなフォーマットとすることができ、「（ユーザの感情）」の部分に、第２のユーザ発話から抽出した感情を表す単語を挿入して応答文を生成することができる。 For example, a response sentence example and a response sentence example format as shown in FIG. 5 are prepared in advance, and the response sentence including the “agreement” nuance is selected at random from the response sentence example or generated using the response sentence example format. To do. The response sentence example format can be, for example, a format like “After all (user's emotion)”, and expresses the emotion extracted from the second user utterance in the part of “(user's emotion)” A response sentence can be generated by inserting a word.

「驚き」のニュアンスを含んだ応答文は、例えば、図６に示すような応答文例及び応答文例フォーマットを予め用意しておき、この応答文例からランダムに選択したり、応答文例フォーマットを用いて生成したりする。応答文例フォーマットは、例えば、「え？（ユーザの感情）の」のようなフォーマットとすることができ、「（ユーザの感情）」の部分に、第２のユーザ発話から抽出した感情を表す単語を挿入して応答文を生成することができる。また、このように選択及び生成した応答文に、さらに理由を尋ねる応答文、例えば「なんで（ユーザの感情）の？」のような応答文を組み合わせてもよい。 For example, a response sentence example and a response sentence example format as shown in FIG. 6 are prepared in advance, and a response sentence including “surprise” nuances is randomly selected from the response sentence example or generated using the response sentence example format. To do. The response sentence example format can be, for example, a format such as “E? (User's emotion)”, and a word representing the emotion extracted from the second user utterance in the “(user's emotion)” portion. Can be inserted to generate a response sentence. In addition, the response sentence selected and generated in this manner may be combined with a response sentence that asks for a reason, for example, “why (user's emotion)?”.

定型応答部３４は、事態判別部２２で、第１のユーザ発話に事態が含まれていないと判別された場合に、予め定めた定型応答文からランダムに定型応答文を選択する。選択した定型応答文は、出力部３６で音声信号に変換されてスピーカ１４から出力される。定型応答文は、例えば、図７に示すように、ユーザへ発話を促すような応答文や相槌の応答文を予め定めておく。 When the situation determination unit 22 determines that the situation is not included in the first user utterance, the standard response unit 34 randomly selects a standard response sentence from predetermined fixed response sentences. The selected standard response sentence is converted into an audio signal by the output unit 36 and output from the speaker 14. As the standard response sentence, for example, as shown in FIG. 7, a response sentence that prompts the user to speak or a response sentence that is compatible is determined in advance.

次に、図８を参照して、第１の実施の形態の応答生成装置１０における応答生成処理ルーチンについて説明する。本ルーチンは、ＲＯＭに記憶された応答生成プログラムをＣＰＵが実行することにより行われる。 Next, a response generation processing routine in the response generation device 10 of the first exemplary embodiment will be described with reference to FIG. This routine is performed by the CPU executing a response generation program stored in the ROM.

ステップ１００で、マイク１２から第１のユーザ発話が入力されたか否かを判断し、第１のユーザ発話が入力された場合には、ステップ１０２へ進み、入力されない場合には、入力されるまで本ステップの判断を繰り返す。ここでは、第１のユーザ発話「雨の中の運動会だったよ」が入力されたものとする。 In step 100, it is determined whether or not the first user utterance has been input from the microphone 12. If the first user utterance has been input, the process proceeds to step 102. Repeat the determination in this step. Here, it is assumed that the first user utterance “It was an athletic meet in the rain” was input.

ステップ１０２で、入力された第１のユーザ発話を示す音声信号を音声認識して文字列情報とし、この文字列情報に対して形態素解析を行う。 In step 102, the input voice signal indicating the first user utterance is recognized as voice string information, and morphological analysis is performed on the string information.

次に、ステップ１０４で、形態素解析の解析結果に基づいて、第１のユーザ発話に「事態」が含まれているか否かを判別する。「事態」が含まれている場合には、ステップ１０６へ進み、含まれていない場合には、ステップ１２２へ進む。ここでは、図２に示す事態性名詞辞書に事態性名詞として定められている「運動会」、及び事態性名詞に続く「だ」が含まれており、（ｉｉ）「事態性名詞＋”だ”」の条件を満たすため、「事態」が含まれていると判別されてステップ１０６へ進む。 Next, in step 104, it is determined whether or not “situation” is included in the first user utterance based on the analysis result of the morphological analysis. If “situation” is included, the process proceeds to step 106; otherwise, the process proceeds to step 122. Here, “Sports” defined as a situational noun in the situational noun dictionary shown in FIG. 2 and “da” following the situational noun are included. (Ii) “Situation noun +” ” ”Is satisfied, it is determined that“ Situation ”is included, and the process proceeds to Step 106.

ステップ１０６で、解析結果及び予め定めた感情極性モデルに基づいて、第１のユーザ発話の感情極性を推定する。ここでは、感情極性が「ネガティブ」と推定されるものとする。 In step 106, the emotion polarity of the first user utterance is estimated based on the analysis result and a predetermined emotion polarity model. Here, it is assumed that the emotional polarity is estimated as “negative”.

次に、ステップ１０８で、ユーザに感情を尋ねるための応答文を生成して、音声信号に変換して出力する。ここでは、図３に示す質問文例の中から、「どう思った？」を選択して出力するものとする。 Next, in step 108, a response sentence for asking the user about emotions is generated, converted into an audio signal, and output. Here, it is assumed that “What do you think?” Is selected and output from the question sentence example shown in FIG.

次に、ステップ１１０で、マイク１２から第２のユーザ発話が入力されたか否かを判断し、ユーザ発話が入力された場合には、ステップ１１２へ進み、入力されたユーザ発話を示す音声信号を音声認識して文字列情報とし、この文字列情報に対して形態素解析を行う。入力されない場合には、入力されるまで本ステップの判断を繰り返す。ここでは、第２のユーザ発話「楽しかったよ」が入力されたものとする。 Next, in step 110, it is determined whether or not a second user utterance has been input from the microphone 12. If a user utterance has been input, the process proceeds to step 112, and an audio signal indicating the input user utterance is obtained. Speech recognition is performed to obtain character string information, and morphological analysis is performed on the character string information. If not input, the determination in this step is repeated until input. Here, it is assumed that the second user utterance “It was fun” was input.

次に、ステップ１１４で、解析結果及び図４に示す感情極性辞書に基づいて、第２のユーザ発話が表す感情極性を抽出する。ここでは、感情極性辞書を参照して、解析結果に含まれる「楽しかった」から感情極性「ポジティブ」が抽出される。 Next, in step 114, the emotion polarity represented by the second user utterance is extracted based on the analysis result and the emotion polarity dictionary shown in FIG. Here, the emotion polarity “positive” is extracted from “I enjoyed” included in the analysis result with reference to the emotion polarity dictionary.

次に、ステップ１１６で、上記ステップ１０６で推定された感情極性と、上記ステップ１１４で抽出された感情極性とが一致するか否かを判断する。一致する場合は、ステップ１１８へ進み、一致しない場合には、ステップ１２０へ進む。ここでは、上記ステップ１０６で推定された感情極性は「ネガティブ」、上記ステップ１１４で抽出された感情極性は「ポジティブ」で一致しないため、否定されてステップ１２０へ進む。 Next, in step 116, it is determined whether or not the emotion polarity estimated in step 106 matches the emotion polarity extracted in step 114. If they match, the process proceeds to step 118, and if they do not match, the process proceeds to step 120. Here, since the emotional polarity estimated in step 106 is “negative” and the emotional polarity extracted in step 114 is “positive” and does not match, the result is negative and the process proceeds to step 120.

ステップ１２０で、「驚き」のニュアンスを含んだ応答文を生成する。例えば、図６に示す応答文例及び応答文例フォーマットから応答文例フォーマット「え？（ユーザの感情）の」を選択し、「（ユーザの感情）」の部分に、第２のユーザ発話から抽出した感情を表す単語を挿入して「え？楽しかったの？」のような応答文を生成する。 In step 120, a response sentence including the nuance of “surprise” is generated. For example, the response sentence example format “E? (User's emotion)” is selected from the response sentence example and the response sentence example format shown in FIG. 6, and the emotion extracted from the second user utterance in the part of “(user's emotion)” Is inserted, and a response sentence such as “Have you enjoyed?” Is generated.

また、上記ステップ１１０で、第２のユーザ発話「悲惨だったよ」が入力された場合には、上記ステップ１１２での解析結果及び感情極性辞書に基づいて、感情極性「ネガティブ」が抽出され、上記ステップ１１６で、感情極性が一致すると判断されてステップ１１８へ進む。ステップ１１８では、「同意」のニュアンスを含んだ応答文を生成する。例えば、図６に示す応答文例及び応答文例フォーマットから応答文例「そりゃそうだよね」を選択する。 When the second user utterance “It was miserable” is input in step 110, the emotion polarity “negative” is extracted based on the analysis result in step 112 and the emotion polarity dictionary. In step 116, it is determined that the emotion polarities match, and the process proceeds to step 118. In step 118, a response sentence including the nuance of “agreement” is generated. For example, the response sentence example “That's right” is selected from the response sentence example and the response sentence example format shown in FIG.

また、例えば、第１のユーザ発話が「机だ」であった場合には、上記ステップ１０４で、第１のユーザ発話に「事態」が含まれていないと判別されてステップ１２２へ進む。ステップ１２２で、図７に示す定型応答文例から、例えば、「へー、それで」を選択する。 Further, for example, if the first user utterance is “desk”, it is determined in step 104 that the “situation” is not included in the first user utterance, and the process proceeds to step 122. In step 122, for example, “Hey, so” is selected from the standard response example shown in FIG.

次に、ステップ１２４で、上記ステップ１１８、ステップ１２０、及びステップ１２２で生成または選択された応答文を音声信号に変換して出力して、処理を終了する。 Next, in step 124, the response sentence generated or selected in step 118, step 120, and step 122 is converted into an audio signal and output, and the process ends.

なお、上記ステップ１１０で入力されたと判断されるユーザ発話は、上記ステップ１０８で質問文が出力された後に入力されるユーザ発話であるため、第２のユーザ発話としているが、ここでのユーザ発話が必ずしも感情を含んだものであるとは限らない。そこで、上記ステップ１１４で、第２のユーザ発話から感情極性を抽出することができない場合には、この第２のユーザ発話を第１のユーザ発話とみなしてステップ１００へ戻ったり、ステップ１２２へ進んで定型の応答文を選択して出力したりするようにしてもよい。 Note that the user utterance that is determined to be input in step 110 is a user utterance that is input after the question text is output in step 108. Therefore, the user utterance is the second user utterance. Is not necessarily emotional. Therefore, if the emotion polarity cannot be extracted from the second user utterance in step 114, the second user utterance is regarded as the first user utterance and the process returns to step 100 or proceeds to step 122. A standard response sentence may be selected and output.

以上説明したように、第１の実施の形態の応答生成装置によれば、第１のユーザ発話から感情極性を推定し、推定した感情極性と第２のユーザ発話から抽出した感情極性とが一致するか否かを判別するため、第２のユーザ発話が同じ内容であったとしても、感情極性が一致する場合には、「同意」を示す応答文が生成され、一致しない場合には、「驚き」を示す応答文が生成される。すなわち、第１のユーザ発話が表す感情極性を推定しておくことにより、第２のユーザ発話の内容が予想通りだった場合と、予想に反する場合とで応答を異ならせることができるため、自然な対話を行うことができる。 As described above, according to the response generation device of the first exemplary embodiment, the emotion polarity is estimated from the first user utterance, and the estimated emotion polarity matches the emotion polarity extracted from the second user utterance. In order to determine whether or not the second user utterance has the same content, if the emotional polarities match, a response sentence indicating “agreement” is generated. A response sentence indicating “surprise” is generated. That is, by estimating the emotion polarity represented by the first user utterance, the response can be different between the case where the content of the second user utterance is as expected and the case where it is contrary to the expectation. Conversations.

なお、第１の実施の形態では、第１のユーザ発話が表す感情極性を、予め学習データから生成した感情極性モデルと比較することにより推定する場合について説明したが、この手法に限定するものではない。例えば、単語毎に感情極性を定めた単語−感情極性辞書を予め作成しておき、第１のユーザ発話に含まれる単語の各々についてこの単語−感情極性辞書を参照して感情極性を得て、第１のユーザ発話中に含まれる感情極性が「ポジティブ」である単語の数、感情極性が「ネガティブ」である単語の数、及び否定表現があるか否か等に基づいて定めたルールに従って、感情極性を推定するようにしてもよい。また、第１のユーザ発話から直接感情極性を推定する場合に限らず、第１のユーザ発話が表す感情を推定して、推定した感情及び図４に示す感情極性辞書を参照して感情極性を推定するようにしてもよい。なお、感情の推定は、感情極性の推定の場合と同様、予め学習データから感情毎に生成した複数の感情モデルと比較することにより推定するなど、周知の技術を用いることができる。 In the first embodiment, the case where the emotion polarity represented by the first user utterance is estimated by comparing with the emotion polarity model generated from learning data in advance has been described. However, the present invention is not limited to this method. Absent. For example, a word-emotion polarity dictionary in which emotion polarity is determined for each word is created in advance, and the emotion polarity is obtained by referring to this word-emotion polarity dictionary for each word included in the first user utterance, According to the rules defined based on the number of words whose emotion polarity is “positive” included in the first user utterance, the number of words whose emotion polarity is “negative”, and whether there is a negative expression, etc. The emotion polarity may be estimated. In addition to estimating the emotion polarity directly from the first user utterance, the emotion polarity represented by the first user utterance is estimated, and the emotion polarity is determined with reference to the estimated emotion and the emotion polarity dictionary shown in FIG. You may make it estimate. As in the case of estimation of emotion polarity, a known technique such as estimation by comparing with a plurality of emotion models previously generated for each emotion from learning data can be used for the estimation of emotion.

次に、第２の実施の形態について説明する。第２の実施の形態では、第１のユーザ発話と第２のユーザ発話との感情極性が一致しなかった場合に、「驚き」を示す応答文を生成するのではなく、定型の応答文を選択する点が第１の実施の形態と異なっている。なお、第１の実施の形態と同様の構成及び処理については、同一の符号を付して説明を省略する。 Next, a second embodiment will be described. In the second embodiment, when the emotion polarities of the first user utterance and the second user utterance do not match, a response sentence indicating “surprise” is not generated, but a standard response sentence is used. The point of selection is different from the first embodiment. In addition, about the structure and process similar to 1st Embodiment, the same code | symbol is attached | subjected and description is abbreviate | omitted.

第２の実施の形態に係る応答生成装置２１０の構成は、図１に示す第１の実施の形態に係る応答生成装置１０の構成と同様である。第２の実施の形態の応答生成装置２１０において、応答生成部１３２は、極性一致判別部３０で、感情極性推定部２４で推定された感情極性と感情極性抽出部２８で抽出された感情極性とが一致すると判別された場合には、「同意」のニュアンスを含んだ応答文を生成し、一致しないと判別された場合には、定型の応答文を選択する。生成または選択された応答文は、出力部３６で音声信号に変換されてスピーカ１４から出力される。「同意」のニュアンスを含んだ応答文の生成については、第１の実施の形態と同様であり、定型の応答文の選択については、第１の実施の形態の定型応答部３４の処理と同様に、例えば、図７に示すような定型応答文例の中からランダムに応答文を選択する。 The configuration of the response generation device 210 according to the second embodiment is the same as the configuration of the response generation device 10 according to the first embodiment shown in FIG. In the response generation device 210 according to the second embodiment, the response generation unit 132 is configured such that the polarity match determination unit 30 determines the emotion polarity estimated by the emotion polarity estimation unit 24 and the emotion polarity extracted by the emotion polarity extraction unit 28. If it is determined that they match, a response sentence including the nuance of “agreement” is generated. If it is determined that they do not match, a standard response sentence is selected. The response sentence generated or selected is converted into an audio signal by the output unit 36 and output from the speaker 14. The generation of the response sentence including the nuance of “agreement” is the same as in the first embodiment, and the selection of the fixed response sentence is the same as the processing of the fixed response unit 34 in the first embodiment. In addition, for example, a response sentence is selected at random from a typical response sentence example as shown in FIG.

次に、図９を参照して、第２の実施の形態の応答生成装置２１０における応答生成処理ルーチンについて説明する。本ルーチンは、ＲＯＭに記憶された応答生成プログラムをＣＰＵが実行することにより行われる。 Next, a response generation processing routine in the response generation device 210 of the second exemplary embodiment will be described with reference to FIG. This routine is performed by the CPU executing a response generation program stored in the ROM.

ステップ１００で、マイク１２から第１のユーザ発話が入力されたか否かを判断し、第１のユーザ発話が入力された場合には、ステップ１０２へ進み、入力された第１のユーザ発話を示す音声信号を音声認識して文字列情報とし、この文字列情報に対して形態素解析を行う。入力されない場合には、入力されるまで本ステップの判断を繰り返す。ここでは、第１のユーザ発話「遊園地に行ったよ」が入力されたものとする。 In step 100, it is determined whether or not the first user utterance has been input from the microphone 12. If the first user utterance has been input, the process proceeds to step 102 to indicate the input first user utterance. The speech signal is recognized as speech to obtain character string information, and morphological analysis is performed on the character string information. If not input, the determination in this step is repeated until input. Here, it is assumed that the first user utterance “I went to an amusement park” is input.

次に、ステップ１０４で、形態素解析の解析結果に基づいて、第１のユーザ発話に「事態」が含まれているか否かを判別する。「事態」が含まれている場合には、ステップ１０６へ進み、含まれていない場合には、ステップ１２２へ進む。ここでは、動詞「行く」が含まれているため、「事態」が含まれていると判別されてステップ１０６へ進む。 Next, in step 104, it is determined whether or not “situation” is included in the first user utterance based on the analysis result of the morphological analysis. If “situation” is included, the process proceeds to step 106; otherwise, the process proceeds to step 122. Here, since the verb “go” is included, it is determined that “situation” is included, and the routine proceeds to step 106.

ステップ１０６で、解析結果及び予め定めた感情極性モデルに基づいて、第１のユーザ発話の感情極性を推定する。ここでは、感情極性が「ポジティブ」と推定されるものとする。次に、ステップ１０８で、ユーザに感情を尋ねるための応答文、例えば、図３に示す質問文例の中から、「どうだった？」を選択して出力する。 In step 106, the emotion polarity of the first user utterance is estimated based on the analysis result and a predetermined emotion polarity model. Here, it is assumed that the emotion polarity is estimated as “positive”. Next, in step 108, “How was it?” Is selected and output from a response sentence for asking the user about emotions, for example, the question sentence example shown in FIG.

次に、ステップ１１０で、マイク１２から第２のユーザ発話が入力されたか否かを判断し、第２のユーザ発話が入力された場合には、ステップ１１２へ進み、入力された第２のユーザ発話を示す音声信号を音声認識して文字列情報とし、この文字列情報に対して形態素解析を行う。入力されない場合には、入力されるまで本ステップの判断を繰り返す。ここでは、第２のユーザ発話「楽しかったよ」が入力され、音声認識において「悲しかったよ」と誤認識されたものとする。 Next, in step 110, it is determined whether or not the second user utterance is input from the microphone 12. If the second user utterance is input, the process proceeds to step 112, and the input second user is input. A voice signal indicating an utterance is recognized as voice string information, and morphological analysis is performed on the character string information. If not input, the determination in this step is repeated until input. Here, it is assumed that the second user utterance “It was fun” was input, and that it was misrecognized as “It was sad” in the speech recognition.

次に、ステップ１１４で、解析結果及び図４に示す感情極性辞書に基づいて、第２のユーザ発話の感情極性を抽出する。ここでは、誤認識された解析結果「悲しかったよ」に基づいて、感情極性辞書を参照して、感情極性「ネガティブ」が抽出される。 Next, in step 114, the emotion polarity of the second user utterance is extracted based on the analysis result and the emotion polarity dictionary shown in FIG. Here, based on the misrecognized analysis result “I was sad”, the emotion polarity “negative” is extracted with reference to the emotion polarity dictionary.

次に、ステップ１１６で、上記ステップ１０６で推定された感情極性と、上記ステップ１１４で抽出された感情極性とが一致するか否かを判断する。一致する場合は、ステップ１１８へ進み、一致しない場合には、音声認識を含む第１のユーザ発話の解析結果に誤りがあったものと想定して、ステップ１２２へ進む。ここでは、上記ステップ１０６で推定された感情極性は「ポジティブ」、上記ステップ１１４で抽出された感情極性は「ネガティブ」で一致しないため、否定されてステップ１２２へ進む。 Next, in step 116, it is determined whether or not the emotion polarity estimated in step 106 matches the emotion polarity extracted in step 114. If they match, the process proceeds to step 118. If they do not match, the process proceeds to step 122 assuming that there is an error in the analysis result of the first user utterance including voice recognition. Here, since the emotional polarity estimated in step 106 is “positive” and the emotional polarity extracted in step 114 is “negative” and does not match, the result is negative and the process proceeds to step 122.

ステップ１２２で、図７に示す定型応答文例から、例えば、「そうですかぁ」を選択し、次に、ステップ１２４で、選択した応答文を音声信号に変換して出力して、処理を終了する。 In step 122, for example, “Is that so” is selected from the standard response sentence example shown in FIG. 7, and then in step 124, the selected response sentence is converted into an audio signal and output, and the process ends. .

以上説明したように、第２の実施の形態の応答生成装置によれば、第１のユーザ発話が表す感情極性を推定し、推定した感情極性と第２のユーザ発話から抽出した感情極性とが一致するか否かを判別し、一致しない場合には、例えば音声認識に誤りがあったものと想定して、相槌の応答文やユーザ発話を促すような応答文などの定型の応答文を出力することができる。これにより、誤認識された解析結果に基づいて不適切な応答文を生成することを防止し、対話を破綻なく進行させることができる。 As described above, according to the response generation device of the second exemplary embodiment, the emotion polarity represented by the first user utterance is estimated, and the estimated emotion polarity and the emotion polarity extracted from the second user utterance are obtained. Determine whether or not they match, and if they do not match, for example, assume that there was an error in speech recognition, and output a standard response sentence such as a response sentence that encourages user speech can do. As a result, it is possible to prevent an inappropriate response sentence from being generated based on the erroneously recognized analysis result, and to allow the dialog to proceed without failure.

なお、上記第１及び第２の実施の形態では、スピーカによる音声出力を行う場合を例に説明したが、これに限定されるものではなく、ディスプレイに応答文を表示するようにしてもよい。また、ユーザから音声がマイクに入力される場合を例に説明したが、ユーザがキーボードなどを用いて入力文としてのテキストを入力するようにしてもよい。 In the first and second embodiments described above, the case where sound is output by a speaker has been described as an example. However, the present invention is not limited to this, and a response sentence may be displayed on a display. Moreover, although the case where audio | voice was input into the microphone from the user was demonstrated to the example, you may make it a user input the text as an input sentence using a keyboard etc.

本実施の形態に係る応答生成装置の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the response generation apparatus which concerns on this Embodiment. 事態性名詞辞書の一例を示す図である。It is a figure which shows an example of a situation noun dictionary. 質問文例の一例を示す図である。It is a figure which shows an example of a question sentence example. 感情極性辞書の一例を示す図である。It is a figure which shows an example of an emotion polarity dictionary. 「同意」を示す応答文の応答文例及び応答文例フォーマットの一例を示す図である。It is a figure which shows an example of the response text example of a response text which shows "agreement", and a response text example format. 「驚き」を示す応答文の応答文例及び応答文例フォーマットの一例を示す図である。It is a figure which shows an example of the response sentence example of a response sentence which shows "surprise", and an example of a response sentence example format. 定型の応答文の応答文例の一例を示す図である。It is a figure which shows an example of the example of a response sentence of a fixed form response sentence. 第１の実施の形態の応答生成処理ルーチンを示すフローチャートである。It is a flowchart which shows the response production | generation process routine of 1st Embodiment. 第２の実施の形態の応答生成処理ルーチンを示すフローチャートである。It is a flowchart which shows the response production | generation process routine of 2nd Embodiment.

Explanation of symbols

１０、２１０応答生成装置
１２マイク
１４スピーカ
１６コンピュータ
２０言語解析部
２２事態判別部
２４感情極性推定部
２６質問生成部
２８感情極性抽出部
３０極性一致判別部
３２、１３２応答生成部
３４定型応答部
３６出力部 10, 210 Response generation device 12 Microphone 14 Speaker 16 Computer 20 Language analysis unit 22 Situation determination unit 24 Emotion polarity estimation unit 26 Question generation unit 28 Emotion polarity extraction unit 30 Polarity match determination units 32 and 132 Response generation unit 34 Fixed response unit 36 Output section

Claims

An input means for inputting an input sentence from the user;
Before outputting the question sentence, the input sentence input by the input means is used as the first input sentence, and the situation of the first input sentence is determined based on the analysis result obtained by analyzing the structure of the first input sentence. A situation determination means for determining whether or not a word or a combination of a word and a word is included,
First control for controlling to output a question sentence prepared in advance for asking the user about emotions when the situation determination means determines that a situation is included in the first input sentence Means,
When the situation determination means determines that a situation is included in the first input sentence, the emotion polarity represented by the first input sentence is estimated based on the analysis result of the first input sentence An estimation means to
For the question sentence controlled and output by the first control means, the structure of the second input sentence was analyzed using the input sentence inputted by the input means from the user as the second input sentence. Extraction means for extracting the emotion polarity of the second input sentence from the analysis result;
When the emotion polarity estimated by the estimation means matches the emotion polarity extracted by the extraction means, a first response sentence prepared in advance is generated and output. Second control means for controlling to generate and output a second response sentence prepared in advance;
A response generation device including:

The first response sentence is a response sentence indicating consent to the second input sentence, and the second response sentence is a response sentence indicating surprise to the second input sentence, to the user The response generation apparatus according to claim 1, wherein the response generation apparatus is a response sentence that prompts input of an input sentence or a response sentence that is a companion.

Third control is performed to output a response sentence that prompts the user to input an input sentence or a response sentence of a conflict when the situation determination means determines that the first input sentence does not include a situation The response generation apparatus according to claim 1, further comprising:

Computer
Before outputting the question sentence, based on the analysis result obtained by analyzing the structure of the first input sentence with the input sentence input by the input means for inputting the input sentence from the user as the first input sentence, A situation determination means for determining whether or not the first input sentence includes a word indicating a situation or a combination of a word and a word;
First control for controlling to output a question sentence prepared in advance for asking the user about emotions when the situation determination means determines that a situation is included in the first input sentence Means,
When the situation determination means determines that a situation is included in the first input sentence, the emotion polarity represented by the first input sentence is estimated based on the analysis result of the first input sentence An estimation means to
For the question sentence controlled and output by the first control means, the structure of the second input sentence was analyzed using the input sentence inputted by the input means from the user as the second input sentence. Extraction means for extracting the emotion polarity of the second input sentence from the analysis result;
When the emotion polarity estimated by the estimation means matches the emotion polarity extracted by the extraction means, a first response sentence prepared in advance is generated and output. Second control means for controlling to generate and output a second response sentence prepared in advance;
A response generator to make it function.