JP2016090891A

JP2016090891A - Response generation apparatus, response generation method, and response generation program

Info

Publication number: JP2016090891A
Application number: JP2014227215A
Authority: JP
Inventors: 生聖渡部; Seisho Watabe
Original assignee: Toyota Motor Corp
Current assignee: Toyota Motor Corp
Priority date: 2014-11-07
Filing date: 2014-11-07
Publication date: 2016-05-23
Anticipated expiration: 2034-11-07
Also published as: JP6299563B2

Abstract

PROBLEM TO BE SOLVED: To alleviate discomfort in a dialog which is caused by a uniform response pattern.SOLUTION: A method includes: recognizing a user voice; analyzing a structure of the recognized voice; storing information produced by correlating plural key words respectively with associative word and additional word that are related to each key word; extracting noun and verb from the recognized voice; selecting a key word consistent with the noun or verb; adding the additional word corresponding to the key word to the extracted noun or verb; generating a repeating response sentence by replacing with a related word corresponding to the selected key word; generating a voluntary response sentence on the basis of the analyzed voice structure; outputting the voluntary response sentence after outputting the repeating response sentence; extracting a second noun or verb related to a first noun or verb from given sentence group information; and questioning the user as to whether or not the extracted both words are the same, and if the answer is positive, storing the first noun or verb as a key word, and the second noun or verb as an associative word.SELECTED DRAWING: Figure 1

Description

本発明は、ユーザに対して応答を行う応答生成方法、応答生成装置及び応答生成プログラムに関するものである。 The present invention relates to a response generation method, a response generation apparatus, and a response generation program for responding to a user.

ユーザの音声を認識する音声認識手段と、音声認識手段により認識された音声の構造を解析する構造解析手段と、構造解析手段により解析された音声の構造に基づいて、ユーザの音声に対する応答文を生成し、該生成した応答文を出力する応答出力手段と、を備える応答生成装置が知られている（例えば、特許文献１参照）。 A voice recognition means for recognizing the user's voice, a structure analysis means for analyzing the structure of the voice recognized by the voice recognition means, and a response sentence to the user's voice based on the structure of the voice analyzed by the structure analysis means. There is known a response generation device including a response output unit that generates and outputs the generated response sentence (see, for example, Patent Document 1).

特開２０１０−１５７０８１号公報JP 2010-157081 A

上記のような応答生成装置は、音声の構造解析、及びその応答文の生成に時間を要し、応答待ちが生じる。このため、対話に違和感が生じる虞がある。そこで、例えば、その応答待ちの間に音声認識手段により認識したユーザの音声を繰返し応答文として用い簡易に応答を行うことが考えられる。この場合、応答待ちが短くなり対話の違和感が緩和されるが、画一的な応答パターンとなり対話としての不自然さが残る。
本発明は、このような問題点を解決するためになされたものであり、画一的な応答パターンによる対話の違和感を緩和することができる応答生成方法、応答生成装置、及び応答生成プログラムを提供することを主たる目的とする。 The response generation apparatus as described above takes time for the structure analysis of voice and the generation of the response sentence, and waiting for a response occurs. For this reason, there is a possibility that a sense of incongruity may occur in the dialogue. Thus, for example, it is conceivable to simply respond by using the user's voice recognized by the voice recognition means while waiting for the response as a repeated response sentence. In this case, the waiting time for the response is shortened, and the uncomfortable feeling of the dialogue is alleviated, but the response pattern becomes uniform and unnaturalness as the dialogue remains.
The present invention has been made to solve such problems, and provides a response generation method, a response generation apparatus, and a response generation program that can alleviate the uncomfortable feeling of dialogue due to a uniform response pattern. The main purpose is to do.

上記目的を達成するための本発明の一態様は、ユーザの音声を認識するステップと、前記認識した音声の構造を解析するステップと、複数のキーワードに該各キーワードに関連する連想ワード及び付加語を夫々対応付けた情報を記憶するステップと、前記認識した音声から名詞又は動詞を抽出するステップと、前記抽出した名詞又は動詞と一致する前記記憶した情報のキーワードを選択し、該選択したキーワードに対応する付加語を、該抽出した名詞又は動詞に対して付加し、該名詞又は動詞を、該選択したキーワードに対応する関連ワードに、置き換えることで、前記ユーザの音声を繰り返すための繰返し応答文を生成するステップと、前記解析した音声の構造に基づいて、前記ユーザの音声に対する随意の応答文を生成し、前記繰返し応答文を出力した後、前記随意の応答文を出力するステップと、を含む応答生成方法であって、第１の名詞又は動詞と、該第１の名詞又は動詞と関連する第２の名詞又は動詞と、を所定の文章集合情報から抽出するステップと、抽出した前記第１の名詞又は動詞と前記第２の名詞又は動詞とが同一であるか否かを前記ユーザに対して質問するステップと、前記質問に対するユーザの回答が肯定的である場合に、前記第１の名詞又は動詞を前記キーワードとし、前記第２の名詞又は動詞を前記連想ワードとして記憶するステップと、を含む、ことを特徴とする応答生成方法である。
この一態様において、ネットワーク上に存在する前記所定の文章集合情報から、前記第１の名詞又は動詞と、第２の名詞又は動詞と、を抽出し、抽出した前記第１の名詞又は動詞と前記第２の名詞又は動詞との共起頻度を用いて、該第１の名詞又は動詞と第２の名詞又は動詞との類似度を算出し、該算出した類似度が閾値以上となる場合に、抽出した前記第１の名詞又は動詞と前記第２の名詞又は動詞とが同一であるか否かを前記ユーザに対して質問し、前記質問に対するユーザの回答が肯定的である場合に、前記第１の名詞又は動詞を前記キーワードとし、前記第２の名詞又は動詞を前記連想ワードとして記憶してもよい。
この一態様において、前記記憶されたキーワードと関連する連想ワード及び付加語、の数を所定条件で制限するステップを更に含んでいてもよい。
この一態様において、前記ユーザの音声の音韻を分析するステップと、前記分析された音韻の分析結果に基づいて、前記ユーザの音声に対する相槌の応答を生成するステップと、を更に含み、前記生成される繰返しの応答文を出力する前に、前記生成された相槌の応答を出力してもよい。
上記目的を達成するための本発明の一態様は、ユーザの音声を認識する音声認識手段と、前記音声認識手段により認識された音声の構造を解析する構造解析手段と、複数のキーワードに該各キーワードに関連する連想ワード及び付加語を夫々対応付けた情報を記憶する記憶手段と、前記音声認識手段により認識された音声から名詞又は動詞を抽出する品詞抽出手段と、前記品詞抽出手段により抽出された名詞又は動詞と一致する前記記憶手段のキーワードを選択し、該選択したキーワードに対応する付加語を、該抽出された名詞又は動詞に対して付加し、該名詞又は動詞を、該選択したキーワードに対応する関連ワードに、置き換えることで、前記ユーザの音声を繰り返すための繰返し応答文を生成する繰返生成手段と、前記構造解析手段により解析された音声の構造に基づいて、前記ユーザの音声に対する随意の応答文を生成し、前記繰返し応答文を出力した後、前記随意の応答文を出力する応答出力手段と、を含む応答生成装置であって、第１の名詞又は動詞と、該第１の名詞又は動詞と関連する第２の名詞又は動詞と、を所定の文章集合情報から抽出する抽出手段と、前記抽出手段により抽出された前記第１の名詞又は動詞と前記第２の名詞又は動詞とが同一であるか否かを前記ユーザに対して質問する質問手段と、前記質問に対するユーザの回答が肯定的である場合に、前記第１の名詞又は動詞を前記キーワードとし、前記第２の名詞又は動詞を前記連想ワードとして前記記憶手段に登録する判定手段と、を含む、ことを特徴とする応答生成装置であってもよい。
上記目的を達成するための本発明の一態様は、ユーザの音声を認識する処理と、前記認識した音声の構造を解析する処理と、前記認識した音声から名詞又は動詞を抽出する処理と、複数のキーワードに該各キーワードに関連する連想ワード及び付加語を夫々対応付けた情報が記憶されており、前記抽出した名詞又は動詞と一致する前記キーワードを選択し、該選択したキーワードに対応する付加語を、該抽出した名詞又は動詞に対して付加し、該名詞又は動詞を、該選択したキーワードに対応する関連ワードに、置き換えることで、前記ユーザの音声を繰り返すための繰返し応答文を生成する処理と、前記解析した音声の構造に基づいて、前記ユーザの音声に対する随意の応答文を生成し、前記繰返し応答文を出力した後、前記随意の応答文を出力する処理と、コンピュータに実行させる応答生成プログラムをであって、第１の名詞又は動詞と、該第１の名詞又は動詞と関連する第２の名詞又は動詞と、を所定の文章集合情報から抽出する処理と、抽出した前記第１の名詞又は動詞と前記第２の名詞又は動詞とが同一であるか否かを前記ユーザに対して質問する処理と、前記質問に対するユーザの回答が肯定的である場合に、前記第１の名詞又は動詞を前記キーワードとし、前記第２の名詞又は動詞を前記連想ワードとして記憶する処理と、をコンピュータに実行させる、ことを特徴とする応答生成プログラムであってもよい。 To achieve the above object, one aspect of the present invention includes a step of recognizing a user's voice, a step of analyzing the structure of the recognized voice, a plurality of keywords and associated words and additional words related to the keywords. Each of the stored information, the step of extracting a noun or verb from the recognized speech, the keyword of the stored information that matches the extracted noun or verb, and the selected keyword A repeated response sentence for repeating the user's voice by adding a corresponding additional word to the extracted noun or verb and replacing the noun or verb with a related word corresponding to the selected keyword. And generating an optional response sentence to the user's voice based on the analyzed voice structure, and generating the repeated response. Outputting a desired response sentence after outputting a sentence, comprising: a first noun or verb; and a second noun or verb related to the first noun or verb Extracting from the predetermined sentence set information, and asking the user whether or not the extracted first noun or verb and the second noun or verb are the same, and Storing the first noun or verb as the keyword and storing the second noun or verb as the associative word when a user's answer to the question is affirmative, This is a response generation method.
In this one aspect, the first noun or verb and the second noun or verb are extracted from the predetermined sentence set information existing on the network, and the extracted first noun or verb and the extracted Using the co-occurrence frequency with the second noun or verb, calculating the similarity between the first noun or verb and the second noun or verb, and when the calculated similarity is equal to or greater than a threshold, When the user is asked whether the extracted first noun or verb and the second noun or verb are the same, and the user's answer to the question is affirmative, One noun or verb may be used as the keyword, and the second noun or verb may be stored as the associative word.
In this aspect, the method may further include a step of limiting the number of associative words and additional words associated with the stored keyword under a predetermined condition.
In this aspect, the method further comprises: analyzing a phoneme of the user's voice; and generating a reciprocal response to the user's voice based on the analyzed phoneme analysis result. Before outputting the repeated response sentence, the generated response of the response may be output.
In order to achieve the above object, one aspect of the present invention includes a speech recognition unit that recognizes a user's speech, a structure analysis unit that analyzes a structure of speech recognized by the speech recognition unit, and a plurality of keywords. Extracted by the storage part for storing the information in which the associative word and the additional word related to the keyword are associated with each other, the part of speech extraction part for extracting the noun or the verb from the speech recognized by the voice recognition part, and the part of speech extraction part. Selecting a keyword in the storage means that matches the noun or verb, adding an additional word corresponding to the selected keyword to the extracted noun or verb, and adding the noun or verb to the selected keyword To the related word corresponding to the above, the repetition generating means for generating a repeated response sentence for repeating the user's voice, and the structural analysis means A response generation unit that generates an arbitrary response sentence to the user's voice based on the analyzed voice structure, outputs the repeated response sentence, and then outputs the optional response sentence An apparatus for extracting a first noun or verb and a second noun or verb related to the first noun or verb from predetermined sentence set information, and extracting by the extracting means When the first noun or verb and the second noun or verb are the same, the question means for asking the user whether or not the answer is positive. A response generation device including: a determination unit that registers the first noun or verb as the keyword and the second noun or verb as the association word in the storage unit. .
One aspect of the present invention for achieving the above object includes: a process for recognizing a user's voice; a process for analyzing the structure of the recognized voice; a process for extracting a noun or verb from the recognized voice; Information associated with each keyword is associated with each keyword and an additional word corresponding to the selected keyword is selected by selecting the keyword that matches the extracted noun or verb. Is added to the extracted noun or verb, and the noun or verb is replaced with a related word corresponding to the selected keyword, thereby generating a repeated response sentence for repeating the user's voice. And generating an optional response sentence to the user's voice based on the analyzed voice structure, outputting the repeated response sentence, and then outputting the optional response sentence. A process for outputting and a response generation program to be executed by a computer, wherein a first noun or verb and a second noun or verb related to the first noun or verb are determined from predetermined sentence set information A process of extracting, a process of asking the user whether or not the extracted first noun or verb and the second noun or verb are the same, and a user's answer to the question is positive A response generating program that causes the computer to execute a process of storing the first noun or verb as the keyword and storing the second noun or verb as the associative word. May be.

本発明によれば、画一的な応答パターンによる対話の違和感を緩和することができる応答生成方法、応答生成装置、及び応答生成プログラムを提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the response production | generation method, the response production | generation apparatus, and the response production | generation program which can relieve the discomfort of the dialog by a uniform response pattern can be provided.

本発明の実施形態１に係る応答生成装置の概略的なシステム構成を示すブロック図である。It is a block diagram which shows the schematic system configuration | structure of the response generation apparatus which concerns on Embodiment 1 of this invention. 本発明の実施形態１に係る応答生成装置の概略的なハードウェア構成を示すブロック図である。It is a block diagram which shows the schematic hardware constitutions of the response generation apparatus which concerns on Embodiment 1 of this invention. 本発明の実施形態１に係る応答生成方法の処理フローを示すフローチャートである。It is a flowchart which shows the processing flow of the response generation method which concerns on Embodiment 1 of this invention. 本発明の実施形態２に係る応答生成装置の概略的なシステム構成を示すブロック図である。It is a block diagram which shows the schematic system configuration | structure of the response generation apparatus which concerns on Embodiment 2 of this invention. 関連ワード情報がネットワーク装置に記憶され更新される構成の一例を示す概略図である。It is the schematic which shows an example of the structure by which related word information is memorize | stored and updated in a network device.

実施形態１
以下、図面を参照して本発明の実施の形態について説明する。図１は、本発明の実施形態１に係る応答生成装置の概略的なシステム構成を示すブロック図である。本実施形態１に係る応答生成装置１は、ユーザの音声を認識する音声認識部２と、音声の構造を解析する構造解析部と３、ユーザの音声に対する応答文を生成し、出力する応答出力部４と、繰返しの応答文を生成する繰返生成部５と、品詞抽出部９と、関連ワード抽出部１０と、ユーザ質問部１１と、記憶判定部１２と、を備えている。 Embodiment 1
Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a schematic system configuration of a response generation apparatus according to Embodiment 1 of the present invention. The response generation apparatus 1 according to the first embodiment includes a voice recognition unit 2 that recognizes a user's voice, a structure analysis unit 3 that analyzes a voice structure, and a response output that generates and outputs a response sentence to the user's voice. A unit 4, a repetition generation unit 5 that generates a repeated response sentence, a part of speech extraction unit 9, a related word extraction unit 10, a user question unit 11, and a memory determination unit 12 are provided.

なお、応答生成装置１は、例えば、演算処理等と行うＣＰＵ（Central Processing Unit）１ａ、ＣＰＵ１ａによって実行される演算プログラム、制御プログラム等が記憶されたＲＯＭ（Read Only Memory）やＲＡＭ（Random Access Memory）からなるメモリ１ｂ、外部と信号の入出力を行うインターフェイス部（Ｉ／Ｆ）１ｃ、などからなるマイクロコンピュータを中心にして、ハードウェア構成されている（図２）。ＣＰＵ１ａ、メモリ１ｂ、及びインターフェイス部１ｃは、データバス１ｄなどを介して相互に接続されている。 The response generation device 1 includes, for example, a CPU (Central Processing Unit) 1a that performs arithmetic processing and the like, a ROM (Read Only Memory) and a RAM (Random Access Memory) that store arithmetic programs executed by the CPU 1a, control programs, and the like. 2) and a microcomputer including an interface unit (I / F) 1c for inputting / outputting signals to / from the outside, and the like (FIG. 2). The CPU 1a, the memory 1b, and the interface unit 1c are connected to each other via a data bus 1d.

音声認識部２は、マイク６により取得されたユーザの音声情報に基づいて音声認識処理を行い、ユーザの音声情報をテキスト化し文字列情報として認識する。音声認識部２は、音声認識手段の一具体例である。音声認識部２は、マイク６から出力されるユーザの音声情報から発話区間を検出し、検出した発話区間の音声情報に対して、例えば、統計言語モデルを参照してパターンマッチングを行うことで音声認識を行う。ここで、統計言語モデルは、例えば、単語の出現分布やある単語の次に出現する単語の分布等、言語表現の出現確率を計算するための確率モデルであり、形態素単位で連結確率を学習したものである。統計言語モデルは、メモリ１ｂなどに予め記憶されている。なお、音声認識部２は、ユーザの音声情報の各形態素に対してその品詞種類（名詞、形容詞、動詞、副詞など）を付加した品詞情報付き形態素情報を生成する。音声認識部２は、認識したユーザの音声情報を構造解析部３、及び品詞抽出部９に出力する。 The voice recognition unit 2 performs voice recognition processing based on the user's voice information acquired by the microphone 6, converts the user's voice information into text, and recognizes it as character string information. The speech recognition unit 2 is a specific example of speech recognition means. The voice recognition unit 2 detects a speech section from the user's voice information output from the microphone 6, and performs voice matching by performing pattern matching on the detected voice information of the speech section with reference to, for example, a statistical language model. Recognize. Here, the statistical language model is a probability model for calculating the appearance probability of a language expression, such as the distribution of the appearance of a word or the distribution of a word that appears after a certain word, and learned the connection probability in units of morphemes. Is. The statistical language model is stored in advance in the memory 1b or the like. Note that the speech recognition unit 2 generates morpheme information with part-of-speech information by adding the part-of-speech type (noun, adjective, verb, adverb, etc.) to each morpheme of the user's speech information. The voice recognition unit 2 outputs the recognized voice information of the user to the structure analysis unit 3 and the part of speech extraction unit 9.

構造解析部３は、音声認識部２により認識された音声情報の構造を解析する。構造解析部３は、構造解析手段の一具体例である。構造解析部３は、例えば、一般的な形態素解析器を用いて音声認識されたユーザの音声情報を示す文字列情報に対して形態素解析などを行い、文字列情報の意味解釈を行う。構造解析部３は、文字列情報の解析結果を応答出力部４に出力する。 The structure analysis unit 3 analyzes the structure of the voice information recognized by the voice recognition unit 2. The structure analysis unit 3 is a specific example of structure analysis means. For example, the structure analysis unit 3 performs morphological analysis on character string information indicating the voice information of the user that has been voice-recognized using a general morphological analyzer, and interprets the meaning of the character string information. The structure analysis unit 3 outputs the analysis result of the character string information to the response output unit 4.

応答出力部４は、構造解析部３により解析された音声情報の構造に基づいて、ユーザの音声情報に対する応答文（以下、随意応答文と称す）を生成し、該生成した随意応答文を出力する。応答出力部４は、応答出力手段の一具体例である。応答出力部４は、例えば、構造解析部３から出力される文字列情報の解析結果に基づいて、ユーザの音声情報に対する随意応答文を生成する。そして、応答出力部４は、生成した応答文をスピーカ７を用いて出力する。 The response output unit 4 generates a response sentence (hereinafter referred to as an optional response sentence) to the user's voice information based on the structure of the voice information analyzed by the structure analysis unit 3, and outputs the generated optional response sentence To do. The response output unit 4 is a specific example of response output means. For example, the response output unit 4 generates an arbitrary response sentence for the user's voice information based on the analysis result of the character string information output from the structure analysis unit 3. Then, the response output unit 4 outputs the generated response sentence using the speaker 7.

より、具体的には、構造解析部３は、文字列情報「トンカツを食べる」において、述語項構造を抽出し、述語「食べる」と格助詞「を」を特定する。そして、応答出力部４は、構造解析部３により特定された述語「食べる」に係り得る格助詞の種類を、述語と格助詞との対応関係が記憶された不足格辞書データベース８の中から抽出する。なお、不足格辞書データベース８は、例えば、メモリ１ｂに構築されている。 More specifically, the structure analysis unit 3 extracts the predicate term structure in the character string information “eating tonkatsu” and specifies the predicate “eat” and the case particle “wo”. Then, the response output unit 4 extracts the type of case particles that can be related to the predicate “eat” specified by the structure analysis unit 3 from the deficiency dictionary database 8 in which the correspondence between the predicate and the case particle is stored. To do. The deficiency dictionary database 8 is constructed in the memory 1b, for example.

応答出力部４は、例えば、「何を食べる」、「どこで食べる」、「いつに食べる」、「誰と食べる」とういう述語項構造を、随意応答文として生成する。さらに、応答出力部４は、上記生成した述語項構造の中で、ユーザの音声と一致しない表層格「を」を除いた、他の述語項構造の中からランダムに選択し、選択した述語項構造を随意応答文とする。応答出力部４は、例えば、「誰と食べたの？」という述語項構造を選択し、随意応答文として出力する。なお、上述した随意応答文の生成方法は一例であり、これに限定されず、任意の生成方法を用いることができる。 The response output unit 4 generates, for example, predicate term structures such as “what to eat”, “where to eat”, “when to eat”, and “who to eat” as an optional response sentence. Further, the response output unit 4 randomly selects from other predicate term structures except the surface case "" that does not match the user's voice in the generated predicate term structure, and selects the selected predicate term. The structure is an arbitrary response sentence. The response output unit 4 selects, for example, a predicate term structure “who did you eat?” And outputs it as an optional response sentence. In addition, the generation method of the arbitrary response sentence mentioned above is an example, It is not limited to this, Arbitrary generation methods can be used.

品詞抽出部９は、音声認識部２から出力された音声情報の品詞情報付き形態素情報に基づいて、認識されたユーザの音声情報から名詞及び／又は動詞を抽出する。品詞抽出部９は、品詞抽出手段の一具体例である。品詞抽出部９は、例えば、音声認識部２から出力された音声情報の品詞情報付き形態素情報「トンカツ（名詞）を（助詞）食べた（動詞）よ（助詞）」から、「トンカツ（名詞）」又は「食べた（動詞）」を抽出する。品詞抽出部９は、上記名詞として、例えば、トンカツ（一般名詞）、矢場トン（固有名詞）、投票する＝＞投票（サ変名詞）（但し、数詞などの一部の名詞を除く）などを抽出する。また、品詞抽出部９は、上記動詞として、例えば、投票する（サ変動詞）、泳ぐ、などを抽出する。品詞抽出部９は、抽出した名詞又は動詞を繰返生成部５及び関連ワード抽出部１０に出力する。 The part-of-speech extraction unit 9 extracts nouns and / or verbs from the recognized user's speech information based on the morpheme information with part-of-speech information of the speech information output from the speech recognition unit 2. The part of speech extraction unit 9 is a specific example of the part of speech extraction means. The part-of-speech extraction unit 9 uses, for example, the morpheme information with part-of-speech information “tonkatsu (noun) (verb) ate (verb) yo (particle)” of the speech information output from the speech recognition unit 2 to “tonkatsu (noun)”. Or “eat (verb)”. The part-of-speech extraction unit 9 extracts, for example, Tonkatsu (general noun), Yaba Ton (proprietary noun), voting => voting (sa variable noun) (however, excluding some nouns such as numerals), etc. To do. The part-of-speech extraction unit 9 extracts, for example, voting (sa variation), swimming, and the like as the verb. The part of speech extraction unit 9 outputs the extracted noun or verb to the repetition generation unit 5 and the related word extraction unit 10.

ところで、上述したような、音声情報の構造解析、及びその応答文の生成には時間を要し（例えば、３秒程度）、処理コストが高い。このため、応答待ちが生じ、対話に違和感が生じる虞がある。 By the way, the structure analysis of voice information and the generation of a response sentence as described above require time (for example, about 3 seconds), and the processing cost is high. For this reason, there is a possibility that waiting for a response may occur, and the conversation may feel uncomfortable.

これに対し、本実施の形態１に係る応答生成装置１において、繰返生成部５は、音声認識部２により認識されたユーザの音声から、繰返しの応答文（以下、繰返応答文と称す）を簡易に生成する。そして、応答出力部４は、繰返生成部５により生成された繰返応答文した後、音声の構造に基づいた随意応答文を出力する。 On the other hand, in the response generation device 1 according to the first embodiment, the repeat generation unit 5 repeats a response sentence (hereinafter referred to as a repeat response sentence) from the user's voice recognized by the voice recognition unit 2. ) Is generated easily. And the response output part 4 outputs the voluntary response sentence based on the structure of an audio | voice after making the repeated response sentence produced | generated by the repetition production | generation part 5. FIG.

これにより、繰返応答文は、認識されたユーザの音声をオウム返しで繰り返すだけなので生成時間を要せず（例えば、１秒程度）、処理コストが低い。したがって、上記処理コストが高い随意応答文を出力するまでの応答待ちの間に、処理コストが低い繰返応答文を出力することができる。したがって、応答待ちによって生じる対話の間が大きいことによる対話の違和感を緩和することができる。 Thereby, since the repeated response sentence only repeats the recognized user's voice by returning a parrot, it does not require generation time (for example, about 1 second), and the processing cost is low. Therefore, it is possible to output a repetitive response sentence with a low processing cost while waiting for a response until the optional response sentence with a high processing cost is output. Therefore, it is possible to alleviate the uncomfortable feeling of the dialogue due to the large duration of the dialogue caused by waiting for a response.

繰返生成部５は、上述の如く、音声認識部２により認識された音声情報を、オウム返しを行うための繰返応答文として生成する。ここで、ユーザの音声を全く変えずにそのままオウム返しするよりも、ユーザの音声情報の名詞又は動詞に特定の付加語を付加してオウム返しをした方が、より対話の自然性が向上する。例えば、ユーザの発話「海に行ったよ」に対して、応答生成装置１が単にそのまま「海に行ったよ」と応答するよりも、「海かぁ」あるいは「お、海かぁ」と応答した方がより対話の自然性が向上する。 As described above, the repeat generation unit 5 generates the speech information recognized by the speech recognition unit 2 as a repeat response sentence for performing a parrot return. Here, rather than returning the parrot as it is without changing the user's voice at all, it is better to add a specific additional word to the noun or verb of the user's voice information and return the parrot to improve the naturalness of the conversation. . For example, in response to the user's utterance “I went to the ocean”, the response generation device 1 responds “Umi ka” or “Oh, ka ka” rather than simply responding “I went to the ocean”. The naturalness of dialogue is further improved.

したがって、本実施の形態１に係る繰返生成部５は、品詞抽出部９により抽出された名詞又は動詞に対して特定の付加語を付加することで、繰返応答文を生成する。これにより、オウム返しの繰返応答文の語感に多様性を持たせることができるため、画一的な応答パターンにならず、対話の違和感をより緩和することができる。 Therefore, the repeat generation unit 5 according to Embodiment 1 generates a repeat response sentence by adding a specific additional word to the noun or verb extracted by the part of speech extraction unit 9. This makes it possible to give diversity to the repetitive response sentence of the parrot return, so that it is not a uniform response pattern, and the uncomfortable feeling of dialogue can be further alleviated.

メモリ１ｂには、例えば、複数のキーワードと、複数の付加語（語頭、語尾など）と、を夫々対応付けた付加情報（テーブル情報など）が記憶されている。繰返生成部５は、品詞抽出部９から出力される名詞又は動詞と、メモリ１ｂに記憶された付加情報と、に基づいて、その名詞又は動詞と一致する付加情報のキーワードを選択する。そして、繰返生成部５は、選択したキーワードに対応する付加語を選択する。繰返生成部５は、品詞抽出部９から出力される名詞又は動詞に、選択した付加語を、付加することで繰返応答文を生成する。 In the memory 1b, for example, additional information (table information or the like) in which a plurality of keywords and a plurality of additional words (beginning, ending, etc.) are associated with each other is stored. The repetition generation unit 5 selects a keyword of additional information that matches the noun or verb based on the noun or verb output from the part of speech extraction unit 9 and the additional information stored in the memory 1b. Then, the repeat generation unit 5 selects an additional word corresponding to the selected keyword. The repeat generation unit 5 generates a repeat response sentence by adding the selected additional word to the noun or verb output from the part of speech extraction unit 9.

より具体的には、繰返生成部５は、品詞抽出部９から出力される名詞「ラーメン」と、メモリ１ｂに記憶された付加情報と、に基づいて、その名詞「ラーメン」に対応する付加語「かぁ」を選択する。繰返生成部５は、品詞抽出部９から出力される名詞「ラーメン」に選択した付加語「かぁ」を付加することで、繰返応答文「ラーメンかぁ」を生成する。 More specifically, the repetition generation unit 5 adds the noun “ramen” corresponding to the noun “ramen” based on the noun “ramen” output from the part-of-speech extraction unit 9 and the additional information stored in the memory 1b. Select the word "ka". The repeat generation unit 5 generates the repeated response sentence “ramen kaaa” by adding the selected additional word “ka” to the noun “ramen” output from the part of speech extraction unit 9.

さらに、画一的な応答パターンを改善し対話としての自然さをより向上させるために、本実施形態１に係る応答生成装置１において、繰返生成部５は、上記名詞又は動詞に特定の付加語を付加するだけでなく、その名詞又は動詞を、関連する関連ワードに置き換えて、繰返応答文を生成する。これにより、オウム返しの繰返応答文の語感に対しより多様性を持たせることができ、対話の違和感をより緩和することができる。 Furthermore, in order to improve the uniform response pattern and further improve the naturalness of the dialogue, in the response generation device 1 according to the first embodiment, the repetition generation unit 5 adds a specific addition to the noun or verb. In addition to adding words, the nouns or verbs are replaced with related words to generate a repeated response sentence. Thereby, it is possible to give more diversity to the sensation of the repeated response sentence of the parrot return, and it is possible to further alleviate the uncomfortable feeling of the dialogue.

メモリ１ｂには、例えば、下記に示すような、複数のキーワードに該各キーワードに関連する関連ワードに夫々対応付けた関連ワード情報（テーブル情報など）が記憶されている。 In the memory 1b, for example, as shown below, related word information (table information or the like) associated with a plurality of keywords and associated words related to the keywords is stored.

キーワード「トンカツ」関連ワード「豚肉」、
キーワード「ステーキ」関連ワード「牛肉」、
キーワード「Ａ型」関連ワード「慎重型」、
キーワード「Ｏ型」関連ワード「おおらか型」
キーワード「牛肉」関連ワード「お肉」
キーワード「矢場トン」関連ワード「味噌カツ」
キーワード「投票する」関連ワード「国民の義務」
キーワード「泳ぐ」関連ワード「スイミング」
・・・・・ Keyword “tonkatsu” Related word “pork”
Keyword “Steak” Related word “Beef”,
Keyword "A type" Related word "Careful type",
Keyword "O type" Related word "Oraka type"
Keyword "beef" Related word "Meat"
Keyword “Yaba Ton” Related Word “Miso Katsu”
Keyword "vote" Related word "National obligation"
Keyword "swim" Related Words "Swimming"
...

なお、関連ワードは、例えば、キーワードに類似するワード、あるいは、キーワードの上位概念に相当するワードなどの、ユーザがそのキーワードに基づいて連想するワードである。また、関連ワード情報は、キーワードと関連ワードとを一対一で対応付けたテーブル情報となっているが、これに限定されない。例えば、関連ワード情報は、単一のキーワードと複数の関連ワードとが対応付けられたテーブル情報であってもよく、ツリー状のオントロジー情報であってもよい。また、関連ワード情報および上記付加情報は、一体（両者が対応付けられたテーブル情報など）で構成されていてもよい。 The related word is a word that the user associates based on the keyword, such as a word similar to the keyword or a word corresponding to a higher concept of the keyword. Further, the related word information is table information in which keywords and related words are associated one to one, but is not limited thereto. For example, the related word information may be table information in which a single keyword is associated with a plurality of related words, or may be tree-like ontology information. Further, the related word information and the additional information may be integrated (table information in which both are associated).

繰返生成部５は、品詞抽出部９から出力される名詞又は動詞と、メモリ１ｂに記憶された関連ワード情報と、に基づいて、その名詞又は動詞と一致する関連ワード情報のキーワードを選択する。繰返生成部５は、選択したキーワードに対応する関連ワードを選択する。なお、単一のキーワードに複数の関連ワードが対応付けられている場合、繰返生成部５は、例えば、その複数の関連ワードの中からランダムにあるいは登録順に１つを選択してもよい。繰返生成部５は、品詞抽出部９により抽出された名詞又は動詞を、該選択した関連ワードに、置き換えて繰返応答文を生成する。 Based on the noun or verb output from the part-of-speech extraction unit 9 and the related word information stored in the memory 1b, the repetition generation unit 5 selects a keyword of the related word information that matches the noun or verb. . The repetition generation unit 5 selects a related word corresponding to the selected keyword. When a plurality of related words are associated with a single keyword, the repeat generation unit 5 may select one of the plurality of related words randomly or in the order of registration, for example. The repetition generation unit 5 replaces the noun or verb extracted by the part-of-speech extraction unit 9 with the selected related word to generate a repetition response sentence.

ここで、ユーザの使用するワードは、その時々の流行り廃りで常時変化している。このため、上述したメモリ１ｂ内の関連ワード情報にも、例えば、新しいキーワード及び関連ワードを追加するなどのメンテナンスが必要となる。一方で、新しいキーワード及び関連ワードの抽出は、知的判断を伴うものであり単純なルール化は困難である。したがって、そのような新しいキーワード及び関連ワードを追加するなどのメンテナンスを人が行った場合、その人に大きな負荷となる。 Here, the word used by the user is constantly changing due to the current trend. For this reason, maintenance such as adding a new keyword and a related word is also required for the related word information in the memory 1b described above. On the other hand, the extraction of new keywords and related words involves intellectual judgment, and it is difficult to make simple rules. Therefore, when a person performs maintenance such as adding such new keywords and related words, a heavy load is placed on the person.

これに対し本実施形態１に係る応答生成装置１は、メモリ１ｂ内の関連ワード情報に新しいキーワード及び関連ワードを追加することで、その関連ワード情報を自動的に更新する機能を有している。これにより、メモリ１ｂ内の関連ワード情報をメンテナンスする際のユーザの負荷を大幅に低減することができる。さらに、関連ワードが適宜増え、繰返応答文により多様性を持たせることができ、対話の違和感をより緩和することができる。 On the other hand, the response generation device 1 according to the first embodiment has a function of automatically updating the related word information by adding new keywords and related words to the related word information in the memory 1b. . Thereby, a user's load at the time of maintaining related word information in the memory 1b can be significantly reduced. Furthermore, the number of related words is increased as appropriate, and the repeated response sentences can provide diversity, which can further alleviate the uncomfortable feeling of dialogue.

関連ワード抽出部１０は、第１の名詞又は動詞と、該第１の名詞又は動詞と関連する第２の名詞又は動詞と、を所定の文章集合情報から抽出する。関連ワード抽出部１０は、ワード抽出手段の一具体例である。関連ワード抽出部１０は、例えば、ネットワーク上のサーバ、端末装置、データベースなどに格納された文章集合情報から、第１の名詞又は動詞（以下、第１単語）と、第２の名詞又は動詞（以下、第２単語）と、を抽出する。文章集合情報は、例えば、新聞、雑誌、小説、会話などのテキスト情報や階層化されたコーパス情報を含む。 The related word extraction unit 10 extracts a first noun or verb and a second noun or verb related to the first noun or verb from predetermined sentence set information. The related word extraction unit 10 is a specific example of word extraction means. The related word extraction unit 10 uses, for example, a first noun or verb (hereinafter referred to as the first word) and a second noun or verb (from the sentence set information stored in a server, terminal device, database, etc. on the network. Hereinafter, the second word) is extracted. The sentence set information includes, for example, text information such as newspapers, magazines, novels, conversations, and hierarchical corpus information.

なお、関連ワード抽出部１０は、応答生成装置１に直接入力された、あるいは、予めメモリ１ｂに記憶された文書集合情報から、第１及び第２単語を抽出してもよい。例えば、関連ワード抽出部１０は、マイク６を介して入力されたユーザの音声情報に基づいて第１及び第２単語を抽出してもよい。 Note that the related word extraction unit 10 may extract the first and second words from the document set information directly input to the response generation device 1 or stored in the memory 1b in advance. For example, the related word extraction unit 10 may extract the first and second words based on the user's voice information input via the microphone 6.

関連ワード抽出部１０は、例えば、文章集合情報から抽出した第１単語と第２単語との共起頻度に基づいて、該第１単語と第２単語との類似度を下記式により算出する。
ｓｉｍ（“第１単語”、“第２単語”）
＝｛“第１単語”と“第２単語”の共起頻度｝^２／｛“第１単語”の頻度×“第２単語”の頻度｝ For example, based on the co-occurrence frequency of the first word and the second word extracted from the sentence set information, the related word extraction unit 10 calculates the similarity between the first word and the second word by the following formula.
sim ("first word", "second word")
= {Co-occurrence frequency of “first word” and “second word”} ² / {frequency of “first word” × frequency of “second word”}

関連ワード抽出部１０は、算出した類似度が閾値以上となる場合に、抽出した第１単語と第２単語とが関連すると判断する。なお、関連ワード抽出部１０は、第１単語と第２単語の共起頻度を用いて、その類似度を算出しているが、これに限定されない。関連ワード抽出部１０は、任意の算出方法を用いて第１単語と第２単語との類似度を算出してもよい。 The related word extraction unit 10 determines that the extracted first word and the second word are related when the calculated similarity is equal to or greater than the threshold. In addition, although the related word extraction part 10 is calculating the similarity using the co-occurrence frequency of a 1st word and a 2nd word, it is not limited to this. The related word extraction unit 10 may calculate the similarity between the first word and the second word using an arbitrary calculation method.

関連ワード抽出部１０は、類似度が閾値以上となり、第１単語と第２単語とが関連すると判断したとき、その第１単語と第２単語をユーザ質問部１１に出力する。 The related word extraction unit 10 outputs the first word and the second word to the user question unit 11 when it is determined that the similarity is equal to or greater than the threshold and the first word and the second word are related.

ユーザ質問部１１は、関連ワード抽出部１０から出力された第１単語と第２単語とが同一であるか否かをユーザに対して問う質問を行う。ユーザ質問部１１は、質問手段の一具体例である。ユーザ質問部１１は、例えば、スピーカ７を用いて音声によりあるいは表示装置を用いて文字表示により、ユーザに対して質問を行う。より具体的には、ユーザ質問部１１は、「トンカツは豚肉だよね？」との質問をユーザに対して行う。 The user question unit 11 asks the user whether or not the first word and the second word output from the related word extraction unit 10 are the same. The user question part 11 is a specific example of a question means. For example, the user question unit 11 asks the user questions by voice using the speaker 7 or by displaying characters using a display device. More specifically, the user question unit 11 asks the user that “Tonkatsu is pork?”.

記憶判定部１２は、ユーザ質問部１１の質問に対するユーザの回答が肯定的である場合に、第１単語をキーワードとし第２単語を連想ワードとして、メモリ１ｂの関連ワード情報に登録する。記憶判定部１２は、判定手段の一具体例である。 When the user's answer to the question from the user question unit 11 is affirmative, the memory determination unit 12 registers the first word as a keyword and the second word as an associative word in the related word information in the memory 1b. The memory determination unit 12 is a specific example of a determination unit.

記憶判定部１２は、ユーザ質問部１１の質問に対するユーザの回答が、例えば、「そうだよ」、「そう」、「はい」などの所定の肯定的なパターンである場合に、その回答が肯定的と判断する。一方、記憶判定部１２は、ユーザ質問部１１の質問に対するユーザの回答が、所定の肯定的なパターン以外である場合に、その回答が否定的と判断する。 The memory determination unit 12 determines that if the user's answer to the question of the user question unit 11 is a predetermined positive pattern such as “Yes”, “Yes”, “Yes”, the answer is positive Judge. On the other hand, the memory determination unit 12 determines that the answer is negative when the user's answer to the question of the user question unit 11 is other than a predetermined positive pattern.

より具体的に、記憶判定部１２は、ユーザ質問部１１の質問「トンカツは豚肉だよね？」に対するユーザの回答が、「そうだよ」と肯定的なパターンであると判断した場合、その回答が肯定的と判断する。記憶判定部１２は、第１単語「トンカツ」をキーワードとし第２単語「豚肉」を連想ワードとし対応させて、メモリ１ｂの関連ワード情報に登録する。 More specifically, when the memory determination unit 12 determines that the user's answer to the question “Tonkatsu is pork?” In the user question unit 11 is a positive pattern “Yes”, the answer is Judgment is positive. The memory determination unit 12 registers the first word “tonkatsu” as a keyword and the second word “pork” as an associative word, and registers the associated word information in the memory 1b.

ここで、記憶判定部１２は、第１単語「トンカツ」をキーワードとし第２単語「豚肉」を連想ワードとして関連ワード情報に登録するが、第１及び第２単語の逆の登録を行わない。すなわち、記憶判定部１２は、第１単語「豚肉」をキーワードとし第２単語「トンカツ」を連想ワードとして関連ワード情報に登録を行わない。 Here, the memory determination unit 12 registers the first word “tonkatsu” as a keyword and the second word “pork” as an associative word in the related word information, but does not perform reverse registration of the first and second words. That is, the memory determination unit 12 does not register the related word information with the first word “pork” as a keyword and the second word “tonkatsu” as an associative word.

さらに、記憶判定部１２は、新しい第１及び第２単語を関連ワード情報に登録する際に、その新しい第１単語と同一のキーワードが既に関連ワード情報に登録されている場合がある。この場合、記憶判定部１２は、その同一のキーワードに対し、新たな第２単語と既に登録された関連ワードとを対応させて関連ワード情報に登録してもよい。例えば、キーワード「トンカツ」と、このキーワードに対応して関連ワード「豚肉」及び「肉」と、が関連ワード情報に登録される。このように、単一のキーワードに対して複数の関連ワードが対応付けられて関連ワード情報に登録されてもよい。これにより、メモリ１ｂの記憶容量の節約に繋がる。 Furthermore, when the memory determination unit 12 registers the new first and second words in the related word information, the same keyword as the new first word may already be registered in the related word information. In this case, the memory determination unit 12 may register the new second word and the already registered related word in the related word information in association with the same keyword. For example, the keyword “tonkatsu” and the related words “pork” and “meat” corresponding to this keyword are registered in the related word information. As described above, a plurality of related words may be associated with a single keyword and registered in the related word information. This leads to saving of the storage capacity of the memory 1b.

記憶判定部１２は、上述の如く、単一のキーワードに対して複数の関連ワードを重複させて関連ワード情報に登録する場合、ユーザの選択に応じてその重複登録を実行する或いは中止してもよい。 As described above, the storage determination unit 12 may register or register a plurality of related words for a single keyword in the related word information by executing or canceling the overlapping registration according to the user's selection. Good.

例えば、記憶判定部１２は、単一のキーワードに対して複数の関連ワードを重複させて登録する場合、その重複登録の実行及び中止を選択させるための質問をスピーカ７などを用いて行う。記憶判定部１２は、その質問に対して、マイク６などにより入力されたユーザの選択に応じて、その重複登録を実行する或いは中止する。 For example, when a plurality of related words are registered with respect to a single keyword, the storage determination unit 12 uses the speaker 7 or the like to make a question for selecting execution or cancellation of the overlapping registration. The memory determination unit 12 executes or cancels the duplication registration for the question according to the user's selection input by the microphone 6 or the like.

図３は、本実施形態１に係る応答生成方法の処理フローを示すフローチャートである。
関連ワード抽出部１０は、第１単語と、該第１単語と関連する第２単語と、をネットワーク上のコーパス情報などから抽出する（ステップＳ１０１）。 FIG. 3 is a flowchart showing a processing flow of the response generation method according to the first embodiment.
The related word extraction unit 10 extracts the first word and the second word related to the first word from the corpus information on the network (step S101).

関連ワード抽出部１０は、抽出した第１単語と第２単語と類似度を算出する（ステップＳ１０２）。関連ワード抽出部１０は、算出した類似度が閾値以上であり、抽出した第１単語と第２単語とが関連するか否かを判断する（ステップＳ１０３）。 The related word extraction unit 10 calculates the similarity between the extracted first word and second word (step S102). The related word extraction unit 10 determines whether or not the calculated similarity is equal to or greater than a threshold value and the extracted first word and the second word are related (step S103).

関連ワード抽出部１０は、類似度が閾値以上となり、第１単語と第２単語とが関連すると判断したとき（ステップＳ１０３のＹＥＳ）、その第１単語と第２単語をユーザ質問部１１に出力する。 When the related word extraction unit 10 determines that the similarity is equal to or greater than the threshold and the first word and the second word are related (YES in step S103), the first word and the second word are output to the user question unit 11. To do.

ユーザ質問部１１は、関連ワード抽出部１０から出力された第１単語と第２単語とが同一であるか否かの質問をスピーカ７などを用いて行う（ステップＳ１０４）。記憶判定部１２は、ユーザ質問部１１の質問に対するユーザの回答が肯定的であるか否かを判定する（ステップＳ１０５）。記憶判定部１２は、ユーザの回答が肯定的であると判定したとき（ステップＳ１０５のＹＥＳ）、第１単語をキーワードとし第２単語を連想ワードとして、メモリ１ｂの関連ワード情報に登録する（ステップＳ１０６）。 The user question part 11 asks whether or not the first word and the second word output from the related word extraction part 10 are the same using the speaker 7 or the like (step S104). The memory determination unit 12 determines whether or not the user's answer to the question from the user question unit 11 is affirmative (step S105). When it is determined that the user's answer is affirmative (YES in step S105), the memory determination unit 12 registers the first word as a keyword and the second word as an associative word in the related word information in the memory 1b (step S105). S106).

音声認識部２は、マイク６により取得されたユーザの音声情報の音声認識を行い（ステップＳ１０７）、認識したユーザの音声情報を構造解析部３、及び品詞抽出部９に出力する。 The voice recognition unit 2 performs voice recognition of the user's voice information acquired by the microphone 6 (step S107), and outputs the recognized user's voice information to the structure analysis unit 3 and the part of speech extraction unit 9.

品詞抽出部９は、音声認識部２から出力された音声情報の品詞情報付き形態素情報に基づいて、認識されたユーザの音声情報から名詞又は動詞を抽出する（ステップＳ１０８）。品詞抽出部９は、抽出した名詞又は動詞を繰返生成部５に出力する。 The part-of-speech extraction unit 9 extracts a noun or a verb from the recognized user's voice information based on the morpheme information with part-of-speech information of the voice information output from the voice recognition unit 2 (step S108). The part-of-speech extraction unit 9 outputs the extracted noun or verb to the repetition generation unit 5.

繰返生成部５は、品詞抽出部９から出力された名詞又は動詞と、メモリ１ｂの付加情報と、に基づいて、品詞抽出部９から出力された名詞又は動詞と一致する付加情報のキーワードを選択する。そして、繰返生成部５は、該選択したキーワードに対応する付加語を、該抽出された名詞又は動詞に対して付加する（ステップＳ１０９）。 Based on the noun or verb output from the part-of-speech extraction unit 9 and the additional information in the memory 1b, the repetition generation unit 5 selects the keyword of the additional information that matches the noun or verb output from the part-of-speech extraction unit 9. select. Then, the repetition generation unit 5 adds an additional word corresponding to the selected keyword to the extracted noun or verb (step S109).

繰返生成部５は、品詞抽出部９から出力された名詞又は動詞と、メモリ１ｂの関連ワード情報と、に基づいて、品詞抽出部９から出力された名詞又は動詞と一致する関連ワード情報のキーワードを選択する。そして、繰返生成部５は、品詞抽出部９から出力された名詞又は動詞を、該選択したキーワードに対応する関連ワードに、置き換えて（ステップＳ１１０）、繰返応答文を生成する（ステップＳ１１１）。 Based on the noun or verb output from the part-of-speech extraction unit 9 and the related word information in the memory 1b, the repetition generation unit 5 generates related word information that matches the noun or verb output from the part-of-speech extraction unit 9. Select a keyword. Then, the repeat generation unit 5 replaces the noun or verb output from the part-of-speech extraction unit 9 with a related word corresponding to the selected keyword (step S110), and generates a repeat response sentence (step S111). ).

繰返生成部５は、生成した繰返応答文を応答出力部４に出力する。応答出力部４は、繰返生成部５から出力される繰返応答文をスピーカ７から出力する（ステップＳ１１２）。 The repeat generation unit 5 outputs the generated repeat response sentence to the response output unit 4. The response output unit 4 outputs the repeated response text output from the repeated generation unit 5 from the speaker 7 (step S112).

上記（ステップ１０８）及び（ステップ１１２）と平行して、構造解析部３は、音声認識部２により認識された音声情報の構造を解析し（ステップＳ１１３）、その文字列情報の解析結果を応答出力部４に出力する。 In parallel with the above (Step 108) and (Step 112), the structure analysis unit 3 analyzes the structure of the speech information recognized by the speech recognition unit 2 (Step S113), and returns the analysis result of the character string information as a response. Output to the output unit 4.

応答出力部４は、構造解析部３から出力される文字列情報の解析結果に基づいて随意応答文を生成し（ステップＳ１１４）、生成した随意応答文をスピーカ７から出力する（ステップＳ１１５）。ここで、応答出力部４は、処理時間の低い繰返応答文を出力した後、処理時間の高い随意応答文を出力することとなる。 The response output unit 4 generates an optional response sentence based on the analysis result of the character string information output from the structure analysis unit 3 (step S114), and outputs the generated optional response sentence from the speaker 7 (step S115). Here, the response output unit 4 outputs an arbitrary response sentence with a high processing time after outputting a repeated response sentence with a low processing time.

なお、上述した（ステップＳ１０１）乃至（ステップＳ１０６）の処理（関連ワード情報の更新処理）は、（ステップＳ１０７）乃至（ステップＳ１１５）（実際のユーザとの対話処理）の前に実行されているが、これに限定されず、任意のタイミング実行できる。例えば、（ステップＳ１０１）乃至（ステップＳ１０６）の処理は、（ステップＳ１０７）乃至（ステップＳ１１５）の後、あるいは、途中、さらには前後で実行されてもよい。 Note that the above-described processing (step S101) to (step S106) (related word information update processing) is executed before (step S107) to (step S115) (actual user interaction processing). However, the present invention is not limited to this, and arbitrary timing can be executed. For example, the processes of (Step S101) to (Step S106) may be executed after (Step S107) to (Step S115), in the middle, or before and after.

以上、本実施形態１に係る応答生成装置１において、繰返生成部５は、品詞抽出部９により抽出された名詞又は動詞と一致するメモリ１ｂのキーワードを選択し、該選択したキーワードに対応する付加語を、該抽出された名詞又は動詞に対して付加する。さらに、繰返生成部５は、該抽出された名詞又は動詞を、該選択したメモリ１ｂのキーワードに対応する関連ワードに、置き換えて、繰返し応答文を生成する。これより、オウム返しの繰返応答文の語感に多様性を持たせることができ、対話の違和感をより緩和することができる。さらに、上記関連ワードに置き換える際に必要となる関連ワード情報を自動的に更新できる。このため、ユーザの負荷を低減しつつ、繰返応答文により多様性を持たせ対話の違和感をより緩和することができる。 As described above, in the response generation device 1 according to the first embodiment, the repetition generation unit 5 selects the keyword in the memory 1b that matches the noun or verb extracted by the part-of-speech extraction unit 9, and corresponds to the selected keyword. An additional word is added to the extracted noun or verb. Further, the repetition generation unit 5 replaces the extracted noun or verb with a related word corresponding to the selected keyword of the memory 1b, and generates a repeated response sentence. As a result, it is possible to give diversity to the repetitive response sentence of the parrot return, and to further alleviate the uncomfortable feeling of dialogue. Furthermore, the related word information required when replacing with the related word can be automatically updated. For this reason, while reducing the user's load, it is possible to provide a variety of repeated response sentences to further alleviate the uncomfortable feeling of dialogue.

実施形態２．
図４は、本発明の実施形態２に係る応答生成装置の概略的なシステム構成を示すブロック図である。本実施形態２に係る応答生成装置２０は、上記実施形態１に係る応答生成装置１の構成に加えて、ユーザの音声情報の音韻を分析する音韻分析部２１と、ユーザの音声情報に対する相槌の応答を生成する相槌生成部２２と、を更に備える点を特徴とする。 Embodiment 2. FIG.
FIG. 4 is a block diagram illustrating a schematic system configuration of the response generation apparatus according to the second embodiment of the present invention. In addition to the configuration of the response generation apparatus 1 according to the first embodiment, the response generation apparatus 20 according to the second embodiment includes a phoneme analysis unit 21 that analyzes the phoneme of the user's voice information, and a conflict of the user's voice information. It is characterized in that it further includes an interaction generation unit 22 that generates a response.

音韻分析部２１は、マイク６により取得されたユーザの音声情報に基づいてユーザの音声情報の音韻を分析する。音韻分析部２１は、音韻分析手段の一具体例である。例えば、音韻分析部２１は、音声情報の音量レベル変化や周波数変化（基本周波数等）を検出することで、ユーザの音声の切れ目を推定する。音韻分析部２１は、音韻の分析結果を相槌生成部２２に出力する。 The phoneme analysis unit 21 analyzes the phoneme of the user's voice information based on the user's voice information acquired by the microphone 6. The phoneme analysis unit 21 is a specific example of phoneme analysis means. For example, the phonological analysis unit 21 estimates a break in a user's voice by detecting a change in volume level or frequency (such as a fundamental frequency) of voice information. The phoneme analysis unit 21 outputs the phoneme analysis result to the conflict generation unit 22.

相槌生成部２２は、音韻分析部２１から出力される音韻の分析結果に基づいて、ユーザの音声に対する相槌の応答（以下、相槌応答と称す）を生成する。相槌生成部２２は、相槌生成手段の一具体例である。例えば、相槌生成部２２は、音声情報の音量レベルが閾値以下となったとき、相槌のパターンが記憶された定型応答データベース２３を検索する。そして、相槌生成部２２は、定型応答データベース２３からランダムに相槌応答を選択する。定型応答データベース２３は、「うん。うん。」などの相槌に用いられる複数のパターンが記憶されている。定型応答データベース２３は、上記メモリ１ｂなどに構築されている。相槌生成部２２は、生成した相槌応答を応答出力部４に出力する。 Based on the phonological analysis result output from the phonological analysis unit 21, the harmonious generation unit 22 generates a response to the user's voice (hereinafter referred to as a “compatibility response”). The interaction generating unit 22 is a specific example of the interaction generating unit. For example, when the volume level of the voice information is equal to or lower than the threshold, the conflict generation unit 22 searches the standard response database 23 in which the conflict pattern is stored. Then, the conflict generation unit 22 randomly selects a conflict response from the standard response database 23. The standard response database 23 stores a plurality of patterns used for consideration such as “Yes. The fixed response database 23 is constructed in the memory 1b and the like. The interaction generating unit 22 outputs the generated interaction response to the response output unit 4.

応答出力部４は、繰返生成部５により生成された繰返応答文の前に、相槌生成部２２により生成された相槌応答をスピーカ７から出力させる。なお、音韻分析部２１は、処理コストの低い特徴量を用いて音韻分析を行っている。このため、その相槌応答の生成時間は、上記繰返応答文の生成時間より短く、処理コストがより低い。 The response output unit 4 causes the speaker 7 to output the conflict response generated by the conflict generation unit 22 before the repeated response sentence generated by the repeat generation unit 5. Note that the phonological analysis unit 21 performs phonological analysis using feature quantities with low processing costs. For this reason, the generation time of the conflict response is shorter than the generation time of the repeated response sentence, and the processing cost is lower.

したがって、上記繰返応答文を出力するまでの間に、より処理コストが低い相槌応答を出力することができる。これにより、対話間の繋がりがよりスムーズになり、対話の違和感をより緩和することができる。さらに、処理コストの異なるより多くの応答及び応答文を並列で生成し、その生成順に出力する。これにより、対話の連続性をより滑らかに維持しそのテンポ感を損なわないより自然な対話を実現できる。 Therefore, it is possible to output a conflict response with a lower processing cost before outputting the repeated response sentence. Thereby, the connection between dialogs becomes smoother, and the uncomfortable feeling of dialog can be eased more. Further, more responses and response sentences having different processing costs are generated in parallel and output in the order of generation. As a result, it is possible to maintain a smoother continuity of dialogue and realize a more natural dialogue that does not impair the sense of tempo.

なお、相槌生成部２２は、相槌応答を定型的に生成しており、繰返生成部５は、音声認識結果の表層的な解釈のみを行って繰返応答文を生成している。したがって、応答出力部４は、相槌生成部２２により生成された相槌応答および繰返生成部５により生成された繰返応答と同様の随意応答候補を生成することが想定される。 Note that the conflict generation unit 22 routinely generates a conflict response, and the repeat generation unit 5 generates a repeat response sentence by only performing surface interpretation of the speech recognition result. Therefore, it is assumed that the response output unit 4 generates an optional response candidate similar to the interaction response generated by the interaction generation unit 22 and the repetition response generated by the repetition generation unit 5.

これに対し、応答出力部４は、随意応答候補の中から、相槌生成部２２により生成された相槌応答および繰返生成部５により生成された繰返応答と重複する随意応答候補を除外する。そして、応答出力部４は、その除外された随意応答候補の中から最適な候補を選択し、随意応答文とする。これにより、重複する無駄な言葉を排除できより自然な対話を実現できる。 On the other hand, the response output unit 4 excludes from the voluntary response candidates the voluntary response candidates that overlap with the conflict response generated by the conflict generation unit 22 and the repeat response generated by the repeat generation unit 5. And the response output part 4 selects an optimal candidate from the excluded voluntary response candidates, and makes it an arbitrary response sentence. This eliminates redundant useless words and realizes a more natural dialogue.

例えば、ユーザの発話「今日は暑いね」に対して、相槌生成部２２が相槌応答「うん」を生成する。続いて、繰返生成部５は、繰返応答文「暑いね」を生成する。これに対し、応答出力部４は、随意応答候補「嫌だね」、「いつまで暑いのかな？」、「暑いね」、「そうだね」等を生成する。応答出力部４は、生成した随意応答候補の中から繰返生成部５により生成された繰返応答文と重複する「暑いね」を排除する。そして、応答出力部４は、その除外された随意応答候補の中から、例えば「いつまで暑いのかな？」を選択し、随意応答文とする。 For example, in response to the user's utterance “Today is hot”, the conflict generation unit 22 generates a conflict response “Yes”. Subsequently, the repeat generation unit 5 generates a repeat response sentence “hot”. On the other hand, the response output unit 4 generates a voluntary response candidate “I don't like it”, “How long will it be hot?”, “It ’s hot”, “I think so”, and so on. The response output unit 4 eliminates “hot” overlapping with the repeated response sentence generated by the repeated generation unit 5 from the generated voluntary response candidates. Then, the response output unit 4 selects, for example, “How long will it be hot?” From the excluded optional response candidates, and uses it as an optional response sentence.

なお、本実施形態２に係る応答生成装置２０において、上記実施形態１に係る応答生成装置１と同一部分に同一符号を付して詳細な説明は省略する。 Note that, in the response generation device 20 according to the second embodiment, the same parts as those in the response generation device 1 according to the first embodiment are denoted by the same reference numerals, and detailed description thereof is omitted.

以下、応答生成装置２０とユーザとの対話の一例を示す。下記一例において、Ｍは、応答生成装置２０の応答文及び応答であり、Ｕはユーザの発話である。
Ｍ（話題提供）：お昼何を食べたの？
Ｕ：トンカツを食べたよ。
Ｍ（相槌応答）：うん。うん。
Ｍ（繰返応答文）：トンカツかぁ。（「かぁ」を付加）
Ｍ（判断処理）：第１単語「トンカツ」と第２単語「豚肉」とが関連と判断
Ｍ（随意応答文）：トンカツは豚肉だよね？
Ｕ：（肯定的な回答）そうだね。
Ｍ（相槌応答）：そうなんだ。
Ｍ（随意応答文）：どこで食べたのかな？
・・・・・
Ｕ：トンカツを食べたよ。
Ｍ（相槌応答）：なるほど。
Ｍ（繰返応答文）：豚肉、豚肉だね。（「だね」を付加、「トンカツ」を「豚肉」に置換え）
Ｍ（随意応答文）：食べたんだね。 Hereinafter, an example of the interaction between the response generation device 20 and the user will be shown. In the following example, M is a response sentence and a response of the response generation device 20, and U is a user's utterance.
M (topic provided): What did you eat at lunch?
U: I ate Tonkatsu.
M (conformity response): Yeah. Yup.
M (repeat response): Tonkatsu. (Add "ka")
M (judgment processing): It is judged that the first word “tonkatsu” and the second word “pork” are related. M (optional response sentence): Tonkatsu is pork?
U: (Positive answer) That's right.
M (Aiso response): That's right.
M (voluntary response): Where did you eat it?
...
U: I ate Tonkatsu.
M (conformity response): I see.
M (repeat response): Pork, pork. (Add “Dane” and replace “Tonkatsu” with “Pork”)
M (voluntary response): I ate it.

上記対話の一例が示すように、ユーザが発話すると、この発話に対して、応答生成装置２０の相槌応答、繰返応答文、及び随意応答文がテンポよく連続し、対話間の繋がりがよりスムーズになることが分かる。また、適宜、関連ワード情報を自動的に更新（キーワード「トンカツ」に対応する関連ワード「豚肉」を追加）でき、人的負荷が掛からないことが分かる。さらに、動詞又は名詞に付加語を付加し関連ワードへ置き換えることにより、繰返応答文に多様性を持たせることで、対話の自然性がより向上していることが分かる。 As shown in the example of the above dialogue, when the user utters, the response response of the response generation device 20, the repeated response text, and the voluntary response text continue to the utterance at a fast pace, and the connection between the dialogs is smoother. I understand that Further, it is understood that the related word information can be automatically updated as appropriate (the related word “pork” corresponding to the keyword “tonkatsu” is added), and no human load is applied. Furthermore, it is understood that the naturalness of the dialogue is further improved by adding the additional word to the verb or the noun and replacing it with the related word to give the repeated response sentence diversity.

実施形態３
上記実施形態１及び２において、記憶判定部１２は、ユーザ質問部１１の質問に対するユーザの回答が肯定的である場合に、第１単語をキーワードとし第２単語を連想ワードとして、メモリ１ｂの関連ワード情報に逐次登録している。記憶判定部１２が、キーワード及び連想ワードを、メモリ１ｂの関連ワード情報に制限なく登録を行った場合、メモリ１ｂの記憶容量に余裕がなくなることも想定される。 Embodiment 3
In the first and second embodiments, when the user's answer to the question of the user question unit 11 is affirmative, the memory determination unit 12 uses the first word as a keyword and the second word as an associative word. The word information is registered sequentially. When the storage determination unit 12 registers keywords and associative words in the related word information in the memory 1b without limitation, it is also assumed that there is no room in the storage capacity of the memory 1b.

したがって、本実施形態３に係る記憶判定部１２は、メモリ１ｂに記憶される関連ワード情報のキーワードの数を所定条件で制限する。関連ワード情報において、例えば、キーワードと、キーワードの関連ワードと、キーワードの利用数と、キーワードの登録時間と、が下記のように対応付けられている。 Therefore, the storage determination unit 12 according to the third embodiment limits the number of keywords of related word information stored in the memory 1b under a predetermined condition. In the related word information, for example, the keyword, the related word of the keyword, the number of keywords used, and the keyword registration time are associated as follows.

キーワード関連ワード利用数登録時間
「トンカツ」「豚肉」１ＹＹ：ＭＭ：ＤＤ
「ステーキ」「牛肉」０ＹＹ：ＭＭ：ＤＤ
・・・・・・・・・・・・・・・・・・・・・・ Keyword Related Words Number of Uses Registration Time “Tonkatsu” “Pork” 1 YY: MM: DD
“Steak” “Beef” 0 YY: MM: DD
・・・・・・・・・・・・・・・・・・・・・・

例えば、記憶判定部１２は、メモリ１ｂに記憶された関連ワード情報において、キーワード及び関連ワードの登録時間が新しいもの順で所定数（５０語など）だけキーワード及び関連ワードを残し、その他を削除してもよい。記憶判定部１２は、メモリ１ｂに記憶された関連ワード情報において、キーワード及び関連ワードの利用数が多いもの順で所定数だけキーワード及び関連ワードを残し、その他を削除してもよい。記憶判定部１２は、メモリ１ｂに記憶された関連ワード情報において、登録時間から所定期間（Ｎ日など）経過し且つ利用数が所定数（１回など）以下のキーワード及び関連ワードを削除してもよい。
記憶判定部１２は、メモリ１ｂに記憶された関連ワード情報において、連想強度やユーザプロファイル情報（ユーザの好感度、嗜好パターンなど）に基づいてキーワード及び関連ワードを削除してもよい。 For example, in the related word information stored in the memory 1b, the storage determination unit 12 leaves a predetermined number (such as 50 words) of keywords and related words in the order of new registration times of keywords and related words, and deletes others. May be. The storage determination unit 12 may leave a predetermined number of keywords and related words in the order of descending use of keywords and related words in the related word information stored in the memory 1b, and delete others. In the related word information stored in the memory 1b, the storage determination unit 12 deletes keywords and related words that have passed a predetermined period (N days, etc.) from the registration time and whose usage number is less than a predetermined number (such as once). Also good.
The storage determination unit 12 may delete the keyword and the related word in the related word information stored in the memory 1b based on the associative strength and the user profile information (user preference, preference pattern, etc.).

なお、関連ワード情報は、ネットワーク１００上のサーバやデータベースなどネットワーク装置１０１に記憶され、更新されてもよい（図５）。この場合、例えば、複数の応答生成装置１がネットワーク１００クに接続されている。 The related word information may be stored and updated in the network device 101 such as a server or database on the network 100 (FIG. 5). In this case, for example, a plurality of response generation apparatuses 1 are connected to the network 100.

各応答生成装置１は、第１単語のキーワード及び第２単語の連想ワードを、ネットワーク１００上のネットワーク装置１０１に送信し、ネットワーク装置１０１に記憶された関連ワード情報を更新する。さらに、ネットワーク装置１０１は、関連ワード情報のキーワード及び関連ワードの数を上記のような所定条件で制限する。そして、ネットワーク装置１０１は、所定条件で制限した関連ワード情報を各応答生成装置１に送信する。各応答生成装置１は、ネットワーク装置１０１から送信された関連ワード情報をメモリ１ｂに記憶する。このように、複数の応答生成装置１からの情報に基づいて、ネットワーク装置で一括して関連ワード情報を更新することで、繰返応答文により多様性を持たせることができる。なお、本実施形態３において、上記実施形態１及び２と同一部分に同一符号を付して詳細な説明は省略する。 Each response generation device 1 transmits the keyword of the first word and the associative word of the second word to the network device 101 on the network 100, and updates the related word information stored in the network device 101. Furthermore, the network device 101 limits the keywords of the related word information and the number of related words under the predetermined condition as described above. Then, the network device 101 transmits the related word information limited by the predetermined condition to each response generation device 1. Each response generation device 1 stores the related word information transmitted from the network device 101 in the memory 1b. As described above, the related word information is collectively updated by the network device based on the information from the plurality of response generation devices 1, thereby making it possible to provide diversity by repeated response sentences. In the third embodiment, the same parts as those in the first and second embodiments are denoted by the same reference numerals, and detailed description thereof is omitted.

なお、本発明は上記実施の形態に限られたものではなく、趣旨を逸脱しない範囲で適宜変更することが可能である。
上記実施形態において、繰返生成部５は、品詞抽出部９から出力された名詞又は動詞にメモリ１ｂの付加情報の付加語を付加した後、該名詞又は動詞をメモリ１ｂの関連ワード情報の関連ワードに、置き換えているが、これに限定されない。繰返生成部５は、品詞抽出部９から出力された名詞又は動詞を、メモリ１ｂの関連ワード情報の関連ワードに、置き換えた後、該置き換えた関連ワードにメモリ１ｂの付加情報の付加語を付加してもよい。 Note that the present invention is not limited to the above-described embodiment, and can be changed as appropriate without departing from the spirit of the present invention.
In the above embodiment, the repetition generation unit 5 adds the additional word of the additional information in the memory 1b to the noun or verb output from the part-of-speech extraction unit 9, and then associates the noun or verb with the related word information in the memory 1b. Although it is replaced with a word, it is not limited to this. The repetition generation unit 5 replaces the noun or verb output from the part-of-speech extraction unit 9 with the related word of the related word information in the memory 1b, and then adds the additional word of the additional information in the memory 1b to the replaced related word. It may be added.

例えば、繰返生成部５は、品詞抽出部９から出力された名詞又は動詞と、メモリ１ｂの関連ワード情報と、に基づいて、品詞抽出部９から出力された名詞又は動詞と一致する関連ワード情報のキーワードを選択する。繰返生成部５は、品詞抽出部９から出力された名詞又は動詞を、該選択したキーワードに対応する関連ワードに、置き換える。繰返生成部５は、品詞抽出部９から出力された名詞又は動詞と、メモリ１ｂの付加情報と、に基づいて、品詞抽出部９から出力された名詞又は動詞と一致する付加情報のキーワードを選択する。繰返生成部５は、該選択したキーワードに対応する付加語を、上記置き換えられた関連ワードに対して付加することで、繰返応答文を生成する。 For example, the repetition generation unit 5 uses a related word that matches the noun or verb output from the part of speech extraction unit 9 based on the noun or verb output from the part of speech extraction unit 9 and the related word information in the memory 1b. Select information keywords. The repetition generation unit 5 replaces the noun or verb output from the part-of-speech extraction unit 9 with a related word corresponding to the selected keyword. Based on the noun or verb output from the part-of-speech extraction unit 9 and the additional information in the memory 1b, the repetition generation unit 5 selects the keyword of the additional information that matches the noun or verb output from the part-of-speech extraction unit 9. select. The repeat generation unit 5 generates a repeat response sentence by adding an additional word corresponding to the selected keyword to the replaced related word.

上記実施形態において、応答出力部４は相槌生成部２２により生成された相槌応答をスピーカ７から出力させているが、これに限られない。応答出力部４は、相槌生成部２２により生成された相槌応答に基づいて、処理負荷の低い任意の応答を行っても良い。例えば、応答出力部４は、振動装置の振動、ライト装置の点灯／点滅、表示装置の表示、ロボットの手足、頭部、胴体など各部の動作などをおこなってもよく、これらを任意に組み合わせて行ってもよい。 In the above-described embodiment, the response output unit 4 outputs the conflict response generated by the conflict generation unit 22 from the speaker 7, but is not limited thereto. The response output unit 4 may perform an arbitrary response with a low processing load based on the conflict response generated by the conflict generation unit 22. For example, the response output unit 4 may perform vibrations of the vibration device, lighting / flashing of the light device, display of the display device, operation of each part such as a robot's limbs, head, and trunk, and any combination thereof. You may go.

上記実施形態において、応答出力部４は、繰返生成部５により生成された繰返応答文をスピーカ７から出力させているが、これに限らない。応答出力部４は、繰返生成部５により生成された繰返応答文に基づいて、処理負荷の低い任意の繰返応答文を出力しても良い。例えば、応答出力部４は、表示装置の表示などを用いて繰返応答文を出力してもよく、任意に手段を組み合わせて出力してもよい。この場合、例えば、応答出力部４の出力態様は、文字の大きさ、輝度、形状などの設定であってもよい。 In the above embodiment, the response output unit 4 outputs the repeated response text generated by the repeated generation unit 5 from the speaker 7, but is not limited thereto. The response output unit 4 may output an arbitrary repeated response sentence with a low processing load based on the repeated response sentence generated by the repeated generation unit 5. For example, the response output unit 4 may output a repetitive response sentence using a display on the display device or the like, or may output it by arbitrarily combining means. In this case, for example, the output mode of the response output unit 4 may be settings such as character size, brightness, and shape.

また、本発明は、例えば、図３に示す処理を、ＣＰＵ１ａにコンピュータプログラムを実行させることにより実現することも可能である。
プログラムは、様々なタイプの非一時的なコンピュータ可読媒体（non-transitory computer readable medium）を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体（tangible storage medium）を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体（例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ）、光磁気記録媒体（例えば光磁気ディスク）、ＣＤ−ＲＯＭ（Read Only Memory）、ＣＤ−Ｒ、ＣＤ−Ｒ／Ｗ、半導体メモリ（例えば、マスクＲＯＭ、ＰＲＯＭ（Programmable ROM）、ＥＰＲＯＭ（Erasable PROM）、フラッシュＲＯＭ、ＲＡＭ（random access memory））を含む。 In addition, the present invention can realize the processing shown in FIG. 3 by causing the CPU 1a to execute a computer program, for example.
The program may be stored using various types of non-transitory computer readable media and supplied to a computer. Non-transitory computer readable media include various types of tangible storage media. Examples of non-transitory computer-readable media include magnetic recording media (for example, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (for example, magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs, CD-R / W and semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (random access memory)) are included.

また、プログラムは、様々なタイプの一時的なコンピュータ可読媒体（transitory computer readable medium）によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 The program may also be supplied to the computer by various types of transitory computer readable media. Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

１応答生成装置、２音声認識部、３構造解析部、４応答出力部、５繰返生成部、６マイク、７スピーカ、８不足格辞書データベース、９品詞抽出部、１０関連ワード抽出部、１１ユーザ質問部、１２記憶判定部、２１音韻分析部、２２相槌生成部、２３定型応答データベース DESCRIPTION OF SYMBOLS 1 Response production | generation apparatus, 2 Speech recognition part, 3 Structure analysis part, 4 Response output part, 5 Repeat production | generation part, 6 Microphone, 7 Speaker, 8 Incomplete dictionary database, 9 Part of speech extraction part, 10 Related word extraction part, 11 User question part, 12 memory determination part, 21 phonological analysis part, 22 phase generation part, 23 routine response database

Claims

Recognizing the user's voice;
Analyzing the structure of the recognized speech;
Storing information in which a plurality of keywords are associated with an associative word and an additional word related to each keyword,
Extracting a noun or verb from the recognized speech;
Select a keyword of the stored information that matches the extracted noun or verb, add an additional word corresponding to the selected keyword to the extracted noun or verb, and select the noun or verb Generating a repetitive response sentence for repeating the user's voice by replacing with a related word corresponding to the keyword,
Generating an optional response sentence for the user's voice based on the analyzed voice structure, outputting the repeated response sentence, and then outputting the optional response sentence;
A response generation method including:
Extracting a first noun or verb and a second noun or verb associated with the first noun or verb from predetermined sentence set information;
Asking the user whether the extracted first noun or verb and the second noun or verb are the same;
Storing the first noun or verb as the keyword and storing the second noun or verb as the associative word when a user's answer to the question is affirmative;
A response generation method comprising:

The response generation method according to claim 1, comprising:
Extracting the first noun or verb and the second noun or verb from the predetermined sentence set information existing on the network,
The degree of similarity between the first noun or verb and the second noun or verb is calculated using the co-occurrence frequency of the extracted first noun or verb and the second noun or verb, and the calculation The first noun or verb extracted and the second noun or verb are identical to the user when the similarity is equal to or greater than a threshold,
When the user's answer to the question is positive, the first noun or verb is used as the keyword, and the second noun or verb is stored as the associative word.

The response generation method according to claim 1 or 2,
The response generation apparatus further comprising a step of limiting a number of associative words and additional words associated with the stored keyword under a predetermined condition.

The response generation method according to any one of claims 1 to 3,
Analyzing the phoneme of the user's voice;
Generating a reciprocal response to the user's voice based on the analyzed phonological analysis results; and
Before outputting the generated repeated response sentence, the generated response of the response is output.

Voice recognition means for recognizing the user's voice;
Structure analysis means for analyzing the structure of the voice recognized by the voice recognition means;
Storage means for storing information in which a plurality of keywords are associated with an associative word and an additional word related to each keyword,
Part of speech extraction means for extracting a noun or verb from the speech recognized by the speech recognition means;
Selecting a keyword of the storage means that matches the noun or verb extracted by the part-of-speech extraction means, and adding an additional word corresponding to the selected keyword to the extracted noun or verb; Repetitive generating means for generating a repetitive response sentence for repeating the user's voice by replacing a verb with a related word corresponding to the selected keyword;
Based on the structure of the voice analyzed by the structure analysis means, an optional response sentence for the user's voice is generated, and after outputting the repeated response sentence, a response output means for outputting the optional response sentence;
A response generating device including:
Extraction means for extracting a first noun or verb and a second noun or verb related to the first noun or verb from predetermined sentence set information;
Question means for asking the user whether or not the first noun or verb extracted by the extraction means and the second noun or verb are the same;
When the user's answer to the question is affirmative, a determination unit that registers the first noun or verb as the keyword and the second noun or verb as the association word in the storage unit;
A response generation device comprising:

Processing to recognize the user's voice,
Processing for analyzing the structure of the recognized speech;
Processing to extract nouns or verbs from the recognized speech;
Information in which an associative word and an additional word related to each keyword are associated with a plurality of keywords is stored, the keyword that matches the extracted noun or verb is selected, and an additional corresponding to the selected keyword is stored. By adding a word to the extracted noun or verb and replacing the noun or verb with a related word corresponding to the selected keyword, a repeated response sentence for repeating the user's voice is generated. Processing,
Based on the structure of the analyzed voice, generating an optional response sentence to the user's voice, outputting the repeated response sentence, and then outputting the optional response sentence;
A response generation program to be executed by a computer,
A process of extracting a first noun or verb and a second noun or verb related to the first noun or verb from predetermined sentence set information;
Processing for asking the user whether or not the extracted first noun or verb and the second noun or verb are the same;
When the user's answer to the question is positive, the first noun or verb as the keyword and the second noun or verb as the associative word; and
A response generation program characterized by causing a computer to execute.