JP2015075952A

JP2015075952A - Speech generation device, method, and program

Info

Publication number: JP2015075952A
Application number: JP2013212221A
Authority: JP
Inventors: 東中　竜一郎; Ryuichiro Higashinaka; 竜一郎東中; 牧野　俊朗; Toshiaki Makino; 俊朗牧野; 松尾　義博; Yoshihiro Matsuo; 義博松尾; 今村　賢治; Kenji Imamura; 賢治今村; のぞみ小林; Nozomi Kobayashi; 平野　徹; Toru Hirano; 徹平野; 千明宮崎; Chiaki Miyazaki; 豊美目黒
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2013-10-09
Filing date: 2013-10-09
Publication date: 2015-04-20
Anticipated expiration: 2033-10-09
Also published as: JP6126965B2

Abstract

PROBLEM TO BE SOLVED: To stably generate a natural speech candidate for a user's speech.SOLUTION: An input part 7 accepts a word showing a topic of dialogue, and a user's speech. A predicate argument structure search part 112 searches for case elements corresponding to the word showing the topic of dialogue from a predicate argument structure database and extracts a predicate argument structure. On the basis of respective extracted predicate argument structures, a sentence generation part 114 generates speech sentences respectively. A sentence determination part 116 eliminates speech sentences which are not natural sentences from the respective generated speech sentences, and outputs respective speech sentences that are not eliminated as speech candidates for the accepted user's speech. A speech search part 118 searches a speech database for speeches when no speech candidate is outputted by the sentence determination part 116, outputs the speech as a speech candidate when the speech is searched for, and outputs information showing that speech is incapable when no speech is searched for.

Description

本発明は、発話生成装置、方法、及びプログラムに関する。 The present invention relates to an utterance generation device, method, and program.

対話システムは大きく分けて二種類あり、タスク指向型対話システムと非タスク指向型対話システムに分けられる。前者は特定のタスクをシステムとの対話により達成するものであり、たとえば、フライトの予約システムや天気情報検索システムに用いられている。これらのシステムでは、予め話される内容が想定できるため、手作業で作り込んだ発話のデータベースを保持したり、データベースから抽出される天気情報などを手作業によるテンプレートに当てはめて、システムは発話を生成する（非特許文献１）。 There are roughly two types of dialogue systems: task-oriented dialogue systems and non-task-oriented dialogue systems. The former achieves a specific task by interaction with the system, and is used, for example, in a flight reservation system or a weather information retrieval system. Since these systems can assume what is spoken in advance, the system maintains a database of utterances created manually, or applies weather information extracted from the database to manual templates, and the system utters utterances. (Non-Patent Document 1).

非タスク指向型対話システムでは、目的のない対話を扱い、対話の内容はいわゆる雑談である。雑談はさまざまな話題が話されるため、予め話される内容を想定できない。そのため発話生成は非常に難しい課題である。ユーザの幅広い入力に対応するために、近年の従来技術では、ウェブやツイッターなどの文章をデータベース化しておき、ユーザ発話に類似するものを選択することでシステム発話とするものがある（非特許文献２）。 A non-task-oriented dialogue system handles a dialogue with no purpose, and the content of the dialogue is a so-called chat. Since various topics are spoken in the chat, it is impossible to assume the contents to be spoken in advance. Therefore, utterance generation is a very difficult task. In order to deal with a wide range of user input, in recent years, there is a technique in which sentences such as web and twitter are stored in a database, and a system utterance is selected by selecting an item similar to the user utterance (Non-Patent Document) 2).

Ryuichiro Higashinaka、Katsuhito Sudoh、Mikio Nakano、「Incorpo-rating Discourse Features into Con_dence Scoring of Intention Recognition Results in Spoken Dialogue Systems」、Speech Communication、2006、Volume 48、Issues 3-4、p.417-436Ryuichiro Higashinaka, Katsuhito Sudoh, Mikio Nakano, `` Incorpo-rating Discourse Features into Con_dence Scoring of Intention Recognition Results in Spoken Dialogue Systems '', Speech Communication, 2006, Volume 48, Issues 3-4, p.417-436

Shibata, M.、Nishiguchi, T.、and Tomiura, Y、「Dialog system for open-ended conversation using web documents.」、Infomatica、 (2009)、33 (3)、p.277-284Shibata, M., Nishiguchi, T., and Tomiura, Y, "Dialog system for open-ended conversation using web documents.", Infomatica, (2009), 33 (3), p.277-284

しかし、雑談対話において、ウェブやツイッターなどから構築した発話のデータベースから発話を選択する方式で発話生成を行うと、システムの発話意図に沿った発話が行えない可能性がある。たとえば、ある話題についてシステムがユーザに質問すべきと判断される状況において、ウェブやツイッターにその話題の質問文がなければ質問はできない。システムは状況に応じて質問をしたり相槌を打ったりする必要があることから、状況に応じた発話を生成できないことは対話の質を低くする。 However, if the utterance is generated by selecting the utterance from the utterance database constructed from the web or Twitter in the chat conversation, the utterance may not be able to be performed according to the utterance intention of the system. For example, in a situation where the system determines that a user should ask a question about a certain topic, a question cannot be made if there is no question sentence on the topic on the web or Twitter. Since the system needs to ask questions and ask questions according to the situation, the inability to generate utterances according to the situation lowers the quality of the dialogue.

この対応策として、システムが発話を自ら生成する発話生成の技術を用いて、発話意図に応じた発話をその場で生成することが考えられる。たとえば、文の構成要素（何がどうしたといった情報）をデータベースとして保持しておき、現在の話題に即した構成要素を抽出した上、その構成要素をシステムの発話意図に即する形で文に変換して発話することが考えられる。しかしながら、現在の話題に即した構成要素が見つかるとは限らず、また、文の変換が常に成功するとは限らない。 As a countermeasure, it is conceivable that an utterance corresponding to the utterance intention is generated on the spot using an utterance generation technique in which the system generates an utterance by itself. For example, a sentence component (information such as what happens) is stored as a database, the component corresponding to the current topic is extracted, and then the component is converted into a sentence according to the utterance intention of the system. It is conceivable to speak after conversion. However, the constituent elements corresponding to the current topic are not always found, and the sentence conversion is not always successful.

本発明は、上記の事情に鑑みてなされたものであり、安定して、ユーザの発話に対する自然な発話候補を生成することができる発話生成装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and an object thereof is to provide an utterance generation apparatus, method, and program capable of stably generating natural utterance candidates for a user's utterance. .

上記の目的を達成するために本発明に係る発話生成装置は、対話の話題を示す単語と、ユーザの発話とを受け付ける入力部と、前記入力部によって受け付けた前記対話の話題を示す単語に基づいて、述語と前記述語に対応する格の要素である格要素との組み合わせである述語項構造を複数記憶した述語項構造データベースから、前記対話の話題を示す単語に対応する前記格要素を含む前記述語項構造を検索する述語項構造検索部と、前記述語項構造検索部によって検索された前記述語項構造の各々に基づいて、発話文を各々生成する文生成部と、前記文生成部によって生成された前記発話文の各々から、自然な文である否かを判定するための予め定められたルールに基づいて、自然な文でない発話文を除去し、除去されなかった前記発話文の各々を、前記入力部によって受け付けた前記ユーザの発話に対する発話候補として出力する文判定部と、前記文判定部によって前記発話候補が出力されなかった場合に、前記ユーザの発話に基づいて、複数の発話を記憶した発話データベースから、前記ユーザの発話に対応する前記発話を検索し、前記発話が検索された場合に前記検索された発話を前記発話候補として出力し、前記発話が検索されなかった場合に発話不可であることを示す情報を出力する発話検索部と、を含んで構成されている。 In order to achieve the above object, an utterance generation device according to the present invention is based on an input unit that accepts a word indicating a conversation topic, a user's utterance, and a word that indicates the conversation topic received by the input unit. The case element corresponding to the word indicating the topic of the dialogue is included from the predicate item structure database storing a plurality of predicate item structures that are combinations of case elements that are case elements corresponding to the predicate and the previous description word. A predicate term structure retrieval unit that retrieves a predescription term structure, a sentence generation unit that generates an utterance sentence based on each of the previous description term structure searched by the previous description term structure search unit, and the sentence Based on a predetermined rule for determining whether or not the sentence is a natural sentence from each of the utterance sentences generated by the generating unit, the utterance sentence that is not a natural sentence is removed and the utterance that has not been removed Sentence A sentence determination unit that outputs the utterance candidates for the user's utterance received by the input unit, and when the utterance candidate is not output by the sentence determination unit, based on the user's utterance, a plurality of When the utterance corresponding to the user's utterance is searched from the utterance database storing the utterance, and when the utterance is searched, the searched utterance is output as the utterance candidate, and the utterance is not searched And an utterance search unit that outputs information indicating that utterance is impossible.

本発明に係る発話生成方法は、入力部、述語項構造検索部、文生成部、文判定部、及び発話検索部を含む発話生成装置における発話生成方法であって、前記入力部によって、対話の話題を示す単語と、ユーザの発話とを受け付けるステップと、前記述語項構造検索部によって、前記入力部によって受け付けた前記対話の話題を示す単語に基づいて、述語と前記述語に対応する格の要素である格要素との組み合わせである述語項構造を複数記憶した述語項構造データベースから、前記対話の話題を示す単語に対応する前記格要素を検索し、検索された格要素に対応する前記述語項構造を抽出するステップと、前記文生成部によって、前記述語項構造検索部によって抽出された前記述語項構造の各々に基づいて、発話文を各々生成するステップと、前記文判定部によって、前記文生成部によって生成された前記発話文の各々から、自然な文である否かを判定するための予め定められたルールに基づいて、自然な文でない発話文を除去し、除去されなかった前記発話文の各々を、前記入力部によって受け付けた前記ユーザの発話に対する発話候補として出力するステップと、前記発話検索部によって、前記文判定部によって前記発話候補が出力されなかった場合に、前記ユーザの発話に基づいて、複数の発話を記憶した発話データベースから、前記ユーザの発話に対応する前記発話を検索し、前記発話が検索された場合に前記検索された発話を前記発話候補として出力し、前記発話が検索されなかった場合に発話不可であることを示す情報を出力するステップと、を含む。 An utterance generation method according to the present invention is an utterance generation method in an utterance generation apparatus including an input unit, a predicate term structure search unit, a sentence generation unit, a sentence determination unit, and an utterance search unit, wherein the input unit A step of receiving a word indicating a topic and a user's utterance, and a case corresponding to the predicate and the previous description word based on the word indicating the topic of the conversation received by the input unit by the previous description word term structure search unit. Before searching for the case element corresponding to the word indicating the topic of the conversation from the predicate item structure database storing a plurality of predicate item structures that are combinations with the case elements that are the elements of A step of extracting a description term structure, and a step of generating each utterance sentence by the sentence generation unit based on each of the previous description term term structures extracted by the previous description word term structure search unit; The sentence determination unit removes an utterance sentence that is not a natural sentence from each of the utterance sentences generated by the sentence generation unit, based on a predetermined rule for determining whether or not the sentence is a natural sentence. And outputting each of the uttered sentences that have not been removed as utterance candidates for the user's utterance received by the input unit, and the utterance search unit does not output the utterance candidates by the sentence determining unit. The utterance corresponding to the user's utterance is searched from an utterance database storing a plurality of utterances based on the user's utterance, and the searched utterance is retrieved when the utterance is searched. Outputting as an utterance candidate, and outputting information indicating that the utterance is impossible when the utterance is not searched.

また、前記入力部は、発話の意図を表す対話行為を更に受け付け、前記文生成部は、前記述語、前記格要素、及び前記格要素の格について予め定められた順番に従って、前記述語項構造検索部によって検索された前記述語項構造の述語、前記格要素、及び前記格要素の格を並べた平叙文を生成し、前記入力部で受け付けた前記対話行為に対して予め定められた、前記対話行為を表現する文に変換するための変換ルールに基づいて、前記生成した平叙文を、前記対話行為を表現する文に変換して、前記発話文を生成するようにすることができる。 In addition, the input unit further accepts a dialogue act representing the intention of utterance, and the sentence generation unit, according to a predetermined order for the predescription word, the case element, and the case element A pretext of the predescription term structure searched by the structure search unit, the case element, and a plain text in which the case elements are arranged are generated and predetermined for the dialogue act received by the input unit. Based on a conversion rule for converting into a sentence expressing the dialogue action, the generated plain text can be converted into a sentence expressing the dialog action to generate the utterance sentence. .

また、前記文判定部は、前記文生成部によって生成された前記発話文の各々について、予め求められたＮ−ｇｒａｍ言語モデルに基づいて、文の生成されにくさを示すパープレキシティ値を算出し、前記パープレキシティ値が閾値以下であれば自然な文であると判定する前記ルールに基づいて、前記発話文の各々から、自然な文でない発話文を除去し、除去されなかった前記発話文の各々を前記発話候補として出力するようにすることができる。 In addition, the sentence determination unit calculates a perplexity value indicating difficulty in generating a sentence based on an N-gram language model obtained in advance for each of the uttered sentences generated by the sentence generation unit. And, if the perplexity value is less than or equal to a threshold value, based on the rule that determines that the sentence is a natural sentence, the utterance sentence that is not a natural sentence is removed from each of the utterance sentences, and the utterance that has not been removed Each sentence can be output as the utterance candidate.

本発明に係るプログラムは、コンピュータを、本発明に係る発話生成装置の各部として機能させるためのプログラムである。 The program according to the present invention is a program for causing a computer to function as each unit of the utterance generation device according to the present invention.

以上説明したように、本発明の発話生成装置、方法、及びプログラムによれば、述語項構造を複数記憶した述語項構造データベースから、対話の話題を示す単語に対応する格要素を含む述語項構造を検索し、述語項構造の各々に基づいて生成された発話文の各々のうちの自然な文を、ユーザの発話に対する発話候補として出力し、発話候補が出力されなかった場合に、ユーザの発話に基づいて、複数の発話を記憶した発話データベースから、ユーザの発話に対応する発話を検索し、発話が検索された場合に当該発話を発話候補として出力することにより、安定して、ユーザの発話に対する自然な発話候補を生成することができる、という効果が得られる。 As described above, according to the utterance generation device, method, and program of the present invention, from the predicate term structure database storing a plurality of predicate term structures, the predicate term structure including a case element corresponding to a word indicating a conversation topic And the natural sentence of each of the utterance sentences generated based on each of the predicate term structures is output as an utterance candidate for the user's utterance, and when the utterance candidate is not output, the user's utterance Based on the utterance, the utterance corresponding to the user's utterance is searched from the utterance database storing a plurality of utterances, and when the utterance is searched, the utterance is output as the utterance candidate, so that the user's utterance is stably It is possible to generate a natural utterance candidate for.

本発明の実施の形態に係る発話データベース構築装置の一構成例を示すブロック図である。It is a block diagram which shows the example of 1 structure of the speech database construction apparatus which concerns on embodiment of this invention. 対話行為の一覧の例を示す説明図である。It is explanatory drawing which shows the example of the list of dialogue acts. 本発明の実施の形態に係る述語項構造データベース構築装置の一構成例を示すブロック図である。It is a block diagram which shows one structural example of the predicate term structure database construction apparatus which concerns on embodiment of this invention. 述語項構造のテーブルデータの一例を示す説明図である。It is explanatory drawing which shows an example of the table data of predicate term structure. 本発明の実施の形態に係る発話生成装置の一構成例を示すブロック図である。It is a block diagram which shows one structural example of the speech production | generation apparatus which concerns on embodiment of this invention. 述語項構造が検索された一例を示す図である。It is a figure which shows an example by which the predicate term structure was searched. 文が自然な文であるか否かを判定するためのパープレキシティ値を説明するための図である。It is a figure for demonstrating the perplexity value for determining whether a sentence is a natural sentence. 発話が検索された一例を示す図である。It is a figure which shows an example by which the utterance was searched. 本発明の実施の形態に係る発話データベース構築処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the speech database construction process routine which concerns on embodiment of this invention. 本発明の実施の形態に係る述語項構造データベース構築処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the predicate term structure database construction process routine which concerns on embodiment of this invention. 本発明の実施の形態に係る発話生成処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the speech production | generation process routine which concerns on embodiment of this invention.

＜概要＞
まず、本発明の実施の形態の概要について説明する。 <Overview>
First, an outline of an embodiment of the present invention will be described.

本発明の実施の形態では、二種類のデータベースを併用する。一つはツイッター（登録商標）などのマイクロブログにおける発話から構築した発話データベースであり、もう一つは大規模なテキストデータを解析することで得られる述語項構造データベースである。発話データベースはマイクロブログ中の大量の発話について、自動的に推定された発話の意図を表す対話行為を付与し、発話と対話行為とをペアとして格納したものである。述語項構造データベースは大量のテキストデータについて述語項構造解析を行い、その結果得られる述語項構造を格納したものである。述語項構造とは、文の構成要素である、述語と当該述語に対応する格要素との組み合わせを表す構造のことであり、参考文献（平博順、永田昌明、「構造学習を用いた述語項構造解析」、言語処理学会第１４回年次大会発表論文集、2008、p.556-559）に述語項構造解析の処理例が記載されている。 In the embodiment of the present invention, two types of databases are used in combination. One is an utterance database constructed from utterances in microblogs such as Twitter (registered trademark), and the other is a predicate term structure database obtained by analyzing large-scale text data. The utterance database is a database in which a dialogue action indicating the intention of the utterance automatically estimated is assigned to a large number of utterances in the microblog, and the utterance and the dialogue action are stored as a pair. The predicate term structure database stores a predicate term structure obtained as a result of performing a predicate term structure analysis on a large amount of text data. The predicate term structure is a structure that represents a combination of a predicate and a case element corresponding to the predicate, which is a component of a sentence. Reference documents (Jun Hirahiro, Masaaki Nagata, “Predicate using structure learning” An example of predicate term structure analysis is described in "Term Structure Analysis", Proc. Of the 14th Annual Conference of the Language Processing Society, 2008, p.556-559).

発話データベースにはマイクロブログのユーザが書き込んだ発話が格納されているため、自然な発話が多く格納されている。しかし、システムが話したいような対話行為に対応する発話が多いとは限らない。 Since utterances written by microblog users are stored in the utterance database, many natural utterances are stored. However, there are not always many utterances corresponding to dialogue actions that the system wants to speak.

一方、述語項構造データベースには文の構成要素（述語項構造）が格納されており、さまざまな対話行為に対応する文を生成することができる。しかし、自動的に生成される文は単純であったり、不自然であったり、非文となる可能性がある。どちらのデータベースも一長一短あることから、これら二つを相補的に用いることで、システムの対話行為に即し、また、自然な発話を生成することが可能となる。 On the other hand, the predicate term structure database stores sentence components (predicate term structure), and can generate sentences corresponding to various interactive actions. However, automatically generated sentences can be simple, unnatural, or non-sentenced. Since both databases have merits and demerits, by using these two in a complementary manner, it becomes possible to generate natural utterances in accordance with the dialogue action of the system.

対話の話題を示す単語（以下、焦点と称する。）と、対話システムにおける上位モジュールから与えられるシステムの対話行為とについて発話を生成する場合、まず、述語項構造データベースから述語項構造を検索し、所定のテンプレートにしたがって文を生成するが、文が生成できない、もしくは、生成された文が所定の条件を満たさない場合、発話データベースを検索し、該当する発話があればそれを用いて発話する。もしくは、発話データベースを検索し該当する発話が見つからない場合、述語項構造データベースから述語項構造を検索し、述語項構造が見つかれば所定のテンプレートにしたがって文を生成し、発話する。 When generating an utterance about a word indicating a conversation topic (hereinafter referred to as a focus) and a dialogue action of a system given by a higher-level module in the dialogue system, first, a predicate term structure is searched from a predicate term structure database, A sentence is generated according to a predetermined template, but if a sentence cannot be generated or the generated sentence does not satisfy a predetermined condition, the utterance database is searched, and if there is a corresponding utterance, the utterance is uttered. Alternatively, when the utterance database is searched and the corresponding utterance is not found, the predicate term structure is retrieved from the predicate term structure database, and if the predicate term structure is found, a sentence is generated according to a predetermined template and uttered.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜発話データベース構築装置のシステム構成＞
図１は、本発明の実施の形態に係る発話データベース構築装置１００を示すブロック図である。この発話データベース構築装置１００は、ＣＰＵと、ＲＡＭと、後述する発話データベース構築処理ルーチンを実行するためのプログラムを記憶したＲＯＭとを備えたコンピュータで構成され、機能的には次に示すように構成されている。 <System configuration of utterance database construction device>
FIG. 1 is a block diagram showing an utterance database construction device 100 according to an embodiment of the present invention. This utterance database construction device 100 is constituted by a computer including a CPU, a RAM, and a ROM storing a program for executing an utterance database construction processing routine to be described later, and is functionally configured as follows. Has been.

本実施の形態に係る発話データベース構築装置１００は、図１に示すように、発話データ入力部１と、発話データベース構築部２と、発話データベース３とを備えている。 As shown in FIG. 1, utterance database construction apparatus 100 according to the present embodiment includes utterance data input unit 1, utterance database construction unit 2, and utterance database 3.

発話データ入力部１は、マイクロブログサイトから取得された、複数の発話を含む発話集合の入力を受け付ける。本実施の形態では、約数百万の発話をマイクロブログサイトからクロールし、発話集合とした。 The utterance data input unit 1 receives an input of an utterance set including a plurality of utterances acquired from a microblog site. In the present embodiment, about several million utterances are crawled from a microblog site to form an utterance set.

発話データベース構築部２は、発話データ入力部１によって受け付けた発話集合に含まれる複数の発話の各々について、後述する対話行為推定部２２によって、対話行為を付与する。そして、発話データベース構築部２は、発話と対話行為とのペアを１つのレコードとして、後述する発話データベース３に登録する。この発話データベース３の構築には一般的な全文検索エンジンのデータベースを構築する手順を踏めばよく、フリーソフトのＬｕｃｅｎｅやＮａｍａｚｕなどを用いて構築すればよい。本実施例ではＬｕｃｅｎｅを用いる。なお、発話データベース構築部２は、発話候補データベース２０と、対話行為推定部２２とを備えている。 The utterance database construction unit 2 gives a dialogue action to each of a plurality of utterances included in the utterance set received by the utterance data input unit 1 by the dialogue action estimation unit 22 described later. Then, the utterance database construction unit 2 registers a pair of the utterance and the dialogue action as one record in the utterance database 3 described later. The utterance database 3 may be constructed by following a general procedure for constructing a full-text search engine database, and may be constructed using free software Lucene, Namazu, or the like. In this embodiment, Lucene is used. The utterance database construction unit 2 includes an utterance candidate database 20 and a dialogue action estimation unit 22.

発話候補データベース２０には、発話データ入力部１によって受け付けた発話集合が格納される。 The utterance candidate database 20 stores the utterance set accepted by the utterance data input unit 1.

対話行為推定部２２は、発話候補データベース２０に格納された発話集合に含まれる複数の発話の各々について、当該発話の対話行為を推定し、当該発話に、推定された対話行為を付与して、発話データベース３に格納する。具体的には、対話行為推定部２２は、発話の各々について、当該発話と対話行為とをペアにした後、発話データベース３に格納する。本実施の形態で扱う対話行為は全部で３３種類である。対話行為の一覧を、図２に示す。 The dialogue action estimation unit 22 estimates the dialogue action of the utterance for each of a plurality of utterances included in the utterance set stored in the utterance candidate database 20, and assigns the estimated dialogue action to the utterance, Store in the utterance database 3. Specifically, the dialogue action estimation unit 22 pairs the utterance and the dialogue action for each utterance, and then stores them in the utterance database 3. There are 33 types of dialogue actions handled in this embodiment. A list of dialogue actions is shown in FIG.

対話行為推定部２２は、発話内の単語に基づいて、特徴量を抽出し、その特徴量から、対話行為を推定する推定器を用いればよく、推定器は、機械学習の手法で予め構築しておけばよい。例えば、文書分類で一般的に用いられる手法である、サポートベクトルマシンなどを用いて構築できる。特徴量としては、たとえば発話内の単語の頻度ベクトルなどを用いればよい。本実施の形態においては、別途用意した約数万の発話について人手で対話行為を付与し、このデータを学習データとして、サポートベクトルマシンによって、発話からその対話行為を推定する多クラス分類器を学習した。 The dialogue action estimation unit 22 may extract a feature amount based on the words in the utterance, and use an estimator that estimates the dialogue action from the feature amount. The estimator is constructed in advance by a machine learning technique. Just keep it. For example, it can be constructed using a support vector machine, which is a technique generally used in document classification. As the feature amount, for example, a frequency vector of words in the utterance may be used. In this embodiment, about tens of thousands of utterances prepared separately, a dialogue action is manually added, and this data is used as learning data to learn a multi-class classifier that estimates the dialogue action from the utterance by a support vector machine. did.

対話行為推定部２２は、例えば、発話「こんにちは」に対して、対話行為「挨拶」を推定し、発話「私はラーメンが好きです」に対して、対話行為「自己開示評価+」を推定する。 Dialog act estimation unit 22, for example, to the utterance "Hello", estimated the dialogue act "greeting", with respect to the utterance "I like ramen", to estimate the dialogue act "self-disclosure evaluation +" .

発話データベース３には、発話と、対話行為推定部２２によって当該発話について推定された対話行為とのペアが複数格納される。また、発話データベース３には、各ペアについて、全文検索が可能なように転置インデックスが保持されている。 The utterance database 3 stores a plurality of pairs of utterances and dialogue actions estimated for the utterances by the dialogue action estimation unit 22. Further, the utterance database 3 holds a transposed index for each pair so that a full-text search is possible.

＜述語項構造データベース構築装置のシステム構成＞
図３は、本発明の実施の形態に係る述語項構造データベース構築装置２００を示すブロック図である。この述語項構造データベース構築装置２００は、ＣＰＵと、ＲＡＭと、後述する述語項構造データベース構築処理ルーチンを実行するためのプログラムを記憶したＲＯＭとを備えたコンピュータで構成され、機能的には次に示すように構成されている。 <System configuration of predicate term structure database construction device>
FIG. 3 is a block diagram showing predicate term structure database construction apparatus 200 according to the embodiment of the present invention. This predicate term structure database construction device 200 is constituted by a computer including a CPU, a RAM, and a ROM storing a program for executing a predicate term structure database construction processing routine to be described later. It is configured as shown.

本実施の形態に係る述語項構造データベース構築装置２００は、図３に示すように、テキストデータ入力部４と、述語項構造データベース構築部５と、述語項構造データベース６とを備えている。 As shown in FIG. 3, the predicate term structure database construction device 200 according to the present embodiment includes a text data input unit 4, a predicate term structure database construction unit 5, and a predicate term structure database 6.

テキストデータ入力部４は、複数のテキストを含む大規模テキスト集合の入力を受け付ける。本実施の形態では、大規模テキスト集合として、ブログサイトから収集したテキストを用いる。なお、本実施の形態では、ブログサイトからテキストを収集したが、これに限らず、テキストであれば何でもよい。 The text data input unit 4 accepts input of a large-scale text set including a plurality of texts. In the present embodiment, text collected from a blog site is used as a large-scale text set. In this embodiment, the text is collected from the blog site. However, the present invention is not limited to this, and any text may be used.

述語項構造データベース構築部５は、テキストデータ入力部４によって受け付けた大規模テキスト集合に含まれる複数のテキストの各々について、当該テキストから述語項構造を抽出し、述語項構造データベース６に格納する。なお、述語項構造データベース構築部５は、大規模テキストデータベース５０と、形態素解析部５２と、係り受け解析部５４と、述語項構造抽出部５６とを備えている。 The predicate term structure database construction unit 5 extracts a predicate term structure from each of a plurality of texts included in the large-scale text set received by the text data input unit 4 and stores it in the predicate term structure database 6. The predicate term structure database construction unit 5 includes a large-scale text database 50, a morpheme analysis unit 52, a dependency analysis unit 54, and a predicate term structure extraction unit 56.

大規模テキストデータベース５０には、テキストデータ入力部４によって受け付けた大規模テキスト集合が格納される。 The large-scale text database 50 stores a large-scale text set received by the text data input unit 4.

形態素解析部５２は、大規模テキストデータベース５０に格納された大規模テキスト集合に含まれる複数のテキストの各文について、形態素解析を行う。 The morpheme analysis unit 52 performs morpheme analysis on each sentence of a plurality of texts included in the large-scale text set stored in the large-scale text database 50.

係り受け解析部５４は、大規模テキスト集合に含まれる複数のテキストの各文について、形態素解析部５２によって解析された形態素解析結果に基づいて、係り受け解析を行い、文節を同定し、文節間の係り受け構造を決定する。 The dependency analysis unit 54 performs dependency analysis on each sentence of a plurality of texts included in the large-scale text set based on the morphological analysis result analyzed by the morpheme analysis unit 52, identifies a phrase, and Determine the dependency structure.

なお、形態素解析部５２による形態素解析や、係り受け解析部５４による係り受け解析には、フリーで用いられているものを用いればよい。例えば、ＣｈａＳｅｎやＣａｂｏＣｈａである。本実施の形態では、出願人が開発したＪＴＡＧとＪＤＥＰとをそれぞれ形態素解析と係り受け解析に用いる。 In addition, what is used for free may be used for the morphological analysis by the morphological analysis part 52 and the dependency analysis by the dependency analysis part 54. For example, ChaSen or CaboCha. In this embodiment, JTAG and JDEP developed by the applicant are used for morphological analysis and dependency analysis, respectively.

述語項構造抽出部５６は、大規模テキスト集合に含まれる複数のテキストの各文について、係り受け解析部５４によって決定された係り受け構造に基づいて、当該文の述語と当該述語に対応する格の要素である格要素とを同定し、述語項構造として抽出する。そして、述語項構造抽出部５６は、複数のテキストの各文について抽出された述語項構造を、述語項構造データベース６へ格納する。なお、本実施の形態では、述語項構造抽出部５６は、文中の述語の各々について、当該述語の格要素としてガ格、ヲ格、ニ格、デ格、ト格、カラ格、及びマデ格の各々を抽出し、述語と１つ以上の格要素との組み合わせをひとまとまりとして抽出する。そして、述語項構造抽出部５６は、抽出された同じ述語項構造についてはひとまとめにし、頻度と共に述語項構造データベース６に登録する。 The predicate term structure extraction unit 56, for each sentence of the plurality of texts included in the large-scale text set, based on the dependency structure determined by the dependency analysis unit 54, the predicate of the sentence and the case corresponding to the predicate. The case element that is the element is identified and extracted as a predicate term structure. Then, the predicate term structure extraction unit 56 stores the predicate term structure extracted for each sentence of the plurality of texts in the predicate term structure database 6. In the present embodiment, the predicate term structure extraction unit 56, for each predicate in the sentence, as the case element of the predicate, the ga case, wo case, ni case, de case, g case, kara case, and made case Are extracted, and combinations of predicates and one or more case elements are extracted as a group. Then, the predicate term structure extracting unit 56 collects the same extracted predicate term structures together and registers them in the predicate term structure database 6 together with the frequency.

上記の述語項構造データベース構築部５について、具体例を挙げて説明する。例えば、「太郎が花子に会う」であれば、「会う」が述語であり、当該述語の文節に係っている文節から、ガ格の要素が「太郎」、ニ格の要素が「花子」だと分かる。そして、ここから「述語：会うガ格：太郎ニ格：花子」という述語項構造が抽出される。このような述語項構造を、大規模テキスト集合に含まれるテキストのすべての文から抽出する。 The predicate term structure database construction unit 5 will be described with a specific example. For example, if “Taro meets Hanako”, “Meet” is a predicate, and from the clauses related to the clause of the predicate, the element of ga rating is “Taro” and the element of Ni rating is “Hanako” I understand. Then, the predicate term structure “predicate: meet ga case: Taro ni case: hanako” is extracted from here. Such a predicate term structure is extracted from all sentences of the text included in the large-scale text set.

述語項構造データベース６には、述語項構造抽出部５６によって抽出された複数の述語項構造が格納される。なお、述語項構造データベース６に格納される述語項構造のデータはテーブルデータであるので、述語項構造データベース６は、一般的なＲＤＢ（Relational database）とすればよい。また、述語項構造データベース６として、全文検索可能なデータベースを用いてもよい。本実施の形態では、ＲＤＢを用い、実装としてはＰｏｓｔｇｒｅＳＱＬを用いた。 The predicate term structure database 6 stores a plurality of predicate term structures extracted by the predicate term structure extraction unit 56. Since the predicate term structure data stored in the predicate term structure database 6 is table data, the predicate term structure database 6 may be a general RDB (Relational database). Further, as the predicate term structure database 6, a database capable of full text search may be used. In the present embodiment, RDB is used, and PostgreSQL is used as the implementation.

図４に、述語項構造データベース６に格納される述語項構造のテーブルデータの一例を示す。上記図４の「ＩＤ」は述語項構造の連番であり、最後のカラムは頻度である。たとえば、最初のデータは、「セキレイが餌を探す」という文に対応する述語項構造である。 FIG. 4 shows an example of predicate term structure table data stored in the predicate term structure database 6. “ID” in FIG. 4 is a serial number of the predicate term structure, and the last column is the frequency. For example, the first data is a predicate term structure corresponding to the sentence “Wagtail looks for food”.

＜発話生成装置のシステム構成＞
図５は、本発明の実施の形態に係る発話生成装置３００を示すブロック図である。この発話生成装置３００は、ＣＰＵと、ＲＡＭと、後述する発話生成処理ルーチンを実行するためのプログラムを記憶したＲＯＭとを備えたコンピュータで構成され、機能的には次に示すように構成されている。 <System configuration of utterance generation device>
FIG. 5 is a block diagram showing the utterance generation device 300 according to the embodiment of the present invention. The utterance generation device 300 is composed of a computer including a CPU, a RAM, and a ROM that stores a program for executing an utterance generation processing routine to be described later, and is functionally configured as follows. Yes.

本実施の形態に係る発話生成装置３００は、図５に示すように、入力部７と、述語項構造データベース８と、変換ルールデータベース９と、言語モデルデータベース１０と、発話データベース１１と、演算部１２と、出力部１３とを備えている。 As shown in FIG. 5, the utterance generation device 300 according to the present embodiment includes an input unit 7, a predicate term structure database 8, a conversion rule database 9, a language model database 10, an utterance database 11, and a calculation unit. 12 and an output unit 13.

入力部７は、焦点と、ユーザの発話と、対話行為とを受け付ける。 The input unit 7 receives a focus, a user's utterance, and a dialogue act.

述語項構造データベース８には、上記述語項構造データベース構築装置２００の述語項構造データベース６に格納されている述語項構造のテーブルデータと同じ述語項構造のテーブルデータが格納されている。 The predicate term structure database 8 stores table data having the same predicate term structure as the predicate term structure table data stored in the predicate term structure database 6 of the upper descriptive term structure database construction apparatus 200.

変換ルールデータベース９には、対話行為毎に、平叙文を、当該対話行為を表現する文に変換するための変換ルールが予め格納されている。 The conversion rule database 9 stores in advance conversion rules for converting a plain text to a sentence expressing the interactive action for each interactive action.

言語モデルデータベース１０には、予め求められたＮ−ｇｒａｍ言語モデルが格納されている。 The language model database 10 stores an N-gram language model obtained in advance.

発話データベース１１には、上記発話データベース構築装置１００の発話データベース３に格納されている発話及び対話行為の複数ペアと同じ、発話及び対話行為の複数ペアが格納されている。 The utterance database 11 stores a plurality of pairs of utterances and dialogue actions that are the same as a plurality of pairs of utterances and dialogue actions stored in the utterance database 3 of the utterance database construction apparatus 100.

演算部１２は、述語項構造データベース８をまず選択し、発話候補（一つ以上の発話）の取得を試みる。そして、発話候補が取得できない場合は、発話データベース１１を参照し発話候補の取得を試みる。それでも発話が取得できない場合は発話不可を表す情報を出力部１３に返す。発話候補の取得ができていたら、発話候補を出力部１３に返す。なお、演算部１２は、制御部１１０と、述語項構造検索部１１２と、文生成部１１４と、文判定部１１６と、発話検索部１１８とを備えている。 The computing unit 12 first selects the predicate term structure database 8 and attempts to acquire utterance candidates (one or more utterances). And when an utterance candidate cannot be acquired, it refers to the utterance database 11 and tries to acquire an utterance candidate. If the utterance still cannot be acquired, information indicating that the utterance is impossible is returned to the output unit 13. If the utterance candidate has been acquired, the utterance candidate is returned to the output unit 13. The arithmetic unit 12 includes a control unit 110, a predicate term structure search unit 112, a sentence generation unit 114, a sentence determination unit 116, and an utterance search unit 118.

制御部１１０には、述語項構造データベース８から優先して発話に用いる発話候補を検索するという情報が設定値として記録されている。制御部１１０は、当該設定値に従って、述語項構造データベース８を選択する。本実施の形態では、述語項構造データベース８から優先して発話候補の検索をする場合について説明する。 In the control unit 110, information for searching for a speech candidate to be used for speech preferentially from the predicate term structure database 8 is recorded as a set value. The control unit 110 selects the predicate term structure database 8 according to the set value. In the present embodiment, a description will be given of a case where an utterance candidate is searched with priority from the predicate term structure database 8.

述語項構造検索部１１２は、入力部７によって受け付けた焦点に基づいて、述語項構造データベース８から、焦点と一致する格要素を含む述語項構造を検索する。
この際、検索された述語項構造を、頻度の降順でソートし、その上位Ｊ件を出力してもよい。なお、ユーザの発話を入力とし、全文検索可能な述語項構造データベース８を検索してもよい。その場合は、ユーザ発話の単語をなるべく多く含む述語項構造を、述語項構造データベース８から検索してもよい。また、後述する発話検索部１１８と同様のベクトル空間モデルを用いてランキングを行い、上位Ｊ件を出力すればよい。 Based on the focus received by the input unit 7, the predicate term structure search unit 112 searches the predicate term structure database 8 for a predicate term structure including a case element that matches the focus.
At this time, the searched predicate term structures may be sorted in descending order of frequency, and the top J items may be output. Alternatively, the user's utterance may be used as an input to search the predicate term structure database 8 that can be searched in full text. In that case, a predicate term structure including as many user-uttered words as possible may be searched from the predicate term structure database 8. Further, ranking may be performed using a vector space model similar to the utterance search unit 118 described later, and the top J items may be output.

図６は「ミスチル」が焦点として入力され、当該焦点に基づいて検索された述語項構造である。頻度の降順で順位が付けられている。上記図６では、述語や格要素に０や１という番号を付与している。当該番号は、本実施の形態で用いる述語項構造の簡易な記法であり、述語は０番、ガ格、ヲ格、ニ格、デ格、ト格、カラ格、マデ格の要素はそれぞれ１番〜７番の番号で表す。すなわち、このデータでは、「０＿歌う」は述語が「歌う」であること、「２＿主題歌」はヲ格が「主題歌」であることを表している。また、述語項構造検索部１１２は、述語項構造についての検索結果の上位Ｊ件を文生成部１１４へ出力する。 FIG. 6 shows a predicate term structure in which “mystil” is input as a focus and searched based on the focus. Ranking in descending order of frequency. In FIG. 6, numbers 0 and 1 are given to predicates and case elements. The number is a simple notation of the predicate term structure used in the present embodiment, and the predicates are number 0, ga case, wo case, ni case, de case, to case, kara case, and made case element. This is expressed by numbers 7 to 7. In other words, in this data, “0_sing” indicates that the predicate is “sing”, and “2_theme song” indicates that the case is “theme song”. In addition, the predicate term structure search unit 112 outputs the top J search results for the predicate term structure to the sentence generation unit 114.

文生成部１１４は、述語項構造検索部１１２によって検索された述語項構造の各々について、述語、格要素、及び格要素の格について予め定められた順番に従って、当該述語項構造の述語、前記格要素、及び格要素の格を並べた平叙文を生成する。また、文生成部１１４は、述語項構造の各々について、入力部７で受け付けた対話行為に対応して変換ルールデータベース９に格納された変換ルールに基づいて、当該述語構造に基づいて生成された平叙文を、当該対話行為を表現する文に変換して、発話文を生成する。 The sentence generation unit 114, for each of the predicate term structures searched by the predicate term structure search unit 112, in accordance with a pre-determined order for the predicate, case element, and case element, A plain text in which elements and case elements are arranged is generated. Further, the sentence generation unit 114 is generated based on the predicate structure for each predicate term structure based on the conversion rule stored in the conversion rule database 9 corresponding to the dialogue act accepted by the input unit 7. The plain text is converted into a sentence expressing the dialogue action, and an utterance sentence is generated.

例えば、まず、文生成部１１４は、述語をｐｒｅｄ、格をｃａｓｅ、格要素をａｒｇとして、「ａｒｇｃａｓｅｐｒｅｄ」という語順で平叙文を生成する。述語項構造が「０＿歌う２＿曲」であれば、「曲を歌う」という平叙文を生成する。なお、日本語の特性から、ヲ格、ニ格、ガ格、デ格、ト格、マデ格、カラ格の順で述語に近くなるように格要素を並べて配置する。複数の格要素があれば、「ｃａｓｅｐｒｅｄ」の部分を繰り返す。述語項構造が「０＿歌う２＿曲４＿テレビ」であれば、「テレビで曲を歌う」という平叙文を生成する。 For example, first, the sentence generation unit 114 generates a plain text in the order of words “arg case pred”, where the predicate is pred, the case is case, and the case element is arg. If the predicate term structure is “0_singing 2_music”, a plain text “singing music” is generated. In addition, from the characteristics of Japanese, case elements are arranged side by side so as to be close to predicates in the order of wo case, ni case, ga case, de case, to case, made case, and empty case. If there are multiple case elements, the “case pred” portion is repeated. If the predicate term structure is “0_sing 2_song 4_television”, a plain text “sing a song on television” is generated.

次に、文生成部１１４は、生成された平叙文について、変換ルールデータベース９に格納された対話行為に対応する変換ルールとして、文末の単語系列及び品詞系列に基づく書き換えルールを用いて、当該対話行為を表す文に変換する。 Next, the sentence generation unit 114 uses the rewrite rule based on the word sequence and the part-of-speech sequence at the end of the sentence as the conversion rule corresponding to the dialogue action stored in the conversion rule database 9 for the generated plain text. Convert to a sentence that represents an action.

例えば、平叙文「ミスチルが好き」のように形容動詞が最後の単語で、対話行為が「質問評価」の場合、「ミスチルが好きですか？」のように「ですか」を付与することにより、対話行為「質問評価」を表す文に変換する。平叙文「ミスチルが歌う」のように最後の単語が動詞で、対話行為が「質問＿情報提供要求」であれば、「のですか？」を追加し、「ミスチルは歌うのですか？」に変換する。もしくは、所与の活用辞書を参照し、最後の動詞を連用形（「歌い」）にし、「ますか？」を付与することで「ミスチルは歌いますか？」という文に変換する。対話行為を表す文の変換候補が複数ある場合はランダムにいずれかを選択し、変換結果とする。対話行為のそれぞれについて変換ルールを準備しておくことで、変換ルールに合致する平叙文であれば、所与の対話行為を表す文に変換することができる。 For example, if the adjective verb is the last word, such as a plain text “I like mystil”, and the dialogue act is “question evaluation”, by adding “?” Like “do you like mystil?” , It is converted into a sentence representing the dialogue act “question evaluation”. If the last word is a verb and the dialogue act is “Question_Information Request”, as in the plain text “Mystil sings”, add “Is it?” And “Is mystil sing?” Convert to Or, referring to a given dictionary, the last verb is made into a continuous form (“singing”), and “is it?” Is converted to a sentence “Would you like to sing mystil?”. When there are a plurality of conversion candidates for a sentence representing a dialogue action, one of them is selected at random and set as a conversion result. By preparing a conversion rule for each dialogue action, a plain text that matches the conversion rule can be converted into a sentence representing a given dialogue action.

もし、生成した平叙文に合致する変換ルールが見つからない場合は、当該平叙文を棄却する。変換ルールに従って変換できた場合は変換済みの文を、発話文として文判定部１１６に送る。入力がＪ発話であれば、Ｍ件（Ｍ＜＝Ｊ）の変換済み発話文が文判定部１１６に送られる。 If a conversion rule matching the generated plain text is not found, the plain text is rejected. If conversion is possible according to the conversion rule, the converted sentence is sent to the sentence determination unit 116 as an uttered sentence. If the input is J utterance, M (M <= J) converted utterances are sent to the sentence determination unit 116.

ここで、文生成部１１４によって生成された発話文は、自動で生成された文であるため、人間が発話したものよりも不自然になっている可能性がある。例えば、述語項構造によっては文生成部１１４が決定した語順が不自然になることがある。例えば、「ラーメンに興味がある」という文の方が「興味がラーメンにある」よりも自然であろう。また、対話行為に沿うよう変換ルールによって変換した結果の文が不自然になる場合もある。例えば、平叙文「桜が咲く」を、対話行為「提案」を表す文に変換すると「〜しましょう」という文末を付与するため「桜が咲きましょう」となるが、この文は不自然な文である。 Here, since the utterance sentence produced | generated by the sentence production | generation part 114 is a sentence produced | generated automatically, there exists a possibility that it is unnatural than what was uttered by the human. For example, the word order determined by the sentence generation unit 114 may become unnatural depending on the predicate term structure. For example, the sentence “I am interested in ramen” may be more natural than “I am interested in ramen”. In addition, the sentence as a result of conversion by the conversion rule so as to follow the dialogue act may become unnatural. For example, if you convert a plain text “Cherry Blossoms” into a sentence that represents a dialogue act “Proposal”, the sentence ends with “Let ’s do it”. It is a sentence.

このような不自然な文を除外するため、本実施の形態では、Ｎ−ｇｒａｍ言語モデルを用いたパープレキシティによるフィルタリングを行う。 In order to exclude such an unnatural sentence, in this Embodiment, filtering by perplexity using an N-gram language model is performed.

すなわち、文判定部１１６は、文生成部１１４によって生成された発話文の各々について、言語モデルデータベース１０に格納されたＮ−ｇｒａｍ言語モデルに基づいて、文の生成されにくさを示すパープレキシティ値を算出し、当該パープレキシティ値が閾値以下であれば自然な文であると判定するルールに基づいて、発話文の各々から、自然な文でない発話文を除去し、除去されなかった発話文の各々を発話候補として出力部１３に出力する。 That is, the sentence determination unit 116 perplexity indicating the difficulty of generating a sentence based on the N-gram language model stored in the language model database 10 for each utterance sentence generated by the sentence generation unit 114. An utterance that is not removed by removing a non-natural sentence from each of the utterance sentences based on a rule that calculates a value and determines that the perplexity value is equal to or less than a threshold value. Each sentence is output to the output unit 13 as an utterance candidate.

具体的には、大規模なテキストデータからＮ−ｇｒａｍ言語モデルを構築しておく。例えば、Ｎは５である。５−ｇｒａｍ言語モデルには、与えられた５つの単語の並びがどの程度テキストデータに現れるかという確率が含まれている。当該言語モデルに照らし合わせたとき、ある文がどれほど出現しにくいかを示す値がパープレキシティである。パープレキシティは言語の複雑さ（生成されにくさ）を表す言語処理で一般的な指標である。パープレキシティ値が閾値未満の発話文が複数ある場合には、複数の発話文を発話候補として出力部１３に送る。 Specifically, an N-gram language model is constructed from large-scale text data. For example, N is 5. The 5-gram language model includes the probability of how much a given sequence of five words appears in text data. Perplexity is a value that indicates how hard a sentence appears when compared to the language model. Perplexity is a general index in language processing that represents the complexity (difficulty of being generated) of a language. When there are a plurality of utterance sentences whose perplexity value is less than the threshold, the plurality of utterance sentences are sent to the output unit 13 as utterance candidates.

図７は、システムが生成した発話文についてパープレキシティを計算した結果である。最初の文のパープレキシティは５００以上であり非常に高い。実際に文を読むと日本語として不自然だと考えられるものである。最後の文はパープレキシティが非常に低く日本語としても自然な文であることが分かる。 FIG. 7 shows the result of calculating perplexity for the utterance sentence generated by the system. The first sentence has a very high perplexity of over 500. If you actually read the sentence, it will be considered unnatural as Japanese. The last sentence has a very low perplexity and is understood to be a natural sentence even in Japanese.

制御部１１０は、文判定部１１６によって発話候補が出力されなかった場合に、発話データベース１１を選択する。 The control unit 110 selects the utterance database 11 when the utterance candidate is not output by the sentence determination unit 116.

発話検索部１１８は、制御部１１０によって発話データベース１１が選択された場合に、入力部７によって受け付けたユーザの発話及び対話行為に基づいて、発話データベース１１から、対話行為が一致し、かつ、当該ユーザの発話に対応する発話を検索し、該当する発話が検索された場合に検索された発話を発話候補として出力し、該当する発話が検索されなかった場合に発話不可であることを示す情報を出力する。 When the utterance database 11 is selected by the control unit 110, the utterance search unit 118 matches the dialogue act from the utterance database 11 based on the user's utterance and dialogue act accepted by the input unit 7, and Searches the utterance corresponding to the user's utterance, outputs the searched utterance as the utterance candidate when the corresponding utterance is searched, and indicates that the utterance is impossible when the corresponding utterance is not searched Output.

例えば、発話検索部１１８は、入力部７によって受け付けた対話行為と対話行為が一致し、かつ、ユーザの発話中の単語をなるべく多く含むような発話を、発話データベース１１から検索する。なお、発話検索部１１８は、入力部７によって受け付けた、対話行為と焦点とに基づいて、発話データベース１１を検索し、対話行為が一致し、かつ、焦点の単語を含む発話を検索してもよい。 For example, the utterance search unit 118 searches the utterance database 11 for an utterance in which the dialogue act received by the input unit 7 matches the dialogue act and includes as many words as the user is uttering. Note that the utterance search unit 118 searches the utterance database 11 based on the dialogue action and the focus received by the input unit 7, and searches for utterances that have the same dialogue action and include the focused word. Good.

発話検索部１１８での検索結果のランキングでは、全文検索エンジンでよく用いられる単語のＴＦ・ＩＤＦで重み付けをしたベクトル空間モデルを用いてもよいし、同じくよく用いられるＯＫＡＰＩ／ＢＭ２５などを用いてもよい。本実施の形態では、Ｌｕｃｅｎｅのデフォルトである、ＴＦ・ＩＤＦで重み付けをしたベクトル空間モデルによるランキングを用いる。該当する発話が検索されたら、上位Ｋ件を発話候補として出力部１３へ出力する。該当する発話が検索されなかったら検索結果が空の旨の情報を出力部１３へ出力する。 In the ranking of the search result in the utterance search unit 118, a vector space model weighted by a TF / IDF of a word that is often used in a full-text search engine may be used, or OKAPI / BM25 that is also frequently used may be used. Good. In this embodiment, Lucene's default ranking based on a vector space model weighted by TF / IDF is used. When the corresponding utterance is searched, the top K cases are output to the output unit 13 as utterance candidates. If the corresponding utterance is not searched, information indicating that the search result is empty is output to the output unit 13.

例えば、図８は対話行為として「自己開示評価+」、焦点として「ラーメン」を入力としたときの検索結果である。スコアはベクトル空間モデルに基づく検索結果のスコアである。 For example, FIG. 8 shows search results when “self-disclosure evaluation +” is input as the dialogue action and “ramen” is input as the focus. The score is a search result score based on the vector space model.

出力部１３は、文判定部１１６又は発話検索部１１８から出力された発話候補を出力する。出力された発話候補は、対話システム等の上位モジュールによって発話に使用される。上位モジュールでは、たとえば、複数発話候補があればその中からランダムに一つを発話する。 The output unit 13 outputs the utterance candidate output from the sentence determination unit 116 or the utterance search unit 118. The output utterance candidate is used for utterance by an upper module such as a dialogue system. In the upper module, for example, if there are a plurality of utterance candidates, one of them is uttered at random.

＜発話データベース構築装置の作用＞
次に、本実施の形態に係る発話データベース構築装置１００の作用について説明する。まず、マイクロブログサイトから取得された、複数の発話を含む発話集合が発話データベース構築装置１００に入力されると、発話データベース構築装置１００によって、図９に示す発話データベース構築処理ルーチンが実行される。 <Operation of utterance database construction device>
Next, the operation of the utterance database construction device 100 according to the present embodiment will be described. First, when an utterance set including a plurality of utterances acquired from a microblog site is input to the utterance database construction device 100, the utterance database construction device 100 executes an utterance database construction processing routine shown in FIG.

まず、ステップＳ１００において、発話データ入力部１によって、入力された発話集合を受け付け、発話候補データベース２０に格納する。 First, in step S100, the input utterance set is received by the utterance data input unit 1 and stored in the utterance candidate database 20.

ステップＳ１０２において、対話行為推定部２２によって、上記ステップＳ１００で発話候補データベース２０に格納された発話集合に含まれる複数の発話のうち、１つの発話を設定する。 In step S102, the dialogue action estimation unit 22 sets one utterance among a plurality of utterances included in the utterance set stored in the utterance candidate database 20 in step S100.

ステップＳ１０４において、対話行為推定部２２によって、上記ステップＳ１０２で設定された発話について、当該発話の対話行為を推定し、当該発話に、推定された対話行為を付与する。 In step S104, the dialogue action estimation unit 22 estimates the dialogue action of the utterance for the utterance set in step S102, and assigns the estimated dialogue action to the utterance.

ステップＳ１０６において、対話行為推定部２２によって、上記ステップＳ１０２で設定された発話について、当該発話と、上記ステップＳ１０４で付与された対話行為とのペアを、発話データベース３に格納する。 In step S106, the dialogue action estimation unit 22 stores a pair of the utterance and the dialogue action given in step S104 in the utterance database 3 for the utterance set in step S102.

ステップＳ１０８において、発話候補データベース２０に格納された全ての発話について、上記ステップＳ１０２〜Ｓ１０６の処理を実行したか否かを判定する。上記ステップＳ１０２〜Ｓ１０６の処理を実行していない発話が存在する場合には、ステップＳ１０２へ戻る。一方、発話候補データベース２０に格納された全ての発話について、上記ステップＳ１０２〜Ｓ１０６の処理を実行した場合には、発話データベース構築処理ルーチンを終了する。 In step S108, it is determined whether or not the processes in steps S102 to S106 have been executed for all utterances stored in the utterance candidate database 20. If there is an utterance for which the processes of steps S102 to S106 are not executed, the process returns to step S102. On the other hand, when the processes in steps S102 to S106 are executed for all utterances stored in the utterance candidate database 20, the utterance database construction process routine is terminated.

＜述語項構造データベース構築装置の作用＞
次に、本実施の形態に係る述語項構造データベース構築装置２００の作用について説明する。まず、複数のテキストを含む大規模テキスト集合が述語項構造データベース構築装置２００に入力されると、述語項構造データベース構築装置２００によって、図１０に示す述語項構造データベース構築処理ルーチンが実行される。 <Operation of predicate term structure database construction device>
Next, the operation of the predicate term structure database construction device 200 according to the present embodiment will be described. First, when a large-scale text set including a plurality of texts is input to the predicate term structure database construction device 200, the predicate term structure database construction device 200 executes a predicate term structure database construction processing routine shown in FIG.

まず、ステップＳ２００において、テキストデータ入力部４によって、大規模テキスト集合を受け付け、大規模テキストデータベース５０に格納する。 First, in step S <b> 200, the text data input unit 4 receives a large text set and stores it in the large text database 50.

ステップＳ２０２において、述語項構造データベース構築部５によって、上記ステップＳ２００で大規模テキストデータベース５０に格納された大規模テキスト集合に含まれる複数のテキストのうち、１つのテキストを設定する。 In step S202, the predicate term structure database construction unit 5 sets one text among a plurality of texts included in the large-scale text set stored in the large-scale text database 50 in step S200.

ステップＳ２０４において、上記ステップＳ２０２で設定されたテキストに含まれる文のうち、１つの文を設定する。 In step S204, one sentence is set out of the sentences included in the text set in step S202.

ステップＳ２０６において、形態素解析部５２によって、上記ステップＳ２０４で設定された文について、形態素解析を行う。 In step S206, the morpheme analysis unit 52 performs morpheme analysis on the sentence set in step S204.

ステップＳ２０８において、係り受け解析部５４によって、上記ステップＳ２０４で設定された文について、上記ステップＳ２０６によって解析された形態素解析結果に基づいて、係り受け解析を行い、文節を同定し、文節間の係り受け構造を決定する。 In step S208, the dependency analysis unit 54 performs dependency analysis on the sentence set in step S204 based on the morphological analysis result analyzed in step S206, identifies the phrase, and determines the relation between phrases. Determine the receiving structure.

ステップＳ２１０において、述語項構造抽出部５６によって、上記ステップＳ２０４で設定された文について、上記ステップＳ２０８で決定された係り受け構造に基づいて、上記ステップＳ２０４で設定された文の述語と当該述語に対応する格の要素である格要素とを同定し、述語項構造として抽出する。そして、ステップＳ２１０において、抽出された述語項構造を、メモリ（図示省略）に一時的に格納する。 In step S210, for the sentence set in step S204 by the predicate term structure extraction unit 56, the predicate of the sentence set in step S204 and the predicate are set based on the dependency structure determined in step S208. Case elements that are corresponding case elements are identified and extracted as predicate term structures. In step S210, the extracted predicate term structure is temporarily stored in a memory (not shown).

ステップＳ２１２において、上記ステップＳ２０２で設定されたテキストに含まれる全ての文について、上記ステップＳ２０４〜Ｓ２１０の処理を実行したか否かを判定する。上記ステップＳ２０４〜Ｓ２１０の処理を実行していない文が存在する場合には、ステップＳ２０４へ戻る。一方、上記ステップＳ２０２で設定されたテキストに含まれる全ての文について、上記ステップＳ２０４〜Ｓ２１０の処理を実行した場合には、ステップＳ２１４へ進む。 In step S212, it is determined whether or not the processing in steps S204 to S210 has been executed for all sentences included in the text set in step S202. If there is a sentence that does not execute the processes in steps S204 to S210, the process returns to step S204. On the other hand, if the processes in steps S204 to S210 have been executed for all sentences included in the text set in step S202, the process proceeds to step S214.

ステップＳ２１４において、大規模テキストデータベース５０に格納された全てのテキストについて、上記ステップＳ２０２〜Ｓ２１２の処理を実行したか否かを判定する。上記ステップＳ２０２〜Ｓ２１２の処理を実行していないテキストが存在する場合には、ステップＳ２０２へ戻る。一方、大規模テキストデータベース５０に格納された全てのテキストについて、上記ステップＳ２０２〜Ｓ２１２の処理を実行した場合には、ステップＳ２１６へ進む。 In step S214, it is determined whether or not the processes in steps S202 to S212 have been executed for all the texts stored in the large-scale text database 50. If there is a text that has not been subjected to the processes in steps S202 to S212, the process returns to step S202. On the other hand, if the processes in steps S202 to S212 are executed for all texts stored in the large-scale text database 50, the process proceeds to step S216.

ステップＳ２１６において、上記ステップＳ２１０でメモリ（図示省略）に格納された述語項構造と、当該述語項構造の頻度とを述語項構造データベース６へ格納する。 In step S216, the predicate term structure stored in the memory (not shown) in step S210 and the frequency of the predicate term structure are stored in the predicate term structure database 6.

＜発話生成装置の作用＞
次に、本実施の形態に係る発話生成装置３００の作用について説明する。まず、発話データベース構築装置１００の発話データベース３に記憶されている発話及び対話行為の複数ペアが、発話生成装置３００に入力されると、発話データベース１１に格納される。次に、述語項構造データベース構築装置２００の述語項構造データベース６に記憶されている述語項構造のテーブルデータが、発話生成装置３００に入力されると、述語項構造データベース８に格納される。そして、対話システム等の上位モジュールから、ユーザの発話と、焦点と、対話行為とが発話生成装置３００に入力されると、発話生成装置３００によって、図１１に示す発話生成処理ルーチンが実行される。 <Operation of utterance generator>
Next, the operation of the utterance generation device 300 according to the present embodiment will be described. First, when a plurality of pairs of utterances and dialogue actions stored in the utterance database 3 of the utterance database construction device 100 are input to the utterance generation device 300, they are stored in the utterance database 11. Next, when the predicate term structure table data stored in the predicate term structure database 6 of the predicate term structure database construction device 200 is input to the utterance generation device 300, it is stored in the predicate term structure database 8. When a user's utterance, focus, and dialogue action are input to the utterance generation device 300 from a higher-level module such as a dialog system, the utterance generation processing routine shown in FIG. 11 is executed by the utterance generation device 300. .

まず、ステップＳ３００において、入力部７によって、焦点と、ユーザの発話と、対話行為とを受け付ける。 First, in step S300, the input unit 7 accepts a focus, a user's speech, and a dialogue act.

ステップＳ３０１において、制御部１１０によって、述語項構造データベース８を選択する。 In step S <b> 301, the control unit 110 selects the predicate term structure database 8.

ステップＳ３０２において、述語項構造検索部１１２によって、上記ステップＳ３００で受け付けた焦点に基づいて、述語項構造データベース８から、焦点と一致する格要素を含む述語項構造を検索する。 In step S302, the predicate term structure search unit 112 searches the predicate term structure database 8 for a predicate term structure including a case element that matches the focus, based on the focus received in step S300.

ステップＳ３０４において、文生成部１１４によって、上記ステップＳ３０２で検索された述語項構造の各々について、述語、格要素、及び格要素の格について予め定められた順番に従って、当該述語項構造の述語、前記格要素、及び格要素の格を並べた平叙文を生成する。 In step S304, for each of the predicate term structures retrieved in step S302 by the statement generation unit 114, the predicate of the predicate term structure according to a pre-determined order for the predicate, the case element, and the case element case, A plain text in which case elements and case elements are arranged is generated.

ステップＳ３０６において、文生成部１１４によって、述語項構造の各々について、上記ステップＳ３００で受け付けた対話行為に対応して変換ルールデータベース９に格納された変換ルールに基づいて、上記ステップＳ３０４で当該述語構造に基づいて生成された平叙文を、当該対話行為を表現する文に変換して、発話文を生成する。 In step S306, for each of the predicate term structures by the sentence generation unit 114, the predicate structure in step S304 based on the conversion rule stored in the conversion rule database 9 corresponding to the dialogue act accepted in step S300. Is converted into a sentence expressing the dialogue action to generate an utterance sentence.

ステップＳ３０８において、文判定部１１６によって、上記ステップＳ３０６で生成された発話文の各々について、言語モデルデータベース１０に格納されたＮ−ｇｒａｍ言語モデルに基づいて、パープレキシティ値を算出し、当該パープレキシティ値が閾値以下であれば自然な文であると判定するルールに基づいて、上記ステップＳ３０６で生成された発話文の各々から、不自然な文を除去し、残った自然な発話文があるか否かを判定する。そして、上記ステップＳ３０６で生成された発話文の各々から、自然な発話文が少なくとも１つ残った場合には、ステップＳ３１４へ進む。一方、上記ステップＳ３０６で生成された発話文の全てが、不自然な文として除去された場合には、ステップＳ３０９へ進む。 In step S308, the sentence determination unit 116 calculates a perplexity value based on the N-gram language model stored in the language model database 10 for each of the uttered sentences generated in step S306. Based on the rule for determining that the sentence is a natural sentence if the plexity value is less than or equal to the threshold value, the unnatural sentence is removed from each of the utterance sentences generated in step S306, and the remaining natural utterance sentence is determined. It is determined whether or not there is. If at least one natural utterance remains from each of the utterances generated in step S306, the process proceeds to step S314. On the other hand, if all of the utterance sentences generated in step S306 are removed as unnatural sentences, the process proceeds to step S309.

ステップＳ３０９において、制御部１１０によって、発話データベース１１を選択する。 In step S <b> 309, the utterance database 11 is selected by the control unit 110.

ステップＳ３１０において、発話検索部１１８によって、上記ステップＳ３００で受け付けたユーザの発話及び対話行為に基づいて、発話データベース１１から、対話行為が一致し、かつ、当該ユーザの発話に対応する発話を検索する。 In step S310, the utterance search unit 118 searches the utterance database 11 for an utterance corresponding to the user's utterance that matches the utterance based on the user's utterance and the dialog act accepted in step S300. .

ステップＳ３１２において、上記ステップＳ３１０で対話行為が一致し、かつ、当該ユーザの発話に対応する発話が検索されたか否かを判定する。該当する発話が検索された場合には、ステップＳ３１４へ進む。一方、該当する発話が検索されなかった場合には、ステップＳ３１６へ進む。 In step S312, it is determined whether or not the dialog act matches in step S310 and an utterance corresponding to the user's utterance has been searched. If the corresponding utterance is searched, the process proceeds to step S314. On the other hand, if the corresponding utterance has not been searched, the process proceeds to step S316.

ステップＳ３１４において、出力部１３によって、上記ステップＳ３０８で自然な文であると判定された発話文の各々、又は上記ステップＳ３１２で検索された該当する発話の各々を、発話候補として出力して、発話生成処理ルーチンを終了する。 In step S314, the output unit 13 outputs each utterance sentence determined to be a natural sentence in step S308 or each corresponding utterance searched in step S312 as an utterance candidate, The generation processing routine is terminated.

ステップＳ３１６において、出力部１３によって、発話不可であることを示す情報を出力して、発話生成処理ルーチンを終了する。 In step S316, the output unit 13 outputs information indicating that the utterance is impossible, and the utterance generation processing routine is terminated.

なお、ステップＳ３１４において出力された発話候補は、対話システム等の上位モジュールによって発話に使用される。上位モジュールでは、たとえば、複数発話候補があればその中からランダムに一つを発話する。 Note that the utterance candidate output in step S314 is used for utterance by a higher-level module such as a dialogue system. In the upper module, for example, if there are a plurality of utterance candidates, one of them is uttered at random.

以上説明したように、本実施の形態に係る発話生成装置によれば、述語項構造を複数記憶した述語項構造データベースから、焦点と一致する格要素を含む述語項構造を検索し、述語項構造の各々に基づいて生成された発話文の各々のうちの自然な文を、ユーザの発話に対する発話候補として出力し、発話候補が出力されなかった場合に、ユーザの発話に基づいて、複数の発話と対話行為のペアを記憶した発話データベースから、対話行為が一致し、かつ、ユーザの発話中の単語を含む発話を検索し、発話が検索された場合に当該発話を発話候補として出力することにより、安定して、ユーザ発話に対する自然な発話候補を生成することができる。 As described above, according to the utterance generation device according to the present embodiment, a predicate term structure including a case element matching the focus is retrieved from a predicate term structure database storing a plurality of predicate term structures. A natural sentence of each of the utterance sentences generated based on each of the utterances is output as an utterance candidate for the utterance of the user, and when the utterance candidate is not output, a plurality of utterances are generated based on the utterance of the user Utterance database that stores a pair of dialogue actions, and searches for utterances that match the dialogue actions and include the word being spoken by the user, and outputs the utterance as a utterance candidate when the utterance is searched Thus, it is possible to stably generate natural utterance candidates for user utterances.

また、上位モジュールである対話システムが、対話システム自身の対話行為（発話意図）に沿った自然な発話を行うことができるようになり、システムの情報伝達の正確性が高まる。 In addition, the dialogue system, which is a higher-level module, can perform natural utterance in accordance with the dialogue action (utterance intention) of the dialogue system itself, and the accuracy of information transmission of the system is improved.

また、ユーザとの意思疎通がしやすくなり、システムとユーザとのインタラクションが円滑になる。 In addition, communication with the user is facilitated, and the interaction between the system and the user becomes smooth.

なお、本発明は、上述した実施形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 Note that the present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

例えば、本実施の形態の発話生成装置３００では、述語項構造データベースを優先して検索し、述語項構造データベースの検索結果に基づいて発話候補が生成されなかった場合に、発話データベースを検索する場合について説明したが、これに限定されるものではなく、発話データベースを優先して検索し、発話データベースで発話候補が検索されなかった場合に、述語項構造データベースを検索するように構成してもよい。 For example, in the utterance generation device 300 according to the present embodiment, the predicate term structure database is preferentially searched, and the utterance database is searched when no utterance candidate is generated based on the search result of the predicate term structure database. However, the present invention is not limited to this, and the utterance database may be preferentially searched, and if no utterance candidate is searched in the utterance database, the predicate term structure database may be searched. .

また、上述の発話データベース構築装置１００は、発話候補データベース２０及び発話データベース３を備えている場合について説明したが、例えば発話候補データベース２０及び発話データベース３が発話データベース構築装置１００の外部装置に設けられ、発話データベース構築装置１００は、外部装置と通信手段を用いて通信することにより、発話候補データベース２０及び発話データベース３を参照するようにしてもよい。 Moreover, although the above-mentioned speech database construction apparatus 100 demonstrated the case where the speech candidate database 20 and the speech database 3 were provided, the speech candidate database 20 and the speech database 3 are provided in the external apparatus of the speech database construction apparatus 100, for example. The utterance database construction device 100 may refer to the utterance candidate database 20 and the utterance database 3 by communicating with an external device using communication means.

また、上述の述語項構造データベース構築装置２００についても、外部装置に設けられた、大規模テキストデータベース５０及び述語項構造データベース６と通信手段を用いて通信することにより、大規模テキストデータベース５０及び述語項構造データベース６を参照するようにしてもよい。 Further, the above-described predicate term structure database construction device 200 also communicates with the large-scale text database 50 and the predicate term structure database 6 provided in the external device using the communication means, so that the large-scale text database 50 and the predicate The term structure database 6 may be referred to.

また、上述の発話生成装置３００についても、外部装置に設けられた、述語項構造データベース８、変換ルールデータベース９、言語モデルデータベース１０、及び発話データベース１１と通信手段を用いて通信することにより、述語項構造データベース８、変換ルールデータベース９、言語モデルデータベース１０、及び発話データベース１１を参照するようにしてもよい。 In addition, the utterance generation apparatus 300 described above also communicates with the predicate term structure database 8, the conversion rule database 9, the language model database 10, and the utterance database 11, which are provided in the external device, by using communication means. You may make it refer to the term structure database 8, the conversion rule database 9, the language model database 10, and the speech database 11. FIG.

また、上記実施の形態では、発話データベース構築装置１００と述語項構造データベース構築装置２００と発話生成装置３００とを別々の装置として構成する場合を例に説明したが、発話データベース構築装置１００、述語項構造データベース構築装置２００、及び発話生成装置３００の少なくとも２つを１つの装置として構成してもよい。 In the above embodiment, the case where the utterance database construction device 100, the predicate term structure database construction device 200, and the utterance generation device 300 are configured as separate devices has been described as an example. At least two of the structure database construction device 200 and the utterance generation device 300 may be configured as one device.

上述の発話データベース構築装置１００、述語項構造データベース構築装置２００、及び発話生成装置３００は、内部にコンピュータシステムを有しているが、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。 The utterance database construction device 100, the predicate term structure database construction device 200, and the utterance generation device 300 described above have a computer system therein, but the “computer system” may be a case where a WWW system is used. For example, the homepage provision environment (or display environment) is also included.

また、本願明細書中において、プログラムが予めインストールされている実施形態として説明したが、当該プログラムを、コンピュータ読み取り可能な記録媒体に格納して提供することも可能である。 In the present specification, the embodiment has been described in which the program is installed in advance. However, the program can be provided by being stored in a computer-readable recording medium.

１発話データ入力部
２発話データベース構築部
３発話データベース
４テキストデータ入力部
５述語項構造データベース構築部
６述語項構造データベース
７入力部
８述語項構造データベース
９変換ルールデータベース
１０言語モデルデータベース
１１発話データベース
１２演算部
１３出力部
２０発話候補データベース
２２対話行為推定部
５０大規模テキストデータベース
５２形態素解析部
５４係り受け解析部
５６述語項構造抽出部
１００発話データベース構築装置
１１０制御部
１１２述語項構造検索部
１１４文生成部
１１６文判定部
１１８発話検索部
２００述語項構造データベース構築装置
３００発話生成装置 DESCRIPTION OF SYMBOLS 1 Speech data input part 2 Speech database construction part 3 Speech database 4 Text data input part 5 Predicate term structure database construction part 6 Predicate term structure database 7 Input part 8 Predicate term structure database 9 Conversion rule database 10 Language model database 11 Speech database 12 Calculation unit 13 Output unit 20 Utterance candidate database 22 Dialogue action estimation unit 50 Large scale text database 52 Morphological analysis unit 54 Dependency analysis unit 56 Predicate term structure extraction unit 100 Utterance database construction device 110 Control unit 112 Predicate term structure search unit 114 sentence Generation unit 116 Sentence determination unit 118 Utterance search unit 200 Predicate term structure database construction device 300 Utterance generation device

Claims

An input unit for receiving a word indicating a conversation topic and a user's utterance;
Based on a word indicating the topic of the dialogue received by the input unit, from a predicate term structure database that stores a plurality of predicate term structures that are combinations of a predicate and a case element that is a case element corresponding to the previous descriptive word, A predicate term structure search unit for searching a predescription term structure including the case element corresponding to a word indicating the topic of the dialogue;
A sentence generation unit for generating each utterance sentence based on each of the previous description term structure searched by the previous description term structure search unit;
From each of the utterance sentences generated by the sentence generation unit, utterance sentences that are not natural sentences are removed based on a predetermined rule for determining whether or not the sentence is a natural sentence, and are not removed. A sentence determination unit that outputs each of the utterance sentences as an utterance candidate for the utterance of the user received by the input unit;
When the utterance candidate is not output by the sentence determination unit, the utterance corresponding to the user's utterance is searched from an utterance database storing a plurality of utterances based on the utterance of the user, and the utterance is An utterance search unit that outputs the searched utterance as the utterance candidate when searched, and outputs information indicating that utterance is not possible when the utterance is not searched;
An utterance generating device including

The input unit further accepts a dialogue act representing the intention of the utterance;
The sentence generation unit includes a predescription of a predescription term structure searched by a predescription term structure search unit in accordance with a predetermined order with respect to a predescription word, the case element, and a case of the case element, and the case element And based on a conversion rule for generating a plain text in which the cases of the case elements are arranged and converting the sentence into a sentence expressing the dialogue act, which is predetermined with respect to the dialogue act accepted by the input unit. The utterance generation device according to claim 1, wherein the generated plain text is converted into a sentence expressing the dialogue action to generate the utterance sentence.

The sentence determination unit calculates a perplexity value indicating difficulty in generating a sentence based on an N-gram language model obtained in advance for each of the uttered sentences generated by the sentence generation unit, Based on the rule that determines that the perplexity value is equal to or less than a threshold value, the utterance sentence that is not a natural sentence is removed from each of the utterance sentences and the utterance sentence that has not been removed is determined. The utterance generation device according to claim 1, wherein each is output as the utterance candidate.

An utterance generation method in an utterance generation apparatus including an input unit, a predicate term structure search unit, a sentence generation unit, a sentence determination unit, and an utterance search unit,
Receiving a word indicating a topic of conversation and a user's utterance by the input unit;
A predicate term structure that is a combination of a predicate and a case element that is a case element corresponding to the previous descriptive word based on a word indicating the topic of the dialogue received by the input unit by a predescription term structure search unit. Retrieving the case element corresponding to the word indicating the topic of the conversation from a plurality of stored predicate term structure databases, and extracting the pre-description term structure corresponding to the searched case element;
Generating each spoken sentence based on each of the previous description term structure extracted by the previous description term structure search unit by the sentence generation unit;
The sentence determination unit removes an utterance sentence that is not a natural sentence from each of the utterance sentences generated by the sentence generation unit, based on a predetermined rule for determining whether or not the sentence is a natural sentence. And outputting each of the utterance sentences that have not been removed as utterance candidates for the utterance of the user accepted by the input unit;
When the utterance search unit does not output the utterance candidate by the sentence determination unit, the utterance corresponding to the utterance of the user is obtained from the utterance database storing a plurality of utterances based on the utterance of the user. Searching, outputting the searched utterance as the utterance candidate when the utterance is searched, and outputting information indicating that utterance is impossible when the utterance is not searched; and
Utterance generation method including

The program for functioning a computer as each part of the speech production | generation apparatus of any one of Claims 1-3.