JP5728527B2

JP5728527B2 - Utterance candidate generation device, utterance candidate generation method, and utterance candidate generation program

Info

Publication number: JP5728527B2
Application number: JP2013101382A
Authority: JP
Inventors: 東中　竜一郎; 竜一郎東中; 松尾　義博; 義博松尾; 牧野　俊朗; 俊朗牧野; のぞみ小林; 平野　徹; 徹平野; 豊美目黒; 千明宮崎
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2013-05-13
Filing date: 2013-05-13
Publication date: 2015-06-03
Anticipated expiration: 2033-05-13
Also published as: JP2014222402A

Description

本発明は、発話候補生成装置、発話候補生成方法、及び発話候補生成プログラムにかかり、特に、ユーザ発話に応答する対話システムにおけるシステム発話のための発話候補生成装置、発話候補生成方法、及び発話候補生成プログラムに関する。 The present invention relates to an utterance candidate generation device, an utterance candidate generation method, and an utterance candidate generation program, and more particularly to an utterance candidate generation device, an utterance candidate generation method, and an utterance candidate for system utterance in an interactive system that responds to user utterances. It relates to the generation program.

一般に、ユーザと対話を行う対話システムでは、人間らしい応答を実現するために、ユーザ発話についての応答を発話ペアとして手作業で大量に構築しておき、対話システム１０の応答として用いることが多い。発話ペアの例としては、ユーザ発話：「ありがとう」、応答：「どういたしまして」等が挙げられる。このような手法で応答を生成するシステムがある（例えば、非特許文献１参照）。 In general, in a dialogue system that interacts with a user, in order to realize a human-like response, a large number of responses regarding user utterances are manually constructed as utterance pairs and used as responses of the dialogue system 10 in many cases. Examples of the utterance pair include user utterance: “thank you”, response: “you are welcome”, and the like. There is a system that generates a response by such a method (see, for example, Non-Patent Document 1).

また、ユーザとの対話における現在の話題について、インターネットを利用してウェブ検索エンジンを用いて関連する文を抽出し、抽出した文の中から一つ以上の文を発話として用いて応答を行う対話システムがある（例えば、非特許文献２参照）。 In addition, regarding the current topic in dialogue with the user, the relevant sentence is extracted using a web search engine using the Internet, and one or more sentences are extracted as utterances from among the extracted sentences. There exists a system (for example, refer nonpatent literature 2).

R. S. Wallace, “The Anatomy of A.L.I.C.E.”, A.L.I.C.E. Artificial_Intelligence Foundation, Inc., 2004.R. S. Wallace, “The Anatomy of A.L.I.C.E.”, A.L.I.C.E.Artificial_Intelligence Foundation, Inc., 2004. Shibata, M., Nishiguchi, T., and Tomiura, Y. (2009).“Dialog system for open-ended conversation using web documents.” Informatica, 33(3), pp. 277-284.Shibata, M., Nishiguchi, T., and Tomiura, Y. (2009). “Dialog system for open-ended conversation using web documents.” Informatica, 33 (3), pp. 277-284.

上記非特許文献１に記載の技術、及び非特許文献２に記載の技術のいずれも単一のアルゴリズムで応答を生成するが、ユーザ発話は様々であり、単一のアルゴリズムのみで対話システムが適切に応答することは難しい。例えば、非特許文献１の技術では、開発者が想定したユーザ発話に対しては、的確に答えられるが、開発者が想定していないユーザ発話に対しては、応答することができない。また、非特許文献２の技術では、インターネットの検索エンジンを用いるため、様々な話題について応答を返すことができるが、雑多な内容を含むインターネットからの検索結果は発話として用いるには質が低く、また、挨拶に挨拶で返すといった定型応答に対応することが難しい。加えて、非特許文献１及び非特許文献２による応答手法を組み合わせたとしても、それぞれが適切に応答できるシステム発話は限られており、ユーザ発話に対して、常に適切な応答ができるとは限らない。 Both the technique described in Non-Patent Document 1 and the technique described in Non-Patent Document 2 generate a response with a single algorithm, but user utterances vary, and a dialogue system is appropriate only with a single algorithm. It is difficult to respond to. For example, in the technique of Non-Patent Document 1, an accurate answer can be given to a user utterance assumed by the developer, but a response cannot be made to a user utterance not assumed by the developer. In the technology of Non-Patent Document 2, since an Internet search engine is used, responses can be returned for various topics. However, search results from the Internet including various contents are low in quality to be used as utterances. Also, it is difficult to respond to a standard response such as returning a greeting as a greeting. In addition, even if the response methods according to Non-Patent Document 1 and Non-Patent Document 2 are combined, system utterances that can respond appropriately are limited, and it is not always possible to respond appropriately to user utterances. Absent.

本発明は上記問題点を考慮してなされたものであり、対話システムにおいてユーザ発話に応じて適切な応答を行うことができる発話候補生成装置、発話候補生成方法、及び発話候補生成プログラムを提供することを目的とする。 The present invention has been made in consideration of the above-described problems, and provides an utterance candidate generation device, an utterance candidate generation method, and an utterance candidate generation program capable of appropriately responding to a user utterance in a dialog system. For the purpose.

上記目的を達成するために、本発明の発話候補生成装置は、ユーザ発話に対して応答を行うことによりユーザと対話を行う対話システムにおける応答に用いる発話候補を生成する発話候補生成装置であって、ユーザ発話と前記ユーザ発話に対する応答とからなる予め定められた複数の発話ペアから、入力されたユーザ発話と一致すると判断される発話ペアを選択し、選択された発話ペアの前記応答を用いて、発話候補を生成する第１モジュール、ユーザ発話の話題を表す単語の各々について予め定められた、前記単語に関連する関連語を用いた発話から、前記入力されたユーザ発話から抽出される前記ユーザ発話の話題を表す単語に関連する前記関連語を用いた発話を選択し、選択した前記発話を用いて発話候補を生成する第２モジュール、複数の発話を格納した発話データベースから、前記入力されたユーザ発話から抽出される単語を用いた検索クエリで検索し、前記検索された発話を用いて発話候補を生成する第３モジュール、予め定められた、ユーザ発話の話題によらず発話可能と判断される複数の発話のうちの少なくとも１つの発話を用いて発話候補を生成する第４モジュール、及び予め定められた、ユーザ発話の話題を変更するための複数の発話のうちの少なくとも１つの発話を用いて発話候補を生成する第５モジュールを備え、前記第１モジュールによって前記発話候補が生成されない場合に、前記第２モジュールへ進み、前記第２モジュールによって前記発話候補が生成されない場合に、前記第３モジュールへ進み、前記第３モジュールによって前記発話候補が生成されない場合に、前記第４モジュールへ進むことにより、前記第１モジュールによって生成された前記発話候補、前記第２モジュールによって生成された前記発話候補、前記第３モジュールによって生成された前記発話候補、前記第４モジュールによって生成された前記発話候補、及び前記第５モジュールによって生成された前記発話候補の順に優先して、前記発話候補を出力する。 In order to achieve the above object, an utterance candidate generation device of the present invention is an utterance candidate generation device that generates an utterance candidate used for a response in an interactive system that interacts with a user by responding to a user utterance. Selecting an utterance pair determined to match the input user utterance from a plurality of predetermined utterance pairs consisting of a user utterance and a response to the user utterance, and using the response of the selected utterance pair The first module for generating utterance candidates, the user extracted from the input user utterance from utterances using related words related to the word predetermined for each word representing the topic of user utterance A second module for selecting an utterance using the related word related to a word representing an utterance topic, and generating an utterance candidate using the selected utterance; A third module, which searches for a search query using a word extracted from the input user utterance from an utterance database storing a number of utterances and generates utterance candidates using the searched utterance; In addition, a fourth module that generates an utterance candidate using at least one utterance out of a plurality of utterances that are determined to be utterable regardless of the topic of the user utterance, and changes a predetermined topic of the user utterance comprising a fifth module for generating an utterance candidate with at least one utterance of a plurality of utterances for, when the utterance candidate by the first module is not generated, the process proceeds to the second module, the second When the utterance candidate is not generated by two modules, the process proceeds to the third module, and the utterance candidate is generated by the third module. If not, the utterance candidate generated by the first module, the utterance candidate generated by the second module, the utterance candidate generated by the third module, by proceeding to the fourth module, The utterance candidate is output in preference to the utterance candidate generated by the fourth module and the utterance candidate generated by the fifth module.

また、本発明の発話候補生成装置の前記第１モジュール〜前記第５モジュールのうちの少なくとも１つは、前記ユーザの発話の意図を表す対話行為を推定した対話行為推定情報に基づいて決定される、前記対話システムの対話行為に基づいて、前記発話候補を生成することが好ましい。 In addition, at least one of the first module to the fifth module of the utterance candidate generation device of the present invention is determined based on dialogue action estimation information obtained by estimating a dialogue action representing the user's utterance intention. Preferably, the utterance candidate is generated based on an interactive action of the interactive system.

本発明の発話候補生成方法は、ユーザ発話に対して応答を行うことによりユーザと対話を行う対話システムにおける応答に用いる発話候補を生成する発話候補生成方法であって、第１モジュールにより、ユーザ発話と前記ユーザ発話に対する応答とからなる予め定められた複数の発話ペアから、入力されたユーザ発話と一致する発話ペアを選択し、選択された発話ペアの前記応答を用いて、発話候補を生成するステップ、第２モジュールにより、ユーザ発話の話題を表す単語の各々について予め定められた、前記単語に関連する関連語を用いた発話から、前記入力されたユーザ発話から抽出される前記ユーザ発話の話題を表す単語に関連する前記関連語を用いた発話を選択し、選択した前記発話を用いて発話候補を生成するステップ、第３モジュールにより、複数の発話を格納した発話データベースから、前記入力されたユーザ発話から抽出される単語を用いた検索クエリで検索し、前記検索された発話を用いて発話候補を生成するステップ、第４モジュールにより、予め定められた、ユーザ発話の話題によらず発話可能な複数の発話のうちの少なくとも１つの発話を用いて発話候補を生成するステップ、及び第５モジュールにより、予め定められた、ユーザ発話の話題を変更するための複数の発話のうちの少なくとも１つの発話を用いて発話候補を生成するステップを備え、前記第１モジュールによって前記発話候補が生成されない場合に、前記第２モジュールへ進み、前記第２モジュールによって前記発話候補が生成されない場合に、前記第３モジュールへ進み、前記第３モジュールによって前記発話候補が生成されない場合に、前記第４モジュールへ進むことにより、前記第１モジュールによって生成された前記発話候補、前記第２モジュールによって生成された前記発話候補、前記第３モジュールによって生成された前記発話候補、前記第４モジュールによって生成された前記発話候補、及び前記第５モジュールによって生成された前記発話候補の順に優先して、前記発話候補を出力する。 An utterance candidate generation method according to the present invention is an utterance candidate generation method for generating an utterance candidate used for a response in an interactive system that interacts with a user by responding to a user utterance. The utterance candidate generation method includes: And a response to the user utterance, a utterance pair that matches the input user utterance is selected from a plurality of predetermined utterance pairs, and an utterance candidate is generated using the response of the selected utterance pair The topic of the user utterance extracted from the input user utterance from the utterance using a related word related to the word, which is predetermined for each word representing the topic of the user utterance by the second module. Selecting an utterance using the related word related to the word representing, and generating an utterance candidate using the selected utterance; Searching by a search query using a word extracted from the input user utterance from an utterance database storing a plurality of utterances, and generating utterance candidates using the searched utterances; A step of generating an utterance candidate using at least one utterance of a plurality of utterances that can be uttered regardless of a topic of the user utterance determined by the module, and a user predetermined by the fifth module A step of generating an utterance candidate using at least one utterance of a plurality of utterances for changing a topic of utterance, and when the utterance candidate is not generated by the first module, the process proceeds to the second module If the utterance candidate is not generated by the second module, the process proceeds to the third module, and the third module If the utterance candidate is not generated by the process, the utterance candidate generated by the first module, the utterance candidate generated by the second module, and the third module are advanced to the fourth module. The utterance candidate is output in the order of the generated utterance candidate, the utterance candidate generated by the fourth module, and the utterance candidate generated by the fifth module .

本発明の発話候補生成プログラムは、コンピュータを、本発明の発話候補生成装置の各モジュールとして機能させるためのものである。 The utterance candidate generation program of the present invention is for causing a computer to function as each module of the utterance candidate generation device of the present invention.

本発明の発話候補生成装置、発話候補生成方法、及び発話候補生成プログラムによれば、対話システムにおいてユーザ発話に応じて適切な応答を行うことができる、という効果が得られる。 According to the utterance candidate generation device, the utterance candidate generation method, and the utterance candidate generation program of the present invention, it is possible to obtain an effect that an appropriate response can be made according to the user utterance in the dialog system.

本実施の形態の対話システムの一例の概略を示すブロック図である。It is a block diagram which shows the outline of an example of the dialogue system of this Embodiment. 本実施の形態の対話システムで使用する、対話行為のセットと、対応する発話の例との関係を表したテーブルを示した説明図である。It is explanatory drawing which showed the table showing the relationship between the set of dialogue action used in the dialogue system of this Embodiment, and the example of a corresponding utterance. 本実施の形態の発話ペアの具体的な一例を示す説明図である。It is explanatory drawing which shows a specific example of the utterance pair of this Embodiment. 本実施の形態の単語−発話ペアの具体的な一例を示す説明図である。It is explanatory drawing which shows a specific example of the word-utterance pair of this Embodiment. 本実施の形態の発話サーチ部において「ラーメン」を検索クエリとして発話データベースから得られる発話リストの具体的な一例を示す説明図である。It is explanatory drawing which shows a specific example of the utterance list | wrist obtained from an utterance database by using "ramen" as a search query in the utterance search part of this Embodiment. 本実施の形態の発話サーチ部において「ラーメン」及びシステムの対話行為である「５：自己丘維持評価＋」を検索クエリとして発話データベースから得られる発話リストの具体的な一例を示す説明図である。It is explanatory drawing which shows a specific example of the utterance list | wrist obtained from an utterance database by making "5: self-hill maintenance evaluation +" which is a dialogue action of a system and "5 ramen" into the search query in the utterance search part of this Embodiment. . 本実施の形態の発話リストの具体的な一例を示す説明図である。It is explanatory drawing which shows a specific example of the speech list | wrist of this Embodiment. 本実施の形態の発話リストの具体的な一例を示す説明図である。It is explanatory drawing which shows a specific example of the speech list | wrist of this Embodiment. 本実施の形態の発話候補生成装置で実行される発話候補生成処理の一例のフローチャートである。It is a flowchart of an example of the speech candidate production | generation process performed with the speech candidate production | generation apparatus of this Embodiment. 本実施の形態の発話候補生成装置で実行される発話候補生成処理の一例のフローチャートである。It is a flowchart of an example of the speech candidate production | generation process performed with the speech candidate production | generation apparatus of this Embodiment. 本実施の形態の対話システムとユーザとの対話の具体的な一例を説明するための説明図である。It is explanatory drawing for demonstrating a specific example of the dialogue with the dialogue system of this embodiment, and a user.

以下、図面を参照して本発明の実施の形態を詳細に説明する。なお、本実施の形態は本発明を限定するものではない。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Note that this embodiment does not limit the present invention.

本実施の形態の対話システム１０は、ユーザの発話に対してシステムが発話を行うことによりユーザと雑談等の対話を行う対話システムであって、ユーザ発話と、ユーザ発話から推定した対話行為に基づいて決定された対話システム１０の対話行為とに基づいて発話候補４６を生成し、生成した発話候補４６から選択した発話によりユーザに対して応答を行う機能を有している。 The dialogue system 10 according to the present embodiment is a dialogue system in which the system utters in response to a user's utterance and performs a dialogue such as chatting with the user, based on the user's utterance and the dialogue action estimated from the user's utterance. The utterance candidate 46 is generated on the basis of the dialog action of the dialog system 10 determined in this manner, and a response is made to the user by the utterance selected from the generated utterance candidate 46.

まず、本実施の形態の対話システム１０の構成について説明する。図１には、本実施の形態の対話システム１０の概略構成の一例を表した構成図を示す。本実施の形態の対話システム１０は、入力部１２、対話制御部１４、出力部１８、及び発話候補生成装置２０を備える。 First, the configuration of the interactive system 10 according to the present embodiment will be described. FIG. 1 is a configuration diagram illustrating an example of a schematic configuration of a dialogue system 10 according to the present embodiment. The dialogue system 10 according to the present embodiment includes an input unit 12, a dialogue control unit 14, an output unit 18, and an utterance candidate generation device 20.

入力部１２は、ユーザによって入力されたユーザ発話を発話候補生成装置２０に出力する機能を有している。また、本実施の形態の入力部１２は、入力されたユーザ発話を対話制御部１４にも出力する機能を有している。さらに、本実施の形態の入力部１２は、ユーザとの対話履歴を発話候補生成装置２０に出力する機能を有している。 The input unit 12 has a function of outputting a user utterance input by the user to the utterance candidate generation device 20. In addition, the input unit 12 of the present embodiment has a function of outputting the input user utterance also to the dialogue control unit 14. Furthermore, the input unit 12 according to the present embodiment has a function of outputting a conversation history with the user to the utterance candidate generation device 20.

対話制御部１４は、対話行為推定器１６を備えており、入力されたユーザ発話の対話行為を対話行為推定器１６により推定し、推定した対話行為に応じて、対話システムの発話意図を決定する機能を有している。対話システム１０側の発話意図を表す情報をシステムの対話行為と呼ぶ。なお、本実施の形態において「対話行為」とは、発話の意図を表すシンボルのことをいう。 The dialogue control unit 14 includes a dialogue act estimator 16. The dialogue act estimator 16 estimates the dialogue act of the input user utterance, and determines the utterance intention of the dialogue system according to the estimated dialogue act. It has a function. Information representing the utterance intention on the dialog system 10 side is called a dialog action of the system. In the present embodiment, “dialogue action” refers to a symbol representing the intention of utterance.

本実施の形態の対話行為は、対話行為ＩＤと対話行為名のセットからなる。図２には、具体的な一例として本実施の形態で使用する、対話行為のセットと、対応する発話の例との関係を表したテーブルを示す。なお、対話行為のセットとしては、図２に示したものに限らず、談話研究で一般的なＤＡＭＳＬタグセット等を用いてもよい。 The dialogue act of this embodiment is composed of a set of dialogue act ID and dialogue act name. FIG. 2 shows a table representing a relationship between a set of dialogue actions and a corresponding utterance example used in the present embodiment as a specific example. The dialogue action set is not limited to the one shown in FIG. 2, and a DAMSL tag set or the like generally used in discourse research may be used.

また、対話行為推定器１６は、発話中の単語を元に対話行為を推定する推定器であり、機械学習の手法で構築される。例えば、対話行為推定器１６を、文書分類で一般的に用いられる手法である、サポートベクトルマシン等を用いて構築することができる。本実施の形態では、約数万の発話について手作業で対話行為を付与し、このデータを学習データとして、サポートベクトルマシンによって、発話からその対話行為を推定する、多クラス分類器を学習した。 The dialogue act estimator 16 is an estimator that estimates a dialogue act based on a word being uttered, and is constructed by a machine learning technique. For example, the dialogue action estimator 16 can be constructed using a support vector machine or the like, which is a method generally used in document classification. In the present embodiment, a multi-class classifier that learns about tens of thousands of utterances by hand and uses this data as learning data to estimate the dialogue action from the utterances by using a support vector machine is learned.

また、システムの対話行為の決定には、例えば、文献（目黒豊美，南泰浩，東中竜一郎，堂坂浩二，POMDPを用いた聞き役対話制御部のWizard of Oz実験による評価，人工知能学会第26 回全国大会オーガナイズドセッション「OS-18 知的対話システム」, 2012．参照）に記されたような手作業で作成されたルールを用いてもよい。また、その他には、近年一般的になっているＭＤＰやＰＯＭＤＰと呼ばれる統計的な対話制御の手法を用いて、システムの対話行為を決定してもよい。 In addition, for the determination of the dialogue action of the system, for example, the literature (Toyomi Meguro, Yasuhiro Minami, Ryuichiro Higashinaka, Koji Dosaka, Evaluation of the hearing dialogue control unit using POMDP by the Wizard of Oz experiment, The 26th Annual Meeting of the Japanese Society for Artificial Intelligence Manual rules such as those described in the National Conference Organized Session “OS-18 Intelligent Dialogue System”, 2012.) may be used. In addition, the interactive action of the system may be determined using a statistical dialog control technique called MDP or POMDP which has become common in recent years.

対話制御部１４により決定されたシステムの対話行為は、発話候補生成装置２０に出力される。なお、発話候補生成装置２０における発話候補の生成において、システムの対話行為は、必須ではない。発話候補生成装置２０が対話行為を用いずに発話候補を生成する場合は、対話制御部１４は、対話行為推定器１６を備えなくてもよく、推定した対話行為を発話候補生成装置２０に出力する機能を有していなくてもよい。 The dialogue action of the system determined by the dialogue control unit 14 is output to the utterance candidate generation device 20. In addition, in the generation of utterance candidates in the utterance candidate generation device 20, the system interaction is not essential. When the utterance candidate generation device 20 generates an utterance candidate without using a dialogue action, the dialogue control unit 14 may not include the dialogue action estimator 16 and outputs the estimated dialogue action to the utterance candidate generation device 20. It is not necessary to have the function to do.

出力部１８は、発話候補生成装置２０により出力された発話候補４６から一つ以上の発話を選択して、システム発話としてユーザに応答（提示）する機能を有している。 The output unit 18 has a function of selecting one or more utterances from the utterance candidates 46 output by the utterance candidate generation device 20 and responding (presenting) to the user as system utterances.

本実施の形態の発話候補生成装置２０は、複数のモジュールを用いて、入力部１２から入力されたユーザ発話に応答するための発話候補４６を生成して出力部１８へ出力する機能を有する。本実施の形態の発話候補生成装置２０は、５つのモジュールを備えている。発話候補生成装置２０では、発話ペア選択部２２が第１モジュールに対応し、関連発話選択部２６が第２モジュールに対応し、発話サーチ部３２が第３モジュールに対応し、バーサタイル発話選択部３８が第４モジュールに対応し、新規話題発話選択部４２が第５モジュールに対応している。なお、以下では、発話ペア選択部２２、関連発話選択部２６、発話サーチ部３２、バーサタイル発話選択部３８、及び新規話題発話選択部４２を総称する場合は、「モジュール」という。 The utterance candidate generation device 20 of the present embodiment has a function of generating an utterance candidate 46 for responding to a user utterance input from the input unit 12 using a plurality of modules and outputting the utterance candidate 46 to the output unit 18. The utterance candidate generation device 20 of the present embodiment includes five modules. In the utterance candidate generation device 20, the utterance pair selection unit 22 corresponds to the first module, the related utterance selection unit 26 corresponds to the second module, the utterance search unit 32 corresponds to the third module, and the versatile utterance selection unit 38. Corresponds to the fourth module, and the new topic utterance selection unit 42 corresponds to the fifth module. Hereinafter, the utterance pair selection unit 22, the related utterance selection unit 26, the utterance search unit 32, the versatile utterance selection unit 38, and the new topic utterance selection unit 42 are collectively referred to as “module”.

また、本実施の形態の発話候補生成装置２０は、各モジュールで用いられるデータベースとして、発話ペア２４、単語−発話ペア３０、発話データベース３６、発話リスト４０、及び発話リスト４４を備える。 In addition, the utterance candidate generation device 20 of the present embodiment includes an utterance pair 24, a word-utterance pair 30, an utterance database 36, an utterance list 40, and an utterance list 44 as databases used in each module.

なお、発話候補生成装置２０におけるこれら各処理部、及び上記各処理部（入力部１２、対話制御部１４、及び出力部１８）は、ＣＰＵ（Central Processing Unit）、ＲＡＭ（Random Access Memory）、及びＲＯＭ（Read Only Memory）等を備えたコンピュータにより実現されており、ＣＰＵが、ＲＯＭに記憶されているプログラムを実行することにより、詳細を後述する各処理部における処理が実行される。 Each of these processing units and the above-described processing units (input unit 12, dialogue control unit 14, and output unit 18) in the utterance candidate generation device 20 are a CPU (Central Processing Unit), a RAM (Random Access Memory), and The processing is realized by a computer having a ROM (Read Only Memory) or the like, and the CPU executes a program stored in the ROM, whereby processing in each processing unit described in detail later is executed.

発話ペア選択部２２は、ユーザ発話そのものに対する応答を予め与えられた発話ペア２４から選択することにより、発話候補を生成する機能を有している。 The utterance pair selection unit 22 has a function of generating utterance candidates by selecting a response to the user utterance itself from the utterance pair 24 given in advance.

発話ペア２４には、ユーザ発話と当該ユーザ発話に対する応答とのペア（発話ペア）が一つ以上格納されている。発話ペア２４のデータ（発話ペア）は、例えば手作業により予め作成されたものである。図３には、本実施の形態の発話ペア２４の具体的な一例を示す。なお、本実施の形態では、発話ペアの応答の対話行為を発話ペアに付与して発話ペア２４に格納している。なお、応答の対話行為は、上述の対話行為推定器１６と同様の対話行為推定器を用いて推定した対話行為を付与すればよい。図３に示した「0:挨拶」の場合は、「0」が対話行為ＩＤであり、「挨拶」が対話行為名を示している。なお、本実施の形態では、後述するようにシステムの対話行為も用いて発話候補４６を生成しているため、このように各発話ペアの対話行為についても発話ペア選択部２２に格納しているが、対話行為を用いずに発話候補４６を生成する場合には、対話行為については、発話ペア選択部２２に格納しなくてもよい。 The utterance pair 24 stores one or more pairs (utterance pairs) of a user utterance and a response to the user utterance. The data of the utterance pair 24 (utterance pair) is created in advance by, for example, manual work. FIG. 3 shows a specific example of the utterance pair 24 of the present embodiment. In the present embodiment, the dialogue action of the response of the utterance pair is assigned to the utterance pair and stored in the utterance pair 24. In addition, what is necessary is just to provide the dialog act estimated using the dialog act estimator similar to the above-mentioned dialog act estimator 16 as the response dialog act. In the case of “0: greeting” shown in FIG. 3, “0” is the dialogue action ID, and “greeting” indicates the dialogue action name. In this embodiment, as will be described later, the utterance candidate 46 is generated also using the system interaction action, and thus the conversation action of each utterance pair is also stored in the utterance pair selection unit 22 in this way. However, when the utterance candidate 46 is generated without using the dialogue action, the dialogue action may not be stored in the utterance pair selection unit 22.

発話ペア選択部２２は、入力されたユーザ発話に一致するユーザ発話を含む発話ペアの応答を発話ペア選択部２２から各々抽出し、各々抽出した応答からなる発話リストを発話候補４６として生成する。なお、発話ペア選択部２２が抽出する応答の数は特に限定されない。また、本実施の形態の発話ペア選択部２２では、ユーザ発話に加えてシステムの対話行為が入力された場合は、入力されたユーザ発話に一致し、かつ、入力された対話行為と対話行為が一致する発話ペアにおける応答を発話ペア２４から各々抽出し、各々抽出した応答からなる発話リストを発話候補４６として生成する。対話行為を用いることにより、対話システム１０の発話意図に沿った発話候補４６を生成することができる。 The utterance pair selection unit 22 extracts responses of utterance pairs including user utterances that match the input user utterance from the utterance pair selection unit 22, and generates an utterance list including the extracted responses as utterance candidates 46. The number of responses extracted by the utterance pair selection unit 22 is not particularly limited. Further, in the utterance pair selection unit 22 according to the present embodiment, when a dialogue action of the system is inputted in addition to the user utterance, the inputted dialogue utterance and the dialogue action are matched with the inputted user utterance. Responses in the matched utterance pairs are extracted from the utterance pairs 24, and an utterance list including the extracted responses is generated as the utterance candidates 46. By using the dialogue action, the utterance candidate 46 in accordance with the utterance intention of the dialogue system 10 can be generated.

発話ペア選択部２２が生成する発話候補４６は、ユーザ発話そのものに対応した発話であるため、ユーザ発話に対して適切な応答を期待することができる。 Since the utterance candidate 46 generated by the utterance pair selection unit 22 is an utterance corresponding to the user utterance itself, an appropriate response to the user utterance can be expected.

関連発話選択部２６は、入力されたユーザ発話から、ユーザとの対話における現在の話題を示す単語を抽出し、抽出した単語に対応した発話を単語−発話ペア３０から選択することにより、発話候補４６を生成する機能を有する。なお、本実施の形態において「入力されたユーザ発話」とは、直前に入力された発話のみが対象となるのではなく、複数（例えば、予め定められた数）のこれまでに入力されたユーザ発話を対象としてもよい。 The related utterance selection unit 26 extracts a word indicating the current topic in the dialogue with the user from the input user utterance, and selects an utterance corresponding to the extracted word from the word-utterance pair 30, thereby utterance candidates. 46 is generated. In the present embodiment, the “input user utterance” is not limited to the utterance input immediately before, but a plurality (for example, a predetermined number) of users input so far Speech may be targeted.

単語−発話ペア３０には、話題を示す単語と当該単語に関連する関連語を用いた発話との組み合わせからなる単語−発話のデータが一つ以上格納されている。単語−発話のデータは、例えば、手作業により予め作成されたものである。図４には、本実施の形態の単語−発話ペア３０の具体的な一例を示す。なお、本実施の形態では、発話ペアの発話の対話行為を各データに付与して単語−発話ペア３０に格納している。なお、発話の対話行為は、上述の対話行為推定器１６と同様の対話行為推定器を用いて推定した対話行為を付与すればよい。 The word-utterance pair 30 stores at least one word-utterance data including a combination of a word indicating a topic and an utterance using a related word related to the word. The word-utterance data is created in advance by hand, for example. FIG. 4 shows a specific example of the word-utterance pair 30 of the present embodiment. In the present embodiment, the dialogue action of the utterance pair is assigned to each data and stored in the word-utterance pair 30. In addition, what is necessary is just to give the dialog act estimated using the dialog act estimator similar to the above-mentioned dialog act estimator 16 to the dialog act of speech.

関連発話選択部２６は、入力されたユーザ発話に対して形態素解析器２８により形態素解析を行い、形態素解析結果に基づいて、ユーザとの対話における話題を示す単語を抽出する。例えば、話題を示す単語とは、ユーザ発話に含まれる名詞であってもよい。具体的な一例として、「ラーメンが好きです」というユーザ発話が入力された場合は、当該ユーザ発話を形態素解析器３４により形態素解析を行うことで、「ラーメン」を名詞として判定することができる。この場合、関連発話選択部２６は、「ラーメン」を、話題を示す単語として抽出する。さらに関連発話選択部２６は、抽出した話題を示す単語と一致する単語に対応した発話を単語−発話ペア３０から各々抽出し、各々抽出した発話からなる発話リストを発話候補４６として生成する。本実施の形態では、具体的な一例として、図４に示した発話からなる発話リストを発話候補４６として生成する。なお、関連発話選択部２６が抽出する発話の数は特に限定されない。また、本実施の形態の関連発話選択部２６では、ユーザ発話に加えてシステムの対話行為が入力された場合は、抽出した話題を示す単語と一致し、かつ、入力された対話行為と対話行為とが一致するデータにおける発話を単語−発話ペア３０から各々抽出し、各々抽出した発話からなる発話リストを発話候補４６として生成する。対話行為を用いることにより、対話システム１０の発話意図に沿った発話候補４６を生成することができる。 The related utterance selection unit 26 performs morphological analysis on the input user utterance by the morphological analyzer 28 and extracts words indicating topics in the dialogue with the user based on the morphological analysis result. For example, the word indicating the topic may be a noun included in the user utterance. As a specific example, when a user utterance “I like ramen” is input, morphological analysis is performed on the user utterance by the morphological analyzer 34, so that “ramen” can be determined as a noun. In this case, the related utterance selection unit 26 extracts “ramen” as a word indicating a topic. Further, the related utterance selection unit 26 extracts utterances corresponding to the words that match the extracted word indicating the topic from the word-utterance pair 30, and generates an utterance list including the extracted utterances as the utterance candidates 46. In the present embodiment, as a specific example, an utterance list including the utterances shown in FIG. The number of utterances extracted by the related utterance selection unit 26 is not particularly limited. Further, in the related utterance selection unit 26 according to the present embodiment, when a dialogue action of the system is input in addition to the user utterance, it matches the word indicating the extracted topic, and the inputted dialogue action and dialogue action The utterances in the data that coincide with each other are extracted from the word-utterance pair 30, and the utterance list including the extracted utterances is generated as the utterance candidate 46. By using the dialogue action, the utterance candidate 46 in accordance with the utterance intention of the dialogue system 10 can be generated.

関連発話選択部２６が生成する発話候補４６は、ユーザ発話そのものではなく、ユーザとの対話における話題に対応しているため、発話ペア選択部２２（第１モジュール）よりもユーザ発話に直接的に対応したものではなくなるものの、対話の話題に即した応答を実現することができる。 Since the utterance candidate 46 generated by the related utterance selection unit 26 corresponds to the topic in the dialogue with the user, not the user utterance itself, the utterance candidate selection unit 22 (first module) directly deals with the user utterance. Although it is not compatible, it is possible to realize a response that matches the topic of dialogue.

発話サーチ部３２は、入力されたユーザ発話から単語を抽出し、抽出した単語が含まれる発話を、発話データベース３６から検索することにより、発話候補４６を生成する機能を有する。 The utterance search unit 32 has a function of generating utterance candidates 46 by extracting words from the input user utterance and searching the utterance database 36 for utterances including the extracted words.

本実施の形態の発話データベース３６としては、具体的な一例として、ツイッター（登録商標）のサイトのデータを独自にクロールし、これらを句点や記号によって文分割した上で、一文（発話）を一レコードとして登録したものを用いているが、特に限定されない。なお、発話データベース３６は全文検索が可能なように転置インデックスを保持している。当該発話データベース３６の構築方法は特に限定されず、例えば、一般的な全文検索エンジンのデータベースを構築する手順を踏めばよく、フリーソフトのLuceneやNamazu等を用いて構築すればよい。本実施の形態では、具体的な一例として、Luceneを用いて構築した発話データベース３６を用いている。なお、本実施の形態では、発話の対話行為を当該発話に付与して発話データベース３６に格納している。なお、発話の対話行為は、上述の対話行為推定器１６と同様の対話行為推定器を用いて推定した対話行為を付与すればよい。 As a specific example, the utterance database 36 of the present embodiment crawls Twitter (registered trademark) site data independently, divides these into sentences according to punctuation marks and symbols, and sets a single sentence (utterance). Although what was registered as a record is used, it is not specifically limited. Note that the utterance database 36 holds a transposed index so that a full-text search is possible. The method for constructing the utterance database 36 is not particularly limited. For example, a procedure for constructing a general full-text search engine database may be taken, and construction may be performed using free software Lucene, Namazu, or the like. In the present embodiment, as a specific example, an utterance database 36 constructed using Lucene is used. In the present embodiment, an utterance dialogue act is assigned to the utterance and stored in the utterance database 36. In addition, what is necessary is just to give the dialog act estimated using the dialog act estimator similar to the above-mentioned dialog act estimator 16 to the dialog act of speech.

発話サーチ部３２は、入力されたユーザ発話に対して形態素解析器３４により形態素解析を行い、形態素解析結果に基づいて、ユーザ発話から単語を抽出し、抽出した単語を用いて検索クエリを構成して、発話データベース３６を検索する。具体的な一例として、「ラーメンが好きです」というユーザ発話が入力された場合は、当該ユーザ発話を形態素解析器３４により形態素解析を行うことで、発話文に含まれる単語を抽出することができる。当該ユーザ発話文においては、内容語（名詞、動詞、及び形容詞）として、「ラーメン」及び「好き」を抽出することができる。なお、これら二つの単語からなる検索クエリを作成してもよいし、「ラーメン」及び「好き」のいずれか一方からなる検索クエリを作成してもよい。このようにして得られた検索クエリによって発話データベース３６を検索し、上位Ｎ件の検索結果である発話を発話リストとして取得して発話候補４６を生成する。なお、発話サーチ部３２が取得する発話の数は特に限定されない。また、本実施の形態の発話サーチ部３２では、ユーザ発話に加えてシステムの対話行為が入力された場合は、検索クエリにシステムの対話行為を含めることにより、システムの対話行為が一致する発話のみを検索結果として得ることができる。図５には、具体的な一例として、「ラーメン」を検索クエリとして発話データベース３６から得られる発話リストを示す。また、図６には、具体的な一例として、「ラーメン」及びシステムの対話行為である「５：自己開示評価＋」を検索クエリとして発話データベース３６から得られる発話リストを示す。 The utterance search unit 32 performs morphological analysis on the input user utterance by the morphological analyzer 34, extracts words from the user utterance based on the morphological analysis results, and constructs a search query using the extracted words. Then, the utterance database 36 is searched. As a specific example, when a user utterance “I like ramen” is input, a morphological analysis is performed on the user utterance by the morphological analyzer 34 to extract words included in the utterance sentence. . In the user utterance sentence, “ramen” and “like” can be extracted as content words (nouns, verbs, and adjectives). A search query consisting of these two words may be created, or a search query consisting of either “ramen” or “like” may be created. The utterance database 36 is searched by the search query obtained in this way, and the utterances 46 as the utterance list are obtained as utterance lists by obtaining the utterances as the top N search results. The number of utterances acquired by the utterance search unit 32 is not particularly limited. Further, in the utterance search unit 32 according to the present embodiment, when a system interaction action is input in addition to a user utterance, only the utterances that match the system interaction action are included by including the system interaction action in the search query. Can be obtained as a search result. As a specific example, FIG. 5 shows an utterance list obtained from the utterance database 36 using “ramen” as a search query. FIG. 6 shows an utterance list obtained from the utterance database 36 by using “5: Self-disclosure evaluation +” as a specific example as a search query.

発話サーチ部３２が生成する発話候補４６は、関連発話選択部２６（第２モジュール）が生成する発話候補４６に比べて、ユーザとの対話における話題に直接対応する発話を用いて応答できるとは限らないが、適切な検索結果が得られることにより、話題に即した応答を実現することができる。 The utterance candidate 46 generated by the utterance search unit 32 can respond using an utterance directly corresponding to the topic in the dialogue with the user, compared to the utterance candidate 46 generated by the related utterance selection unit 26 (second module). Although it is not limited, a response suitable for the topic can be realized by obtaining an appropriate search result.

バーサタイル発話選択部３８は、発話リスト４０から発話を選択することで発話候補４６を生成する機能を有する。一般に、「バーサタイル」とは、多様という意味であり、種々の意味を含むという意味である。そのため、どのような意味にでも解釈でき、いつ発話しても文脈上大きな問題の起きにくい発話のことを、本実施の形態では、「バーサタイル発話」という。 The versatile utterance selection unit 38 has a function of generating an utterance candidate 46 by selecting an utterance from the utterance list 40. In general, “versatile” means various and includes various meanings. For this reason, an utterance that can be interpreted in any sense and is less likely to cause a large problem in context when uttered is referred to as “versatile utterance” in the present embodiment.

発話リスト４０は、ユーザとの対話における現在の話題によらず、いつ発話しても対話の内容に大きな影響を与えない複数の発話からなる発話リストである。バーサタイル発話を発話リスト４０に格納しておくことにより、対話システム１０は発話に窮した場合にこれらの発話を用いてその場を繋ぐことが可能となる。発話リスト４０には、例えば、「はい」や「ええ」のような相槌や、具体的な内容を持たない、「すごいですね」や「いいですね」等の共感を表す発話等が格納されている。発話リスト４０は、例えば手作業により予め作成されてものである。図７には、本実施の形態の発話リスト４０の具体的な一例を示す。なお、本実施の形態では、発話の対話行為を各発話に付与して発話リスト４０に格納している。発話の対話行為は、上述の対話行為推定器１６と同様の対話行為推定器を用いて推定した対話行為を付与すればよい。 The utterance list 40 is an utterance list composed of a plurality of utterances that do not greatly affect the contents of the dialogue regardless of the current topic in the dialogue with the user. By storing the Versatile utterances in the utterance list 40, the dialogue system 10 can connect the places using these utterances when he / she is deceived. The utterance list 40 stores, for example, utterances such as “Yes” and “Yes”, and utterances that express sympathy such as “It ’s amazing” or “Good” without specific contents. ing. The utterance list 40 is created in advance, for example, manually. FIG. 7 shows a specific example of the utterance list 40 of the present embodiment. In the present embodiment, an utterance dialogue act is assigned to each utterance and stored in the utterance list 40. As the dialog action of the utterance, the dialog action estimated using the dialog action estimator similar to the dialog action estimator 16 described above may be given.

バーサタイル発話選択部３８は、発話リスト４０から発話を選択することで発話候補４６を生成する。なお、バーサタイル発話選択部３８は、発話リスト４０の全体を発話候補４６としてもよいし、発話リスト４０の中から一つ以上を適宜選択して発話候補４６としてもよい。また、本実施の形態では、発話の対話行為を当該発話に付与して発話リスト４０に格納している。また、本実施の形態のバーサタイル発話選択部３８では、システムの対話行為が入力された場合は、発話リスト４０から、システムの対話行為と一致する対話行為が付与された発話を選択し、発話候補４６を生成する。 The versatile utterance selection unit 38 generates an utterance candidate 46 by selecting an utterance from the utterance list 40. The versatile utterance selection unit 38 may set the entire utterance list 40 as the utterance candidate 46, or may appropriately select one or more from the utterance list 40 as the utterance candidate 46. In the present embodiment, an utterance dialogue act is assigned to the utterance and stored in the utterance list 40. In addition, when a system dialogue action is input, the versatile utterance selection unit 38 of the present embodiment selects a utterance to which a dialogue action that matches the system dialogue action is given from the utterance list 40, and is a speech candidate. 46 is generated.

バーサタイル発話選択部３８が生成する発話候補４６は、発話ペア選択部２２（第１モジュール）、関連発話選択部２６（第２モジュール）、及び発話サーチ部３２（第３モジュール）のようにユーザ発話や、ユーザとの対話における現在の話題に即した発話は応答できないが、現在の話題を継続することができる応答を実現することができる。 The utterance candidates 46 generated by the versatile utterance selection unit 38 are user utterances such as the utterance pair selection unit 22 (first module), the related utterance selection unit 26 (second module), and the utterance search unit 32 (third module). In addition, it is possible to realize a response capable of continuing the current topic, although an utterance corresponding to the current topic in the dialog with the user cannot be responded.

新規話題発話選択部４２は、発話リスト４４から発話を選択することで発話候補４６を生成する機能を有する。 The new topic utterance selection unit 42 has a function of generating an utterance candidate 46 by selecting an utterance from the utterance list 44.

発話リスト４４は、ユーザとの対話における現在の話題を変更するために、新規な話題にユーザを誘導するための発話が含まれた発話リストである。発話リスト４４は例えば手作業により予め作成したものである。図８には、本実施の形態の発話リスト４４の具体的な一例を示す。なお、本実施の形態では、発話の対話行為を各発話に付与して発話リスト４４に格納している。発話の対話行為は、上述の対話行為推定器１６と同様の対話行為推定器を用いて推定した対話行為を付与すればよい。 The utterance list 44 is an utterance list including utterances for guiding the user to a new topic in order to change the current topic in the dialog with the user. The utterance list 44 is created in advance, for example, manually. FIG. 8 shows a specific example of the utterance list 44 of the present embodiment. In the present embodiment, an utterance dialogue act is assigned to each utterance and stored in the utterance list 44. As the dialog action of the utterance, the dialog action estimated using the dialog action estimator similar to the dialog action estimator 16 described above may be given.

新規話題発話選択部４２は、発話リスト４４から発話を選択することで発話候補４６を生成する。なお、新規話題発話選択部４２は、発話リスト４４の全体を発話候補４６としてもよいし、発話リスト４４の中から一つ以上を適宜選択して発話候補４６としてもよい。また、本実施の形態の新規話題発話選択部４２では、システムの対話行為が入力された場合は、発話リスト４４から、入力されたシステムの対話行為と一致する対話行為が付与された発話を選択し、発話候補４６を生成する。 The new topic utterance selection unit 42 generates an utterance candidate 46 by selecting an utterance from the utterance list 44. Note that the new topic utterance selection unit 42 may set the entire utterance list 44 as the utterance candidate 46, or may appropriately select one or more from the utterance list 44 as the utterance candidate 46. In addition, in the new topic utterance selection unit 42 according to the present embodiment, when a system dialogue action is input, the utterance list 44 selects an utterance provided with a dialog action that matches the inputted system dialogue action. The utterance candidate 46 is generated.

新規話題発話選択部４２が生成する発話候補４６は、発話ペア選択部２２（第１モジュール）、関連発話選択部２６（第２モジュール）、発話サーチ部３２（第３モジュール）、及びバーサタイル発話選択部３８（第４モジュール）とは異なり、新規な話題にユーザを誘導する応答を行うことにより、ユーザ発話や、ユーザとの対話における現在の話題に即した応答ができない状況を打開することができる応答を実現することができる。 The utterance candidates 46 generated by the new topic utterance selection unit 42 are the utterance pair selection unit 22 (first module), the related utterance selection unit 26 (second module), the utterance search unit 32 (third module), and versatile utterance selection. Unlike the unit 38 (fourth module), by performing a response that guides the user to a new topic, it is possible to overcome the situation where the user cannot speak or respond to the current topic in the dialog with the user. A response can be realized.

次に、本実施の形態の発話候補生成装置２０の動作の流れについて説明する。本実施の形態の発話候補生成装置２０では、複数（第１〜第５）のモジュールを使い分けることにより、常にユーザ発話に応じて適切な、質が高い応答を行っている。また、本実施の形態の発話候補生成装置２０では、第１モジュール（発話ペア選択部２２）から第５モジュール（新規話題発話選択部４２）の順で順次、発話候補４６を生成する。このような順番で発話候補４６を生成することにより、ユーザ発話に対して、より適切な応答ができるモジュールの順で順次、発話候補４６を生成するため、ユーザの発話に対して、より的確な応答を実現することが可能となっている。なお、発話候補４６を生成するモジュールの順番は、本実施の形態の順番に限定されないが、上述の理由から、本実施の形態のように、番号が小さいモジュールから大きいモジュールへ進む順番とすることが好ましい。 Next, the flow of operation of the utterance candidate generation device 20 of the present embodiment will be described. In the utterance candidate generation device 20 of the present embodiment, by appropriately using a plurality of (first to fifth) modules, an appropriate and high-quality response is always performed according to the user utterance. In addition, the utterance candidate generation device 20 according to the present embodiment sequentially generates utterance candidates 46 in the order of the first module (utterance pair selection unit 22) to the fifth module (new topic utterance selection unit 42). By generating the utterance candidates 46 in such an order, the utterance candidates 46 are sequentially generated in the order of modules capable of providing a more appropriate response to the user utterances. A response can be realized. Note that the order of the modules for generating the utterance candidates 46 is not limited to the order of the present embodiment, but for the reasons described above, the order of proceeding from the module with the smallest number to the module with the larger number as in this embodiment. Is preferred.

図９及び図１０には、発話候補生成装置２０における発話候補生成処理の流れの一例のフローチャートを示す。当該発話候補生成処理は、入力部１２からユーザ発話及び対話履歴が発話候補生成装置２０に入力されると実行される。なお、対話履歴の入力は必須ではないが、入力された一つのユーザ発話から話題を示す単語を抽出できなかった場合等、各モジュールが、発話候補４６を生成できない場合に、入力されたユーザ発話に替わり、当該ユーザ発話より過去に発話されたユーザ発話を用いて発話候補４６を生成することができるため、本実施の形態のように、対話履歴を入力することが好ましい。 FIG. 9 and FIG. 10 show a flowchart of an example of the flow of the utterance candidate generation process in the utterance candidate generation device 20. The utterance candidate generation process is executed when a user utterance and a dialogue history are input from the input unit 12 to the utterance candidate generation device 20. The input of the dialogue history is not essential, but when each module cannot generate the utterance candidate 46, such as when a word indicating a topic cannot be extracted from one input user utterance, the input user utterance is input. Instead, since the utterance candidate 46 can be generated using the user utterance uttered in the past from the user utterance, it is preferable to input the dialogue history as in the present embodiment.

まず、発話ペア選択部２２が、発話候補４６の生成を行う。そのため、ステップＳ１００では、発話ペア選択部２２にユーザ発話が入力される。次のステップＳ１０２では、発話ペア選択部２２が、入力されたユーザ発話に一致するユーザ発話を含む発話ペアが発話ペア２４に格納されているか否か判断する。なお、本実施の形態では、発話ペア選択部２２は、完全一致のみではなく、一致するとみなされるユーザ発話を含む発話ペアが発話ペア２４に格納されているか否か判断している。具体的には、例えば、編集距離や単語集合の類似度を用いて、予め設けておいた閾値に基づいて当該閾値を超えるか否かにより、一致するか否かを判断している。このように発話ペア選択部２２は、入力されたユーザ発話に完全に一致するユーザ発話を含む発話ペアのみではなく、一致するとみなされるユーザ発話を含む発話ペアが発話ペア２４に格納されているか否か判断することが好ましい。 First, the utterance pair selection unit 22 generates an utterance candidate 46. Therefore, in step S <b> 100, a user utterance is input to the utterance pair selection unit 22. In the next step S102, the utterance pair selection unit 22 determines whether or not an utterance pair including a user utterance that matches the input user utterance is stored in the utterance pair 24. In the present embodiment, the utterance pair selection unit 22 determines whether or not an utterance pair including a user utterance that is regarded as a match is stored in the utterance pair 24 as well as a complete match. Specifically, for example, using the editing distance or the similarity of the word set, whether or not they match is determined based on whether or not the threshold value is exceeded based on a preset threshold value. As described above, the utterance pair selection unit 22 stores not only the utterance pair including the user utterance that completely matches the input user utterance but also whether the utterance pair including the user utterance considered to match is stored in the utterance pair 24. It is preferable to judge whether or not.

入力されたユーザ発話に一致するユーザ発話を含む発話ペアが発話ペア２４に格納されていない場合は、発話候補４６が生成できないため、次のモジュール（本実施の形態では関連発話選択部２６）による発話候補４６の生成を行うために、ステップＳ１１２へ進む。一方、入力されたユーザ発話に一致するユーザ発話を含む発話ペアが発話ペア２４に格納されている場合は、ステップＳ１０４へ進む。 If an utterance pair including a user utterance that matches the input user utterance is not stored in the utterance pair 24, an utterance candidate 46 cannot be generated, and therefore the following module (related utterance selection unit 26 in the present embodiment) is used. In order to generate the utterance candidate 46, the process proceeds to step S112. On the other hand, if an utterance pair including a user utterance that matches the input user utterance is stored in the utterance pair 24, the process proceeds to step S104.

ステップＳ１０４では、対話制御部１４から入力された対話行為が有るか否か判断する。対話行為が有る場合は、ステップＳ１０６へ進み、入力されたユーザ発話と対話行為と一致する発話ペアから発話候補４６を生成した後、ステップＳ１１０へ進む。具体的には、発話ペア２４から、入力されたユーザ発話を含み、かつ入力された対話行為が付与された発話ペアを抽出し、抽出した発話ペアの応答を発話候補４６として生成する。一方、対話行為が無い場合は、ステップＳ１０４からステップＳ１０８へ進む。ステップＳ１０８では、入力されたユーザ発話と一致する発話ペアから発話候補４６を生成した後、ステップＳ１１０へ進む。具体的には、発話ペア２４から、入力されたユーザ発話と一致するユーザ発話を含む発話ペアを抽出し、抽出した発話ペアの応答を発話候補４６として生成する。 In step S104, it is determined whether or not there is a dialogue action input from the dialogue control unit 14. If there is an interactive action, the process proceeds to step S106, and after generating an utterance candidate 46 from an utterance pair that matches the input user utterance and the interactive action, the process proceeds to step S110. Specifically, an utterance pair including the input user utterance and provided with the input dialogue action is extracted from the utterance pair 24, and a response of the extracted utterance pair is generated as the utterance candidate 46. On the other hand, if there is no dialogue act, the process proceeds from step S104 to step S108. In step S108, the utterance candidate 46 is generated from the utterance pair that matches the input user utterance, and the process proceeds to step S110. Specifically, an utterance pair including a user utterance that matches the input user utterance is extracted from the utterance pair 24, and a response of the extracted utterance pair is generated as an utterance candidate 46.

ステップＳ１１０では、次のモジュールで発話候補４６を生成するか否か判断する。本実施の形態では、上述したように、第１モジュール（発話ペア選択部２２）から第５モジュール（新規話題発話選択部４２）の順で順次、発話候補４６を生成するが、全てのモジュールで発話候補４６を生成しないようにしてもよい。例えば、一つ以上のモジュールをスキップしてもよいし、予め定められた数以上の応答や発話を含む発話候補４６を生成した時点で、後段のモジュールによる発話候補４６の生成を行わないようにしてもよい。これらの場合は、予め発話候補生成装置２０内に発話候補４６の生成について設定しておき、設定に応じて、各モジュールで発話候補４６の生成を行うか否か判断するようにすればよい。 In step S110, it is determined whether or not the utterance candidate 46 is generated in the next module. In this embodiment, as described above, the utterance candidates 46 are sequentially generated in the order of the first module (the utterance pair selection unit 22) to the fifth module (the new topic utterance selection unit 42). The utterance candidate 46 may not be generated. For example, one or more modules may be skipped, and when the utterance candidate 46 including a predetermined number of responses or utterances is generated, the utterance candidate 46 is not generated by the subsequent module. May be. In these cases, the generation of the utterance candidate 46 is set in advance in the utterance candidate generation device 20, and it is sufficient to determine whether or not to generate the utterance candidate 46 in each module according to the setting.

また、本実施の形態では、発話ペア選択部２２で、対話行為として「相槌」が付与されている応答のみを発話候補４６として生成した場合は、次のモジュールで発話候補４６を生成するように判断している。 In the present embodiment, when the utterance pair selection unit 22 generates only the response to which “conformity” is given as the dialogue action as the utterance candidate 46, the utterance candidate 46 is generated in the next module. Deciding.

ステップＳ１１０では、設定等に応じて、次のモジュールで発話候補４６を生成しない場合は、ステップＳ１６２へ進む。一方、次のモジュールで発話候補４６を生成する場合は、ステップＳ１１２へ進む。なお、本実施の形態の発話候補生成装置２０では、順次、後段のモジュールで発話候補４６を生成するため次の番号のモジュールで発話候補４６を生成するようにしているが、上述したようにモジュールをスキップするように設定されている場合は、設定先のモジュールで発話候補４６を生成するようにする。 In step S110, if the utterance candidate 46 is not generated in the next module according to the setting or the like, the process proceeds to step S162. On the other hand, when generating the utterance candidate 46 in the next module, the process proceeds to step S112. Note that in the utterance candidate generation device 20 of the present embodiment, the utterance candidate 46 is sequentially generated by the module of the next number in order to generate the utterance candidate 46 by the subsequent module, but as described above, the module Is set to be skipped, the utterance candidate 46 is generated by the setting destination module.

次に、関連発話選択部２６が、発話候補４６の生成を行う。そのため、ステップＳ１１２では、関連発話選択部２６にユーザ発話が入力される。次のステップＳ１１４では、関連発話選択部２６によりユーザ発話の形態素解析を行い、入力されたユーザ発話から、話題を示す単語を抽出する。 Next, the related utterance selection unit 26 generates an utterance candidate 46. Therefore, in step S112, the user utterance is input to the related utterance selection unit 26. In the next step S114, morphological analysis of the user utterance is performed by the related utterance selection unit 26, and a word indicating a topic is extracted from the input user utterance.

次のステップＳ１１６では、抽出した単語と一致する単語に対応した発話が単語−発話ペア３０に有るか否か判断する。抽出した単語と一致する単語に対応した発話が単語−発話ペア３０に無い場合は、ステップＳ１２６へ進む。一方、抽出した単語と一致する単語に対応した発話が単語−発話ペア３０に有る場合は、ステップＳ１１８へ進む。 In the next step S116, it is determined whether or not the word-utterance pair 30 has an utterance corresponding to a word that matches the extracted word. If there is no utterance corresponding to the extracted word in the word-utterance pair 30, the process proceeds to step S126. On the other hand, when the utterance corresponding to the word that matches the extracted word is in the word-utterance pair 30, the process proceeds to step S118.

ステップＳ１１８では、対話制御部１４から入力された対話行為が有るか否か判断する。対話行為が有る場合は、ステップＳ１２０へ進み、抽出した単語と入力された対話行為とに一致する発話から発話候補４６を生成した後、ステップＳ１２４へ進む。具体的には、単語−発話ペア３０から、抽出された単語と一致する単語に対応し、かつ入力された対話行為と一致する対話行為が付与された発話を抽出し、抽出した発話を発話候補４６として生成する。一方、対話行為が無い場合は、ステップＳ１１８からステップＳ１２２へ進む。ステップＳ１２２では、抽出した単語と一致する単語に対する発話から発話候補４６を生成した後、ステップＳ１２４へ進む。具体的には、単語−発話ペア３０から、抽出された単語と一致する単語に対応する発話を抽出し、抽出した発話により発話候補４６として生成する。 In step S118, it is determined whether or not there is a dialogue action input from the dialogue control unit 14. If there is an interactive action, the process proceeds to step S120, and after generating the utterance candidate 46 from the utterance that matches the extracted word and the input interactive action, the process proceeds to step S124. Specifically, from the word-utterance pair 30, an utterance corresponding to a word that matches the extracted word and provided with an interactive action that matches the input interactive action is extracted, and the extracted utterance is the utterance candidate. 46 is generated. On the other hand, if there is no interactive action, the process proceeds from step S118 to step S122. In step S122, the utterance candidate 46 is generated from the utterance for the word that matches the extracted word, and then the process proceeds to step S124. Specifically, an utterance corresponding to a word that matches the extracted word is extracted from the word-utterance pair 30 and generated as an utterance candidate 46 by the extracted utterance.

次のステップＳ１２４では、上述したステップＳ１１０と同様に、次のモジュールで発話候補４６を生成するか否か判断する。次のモジュールで発話候補４６を生成しない場合は、ステップＳ１６２へ進む。一方、次のモジュールで発話候補４６を生成する場合は、ステップＳ１２６へ進む。 In the next step S124, as in step S110 described above, it is determined whether or not the utterance candidate 46 is generated in the next module. When the utterance candidate 46 is not generated in the next module, the process proceeds to step S162. On the other hand, when the utterance candidate 46 is generated in the next module, the process proceeds to step S126.

次に、発話サーチ部３２が、発話候補４６の生成を行う。そのため、ステップＳ１２６では、発話サーチ部３２にユーザ発話が入力される。次のステップＳ１２８では、発話サーチ部３２によりユーザ発話の形態素解析を行い、ユーザ発話から単語を抽出する。 Next, the utterance search unit 32 generates an utterance candidate 46. Therefore, in step S126, the user utterance is input to the utterance search unit 32. In the next step S128, the utterance search unit 32 performs morphological analysis of the user utterance and extracts words from the user utterance.

次のステップＳ１３０では、対話制御部１４から入力された対話行為が有るか否か判断する。対話行為が有る場合は、ステップＳ１３２へ進み、ステップＳ１３２では、抽出した単語と入力された対話行為とにより検索クエリを構成する。一方、対話制御部１４から入力された対話行為が無い場合は、ステップＳ１３４へ進み、ステップＳ１３４では、抽出した単語により検索クエリを構成する。さらに次のステップＳ１３６では、当該検索クエリを用いて、発話データベース３６を検索し、検索結果を取得する。 In the next step S <b> 130, it is determined whether or not there is a dialogue action input from the dialogue control unit 14. If there is an interactive action, the process proceeds to step S132, and in step S132, a search query is constituted by the extracted word and the input interactive action. On the other hand, when there is no dialogue act input from the dialogue control unit 14, the process proceeds to step S134, and in step S134, a search query is constituted by the extracted words. In the next step S136, the search database 36 is searched using the search query, and the search result is acquired.

次のステップＳ１３８では、検索結果に発話が含まれているか否か判断する。検索結果に発話が含まれていない場合、すなわち検索結果が０件であった場合は、ステップＳ１４４へ進む。一方、検索結果に発話が含まれている場合、すなわち検索結果が０件で無い場合は、ステップＳ１４０へ進む。ステップＳ１４０では、検索結果に含まれる発話からなる発話リストから、発話候補４６を生成した後、ステップＳ１４２へ進む。 In the next step S138, it is determined whether or not an utterance is included in the search result. If the search result does not include an utterance, that is, if the search result is zero, the process proceeds to step S144. On the other hand, if the search result includes an utterance, that is, if the search result is not 0, the process proceeds to step S140. In step S140, the utterance candidate 46 is generated from the utterance list including utterances included in the search result, and then the process proceeds to step S142.

ステップＳ１４２では、上述したステップＳ１１０と同様に、次のモジュールで発話候補４６を生成するか否か判断する。次のモジュールで発話候補４６を生成しない場合は、ステップＳ１６２へ進む。一方、次のモジュールで発話候補４６を生成する場合は、ステップＳ１４４へ進む。 In step S142, as in step S110 described above, it is determined whether or not to generate the utterance candidate 46 in the next module. When the utterance candidate 46 is not generated in the next module, the process proceeds to step S162. On the other hand, when generating the utterance candidate 46 in the next module, the process proceeds to step S144.

次に、バーサタイル発話選択部３８が、発話候補４６の生成を行う。そのため、ステップＳ１４４では、バーサタイル発話選択部３８にユーザ発話が入力される。次のステップＳ１４６では、対話制御部１４から入力された対話行為が有るか否か判断する。対話行為が有る場合は、ステップＳ１４８へ進み、ステップＳ１４８では、発話リスト４０と入力された対話行為とから発話候補４６を生成する。具体的には、発話リスト４０から、入力された対話行為と一致する対話行為が付与された発話を選択して発話候補４６を生成する。一方、対話制御部１４から入力された対話行為が無い場合は、ステップＳ１５０へ進み、発話リスト４０から発話候補４６を生成する。 Next, the versatile utterance selection unit 38 generates an utterance candidate 46. Therefore, in step S144, the user utterance is input to the versatile utterance selection unit 38. In the next step S146, it is determined whether or not there is a dialogue action input from the dialogue control unit 14. If there is an interactive action, the process proceeds to step S148, where an utterance candidate 46 is generated from the utterance list 40 and the input interactive action. Specifically, an utterance candidate 46 is generated by selecting an utterance provided with an interactive action that matches the input interactive action from the utterance list 40. On the other hand, if there is no dialogue action input from the dialogue control unit 14, the process proceeds to step S 150, and the utterance candidate 46 is generated from the utterance list 40.

次のステップＳ１５２では、上述したステップＳ１１０と同様に、次のモジュールで発話候補４６を生成するか否か判断する。次のモジュールで発話候補４６を生成しない場合は、ステップＳ１６２へ進む。一方、次のモジュールで発話候補４６を生成する場合は、ステップＳ１５４へ進む。 In the next step S152, as in step S110 described above, it is determined whether or not to generate the utterance candidate 46 in the next module. When the utterance candidate 46 is not generated in the next module, the process proceeds to step S162. On the other hand, when the utterance candidate 46 is generated in the next module, the process proceeds to step S154.

次に、新規話題発話選択部４２が、発話候補４６の生成を行う。そのため、ステップＳ１５４では、新規話題発話選択部４２にユーザ発話が入力される。次のステップＳ１５６では、対話制御部１４から入力された対話行為が有るか否か判断する。対話行為が有る場合は、ステップＳ１５８へ進み、ステップＳ１５８では、発話リスト４４と入力された対話行為とから発話候補４６を生成する。具体的には、発話候補４６から、入力された対話行為と一致する対話行為が付与された発話を選択して発話候補４６を生成する。一方、対話制御部１４から入力された対話行為が無い場合は、ステップＳ１６０へ進み、発話リスト４４から発話候補４６を生成する。 Next, the new topic utterance selection unit 42 generates an utterance candidate 46. Therefore, in step S154, the user utterance is input to the new topic utterance selection unit 42. In the next step S156, it is determined whether or not there is a dialogue act input from the dialogue control unit 14. If there is an interactive action, the process proceeds to step S158, and in step S158, an utterance candidate 46 is generated from the utterance list 44 and the input interactive action. Specifically, an utterance candidate 46 is generated by selecting an utterance to which an interactive action that matches the input interactive action is selected from the utterance candidates 46. On the other hand, if there is no dialogue action input from the dialogue control unit 14, the process proceeds to step S 160, and the utterance candidate 46 is generated from the utterance list 44.

次のステップＳ１６２では、各モジュールにより生成された発話候補４６を出力部１８へ出力した後、本処理を終了する。なお、本実施の形態では、このように各モジュールで生成された発話候補４６をまとめて出力部１８に出力しているがこれに限らず、各モジュールで生成された発話候補４６を順次、出力部１８に出力するようにしてもよい。 In the next step S162, the utterance candidate 46 generated by each module is output to the output unit 18, and then this process is terminated. In the present embodiment, the utterance candidates 46 generated in each module are collectively output to the output unit 18 as described above. However, the present invention is not limited thereto, and the utterance candidates 46 generated in each module are sequentially output. You may make it output to the part 18. FIG.

出力部１８は、発話候補生成装置２０で生成され、出力された発話候補４６から発話を選択しユーザに対して応答する。 The output unit 18 selects an utterance from the utterance candidates 46 generated and output by the utterance candidate generation device 20, and responds to the user.

また、本実施の形態の対話システム１０では、内容がない発話（応答）のみを応答しないように、対話としての内容が不十分であるとみなされる「相槌」等の対話行為が付与された発話（応答）のみを用いた応答をしないようにしている。具体的には、本実施の形態の対話システム１０の出力部１８により、対話行為として「相槌」が付与されている発話（応答）を発話候補４６から選択してユーザに対して応答した場合は、続けて別の発話を発話候補４６から選択してユーザに対して応答するようにする。なお、対話システム１０において、「相槌」等の対話行為が付与された発話のみを応答しないようにする方法としては、特に限定されないが、例えば、このような発話（応答）に対してフラグを付与して発話候補４６を生成しておき、出力部１８は、当該フラグが付与された発話を発話候補４６から選択してユーザに対して応答した場合は、続けて別の発話を発話候補４６から選択してユーザに対して応答すればよい。 Further, in the dialogue system 10 of the present embodiment, the utterance to which the dialogue act such as “conformity”, which is regarded as insufficient in the content as the dialogue, is given so that only the utterance (response) having no content is answered. The response using only (response) is not made. Specifically, when the output unit 18 of the dialog system 10 according to the present embodiment selects an utterance (response) to which “conversation” is given as a dialog action from the utterance candidates 46 and responds to the user Then, another utterance is selected from the utterance candidates 46 and responded to the user. In the dialogue system 10, a method for not responding only to an utterance to which a dialogue act such as “conformity” is given is not particularly limited. For example, a flag is given to such an utterance (response) Then, the utterance candidate 46 is generated, and when the output unit 18 selects the utterance to which the flag is assigned from the utterance candidate 46 and responds to the user, the output unit 18 continues another utterance from the utterance candidate 46. Select and respond to the user.

対話システム１０によって、図１１に示した対話例のような対話が行われる。 The dialogue system 10 performs a dialogue such as the dialogue example shown in FIG.

図１１には、ユーザと対話システム１０との間で実現できる対話の具体例を示す。図１１に示した対話例では、まず、対話システム１０（system）が、システムのプロンプト（初期発話）である、「こんにちは」をユーザに提示する。 FIG. 11 shows a specific example of a dialog that can be realized between the user and the dialog system 10. Conversations example shown in FIG. 11, first, the interactive system 10 (system) is the system prompts (initial utterances) presented to the user to "Hello".

当該提示に応じて、ユーザ（you）が、「どうも」とユーザ発話を行うと、当該ユーザ発話入力部１２から発話候補生成装置２０に入力される。当該ユーザ発話に応じて、当該ユーザ発話に応じて発話ペア選択部２２が発話ペア２４から選択された応答により生成された発話候補４６から出力部１８が選択した「いえいえ」をユーザに対して発話する。 In response to the presentation, when the user (you) makes a user utterance “Tomo”, the user is input from the user utterance input unit 12 to the utterance candidate generation device 20. In response to the user utterance, the utterance pair selection unit 22 responds to the user utterance, and the utterance selected by the output unit 18 from the utterance candidate 46 generated by the response selected from the utterance pair 24 is uttered to the user. To do.

さらに、当該応答に対して、ユーザが「イタリア行きたいな」とユーザ発話を行うと、当該ユーザ発話が入力部１２から発話候補生成装置２０に入力される。このようにユーザが発話を行うと、順次、ユーザ発話が入力部１２から発話候補生成装置２０に入力され、各モジュールで、上述したように入力されたユーザ発話や対話履歴を参照して、発話候補４６が生成される。出力部１８は、生成された発話候補４６から発話を選択してユーザに対して応答する。 Further, when the user utters “I want to go to Italy” in response to the response, the user utterance is input from the input unit 12 to the utterance candidate generation device 20. When the user utters in this way, the user utterance is sequentially input from the input unit 12 to the utterance candidate generation device 20, and the utterance is referred to by each module with reference to the input user utterance and conversation history as described above. Candidate 46 is generated. The output unit 18 selects an utterance from the generated utterance candidates 46 and responds to the user.

図１１に示した対話例では、「イタリア行きたいな」というユーザ発話に対して、出力部１８は、関連発話選択部２６（第２モジュール）が生成した発話候補４６から選択した「イタリア行くの？」を応答する。また、「行きたい」というユーザ発話に対して、出力部１８は、発話ペア選択部２２（第１モジュール）が生成した発話候補４６から選択した「行きたい」を応答する。また、「だね」というユーザ発話に対して、出力部１８は、発話ペア選択部２２（第１モジュール）が生成した発話候補４６から選択した「うん」を応答する。この場合、「うん」には、対話行為として「相槌」が付与されているため、出力部１８は、続けて、関連発話選択部２６（第２モジュール）及び発話サーチ部３２（第３モジュール）が生成した発話候補４６の中から選択した発話サーチ部３２が生成した発話候補４６による「イタリアワインいいなぁ」を応答する。また、「美味しいよね」というユーザ発話に対して、出力部１８は、発話ペア選択部２２（第１モジュール）が生成した発話候補４６から選択した「うん」を応答する。この場合、「うん」には、対話行為として「相槌」が付与されているため、出力部１８は、続けて、関連発話選択部２６（第２モジュール）及び発話サーチ部３２（第３モジュール）が生成した発話候補４６の中から選択した発話サーチ部３２が生成した発話候補４６による「イタリア料理美味しい」を応答する。また、「パスタも」というユーザ発話に対して、出力部１８は、関連発話選択部２６（第２モジュール）及び発話サーチ部３２（第３モジュール）が生成した発話候補４６の中から選択した発話サーチ部３２が生成した発話候補４６による「イタリアのアイテムはやはりパスタになるんですかね」を応答する。また、「だと思うよ」というユーザ発話に対して、出力部１８は、バーサタイル発話選択部３８（第４モジュール）が生成した発話候補４６から選択した「なるほど」を応答する。さらに、「何がなるほどなの？」というユーザ発話に対して、出力部１８は、新規話題発話選択部４２（第５モジュール）が生成した発話候補４６から選択した「ところで今日は何を食べましたか」を応答する。 In the dialogue example shown in FIG. 11, in response to the user utterance “I want to go to Italy”, the output unit 18 selects “Italy go to Italy” selected from the utterance candidates 46 generated by the related utterance selection unit 26 (second module). "?" In response to the user utterance “I want to go”, the output unit 18 responds “I want to go” selected from the utterance candidates 46 generated by the utterance pair selection unit 22 (first module). In response to the user utterance “Dane”, the output unit 18 responds “Yes” selected from the utterance candidates 46 generated by the utterance pair selection unit 22 (first module). In this case, since “Yes” is assigned as “conversation” to “Yes”, the output unit 18 continues to the related utterance selection unit 26 (second module) and the utterance search unit 32 (third module). The utterance search unit 32 selected from among the utterance candidates 46 generated by the user responds “Italian wine is good” by the utterance candidate 46 generated. In response to the user utterance “delicious”, the output unit 18 responds “Yes” selected from the utterance candidates 46 generated by the utterance pair selection unit 22 (first module). In this case, since “Yes” is assigned as “conversation” to “Yes”, the output unit 18 continues to the related utterance selection unit 26 (second module) and the utterance search unit 32 (third module). The utterance search unit 32 selected from the utterance candidates 46 generated by the user responds “Italian cuisine delicious” by the utterance candidates 46 generated. For the user utterance “pasta also”, the output unit 18 selects the utterance selected from the utterance candidates 46 generated by the related utterance selection unit 26 (second module) and the utterance search unit 32 (third module). The search unit 32 responds with the utterance candidate 46 generated, “Is Italian item still a pasta?”. Further, in response to the user utterance “I think”, the output unit 18 responds “I see” selected from the utterance candidates 46 generated by the versatile utterance selection unit 38 (fourth module). Furthermore, in response to the user utterance “What is it?”, The output unit 18 selects from the utterance candidates 46 generated by the new topic utterance selection unit 42 (fifth module) “What did you eat today? ".

上記図１１では、ユーザ発話が行われる度に、入力部１２から発話候補生成装置２０にユーザ発話が入力され、各モジュールで発話候補４６を生成し、生成された発話候補４６から選択した発話により出力部１８がユーザに対して応答を行っている。なお、出力部１８は、ユーザとの対話における所定のタイミング（例えば、話題が変わったとみなせるタイミング）まで、発話候補生成装置２０が生成した発話候補４６から発話を繰り返し選択してユーザに対して応答するようにしてもよい。 In FIG. 11, each time a user utterance is performed, a user utterance is input from the input unit 12 to the utterance candidate generation device 20, an utterance candidate 46 is generated by each module, and an utterance selected from the generated utterance candidate 46 is used. The output unit 18 responds to the user. Note that the output unit 18 repeatedly selects an utterance from the utterance candidates 46 generated by the utterance candidate generation device 20 and responds to the user until a predetermined timing in the dialogue with the user (for example, a timing at which the topic can be considered changed). You may make it do.

以上説明したように、本実施の形態の対話システム１０では、ユーザとの対話において、入力部１２からユーザ発話が発話候補生成装置２０に入力される。また、対話制御部１４がユーザ発話の対話行為を対話行為推定器１６により推定し、推定した結果に応じて決定した対話システム１０の対話行為を発話候補生成装置２０に出力する。発話候補生成装置２０では、発話ペア選択部２２、関連発話選択部２６、発話サーチ部３２、バーサタイル発話選択部３８、及び新規話題発話選択部４２が、順次、入力されたユーザ発話及び対話行為に応じた発話候補４６を生成する。発話ペア選択部２２は、入力されたユーザ発話に応じて発話ペア２４から選択された応答により発話候補４６を生成する。関連発話選択部２６は、入力されたユーザ発話を形態素解析器２８により形態素解析して、話題を示す単語を抽出し、抽出した単語に対応し、かつ入力された対話行為が付与された発話を単語−発話ペア３０から取得して発話候補４６を生成する。発話サーチ部３２は、入力されたユーザ発話を形態素解析器３４により形態素解析して、単語を抽出し、抽出した単語及び入力された対話行為を検索クエリとして発話データベース３６を検索して、検索結果に基づいて発話候補４６を生成する。バーサタイル発話選択部３８は、入力された対話行為と一致する対話行為が付与された発話を発話リスト４０から選択して、発話候補４６を生成する。新規話題発話選択部４２は、入力された対話行為と一致する対話行為が付与された発話を発話リスト４４から選択して、発話候補４６を生成する。出力部１８は、発話候補生成装置２０で生成され、出力された発話候補４６から発話を選択しユーザに対して応答する。 As described above, in the dialogue system 10 of the present embodiment, a user utterance is input from the input unit 12 to the utterance candidate generation device 20 in a dialogue with the user. In addition, the dialogue control unit 14 estimates the dialogue act of the user utterance by the dialogue act estimator 16, and outputs the dialogue act of the dialogue system 10 determined according to the estimated result to the utterance candidate generation device 20. In the utterance candidate generation device 20, the utterance pair selection unit 22, the related utterance selection unit 26, the utterance search unit 32, the versatile utterance selection unit 38, and the new topic utterance selection unit 42 are sequentially input to the input user utterance and dialogue action. A corresponding utterance candidate 46 is generated. The utterance pair selection unit 22 generates an utterance candidate 46 based on a response selected from the utterance pair 24 according to the input user utterance. The related utterance selection unit 26 performs morphological analysis on the input user utterance by the morphological analyzer 28, extracts a word indicating a topic, and corresponds to the extracted word and outputs the utterance to which the input interactive action is given. An utterance candidate 46 is generated from the word-utterance pair 30. The utterance search unit 32 performs morphological analysis on the input user utterance by the morphological analyzer 34, extracts words, searches the utterance database 36 using the extracted words and the input interactive action as a search query, and searches the search results. The speech candidate 46 is generated based on the above. The versatile utterance selection unit 38 selects an utterance provided with an interactive action that matches the input interactive action from the utterance list 40 and generates an utterance candidate 46. The new topic utterance selection unit 42 selects an utterance provided with an interactive action that matches the input interactive action from the utterance list 44, and generates an utterance candidate 46. The output unit 18 selects an utterance from the utterance candidates 46 generated and output by the utterance candidate generation device 20, and responds to the user.

このように、本実施の形態の対話システム１０の発話候補生成装置２０では、ユーザ発話に対応し、ユーザ発話に対してより適切な発話候補４６を生成するモジュールから順に、発話候補４６を生成することができる。また、発話候補生成装置２０では、複数のモジュールを備えることで、発話の質が異なる発話候補４６を生成することができる。従って、対話システム１０においてユーザ発話に応じて適切な応答を行うことができる。 As described above, the utterance candidate generation device 20 of the interactive system 10 according to the present embodiment generates the utterance candidates 46 in order from the module corresponding to the user utterance and generating a more appropriate utterance candidate 46 for the user utterance. be able to. In addition, the utterance candidate generation device 20 can generate utterance candidates 46 having different utterance qualities by including a plurality of modules. Therefore, an appropriate response can be made in the dialog system 10 according to the user utterance.

これにより、質の高い応答を行う対話システム１０を実現することができる。雑談等の対話は人間同士では会話の潤滑油として用いられる。質の高い対話を実現可能にすることによって、人間と対話システムとのやりとりがより円滑になり、コンピュータとユーザの共同作業の効率を高めることができる。 Thereby, the interactive system 10 that performs a high-quality response can be realized. Dialogue such as chat is used as a lubricant for conversation between humans. By enabling a high-quality dialogue, the interaction between the human and the dialogue system becomes smoother, and the efficiency of collaboration between the computer and the user can be improved.

なお、本実施の形態では、５つのモジュール（発話ペア選択部２２、関連発話選択部２６、発話サーチ部３２、バーサタイル発話選択部３８、及び新規話題発話選択部４２）を備えているがこれに限らず、これらのうち、少なくとも２つ以上備えて入ればよい。 In the present embodiment, five modules (the utterance pair selection unit 22, the related utterance selection unit 26, the utterance search unit 32, the versatile utterance selection unit 38, and the new topic utterance selection unit 42) are provided. Not limited to these, it is only necessary to include at least two of these.

また、本実施の形態に限らず、発話ペア２４、単語−発話ペア３０、発話データベース３６、発話リスト４０、及び発話リスト４４は、発話候補生成装置２０の外部に備えられていてもよい。 In addition, the utterance pair 24, the word-utterance pair 30, the utterance database 36, the utterance list 40, and the utterance list 44 may be provided outside the utterance candidate generation device 20 without being limited to the present embodiment.

また、本実施の形態について図面を参照して詳述に説明したが、本実施の形態は一例であり、具体的な構成は本実施の形態に限られるものではなく、本発明の要旨を逸脱しない範囲の設計等も含まれ、状況に応じて変更可能であることは言うまでもない。 Although the present embodiment has been described in detail with reference to the drawings, the present embodiment is an example, and the specific configuration is not limited to the present embodiment, and departs from the gist of the present invention. Needless to say, the range of designs that are not included is included and can be changed according to the situation.

１０対話システム
２２発話ペア選択部
２４発話ペア
２６関連発話選択部
３０単語-発話ペア
３２発話サーチ部
３６発話データベース
３８バーサタイル発話選択部
４０発話リスト
４２新規話題発話選択部
４４発話リスト DESCRIPTION OF SYMBOLS 10 Dialogue system 22 Utterance pair selection part 24 Utterance pair 26 Related utterance selection part 30 Word-utterance pair 32 Utterance search part 36 Utterance database 38 Versatile utterance selection part 40 Utterance list 42 New topic utterance selection part 44 Utterance list

Claims

An utterance candidate generation device that generates an utterance candidate for use in a response in a dialog system that interacts with a user by responding to a user utterance,
From a plurality of predetermined utterance pairs consisting of a user utterance and a response to the user utterance, select an utterance pair determined to match the input user utterance, and using the response of the selected utterance pair, A first module for generating utterance candidates;
The word related to the word representing the topic of the user utterance extracted from the input user utterance from the utterance using the related word related to the word predetermined for each word representing the topic of the user utterance. A second module for selecting an utterance using a related word and generating an utterance candidate using the selected utterance;
A third module that searches an utterance database storing a plurality of utterances using a search query using a word extracted from the input user utterance and generates an utterance candidate using the searched utterance;
A fourth module that generates an utterance candidate using at least one utterance of a plurality of utterances determined to be utterable regardless of a predetermined user utterance topic, and a predetermined user utterance topic comprising a fifth module for generating an utterance candidate with at least one utterance of a plurality of utterances for changing,
If the utterance candidate is not generated by the first module, the process proceeds to the second module. If the utterance candidate is not generated by the second module, the process proceeds to the third module, and the utterance candidate is generated by the third module. Is not generated, the utterance candidate generated by the first module, the utterance candidate generated by the second module, and the utterance candidate generated by the third module by proceeding to the fourth module An utterance candidate generation device that outputs the utterance candidates in preference to the utterance candidates generated by the fourth module and the utterance candidates generated by the fifth module .

At least one of the first module to the fifth module is based on a dialogue act of the dialogue system, which is determined based on dialogue act estimation information obtained by estimating a dialogue act representing an intention of the user's utterance. The utterance candidate generation device according to claim 1, wherein the utterance candidate is generated.

An utterance candidate generation method for generating an utterance candidate for use in a response in a dialog system that interacts with a user by responding to a user utterance,
The first module selects an utterance pair that matches the input user utterance from a plurality of predetermined utterance pairs including a user utterance and a response to the user utterance, and uses the response of the selected utterance pair Generating utterance candidates,
The second module represents the topic of the user utterance extracted from the input user utterance from the utterance using a related word related to the word, which is predetermined for each word representing the topic of the user utterance. Selecting an utterance using the related word related to a word, and generating an utterance candidate using the selected utterance;
Searching with a search query using a word extracted from the input user utterance from an utterance database storing a plurality of utterances according to the third module, and generating utterance candidates using the searched utterances;
A step of generating an utterance candidate using at least one utterance of a plurality of utterances that can be uttered regardless of a topic of user utterance determined in advance by the fourth module; and predetermined by a fifth module Generating an utterance candidate using at least one utterance of a plurality of utterances for changing the topic of the user utterance ,
If the utterance candidate is not generated by the first module, the process proceeds to the second module. If the utterance candidate is not generated by the second module, the process proceeds to the third module, and the utterance candidate is generated by the third module. Is not generated, the utterance candidate generated by the first module, the utterance candidate generated by the second module, and the utterance candidate generated by the third module by proceeding to the fourth module The utterance candidate generation method of outputting the utterance candidate in preference to the utterance candidate generated by the fourth module and the utterance candidate generated by the fifth module .

The utterance candidate production | generation program for functioning a computer as each module of the utterance candidate production | generation apparatus of Claim 1.