JP2010009471A

JP2010009471A - Query reply retrieval system, and method and program therefor

Info

Publication number: JP2010009471A
Application number: JP2008170555A
Authority: JP
Inventors: Kenji Tateishi; 健二立石; Masaru Kusui; 大久寿居
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2008-06-30
Filing date: 2008-06-30
Publication date: 2010-01-14
Anticipated expiration: 2028-06-30
Also published as: JP5311002B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a query reply retrieval system exhibiting a proper narrowing keyword, when incapable of narrowing a reply from an input query sentence, and capable of supporting specification of the reply, and a method and a program therefor. <P>SOLUTION: This query reply retrieval system includes a case database storing a case including a pair of the query sentence and a reply sentence, and a narrowing keyword selecting means for calculating a significance of the keyword included in the query sentence of the case, using the similarity between the paired reply sentences of the query sentences, and for selecting the narrowing keyword for narrowing a reply for an inquiry sentence from the keywords, based on the significance. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は質問回答検索システム及びその方法とプログラムに関する。 The present invention relates to a question answer search system, method and program thereof.

企業のコンタクトセンターのオペレータ等は、顧客からの電話や電子メールでの問い合わせに対して適切な回答が求められる。コンタクトセンターのオペレータの応対記録は、顧客からの質問とオペレータの回答との対を記録した事例データベースとして蓄積される。このような事例データベースから参考となる回答を検索できる質問回答システムがあれば、オペレータの回答内容の品質向上や応対時間の短縮が期待できる。 An operator of a company contact center or the like is required to respond appropriately to inquiries from customers by telephone or e-mail. The contact center operator record is stored as a case database in which pairs of questions from customers and operator responses are recorded. If there is a question answering system that can search for a reference answer from such a case database, it can be expected to improve the quality of the answer contents of the operator and shorten the response time.

事例データベースを用いた質問回答システムは、従来から提案されている。これらは、利用者が質問文を入力すると、その質問文と類似度が高い事例データベースに格納された質問文を検出し、その回答文を回答候補として利用者に提示する。例えば、特許文献１は、類似度を計算する際に、精密な言語解析を用いて入力された質問文と事例データベースの質問文とを比較することを特徴としている。
特開２００６−２４４２６２号公報 Conventionally, a question answering system using a case database has been proposed. When a user inputs a question sentence, these detect the question sentence stored in the case database having a high similarity to the question sentence, and present the answer sentence to the user as an answer candidate. For example, Patent Document 1 is characterized in that, when calculating the similarity, a question sentence input using precise language analysis is compared with a question sentence in a case database.
JP 2006-244262 A

従来の質問回答システムの課題は、入力された質問文だけでは必ずしも回答が十分に特定できないことである。例えば、図１のように「紙詰まり」を質問文として障害事例データベースを検索した結果、質問文が類似する(「紙詰まり」を含む)５つの事例の回答文が回答候補として検索されたとする。ここで、障害事例データベースには障害の現象と原因のペアが格納されており、質問文を検索するのは現象文、回答候補として提示するのは原因文である。この例から、「紙詰まり」という現象だけでは、その障害の原因が「LDユニット」にあるのか「ピックローラ」にあるのかが明らかではなく、質問文に対する回答が十分に特定できない。 The problem with conventional question answering systems is that the answers cannot always be specified sufficiently by the input question text alone. For example, as shown in FIG. 1, as a result of searching the failure case database using “paper jam” as a question sentence, answer sentences of five cases similar to the question sentence (including “paper jam”) are searched as answer candidates. . Here, pairs of failure phenomena and causes are stored in the failure case database, and a question sentence is searched for a phenomenon sentence, and a cause sentence is presented as an answer candidate. From this example, the phenomenon of “paper jam” alone does not reveal whether the cause of the failure is in the “LD unit” or the “pick roller”, and the answer to the question cannot be specified sufficiently.

そこで、本発明は上記課題に鑑みて発明されたものであって、入力された質問文からでは回答が絞り込めない場合に、適切な絞込みキーワードを提示し、回答の特定を支援する質問回答検索システム及びその方法とプログラムを提供することにある。 Therefore, the present invention has been invented in view of the above problems, and in the case where answers cannot be narrowed down based on an inputted question sentence, an appropriate narrowing keyword is presented, and a question answer search that supports the identification of answers It is to provide a system and method and program thereof.

上記課題を解決する本発明は、質問文と回答文との対を含む事例が記憶された事例データベースと、前記事例の質問文に含まれるキーワードの重要度を前記質問文の対の回答文同士の類似度を用いて計算し、前記重要度に基づいて、問い合わせ文に対する回答を絞り込む為の絞込みキーワードを前記キーワードから選択する絞込みキーワード選択手段とを有する質問回答検索システムである。 The present invention for solving the above-mentioned problems is a case database in which a case including a pair of a question sentence and an answer sentence is stored, and the importance of a keyword included in the question sentence of the case is determined between the answer sentences of the question sentence pair. And a narrowed keyword selection means for selecting a narrow keyword for narrowing down answers to a query sentence from the keywords based on the degree of importance.

上記課題を解決する本発明は、質問文と回答文との対を含む事例の質問文に含まれるキーワードの重要度を前記質問文の対の回答文同士の類似度を用いて計算し、前記重要度に基づいて、問い合わせ文に対する回答を絞り込む為の絞込みキーワードを前記キーワードから選択する質問回答検索方法である。 The present invention for solving the above problem is to calculate the importance of a keyword included in a question sentence of a case including a pair of a question sentence and an answer sentence using the similarity between the answer sentences of the question sentence pair, This is a question answer search method in which a narrow-down keyword for narrowing down answers to an inquiry sentence is selected from the keywords based on importance.

上記課題を解決する本発明は、質問文と回答文との対を含む事例の質問文に含まれるキーワードの重要度を前記質問文の対の回答文同士の類似度を用いて計算する処理と、前記重要度に基づいて、問い合わせ文に対する回答を絞り込む為の絞込みキーワードを前記キーワードから選択する処理とを情報処理装置に実行させるプログラムである。 The present invention for solving the above-described problem is a process of calculating the importance of a keyword included in a question sentence of a case including a pair of a question sentence and an answer sentence using the similarity between the answer sentences of the question sentence pair; A program for causing the information processing apparatus to execute a process of selecting, from the keywords, a narrowing keyword for narrowing down answers to the inquiry sentence based on the importance.

本発明は、入力された質問文からでは回答が十分に特定できない(絞り込めない)場合に、回答を絞り込むキーワードを提示することにより、回答の特定を支援することができる。 The present invention can support the identification of an answer by presenting a keyword for narrowing down an answer when the answer cannot be sufficiently identified from the input question text (cannot be narrowed down).

本発明を実施するための最良の形態について図面を参照して詳細に説明する。 The best mode for carrying out the present invention will be described in detail with reference to the drawings.

まず、本発明の実施の形態の概要を説明する。 First, an outline of an embodiment of the present invention will be described.

例えば、質問文として「紙詰まり」が入力されたとする。事例検索手段は、図１のように質問文に「紙詰まり」を含む事例を検索する。その結果、ＩＤ１からＩＤ５の五つの事例が得られたとする。 For example, it is assumed that “paper jam” is input as a question sentence. The case search means searches for cases including “paper jam” in the question sentence as shown in FIG. As a result, five cases of ID1 to ID5 are obtained.

次に、絞込みキーワード選択手段は、事例検索手段により得られた事例の質問文に含まれるキーワードの重要度を、質問文と対の回答文同士の類似度を用いて計算する。ここでは、質問文に含まれる自立語をキーワードとすると、５つの事例では、「紙詰まり」、「頻発」、「印字」、「薄い」、「最近」、「思う」等が該当する。これらのすべてのキーワードの重要度を、そのキーワードを含む事例の回答文同士の類似度の平均値によって求める。例えば、図２のようにキーワード「薄い」の重要度を計算する際には、「薄い」を含むID1からID3の回答文間の類似度sim(1,2), sim(1,3), sim(2,3)の平均値を求める。同様に、図３のように「印字」の重要度を計算する際には、「印字」を含むID1からID5の回答文間の類似度の平均値を求める。類似度の計算の方法は後述するが、ここでは、回答文間の類似度の平均値(重要度)が、「薄い」は0.78、「印字」は0.43であったとする。 Next, the narrow-down keyword selection unit calculates the importance of the keyword included in the question sentence of the case obtained by the case search unit using the similarity between the answer sentence paired with the question sentence. Here, if the independent word included in the question sentence is a keyword, “paper jam”, “frequent occurrence”, “printing”, “light”, “recent”, “think”, and the like correspond to the five cases. The importance of all these keywords is obtained by the average value of the similarities between the answer sentences of the cases including the keyword. For example, as shown in FIG. 2, when calculating the importance of the keyword “light”, the similarity between the answer sentences ID1 to ID3 including “light” sim (1,2), sim (1,3), Find the average value of sim (2,3). Similarly, when calculating the importance of “printing” as shown in FIG. 3, the average value of the similarity between the answer sentences ID1 to ID5 including “printing” is obtained. A method of calculating the similarity will be described later. Here, it is assumed that the average value (importance) of the similarity between the answer sentences is 0.78 for “light” and 0.43 for “print”.

絞込みキーワード選択手段は、重要度の順にあらかじめ指定された個数、又は重要度が閾値以上のキーワードを、絞込みキーワードとして選択して利用者に提示する。 The narrowed keyword selection means selects a number specified in advance in the order of importance or a keyword having an importance level equal to or higher than a threshold as a narrow keyword and presents it to the user.

利用者は、提示された絞込みキーワードを見て、「紙詰まり」以外にも「(印字が)薄い」、「異常音」等の絞込みキーワードに該当する障害が発生しているかを確認し、もし発生している場合は、その絞込みキーワードを含めた質問文で再度検索することで、回答の特定(絞り込み)が可能となる。図４、５は絞込みキーワードとそれに対応する事例とを表示したものであり、図４は絞込みキーワード「(印字が)薄い」とこれに対応する事例とを表示したものであり、図５は絞込みキーワード「異常音」とこれに対応する事例とを表示したものである。 The user looks at the presented refinement keywords and confirms that there is a failure corresponding to the refinement keywords such as “(Light)” and “Abnormal sound” in addition to “Paper jam”. If it has occurred, it is possible to identify (narrow down) the answers by searching again with the question sentence including the narrowed keyword. 4 and 5 show the narrowing keywords and the corresponding cases, and FIG. 4 shows the narrowing keyword “(printed) is light” and the corresponding cases, and FIG. 5 shows the narrowing. The keyword “abnormal sound” and a corresponding example are displayed.

キーワードの重要度を、それを含む文書間の類似度に基づいて計算する方法は、Term Strengthと呼ばれ文献１（Wilbur, J.W. and Sirotkin, K., The automatic identification of stop words, Journal of Information Science, 18, pp.45-55, 1992.）で提案されている。キーワードの主題になりやすさ(≒キーワードの役割の文書での重要性≒不要語になりにくさ)を図る指標で、キーワードが文書の主題に近いほど、キーワードを含む対象文書の内容は類似しやすいという性質に基づく。上記の文献１と本実施の形態との違いは、文献１をそのまま適用すると、質問文に含まれるキーワードの重要度をその質問文同士の類似度に基づいて計算することになるが、本実施の形態は質問文に含まれるキーワードの重要度をその回答文同士の類似度に基づいて計算している点である。この違いにより、入力された質問文からでは回答が十分に特定できない(絞り込めない)場合に、回答を特定しやすい(回答間の類似度が高い)絞込みキーワードを提示することができるという新たな効果を有する。 A method for calculating the importance of a keyword based on the similarity between documents including the keyword is called Term Strength, which is referred to in Reference 1 (Wilbur, JW and Sirotkin, K., The automatic identification of stop words, Journal of Information Science. , 18, pp.45-55, 1992.). This is an index that aims at the ease of becoming the subject of a keyword (≒ importance of the keyword role in a document ≒ less likely to become an unnecessary word) .The closer the keyword is to the subject of the document, the more similar the content of the target document containing the keyword. Based on the nature of being easy. The difference between the above document 1 and the present embodiment is that if document 1 is applied as it is, the importance of the keyword included in the question sentence is calculated based on the similarity between the question sentences. In this form, the importance of the keyword included in the question sentence is calculated based on the similarity between the answer sentences. This difference makes it possible to present a refined keyword that makes it easier to identify answers (high similarity between answers) when answers cannot be sufficiently identified (cannot be narrowed down) from the entered question text. Has an effect.

直感的には、事例の質問文の内容と回答文の内容とには相関があるため、事例の質問文同士が類似すれば回答文同士も類似し、文献１の方法でも本実施の形態と同様の効果を得られるとも考えられる。しかしながら、図１の例のように、質問文には非常に細かく障害の現象が記載され、その中には回答の特定に無関係な部分(自立語)も多い。そのため、回答文が類似するが質問文が類似しない場合がある。また、質問文同士が類似していても、それら事例の質問文が非常に抽象的な内容である場合、回答文同士は類似するとは限らない。このような、質問文の内容と回答文の内容とに相関が無い場合に有効である。 Intuitively, there is a correlation between the contents of the question sentences in the case and the contents of the answer sentences. Therefore, if the question sentences in the case are similar, the answer sentences are also similar. It is considered that the same effect can be obtained. However, as in the example of FIG. 1, the phenomenon of failure is described very finely in the question sentence, and there are many parts (independent words) that are irrelevant to the identification of the answer. Therefore, there are cases where the answer sentences are similar but the question sentences are not similar. Even if the question sentences are similar to each other, the answer sentences are not necessarily similar if the question sentences in these cases have very abstract contents. This is effective when there is no correlation between the content of the question text and the content of the answer text.

次に、具体的な第１の実施の形態を説明する。 Next, a specific first embodiment will be described.

図６を参照すると、本実施の形態における質問回答検索システムは、プログラム制御により動作するデータ処理装置１と、情報を記憶する記憶装置２と、キーボード等の入力装置３、ディスプレイ等の出力装置４とを含む。 Referring to FIG. 6, the question answer search system according to the present embodiment includes a data processing device 1 that operates under program control, a storage device 2 that stores information, an input device 3 such as a keyboard, and an output device 4 such as a display. Including.

質問回答検索システムは、上記の四つの装置が一つのハードウェア上で実装される場合と、データ処理装置１と記憶装置２とがサーバ上に存在し、入力装置３と出力装置４とがクライアント上の別の装置に存在する場合がある。後者では、クライアント上の入力装置３から入力した情報は、ネットワークを介してサーバ上のデータ処理装置１に送信され、サーバ上のデータ処理装置１で出力した情報は、ネットワークを介してクライアント上の出力装置４に送信される。 In the question answer search system, when the above four devices are mounted on one hardware, the data processing device 1 and the storage device 2 exist on the server, and the input device 3 and the output device 4 are the clients. May be present in another device above. In the latter, information input from the input device 3 on the client is transmitted to the data processing device 1 on the server via the network, and information output from the data processing device 1 on the server is transmitted to the client via the network. It is transmitted to the output device 4.

データ処理装置１は、事例検索部１０と、絞込みキーワード選択部１１と、表示部１２とを含む。 The data processing device 1 includes a case search unit 10, a refined keyword selection unit 11, and a display unit 12.

事例検索部１０は、入力装置３を通して入力された質問文と事例データベース２０の質問文とが類似する事例を事例データベース２０から検索し、その結果を事例ＩＤ記憶部２１に格納する。 The case search unit 10 searches the case database 20 for cases where the question sentence input through the input device 3 and the question sentence in the case database 20 are similar, and stores the result in the case ID storage unit 21.

絞込みキーワード選択部１１は、事例ＩＤ記憶部２１と事例データベース２０とを参照して、事例検索部１０により得られた事例の質問文に含まれるキーワードの重要度を、その回答文同士の類似度を用いて計算する。そして、重要度の順にあらかじめ指定された個数あるいは重要度が閾値以上のキーワードを、絞込みキーワードとして選択し、絞込みキーワード記憶部２２に格納する。 The refinement keyword selection unit 11 refers to the case ID storage unit 21 and the case database 20 to determine the importance of the keyword included in the question text of the case obtained by the case search unit 10 and the similarity between the answer sentences. Calculate using. Then, the number of keywords specified in advance in the order of importance or the keyword whose importance is greater than or equal to the threshold value is selected as a refinement keyword and stored in the refinement keyword storage unit 22.

表示手段１２は、絞込みキーワード記憶部２２と事例データベース２０とを参照して、絞込みキーワード選択部１１により得られた絞込みキーワードと、これを含む事例検索部１０により得られた事例を出力装置４に送信する。 The display unit 12 refers to the refined keyword storage unit 22 and the case database 20, and displays the refined keyword obtained by the refined keyword selection unit 11 and the example obtained by the case search unit 10 including the refined keyword in the output device 4. Send.

記憶装置２は、事例データベース２０と、事例ＩＤ記憶部２１と、絞込みキーワード記憶部２２とを含む。記憶装置２は通常、ＨＤＤ等の補助記憶装置で実現されるが、メモリであってもよい。また、事例データベース２０と、事例ＩＤ記憶部２１と、絞込みキーワード記憶部２２とは、記憶装置２に全て含まれて居なくても良く。異なる場所にあっても良い。 The storage device 2 includes a case database 20, a case ID storage unit 21, and a narrowed keyword storage unit 22. The storage device 2 is usually realized by an auxiliary storage device such as an HDD, but may be a memory. Further, the case database 20, the case ID storage unit 21, and the narrowed keyword storage unit 22 may not be all included in the storage device 2. May be in different places.

事例データベース２０において、ひとつの事例は質問文と回答文との対を含み、各事例にはその事例を識別する事例ＩＤが付与されている。 In the case database 20, one case includes a pair of a question sentence and an answer sentence, and each case is given a case ID for identifying the case.

事例ＩＤ記憶部２１は、事例検索部１０により得られた事例ＩＤを格納する。 The case ID storage unit 21 stores the case ID obtained by the case search unit 10.

絞込みキーワード記憶部２２は、絞込みキーワード選択部１１が選択した絞込みキーワードを格納する。 The refinement keyword storage unit 22 stores refinement keywords selected by the refinement keyword selection unit 11.

次に、図７を参照して本実施の形態の動作について詳細に説明する。尚、事例データベース１０には障害事例データが格納され、個々の事例は、質問文に該当する「障害の現象」と回答文に該当する「障害の原因」との対で構成されている場合を例にして説明する。また、事例データベース１０の質問文と回答文とはあらかじめ形態素解析によって単語に分割し、自立語のみを抜き出して記憶装置２に格納されているものとする。 Next, the operation of the present embodiment will be described in detail with reference to FIG. Note that failure case data is stored in the case database 10, and each case is composed of a pair of “failure phenomenon” corresponding to a question sentence and “cause of failure” corresponding to an answer sentence. An example will be described. Also, the question sentence and the answer sentence in the case database 10 are divided into words by morphological analysis in advance, and only independent words are extracted and stored in the storage device 2.

まず、事例検索部１０は、入力装置３を通して入力された質問文と事例データベース２０の質問文とが類似する事例を事例データベース２０から検索し、その結果を事例ＩＤ記憶部２１に格納する（図７のステップＳ１）。 First, the case search unit 10 searches the case database 20 for cases where the question sentence input through the input device 3 is similar to the question sentence in the case database 20, and stores the result in the case ID storage unit 21 (FIG. 7 step S1).

質問文は「紙詰まり」のような語入力と、「紙詰まりが発生する」のような文入力とがある。語入力の場合は、その語を質問文に含む事例を検索する。「紙詰まり発生」のように複数の語が入力された場合(スペースは語の区切り)は、すべての語を含む事例を検索する。文入力の場合は、形態素解析を用いて自立語のみを抜き出し、その語は語入力と同様である。ここでは、入力装置３から「紙詰まり」が語入力され、「紙詰まり」を質問文に含む事例として、図８のＩＤ１からＩＤ５を事例データベース２０から検索し、事例ＩＤ記憶部２１に格納したものとする。 The question sentence includes a word input such as “paper jam” and a sentence input such as “paper jam occurs”. In the case of word input, a case that includes the word in the question sentence is searched. When a plurality of words are input as in the case of “paper jam occurrence” (space is a word separator), a case including all words is searched. In the case of sentence input, only independent words are extracted using morphological analysis, and the words are the same as word input. Here, “paper jam” is input from the input device 3, ID1 to ID5 of FIG. 8 are searched from the case database 20 as examples including “paper jam” in the question sentence, and stored in the case ID storage unit 21. Shall.

次に、絞込みキーワード部１１は、事例ＩＤ記憶部２１と事例データベース２０とを参照して、事例検索部１０により得られた事例の質問文に含まれるキーワードの重要度を、その回答文同士の類似度を用いて計算し、重要度の順にあらかじめ指定された個数あるいは重要度が閾値以上のキーワードを絞込みキーワードとして、絞込みキーワード記憶部２２に格納する（図７のステップＳ２）。 Next, the refinement keyword unit 11 refers to the case ID storage unit 21 and the case database 20 to determine the importance of the keyword included in the question sentence of the case obtained by the case search unit 10 between the answer sentences. The calculation is performed using the similarity, and the number of keywords specified in advance in the order of importance or keywords whose importance is greater than or equal to the threshold value are stored in the refined keyword storage unit 22 as refined keywords (step S2 in FIG. 7).

質問文に含まれるキーワードとは、ここでは質問文に含まれる自立語を表す。事例検索部１０により得られたＩＤ１からＩＤ５の事例の質問文と回答文とに含まれる自立語を図９に示す。「異常音」、「薄い」、「印字」、「紙詰まり」、「発生」、「思う」、「途中」、「電源」、「頻発」、「大きい」、「UNIT」、「最近」、「おこる」、「投入」が重要度計算の対象となるキーワードである。 Here, the keyword included in the question sentence represents an independent word included in the question sentence. FIG. 9 shows independent words included in the question sentence and the answer sentence of the cases ID1 to ID5 obtained by the case search unit 10. “Abnormal sound”, “light”, “printing”, “paper jam”, “occurrence”, “think”, “middle”, “power”, “frequent”, “large”, “UNIT”, “recent”, “Okuru” and “Throw in” are keywords for importance calculation.

キーワードの重要度は、回答文同士の類似度を用いて計算する。具体的には、図１０の(式1)に示すように、キーワードNの重要度Score(N)を、事例検索手段で得られた事例でかつ、質問文にNを含む事例D_Nの中から選んだ2つの回答文の類似度sim(d_i,d_j)の全ての組み合わせを計算し、その平均値を求める。組み合わせの数は、|D_N|に対して|D_N|x(|D_N|-1)/2となる。尚、キーワードの重要度を計算する方法は(式2)以降に示すように様々な形態が存在するが、後述する。 The importance of the keyword is calculated using the similarity between the answer sentences. Specifically, as shown in FIG. 10 (Equation 1), the importance score Score (N) of the keyword N is selected from the cases D_N that are examples obtained by the case search means and include N in the question sentence. All combinations of similarity sim (d_i, d_j) of the two selected answer sentences are calculated, and the average value is obtained. The number of combinations is | D_N | x (| D_N | -1) / 2 with respect to | D_N |. There are various methods for calculating the importance of the keyword, as will be described later.

図１１の(式5)にsim(d_i,d_j)の計算式を示す。sim(d_i,d_j)は一般にcosine類似度と呼ばれる計算方法である。(式5)において事例dに含まれる語(自立語)tの重みw(d,t)は(式6)または(式7)の方法で計算する。(式6)はどの語の重みも常に1で一定であり、(式7)はtのdにおける出現回数tf(d,t)と、tの出現事例数df(t)の逆数の積により求める。尚、sim(d_i,d_j)の計算方法は、cosine類似度以外でも、２文書間の類似度を計算するどのような方式も適用できる。例えば、Jaccard係数やdice係数であっても良い。 FIG. 11 (Formula 5) shows a calculation formula of sim (d_i, d_j). sim (d_i, d_j) is a calculation method generally called cosine similarity. In (Expression 5), the weight w (d, t) of the word (independent word) t included in the case d is calculated by the method of (Expression 6) or (Expression 7). In (Equation 6), the weight of every word is always 1 and (Equation 7) is the product of the number of occurrences tf (d, t) of t in d and the inverse of the number of occurrences df (t) of t. Ask. Note that the calculation method of sim (d_i, d_j) can be applied to any method for calculating the similarity between two documents other than the cosine similarity. For example, a Jaccard coefficient or a dice coefficient may be used.

キーワードNの重要度の計算例を図１２に示す。図１２の(a)は、文書検索部１０により得られた図８のID1からID5の事例の質問文に含まれるキーワード「薄い」のスコアScore(薄い)を計算している。「薄い」はID1-ID3の3つの事例の質問文に含まれるため類似度はsim(1,2), sim(1,3), sim(2,3)の3通りあり、これらの平均値を求める。図１２の(b)は、sim(d_i,d_j)の計算例として、sim(1,2)を計算している。それぞれのw(d,t)の値は図１２の(c)のように計算する。例えば、ID1の事例で「LD」は1回出現し、「LD」は５事例中３事例で出現するので、1*log(5/3)=1.42となる。 An example of calculating the importance of the keyword N is shown in FIG. FIG. 12A calculates the score Score (thin) of the keyword “thin” included in the question sentences of the examples ID1 to ID5 in FIG. 8 obtained by the document search unit 10. “Thin” is included in the question texts of the three cases ID1-ID3, so there are three similarities, sim (1,2), sim (1,3), and sim (2,3). Ask for. (B) of FIG. 12 calculates sim (1, 2) as a calculation example of sim (d_i, d_j). Each value of w (d, t) is calculated as shown in FIG. For example, in the case of ID1, “LD” appears once and “LD” appears in 3 out of 5 cases, so 1 * log (5/3) = 1.42.

図１３に、事例検索部１０により得られた図８の事例の質問文に含まれるキーワードの重要度を計算した結果を示す。図１０の(式1)は、キーワードを含む事例の数|D_N|が1のとき分母が0になるため計算できない。そのため|D_N|=1のキーワードは0としている。絞込みキーワード選択部１１は、得られた重要度の順にあらかじめ指定された個数あるいは重要度が閾値以上のキーワードを絞込みキーワードとして、絞込みキーワード記憶部２２に格納する。例えば、重要度が0.7以上のキーワードを絞込みキーワードとするとした場合は、「異常音」「薄い」の２つが絞込みキーワードとして絞込みキーワード記憶部２２に格納される。この際、絞込みキーワード記憶部２２には図１４のように絞込みキーワードとそれを含む事例IDを対応付けて登録しておく。 FIG. 13 shows the result of calculating the importance of the keyword included in the question sentence of the case of FIG. 8 obtained by the case search unit 10. (Equation 1) in FIG. 10 cannot be calculated because the denominator is 0 when the number of cases including keywords | D_N | is 1. Therefore, the keyword of | D_N | = 1 is 0. The narrowed keyword selection unit 11 stores the number of keywords specified in advance in the order of importance or the keywords whose importance is equal to or greater than the threshold as narrowed keywords in the narrowed keyword storage unit 22. For example, when a keyword having an importance of 0.7 or more is selected as a narrowing keyword, two of “abnormal sound” and “light” are stored in the narrowing keyword storage unit 22 as narrowing keywords. At this time, the narrowed keyword storage unit 22 registers the narrowed keyword and the case ID including the narrowed keyword as shown in FIG.

最後に表示手段１２は、絞込みキーワード記憶部２１と事例データベース２０とを参照して、絞込みキーワード選択部１１により得られた絞込みキーワードと事例検索部１０により得られた事例を出力装置４に送信する（図７のステップＳ３）。 Finally, the display unit 12 refers to the narrowed keyword storage unit 21 and the case database 20, and transmits the narrowed keyword obtained by the narrowed keyword selection unit 11 and the case obtained by the case search unit 10 to the output device 4. (Step S3 in FIG. 7).

図１４のように絞込みキーワード記憶部２２には、「異常音」、「薄い」とそれぞれを含む事例検索部１０で得られた事例IDの対応が格納されているので、その内容を出力装置４に送信すればよい。 As shown in FIG. 14, the narrowed keyword storage unit 22 stores correspondences of case IDs obtained by the case search unit 10 including “abnormal sound” and “light”, and the contents are output to the output device 4. To send to.

以上、本実施の形態の動作を説明した。図７のステップS2において事例検索部１０により得られた事例の質問文に含まれるキーワードの重要度を、その回答文同士の類似度を用いて計算した。上記では特に、その具体的な実現方式として図１０の(式1)を用いたが、そのほかにも実現方式が存在する。 The operation of the present embodiment has been described above. The importance of the keyword included in the question text of the case obtained by the case search unit 10 in step S2 of FIG. 7 was calculated using the similarity between the answer sentences. In the above, (Equation 1) in FIG. 10 is used as a specific implementation method, but there are other implementation methods.

図１０の(式2)の実現方式では、キーワードNの重要度Score(N)は、事例検索部１０で得られた事例でかつ、質問文にキーワードを含む事例D_Nの回答文の中心を求めた後、それぞれの回答文とその回答文の中心の類似度sim(d_i,d_DN)の平均値を求めることによって得る。回答文の中心は、D_Nの回答文を全て連結することにより求める。例えば、図８のＩＤ１からＩＤ５の回答文の中心は「LDユニットが故障していました／LDユニットを交換。／LDユニットの故障。交換／ピックローラ不良。／どうやらピックローラの故障のようでした」となる(“／”は回答文の区切り)。(式1)では、D_Nの2つ回答文の類似度の全ての組み合わせを求める必要があるが、(式2)はそれぞれの回答文と回答文の中心の類似度のみを計算すればよいので、(式1)よりも高速に実現できる。 In the realization method of (Equation 2) in FIG. 10, the importance Score (N) of the keyword N is the case obtained by the case search unit 10 and the center of the answer sentence of the case D_N including the keyword in the question sentence is obtained. After that, the average value of the similarity sim (d_i, d_DN) of each answer sentence and the center of the answer sentence is obtained. The center of the answer sentence is obtained by concatenating all the answer sentences of D_N. For example, the center of the response from ID1 to ID5 in Fig. 8 is "LD unit was broken / LD unit replaced./LD unit failed. Replacement / pick roller failure./ Apparently pick roller failure. ("/" Is a delimiter of the answer sentence). In (Equation 1), it is necessary to find all combinations of the similarity between the two answer sentences D_N, but (Equation 2) only needs to calculate the similarity between the center of each answer sentence and the answer sentence. , (Equation 1) can be realized at higher speed.

図１０の(式3)の実現方式では、キーワードNの重要度Score(N)を、(式1)の値からその出現事例数を有するキーワードの重要度の期待値E[Score(N)]の値を引くことによって得る。例えば、文書検索手段により得られた図８の事例において「薄い」はID1-ID3の3つの質問文で現れる。「薄い」のスコアを(式1)の方法で求め、さらに３つの質問文で現れるキーワードの(式1)のスコアの期待値を求め、両者の差を求める。(式1)の性質上、出現事例数が小さなキーワードほどスコアが高くなり、逆に出現事例数が高いキーワードほどスコアは低くなる全体傾向がある。そのため、(式1)をそのまま適用すると、出現事例数が小さいキーワードが過大評価されてしまう。(式3)のように、期待値の値を引くことによってこの問題を解消可能である。 In the realization method of (Equation 3) in FIG. 10, the importance score Score (N) of the keyword N is changed from the value of (Equation 1) to the expected value E [Score (N)] of the importance of the keyword having the number of appearance cases. By subtracting the value of For example, in the case of FIG. 8 obtained by the document search means, “thin” appears in three question sentences ID1 to ID3. The “thin” score is obtained by the method of (Equation 1), the expected value of the score of (Equation 1) of the keyword appearing in the three question sentences is obtained, and the difference between the two is obtained. Due to the nature of (Equation 1), there is a general tendency that a keyword with a smaller number of appearance cases has a higher score and a keyword with a higher number of appearance cases has a lower score. Therefore, if (Equation 1) is applied as it is, a keyword with a small number of appearance cases will be overestimated. As shown in (Expression 3), this problem can be solved by subtracting the expected value.

図１０の(式4)の実現方式は、(式1)の代わりに(式2)を用いる点以外は、(式3)と同様である。 The implementation method of (Expression 4) in FIG. 10 is the same as (Expression 3) except that (Expression 2) is used instead of (Expression 1).

図１５の(式9)から(式12)は、事例検索部１０により得られた事例の質問文に含まれるキーワードの重要度を、その回答文同士の類似度を用いることに加え、その出現事例数を用いる点で(式1)から(式4)と異なる。この実装方法により、回答を特定しやすいキーワードであると同時に、利用者の質問を具体化できる可能性が高いキーワードを応対キーワードとして提示することが可能になる。(式9)から(式12)で出現事例数|D_N|の項に対数logが付与されているのは、出現事例数の影響を小さくするためである。無論logを付与しなくても良い。 (Equation 9) to (Equation 12) in FIG. 15 show the importance of the keywords included in the question sentence of the case obtained by the case search unit 10 in addition to the similarity between the answer sentences, It differs from (Equation 1) to (Equation 4) in that the number of cases is used. With this implementation method, it is possible to present keywords that are easy to identify answers and at the same time have a high possibility of realizing a user's question as a response keyword. The reason why the logarithm log is given to the term of the number of appearance cases | D_N | in (Expression 9) to (Expression 12) is to reduce the influence of the number of appearance cases. Of course, it is not necessary to add a log.

質問回答検索システムは利用者に絞込みキーワードを提示し、利用者は提示された絞込みキーワードが初期に入力した質問文を具体化できるのであれば、その絞込みキーワードを含めた質問文で新たに検索する。例えば、初期の質問文「紙詰まり」に対して「薄い」「異常音」を絞込みキーワードとして提示された場合、利用者は「紙詰まり」に加えて「(印字が)薄い」、「異常音」といった現象が発生していないかを調査する。その結果、もし「(印字が)薄い」が発生している場合は、「薄い」と「紙詰まり」とで新たに事例を検索する。ここで、もし「薄い」を含む事例が３件、「異常音」を含む事例が１件であれば、「薄い」ほうが「異常音」よりも実際に発生している可能性が高いといえる。このように、出現事例数を考慮することで、利用者が質問を具体化できる可能性が高いキーワードを応対キーワードとして提示することができるようになる。 The question answer search system presents the narrowed keyword to the user, and if the user can materialize the question text that was initially input by the narrowed keyword, the search is newly performed with the question sentence including the narrowed keyword. . For example, when the initial question sentence “paper jam” is presented as a keyword that narrows “light” and “abnormal sound”, the user can add “(paper) light” and “abnormal sound” in addition to “paper jam”. ”Is investigated whether a phenomenon such as“ As a result, if “(printing) is thin” occurs, a new case is searched for “thin” and “paper jam”. Here, if there are 3 cases including “thin” and 1 case including “abnormal sound”, it can be said that “thin” is more likely to actually occur than “abnormal sound”. . In this way, by considering the number of appearance cases, it is possible to present a keyword with a high possibility that the user can materialize the question as a response keyword.

図１６の(式13)と(式14)は、事例検索部１０により得られた事例の質問文に含まれるキーワードの重要度を、その回答文同士の類似度に加えてその質問文同士の類似度を用いる点で(式1)から(式4),(式9)から(式12)と異なる。(式13)と(式14)とにおけるQScore(N)では、質問文を対象として(式1)から(式4),(式9)から(式12)のScore(N)を計算する。すなわち、|D_N|をNを質問文に含む事例集合, sim(d_1,d_2)を質問文d_1と質問文d_2の間の類似度と置き換え計算すればよい。AScore(N)は(式1)-(式4),(式9)-(式12)のScore(N)と同様である。 (Equation 13) and (Equation 14) in FIG. 16 show the importance of keywords included in the question sentence of the case obtained by the case search unit 10 in addition to the similarity between the answer sentences. It differs from (Expression 1) to (Expression 4) and (Expression 9) to (Expression 12) in that the similarity is used. In QScore (N) in (Expression 13) and (Expression 14), Score (N) of (Expression 1) to (Expression 4) and (Expression 9) to (Expression 12) is calculated for the question sentence. That is, | D_N | may be calculated by replacing | D_N | with a case set including N in the question sentence, and sim (d_1, d_2) with the similarity between the question sentence d_1 and the question sentence d_2. AScore (N) is the same as Score (N) in (Formula 1)-(Formula 4) and (Formula 9)-(Formula 12).

ある質問に対する正解の回答が複数存在する場合、回答文同士の類似度を利用すると、それらの回答を特定するキーワードの重要度は必ずしも高くならない。一方、質問文の内容と回答文の内容には一定の相関があるので、質問文同士の類似度が高ければ、回答文同士の類似度が高くなくても回答を特定するキーワードである可能性が高くなる。そのため、質問文同士の類似度も合わせて用いる。 When there are a plurality of correct answers to a certain question, if the similarity between the answer sentences is used, the importance of the keyword for identifying those answers does not necessarily increase. On the other hand, there is a certain correlation between the content of the question text and the content of the answer text, so if the similarity between the question texts is high, it may be a keyword that identifies the answer even if the similarity between the answer texts is not high Becomes higher. Therefore, the similarity between question sentences is also used.

他の実施の形態を説明する。 Another embodiment will be described.

ここまでは、一つの質問文と一つの回答文との対から構成されている事例を用いていたが、事例の中には、一つの質問文と複数の回答文の対から構成されているものもある。例えば、Web上のQAサイトでは、図１７のように、利用者がWebを通して質問文を登録すると、それに対する回答文が他の利用者から登録される。登録された回答文のうち一つが最良の回答文に選ばれる。現在、QAサイトとして有名なものとして、「Yahoo知恵袋」(http://chiebukuro.yahoo.co.jp/)、「教えてgoo」(http://oshiete.goo.ne.jp/)、「OKWave」(http://okwave.jp/)が存在する。 Up to this point, we have used cases that consist of pairs of one question sentence and one answer sentence, but some cases consist of pairs of one question sentence and multiple answer sentences. There are also things. For example, in a QA site on the Web, as shown in FIG. 17, when a user registers a question sentence through the Web, an answer sentence to that is registered from another user. One of the registered answer sentences is selected as the best answer sentence. Currently known as QA sites are Yahoo Wisdom Bag (http://chiebukuro.yahoo.co.jp/), Teach Goo (http://oshiete.goo.ne.jp/), OKWave "(http://okwave.jp/) exists.

このような一つの質問文と複数の回答文との対を含む事例に対する本発明の適用方法を説明する。最も単純な方法は、複数の回答文のうちあらかじめ選択された最良の回答文を用いて絞込みキーワード選択部１１を実行することである。この場合、実質的には、一つの質問文と一つの回答文との対から構成されているとみなすことができる。 A method of applying the present invention to an example including a pair of such one question sentence and a plurality of answer sentences will be described. The simplest method is to execute the narrow-down keyword selection unit 11 using the best answer sentence selected in advance among a plurality of answer sentences. In this case, it can be considered that it is substantially composed of a pair of one question sentence and one answer sentence.

別の方法として、一つの質問文に対してn個の回答文の対を含む事例を、一つの質問文に対して一つの回答文を含むn個の事例に展開してから絞込みキーワード部１１を適用することである。例えば、{Q1,{A1,A2}}からなる事例を{Q1,A1}{Q1,A2}に展開する。ここで、{Q1,{A1,A2}}とは、質問文Q1と2つの回答文A1,A2からなる一つの事例を、質問文Q1と回答文A1、および、質問文Q1と回答文A2の2つの事例に展開することを表す。 As another method, a narrowed keyword portion 11 is obtained after expanding a case including a pair of n response sentences for one question sentence into n cases including one answer sentence for one question sentence. Is to apply. For example, a case consisting of {Q1, {A1, A2}} is expanded to {Q1, A1} {Q1, A2}. Here, {Q1, {A1, A2}} means one example consisting of question sentence Q1 and two answer sentences A1, A2, question sentence Q1 and answer sentence A1, and question sentence Q1 and answer sentence A2. This represents an expansion to two cases.

さらに別の方法として、一つの質問文に対する複数の回答文の中心を求めてから絞込みキーワード部１１を実行することである。この場合、実質的には、一つの質問文と一つの回答文の対から構成されているとみなすことができる。中心の求め方については、図１０の(式2)の時と同様に複数の回答文を全て連結することにより求めればよい。 As yet another method, the narrowed-down keyword unit 11 is executed after obtaining the centers of a plurality of answer sentences for one question sentence. In this case, it can be considered that it is substantially composed of a pair of one question sentence and one answer sentence. The method for obtaining the center may be obtained by concatenating a plurality of answer sentences as in the case of (Formula 2) in FIG.

さらに別の方法として、絞込みキーワード部１１が、事例検索部１０により得られた事例の質問文に含まれるキーワードの重要度を、事例の回答文同士の類似度を用いて計算する際に、事例の複数の回答文と事例の複数の回答文の全ての組み合わせの類似度を求め、その最小値または最大値または平均値を事例の回答文同士の類似度とする方法がある。例えば、事例A{Q1, {A1,A2} , 事例B{Q2, {A3,A4}}において、事例Aと事例Bの回答文同士の類似度をsim(A1,A3), sim(A1,A4), sim(A2,A3), sim(A2,A4)の最小値、最大値、又は平均値によって求める。 As another method, when the refinement keyword unit 11 calculates the importance of the keyword included in the question sentence of the case obtained by the case search unit 10 using the similarity between the answer sentences of the cases, There is a method in which the similarity of all combinations of the plurality of answer sentences and the plurality of case answer sentences is obtained, and the minimum value, maximum value, or average value thereof is used as the degree of similarity between the case answer sentences. For example, in case A {Q1, {A1, A2}, case B {Q2, {A3, A4}}, the similarity between the answer sentences of case A and case B is expressed as sim (A1, A3), sim (A1, A4), sim (A2, A3), and sim (A2, A4) are obtained from the minimum value, maximum value, or average value.

本実施の形態の効果は、入力された質問文からでは回答が十分に特定できない(絞り込めない)場合に、回答を特定しやすい応対キーワードを提示することにより、回答の特定を支援することが可能になることである。その理由は、事例検索部１０により得られた事例の質問文に含まれるキーワードの重要度を、その回答文同士の類似度を用いて計算し、重要度の順にあらかじめ指定された個数あるいは重要度が閾値以上のキーワードを絞込みキーワードとする絞込みキーワード部１１を有するためである。 The effect of this embodiment is to support the identification of answers by presenting response keywords that make it easy to identify answers when the answers cannot be sufficiently identified from the input question text (cannot be narrowed down). It is possible. The reason is that the importance of keywords included in the question text of the case obtained by the case search unit 10 is calculated using the similarity between the answer sentences, and the number or importance specified in advance in the order of importance. This is because the narrowed keyword part 11 is used as a narrowed keyword.

本発明の質問回答検索システムは、コンタクトセンターのオペレータが受けた顧客からの問い合わせを正確に早く回答するために、蓄積された事例データベースから参考となる回答を検索する際に利用できる。また、WebのQAサイトにおいて、利用者が参考となるQAを検索する際に利用できる。 The question answer search system according to the present invention can be used when searching for a reference answer from an accumulated case database in order to quickly and accurately answer a customer inquiry received by an operator of a contact center. It can also be used to search for QA for users to reference on the Web QA site.

図１は従来技術の問題点を説明するための図である。FIG. 1 is a diagram for explaining the problems of the prior art. 図２は本発明の概要を説明するための図である。FIG. 2 is a diagram for explaining the outline of the present invention. 図３は本発明の概要を説明するための図である。FIG. 3 is a diagram for explaining the outline of the present invention. 図４は本発明の概要を説明するための図である。FIG. 4 is a diagram for explaining the outline of the present invention. 図５は本発明の概要を説明するための図である。FIG. 5 is a diagram for explaining the outline of the present invention. 図６は本発明の実施の形態の構成を示す図である。FIG. 6 is a diagram showing the configuration of the embodiment of the present invention. 図７は本発明の実施の形態の動作を示す流れ図である。FIG. 7 is a flowchart showing the operation of the embodiment of the present invention. 図８は本発明の実施の形態の動作を説明するための図である。FIG. 8 is a diagram for explaining the operation of the embodiment of the present invention. 図９は本発明の形態の絞込みキーワード部１１の動作を説明するための図である。FIG. 9 is a diagram for explaining the operation of the refinement keyword unit 11 according to the embodiment of this invention. 図１０は本発明の実施の形態の絞込みキーワード部１１の動作を説明するための図である。FIG. 10 is a diagram for explaining the operation of the refinement keyword unit 11 according to the embodiment of this invention. 図１１は本発明の実施の形態の絞込みキーワード部１１の動作を説明するための図である。FIG. 11 is a diagram for explaining the operation of the refinement keyword unit 11 according to the embodiment of this invention. 図１２は本発明の実施の形態の絞込みキーワード部１１の動作を説明するための図である。FIG. 12 is a diagram for explaining the operation of the refinement keyword unit 11 according to the embodiment of this invention. 図１３は本発明の実施の形態の絞込みキーワード部１１の動作を説明するための図である。FIG. 13 is a diagram for explaining the operation of the refinement keyword unit 11 according to the embodiment of this invention. 図１４は本発明の実施の形態の絞込みキーワード記憶部22の格納例である。FIG. 14 shows a storage example of the narrowed keyword storage unit 22 according to the embodiment of this invention. 図１５は本発明の実施の形態の絞込みキーワード部１１の動作を説明するための図である。FIG. 15 is a diagram for explaining the operation of the refinement keyword unit 11 according to the embodiment of this invention. 図１６は本発明の実施の形態の絞込みキーワード部１１の動作を説明するための図である。FIG. 16 is a diagram for explaining the operation of the refinement keyword unit 11 according to the embodiment of this invention. 図１７は本発明の他の実施の形態の絞込みキーワード部１１の動作を説明するための図である。FIG. 17 is a diagram for explaining the operation of the refinement keyword unit 11 according to another embodiment of the present invention.

Explanation of symbols

１データ処理装置
２記憶装置
３入力装置
４出力装置
１０事例検索部
１１絞込みキーワード選択部
１２表示部
２０事例データベース
２１事例ID記憶部
２２絞込みキーワード記憶部 DESCRIPTION OF SYMBOLS 1 Data processing device 2 Storage device 3 Input device 4 Output device 10 Case search part 11 Refinement keyword selection part 12 Display part 20 Case database 21 Case ID storage part 22 Refinement keyword storage part

Claims

A case database that stores cases containing pairs of question and answer sentences;
The importance of the keyword included in the question sentence of the case is calculated using the similarity between the answer sentences of the question sentence pairs, and the narrow-down keyword for narrowing down the answer to the inquiry sentence based on the importance is A question answer search system having narrowed keyword selection means for selecting from keywords.

The question answer search system according to claim 1, further comprising case search means for searching the case database for cases where the inputted inquiry sentence is similar to the question sentence in the case database.

The narrowed keyword selection means calculates the importance of the keyword included in the question sentence of the case searched by the case search means using the similarity between the answer sentences of the question sentence pairs, and based on the importance The question answer search system according to claim 2, wherein a narrow-down keyword for narrowing down answers to the inquiry sentence is selected from the keywords.

The question answer search system according to claim 1, wherein the narrowing keyword selection unit selects a predetermined number of keywords as narrowing keywords in the order of importance.

The question answer search system according to claim 1, wherein the narrowed keyword selection unit selects a keyword having the importance level equal to or higher than a threshold as a narrow keyword.

The question answer search system according to claim 1, further comprising display means for transmitting the narrowed keyword and an example including the narrowed keyword to an output device.

The narrowed-down keyword selection unit calculates the importance of the keyword included in the question sentence of the case using the similarity between the answer sentences of the case and the number of appearance cases. Question answer search system described in.

The narrowed keyword selection means calculates the importance of a keyword included in the question sentence of the case using the similarity between the answer sentences of the case and the similarity between the question sentences of the case. The question answer search system according to claim 6.

The narrowed-down keyword selection means calculates the similarity between the answer sentences of the case examples including the keyword in the question sentence for all combinations of the answer sentences in the case examples that include the keyword in the question sentence, and calculates the average value of the importance of the keyword The question answer search system according to any one of claims 1 to 6.

The narrowed-down keyword selection means obtains the center of the answer sentence of the case including the keyword in the question sentence, and calculates the average value of the similarity between each answer sentence of the case including the keyword in the question sentence and the center of the answer sentence, The question answer search system according to any one of claims 1 to 6, wherein the importance level of the keyword is set.

7. The narrowed-down keyword selection means uses a value obtained by subtracting an expected value of the importance of a keyword having the number of appearance cases of the keyword from the obtained importance of the keyword as the importance of the keyword. Question answer search system described in any one.

The narrow-down keyword selection means, when one case is composed of a pair of one question sentence and a plurality of answer sentences, one example is regarded as a pair of one question sentence and one answer sentence. The question answer search system according to any one of claims 1 to 11.

The narrow-down keyword selection means, when one case is composed of a pair of one question sentence and a plurality of answer sentences, selects the best answer sentence from a plurality of answer sentences, and selects the case as the question sentence. The question and answer search system according to claim 12, which is regarded as a pair of the selected answer sentence and the selected answer sentence.

The narrow-down keyword selection means expands into n cases including one answer sentence for one question sentence when one case is composed of a pair of one question sentence and n answer sentences. The question and answer search system according to claim 12.

The narrow-down keyword selection means obtains the center of a plurality of answer sentences for one question sentence when one example is composed of a pair of one question sentence and a plurality of answer sentences, The question answer search system according to claim 12, which is regarded as a pair of the answer sentence and the center of the answer sentence.

The narrow-down keyword selection means, when one case is composed of a pair of one question sentence and a plurality of answer sentences, obtains the similarity of all combinations of a plurality of answer sentences of the target case, The question answer search system according to claim 12, wherein the minimum value, the maximum value, or the average value is set as the similarity between the answer sentences in the cases.

The importance of the keyword included in the question sentence of the case including the pair of the question sentence and the answer sentence is calculated using the similarity between the answer sentences of the question sentence pair, and the query sentence is calculated based on the importance degree. A question answer search method for selecting a narrow-down keyword for narrowing down an answer from the keyword.

The question answer search method according to claim 17, wherein a case where a query sentence and a question sentence are similar is searched from a case database storing cases including pairs of question sentences and answer sentences.

The importance of the keyword included in the question sentence of the searched case is calculated using the similarity between the answer sentences of the question sentence pairs, and the answers to the inquiry sentence are narrowed down based on the importance. The question answer search method according to claim 18, wherein a narrow-down keyword is selected from the keywords.

18. The question answer search method according to claim 17, wherein a predetermined number of keywords are selected as narrowing keywords in order of importance.

The question answer search method according to claim 17, wherein keywords having the importance level equal to or higher than a threshold are selected as narrowing keywords.

The question answer search method according to claim 17, wherein the narrowed keyword and an example including the narrowed keyword are displayed.

The question answer search method according to any one of claims 17 to 22, wherein the importance of a keyword included in the question sentence of the case is calculated using a similarity between the answer sentences of the case and the number of appearance cases. .

The importance of the keyword included in the question sentence of the case is calculated using the similarity between the answer sentences of the case and the similarity between the question sentences of the case. The method for searching for answers to questions.

18. The degree of similarity between the answer sentences of cases including a keyword in the question sentence is calculated for all combinations of the answer sentences of cases containing the keyword in the question sentence, and the average value is set as the importance of the keyword. Item 23. The question answer search method according to Item 22.

The center of the answer sentence of the case including the keyword in the question sentence is obtained, and the average value of the similarity between each answer sentence of the case including the keyword in the question sentence and the center of the answer sentence is set as the importance of the keyword. The question answer search method according to any one of claims 17 to 22.

The question answer according to any one of claims 17 to 22, wherein a value obtained by subtracting an expected value of the importance of a keyword having the number of appearance cases of the keyword from the obtained importance of the keyword is set as the importance of the keyword. retrieval method.

Claims 17 to 17 in which one example is regarded as a pair of one question sentence and one answer sentence when one example is composed of a pair of one question sentence and a plurality of answer sentences. Item 28. The question answer search method according to any one of Items 27.

When one example is composed of a pair of one question sentence and a plurality of answer sentences, the best answer sentence is selected from among a plurality of answer sentences, and the example sentence is selected as the question sentence and the selected answer sentence. The method for retrieving a question answer according to claim 28, which is regarded as a pair with.

30. The method according to claim 28, wherein when one example is composed of a pair of one question sentence and n answer sentences, the one question sentence is expanded into n cases including one answer sentence. Question answer search method.

When one example is composed of a pair of one question sentence and a plurality of answer sentences, the center of a plurality of answer sentences for one question sentence is obtained, and the example is defined as the center of the question sentence and the answer sentence. The method for searching for answers to a question according to claim 28, which is regarded as a pair.

When one case is composed of a pair of one question sentence and multiple answer sentences, the similarity of all combinations of multiple answer sentences of the target case is obtained, and the minimum value, maximum value, or average 29. The question answer search method according to claim 28, wherein the value is a similarity between the answer sentences in the cases.

A process of calculating the importance of a keyword included in a question sentence of a case including a pair of a question sentence and an answer sentence using the similarity between the answer sentences of the question sentence pair;
A program for causing an information processing apparatus to execute a process of selecting a narrow-down keyword for narrowing down answers to an inquiry sentence from the keyword based on the importance.