JP7009911B2

JP7009911B2 - Answer output program, answer output method and information processing device

Info

Publication number: JP7009911B2
Application number: JP2017207623A
Authority: JP
Inventors: 章文中浜
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-10-26
Filing date: 2017-10-26
Publication date: 2022-01-26
Anticipated expiration: 2037-10-26
Also published as: JP2019079437A

Description

本発明は、回答出力プログラム、回答出力方法および情報処理装置に関する。 The present invention relates to an answer output program, an answer output method, and an information processing apparatus.

近年、オペレータのスキル平準化や回答リードタイムの短縮化のため、ＦＡＱ（ＦｒｅｑｕｅｎｔｌｙＡｓｋｅｄＱｕｅｓｔｉｏｎ）システムを導入するコールセンタが増えている。また、外部にＦＡＱを公開して、コールセンタへの問い合わせ削減を図ることも行われている。 In recent years, an increasing number of call centers have introduced the FAQ (freak Asked Question) system in order to standardize operator skills and shorten response lead times. In addition, the FAQ is open to the outside to reduce the number of inquiries to the call center.

先行技術としては、例えば、複数の識別器各々の識別結果および真のクラスを２次学習データとして用いて、統合識別器の学習を行うものがある。また、追加素性情報候補が追加された新たな学習モデル情報の回答候補抽出の精度と、元の学習モデル情報の回答候補抽出の精度とを比較し、回答候補抽出の精度が所定値以上向上している場合には、新たな学習モデル情報をＤＢに保持させたままとする技術がある。また、第１の処理部による、第１の質問に対する第１の回答を取得し、第２の処理部による、第１の質問に対する回答および回答の評価を参照して得られた、第１の回答の評価を取得し、第１の回答と第１の回答の評価とを併せて提示する技術がある。また、問い合わせの対象を用いて行動履歴情報を検索する範囲を決定し、絞込みモデルを用いて、範囲内の行動履歴情報によって示される行動の履歴から、問い合わせの回答の候補ごとに可能性を判定し、この可能性に応じて回答の候補を出力する技術がある。 As the prior art, for example, there is a method in which the integrated classifier is trained by using the discrimination result of each of the plurality of classifiers and the true class as the secondary learning data. In addition, the accuracy of answer candidate extraction of new learning model information to which additional elemental information candidates have been added is compared with the accuracy of answer candidate extraction of the original learning model information, and the accuracy of answer candidate extraction is improved by a predetermined value or more. If so, there is a technique for keeping the new learning model information in the DB. In addition, the first answer obtained by the first processing unit for the first question was obtained, and the first answer was obtained by referring to the answer to the first question and the evaluation of the answer by the second processing unit. There is a technique for acquiring an evaluation of an answer and presenting the evaluation of the first answer and the evaluation of the first answer together. In addition, the range for searching the behavior history information is determined using the target of the inquiry, and the possibility is determined for each candidate for answering the inquiry from the behavior history indicated by the behavior history information within the range using the narrowing model. However, there is a technology to output candidate answers according to this possibility.

特開２０１５－１１６８６号公報Japanese Unexamined Patent Publication No. 2015-11686 特開２００７－２１９９５５号公報Japanese Unexamined Patent Publication No. 2007-219955 特開２０１７－９７５６１号公報Japanese Unexamined Patent Publication No. 2017-97561 特開２０１２－１２８５２５号公報Japanese Unexamined Patent Publication No. 2012-128525

しかしながら、従来技術では、入力された質問に対して出力する回答候補を最適化することが難しい。例えば、ＦＡＱへのアクセス数の多い順や、入力されたキーワードや文章との類似度が高い順にＦＡＱを並び替えただけでは、ユーザが期待する並びになっていないことがある。 However, in the prior art, it is difficult to optimize the answer candidates to be output for the input question. For example, simply sorting the FAQs in descending order of the number of access to the FAQs or in descending order of similarity with the entered keywords or sentences may not be the order expected by the user.

一つの側面では、本発明は、質問に対して出力する回答候補を最適化することを目的とする。 In one aspect, the present invention aims to optimize the answer candidates to be output for a question.

１つの実施態様では、質問の入力を受け付け、受け付けた前記質問に対応する複数の回答候補を特定し、前記質問に対する前記回答候補のうち選択操作を受け付けた１または複数の第１の回答候補と、前記質問に対する前記回答候補のうち選択操作を受け付けなかった１または複数の第２の回答候補とを記憶する記憶部の記憶内容に基づく機械学習を行って、特定した前記複数の回答候補それぞれについて、受け付けた前記質問に対して表示された際に選択される第１の確率と、受け付けた前記質問に対して表示された際に選択されない第２の確率とを算出し、算出した前記第１の確率と前記第２の確率とに基づいて、前記複数の回答候補のうち前記第２の確率が閾値以上の回答候補を除外した回答候補を、前記第１の確率が高い順にソートした１または複数の回答候補を、受け付けた前記質問に対する１または複数の回答候補に決定し、決定した前記１または複数の回答候補を出力し、前記質問に対する選択操作を受け付けた前記回答候補を示すログ情報を蓄積し、蓄積した前記ログ情報に基づいて、前記質問に対して表示される上位所定数の前記回答候補のいずれかが選択される確率を算出し、算出した前記確率の時系列変化に基づいて、前記閾値を調整し、前記調整する処理は、算出した前記確率の時系列変化に基づいて、前記確率が下降傾向にあると判断した場合、前記閾値を所定値下げる、回答出力プログラムが提供される。 In one embodiment, an input of a question is accepted, a plurality of answer candidates corresponding to the received question are specified, and one or a plurality of first answer candidates that have received a selection operation among the answer candidates for the question . , Each of the plurality of answer candidates identified by performing machine learning based on the stored contents of the storage unit that stores one or a plurality of second answer candidates that did not accept the selection operation among the answer candidates for the question. , The first probability calculated when displayed for the accepted question and the second probability not selected when displayed for the accepted question. Based on the probability of Multiple answer candidates are determined as one or more answer candidates for the received question, the determined one or more answer candidates are output , and log information indicating the answer candidate for which the selection operation for the question is accepted is output. Based on the accumulated log information, the probability that any of the upper predetermined number of answer candidates displayed for the question will be selected is calculated, and the calculated probability is based on the time-series change. , The process of adjusting the threshold is provided with a response output program that lowers the threshold by a predetermined value when it is determined that the probability is on a downward trend based on the calculated time-series change of the probability. To.

本発明の一側面によれば、質問に対して出力する回答候補を最適化することができる。 According to one aspect of the present invention, it is possible to optimize the answer candidates to be output to the question.

図１は、実施の形態にかかる回答出力方法の一実施例を示す説明図である。FIG. 1 is an explanatory diagram showing an embodiment of a response output method according to an embodiment. 図２は、回答出力システム２００のシステム構成例を示す説明図である。FIG. 2 is an explanatory diagram showing a system configuration example of the response output system 200. 図３は、情報処理装置１０１のハードウェア構成例を示すブロック図である。FIG. 3 is a block diagram showing a hardware configuration example of the information processing apparatus 101. 図４は、ＦＡＱマスタ２２０の記憶内容の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of the stored contents of the FAQ master 220. 図５は、アクセスログＤＢ２３０の記憶内容の一例を示す説明図である。FIG. 5 is an explanatory diagram showing an example of the stored contents of the access log DB 230. 図６は、教師データ（よゐこ）ＤＢ２４０の記憶内容の一例を示す説明図である。FIG. 6 is an explanatory diagram showing an example of the stored contents of the teacher data (Yoiko) DB 240. 図７は、教師データ（わるいこ）ＤＢ２５０の記憶内容の一例を示す説明図である。FIG. 7 is an explanatory diagram showing an example of the stored contents of the teacher data (wariko) DB 250. 図８は、教師データ（まいご）ＤＢ２６０の記憶内容の一例を示す説明図である。FIG. 8 is an explanatory diagram showing an example of the stored contents of the teacher data (maigo) DB 260. 図９は、ＦＡＱ画面９００の画面例を示す説明図である。FIG. 9 is an explanatory diagram showing a screen example of the FAQ screen 900. 図１０は、情報処理装置１０１の機能的構成例を示すブロック図である。FIG. 10 is a block diagram showing a functional configuration example of the information processing apparatus 101. 図１１は、アクセスログの具体例を示す説明図（その１）である。FIG. 11 is an explanatory diagram (No. 1) showing a specific example of the access log. 図１２は、教師データの具体例を示す説明図（その１）である。FIG. 12 is an explanatory diagram (No. 1) showing a specific example of teacher data. 図１３は、アクセスログの具体例を示す説明図（その２）である。FIG. 13 is an explanatory diagram (No. 2) showing a specific example of the access log. 図１４は、教師データの具体例を示す説明図（その２）である。FIG. 14 is an explanatory diagram (No. 2) showing a specific example of teacher data. 図１５は、分析ＤＢ１５００の記憶内容の一例を示す説明図である。FIG. 15 is an explanatory diagram showing an example of the stored contents of the analysis DB 1500. 図１６は、ＦＡＱの決定例を示す説明図である。FIG. 16 is an explanatory diagram showing an example of determining FAQ. 図１７は、アクセスログの具体例を示す説明図（その３）である。FIG. 17 is an explanatory diagram (No. 3) showing a specific example of the access log. 図１８は、１ページ目ヒット率の時系列変化の一例を示す説明図である。FIG. 18 is an explanatory diagram showing an example of time-series changes in the hit rate on the first page. 図１９は、形態素を推定する方法の一実施例を示す説明図である。FIG. 19 is an explanatory diagram showing an embodiment of a method for estimating a morpheme. 図２０は、情報処理装置１０１の教師データ生成処理手順の一例を示すフローチャート（その１）である。FIG. 20 is a flowchart (No. 1) showing an example of the teacher data generation processing procedure of the information processing apparatus 101. 図２１は、情報処理装置１０１の教師データ生成処理手順の一例を示すフローチャート（その２）である。FIG. 21 is a flowchart (No. 2) showing an example of the teacher data generation processing procedure of the information processing apparatus 101. 図２２は、情報処理装置１０１の回答出力処理手順の一例を示すフローチャート（その１）である。FIG. 22 is a flowchart (No. 1) showing an example of the response output processing procedure of the information processing apparatus 101. 図２３は、情報処理装置１０１の回答出力処理手順の一例を示すフローチャート（その２）である。FIG. 23 is a flowchart (No. 2) showing an example of the response output processing procedure of the information processing apparatus 101.

以下に図面を参照して、本発明にかかる回答出力プログラム、回答出力方法および情報処理装置の実施の形態を詳細に説明する。 Hereinafter, embodiments of the response output program, the response output method, and the information processing apparatus according to the present invention will be described in detail with reference to the drawings.

（実施の形態）
図１は、実施の形態にかかる回答出力方法の一実施例を示す説明図である。図１において、情報処理装置１０１は、質問に対応する回答候補を出力するコンピュータである。質問は、何らかの問題の解決方法を問いただすためのものである。質問は、例えば、商品やサービスについての質問である。質問は、単語または複数の単語の組み合わせによって表現されてもよいし、１または複数の文章によって表現されてもよい。 (Embodiment)
FIG. 1 is an explanatory diagram showing an embodiment of a response output method according to an embodiment. In FIG. 1, the information processing apparatus 101 is a computer that outputs answer candidates corresponding to a question. The question is to ask how to solve some problem. The question is, for example, a question about a product or service. The question may be represented by a word or a combination of words, or by one or more sentences.

また、回答候補は、問題を解決するための情報である。すなわち、質問に対応する回答候補は、質問された問題の解決方法を示す回答の候補である。回答候補は、例えば、ＦＡＱである。ＦＡＱは、「頻繁に尋ねられる質問」の略であり、あらかじめ予想される質問に対して、質問と回答をまとめたものである。 In addition, the answer candidate is information for solving the problem. That is, the answer candidate corresponding to the question is the answer candidate indicating the solution method of the questioned problem. The answer candidate is, for example, FAQ. FAQ is an abbreviation for "Frequently Asked Questions" and is a collection of questions and answers to expected questions in advance.

ここで、コールセンタやサポートセンタでは、オペレータのスキル平準化や回答リードタイムの短縮化のため、ＦＡＱシステムが導入されていることが多い。ＦＡＱシステムを適切に運用すると、一次回答率や顧客満足度の向上につながるため、コールセンタ等においてＦＡＱシステムは重要な位置付けにある。 Here, in call centers and support centers, FAQ systems are often introduced in order to level the skills of operators and shorten the response lead time. Proper operation of the FAQ system leads to improvements in the primary response rate and customer satisfaction, so the FAQ system is in an important position in call centers and the like.

また、企業の中には、インターネット上に自社のＦＡＱを公開しているところもある。社外にＦＡＱサイトを公開することで、顧客自身がＦＡＱを閲覧できるようになるため、顧客の利便性の向上やコールセンタ等への問い合わせの削減が期待できる。 In addition, some companies publish their FAQs on the Internet. By disclosing the FAQ site outside the company, customers can browse the FAQ themselves, which can be expected to improve customer convenience and reduce inquiries to call centers and the like.

一方で、ＦＡＱの内容は陳腐化しやすいため、定期的な見直しを行うことが望ましいが、ＦＡＱの見直しにかかる運用負荷は大きい。例えば、ＦＡＱマスタに登録されるＦＡＱの数は、数千～数万個程度となることもあり、それらＦＡＱの内容を短期間に人手で見直すのは非常に負荷がかかる。 On the other hand, since the contents of the FAQ tend to become obsolete, it is desirable to review it regularly, but the operational load required for reviewing the FAQ is heavy. For example, the number of FAQs registered in the FAQ master may be several thousand to tens of thousands, and it is extremely burdensome to manually review the contents of those FAQs in a short period of time.

したがって、質問に対して、現状のＦＡＱの中からより適切なＦＡＱを厳選して提示することは重要である。しかし、ＦＡＱへのアクセス数の多い順や、入力されたキーワードや文章との類似度が高い順にＦＡＱを並び替えただけでは、利用者が期待する並びになっていない、すなわち、利用者が知りたいＦＡＱが上位に表示されないことが多い。 Therefore, it is important to carefully select and present a more appropriate FAQ from the current FAQs for questions. However, simply sorting the FAQs in descending order of the number of access to the FAQs or in descending order of similarity to the entered keywords or sentences does not meet the user's expectations, that is, the user wants to know. FAQs are often not displayed at the top.

また、チャットボットのようなサービスでは、入力された質問に対して一度に表示する検索結果（ＦＡＱ）の数は少なく、より検索精度の高いＦＡＱの絞り込みや並び替えが求められる。例えば、コールセンタのオペレータが使用するＦＡＱシステムや、社外に公開されるＦＡＱサイトの場合、一度に表示する検索結果の数は３０個程度である。これに対して、チャットボットの場合、一度に表示する検索結果の数は５個程度である。 Further, in a service such as a chatbot, the number of search results (FAQs) displayed at one time for an input question is small, and it is required to narrow down and sort FAQs with higher search accuracy. For example, in the case of an FAQ system used by a call center operator or an FAQ site published outside the company, the number of search results displayed at one time is about 30. On the other hand, in the case of a chatbot, the number of search results displayed at one time is about five.

そこで、本実施の形態では、質問に対して出力する回答候補を最適化する回答出力方法について説明する。以下、情報処理装置１０１の処理例について説明する。 Therefore, in the present embodiment, an answer output method for optimizing the answer candidates to be output for the question will be described. Hereinafter, a processing example of the information processing apparatus 101 will be described.

（１）情報処理装置１０１は、質問の入力を受け付ける。入力される質問は、単語または複数の単語の組み合わせであってもよいし、１または複数の文章であってもよい。質問を入力するユーザは、例えば、コールセンタ等のオペレータや、ＦＡＱサイトを利用する利用者である。図１の例では、質問「ＸＸＸ」の入力を受け付けた場合を想定する。 (1) The information processing apparatus 101 accepts the input of a question. The question entered may be a word or a combination of words, or may be one or more sentences. The user who inputs the question is, for example, an operator such as a call center or a user who uses the FAQ site. In the example of FIG. 1, it is assumed that the input of the question "XXX" is accepted.

（２）情報処理装置１０１は、受け付けた質問に対応する複数の回答候補を特定する。回答候補は、例えば、ＦＡＱである。質問に対応する回答候補は、例えば、ＦＡＱシステムやＦＡＱサイトにおいて採用されている既存の検索アルゴリズムを用いて検索される、質問に対する検索結果（ＦＡＱ）である。 (2) The information processing apparatus 101 identifies a plurality of answer candidates corresponding to the received questions. The answer candidate is, for example, FAQ. The answer candidate corresponding to the question is, for example, a search result (FAQ) for the question, which is searched by using an existing search algorithm adopted in the FAQ system or the FAQ site.

具体的には、例えば、情報処理装置１０１は、受け付けた質問を形態素解析して形態素に分解する。つぎに、情報処理装置１０１は、ＦＡＱマスタ（例えば、後述の図２に示すＦＡＱマスタ２２０）から、所定の検索条件にしたがって、分解した形態素に対応するＦＡＱを検索する。そして、情報処理装置１０１は、検索した検索結果（ＦＡＱ）を、受け付けた質問に対応する複数の回答候補として特定する。 Specifically, for example, the information processing apparatus 101 analyzes the received question into morphological elements and decomposes them into morphemes. Next, the information processing apparatus 101 searches the FAQ master (for example, the FAQ master 220 shown in FIG. 2 described later) for the FAQ corresponding to the decomposed morpheme according to a predetermined search condition. Then, the information processing apparatus 101 identifies the searched search result (FAQ) as a plurality of answer candidates corresponding to the received question.

図１の例では、受け付けた質問「ＸＸＸ」に対応する複数の回答候補として、回答候補１～１０が特定された場合を想定する。なお、情報処理装置１０１は、受け付けた質問に対応する複数の回答候補を、他のコンピュータから取得することにしてもよい。 In the example of FIG. 1, it is assumed that answer candidates 1 to 10 are specified as a plurality of answer candidates corresponding to the received question "XXX". The information processing apparatus 101 may acquire a plurality of answer candidates corresponding to the received questions from another computer.

（３）情報処理装置１０１は、記憶部１１０を参照して、特定した複数の回答候補の中から、受け付けた質問に対する１または複数の回答候補を決定する。具体的には、例えば、情報処理装置１０１は、記憶部１１０の記憶内容に基づく機械学習を行って、複数の回答候補の中から、質問に対する１または複数の回答候補を決定する。 (3) The information processing apparatus 101 refers to the storage unit 110 and determines one or a plurality of answer candidates for the received question from the specified plurality of answer candidates. Specifically, for example, the information processing apparatus 101 performs machine learning based on the stored contents of the storage unit 110, and determines one or a plurality of answer candidates for the question from the plurality of answer candidates.

ここで、機械学習とは、コンピュータが、データから学習して、パターンや傾向を導き出し、将来予測や意思決定を可能にすることである。記憶部１１０は、質問に対する選択操作を受け付けた１または複数の第１の回答候補と、質問に対する選択操作を受け付けなかった１または複数の第２の回答候補とを記憶する。すなわち、記憶部１１０は、機械学習に用いる教師データを記憶する。 Here, machine learning means that a computer learns from data to derive patterns and trends, and enables future prediction and decision making. The storage unit 110 stores one or a plurality of first answer candidates that have accepted the selection operation for the question, and one or a plurality of second answer candidates that have not accepted the selection operation for the question. That is, the storage unit 110 stores the teacher data used for machine learning.

より詳細に説明すると、記憶部１１０は、質問に対する選択操作を受け付けた第１の回答候補を当該質問に対応付けた教師データ（よゐこ）を記憶する。また、記憶部１１０は、質問に対する選択操作を受け付けなかった第２の回答候補を当該質問に対応付けた教師データ（わるいこ）を記憶する。 More specifically, the storage unit 110 stores teacher data (yoiko) in which the first answer candidate that has received the selection operation for the question is associated with the question. In addition, the storage unit 110 stores teacher data (wariko) in which the second answer candidate for which the selection operation for the question is not accepted is associated with the question.

第１の回答候補は、例えば、質問に対する上位Ｎ件の回答候補のうち選択操作を受け付けた回答候補であってもよい。また、第２の回答候補は、例えば、質問に対する上位Ｎ件の回答候補のうち選択操作を受け付けなかった回答候補であってもよい。 The first answer candidate may be, for example, an answer candidate that has accepted a selection operation from among the top N answer candidates for a question. Further, the second answer candidate may be, for example, an answer candidate that does not accept the selection operation among the top N answer candidates for the question.

所定数Ｎは、任意に設定可能である。所定数Ｎは、例えば、質問に対する回答候補を表示する画面の１ページ目に表示される回答候補の上限数に設定される。ＦＡＱシステムやＦＡＱサイトでは、所定数Ｎは、３０程度の値に設定される。チャットボットでは、所定数Ｎは、５程度の値に設定される。 The predetermined number N can be arbitrarily set. The predetermined number N is set to, for example, the upper limit number of answer candidates displayed on the first page of the screen for displaying the answer candidates for the question. In the FAQ system and FAQ site, the predetermined number N is set to a value of about 30. In the chatbot, the predetermined number N is set to a value of about 5.

すなわち、第１の回答候補は、ユーザからの質問に対する検索結果の１ページ目に表示され、かつ、ユーザによって照会（選択）された回答候補である。一方、第２の回答候補は、ユーザからの質問に対する検索結果の１ページ目に表示されたにもかかわらず、ユーザによって照会（選択）されなかった回答候補である。 That is, the first answer candidate is an answer candidate displayed on the first page of the search result for the question from the user and inquired (selected) by the user. On the other hand, the second answer candidate is an answer candidate that is displayed on the first page of the search result for the question from the user but is not inquired (selected) by the user.

第１の回答候補を質問に対応付けた教師データを用いて機械学習することで、１ページ目に表示された際に選択される可能性が高い回答候補の特徴を導き出す。また、第２の回答候補を質問に対応付けた教師データを用いて機械学習することで、１ページ目に表示されても選択されない可能性が高い回答候補の特徴を導き出す。 By machine learning the first answer candidate using the teacher data associated with the question, the characteristics of the answer candidate that are likely to be selected when displayed on the first page are derived. Further, by machine learning the second answer candidate using the teacher data associated with the question, the characteristics of the answer candidate that are likely not to be selected even if they are displayed on the first page are derived.

図１の例では、質問「ＸＸＸ」に対応する回答候補１～１０のうち、回答候補１～５が、１ページ目に表示された際に選択される可能性が高い回答候補として特定されたとする。また、回答候補２，５が、１ページ目に表示されても選択されない可能性が高い回答候補として特定されたとする。この場合、情報処理装置１０１は、例えば、回答候補１～５から回答候補２，５を除いた回答候補１，３，４を、質問「ＸＸＸ」に対する回答候補に決定する。 In the example of FIG. 1, among the answer candidates 1 to 10 corresponding to the question "XXX", the answer candidates 1 to 5 are identified as the answer candidates that are likely to be selected when displayed on the first page. do. Further, it is assumed that the answer candidates 2 and 5 are specified as answer candidates that are unlikely to be selected even if they are displayed on the first page. In this case, the information processing apparatus 101 determines, for example, the answer candidates 1, 3 and 4 excluding the answer candidates 2 and 5 from the answer candidates 1 to 5 as the answer candidates for the question "XXX".

（４）情報処理装置１０１は、決定した１または複数の回答候補を出力する。図１の例では、情報処理装置１０１は、受け付けた質問「ＸＸＸ」に対する回答候補として、決定した回答候補１，３，４を出力する。この際、情報処理装置１０１は、例えば、回答候補１，３，４を、１ページ目に表示された際に選択される可能性が高い順にソートして出力することにしてもよい。 (4) The information processing apparatus 101 outputs one or a plurality of determined answer candidates. In the example of FIG. 1, the information processing apparatus 101 outputs the determined answer candidates 1, 3 and 4 as answer candidates for the received question "XXX". At this time, the information processing apparatus 101 may sort and output, for example, the answer candidates 1, 3 and 4 in the order in which they are likely to be selected when they are displayed on the first page.

このように、情報処理装置１０１によれば、過去のユーザの操作履歴をもとに、質問に対して表示された際に選択（照会）される可能性の高さと、質問に対して表示された際に選択（照会）されない可能性の高さという異なる２つの観点から、質問との関連性の強い回答候補と、質問との関連性が弱い回答候補とを特定することができる。例えば、質問に対して表示された際に選択（照会）される可能性が高ければ、質問との関連性が強く、問題解決につながる可能性が高い回答候補といえる。一方、質問に対して表示された際に選択（照会）されない可能性が高ければ、質問との関連性が弱く、問題解決につながる可能性が低い回答候補といえる。これにより、質問に対して表示された際に照会されて問題解決につながるような回答候補を精度よく絞り込んで、質問に対して出力する回答候補を最適化することができる。 In this way, according to the information processing apparatus 101, based on the operation history of the past user, there is a high possibility that the question will be selected (inquired) when it is displayed, and the question will be displayed. It is possible to identify an answer candidate that is strongly related to the question and an answer candidate that is weakly related to the question from two different viewpoints that the possibility of not being selected (inquired) at the time is high. For example, if there is a high possibility that a question will be selected (inquired) when it is displayed, it can be said that it is a candidate for an answer that is highly relevant to the question and is likely to lead to problem solving. On the other hand, if there is a high possibility that the question will not be selected (inquired) when it is displayed, it can be said that the answer candidate has a weak relevance to the question and is unlikely to lead to problem solving. As a result, it is possible to accurately narrow down the answer candidates that are inquired when displayed for the question and lead to the solution of the problem, and to optimize the answer candidates to be output to the question.

（回答出力システム２００のシステム構成例）
つぎに、図１に示した情報処理装置１０１を含む回答出力システム２００のシステム構成例について説明する。回答出力システム２００は、例えば、コールセンタ等のＦＡＱシステムに適用されてもよく、また、インターネット上に公開されるＦＡＱサイトに適用されてもよい。 (System configuration example of answer output system 200)
Next, a system configuration example of the response output system 200 including the information processing apparatus 101 shown in FIG. 1 will be described. The answer output system 200 may be applied to, for example, a FAQ system such as a call center, or may be applied to a FAQ site published on the Internet.

図２は、回答出力システム２００のシステム構成例を示す説明図である。図２において、回答出力システム２００は、情報処理装置１０１と、複数の端末２０１（図２の例では、３台）と、を含む。回答出力システム２００において、情報処理装置１０１および複数の端末２０１は、有線または無線のネットワーク２１０を介して接続される。ネットワーク２１０は、例えば、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネットなどである。 FIG. 2 is an explanatory diagram showing a system configuration example of the response output system 200. In FIG. 2, the response output system 200 includes an information processing device 101 and a plurality of terminals 201 (three in the example of FIG. 2). In the answer output system 200, the information processing apparatus 101 and the plurality of terminals 201 are connected via a wired or wireless network 210. The network 210 is, for example, a LAN (Local Area Network), a WAN (Wide Area Network), the Internet, or the like.

ここで、情報処理装置１０１は、質問に対応する回答候補を出力する。以下の説明では、回答候補として「ＦＡＱ」を例に挙げて説明する場合がある。また、情報処理装置１０１は、ＦＡＱマスタ２２０、アクセスログＤＢ（Ｄａｔａｂａｓｅ）２３０、教師データ（よゐこ）ＤＢ２４０、教師データ（わるいこ）ＤＢ２５０および教師データ（まいご）ＤＢ２６０を有する。情報処理装置１０１は、例えば、サーバである。 Here, the information processing apparatus 101 outputs answer candidates corresponding to the question. In the following explanation, "FAQ" may be taken as an example for explanation as an answer candidate. Further, the information processing apparatus 101 has an FAQ master 220, an access log DB (Data) 230, a teacher data (Yoiko) DB 240, a teacher data (wariko) DB 250, and a teacher data (Maigo) DB 260. The information processing device 101 is, for example, a server.

なお、各種ＤＢ２２０，２３０，２４０，２５０，２６０等の記憶内容については、図４～図８を用いて後述する。 The stored contents of various DBs 220, 230, 240, 250, 260 and the like will be described later with reference to FIGS. 4 to 8.

端末２０１は、回答出力システム２００のユーザが使用するコンピュータである。回答出力システム２００のユーザは、例えば、コールセンタのオペレータや、ＦＡＱサイトの利用者である。端末２０１は、例えば、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）、スマートフォン、タブレット型ＰＣなどである。 The terminal 201 is a computer used by the user of the answer output system 200. The user of the answer output system 200 is, for example, a call center operator or a FAQ site user. The terminal 201 is, for example, a PC (Personal Computer), a smartphone, a tablet-type PC, or the like.

（情報処理装置１０１のハードウェア構成例）
図３は、情報処理装置１０１のハードウェア構成例を示すブロック図である。図３において、情報処理装置１０１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）３０１と、メモリ３０２と、Ｉ／Ｆ（Ｉｎｔｅｒｆａｃｅ）３０３と、ディスクドライブ３０４と、ディスク３０５と、を有する。また、各構成部は、バス３００によってそれぞれ接続される。 (Hardware configuration example of information processing device 101)
FIG. 3 is a block diagram showing a hardware configuration example of the information processing apparatus 101. In FIG. 3, the information processing apparatus 101 includes a CPU (Central Processing Unit) 301, a memory 302, an I / F (Interface) 303, a disk drive 304, and a disk 305. Further, each component is connected by a bus 300.

ここで、ＣＰＵ３０１は、情報処理装置１０１の全体の制御を司る。メモリ３０２は、例えば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）およびフラッシュＲＯＭなどを有する。具体的には、例えば、フラッシュＲＯＭがＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）のプログラムを記憶し、ＲＯＭがアプリケーションプログラムを記憶し、ＲＡＭがＣＰＵ３０１のワークエリアとして使用される。メモリ３０２に記憶されるプログラムは、ＣＰＵ３０１にロードされることで、コーディングされている処理をＣＰＵ３０１に実行させる。 Here, the CPU 301 controls the entire information processing apparatus 101. The memory 302 includes, for example, a ROM (Read Only Memory), a RAM (Random Access Memory), a flash ROM, and the like. Specifically, for example, the flash ROM stores the OS (Operating System) program, the ROM stores the application program, and the RAM is used as the work area of the CPU 301. The program stored in the memory 302 is loaded into the CPU 301 to cause the CPU 301 to execute the coded process.

Ｉ／Ｆ３０３は、通信回線を通じてネットワーク２１０に接続され、ネットワーク２１０を介して外部のコンピュータ（例えば、図２に示した端末２０１）に接続される。そして、Ｉ／Ｆ３０３は、ネットワーク２１０と装置内部とのインターフェースを司り、外部のコンピュータからのデータの入出力を制御する。Ｉ／Ｆ３０３には、例えば、モデムやＬＡＮアダプタなどを採用することができる。 The I / F 303 is connected to the network 210 through a communication line, and is connected to an external computer (for example, the terminal 201 shown in FIG. 2) via the network 210. The I / F 303 controls the interface between the network 210 and the inside of the device, and controls the input / output of data from an external computer. For the I / F 303, for example, a modem, a LAN adapter, or the like can be adopted.

ディスクドライブ３０４は、ＣＰＵ３０１の制御に従ってディスク３０５に対するデータのリード／ライトを制御する。ディスク３０５は、ディスクドライブ３０４の制御で書き込まれたデータを記憶する。ディスク３０５としては、例えば、磁気ディスク、光ディスクなどが挙げられる。 The disk drive 304 controls data read / write to the disk 305 according to the control of the CPU 301. The disk 305 stores the data written under the control of the disk drive 304. Examples of the disk 305 include a magnetic disk and an optical disk.

なお、情報処理装置１０１は、上述した構成部のほかに、例えば、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、入力装置、ディスプレイ等を有することにしてもよい。また、図２に示した端末２０１についても、情報処理装置１０１と同様のハードウェア構成により実現することができる。ただし、端末２０１は、上述した構成部のほかに、入力装置、ディスプレイ等を有する。 In addition to the above-mentioned components, the information processing device 101 may include, for example, an SSD (Solid State Drive), an input device, a display, and the like. Further, the terminal 201 shown in FIG. 2 can also be realized by the same hardware configuration as the information processing apparatus 101. However, the terminal 201 has an input device, a display, and the like in addition to the above-mentioned components.

（各種ＤＢ２２０，２３０，２４０，２５０，２６０等の記憶内容）
つぎに、図４～図８を用いて、情報処理装置１０１が有する各種ＤＢ２２０，２３０，２４０，２５０，２６０等の記憶内容について説明する。 (Memory contents of various DB 220, 230, 240, 250, 260, etc.)
Next, the stored contents of various DBs 220, 230, 240, 250, 260 and the like included in the information processing apparatus 101 will be described with reference to FIGS. 4 to 8.

図４は、ＦＡＱマスタ２２０の記憶内容の一例を示す説明図である。図４において、ＦＡＱマスタ２２０は、ＦＡＱ番号、質問および回答のフィールドを有し、各フィールドに情報を設定することで、ＦＡＱ（例えば、ＦＡＱ４００－１，４００－２）をレコードとして記憶する。 FIG. 4 is an explanatory diagram showing an example of the stored contents of the FAQ master 220. In FIG. 4, the FAQ master 220 has fields for FAQ number, question, and answer, and stores FAQ (for example, FAQ400-1,400-2) as a record by setting information in each field.

ここで、ＦＡＱ番号は、ＦＡＱを一意に識別する識別子である。ＦＡＱ４００－＃の「＃」は、ＦＡＱ番号に対応する。質問は、あらかじめ予想される質問である。回答は、あらかじめ予想される質問に対する回答である。例えば、ＦＡＱ４００－１は、質問「ＰＣを初期状態に戻す方法を教えてください。」と、回答「トラブルが発生し正常に動作しなくなった場合などに、リカバリする方法は、次のとおりです。・・・」と、をまとめたものである。 Here, the FAQ number is an identifier that uniquely identifies the FAQ. "#" In FAQ400- # corresponds to the FAQ number. The question is a question that is expected in advance. The answer is an answer to a question that is expected in advance. For example, FAQ400-1 asks the question "Please tell me how to return the PC to the initial state." And the answer "If a problem occurs and the PC does not operate normally, the recovery method is as follows. ... ", is a summary.

図５は、アクセスログＤＢ２３０の記憶内容の一例を示す説明図である。図５において、アクセスログＤＢ２３０は、セッション番号、日時、タイプ、検索ワード、照会ＦＡＱ、順位および検索リストのフィールドを有する。各フィールドに情報を設定することで、アクセスログ、例えば、アクセスログ５００－１～５００－３がレコードとして記憶される。 FIG. 5 is an explanatory diagram showing an example of the stored contents of the access log DB 230. In FIG. 5, the access log DB 230 has fields for session number, date and time, type, search word, inquiry FAQ, rank, and search list. By setting information in each field, access logs, for example, access logs 500-1 to 500-3 are stored as records.

ここで、セッション番号は、情報処理装置１０１と端末２０１とのセッションを一意に識別する識別子である。セッションは、２点間（装置間）の通信において、情報をやり取りするために設定する論理的な接続関係である。日時は、端末２０１において操作が行われた日時である。 Here, the session number is an identifier that uniquely identifies the session between the information processing apparatus 101 and the terminal 201. A session is a logical connection relationship set for exchanging information in communication between two points (devices). The date and time is the date and time when the operation was performed on the terminal 201.

タイプは、端末２０１において行われた操作のタイプである。タイプとしては、例えば、検索、照会などが挙げられる。タイプ「検索」は、検索ワードに対応するＦＡＱを検索する操作を表す。タイプ「照会」は、検索ワードに対するＦＡＱを選択する操作、すなわち、ＦＡＱの内容を照会する操作を表す。 The type is the type of operation performed at the terminal 201. Types include, for example, search, query, and the like. The type "search" represents an operation to search for the FAQ corresponding to the search word. The type "query" represents an operation of selecting an FAQ for a search word, that is, an operation of querying the contents of the FAQ.

検索ワードは、端末２０１において入力される質問に相当する。検索ワードは、単語または複数の単語の組み合わせであってもよいし、１または複数の文章であってもよい。照会ＦＡＱは、検索ワードに対する選択操作を受け付けたＦＡＱ、すなわち、照会されたＦＡＱのＦＡＱ番号である。 The search word corresponds to a question input in the terminal 201. The search word may be a word or a combination of a plurality of words, or may be one or a plurality of sentences. The inquiry FAQ is a FAQ that has received a selection operation for a search word, that is, the FAQ number of the inquired FAQ.

順位は、検索ワードに対してＦＡＱが表示された際に、当該ＦＡＱが上から何番目に表示されるかを示す順位である。検索リストは、検索ワードに基づきＦＡＱマスタ２２０（例えば、図４参照）から検索されたＦＡＱのＦＡＱ番号をリスト化したものである。 The ranking is a ranking indicating the order in which the FAQ is displayed from the top when the FAQ is displayed for the search word. The search list is a list of FAQ numbers of FAQs searched from the FAQ master 220 (see, for example, FIG. 4) based on the search word.

図６は、教師データ（よゐこ）ＤＢ２４０の記憶内容の一例を示す説明図である。図６において、教師データ（よゐこ）ＤＢ２４０は、日時、照会ＦＡＱおよび形態素解析後検索ワードのフィールドを有する。各フィールドに情報を設定することで、教師データ（よゐこ）、例えば、教師データ（よゐこ）６００－１，６００－２がレコードとして記憶される。 FIG. 6 is an explanatory diagram showing an example of the stored contents of the teacher data (Yoiko) DB 240. In FIG. 6, the teacher data (Yoiko) DB 240 has fields for date and time, inquiry FAQ, and search word after morphological analysis. By setting information in each field, teacher data (Yoiko), for example, teacher data (Yoiko) 600-1,600-2 is stored as a record.

ここで、日時は、照会ＦＡＱが照会された日時である。照会ＦＡＱは、検索ワードに対して１ページ目に表示された際に選択操作を受け付けたＦＡＱ、すなわち、照会されたＦＡＱのＦＡＱ番号である。１ページ目に表示されるＦＡＱは、検索ワードに対するＦＡＱのうち上位Ｎ件のＦＡＱである。形態素解析後検索ワードは、照会ＦＡＱが照会されたときの検索ワードを形態素解析して検出された１または複数の形態素である。 Here, the date and time is the date and time when the inquiry FAQ is inquired. The inquiry FAQ is a FAQ that has received a selection operation when it is displayed on the first page of the search word, that is, the FAQ number of the inquired FAQ. The FAQ displayed on the first page is the top N FAQs among the FAQs for the search word. The search word after morphological analysis is one or more morphemes detected by morphological analysis of the search word when the inquiry FAQ is queried.

図７は、教師データ（わるいこ）ＤＢ２５０の記憶内容の一例を示す説明図である。図７において、教師データ（わるいこ）ＤＢ２５０は、日時、非照会ＦＡＱおよび形態素解析後検索ワードのフィールドを有する。各フィールドに情報を設定することで、教師データ（わるいこ）、例えば、教師データ（わるいこ）７００－１～７００－３がレコードとして記憶される。 FIG. 7 is an explanatory diagram showing an example of the stored contents of the teacher data (wariko) DB 250. In FIG. 7, the teacher data (wariko) DB 250 has fields for date and time, non-query FAQ, and search word after morphological analysis. By setting information in each field, teacher data (wariko), for example, teacher data (wariko) 700-1 to 700-3 are stored as records.

ここで、日時は、非照会ＦＡＱが照会されなかった日時、例えば、非照会ＦＡＱが照会されずに情報処理装置１０１と端末２０１とのセッションが切断された日時である。非照会ＦＡＱは、検索ワードに対して１ページ目に表示された際に選択操作を受け付けなかったＦＡＱ、すなわち、照会されなかったＦＡＱのＦＡＱ番号である。形態素解析後検索ワードは、非照会ＦＡＱが照会されなかったときの検索ワードを形態素解析して検出された１または複数の形態素である。 Here, the date and time is the date and time when the non-inquiry FAQ was not inquired, for example, the date and time when the session between the information processing apparatus 101 and the terminal 201 was disconnected without inquiring the non-inquiry FAQ. The non-inquiry FAQ is a FAQ that did not accept the selection operation when it was displayed on the first page for the search word, that is, the FAQ number of the FAQ that was not inquired. The post-morphological analysis search word is one or more morphemes detected by morphological analysis of the search word when the non-query FAQ is not queried.

図８は、教師データ（まいご）ＤＢ２６０の記憶内容の一例を示す説明図である。図８において、教師データ（まいご）ＤＢ２６０は、日時、代替ワードおよび検索ワードのフィールドを有する。各フィールドに情報を設定することで、教師データ（まいご）、例えば、教師データ（まいご）８００－１がレコードとして記憶される。 FIG. 8 is an explanatory diagram showing an example of the stored contents of the teacher data (maigo) DB 260. In FIG. 8, the teacher data (maigo) DB 260 has fields for date and time, alternative words, and search words. By setting information in each field, teacher data (maigo), for example, teacher data (maigo) 800-1 is stored as a record.

ここで、日時は、検索ワードに対してＦＡＱマスタ２２０から１件もＦＡＱが検索されなかった日時、例えば、当該検索ワードに対応するＦＡＱを検索する操作が行われた日時である。代替ワードは、１件もＦＡＱが表示されなかった検索ワードに後続して入力を受け付けた他の検索ワードであって、当該他の検索ワードに対して１ページ目に表示されたいずれかのＦＡＱが選択された検索ワードである。検索ワードは、ＦＡＱマスタ２２０から１件もＦＡＱが検索されなかった質問である。検索ワードは、例えば、検索ワードを形態素解析して検出された１または複数の形態素によって表現されてもよい。 Here, the date and time is the date and time when no FAQ was searched for the search word from the FAQ master 220, for example, the date and time when the operation for searching the FAQ corresponding to the search word was performed. The alternative word is another search word that accepts input after the search word for which no FAQ is displayed, and any FAQ displayed on the first page for the other search word. Is the selected search word. The search word is a question for which no FAQ has been searched from the FAQ master 220. The search word may be represented by, for example, one or a plurality of morphemes detected by morphological analysis of the search word.

（ＦＡＱ画面９００の画面例）
つぎに、端末２０１に表示されるＦＡＱ画面９００の画面例について説明する。以下の説明では、端末２０１に表示される操作画面のボックス、ボタン等をユーザが選択する操作として、不図示の入力装置を用いたクリック操作を行う場合を例に挙げて説明する。 (Screen example of FAQ screen 900)
Next, a screen example of the FAQ screen 900 displayed on the terminal 201 will be described. In the following description, a case where a click operation using an input device (not shown) is performed as an operation for the user to select a box, a button, or the like on the operation screen displayed on the terminal 201 will be described as an example.

図９は、ＦＡＱ画面９００の画面例を示す説明図である。図９において、ＦＡＱ画面９００は、ＦＡＱを検索したり、ＦＡＱを照会したりする操作画面の一例である。ＦＡＱ画面９００において、ボックス９０１をクリックすると、検索ワードを入力することができる。図９の例では、検索ワード「ＦＪ１０初期化」が入力されている。 FIG. 9 is an explanatory diagram showing a screen example of the FAQ screen 900. In FIG. 9, the FAQ screen 900 is an example of an operation screen for searching the FAQ or inquiring about the FAQ. If you click the box 901 on the FAQ screen 900, you can enter a search word. In the example of FIG. 9, the search word "FJ10 initialization" is input.

また、ＦＡＱ画面９００において、ボタン９０２をクリックすると、ボックス９０１に入力された検索ワードに対応するＦＡＱを検索することができる。具体的には、ボタン９０２をクリックすると、ボックス９０１に入力された検索ワードが、端末２０１から情報処理装置１０１に送信される。 Further, by clicking the button 902 on the FAQ screen 900, the FAQ corresponding to the search word input in the box 901 can be searched. Specifically, when the button 902 is clicked, the search word entered in the box 901 is transmitted from the terminal 201 to the information processing apparatus 101.

この結果、情報処理装置１０１から端末２０１に検索ワードに対する検索結果が送信され、検索されたＦＡＱが表示エリア９１０にリスト化されて表示される。この際、上位Ｎ件のＦＡＱは、１ページ目に表示される。それ以外のＦＡＱは、２ページ以降に表示される。Ｎは、任意に設定可能であり、５～３０程度の値に設定される。ここでは、Ｎは「Ｎ＝５」である。 As a result, the search result for the search word is transmitted from the information processing apparatus 101 to the terminal 201, and the searched FAQs are listed and displayed in the display area 910. At this time, the top N FAQs are displayed on the first page. Other FAQs are displayed on the second and subsequent pages. N can be arbitrarily set and is set to a value of about 5 to 30. Here, N is "N = 5".

図９の例では、検索ワード「ＦＪ１０初期化」に対応するＦＡＱが検索された結果、上位５件のＦＡＱ９１１～９１５が、１ページ目の検索結果として表示エリア９１０に表示されている。なお、ＦＡＱ画面９００において、ボタン９０３をクリックすると、２ページ目の検索結果を表示することができる。 In the example of FIG. 9, as a result of searching the FAQ corresponding to the search word "FJ10 initialization", the top five FAQs 911 to 915 are displayed in the display area 910 as the search result of the first page. If you click the button 903 on the FAQ screen 900, you can display the search results on the second page.

また、ＦＡＱ画面９００において、表示エリア９１０に表示されたいずれかのＦＡＱをクリックすると、当該ＦＡＱの内容を照会することができる。例えば、ＦＡＱ９１１をクリックすると、ＦＡＱ９１１の内容（質問と回答）を照会することができる。また、表示エリア９１０に表示されたいずれかのＦＡＱがクリックされると、当該ＦＡＱが選択されたことを示す選択結果が、端末２０１から情報処理装置１０１に送信される。 Further, by clicking any FAQ displayed in the display area 910 on the FAQ screen 900, the contents of the FAQ can be inquired. For example, by clicking FAQ911, the contents (question and answer) of FAQ911 can be inquired. Further, when any of the FAQs displayed in the display area 910 is clicked, the selection result indicating that the FAQ is selected is transmitted from the terminal 201 to the information processing apparatus 101.

（情報処理装置１０１の機能的構成例）
図１０は、情報処理装置１０１の機能的構成例を示すブロック図である。図１０において、情報処理装置１０１は、受付部１００１と、検索部１００２と、決定部１００３と、出力制御部１００４と、生成部１００５と、判定部１００６と、特定部１００７と、記憶部１０１０と、を含む。受付部１００１～特定部１００７は制御部となる機能であり、具体的には、例えば、図３に示したメモリ３０２、ディスク３０５などの記憶装置に記憶されたプログラムをＣＰＵ３０１に実行させることにより、または、Ｉ／Ｆ３０３により、その機能を実現する。各機能部の処理結果は、例えば、メモリ３０２、ディスク３０５などの記憶装置に記憶される。 (Example of functional configuration of information processing device 101)
FIG. 10 is a block diagram showing a functional configuration example of the information processing apparatus 101. In FIG. 10, the information processing apparatus 101 includes a reception unit 1001, a search unit 1002, a determination unit 1003, an output control unit 1004, a generation unit 1005, a determination unit 1006, a specific unit 1007, and a storage unit 1010. ,including. The reception unit 1001 to the specific unit 1007 are functions that serve as control units. Specifically, for example, by causing the CPU 301 to execute a program stored in a storage device such as the memory 302 and the disk 305 shown in FIG. Alternatively, the function is realized by the I / F 303. The processing result of each functional unit is stored in a storage device such as a memory 302 or a disk 305.

また、記憶部１０１０は、メモリ３０２、ディスク３０５などの記憶装置により実現される。例えば、記憶部１０１０は、各種ＤＢ２２０，２３０，２４０，２５０，２６０，１９００（後述の図１９参照）等を記憶する。なお、記憶部１０１０は、情報処理装置１０１とは異なる他のコンピュータが有することにしてもよい。この場合、情報処理装置１０１は、他のコンピュータにアクセスして、記憶部１０１０の記憶内容を参照することができる。 Further, the storage unit 1010 is realized by a storage device such as a memory 302 and a disk 305. For example, the storage unit 1010 stores various DBs 220, 230, 240, 250, 260, 1900 (see FIG. 19 described later) and the like. The storage unit 1010 may be provided by another computer different from the information processing device 101. In this case, the information processing apparatus 101 can access another computer and refer to the stored contents of the storage unit 1010.

受付部１００１は、質問の入力を受け付ける。質問は、何らかの問題の解決方法を問いただすための検索ワードであり、単語または複数の単語の組み合わせであってもよいし、１または複数の文章であってもよい。質問の入力は、例えば、図９に示したＦＡＱ画面９００において行われる。具体的には、例えば、受付部１００１は、端末２０１から、ＦＡＱ画面９００に入力された検索ワードを受信することにより、検索ワードの入力を受け付ける。以下の説明では、質問を「検索ワード」と表記する場合がある。 The reception unit 1001 accepts the input of a question. The question is a search word for asking how to solve some problem, and may be a word or a combination of a plurality of words, or may be a sentence or a sentence. The question is input, for example, on the FAQ screen 900 shown in FIG. Specifically, for example, the reception unit 1001 receives the input of the search word from the terminal 201 by receiving the search word input to the FAQ screen 900. In the following explanation, the question may be referred to as a "search word".

検索部１００２は、検索ワードに対応するＦＡＱを検索する。具体的には、例えば、検索部１００２は、検索ワードを形態素解析して形態素に分解する。つぎに、検索部１００２は、図４に示したＦＡＱマスタ２２０から、所定の検索条件にしたがって、分解した形態素に対応するＦＡＱを検索する。 The search unit 1002 searches for the FAQ corresponding to the search word. Specifically, for example, the search unit 1002 analyzes the search word into morphemes and decomposes them into morphemes. Next, the search unit 1002 searches the FAQ master 220 shown in FIG. 4 for the FAQ corresponding to the decomposed morpheme according to a predetermined search condition.

より詳細に説明すると、例えば、検索部１００２は、検索ワードを形態素解析して検出された形態素についてのＡＮＤ条件またはＯＲ条件を設定して、ＦＡＱマスタ２２０からＦＡＱを検索してもよい。この際、検索部１００２は、例えば、各ＦＡＱの過去のアクセス数（照会数）や検索ワードとの類似度を考慮して、アクセス数が多いＦＡＱや類似度が高いＦＡＱを検索することにしてもよい。 More specifically, for example, the search unit 1002 may search the FAQ from the FAQ master 220 by setting an AND condition or an OR condition for the morpheme detected by morphological analysis of the search word. At this time, the search unit 1002 decides to search for FAQs with a large number of accesses and FAQs with a high degree of similarity, for example, in consideration of the number of past accesses (number of inquiries) of each FAQ and the degree of similarity with the search word. May be good.

これにより、検索ワードに対応するＦＡＱを特定することができる。検索結果は、例えば、タイプ「検索」のアクセスログとして、図５に示したアクセスログＤＢ２３０に記憶される。具体的には、アクセスログのセッション番号には、検索ワードを受け付けた端末２０１と情報処理装置１０１とのセッションのセッション番号が設定される。アクセスログの日時には、検索ワードを受け付けた日時が設定される。アクセスログの検索ワードには、受け付けた検索ワードが設定される。アクセスログの照会ＦＡＱおよび順位は「空白」である。アクセスログの検索リストには、検索されたＦＡＱのＦＡＱ番号が設定される。ただし、検索ワードに対応するＦＡＱが１件も検索されなかった場合、アクセスログの検索リストには「－（ｎｕｌｌ）」が設定される。 Thereby, the FAQ corresponding to the search word can be specified. The search result is stored in the access log DB 230 shown in FIG. 5, for example, as an access log of type “search”. Specifically, the session number of the session between the terminal 201 that received the search word and the information processing apparatus 101 is set in the session number of the access log. The date and time when the search word is accepted is set in the date and time of the access log. The accepted search word is set as the search word of the access log. The access log inquiry FAQ and rank are "blank". The FAQ number of the searched FAQ is set in the search list of the access log. However, if no FAQ corresponding to the search word is searched, "-(null)" is set in the search list of the access log.

決定部１００３は、教師データ（よゐこ）と教師データ（わるいこ）とを記憶する記憶部１０１０の記憶内容に基づく機械学習を行って、検索された複数のＦＡＱの中から、検索ワードに対する１または複数のＦＡＱを決定する。ここで、教師データ（よゐこ）は、質問に対する選択操作を受け付けたＦＡＱを当該質問に対応付けた情報である。教師データ（わるいこ）は、質問に対する選択操作を受け付けなかったＦＡＱを当該質問に対応付けた情報である。 The determination unit 1003 performs machine learning based on the stored contents of the storage unit 1010 that stores the teacher data (yoko) and the teacher data (wariko), and one or one of the searched FAQs for the search word. Determine multiple FAQs. Here, the teacher data (Yoiko) is information in which the FAQ that accepts the selection operation for the question is associated with the question. The teacher data (wariko) is information in which the FAQ for which the selection operation for the question is not accepted is associated with the question.

具体的には、例えば、決定部１００３は、図６に示した教師データ（よゐこ）ＤＢ２４０を参照して、ナイーブベイズ分類器による機械学習を行って、検索された複数のＦＡＱそれぞれについて、第１の確率を算出する。第１の確率は、入力を受け付けた検索ワードに対してＦＡＱが表示された際に、当該ＦＡＱが選択される確率である。 Specifically, for example, the determination unit 1003 performs machine learning with a naive Bayes classifier with reference to the teacher data (yoko) DB 240 shown in FIG. 6, and first for each of the plurality of searched FAQs. Calculate the probability of. The first probability is the probability that the FAQ will be selected when the FAQ is displayed for the search word that has received the input.

これにより、検索された複数のＦＡＱそれぞれが、１ページ目に表示された際に照会される可能性の高さを判断する指標値、換言すれば、検索ワードとの関連性の強さを表す指標値を得ることができる。 As a result, each of the plurality of searched FAQs indicates an index value for determining the high possibility of being inquired when displayed on the first page, in other words, the strength of relevance to the search word. The index value can be obtained.

また、決定部１００３は、図７に示した教師データ（わるいこ）ＤＢ２５０を参照して、ナイーブベイズ分類器による機械学習を行って、検索された複数のＦＡＱそれぞれについて、第２の確率を算出する。第２の確率は、入力を受け付けた検索ワードに対してＦＡＱが表示された際に、当該ＦＡＱが選択されない確率である。 Further, the determination unit 1003 performs machine learning by the naive Bayes classifier with reference to the teacher data (wariko) DB 250 shown in FIG. 7, and calculates a second probability for each of the plurality of searched FAQs. do. The second probability is the probability that the FAQ is not selected when the FAQ is displayed for the search word for which the input is accepted.

これにより、検索された複数のＦＡＱそれぞれが、１ページ目に表示された際に照会されない可能性の高さを判断する指標値、換言すれば、検索ワードとの関連性の弱さを表す指標値を得ることができる。 As a result, each of the plurality of searched FAQs is an index value that determines the high possibility that the FAQ will not be inquired when displayed on the first page, in other words, an index that indicates the weakness of the relevance to the search word. You can get the value.

そして、決定部１００３は、検索された複数のＦＡＱそれぞれについて算出した第１の確率と第２の確率とに基づいて、検索された複数のＦＡＱの中から、検索ワードに対する１または複数のＦＡＱを決定する。これにより、第１の確率および第２の確率の両方を考慮して、検索ワードに対して表示するＦＡＱを絞り込むことができる。 Then, the determination unit 1003 determines one or a plurality of FAQs for the search word from the searched plurality of FAQs based on the first probability and the second probability calculated for each of the searched plurality of FAQs. decide. Thereby, the FAQ displayed for the search word can be narrowed down in consideration of both the first probability and the second probability.

より詳細に説明すると、例えば、決定部１００３は、検索された複数のＦＡＱのうち第２の確率が閾値α以上のＦＡＱを除外したＦＡＱを、第１の確率が高い順にソートした１または複数のＦＡＱを、検索ワードに対する１または複数のＦＡＱに決定する。閾値αは、任意に設定可能である。閾値αは、例えば、単位を［％］とすると、８０程度の値に設定される。 More specifically, for example, the determination unit 1003 sorts one or a plurality of FAQs obtained by excluding the FAQs having a second probability of α or more from the searched FAQs in descending order of the first probability. The FAQ is determined to be one or more FAQs for the search word. The threshold value α can be set arbitrarily. The threshold value α is set to a value of about 80, for example, when the unit is [%].

また、例えば、決定部１００３は、第１の確率が高い上位Ｎ件のＦＡＱを選択し、選択したＦＡＱのうち第２の確率が閾値α以上のＦＡＱを、検索ワードに対する１または複数のＦＡＱに決定することにしてもよい。すなわち、決定部１００３は、第１の確率が高い上位Ｎ件のＦＡＱを絞り込んだ上で、第２の確率が閾値α以上のＦＡＱを除外する。 Further, for example, the determination unit 1003 selects the upper N FAQs having a high first probability, and among the selected FAQs, the FAQs having a second probability of the threshold value α or more are set as one or a plurality of FAQs for the search word. You may decide. That is, the determination unit 1003 narrows down the FAQs of the top N cases having a high first probability, and then excludes the FAQs having a second probability of the threshold value α or more.

なお、第１および第２の確率を算出する具体的な処理内容については、図１５を用いて後述する。また、検索ワードに対するＦＡＱの決定例については、図１６を用いて後述する。 The specific processing contents for calculating the first and second probabilities will be described later with reference to FIG. Further, an example of determining the FAQ for the search word will be described later with reference to FIG.

出力制御部１００４は、検索ワードに対する１または複数のＦＡＱを出力する。具体的には、例えば、出力制御部１００４は、検索ワードに対して、決定された１または複数のＦＡＱを出力する。各ＦＡＱには、各ＦＡＱの順位（表示順序）を特定可能な情報が付与される。より詳細に説明すると、例えば、出力制御部１００４は、端末２０１に対して、決定された１または複数のＦＡＱを送信することにより、ＦＡＱ画面９００の表示エリア９１０に、受け付けた検索ワードに対する１または複数のＦＡＱを表示する制御を行う。 The output control unit 1004 outputs one or more FAQs for the search word. Specifically, for example, the output control unit 1004 outputs one or a plurality of determined FAQs for the search word. Information that can specify the order (display order) of each FAQ is given to each FAQ. More specifically, for example, the output control unit 1004 transmits one or a plurality of determined FAQs to the terminal 201 to display one or a plurality of received search words in the display area 910 of the FAQ screen 900. Controls to display multiple FAQs.

また、受付部１００１は、検索ワードに対するＦＡＱの選択操作を受け付ける。ＦＡＱの選択操作は、ＦＡＱの内容を照会するための操作であり、例えば、図９に示したＦＡＱ画面９００において行われる。具体的には、例えば、受付部１００１は、端末２０１から、ＦＡＱ画面９００において選択（クリック）されたＦＡＱを示す選択結果を受信することにより、当該ＦＡＱの選択操作を受け付ける。 Further, the reception unit 1001 accepts an FAQ selection operation for the search word. The FAQ selection operation is an operation for inquiring about the contents of the FAQ, and is performed, for example, on the FAQ screen 900 shown in FIG. Specifically, for example, the reception unit 1001 receives a selection operation of the FAQ by receiving a selection result indicating the FAQ selected (clicked) on the FAQ screen 900 from the terminal 201.

ＦＡＱの選択操作を受け付けた場合、タイプ「照会」のアクセスログがアクセスログＤＢ２３０に記憶される。具体的には、アクセスログのセッション番号には、検索ワードを受け付けた端末２０１と情報処理装置１０１とのセッションのセッション番号が設定される。アクセスログの日時には、ＦＡＱの選択操作を受け付けた日時が設定される。アクセスログの検索ワードには、受け付けた検索ワードが設定される。アクセスログの照会ＦＡＱには、選択操作を受け付けたＦＡＱのＦＡＱ番号が設定される。アクセスログの順位には、選択操作を受け付けたＦＡＱの順位が設定される。アクセスログの検索リストは「空白」である。これにより、検索ワードに対して表示された際に選択操作を受け付けたＦＡＱを示すアクセスログを蓄積することができる。 When the FAQ selection operation is accepted, the access log of type "inquiry" is stored in the access log DB 230. Specifically, the session number of the session between the terminal 201 that received the search word and the information processing apparatus 101 is set in the session number of the access log. The date and time when the FAQ selection operation is accepted is set as the date and time of the access log. The accepted search word is set as the search word of the access log. In the access log inquiry FAQ, the FAQ number of the FAQ that accepted the selection operation is set. In the order of the access log, the order of the FAQ that accepted the selection operation is set. The access log search list is "blank". As a result, it is possible to accumulate an access log indicating the FAQ that accepted the selection operation when displayed for the search word.

生成部１００５は、教師データ（よゐこ）を生成する。また、生成部１００５は、教師データ（わるいこ）を生成する。また、生成部１００５は、教師データ（まいご）を生成する。ここで、教師データ（まいご）は、検索ワードに対応するＦＡＱがないときに後続して入力を受け付けた他の検索ワードを、当該検索ワードに対応付けた情報である。ただし、他の検索ワードは、例えば、他の検索ワードに対して１ページ目に表示されたいずれかのＦＡＱが選択された検索ワードである。 The generation unit 1005 generates teacher data (yoiko). In addition, the generation unit 1005 generates teacher data (wariko). In addition, the generation unit 1005 generates teacher data (maigo). Here, the teacher data (maigo) is information in which another search word for which input is subsequently accepted when there is no FAQ corresponding to the search word is associated with the search word. However, the other search word is, for example, a search word in which any FAQ displayed on the first page is selected with respect to the other search word.

具体的には、例えば、生成部１００５は、アクセスログＤＢ２３０から、セッション番号が同一のアクセスログを取得する。つぎに、生成部１００５は、取得したアクセスログの日時に基づいて、取得したアクセスログを時系列にソートする。そして、生成部１００５は、取得したアクセスログの中に、タイプ「照会」のアクセスログがあるか否かを判断する。 Specifically, for example, the generation unit 1005 acquires an access log having the same session number from the access log DB 230. Next, the generation unit 1005 sorts the acquired access logs in chronological order based on the date and time of the acquired access logs. Then, the generation unit 1005 determines whether or not there is an access log of type "inquiry" in the acquired access log.

ここで、タイプ「照会」のアクセスログがある場合、生成部１００５は、当該アクセスログから特定される照会ＦＡＱの順位に基づいて、照会ＦＡＱが、１ページ目に表示されたＦＡＱ、すなわち、上位Ｎ件のＦＡＱであるか否かを判断する。ここで、１ページ目に表示されたＦＡＱである場合、タイプ「照会」のアクセスログから特定される照会ＦＡＱと形態素解析後検索ワードとを対応付けて表す教師データ（よゐこ）を生成する。形態素解析後検索ワードは、タイプ「照会」のアクセスログから特定される検索ワードを形態素解析して検出される１または複数の形態素である。 Here, if there is an access log of type "inquiry", the generation unit 1005 determines that the inquiry FAQ is the FAQ displayed on the first page, that is, the higher rank, based on the order of the inquiry FAQ specified from the access log. Judge whether or not there are N FAQs. Here, in the case of the FAQ displayed on the first page, teacher data (Yoiko) representing the inquiry FAQ specified from the access log of the type "inquiry" and the search word after morphological analysis are generated in association with each other. The post-morphological analysis search word is one or more morphemes detected by morphological analysis of the search word identified from the access log of type "query".

また、生成部１００５は、タイプ「検索」のアクセスログの検索リストを参照して、検索ワードに対して表示された際に照会されなかったＦＡＱ、すなわち、検索ワードに対する選択操作を受け付けなかったＦＡＱがあるか否かを判断する。ここで、照会されなかったＦＡＱがある場合、生成部１００５は、照会されなかったＦＡＱを非照会ＦＡＱとする教師データ（わるいこ）を生成する。教師データ（わるいこ）の形態素解析後検索ワードは、例えば、タイプ「照会」のアクセスログから特定される検索ワードから得られる。 Further, the generation unit 1005 refers to the search list of the access log of the type "search", and the FAQ that was not queried when displayed for the search word, that is, the FAQ that did not accept the selection operation for the search word. Determine if there is. Here, if there is an FAQ that has not been inquired, the generation unit 1005 generates teacher data (wariko) in which the FAQ that has not been inquired is used as a non-inquiry FAQ. The post-morphological analysis search word for teacher data (wariko) is obtained, for example, from the search word identified from the access log of type "query".

また、生成部１００５は、タイプ「検索」のアクセスログの検索リストを参照して、ＦＡＱが１件も検索されなかった検索ワードがあるか否かを判断する。ここで、ＦＡＱが１件も検索されなかった検索ワードがある場合、生成部１００５は、取得したアクセスログの日時に基づいて、当該検索ワードに後続して入力を受け付けた他の検索ワードがあるか否かを判断する。ただし、他の検索ワードは、当該他の検索ワードに対して１ページ目に表示されたいずれかのＦＡＱが照会（選択）された検索ワードである。ここで、他の検索ワードがある場合、生成部１００５は、他の検索ワードを代替ワードとする教師データ（まいご）を生成する。教師データ（まいご）の形態素解析後検索ワードは、例えば、他の検索ワードを形態素解析して検出される１または複数の形態素である。 Further, the generation unit 1005 refers to the search list of the access log of the type "search" and determines whether or not there is a search word for which no FAQ has been searched. Here, if there is a search word for which no FAQ has been searched, the generation unit 1005 has another search word for which input is accepted following the search word based on the date and time of the acquired access log. Judge whether or not. However, the other search word is a search word in which any FAQ displayed on the first page is queried (selected) for the other search word. Here, if there is another search word, the generation unit 1005 generates teacher data (maigo) using the other search word as an alternative word. The search word after morphological analysis of the teacher data (maigo) is, for example, one or a plurality of morphemes detected by morphological analysis of another search word.

生成された教師データ（よゐこ）は、例えば、図６に示した教師データ（よゐこ）ＤＢ２４０に記憶される。生成された教師データ（わるいこ）は、例えば、図７に示した教師データ（わるいこ）ＤＢ２５０に記憶される。生成された教師データ（まいご）は、例えば、図８に示した教師データ（まいご）ＤＢ２６０に記憶される。 The generated teacher data (Yoiko) is stored in, for example, the teacher data (Yoiko) DB 240 shown in FIG. The generated teacher data (wariko) is stored in, for example, the teacher data (wariko) DB 250 shown in FIG. 7. The generated teacher data (maigo) is stored in, for example, the teacher data (maigo) DB 260 shown in FIG.

教師データ（よゐこ）、教師データ（わるいこ）および教師データ（まいご）の生成例については、図１１～図１４を用いて後述する。 Examples of generation of teacher data (yoiko), teacher data (wariko), and teacher data (maigo) will be described later with reference to FIGS. 11 to 14.

なお、教師データ（よゐこ）と教師データ（わるいこ）とが蓄積されていない場合、情報処理装置１０１は、上述した第１および第２の確率を用いたＦＡＱの絞り込みを行うことができない。このため、運用開始からある程度の期間は、情報処理装置１０１は、検索ワードに対して、検索されたＦＡＱを出力することにしてもよい。この際、情報処理装置１０１は、例えば、検索された複数のＦＡＱを、過去のアクセス数（照会数）が多い順にソートして出力することにしてもよい。より詳細に説明すると、例えば、出力制御部１００４は、端末２０１に対して、検索されたＦＡＱを送信することにより、ＦＡＱ画面９００の表示エリア９１０に、受け付けた検索ワードに対するＦＡＱを表示する制御を行う。これにより、教師データ（よゐこ）と教師データ（わるいこ）とを蓄積することができる。 If the teacher data (yoiko) and the teacher data (wariko) are not accumulated, the information processing apparatus 101 cannot narrow down the FAQ using the first and second probabilities described above. Therefore, the information processing apparatus 101 may output the searched FAQ to the search word for a certain period from the start of operation. At this time, the information processing apparatus 101 may sort and output, for example, a plurality of searched FAQs in descending order of the number of past accesses (number of inquiries). More specifically, for example, the output control unit 1004 controls to display the FAQ for the accepted search word in the display area 910 of the FAQ screen 900 by transmitting the searched FAQ to the terminal 201. conduct. As a result, teacher data (yoiko) and teacher data (wariko) can be accumulated.

また、決定部１００３は、検索ワードに対する選択操作を受け付けたＦＡＱを示すアクセスログに基づいて、閾値αを調整することにしてもよい。具体的には、例えば、決定部１００３は、アクセスログＤＢ２３０内のタイプ「照会」のアクセスログに基づいて、検索ワードに対する上位Ｎ件のＦＡＱ、すなわち、１ページ目に表示されるＦＡＱのいずれかが選択される確率を算出する。 Further, the determination unit 1003 may adjust the threshold value α based on the access log indicating the FAQ that has received the selection operation for the search word. Specifically, for example, the determination unit 1003 is one of the top N FAQs for the search word, that is, the FAQ displayed on the first page, based on the access log of the type "inquiry" in the access log DB 230. Calculate the probability that is selected.

以下の説明では、１ページ目に表示されるＦＡＱのいずれかが選択される確率を「１ページ目ヒット率」と表記することがある。 In the following description, the probability that any of the FAQs displayed on the first page will be selected may be referred to as "first page hit rate".

１ページ目ヒット率の算出は、例えば、所定期間ごとに行う。所定期間は、任意に設定可能であり、例えば、数週間～数ヶ月程度の期間に設定される。この際、決定部１００３は、例えば、直近所定期間分のアクセスログに基づいて、１ページ目ヒット率を算出する。これにより、直近のアクセス傾向が反映された１ページ目ヒット率を算出することができる。 The hit rate for the first page is calculated, for example, at predetermined intervals. The predetermined period can be arbitrarily set, and is set to, for example, a period of several weeks to several months. At this time, the determination unit 1003 calculates the first page hit rate based on, for example, the access log for the latest predetermined period. This makes it possible to calculate the first page hit rate that reflects the latest access tendency.

そして、決定部１００３は、算出した１ページ目ヒット率の時系列変化に基づいて、閾値αを調整する。より詳細に説明すると、例えば、決定部１００３は、１ページ目ヒット率が下降傾向にある場合に、閾値αを下げる。閾値αの下げ分は、任意に設定可能であり、例えば、単位を［％］とすると、１程度の値に設定される。 Then, the determination unit 1003 adjusts the threshold value α based on the calculated time-series change in the hit rate of the first page. More specifically, for example, the determination unit 1003 lowers the threshold value α when the hit rate of the first page tends to decrease. The amount of reduction of the threshold value α can be arbitrarily set. For example, when the unit is [%], the value is set to about 1.

閾値αの調整例については、図１７および図１８を用いて後述する。 An example of adjusting the threshold value α will be described later with reference to FIGS. 17 and 18.

判定部１００６は、検索ワードに対応するＦＡＱの有無を判定する。具体的には、例えば、判定部１００６は、ＦＡＱマスタ２２０から検索ワードに対応するＦＡＱが１件も検索されなかった場合、いわゆる、０件ヒットの場合に、検索ワードに対応するＦＡＱがないと判定する。 The determination unit 1006 determines whether or not there is an FAQ corresponding to the search word. Specifically, for example, the determination unit 1006 has no FAQ corresponding to the search word when no FAQ corresponding to the search word is searched from the FAQ master 220, so-called 0 hits. judge.

また、判定部１００６は、決定部１００３によって検索ワードに対するＦＡＱが１件も決定されなかった場合に、検索ワードに対応するＦＡＱがないと判定することにしてもよい。より詳細に説明すると、例えば、判定部１００６は、上述した第１および第２の確率を用いたＦＡＱの絞り込みを行った結果、ＦＡＱが１件も残らなかった場合に、検索ワードに対応するＦＡＱがないと判定することにしてもよい。 Further, the determination unit 1006 may determine that there is no FAQ corresponding to the search word when the determination unit 1003 does not determine any FAQ for the search word. More specifically, for example, the determination unit 1006 narrows down the FAQ using the first and second probabilities described above, and when no FAQ remains, the FAQ corresponding to the search word. It may be determined that there is no FAQ.

特定部１００７は、検索ワードに対応するＦＡＱがないと判定された場合、教師データ（まいご）を記憶する記憶部１０１０の記憶内容に基づく機械学習を行って、当該検索ワードに後続して入力される他の検索ワードを特定する。以下の説明では、判定部１００６によってＦＡＱがないと判定された検索ワードを「０件ヒット検索ワード」と表記する場合がある。 When it is determined that there is no FAQ corresponding to the search word, the specific unit 1007 performs machine learning based on the stored contents of the storage unit 1010 that stores the teacher data (maigo), and is subsequently input to the search word. Identify other search words. In the following description, the search word determined by the determination unit 1006 to have no FAQ may be referred to as "0 hit search word".

具体的には、例えば、特定部１００７は、教師データ（まいご）ＤＢ２６０を参照して、ナイーブベイズ分類器による機械学習を行って、他の検索ワード（代替ワード）それぞれについて、０件ヒット検索ワードに後続して入力される第３の確率を算出する。そして、特定部１００７は、他の検索ワードそれぞれについて算出した第３の確率に基づいて、０件ヒット検索ワードに後続して入力される代替ワードを特定する。 Specifically, for example, the specific unit 1007 refers to the teacher data (maigo) DB260, performs machine learning with a naive Bayes classifier, and performs 0 hit search words for each of the other search words (alternative words). Calculate the third probability that is entered following. Then, the specifying unit 1007 specifies an alternative word to be input after the 0 hit search word based on the third probability calculated for each of the other search words.

より詳細に説明すると、例えば、特定部１００７は、第３の確率が最大の他の検索ワードを、代替ワードとして特定することにしてもよい。また、特定部１００７は、第３の確率が閾値β以上の他の検索ワードを、代替ワードとして特定することにしてもよい。閾値βは、任意に設定可能である。また、特定部１００７は、第３の確率が閾値β以上で、かつ、最大の他の検索ワードを、代替ワードとして特定することにしてもよい。 More specifically, for example, the specific unit 1007 may specify another search word having the maximum third probability as an alternative word. Further, the specifying unit 1007 may specify another search word having a third probability of the threshold value β or more as an alternative word. The threshold value β can be set arbitrarily. Further, the specifying unit 1007 may specify another search word having a third probability of the threshold value β or more and having the maximum value as an alternative word.

第３の確率を算出する具体的な処理内容については、図１５を用いて後述する。 The specific processing content for calculating the third probability will be described later with reference to FIG.

出力制御部１００４は、特定された代替ワードに対応するＦＡＱを出力する。具体的には、例えば、まず、検索部１００２は、代替ワードに対応するＦＡＱを検索する。なお、複数の代替ワードが特定された場合、検索部１００２は、複数の代替ワードそれぞれについて、ＦＡＱを検索することにしてもよい。 The output control unit 1004 outputs the FAQ corresponding to the specified alternative word. Specifically, for example, first, the search unit 1002 searches for the FAQ corresponding to the alternative word. When a plurality of alternative words are specified, the search unit 1002 may search the FAQ for each of the plurality of alternative words.

つぎに、決定部１００３は、記憶部１０１０の記憶内容に基づく機械学習を行って、検索された複数のＦＡＱの中から、代替ワードに対する１または複数のＦＡＱを決定する。そして、出力制御部１００４は、代替ワードに対して、決定された１または複数のＦＡＱを出力する。なお、出力制御部１００４は、検索された代替ワードに対応するＦＡＱを出力することにしてもよい。 Next, the determination unit 1003 performs machine learning based on the stored contents of the storage unit 1010, and determines one or a plurality of FAQs for the alternative word from the plurality of searched FAQs. Then, the output control unit 1004 outputs one or a plurality of determined FAQs to the alternative word. The output control unit 1004 may output the FAQ corresponding to the searched alternative word.

これにより、入力された検索ワードに対応するＦＡＱがなくても、過去のユーザの操作履歴（アクセスログ）をもとに、その検索ワード、すなわち、０件ヒット検索ワードに置き換わる代替ワードを推定してＦＡＱを出力することができる。 As a result, even if there is no FAQ corresponding to the input search word, the search word, that is, the alternative word to be replaced with the 0 hit search word is estimated based on the operation history (access log) of the past user. FAQ can be output.

（教師データの生成例）
つぎに、図１１～図１４を用いて、教師データ（わるいこ）および教師データ（まいご）の生成例について説明する。ここでは、上位５件のＦＡＱが１ページ目に表示されるＦＡＱである場合を想定する（Ｎ＝５）。 (Example of generating teacher data)
Next, an example of generating teacher data (wariko) and teacher data (maigo) will be described with reference to FIGS. 11 to 14. Here, it is assumed that the top five FAQs are the FAQs displayed on the first page (N = 5).

図１１は、アクセスログの具体例を示す説明図（その１）である。図１２は、教師データの具体例を示す説明図（その１）である。図１１において、アクセスログ１１０１～１１０３は、時系列にソートされたセッション番号が同一のアクセスログの集合である。 FIG. 11 is an explanatory diagram (No. 1) showing a specific example of the access log. FIG. 12 is an explanatory diagram (No. 1) showing a specific example of teacher data. In FIG. 11, the access logs 1101 to 1103 are a set of access logs having the same session number sorted in chronological order.

この場合、生成部１００５は、アクセスログ１１０１～１１０３の中に、タイプ「照会」のアクセスログがあるか否かを判断する。ここで、アクセスログ１１０２，１１０３はタイプ「照会」のアクセスログである。このため、生成部１００５は、タイプ「照会」のアクセスログがあると判断する。 In this case, the generation unit 1005 determines whether or not there is an access log of type "inquiry" in the access logs 1101 to 1103. Here, access logs 1102 and 1103 are access logs of type "query". Therefore, the generation unit 1005 determines that there is an access log of type "inquiry".

つぎに、生成部１００５は、タイプ「照会」のアクセスログ１１０２の順位に基づいて、照会ＦＡＱが、１ページ目に表示されたＦＡＱであるか否かを判断する。ここで、アクセスログ１１０２の順位は、「１」であり、５以下である。このため、生成部１００５は、アクセスログ１１０２の照会ＦＡＱが、１ページ目に表示されたＦＡＱであると判断する。 Next, the generation unit 1005 determines whether or not the inquiry FAQ is the FAQ displayed on the first page, based on the order of the access log 1102 of the type "inquiry". Here, the order of the access log 1102 is "1", which is 5 or less. Therefore, the generation unit 1005 determines that the inquiry FAQ of the access log 1102 is the FAQ displayed on the first page.

この場合、生成部１００５は、図１２に示すように、アクセスログ１１０２から特定される照会ＦＡＱ「０１１」と形態素解析後検索ワード「拡大、印刷」とを対応付けて表す教師データ（よゐこ）１２０１を生成する。 In this case, as shown in FIG. 12, the generation unit 1005 associates the inquiry FAQ “011” specified from the access log 1102 with the search word “enlarge, print” after morphological analysis, and represents the teacher data (Yoiko) 1201. To generate.

同様に、生成部１００５は、タイプ「照会」のアクセスログ１１０３の順位に基づいて、照会ＦＡＱが、１ページ目に表示されたＦＡＱであるか否かを判断する。ここで、アクセスログ１１０３の順位は、「２」であり、５以下である。このため、生成部１００５は、アクセスログ１１０３の照会ＦＡＱが、１ページ目に表示されたＦＡＱであると判断する。 Similarly, the generation unit 1005 determines whether or not the inquiry FAQ is the FAQ displayed on the first page, based on the order of the access log 1103 of the type "inquiry". Here, the order of the access log 1103 is "2", which is 5 or less. Therefore, the generation unit 1005 determines that the inquiry FAQ of the access log 1103 is the FAQ displayed on the first page.

この場合、生成部１００５は、図１２に示すように、アクセスログ１１０３から特定される照会ＦＡＱ「０１２」と形態素解析後検索ワード「拡大、印刷」とを対応付けて表す教師データ（よゐこ）１２０２を生成する。 In this case, as shown in FIG. 12, the generation unit 1005 associates the inquiry FAQ “012” specified from the access log 1103 with the search word “enlarge, print” after morphological analysis, and represents the teacher data (Yoiko) 1202. To generate.

また、生成部１００５は、タイプ「検索」のアクセスログ１１０１の検索リストを参照して、検索ワードに対して表示された際に照会されなかったＦＡＱ、すなわち、検索ワードに対する選択操作を受け付けなかったＦＡＱがあるか否かを判断する。ここで、ＦＡＱ番号「０１３」のＦＡＱは照会されていない。 Further, the generation unit 1005 referred to the search list of the access log 1101 of the type "search" and did not accept the FAQ that was not queried when displayed for the search word, that is, the selection operation for the search word. Determine if there is an FAQ. Here, the FAQ of the FAQ number "013" is not inquired.

このため、生成部１００５は、照会されなかったＦＡＱがあると判断する。この場合、生成部１００５は、図１２に示すように、照会されなかったＦＡＱ番号「０１３」のＦＡＱを非照会ＦＡＱとする教師データ（わるいこ）１２０３を生成する。 Therefore, the generation unit 1005 determines that there is an FAQ that has not been inquired. In this case, as shown in FIG. 12, the generation unit 1005 generates teacher data (wariko) 1203 in which the FAQ of the FAQ number “013” that has not been inquired is used as the non-inquiry FAQ.

また、生成部１００５は、タイプ「検索」のアクセスログの検索リストを参照して、ＦＡＱが１件も検索されなかった検索ワードがあるか否かを判断する。ここでは、ＦＡＱが１件も検索されなかった検索ワードは存在しない。この場合、生成部１００５は、教師データ（まいご）を生成しない。 Further, the generation unit 1005 refers to the search list of the access log of the type "search" and determines whether or not there is a search word for which no FAQ has been searched. Here, there is no search word for which no FAQ was searched. In this case, the generation unit 1005 does not generate the teacher data (maigo).

図１３は、アクセスログの具体例を示す説明図（その２）である。図１４は、教師データの具体例を示す説明図（その２）である。図１３において、アクセスログ１３０１～１３０４は、時系列にソートされたセッション番号が同一のアクセスログの集合である。 FIG. 13 is an explanatory diagram (No. 2) showing a specific example of the access log. FIG. 14 is an explanatory diagram (No. 2) showing a specific example of teacher data. In FIG. 13, access logs 1301 to 1304 are a set of access logs having the same session number sorted in chronological order.

この場合、生成部１００５は、アクセスログ１３０１～１３０４の中に、タイプ「照会」のアクセスログがあるか否かを判断する。ここで、アクセスログ１３０３，１３０４はタイプ「照会」のアクセスログである。このため、生成部１００５は、タイプ「照会」のアクセスログがあると判断する。 In this case, the generation unit 1005 determines whether or not there is an access log of type "inquiry" in the access logs 1301 to 1304. Here, access logs 1303 and 1304 are access logs of type "query". Therefore, the generation unit 1005 determines that there is an access log of type "inquiry".

つぎに、生成部１００５は、タイプ「照会」のアクセスログ１３０３の順位に基づいて、照会ＦＡＱが、１ページ目に表示されたＦＡＱであるか否かを判断する。ここで、アクセスログ１３０３の順位は、「１」であり、５以下である。このため、生成部１００５は、アクセスログ１３０３の照会ＦＡＱが、１ページ目に表示されたＦＡＱであると判断する。 Next, the generation unit 1005 determines whether or not the inquiry FAQ is the FAQ displayed on the first page, based on the order of the access log 1303 of the type "inquiry". Here, the order of the access log 1303 is "1", which is 5 or less. Therefore, the generation unit 1005 determines that the inquiry FAQ of the access log 1303 is the FAQ displayed on the first page.

この場合、生成部１００５は、図１４に示すように、アクセスログ１３０３から特定される照会ＦＡＱ「０１１」と形態素解析後検索ワード「拡大、印刷」とを対応付けて表す教師データ（よゐこ）１４０１を生成する。 In this case, as shown in FIG. 14, the generation unit 1005 associates the inquiry FAQ “011” specified from the access log 1303 with the search word “enlarge, print” after morphological analysis, and represents the teacher data (Yoiko) 1401. To generate.

同様に、生成部１００５は、タイプ「照会」のアクセスログ１３０４の順位に基づいて、照会ＦＡＱが、１ページ目に表示されたＦＡＱであるか否かを判断する。ここで、アクセスログ１３０４の順位は、「２」であり、５以下である。このため、生成部１００５は、アクセスログ１３０４の照会ＦＡＱが、１ページ目に表示されたＦＡＱであると判断する。 Similarly, the generation unit 1005 determines whether or not the inquiry FAQ is the FAQ displayed on the first page, based on the order of the access log 1304 of the type "inquiry". Here, the order of the access log 1304 is "2", which is 5 or less. Therefore, the generation unit 1005 determines that the inquiry FAQ of the access log 1304 is the FAQ displayed on the first page.

この場合、生成部１００５は、図１４に示すように、アクセスログ１３０４から特定される照会ＦＡＱ「０１２」と形態素解析後検索ワード「拡大、印刷」とを対応付けて表す教師データ（よゐこ）１４０２を生成する。 In this case, as shown in FIG. 14, the generation unit 1005 associates the inquiry FAQ “012” specified from the access log 1304 with the search word “enlarge, print” after morphological analysis, and represents the teacher data (Yoiko) 1402. To generate.

また、生成部１００５は、タイプ「検索」のアクセスログ１３０１，１３０２の検索リストを参照して、検索ワードに対して表示された際に照会されなかったＦＡＱ、すなわち、検索ワードに対する選択操作を受け付けなかったＦＡＱがあるか否かを判断する。ここで、ＦＡＱ番号「０１３」のＦＡＱは照会されていない。 Further, the generation unit 1005 refers to the search list of the access logs 1301 and 1302 of the type "search", and accepts the FAQ that was not queried when displayed for the search word, that is, the selection operation for the search word. Determine if there is a FAQ that was not available. Here, the FAQ of the FAQ number "013" is not inquired.

このため、生成部１００５は、照会されなかったＦＡＱがあると判断する。この場合、生成部１００５は、図１４に示すように、照会されなかったＦＡＱ番号「０１３」のＦＡＱを非照会ＦＡＱとする教師データ（わるいこ）１４０３を生成する。 Therefore, the generation unit 1005 determines that there is an FAQ that has not been inquired. In this case, as shown in FIG. 14, the generation unit 1005 generates teacher data (wariko) 1403 in which the FAQ of the FAQ number “013” that has not been inquired is used as the non-inquiry FAQ.

また、生成部１００５は、タイプ「検索」のアクセスログ１３０１，１３０２の検索リストを参照して、ＦＡＱが１件も検索されなかった検索ワードがあるか否かを判断する。ここでは、アクセスログ１３０１の検索リストは「－」であり、検索ワード「大きく印刷」はＦＡＱが１件も検索されなかった検索ワードである。 Further, the generation unit 1005 refers to the search list of the access logs 1301 and 1302 of the type "search", and determines whether or not there is a search word for which no FAQ has been searched. Here, the search list of the access log 1301 is "-", and the search word "large print" is a search word in which no FAQ is searched.

この場合、生成部１００５は、検索ワード「大きく印刷」に後続して入力を受け付けた他の検索ワードがあるか否かを判断する。ただし、他の検索ワードは、当該他の検索ワードに対して１ページ目に表示されたいずれかのＦＡＱが照会（選択）された検索ワードである。 In this case, the generation unit 1005 determines whether or not there is another search word for which input is accepted following the search word "large print". However, the other search word is a search word in which any FAQ displayed on the first page is queried (selected) for the other search word.

ここで、検索ワード「拡大、印刷」は、検索ワード「大きく印刷」に後続して入力を受け付けた他の検索ワードである。また、検索ワード「拡大、印刷」は、当該他の検索ワードに対して１ページ目に表示されたＦＡＱ番号「００１，００２」のＦＡＱが照会（選択）された検索ワードである。このため、生成部１００５は、他の検索ワードがあると判断する。 Here, the search word "enlarge, print" is another search word that accepts input following the search word "large print". Further, the search word "enlarge, print" is a search word in which the FAQ of the FAQ number "001,002" displayed on the first page is inquired (selected) for the other search words. Therefore, the generation unit 1005 determines that there is another search word.

この場合、生成部１００５は、図１４に示すように、他の検索ワード「拡大、印刷」を代替ワードとして、代替ワード「拡大、印刷」と検索ワード「大きく印刷」とを対応付けた教師データ（まいご）１４０４を生成する。 In this case, as shown in FIG. 14, the generation unit 1005 uses the other search word “enlargement, print” as an alternative word, and the teacher data in which the alternative word “enlargement, print” and the search word “large print” are associated with each other. (Maigo) Generates 1404.

（第１、第２および第３の確率を算出する具体的な処理内容）
つぎに、図１５を用いて、ナイーブベイズ分類器による機械学習を行って、第１、第２および第３の確率を算出する具体的な処理内容について説明する。 (Specific processing contents for calculating the first, second, and third probabilities)
Next, with reference to FIG. 15, specific processing contents for calculating the first, second, and third probabilities by performing machine learning with a naive Bayes classifier will be described.

図１５は、分析ＤＢ１５００の記憶内容の一例を示す説明図である。ここでは、第１の確率を例に挙げて、計算例について説明する。まず、決定部１００３は、教師データ（よゐこ）ＤＢ２４０を参照して、分析ＤＢ１５００を作成する。 FIG. 15 is an explanatory diagram showing an example of the stored contents of the analysis DB 1500. Here, a calculation example will be described by taking the first probability as an example. First, the determination unit 1003 creates the analysis DB 1500 with reference to the teacher data (Yoiko) DB 240.

分析ＤＢ１５００は、縦軸に「ＦＡＱ番号」、横軸に「キーワード」を設定し、キーワードごとに各ＦＡＱの照会回数を集計したものである。ただし、ＦＡＱ番号は、検索部１００２によって検索された複数のＦＡＱそれぞれのＦＡＱ番号である。キーワードは、教師データ（よゐこ）ＤＢ２４０内の教師データ（よゐこ）の形態素解析後検索ワードに含まれる形態素である。 In the analysis DB 1500, "FAQ number" is set on the vertical axis and "keyword" is set on the horizontal axis, and the number of inquiries for each FAQ is totaled for each keyword. However, the FAQ number is the FAQ number of each of the plurality of FAQs searched by the search unit 1002. The keyword is a morpheme included in the search word after the morphological analysis of the teacher data (Yoiko) in the teacher data (Yoiko) DB 240.

ここでは、検索ワードを「キーワード１，キーワード２」とする。また、検索部１００２によって検索された検索ワード「キーワード１，キーワード２」に対応する複数のＦＡＱそれぞれのＦＡＱ番号を「１，２，３，４，５」とする。また、教師データ（よゐこ）ＤＢ２４０内の教師データ（よゐこ）の形態素解析後検索ワードに含まれる形態素を「キーワード１，キーワード２，キーワード３，キーワード４，キーワード５，キーワード６」とする。 Here, the search word is "keyword 1, keyword 2". Further, the FAQ numbers of the plurality of FAQs corresponding to the search words "keyword 1, keyword 2" searched by the search unit 1002 are set to "1, 2, 3, 4, 5". Further, the morpheme included in the search word after the morphological analysis of the teacher data (Yoiko) in the teacher data (Yoiko) DB 240 is defined as "keyword 1, keyword 2, keyword 3, keyword 4, keyword 5, keyword 6".

具体的には、例えば、決定部１００３は、教師データ（よゐこ）ＤＢ２４０から未選択の教師データ（よゐこ）を選択する。ここで、選択された教師データ（よゐこ）の照会ＦＡＱを「１」とし、形態素解析後検索ワードにキーワード１、キーワード２が含まれるとする。この場合、決定部１００３は、ＦＡＱ番号「１」のキーワード１の照会回数をインクリメントする。また、決定部１００３は、ＦＡＱ番号「１」のキーワード２の照会回数をインクリメントする。決定部１００３は、教師データ（よゐこ）ＤＢ２４０から選択されていない未選択の教師データ（よゐこ）がなくなるまで、上述した一連の処理を繰り返す。 Specifically, for example, the determination unit 1003 selects unselected teacher data (Yoiko) from the teacher data (Yoiko) DB 240. Here, it is assumed that the inquiry FAQ of the selected teacher data (Yoiko) is set to "1", and the search word after the morphological analysis includes the keyword 1 and the keyword 2. In this case, the determination unit 1003 increments the number of inquiries for the keyword 1 of the FAQ number "1". Further, the determination unit 1003 increments the number of inquiries for the keyword 2 of the FAQ number “1”. The determination unit 1003 repeats the above-mentioned series of processes until there is no unselected teacher data (Yoiko) selected from the teacher data (Yoiko) DB 240.

つぎに、決定部１００３は、分析ＤＢ１５００を参照して、検索された複数のＦＡＱそれぞれについて、第１の確率を算出する。具体的には、例えば、決定部１００３は、例えば、下記式（１）を用いて、検索された複数のＦＡＱそれぞれについて、第１の確率を算出することができる。 Next, the determination unit 1003 refers to the analysis DB 1500 and calculates the first probability for each of the plurality of FAQs searched. Specifically, for example, the determination unit 1003 can calculate the first probability for each of the plurality of FAQs searched by using the following equation (1), for example.

Ｐ（ＦＡＱ｜検索ワード）＝Ｐ（ＦＡＱ）Ｐ（検索ワード｜ＦＡＱ）・・・（１） P (FAQ | Search word) = P (FAQ) P (Search word | FAQ) ... (1)

事後確率Ｐ（ＦＡＱ｜検索ワード）は、検索ワードを指定してＦＡＱを検索した場合、該当ＦＡＱが選択（照会）される確率を示す。すなわち、事後確率Ｐ（ＦＡＱ｜検索ワード）は、第１の確率に相当する。例えば、Ｐ（ＦＡＱ１｜事故）は、検索ワード「事故」を指定してＦＡＱを検索した場合、ＦＡＱ１が選択（照会）される確率を示す。 The posterior probability P (FAQ | search word) indicates the probability that the corresponding FAQ is selected (inquired) when the FAQ is searched by designating the search word. That is, the posterior probability P (FAQ | search word) corresponds to the first probability. For example, P (FAQ1 | Accident) indicates the probability that FAQ1 will be selected (inquired) when the search word "accident" is specified and the FAQ is searched.

なお、検索ワードが複数のキーワードで構成される場合は、事後確率をそれぞれの確率の積で示す。例えば、Ｐ（ＦＡＱ１｜事故、車）は、検索ワード「事故」、「車」を指定してＦＡＱを検索した場合、ＦＡＱ１が選択される確率が、Ｐ（ＦＡＱ１）Ｐ（事故｜ＦＡＱ１）×Ｐ（ＦＡＱ１）Ｐ（車｜ＦＡＱ１）であることを示す。 If the search word is composed of a plurality of keywords, the posterior probabilities are indicated by the product of each probability. For example, for P (FAQ1 | accident, car), when the search words "accident" and "car" are specified and the FAQ is searched, the probability that FAQ1 is selected is P (FAQ1) P (accident | FAQ1) ×. P (FAQ1) Indicates that it is P (car | FAQ1).

事前確率Ｐ（ＦＡＱ）は、該当ＦＡＱが選ばれる確率（割合）を示す。例えば、ＦＡＱ総数が５０００で、検索ワード「事故」で検索する場合、Ｐ（ＦＡＱ１）は「１／５０００」である。なお、事前確率Ｐ（ＦＡＱ）は、推定における精度を向上させるため、検索ワードに含まれるＦＡＱの総数に置き換える。ＦＡＱ総数が５０００で、検索ワード「事故」でヒットしたことのあるＦＡＱの総数が１００とすると、検索ワード「事故」で検索する場合、Ｐ（ＦＡＱ１）は「１／１００」である。 The prior probability P (FAQ) indicates the probability (ratio) that the corresponding FAQ is selected. For example, when the total number of FAQs is 5000 and the search word "accident" is used for searching, P (FAQ1) is "1/5000". The prior probability P (FAQ) is replaced with the total number of FAQs included in the search word in order to improve the accuracy in estimation. Assuming that the total number of FAQs is 5000 and the total number of FAQs that have been hit by the search word "accident" is 100, P (FAQ1) is "1/100" when searching with the search word "accident".

尤度Ｐ（検索ワード｜ＦＡＱ）は、ＦＡＱが決まったとき、検索ワードが作成される確率を示す。例えば、Ｐ（事故｜ＦＡＱ１）は、ＦＡＱ１から検索ワード「事故」が作成される確率を示す。ＦＡＱ１をアクセスしたキーワードのアクセス数が１０００で、検索ワード「事故」が１０の場合は、Ｐ（事故｜ＦＡＱ１）は「１０／１０００」となる。 Likelihood P (search word | FAQ) indicates the probability that a search word will be created when the FAQ is determined. For example, P (accident | FAQ1) indicates the probability that the search word "accident" will be created from FAQ1. If the number of accesses of the keyword that accessed FAQ1 is 1000 and the search word "accident" is 10, P (accident | FAQ1) is "10/1000".

図１５に示した分析ＤＢ１５００をもとに算出される各ＦＡＱ１～５の第１の確率（Ｐ（ＦＡＱ｜キーワード１，２））は、以下の通りである。なお、「０」となる部分は「１」に置き換える（ラプラススムージング）。 The first probabilities (P (FAQ | keywords 1, 2)) of each FAQ 1 to 5 calculated based on the analysis DB 1500 shown in FIG. 15 are as follows. The part that becomes "0" is replaced with "1" (Laplace smoothing).

Ｐ（ＦＡＱ１｜キーワード１，２）
＝Ｐ（キーワード１｜ＦＡＱ１）×Ｐ（ＦＡＱ１）×Ｐ（キーワード２｜ＦＡＱ１）×Ｐ（ＦＡＱ１）＝（１／２１）（１／２３）×（１２／２１）（１２／９６）
＝０．０００１５ P (FAQ1 | Keywords 1 and 2)
= P (keyword 1 | FAQ1) x P (FAQ1) x P (keyword 2 | FAQ1) x P (FAQ1) = (1/21) (1/23) x (12/21) (12/96)
= 0.00015

Ｐ（ＦＡＱ２｜キーワード１，２）
＝Ｐ（キーワード１｜ＦＡＱ２）×Ｐ（ＦＡＱ２）×Ｐ（キーワード２｜ＦＡＱ２）×Ｐ（ＦＡＱ２）＝（１０／８５）（１０／２３）×（１／８５）（１／９６）
＝０．００００１ P (FAQ2 | Keywords 1 and 2)
= P (keyword 1 | FAQ2) x P (FAQ2) x P (keyword 2 | FAQ2) x P (FAQ2) = (10/85) (10/23) x (1/85) (1/96)
= 0.00001

Ｐ（ＦＡＱ３｜キーワード１，２）
＝Ｐ（キーワード１｜ＦＡＱ３）×Ｐ（ＦＡＱ３）×Ｐ（キーワード２｜ＦＡＱ３）×Ｐ（ＦＡＱ３）＝（２／７）（２／２３）×（１／７）（１／９６）
＝０．０００１８ P (FAQ3 | Keywords 1 and 2)
= P (keyword 1 | FAQ3) x P (FAQ3) x P (keyword 2 | FAQ3) x P (FAQ3) = (2/7) (2/23) x (1/7) (1/96)
= 0.00018

Ｐ（ＦＡＱ４｜キーワード１，２）
＝Ｐ（キーワード１｜ＦＡＱ４）×Ｐ（ＦＡＱ４）×Ｐ（キーワード２｜ＦＡＱ４）×Ｐ（ＦＡＱ４）＝（６／１２４）（６／２３）×（７７／１２４）（７７／９６）
＝０．００６２９ P (FAQ4 | Keywords 1 and 2)
= P (keyword 1 | FAQ4) x P (FAQ4) x P (keyword 2 | FAQ4) x P (FAQ4) = (6/124) (6/23) x (77/124) (77/96)
= 0.00629

Ｐ（ＦＡＱ５｜キーワード１，２）
＝Ｐ（キーワード１｜ＦＡＱ５）×Ｐ（ＦＡＱ５）×Ｐ（キーワード２｜ＦＡＱ５）×Ｐ（ＦＡＱ５）＝（４／４４）（４／２３）×（６／４４）（６／９６）
＝０．０００１３ P (FAQ5 | Keywords 1 and 2)
= P (keyword 1 | FAQ5) x P (FAQ5) x P (keyword 2 | FAQ5) x P (FAQ5) = (4/44) (4/23) x (6/44) (6/96)
= 0.00013

ただし、コンピュータ上では、桁数が小さくなりすぎると計算できなくなる可能性がある（オーバーフロー）。このため、決定部１００３は、算出した各値を対数に変換して、第１の確率をそれぞれ算出する。対数に変換した第１の確率は、以下の通りである。０に近いほど、確率が高いことを示す。 However, on a computer, if the number of digits becomes too small, it may not be possible to calculate (overflow). Therefore, the determination unit 1003 converts each calculated value into a logarithm to calculate the first probability. The first probability converted into a logarithm is as follows. The closer it is to 0, the higher the probability.

Ｐ（ＦＡＱ１｜キーワード１，２）
＝Ｌｏｇ（１／２１）（１／２３）＋Ｌｏｇ（１２／２１）（１２／９６）
＝－３．８３ P (FAQ1 | Keywords 1 and 2)
= Log (1/21) (1/23) + Log (12/21) (12/96)
=-3.83

Ｐ（ＦＡＱ２｜キーワード１，２）
＝Ｌｏｇ（１０／８５）（１０／２３）＋Ｌｏｇ（１／８５）（１／９６）
＝－５．２０ P (FAQ2 | Keywords 1 and 2)
= Log (10/85) (10/23) + Log (1/85) (1/96)
=-5.20

Ｐ（ＦＡＱ３｜キーワード１，２）
＝Ｌｏｇ（２／７）（２／２３）＋Ｌｏｇ（１／７）（１／９６）
＝－４．４３ P (FAQ3 | Keywords 1 and 2)
= Log (2/7) (2/23) + Log (1/7) (1/96)
=-4.43

Ｐ（ＦＡＱ４｜キーワード１，２）
＝Ｌｏｇ（６／１２４）（６／２３）＋Ｌｏｇ（７７／１２４）（７７／９６）
＝－２．２ P (FAQ4 | Keywords 1 and 2)
= Log (6/124) (6/23) + Log (77/124) (77/96)
= -2.2

Ｐ（ＦＡＱ５｜キーワード１，２）
＝Ｌｏｇ（４／４４）（４／２３）＋Ｌｏｇ（６／４４）（６／９６）
＝－３．８７ P (FAQ5 | Keywords 1 and 2)
= Log (4/44) (4/23) + Log (6/44) (6/96)
=-3.87

つぎに、決定部１００３は、ソフトマックス関数を利用して、対数表記した各ＦＡＱ１～５の第１の確率を、０～１００［％］の表記に変換する。具体的には、例えば、決定部１００３は、対数表記した各ＦＡＱ１～５の第１の確率のうち、絶対値の最大値Ｃ「－５．２０」を取得する。そして、決定部１００３は、以下のように、各ＦＡＱ１～５について、（最大値Ｃ－第１の確率）の値の指数を計算する。 Next, the determination unit 1003 uses the softmax function to convert the first probabilities of each of the FAQs 1 to 5 in logarithmic notation into the notation of 0 to 100 [%]. Specifically, for example, the determination unit 1003 acquires the maximum value C "-5.20" of the absolute value among the first probabilities of each FAQ 1 to 5 expressed in logarithm. Then, the determination unit 1003 calculates an index of the value of (maximum value C-1st probability) for each FAQ 1 to 5 as follows.

Ｐ（ＦＡＱ１）＝ｅｘｐ（－３．８３－（－５．２０））＝３．９３
Ｐ（ＦＡＱ２）＝ｅｘｐ（－５．２０－（－５．２０））＝１．０
Ｐ（ＦＡＱ３）＝ｅｘｐ（－４．４３－（－５．２０））＝２．１６
Ｐ（ＦＡＱ４）＝ｅｘｐ（－２．２０－（－５．２０））＝２０．０９
Ｐ（ＦＡＱ５）＝ｅｘｐ（－３．８７－（－５．２０））＝３．４２ P (FAQ1) = exp (-3.83- (-5.20)) = 3.93
P (FAQ2) = exp (-5.20- (-5.20)) = 1.0
P (FAQ3) = exp (-4.43- (-5.20)) = 2.16
P (FAQ4) = exp (-2.20- (-5.20)) = 20.09
P (FAQ5) = exp (-3.87- (-5.20)) = 3.42

つぎに、決定部１００３は、各ＦＡＱ１～５の（最大値Ｃ－第１の確率）の値の指数の合計値Ｓを計算する。ここでは、指数の合計値Ｓは、「３０．０６」となる。 Next, the determination unit 1003 calculates the total value S of the exponents of the values of (maximum value C-first probability) of each FAQ 1 to 5. Here, the total value S of the exponent is "30.06".

つぎに、決定部１００３は、以下のように、各ＦＡＱ１～５について、（最大値Ｃ－第１の確率）の値の指数を、指数の合計値Ｓで割った値を算出する。 Next, the determination unit 1003 calculates a value obtained by dividing the index of the value (maximum value C-1st probability) by the total value S of the exponents for each FAQ 1 to 5 as follows.

Ｐ（ＦＡＱ１）＝３．９３／３０．０６＝０．１３
Ｐ（ＦＡＱ２）＝１．０／３０．０６＝０．０３
Ｐ（ＦＡＱ３）＝２．１６／３０．０６＝０．０７
Ｐ（ＦＡＱ４）＝２０．０９／３０．０６＝０．６７
Ｐ（ＦＡＱ５）＝３．４２／３０．０６＝０．１１ P (FAQ1) = 3.93 / 30.06 = 0.13
P (FAQ2) = 1.0 / 30.06 = 0.03
P (FAQ3) = 2.16 / 30.06 = 0.07
P (FAQ4) = 20.09 / 30.06 = 0.67
P (FAQ5) = 3.42 / 30.06 = 0.11

つぎに、決定部１００３は、以下のように、各ＦＡＱ１～５について、算出した値に１００を掛けることにより、第１の確率を０～１００［％］の表記に変換する。 Next, the determination unit 1003 converts the first probability into the notation of 0 to 100 [%] by multiplying the calculated value for each FAQ 1 to 5 by 100 as follows.

Ｐ（ＦＡＱ１）＝１３［％］
Ｐ（ＦＡＱ２）＝３［％］
Ｐ（ＦＡＱ３）＝７［％］
Ｐ（ＦＡＱ４）＝６７［％］
Ｐ（ＦＡＱ５）＝１１［％］ P (FAQ1) = 13 [%]
P (FAQ2) = 3 [%]
P (FAQ3) = 7 [%]
P (FAQ4) = 67 [%]
P (FAQ5) = 11 [%]

このようにして、各ＦＡＱ１～５について、検索ワード「キーワード１，キーワード２」に対して各ＦＡＱ１～５が表示された際に、各ＦＡＱ１～５が選択（照会）される第１の確率を算出することができる。なお、１回も選択（照会）されたことがないＦＡＱについては、第１の確率は「０［％］」とする。 In this way, for each of the FAQs 1 to 5, when each of the FAQs 1 to 5 is displayed for the search word "keyword 1 and keyword 2", the first probability that each of the FAQs 1 to 5 is selected (inquired) is determined. Can be calculated. For FAQs that have never been selected (inquired), the first probability is "0 [%]".

また、ここでは、第１の確率を例に挙げて説明したが、第２および第３の確率についても、同様にして求めることができる。 Further, although the first probability has been described here as an example, the second and third probabilities can be obtained in the same manner.

例えば、第２の確率の場合、決定部１００３は、教師データ（わるいこ）ＤＢ２５０を参照して、分析ＤＢ１５００を作成する。この場合、分析ＤＢ１５００は、縦軸に「ＦＡＱ番号」、横軸に「キーワード」を設定し、キーワードごとに各ＦＡＱが照会されなかった非照会回数を集計したものである。ただし、ＦＡＱ番号は、検索部１００２によって検索された複数のＦＡＱそれぞれのＦＡＱ番号である。キーワードは、教師データ（わるいこ）ＤＢ２５０内の教師データ（わるいこ）の形態素解析後検索ワードに含まれる形態素である。 For example, in the case of the second probability, the determination unit 1003 creates the analysis DB 1500 with reference to the teacher data (wariko) DB 250. In this case, in the analysis DB 1500, "FAQ number" is set on the vertical axis and "keyword" is set on the horizontal axis, and the number of non-inquiries in which each FAQ is not inquired is totaled for each keyword. However, the FAQ number is the FAQ number of each of the plurality of FAQs searched by the search unit 1002. The keyword is a morpheme included in the search word after morphological analysis of the teacher data (wariko) in the teacher data (wariko) DB250.

具体的には、例えば、決定部１００３は、教師データ（わるいこ）ＤＢ２５０から未選択の教師データ（わるいこ）を選択する。ここで、選択された教師データ（わるいこ）の非照会ＦＡＱを「１」とし、形態素解析後検索ワードにキーワード１、キーワード２が含まれるとする。この場合、決定部１００３は、ＦＡＱ番号「１」のキーワード１の非照会回数をインクリメントする。また、決定部１００３は、ＦＡＱ番号「１」のキーワード２の非照会回数をインクリメントする。決定部１００３は、教師データ（よゐこ）ＤＢ２４０から選択されていない未選択の教師データ（よゐこ）がなくなるまで、上述した一連の処理を繰り返す。 Specifically, for example, the determination unit 1003 selects unselected teacher data (wariko) from the teacher data (wariko) DB 250. Here, it is assumed that the non-inquiry FAQ of the selected teacher data (wariko) is set to "1", and the search word after the morphological analysis includes the keyword 1 and the keyword 2. In this case, the determination unit 1003 increments the number of non-inquiries of the keyword 1 of the FAQ number "1". Further, the determination unit 1003 increments the number of non-inquiries of the keyword 2 of the FAQ number “1”. The determination unit 1003 repeats the above-mentioned series of processes until there is no unselected teacher data (Yoiko) selected from the teacher data (Yoiko) DB 240.

これにより、キーワードごとに各ＦＡＱの非照会回数が記憶された分析ＤＢ１５００を作成することができる。そして、決定部１００３は、例えば、分析ＤＢ１５００を参照して、上記式（１）を用いて、検索された複数のＦＡＱそれぞれについて、第２の確率を算出する。ただし、事後確率Ｐ（ＦＡＱ｜検索ワード）は、検索ワードを指定してＦＡＱを検索した場合、該当ＦＡＱが選択（照会）されない確率を示す。事前確率Ｐ（ＦＡＱ）は、該当ＦＡＱが選ばれない確率（割合）を示す。尤度Ｐ（検索ワード｜ＦＡＱ）は、ＦＡＱが決まったとき、検索ワードが作成される確率を示す。なお、以降の第２の確率の具体的な計算手順は、第１の確率と同様のため、詳細な説明は省略する。 As a result, it is possible to create an analysis DB 1500 in which the number of non-inquiries of each FAQ is stored for each keyword. Then, the determination unit 1003 calculates a second probability for each of the plurality of FAQs searched by using the above equation (1) with reference to, for example, the analysis DB 1500. However, the posterior probability P (FAQ | search word) indicates the probability that the corresponding FAQ is not selected (inquired) when the FAQ is searched by designating the search word. The prior probability P (FAQ) indicates the probability (ratio) that the corresponding FAQ is not selected. Likelihood P (search word | FAQ) indicates the probability that a search word will be created when the FAQ is determined. Since the specific calculation procedure of the second probability thereafter is the same as that of the first probability, detailed description thereof will be omitted.

また、第３の確率の場合、特定部１００７は、教師データ（まいご）ＤＢ２６０を参照して、分析ＤＢ１５００を作成する。この場合、分析ＤＢ１５００は、縦軸に「代替ワード」、横軸に「形態素」を設定し、形態素ごとに各代替ワードが入力された入力回数を集計したものである。ただし、代替ワードは、教師データ（まいご）ＤＢ２６０内の教師データ（まいご）の代替ワードである。すなわち、代替ワードは、０件ヒット検索ワードに後続して入力された他の検索ワードに相当する。形態素は、０件ヒット検索ワードを形態素解析して検出される形態素である。０件ヒット検索ワードから形態素が検出されなかった場合には、例えば、０件ヒット検索ワードそのものを、横軸の形態素としてもよい。 Further, in the case of the third probability, the specific unit 1007 creates the analysis DB 1500 with reference to the teacher data (maigo) DB 260. In this case, the analysis DB 1500 sets "alternative words" on the vertical axis and "morphemes" on the horizontal axis, and totals the number of times each alternative word is input for each morpheme. However, the alternative word is an alternative word for the teacher data (maigo) in the teacher data (maigo) DB 260. That is, the alternative word corresponds to another search word input after the 0 hit search word. The morpheme is a morpheme detected by morphological analysis of 0 hit search words. When the morpheme is not detected from the 0-hit search word, for example, the 0-hit search word itself may be used as the morpheme on the horizontal axis.

具体的には、例えば、特定部１００７は、教師データ（まいご）ＤＢ２６０から未選択の教師データ（まいご）を選択する。そして、特定部１００７は、選択した教師データ（まいご）の検索ワードに、分析ＤＢ１５００内のいずれかの形態素が含まれる場合、その形態素について、選択した教師データ（まいご）の代替ワードの入力回数をインクリメントする。特定部１００７は、教師データ（まいご）ＤＢ２６０から選択されていない未選択の教師データ（まいご）がなくなるまで、上述した一連の処理を繰り返す。 Specifically, for example, the specific unit 1007 selects unselected teacher data (maigo) from the teacher data (maigo) DB 260. Then, when the search word of the selected teacher data (maigo) includes any morpheme in the analysis DB 1500, the specific unit 1007 inputs the substitute word of the selected teacher data (maigo) for the morpheme. Increment. The specific unit 1007 repeats the above-mentioned series of processes until there is no unselected teacher data (maigo) from the teacher data (maigo) DB 260.

これにより、形態素ごとに各代替ワードが入力された入力回数が記憶された分析ＤＢ１５００を作成することができる。そして、特定部１００７は、例えば、分析ＤＢ１５００を参照して、上記式（１）を用いて、分析ＤＢ１５００内の代替ワードそれぞれについて、第３の確率を算出する。 This makes it possible to create an analysis DB 1500 in which the number of times each alternative word is input for each morpheme is stored. Then, the specific unit 1007 calculates a third probability for each of the alternative words in the analysis DB 1500 by using the above equation (1) with reference to, for example, the analysis DB 1500.

ただし、事後確率は、Ｐ（代替ワード｜検索ワード）となる。事前確率は、Ｐ（代替ワード）となる。尤度は、Ｐ（代替ワード）となる。事後確率Ｐ（代替ワード｜検索ワード）は、検索ワードに後続して代替ワードが入力される確率を示す。事前確率Ｐ（代替ワード）は、該当代替ワードがある確率（割合）を示す。尤度Ｐ（検索ワード｜代替ワード）は、代替ワードが決まったとき、検索ワードが作成される確率を示す。なお、以降の第３の確率の具体的な計算手順は、第１の確率と同様のため、詳細な説明は省略する。 However, the posterior probability is P (alternative word | search word). The prior probability is P (alternative word). The likelihood is P (alternative word). The posterior probability P (alternative word | search word) indicates the probability that an alternative word is input after the search word. The prior probability P (alternative word) indicates the probability (ratio) of the corresponding alternative word. Likelihood P (search word | alternative word) indicates the probability that a search word will be created when the alternative word is determined. Since the specific calculation procedure of the third probability thereafter is the same as that of the first probability, detailed description thereof will be omitted.

（ＦＡＱの決定例）
つぎに、図１６を用いて、検索ワードに対するＦＡＱの決定例について説明する。 (FAQ decision example)
Next, an example of determining the FAQ for the search word will be described with reference to FIG.

図１６は、ＦＡＱの決定例を示す説明図である。図１６において、第１のテーブル１６０１は、検索部１００２によって検索された複数のＦＡＱそれぞれについて算出された第１の確率を記憶する。第２のテーブル１６０２は、検索部１００２によって検索された複数のＦＡＱそれぞれについて算出された第２の確率を記憶する。 FIG. 16 is an explanatory diagram showing an example of determining FAQ. In FIG. 16, the first table 1601 stores the first probabilities calculated for each of the plurality of FAQs searched by the search unit 1002. The second table 1602 stores the second probabilities calculated for each of the plurality of FAQs searched by the search unit 1002.

ここでは、検索ワードに対応するＦＡＱとして、ＦＡＱ番号が「００１～０１０」の１０件のＦＡＱが検索された場合を想定する。また、閾値αを「α＝８０」とする。また、ＦＡＱ番号が若いほど、既存の検索アルゴリズムで検索された際の優先度が高いものとする。例えば、既存の検索アルゴリズムでは、過去のアクセス数（照会数）が多いＦＡＱほど高い優先度が設定される。 Here, it is assumed that 10 FAQs with FAQ numbers "001 to 010" are searched as FAQs corresponding to the search words. Further, the threshold value α is set to “α = 80”. Further, the younger the FAQ number, the higher the priority when the search is performed by the existing search algorithm. For example, in an existing search algorithm, a FAQ with a large number of past accesses (inquiries) is set with a higher priority.

この場合、決定部１００３は、例えば、第２のテーブル１６０２を参照して、第２の確率が閾値α以上のＦＡＱを特定する。ここでは、ＦＡＱ番号「００３，００５，００６，００８」のＦＡＱが特定される。つぎに、決定部１００３は、第１のテーブル１６０１から、特定したＦＡＱ番号「００３，００５，００６，００８」のＦＡＱのレコードを削除する。 In this case, the determination unit 1003 identifies the FAQ whose second probability is equal to or greater than the threshold value α with reference to, for example, the second table 1602. Here, the FAQ of the FAQ number "003,005,006,008" is specified. Next, the determination unit 1003 deletes the FAQ record of the specified FAQ number "003,005,006,008" from the first table 1601.

そして、決定部１００３は、第１のテーブル１６０１を参照して、第１の確率が高い順にソートした残余のＦＡＱ、すなわち、ＦＡＱ番号「００１，００２，００７，００４，００９，０１０」のＦＡＱを、検索ワードに対する１または複数のＦＡＱに決定する。 Then, the determination unit 1003 refers to the first table 1601 and obtains the FAQ of the remainder sorted in descending order of the first probability, that is, the FAQ of the FAQ number "001,002,007,004,009,010". , Determine one or more FAQs for the search word.

この場合、出力制御部１００４は、端末２０１に対して、決定されたＦＡＱ番号「００１，００２，００７，００４，００９，０１０」のＦＡＱを送信する。ここで、ＦＡＱ画面９００において、上位５件のみ１ページ目に表示される場合（Ｎ＝５）、ＦＡＱ番号「００１，００２，００７，００４，００９」のＦＡＱは、この順番で１ページ目に表示され、ＦＡＱ番号「０１０」のＦＡＱは２ページ目に表示される。 In this case, the output control unit 1004 transmits the FAQ of the determined FAQ number "001,002,007,004,009,010" to the terminal 201. Here, in the FAQ screen 900, when only the top five items are displayed on the first page (N = 5), the FAQ of the FAQ number "001,002,007,004,009" is displayed on the first page in this order. The FAQ with the FAQ number "010" is displayed on the second page.

例えば、過去のアクセス数（照会数）が多い順にＦＡＱを表示する場合に比べると、第１および第２の確率を用いてＦＡＱが絞り込まれた結果、ＦＡＱ番号「００３，００５，００６，００８」のＦＡＱが除外され、ＦＡＱ番号「００７，００９，０１０」のＦＡＱが繰り上がって１ページ目に表示される。 For example, as a result of narrowing down the FAQ using the first and second probabilities, the FAQ number "003,005,006,008" is compared with the case where the FAQ is displayed in descending order of the number of past accesses (inquiries). The FAQ of is excluded, and the FAQ of the FAQ number "007,009,010" is moved up and displayed on the first page.

（閾値αの調整例）
つぎに、図１７および図１８を用いて、閾値αの調整例について説明する。ここでは、上位５件のＦＡＱが１ページ目に表示されるＦＡＱである場合を想定する（Ｎ＝５）。 (Example of adjusting threshold α)
Next, an example of adjusting the threshold value α will be described with reference to FIGS. 17 and 18. Here, it is assumed that the top five FAQs are the FAQs displayed on the first page (N = 5).

図１７は、アクセスログの具体例を示す説明図（その３）である。図１７において、アクセスログ１７０１～１７０５は、時系列にソートされたセッション番号が同一のアクセスログの集合である。以下、アクセスログ１７０１～１７０５から、１ページ目ヒット率を算出する場合について説明する。 FIG. 17 is an explanatory diagram (No. 3) showing a specific example of the access log. In FIG. 17, access logs 1701 to 1705 are a set of access logs having the same session number sorted in chronological order. Hereinafter, a case where the hit rate of the first page is calculated from the access logs 1701 to 1705 will be described.

具体的には、例えば、決定部１００３は、下記式（２）を用いて、アクセスログ１７０１～１７０５から１ページ目ヒット率を算出する。ただし、Ｘは、照会されたＦＡＱが検索結果の１ページ目に存在する回数である。Ｙは、ＦＡＱの照会回数である。また、１ページ目ヒット率の単位は［％］である。 Specifically, for example, the determination unit 1003 calculates the hit rate of the first page from the access logs 1701 to 1705 using the following formula (2). However, X is the number of times that the inquired FAQ exists on the first page of the search result. Y is the number of FAQ inquiries. The unit of the hit rate on the first page is [%].

１ページ目ヒット率＝Ｘ／Ｙ×１００・・・（２） 1st page hit rate = X / Y × 100 ・・・ (2)

より詳細に説明すると、決定部１００３は、タイプ「照会」のアクセスログ１７０２～１７０５を参照して、照会されたＦＡＱが検索結果の１ページ目に存在する回数Ｘを算出する。ここで、アクセスログ１７０２，１７０３の順位は、「１，２」であり、５以下である。一方、アクセスログ１７０４，１７０５の順位は、「４０，４５」であり、５より大きい。したがって、回数Ｘは「Ｘ＝２」となる。 More specifically, the determination unit 1003 calculates the number of times X that the inquired FAQ exists on the first page of the search result with reference to the access logs 1702 to 1705 of the type "inquiry". Here, the order of the access logs 1702, 1703 is "1, 2", which is 5 or less. On the other hand, the order of the access logs 1704 and 1705 is "40,45", which is larger than 5. Therefore, the number of times X is "X = 2".

また、決定部１００３は、タイプ「照会」のアクセスログ１７０２～１７０５を参照して、ＦＡＱの照会回数Ｙを算出する。ここでは、照会回数Ｙは、タイプ「照会」のアクセスログ１７０２～１７０５の数であり、「Ｙ＝４」である。このため、１ページ目ヒット率は、「５０［％］（＝２／４×１００）」となる。 Further, the determination unit 1003 calculates the number of inquiry times Y of the FAQ with reference to the access logs 1702 to 1705 of the type "inquiry". Here, the number of inquiries Y is the number of access logs 1702 to 1705 of the type "inquiry", and "Y = 4". Therefore, the hit rate on the first page is "50 [%] (= 2/4 x 100)".

決定部１００３は、１ページ目ヒット率の算出を、所定期間ごと、例えば、数週間～数ヶ月程度の期間ごとに行う。つぎに、決定部１００３は、所定期間ごとに算出した１ページ目ヒット率の推移を回帰分析して、１ページ目ヒット率の時系列変化を表す直線の傾きを算出する。そして、決定部１００３は、算出した傾きに応じて、閾値αを調整する。 The determination unit 1003 calculates the hit rate for the first page every predetermined period, for example, every few weeks to several months. Next, the determination unit 1003 performs regression analysis on the transition of the first page hit rate calculated for each predetermined period, and calculates the slope of a straight line representing the time-series change of the first page hit rate. Then, the determination unit 1003 adjusts the threshold value α according to the calculated inclination.

図１８は、１ページ目ヒット率の時系列変化の一例を示す説明図である。図１８において、直線１８０１，１８０２は、１ページ目ヒット率の時系列変化を表す直線である。直線１８０１は、傾きが０より大きく、１ページ目ヒット率が上昇傾向にあることを示している。一方、直線１８０２は、傾きが０より小さく、１ページ目ヒット率が下降傾向にあることを示している。 FIG. 18 is an explanatory diagram showing an example of time-series changes in the hit rate on the first page. In FIG. 18, the straight lines 1801 and 1802 are straight lines representing the time-series changes in the hit rate of the first page. The straight line 1801 indicates that the slope is larger than 0 and the hit rate on the first page tends to increase. On the other hand, the straight line 1802 indicates that the slope is smaller than 0 and the hit rate on the first page tends to decrease.

例えば、決定部１００３は、１ページ目ヒット率の時系列変化を表す直線の傾きが０より大きい場合、１ページ目ヒット率が上昇傾向にあるため、閾値αを調整しない。また、決定部１００３は、１ページ目ヒット率の時系列変化を表す直線の傾きが０の場合、１ページ目ヒット率に変化がないため、閾値αを調整しない。 For example, when the slope of the straight line representing the time-series change of the first page hit rate is larger than 0, the determination unit 1003 does not adjust the threshold value α because the first page hit rate tends to increase. Further, when the slope of the straight line representing the time-series change of the hit rate of the first page is 0, the determination unit 1003 does not adjust the threshold value α because there is no change in the hit rate of the first page.

一方、決定部１００３は、１ページ目ヒット率の時系列変化を表す直線の傾きが０より小さい場合、１ページ目ヒット率が下降傾向にあるため、閾値αを調整する。具体的には、例えば、決定部１００３は、１ページ目ヒット率の時系列変化を表す直線の傾きが０より小さい場合、閾値αを１［％］下げる。 On the other hand, when the slope of the straight line representing the time-series change of the hit rate of the first page is smaller than 0, the determination unit 1003 adjusts the threshold value α because the hit rate of the first page tends to decrease. Specifically, for example, when the slope of the straight line representing the time-series change of the hit rate of the first page is smaller than 0, the determination unit 1003 lowers the threshold value α by 1 [%].

例えば、調整前の閾値αが「８０［％］」の場合、決定部１００３は、閾値αを「７９［％］」に変更する。１ページ目ヒット率が下降傾向にある場合、１ページ目のＦＡＱが照会されにくくなって、教師データ（わるいこ）の学習が進むと想定される。このため、教師データ（わるいこ）の反応をよくするため、閾値αを下げて、ＦＡＱをより多く削る方向に調整する。 For example, when the threshold value α before adjustment is “80 [%]”, the determination unit 1003 changes the threshold value α to “79 [%]”. When the hit rate on the first page is on a downward trend, it is assumed that the FAQ on the first page is less likely to be inquired and the learning of teacher data (wariko) progresses. Therefore, in order to improve the reaction of the teacher data (wariko), the threshold value α is lowered and the FAQ is adjusted in the direction of cutting more.

なお、決定部１００３は、１ページ目ヒット率の時系列変化を表す直線の傾きの大きさに応じて、閾値αの下げ分を変更することにしてもよい。例えば、１ページ目ヒット率が大幅に下降していると判断できる程度に、直線の傾きが小さい場合、決定部１００３は、閾値αの下げ分を５［％］程度に変更することにしてもよい。 The determination unit 1003 may change the amount of decrease in the threshold value α according to the magnitude of the slope of the straight line representing the time-series change of the hit rate on the first page. For example, if the slope of the straight line is small enough to determine that the hit rate on the first page has dropped significantly, the determination unit 1003 may change the decrease in the threshold value α to about 5 [%]. good.

（形態素の推定）
つぎに、図１９を用いて、形態素を推定する処理について説明する。 (Estimation of morpheme)
Next, the process of estimating the morpheme will be described with reference to FIG.

図１９は、形態素を推定する方法の一実施例を示す説明図である。特定部１００７は、判定部１００６によって検索ワードに対応するＦＡＱがないと判定された場合、教師データ（まいご）ＤＢ２６０の記憶内容に基づく機械学習を行って、０件ヒット検索ワードに後続して入力される代替ワードを特定する。しかし、教師データ（まいご）が蓄積されるまでの間は、代替ワードを特定することができない。 FIG. 19 is an explanatory diagram showing an embodiment of a method for estimating a morpheme. When the determination unit 1007 determines that there is no FAQ corresponding to the search word, the specific unit 1007 performs machine learning based on the stored contents of the teacher data (maigo) DB 260, and inputs it after the 0 hit search word. Identify alternative words to be used. However, the alternative word cannot be specified until the teacher data (maigo) is accumulated.

例えば、パンチミス等によりデタラメな検索ワードが入力された場合、検索ワードから形態素が検出されず、０件ヒットとなることがある。また、この場合、当該検索ワードに対応する教師データ（まいご）は蓄積されていない可能性が高い。すなわち、パンチミス等によりデタラメな検索ワードが入力された場合、代替ワードを特定できず、０件ヒットとなることがある。 For example, when a random search word is input due to a punch mistake or the like, the morpheme may not be detected from the search word and 0 hits may be obtained. Further, in this case, it is highly possible that the teacher data (maigo) corresponding to the search word is not accumulated. That is, when a random search word is input due to a punch mistake or the like, the alternative word cannot be specified, and 0 hits may be obtained.

そこで、情報処理装置１０１は、代替ワードが特定されなかった場合に、当該検索ワードに対応する形態素を推定して、推定した形態素に対応するＦＡＱを検索することにしてもよい。 Therefore, when the alternative word is not specified, the information processing apparatus 101 may estimate the morpheme corresponding to the search word and search for the FAQ corresponding to the estimated morpheme.

具体的には、例えば、まず、生成部１００５は、教師データ（形態素）を生成する。より詳細に説明すると、例えば、生成部１００５は、教師データ（よゐこ）ＤＢ２４０内の教師データ（よゐこ）の形態素解析後検索ワードから形態素を抽出する。そして、生成部１００５は、抽出した形態素と、当該形態素に含まれるそれぞれの文字とを対応付けた教師データ（形態素）を生成する。 Specifically, for example, first, the generation unit 1005 generates teacher data (morpheme). More specifically, for example, the generation unit 1005 extracts a morpheme from the search word after the morphological analysis of the teacher data (Yoiko) in the teacher data (Yoiko) DB 240. Then, the generation unit 1005 generates teacher data (morpheme) in which the extracted morpheme and each character included in the morpheme are associated with each other.

一例として、形態素を「ＦＩＬＥ」とすると、形態素「ＦＩＬＥ」と、形態素「ＦＩＬＥ」に含まれるそれぞれの要素（文字）「“Ｆ”，“Ｉ”，“Ｌ”，“Ｅ”」とを対応付けた教師データ（形態素）１９０１が生成される。生成された教師データ（形態素）は、教師データ（形態素）ＤＢ１９００に記憶される。 As an example, assuming that the morpheme is "FILE", the morpheme "FILE" and the respective elements (characters) "" F "," I "," L "," E "" included in the morpheme "FILE" correspond to each other. The attached teacher data (morpheme) 1901 is generated. The generated teacher data (morpheme) is stored in the teacher data (morpheme) DB1900.

また、生成部１００５は、例えば、教師データ（わるいこ）ＤＢ２５０内の教師データ（わるいこ）の形態素解析後検索ワードから形態素を抽出し、抽出した形態素と、当該形態素に含まれるそれぞれの文字とを対応付けた教師データ（形態素）を生成してもよい。また、生成部１００５は、例えば、教師データ（まいご）ＤＢ２６０内の教師データ（まいご）の代替ワードや検索ワードから形態素を抽出し、抽出した形態素と、当該形態素に含まれるそれぞれの文字とを対応付けた教師データ（形態素）を生成してもよい。 Further, the generation unit 1005 extracts, for example, a morpheme from the search word after morphological analysis of the teacher data (wariko) in the teacher data (wariko) DB 250, and the extracted morpheme and each character included in the morpheme. The teacher data (morpheme) associated with may be generated. Further, the generation unit 1005 extracts a morpheme from a substitute word or a search word of the teacher data (maigo) in the teacher data (maigo) DB260, and corresponds the extracted morpheme with each character included in the morpheme. The attached teacher data (morpheme) may be generated.

つぎに、特定部１００７は、検索ワードから形態素が検出されなかった場合、教師データ（形態素）ＤＢ１９００の記憶内容に基づく機械学習を行って、検索ワードに対応する形態素を特定する。ただし、検索ワードから形態素が検出されなかった場合であっても、ＦＡＱマスタ２２０からＦＡＱが検索されることもある。 Next, when the morpheme is not detected from the search word, the specifying unit 1007 performs machine learning based on the stored contents of the teacher data (morpheme) DB1900 to specify the morpheme corresponding to the search word. However, even if the morpheme is not detected from the search word, the FAQ may be searched from the FAQ master 220.

すなわち、検索ワードから形態素が検出されなかった場合であっても、端末２０１に対して検索ワードに対応するＦＡＱを出力できる場合がある。このため、特定部１００７は、例えば、検索ワードから形態素が検出されず、かつ、検索ワードに対応するＦＡＱがないと判定された場合に、検索ワードに対応する形態素を特定することにしてもよい。 That is, even when the morpheme is not detected from the search word, the FAQ corresponding to the search word may be output to the terminal 201. Therefore, the specific unit 1007 may specify the morpheme corresponding to the search word, for example, when the morpheme is not detected from the search word and it is determined that there is no FAQ corresponding to the search word. ..

より具体的には、例えば、特定部１００７は、教師データ（形態素）ＤＢ１９００を参照して、ナイーブベイズ分類器による機械学習を行って、複数の形態素それぞれについて、第４の確率を算出する。第４の確率は、形態素が検出されなかった検索ワードに類似する度合いの高さを示す。 More specifically, for example, the specific unit 1007 refers to the teacher data (morpheme) DB1900, performs machine learning by a naive Bayes classifier, and calculates a fourth probability for each of the plurality of morphemes. The fourth probability indicates a high degree of similarity to the search word in which the morpheme was not detected.

なお、第４の確率についても、第１の確率と同様にして求めることができる。まず、特定部１００７は、教師データ（形態素）ＤＢ１９００を参照して、分析ＤＢ１５００を作成する。この場合、分析ＤＢ１５００は、縦軸に「形態素」、横軸に「文字」を設定し、文字ごとの出現回数を集計したものである。ただし、形態素は、教師データ（形態素）ＤＢ１９００内の教師データ（形態素）の形態素である。文字は、教師データ（形態素）ＤＢ１９００内の教師データ（形態素）の形態素に含まれる文字である。 The fourth probability can also be obtained in the same manner as the first probability. First, the specific unit 1007 creates an analysis DB 1500 with reference to the teacher data (morpheme) DB 1900. In this case, in the analysis DB 1500, "morpheme" is set on the vertical axis and "character" is set on the horizontal axis, and the number of appearances for each character is totaled. However, the morpheme is a morpheme of the teacher data (morpheme) in the teacher data (morpheme) DB1900. The characters are characters included in the morpheme of the teacher data (morpheme) in the teacher data (morpheme) DB1900.

具体的には、例えば、特定部１００７は、教師データ（形態素）ＤＢ１９００から未選択の教師データ（形態素）を選択する。ここで、選択された教師データ（形態素）の形態素を「ＦＩＬＥ」とし、要素を「“Ｆ”，“Ｉ”，“Ｌ”，“Ｅ”」とする。この場合、特定部１００７は、分析ＤＢ１５００内の形態素「ＦＩＬＥ」の各要素「“Ｆ”，“Ｉ”，“Ｌ”，“Ｅ”」の出現回数をそれぞれインクリメントする。特定部１００７は、例えば、教師データ（形態素）ＤＢ１９００から選択されていない未選択の教師データ（形態素）がなくなるまで、上述した一連の処理を繰り返す。 Specifically, for example, the specific unit 1007 selects unselected teacher data (morpheme) from the teacher data (morpheme) DB 1900. Here, the morpheme of the selected teacher data (morpheme) is "FILE", and the elements are "" F "," I "," L "," E "". In this case, the specific unit 1007 increments the number of appearances of each element "" F "," I "," L "," E "" of the morpheme "FILE" in the analysis DB 1500. The specific unit 1007 repeats the above-mentioned series of processes until, for example, there is no unselected teacher data (morpheme) selected from the teacher data (morpheme) DB 1900.

これにより、形態素ごとに各要素の出現回数が記憶された分析ＤＢ１５００を作成することができる。そして、決定部１００３は、例えば、分析ＤＢ１５００を参照して、上記式（１）を用いて、複数の形態素それぞれについて、第４の確率を算出する。ただし、事後確率は、Ｐ（形態素｜文字）となる。事前確率は、Ｐ（形態素）となる。尤度は、Ｐ（形態素）となる。事後確率Ｐ（形態素｜文字）は、該当文字を含む形態素が入力される確率を示す。事前確率Ｐ（形態素）は、形態素（縦軸）に含まれる全要素中に該当文字が占める割合を示す。尤度Ｐ（形態素）は、形態素が決まったとき、該当文字が含まれる確率を示す。なお、以降の第４の確率の具体的な計算手順は、第１の確率と同様のため、詳細な説明は省略する。 This makes it possible to create an analysis DB 1500 in which the number of appearances of each element is stored for each morpheme. Then, the determination unit 1003 calculates a fourth probability for each of the plurality of morphemes by using the above equation (1) with reference to, for example, the analysis DB 1500. However, the posterior probability is P (morpheme | character). The prior probability is P (morpheme). The likelihood is P (morpheme). The posterior probability P (morpheme | character) indicates the probability that a morpheme including the corresponding character is input. The prior probability P (morpheme) indicates the ratio of the corresponding character in all the elements included in the morpheme (vertical axis). The likelihood P (morpheme) indicates the probability that the corresponding character is included when the morpheme is determined. Since the specific calculation procedure of the fourth probability thereafter is the same as that of the first probability, detailed description thereof will be omitted.

そして、特定部１００７は、分析ＤＢ１５００内の複数の形態素それぞれについて算出した第４の確率に基づいて、検索ワードに対応する形態素を特定する。具体的には、例えば、特定部１００７は、第４の確率が最大の形態素を、形態ワードに対応する形態素として特定する。これにより、形態素解析により形態素が検出されなかった検索ワードに対応する形態素を推定することができる。 Then, the specifying unit 1007 specifies the morpheme corresponding to the search word based on the fourth probability calculated for each of the plurality of morphemes in the analysis DB 1500. Specifically, for example, the specific unit 1007 specifies the morpheme having the maximum fourth probability as the morpheme corresponding to the morpheme word. This makes it possible to estimate the morpheme corresponding to the search word for which the morpheme was not detected by the morphological analysis.

この場合、特定部１００７は、例えば、教師データ（まいご）を記憶する記憶部１０１０の記憶内容に基づく機械学習を行って、特定した形態素に後続して入力される代替ワードを特定する。そして、検索部１００２は、特定された代替ワードに対応するＦＡＱを検索する。 In this case, the specific unit 1007 performs machine learning based on the stored contents of the storage unit 1010 that stores the teacher data (maigo), and specifies an alternative word that is subsequently input to the specified morpheme. Then, the search unit 1002 searches for the FAQ corresponding to the specified alternative word.

これにより、パンチミス等によりデタラメな検索ワードが入力された場合であっても、０件ヒットとなることを防ぐことができる。また、上述した説明では、特定部１００７が、特定した形態素に後続して入力される代替ワードを特定し、検索部１００２が、特定された代替ワードに対応するＦＡＱを検索することにしたが、これに限らない。例えば、検索部１００２は、特定された形態素に対応するＦＡＱを検索することにしてもよい。 As a result, even if a random search word is input due to a punch mistake or the like, it is possible to prevent 0 hits. Further, in the above description, the specific unit 1007 identifies the alternative word to be input after the specified morpheme, and the search unit 1002 searches for the FAQ corresponding to the specified alternative word. Not limited to this. For example, the search unit 1002 may search for the FAQ corresponding to the specified morpheme.

なお、教師データ（よゐこ）、教師データ（わるいこ）のデータ構造は、ＦＡＱ番号と形態素解析後検索ワードのため、検索ワードから形態素が検出されなかった場合は、学習できず、第１および第２の確率を算出できない。つまり、第１および第２の確率を用いたＦＡＱの絞り込みを行うことができない。したがって、このような場合も、第１および第２の確率を用いたＦＡＱの絞り込みを行えるように、情報処理装置１０１は、検索ワードに対応する形態素を推定して、ＦＡＱを再検索することにしてもよい。 Since the data structures of teacher data (yoiko) and teacher data (wariko) are FAQ numbers and search words after morphological analysis, if morphemes are not detected from the search words, they cannot be learned, and the first and first morphemes cannot be learned. The probability of 2 cannot be calculated. That is, it is not possible to narrow down the FAQ using the first and second probabilities. Therefore, even in such a case, the information processing apparatus 101 estimates the morpheme corresponding to the search word and searches the FAQ again so that the FAQ can be narrowed down using the first and second probabilities. May be.

（情報処理装置１０１の教師データ生成処理手順）
つぎに、図２０および図２１を用いて、情報処理装置１０１の教師データ生成処理手順について説明する。情報処理装置１０１の教師データ生成処理は、例えば、２４時間程度の期間ごとに定期的に実行されてもよく、また、回答出力システム２００の管理者等が指定する所定のタイミングで実行されてもよい。 (Procedure for generating teacher data of information processing device 101)
Next, the teacher data generation processing procedure of the information processing apparatus 101 will be described with reference to FIGS. 20 and 21. The teacher data generation process of the information processing apparatus 101 may be executed periodically, for example, every period of about 24 hours, or may be executed at a predetermined timing specified by the administrator of the response output system 200 or the like. good.

図２０および図２１は、情報処理装置１０１の教師データ生成処理手順の一例を示すフローチャートである。図２０のフローチャートにおいて、まず、情報処理装置１０１は、アクセスログＤＢ２３０から選択されていない未選択のセッション番号を選択する（ステップＳ２００１）。 20 and 21 are flowcharts showing an example of the teacher data generation processing procedure of the information processing apparatus 101. In the flowchart of FIG. 20, first, the information processing apparatus 101 selects an unselected session number that has not been selected from the access log DB 230 (step S2001).

つぎに、情報処理装置１０１は、アクセスログＤＢ２３０から、選択したセッション番号のアクセスログを取得する（ステップＳ２００２）。そして、情報処理装置１０１は、取得したアクセスログの日時に基づいて、取得したアクセスログを時系列にソートする（ステップＳ２００３）。 Next, the information processing apparatus 101 acquires the access log of the selected session number from the access log DB 230 (step S2002). Then, the information processing apparatus 101 sorts the acquired access logs in chronological order based on the date and time of the acquired access logs (step S2003).

つぎに、情報処理装置１０１は、取得したアクセスログの中に、タイプ「照会」のアクセスログがあるか否かを判断する（ステップＳ２００４）。ここで、タイプ「照会」のアクセスログがない場合（ステップＳ２００４：Ｎｏ）、情報処理装置１０１は、図２１に示すステップＳ２１０１に移行する。 Next, the information processing apparatus 101 determines whether or not there is an access log of type "inquiry" in the acquired access log (step S2004). Here, if there is no access log of type "inquiry" (step S2004: No), the information processing apparatus 101 shifts to step S2101 shown in FIG.

一方、タイプ「照会」のアクセスログがある場合（ステップＳ２００４：Ｙｅｓ）、情報処理装置１０１は、当該アクセスログから特定される照会ＦＡＱが、１ページ目に表示されたＦＡＱであるか否かを判断する（ステップＳ２００５）。ここで、１ページ目に表示されたＦＡＱではない場合（ステップＳ２００５：Ｎｏ）、情報処理装置１０１は、図２１に示すステップＳ２１０１に移行する。 On the other hand, when there is an access log of type "inquiry" (step S2004: Yes), the information processing apparatus 101 determines whether or not the inquiry FAQ specified from the access log is the FAQ displayed on the first page. Determine (step S2005). Here, if the FAQ is not displayed on the first page (step S2005: No), the information processing apparatus 101 shifts to step S2101 shown in FIG.

一方、１ページ目に表示されたＦＡＱの場合（ステップＳ２００５：Ｙｅｓ）、情報処理装置１０１は、タイプ「照会」のアクセスログから特定される照会ＦＡＱと形態素解析後検索ワードとを対応付けて表す教師データ（よゐこ）を生成する（ステップＳ２００６）。そして、情報処理装置１０１は、生成した教師データ（よゐこ）を、教師データ（よゐこ）ＤＢ２４０に登録して（ステップＳ２００７）、図２１に示すステップＳ２１０１に移行する。 On the other hand, in the case of the FAQ displayed on the first page (step S2005: Yes), the information processing apparatus 101 represents the inquiry FAQ specified from the access log of the type "inquiry" in association with the search word after morphological analysis. Generate teacher data (Yoiko) (step S2006). Then, the information processing apparatus 101 registers the generated teacher data (Yoiko) in the teacher data (Yoiko) DB 240 (step S2007), and shifts to step S2101 shown in FIG.

図２１のフローチャートにおいて、まず、情報処理装置１０１は、タイプ「検索」のアクセスログの検索リストを参照して、０件ヒットの検索ワードがあるか否かを判断する（ステップＳ２１０１）。ここで、０件ヒットの検索ワードがない場合（ステップＳ２１０１：Ｎｏ）、情報処理装置１０１は、ステップＳ２１０５に移行する。 In the flowchart of FIG. 21, first, the information processing apparatus 101 refers to the search list of the access log of the type "search" and determines whether or not there is a search word with 0 hits (step S2101). Here, when there is no search word for 0 hits (step S2101: No), the information processing apparatus 101 shifts to step S2105.

一方、０件ヒットの検索ワードがある場合（ステップＳ２１０１：Ｙｅｓ）、情報処理装置１０１は、取得したアクセスログの日時に基づいて、０件ヒットの検索ワードに後続して入力を受け付けた他の検索ワードがあるか否かを判断する（ステップＳ２１０２）。ただし、他の検索ワードは、当該他の検索ワードに対して１ページ目に表示されたいずれかのＦＡＱが照会（選択）された検索ワードである。 On the other hand, when there is a search word with 0 hits (step S2101: Yes), the information processing apparatus 101 accepts input following the search word with 0 hits based on the acquired date and time of the access log. It is determined whether or not there is a search word (step S2102). However, the other search word is a search word in which any FAQ displayed on the first page is queried (selected) for the other search word.

ここで、他の検索ワードがない場合（ステップＳ２１０２：Ｎｏ）、情報処理装置１０１は、ステップＳ２１０５に移行する。一方、他の検索ワードがある場合（ステップＳ２１０２：Ｙｅｓ）、情報処理装置１０１は、他の検索ワードを代替ワードとする教師データ（まいご）を生成する（ステップＳ２１０３）。 Here, when there is no other search word (step S2102: No), the information processing apparatus 101 shifts to step S2105. On the other hand, when there is another search word (step S2102: Yes), the information processing apparatus 101 generates teacher data (maigo) using the other search word as an alternative word (step S2103).

そして、情報処理装置１０１は、生成した教師データ（まいご）を、教師データ（まいご）ＤＢ２６０に登録する（ステップＳ２１０４）。つぎに、情報処理装置１０１は、タイプ「検索」のアクセスログの検索リストを参照して、検索ワードに対して表示された際に照会されなかったＦＡＱがあるか否かを判断する（ステップＳ２１０５）。 Then, the information processing apparatus 101 registers the generated teacher data (maigo) in the teacher data (maigo) DB 260 (step S2104). Next, the information processing apparatus 101 refers to the search list of the access log of the type "search" and determines whether or not there is an FAQ that was not queried when displayed for the search word (step S2105). ).

ここで、照会されなかったＦＡＱがない場合（ステップＳ２１０５：Ｎｏ）、情報処理装置１０１は、ステップＳ２１０８に移行する。一方、照会されなかったＦＡＱがある場合（ステップＳ２１０５：Ｙｅｓ）、情報処理装置１０１は、照会されなかったＦＡＱを非照会ＦＡＱとする教師データ（わるいこ）を生成する（ステップＳ２１０６）。 Here, if there is no FAQ that has not been inquired (step S2105: No), the information processing apparatus 101 shifts to step S2108. On the other hand, when there is an FAQ that has not been inquired (step S2105: Yes), the information processing apparatus 101 generates teacher data (wariko) in which the FAQ that has not been inquired is a non-inquiry FAQ (step S2106).

そして、情報処理装置１０１は、生成した教師データ（わるいこ）を、教師データ（わるいこ）ＤＢ２５０に登録する（ステップＳ２１０７）。つぎに、情報処理装置１０１は、アクセスログＤＢ２３０から選択されていない未選択のセッション番号があるか否かを判断する（ステップＳ２１０８）。 Then, the information processing apparatus 101 registers the generated teacher data (wariko) in the teacher data (wariko) DB 250 (step S2107). Next, the information processing apparatus 101 determines whether or not there is an unselected session number that has not been selected from the access log DB 230 (step S2108).

ここで、未選択のセッション番号がある場合（ステップＳ２１０８：Ｙｅｓ）、情報処理装置１０１は、図２０に示したステップＳ２００１に戻る。一方、未選択のセッション番号がない場合（ステップＳ２１０８：Ｎｏ）、情報処理装置１０１は、本フローチャートによる一連の処理を終了する。 Here, if there is an unselected session number (step S2108: Yes), the information processing apparatus 101 returns to step S2001 shown in FIG. On the other hand, when there is no unselected session number (step S2108: No), the information processing apparatus 101 ends a series of processes according to this flowchart.

これにより、過去のユーザの操作履歴（アクセスログ）から教師データ（よゐこ）、教師データ（わるいこ）および教師データ（まいご）を生成することができる。なお、アクセスログＤＢ２３０の記憶内容は、例えば、情報処理装置１０１の教師データ生成処理の実行が完了すると、その都度リセットされる。 As a result, teacher data (yoiko), teacher data (wariko), and teacher data (maigo) can be generated from the past user operation history (access log). The stored contents of the access log DB 230 are reset each time, for example, when the execution of the teacher data generation process of the information processing apparatus 101 is completed.

（情報処理装置１０１の回答出力処理手順）
つぎに、図２２および図２３を用いて、情報処理装置１０１の回答出力処理手順について説明する。 (Procedure for processing the answer output of the information processing device 101)
Next, the response output processing procedure of the information processing apparatus 101 will be described with reference to FIGS. 22 and 23.

図２２および図２３は、情報処理装置１０１の回答出力処理手順の一例を示すフローチャートである。図２２のフローチャートにおいて、まず、情報処理装置１０１は、検索ワードの入力を受け付けたか否かを判断する（ステップＳ２２０１）。ここで、情報処理装置１０１は、検索ワードの入力を受け付けるのを待つ（ステップＳ２２０１：Ｎｏ）。 22 and 23 are flowcharts showing an example of the response output processing procedure of the information processing apparatus 101. In the flowchart of FIG. 22, first, the information processing apparatus 101 determines whether or not the input of the search word is accepted (step S2201). Here, the information processing apparatus 101 waits for the input of the search word to be accepted (step S2201: No).

そして、情報処理装置１０１は、検索ワードの入力を受け付けた場合（ステップＳ２２０１：Ｙｅｓ）、検索ワードを形態素解析する（ステップＳ２２０２）。つぎに、情報処理装置１０１は、検索ワードを形態素解析した結果に基づいて、ＦＡＱマスタ２２０からＦＡＱを検索する（ステップＳ２２０３）。 Then, when the information processing apparatus 101 accepts the input of the search word (step S2201: Yes), the information processing apparatus 101 morphologically analyzes the search word (step S2202). Next, the information processing apparatus 101 searches for FAQ from the FAQ master 220 based on the result of morphological analysis of the search word (step S2203).

つぎに、情報処理装置１０１は、教師データ（よゐこ）ＤＢ２４０を参照して、ナイーブベイズ分類器による機械学習を行って、検索された複数のＦＡＱそれぞれについて、第１の確率を算出する（ステップＳ２２０４）。つぎに、情報処理装置１０１は、教師データ（わるいこ）ＤＢ２５０を参照して、ナイーブベイズ分類器による機械学習を行って、検索された複数のＦＡＱそれぞれについて、第２の確率を算出する（ステップＳ２２０５）。 Next, the information processing apparatus 101 refers to the teacher data (Yoko) DB 240, performs machine learning by the naive Bayes classifier, and calculates the first probability for each of the plurality of searched FAQs (step S2204). ). Next, the information processing apparatus 101 refers to the teacher data (wariko) DB 250, performs machine learning by the naive Bayes classifier, and calculates a second probability for each of the plurality of searched FAQs (step). S2205).

そして、情報処理装置１０１は、検索された複数のＦＡＱそれぞれについて算出した第１の確率と第２の確率とに基づいて、検索された複数のＦＡＱの中から、検索ワードに対する１または複数のＦＡＱを決定する（ステップＳ２２０６）。 Then, the information processing apparatus 101 has one or a plurality of FAQs for the search word from the searched plurality of FAQs based on the first probability and the second probability calculated for each of the searched plurality of FAQs. Is determined (step S2206).

つぎに、情報処理装置１０１は、検索ワードに対応するＦＡＱがあるか否かを判断する（ステップＳ２２０７）。具体的には、例えば、情報処理装置１０１は、ステップＳ２２０３においてＦＡＱが１件も検索されなかった場合、または、ステップＳ２２０６においてＦＡＱが１件も決定されなかった場合に、検索ワードに対応するＦＡＱがないと判定する。 Next, the information processing apparatus 101 determines whether or not there is an FAQ corresponding to the search word (step S2207). Specifically, for example, the information processing apparatus 101 has a FAQ corresponding to a search word when no FAQ is searched in step S2203 or when no FAQ is determined in step S2206. Judge that there is no.

ここで、検索ワードに対応するＦＡＱがある場合（ステップＳ２２０７：Ｙｅｓ）、情報処理装置１０１は、決定した検索ワードに対する１または複数のＦＡＱを出力して（ステップＳ２２０８）、本フローチャートによる一連の処理を終了する。一方、検索ワードに対応するＦＡＱがない場合（ステップＳ２２０７：Ｎｏ）、情報処理装置１０１は、図２３に示すステップＳ２３０１に移行する。 Here, when there is an FAQ corresponding to the search word (step S2207: Yes), the information processing apparatus 101 outputs one or a plurality of FAQs for the determined search word (step S2208), and a series of processes according to this flowchart. To finish. On the other hand, when there is no FAQ corresponding to the search word (step S2207: No), the information processing apparatus 101 shifts to step S2301 shown in FIG.

図２３のフローチャートにおいて、まず、情報処理装置１０１は、教師データ（まいご）ＤＢ２６０を参照して、ナイーブベイズ分類器による機械学習を行って、複数の代替ワードそれぞれについて、検索ワードに後続して入力される第３の確率を算出する（ステップＳ２３０１）。 In the flowchart of FIG. 23, first, the information processing apparatus 101 refers to the teacher data (maigo) DB 260, performs machine learning by a naive Bayes classifier, and inputs each of the plurality of alternative words after the search word. The third probability is calculated (step S2301).

そして、情報処理装置１０１は、複数の代替ワードそれぞれについて算出した第３の確率に基づいて、検索ワードに後続して入力される代替ワードを特定する（ステップＳ２３０２）。つぎに、情報処理装置１０１は、代替ワードが特定されたか否かを判断する（ステップＳ２３０３）。例えば、図２２に示したステップＳ２２０２において、検索ワードから形態素が検出されなかった場合に、代替ワードは特定されないことがある。 Then, the information processing apparatus 101 identifies the alternative word to be input after the search word based on the third probability calculated for each of the plurality of alternative words (step S2302). Next, the information processing apparatus 101 determines whether or not the alternative word has been specified (step S2303). For example, in step S2202 shown in FIG. 22, when the morpheme is not detected from the search word, the alternative word may not be specified.

ここで、代替ワードが特定された場合（ステップＳ２３０３：Ｙｅｓ）、情報処理装置１０１は、図２２に示したステップＳ２２０２に戻る。この場合、情報処理装置１０１は、特定した代替ワードの形態素解析を実施する。 Here, when the alternative word is specified (step S2303: Yes), the information processing apparatus 101 returns to step S2202 shown in FIG. In this case, the information processing apparatus 101 performs morphological analysis of the specified alternative word.

一方、代替ワードが特定されなかった場合（ステップＳ２３０３：Ｎｏ）、情報処理装置１０１は、教師データ（形態素）ＤＢ１９００を参照して、ナイーブベイズ分類器による機械学習を行って、複数の形態素それぞれについて、第４の確率を算出する（ステップＳ２３０４）。第４の確率は、検索ワードに類似する度合いの高さを示す。 On the other hand, when the alternative word is not specified (step S2303: No), the information processing apparatus 101 refers to the teacher data (morpheme) DB1900 and performs machine learning by the naive Bayes classifier for each of the plurality of morphemes. , The fourth probability is calculated (step S2304). The fourth probability indicates a high degree of similarity to the search word.

そして、情報処理装置１０１は、複数の形態素それぞれについて算出した第４の確率に基づいて、検索ワードに対応する形態素を特定して（ステップＳ２３０５）、図２２に示したステップＳ２２０３に戻る。この場合、情報処理装置１０１は、特定した形態素に基づいて、ＦＡＱマスタ２２０からＦＡＱを検索する。 Then, the information processing apparatus 101 identifies the morpheme corresponding to the search word based on the fourth probability calculated for each of the plurality of morphemes (step S2305), and returns to step S2203 shown in FIG. 22. In this case, the information processing apparatus 101 searches the FAQ from the FAQ master 220 based on the specified morpheme.

なお、ステップＳ２２０３において、ＦＡＱが１件も検索されなかった場合、情報処理装置１０１は、ステップＳ２３０４に移行することにしてもよい。また、ステップＳ２３０５からステップＳ２２０３への移行は、２回以上行わないこととする。このため、移行後のステップＳ２２０７において、検索ワードに対応するＦＡＱがない場合には（ステップＳ２２０７：Ｎｏ）、情報処理装置１０１は、例えば、検索結果が０件である旨の情報を出力する。 If no FAQ is searched in step S2203, the information processing apparatus 101 may shift to step S2304. Further, the transition from step S2305 to step S2203 shall not be performed more than once. Therefore, in step S2207 after the transition, if there is no FAQ corresponding to the search word (step S2207: No), the information processing apparatus 101 outputs, for example, information that the search result is 0.

これにより、過去のユーザの操作履歴（アクセスログ）をもとに、検索ワードに対して表示するＦＡＱを最適化することができる。また、検索ワードに対応するＦＡＱがなくても、過去のユーザの操作履歴をもとに、その検索ワード、すなわち、０件ヒット検索ワードに置き換わる代替ワードを推定してＦＡＱを出力することができる。 This makes it possible to optimize the FAQ displayed for the search word based on the operation history (access log) of the past user. Further, even if there is no FAQ corresponding to the search word, it is possible to estimate the search word, that is, the alternative word to be replaced with the 0-hit search word, and output the FAQ based on the operation history of the past user. ..

以上説明したように、実施の形態にかかる情報処理装置１０１によれば、検索ワードの入力を受け付け、受け付けた検索ワードに対応するＦＡＱを検索することができる。また、情報処理装置１０１によれば、教師データ（よゐこ）と教師データ（わるいこ）とを記憶する記憶部１０１０の記憶内容に基づく機械学習を行って、検索した複数のＦＡＱの中から、検索ワードに対する１または複数のＦＡＱを決定することができる。教師データ（よゐこ）は、例えば、検索ワード（質問）に対して１ページ目に表示された際に選択操作を受け付けたＦＡＱを、当該検索ワードに対応付けた情報である。教師データ（わるいこ）は、例えば、検索ワードに対して１ページ目に表示された際に選択操作を受け付けなかったＦＡＱを、当該検索ワードに対応付けた情報である。そして、情報処理装置１０１によれば、決定した１または複数のＦＡＱを出力することができる。 As described above, according to the information processing apparatus 101 according to the embodiment, it is possible to accept the input of the search word and search the FAQ corresponding to the accepted search word. Further, according to the information processing apparatus 101, machine learning is performed based on the stored contents of the storage unit 1010 that stores the teacher data (yoiko) and the teacher data (wariko), and the search is performed from among the plurality of searched FAQs. One or more FAQs for a word can be determined. The teacher data (Yoiko) is, for example, information in which the FAQ for which the selection operation is accepted when the search word (question) is displayed on the first page is associated with the search word. The teacher data (wariko) is, for example, information in which the FAQ that did not accept the selection operation when displayed on the first page of the search word is associated with the search word. Then, according to the information processing apparatus 101, one or a plurality of determined FAQs can be output.

これにより、過去のユーザの操作履歴（アクセスログ）をもとに、異なる２つの観点から検索ワードと関連性の強いＦＡＱ、および、検索ワードと関連性が弱いＦＡＱを特定して、問題解決につながるＦＡＱを精度よく絞り込むことができる。 As a result, based on the operation history (access log) of the past user, the FAQ that is strongly related to the search word and the FAQ that is weakly related to the search word are identified from two different viewpoints, and the problem is solved. The FAQs to be connected can be narrowed down accurately.

また、情報処理装置１０１によれば、記憶部１０１０を参照して、ナイーブベイズ分類器による機械学習を行って、検索した複数のＦＡＱそれぞれについて、第１の確率と第２の確率とを算出することができる。第１の確率は、例えば、検索ワードに対して１ページ目に表示された際に選択される確率である。第２の確率は、例えば、検索ワードに対して１ページ目に表示された際に選択されない確率である。そして、情報処理装置１０１によれば、検索した複数のＦＡＱそれぞれについて算出した第１の確率と第２の確率とに基づいて、複数のＦＡＱの中から、受け付けた検索ワードに対する１または複数のＦＡＱを決定することができる。 Further, according to the information processing apparatus 101, the storage unit 1010 is referred to, machine learning is performed by the naive Bayes classifier, and the first probability and the second probability are calculated for each of the plurality of searched FAQs. be able to. The first probability is, for example, the probability of being selected when the search word is displayed on the first page. The second probability is, for example, the probability that the search word is not selected when it is displayed on the first page. Then, according to the information processing apparatus 101, one or a plurality of FAQs for the accepted search word from the plurality of FAQs based on the first probability and the second probability calculated for each of the searched plurality of FAQs. Can be determined.

これにより、検索ワードと関連性の強いＦＡＱ、および、検索ワードと関連性が弱いＦＡＱの特定精度を向上させることができる。また、ナイーブベイズ分類器による機械学習を利用することで、教師データ（よゐこ、わるいこ）の質と量により特定精度をコントロールすることができる。 This makes it possible to improve the accuracy of identifying FAQs that are strongly related to the search word and FAQs that are weakly related to the search word. In addition, by using machine learning with a naive Bayes classifier, it is possible to control the specific accuracy by the quality and quantity of teacher data (Yoiko, Wariko).

また、情報処理装置１０１によれば、検索した複数のＦＡＱのうち第２の確率が閾値α以上のＦＡＱを除外したＦＡＱを、第１の確率が高い順にソートした１または複数のＦＡＱを、検索ワードに対する１または複数のＦＡＱに決定することができる。これにより、検索ワードと関連性の強いＦＡＱを検索上位に表示しつつ、検索ワードと関連性が弱いＦＡＱを表示対象から除外することができる。具体的には、１ページ目に表示された際に照会される確度の高いＦＡＱを選別しつつ、１ページ目に表示されても照会されない確度が高いＦＡＱを除外することができる。このため、例えば、１ページ目に頻繁に表示されるものの問題解決につながらないようなＦＡＱの表示を抑制することができる。また、例えば、過去はよく回答として照会されていたが、ＦＡＱの陳腐化が進み、最近は照会されなくなってきているＦＡＱの表示を抑制することができる。 Further, according to the information processing apparatus 101, one or a plurality of FAQs obtained by sorting the FAQs excluding the FAQs having the second probability of the threshold α or more among the searched FAQs in descending order of the first probability are searched. You can decide on one or more FAQs for a word. As a result, FAQs that are strongly related to the search word can be displayed at the top of the search, and FAQs that are weakly related to the search word can be excluded from the display target. Specifically, it is possible to select FAQs with high accuracy that are inquired when displayed on the first page, and exclude FAQs with high accuracy that are not inquired even if they are displayed on the first page. Therefore, for example, it is possible to suppress the display of the FAQ that is frequently displayed on the first page but does not lead to the solution of the problem. Further, for example, it is possible to suppress the display of FAQ, which has been frequently inquired as an answer in the past, but has become obsolete and has not been inquired recently.

また、情報処理装置１０１によれば、検索ワードに対する選択操作を受け付けたＦＡＱを示すアクセスログに基づいて、検索ワードに対して１ページ目に表示されるＦＡＱのいずれかが選択（照会）される確率を算出することができる。そして、情報処理装置１０１によれば、算出した確率の時系列変化に基づいて、閾値αを調整することができる。これにより、１ページ目（上位Ｎ件）に表示されたＦＡＱがどの程度照会されているかという運用中の照会実績をもとに、閾値αの大きさを最適化することができる。 Further, according to the information processing apparatus 101, one of the FAQs displayed on the first page for the search word is selected (inquired) based on the access log indicating the FAQ that accepted the selection operation for the search word. The probability can be calculated. Then, according to the information processing apparatus 101, the threshold value α can be adjusted based on the time-series change of the calculated probability. As a result, the size of the threshold value α can be optimized based on the inquiry results during operation, such as how much the FAQ displayed on the first page (upper N items) is inquired.

これらのことから、情報処理装置１０１によれば、ＦＡＱシステムやＦＡＱサイトにおける一次回答率や顧客満足度の向上を図ることができる。また、ＦＡＱを見直すことなく検索結果の最適化を図ることができるため、ＦＡＱシステムやＦＡＱサイトの運用コストを削減することができる。 From these facts, according to the information processing apparatus 101, it is possible to improve the primary response rate and customer satisfaction in the FAQ system and the FAQ site. In addition, since the search results can be optimized without reviewing the FAQ, the operating cost of the FAQ system and the FAQ site can be reduced.

なお、本実施の形態で説明した回答出力方法は、あらかじめ用意されたプログラムをパーソナル・コンピュータやワークステーション等のコンピュータで実行することにより実現することができる。本回答出力プログラムは、ハードディスク、フレキシブルディスク、ＣＤ（ＣｏｍｐａｃｔＤｉｓｃ）－ＲＯＭ、ＭＯ（Ｍａｇｎｅｔｏ－Ｏｐｔｉｃａｌｄｉｓｋ）、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）メモリ等のコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行される。また、本回答出力プログラムは、インターネット等のネットワークを介して配布してもよい。 The answer output method described in this embodiment can be realized by executing a program prepared in advance on a computer such as a personal computer or a workstation. This answer output program is a computer-readable recording medium such as a hard disk, flexible disk, CD (Compact Disk) -ROM, MO (Magnet-Optical disk), DVD (Digital Versaille Disk), and USB (Universal Serial Bus) memory. It is recorded on an optical disc and executed by being read from a recording medium by a computer. Further, this answer output program may be distributed via a network such as the Internet.

上述した実施の形態に関し、さらに以下の付記を開示する。 The following additional notes are further disclosed with respect to the above-described embodiment.

（付記１）質問の入力を受け付け、
受け付けた前記質問に対応する複数の回答候補を特定し、
質問に対する選択操作を受け付けた１または複数の第１の回答候補と、質問に対する選択操作を受け付けなかった１または複数の第２の回答候補とを記憶する記憶部の記憶内容に基づく機械学習を行って、特定した前記複数の回答候補の中から、受け付けた前記質問に対する１または複数の回答候補を決定し、
決定した前記１または複数の回答候補を出力する、
処理をコンピュータに実行させることを特徴とする回答出力プログラム。 (Appendix 1) Accepting the input of questions,
Identify multiple answer candidates corresponding to the received question,
Machine learning is performed based on the stored contents of the storage unit that stores one or more first answer candidates that have accepted the selection operation for the question and one or more second answer candidates that have not accepted the selection operation for the question. Then, from the specified plurality of answer candidates, one or more answer candidates for the received question are determined.
Output the determined one or more answer candidates,
An answer output program characterized by having a computer perform processing.

（付記２）前記決定する処理は、
前記記憶部を参照して、ナイーブベイズ分類器による機械学習を行って、特定した前記複数の回答候補それぞれについて、受け付けた前記質問に対して表示された際に選択される第１の確率と、受け付けた前記質問に対して表示された際に選択されない第２の確率とを算出し、
前記複数の回答候補それぞれについて算出した前記第１の確率と前記第２の確率とに基づいて、前記複数の回答候補の中から、受け付けた前記質問に対する１または複数の回答候補を決定する、ことを特徴とする付記１に記載の回答出力プログラム。 (Appendix 2) The process to be determined is
With reference to the storage unit, machine learning by the naive Bayes classifier is performed, and for each of the plurality of identified answer candidates, the first probability selected when displayed for the received question and the first probability. Calculate the second probability that the question will not be selected when displayed for the accepted question.
Based on the first probability and the second probability calculated for each of the plurality of answer candidates, one or a plurality of answer candidates for the received question are determined from the plurality of answer candidates. The answer output program according to Appendix 1, which comprises the above.

（付記３）前記決定する処理は、
前記複数の回答候補のうち前記第２の確率が閾値以上の回答候補を除外した回答候補を、前記第１の確率が高い順にソートした１または複数の回答候補を、受け付けた前記質問に対する１または複数の回答候補に決定する、ことを特徴とする付記２に記載の回答出力プログラム。 (Appendix 3) The process to be determined is
One or a plurality of answer candidates obtained by sorting the answer candidates excluding the answer candidates whose second probability is equal to or higher than the threshold value among the plurality of answer candidates in descending order of the first probability is one or a plurality of answers. The answer output program described in Appendix 2, which is characterized in that a plurality of answer candidates are determined.

（付記４）質問に対する選択操作を受け付けた回答候補を示すログ情報を蓄積し、
蓄積した前記ログ情報に基づいて、質問に対して表示される上位所定数の回答候補のいずれかが選択される確率を算出し、
算出した前記確率の時系列変化に基づいて、前記閾値を調整する、
処理を前記コンピュータに実行させることを特徴とする付記３に記載の回答出力プログラム。 (Appendix 4) Accumulate log information indicating answer candidates that have accepted the selection operation for the question.
Based on the accumulated log information, the probability that one of the top predetermined number of answer candidates displayed for the question will be selected is calculated.
Adjusting the threshold value based on the calculated time-series change of the probability.
The answer output program according to Appendix 3, wherein the processing is executed by the computer.

（付記５）前記調整する処理は、算出した前記確率の時系列変化に基づいて、前記確率が下降傾向にあると判断した場合、前記閾値を所定値下げる、ことを特徴とする付記４に記載の回答出力プログラム。 (Appendix 5) The adjustment process is described in Appendix 4, characterized in that the threshold value is lowered by a predetermined value when it is determined that the probability is on a downward trend based on the calculated time-series change of the probability. Answer output program.

（付記６）前記第１の回答候補は、質問に対する上位所定数の回答候補のうち選択操作を受け付けた回答候補であり、
前記第２の回答候補は、質問に対する上位所定数の回答候補のうち選択操作を受け付けなかった回答候補である、
ことを特徴とする付記１～５のいずれか一つに記載の回答出力プログラム。 (Appendix 6) The first answer candidate is an answer candidate that has received a selection operation from among the top predetermined number of answer candidates for the question.
The second answer candidate is an answer candidate that does not accept the selection operation among the top predetermined number of answer candidates for the question.
The answer output program described in any one of the appendices 1 to 5, characterized in that.

（付記７）前記回答候補は、ＦＡＱ（ＦｒｅｑｕｅｎｔｌｙＡｓｋｅｄＱｕｅｓｔｉｏｎ）である、ことを特徴とする付記１～６のいずれか一つに記載の回答出力プログラム。 (Appendix 7) The answer output program according to any one of the appendices 1 to 6, wherein the answer candidate is a FAQ (Frequently Asked Question).

（付記８）質問の入力を受け付け、
受け付けた前記質問に対応する複数の回答候補を特定し、
質問に対する選択操作を受け付けた１または複数の第１の回答候補と、質問に対する選択操作を受け付けなかった１または複数の第２の回答候補とを記憶する記憶部の記憶内容に基づく機械学習を行って、特定した前記複数の回答候補の中から、受け付けた前記質問に対する１または複数の回答候補を決定し、
決定した前記１または複数の回答候補を出力する、
処理をコンピュータが実行することを特徴とする回答出力方法。 (Appendix 8) Accepting the input of questions,
Identify multiple answer candidates corresponding to the received question,
Machine learning is performed based on the stored contents of the storage unit that stores one or more first answer candidates that have accepted the selection operation for the question and one or more second answer candidates that have not accepted the selection operation for the question. Then, from the specified plurality of answer candidates, one or more answer candidates for the received question are determined.
Output the determined one or more answer candidates,
An answer output method characterized by a computer performing processing.

（付記９）質問の入力を受け付け、
受け付けた前記質問に対応する複数の回答候補を特定し、
質問に対する選択操作を受け付けた１または複数の第１の回答候補と、質問に対する選択操作を受け付けなかった１または複数の第２の回答候補とを記憶する記憶部の記憶内容に基づく機械学習を行って、特定した前記複数の回答候補の中から、受け付けた前記質問に対する１または複数の回答候補を決定し、
決定した前記１または複数の回答候補を出力する、
制御部を有することを特徴とする情報処理装置。 (Appendix 9) Accepting the input of questions,
Identify multiple answer candidates corresponding to the received question,
Machine learning is performed based on the stored contents of the storage unit that stores one or more first answer candidates that have accepted the selection operation for the question and one or more second answer candidates that have not accepted the selection operation for the question. Then, from the specified plurality of answer candidates, one or more answer candidates for the received question are determined.
Output the determined one or more answer candidates,
An information processing device characterized by having a control unit.

１０１情報処理装置
１１０，１０１０記憶部
２００回答出力システム
２０１端末
２１０ネットワーク
２２０ＦＡＱマスタ
２３０アクセスログＤＢ
２４０教師データ（よゐこ）ＤＢ
２５０教師データ（わるいこ）ＤＢ
２６０教師データ（まいご）ＤＢ
３００バス
３０１ＣＰＵ
３０２メモリ
３０３Ｉ／Ｆ
３０４ディスクドライブ
３０５ディスク
９００ＦＡＱ画面
１００１受付部
１００２検索部
１００３決定部
１００４出力制御部
１００５生成部
１００６判定部
１００７特定部
１１０１，１１０２，１１０３，１３０１，１３０２，１３０３，１３０４，１７０１，１７０２，１７０３，１７０４，１７０５アクセスログ
１５００分析ＤＢ
１９００教師データ（形態素）ＤＢ 101 Information processing device 110, 1010 Storage unit 200 Answer output system 201 Terminal 210 Network 220 FAQ master 230 Access log DB
240 Teacher data (Yoiko) DB
250 Teacher data (wariko) DB
260 Teacher Data (Maigo) DB
300 bus 301 CPU
302 Memory 303 I / F
304 Disk drive 305 Disk 900 FAQ screen 1001 Reception unit 1002 Search unit 1003 Determination unit 1004 Output control unit 1005 Generation unit 1006 Judgment unit 1007 Specific unit 1101, 1102, 1103, 1301, 1302, 1303, 1304, 1701, 1702, 1703 1704, 1705 Access log 1500 Analysis DB
1900 Teacher data (morpheme) DB

Claims

Accepts questions and accepts questions
Identify multiple answer candidates corresponding to the received question,
One or more first answer candidates that received the selection operation among the answer candidates for the question and one or more second answer candidates that did not accept the selection operation among the answer candidates for the question are stored. For each of the plurality of identified answer candidates by performing machine learning based on the stored contents of the storage unit, the first probability selected when displayed for the received question and the received question On the other hand, calculate the second probability that it will not be selected when it is displayed,
Based on the calculated first probability and the second probability, the answer candidate excluding the answer candidate whose second probability is equal to or greater than the threshold value among the plurality of answer candidates has a high first probability. One or more answer candidates sorted in order are determined as one or more answer candidates for the received question.
Output the determined one or more answer candidates ,
The log information indicating the answer candidate that accepted the selection operation for the question is accumulated, and the log information is accumulated.
Based on the accumulated log information, the probability that any of the top predetermined number of answer candidates displayed for the question will be selected is calculated.
The threshold is adjusted based on the calculated time-series change of the probability.
The adjustment process lowers the threshold value by a predetermined value when it is determined that the probability is on a downward trend based on the calculated time-series change of the probability.
An answer output program characterized by having a computer perform processing.

The first answer candidate is an answer candidate that has received a selection operation from the top predetermined number of answer candidates for the question.
The second answer candidate is an answer candidate that does not accept the selection operation from the top predetermined number of answer candidates for the question.
The answer output program according to claim 1.

Accepts questions and accepts questions
Identify multiple answer candidates corresponding to the received question,
One or more first answer candidates that received the selection operation among the answer candidates for the question and one or more second answer candidates that did not accept the selection operation among the answer candidates for the question are stored. For each of the plurality of identified answer candidates by performing machine learning based on the stored contents of the storage unit, the first probability selected when displayed for the received question and the received question On the other hand, calculate the second probability that it will not be selected when it is displayed,
Based on the calculated first probability and the second probability, the answer candidate excluding the answer candidate whose second probability is equal to or higher than the threshold value among the plurality of answer candidates has a high first probability. One or more answer candidates sorted in order are determined as one or more answer candidates for the received question.
Output the determined one or more answer candidates,
The log information indicating the answer candidate that accepted the selection operation for the question is accumulated, and the log information is accumulated.
Based on the accumulated log information, the probability that any of the top predetermined number of answer candidates displayed for the question will be selected is calculated.
The threshold is adjusted based on the calculated time-series change of the probability.
The adjustment process lowers the threshold value by a predetermined value when it is determined that the probability is on a downward trend based on the calculated time-series change of the probability.
An answer output method characterized by a computer performing processing.

Accepts questions and accepts questions
Identify multiple answer candidates corresponding to the received question,
One or more first answer candidates that received the selection operation among the answer candidates for the question and one or more second answer candidates that did not accept the selection operation among the answer candidates for the question are stored. For each of the plurality of identified answer candidates by performing machine learning based on the stored contents of the storage unit, the first probability selected when displayed for the received question and the received question On the other hand, calculate the second probability that it will not be selected when it is displayed,
Based on the calculated first probability and the second probability, the answer candidate excluding the answer candidate whose second probability is equal to or higher than the threshold value among the plurality of answer candidates has a high first probability. One or more answer candidates sorted in order are determined as one or more answer candidates for the received question.
Output the determined one or more answer candidates,
The log information indicating the answer candidate that accepted the selection operation for the question is accumulated, and the log information is accumulated.
Based on the accumulated log information, the probability that any of the top predetermined number of answer candidates displayed for the question will be selected is calculated.
The threshold is adjusted based on the calculated time-series change of the probability.
The adjustment process lowers the threshold value by a predetermined value when it is determined that the probability is on a downward trend based on the calculated time-series change of the probability.
An information processing device characterized by having a control unit.