JP7475977B2

JP7475977B2 - Knowledge collection support system

Info

Publication number: JP7475977B2
Application number: JP2020102109A
Authority: JP
Inventors: 良介齊藤; 美紀石川; 肖太鈴木; 美季 ▲高▼桑; 莞月濱野; 美奈増田
Original assignee: EXA CO Ltd
Current assignee: EXA CO Ltd
Priority date: 2020-06-12
Filing date: 2020-06-12
Publication date: 2024-04-30
Anticipated expiration: 2040-06-12
Also published as: JP2021196795A

Description

本発明は、情報システム開発プロジェクトにおいて必要となる知識を収集することを支援する知識収集支援システムに関する。 The present invention relates to a knowledge collection support system that supports the collection of knowledge required in an information system development project.

情報システムを開発するプロジェクトにおいては、開発案件に関する様々な情報を収集する。例えば情報システムの所有者となる顧客企業についての情報、情報システムが取り扱うデータや処理などの特性、開発に際して用いる技術、などが挙げられる。これら情報を収集するためには、例えば過去の類似プロジェクトにおける資料や有識者からの情報提供を用いることが考えられる。しかしこの作業は一般に人的負荷が高く、開発プロジェクトを円滑に推進するうえで支障となる。特に属人化した知識は他作業者に対して継承することが難しく、有識者が開発プロジェクトから離任することにより、その知識が損失する場合がある。 In projects to develop information systems, various information about the development project is collected. Examples include information about the client company that will own the information system, the characteristics of the data and processing that the information system will handle, and the technologies used in development. One way to collect this information is to use materials from similar past projects or information provided by experts. However, this work generally places a high burden on personnel and is an obstacle to the smooth progress of the development project. In particular, knowledge that is personal is difficult to pass on to other workers, and knowledge may be lost when an expert leaves the development project.

このような知識を継承するためには、ナレッジ検索システムを構築して同システム内に知識を集約することが考えられる。ただしシステムに対して適切なキーワードを用いて検索を実施しなければ、適切な回答を得ることは困難である。そこで例えば、自然言語文字列によって記述した質問文と、ナレッジ検索システムに対する検索キーワードとの間の対応関係を機械学習によって学習し、その学習結果を用いて、質問文を適切な検索キーワードなどへ変換することが考えられる。 To pass on such knowledge, it is conceivable to build a knowledge search system and aggregate the knowledge within that system. However, unless searches are performed on the system using appropriate keywords, it is difficult to obtain appropriate answers. For example, it is conceivable to use machine learning to learn the correspondence between questions written in natural language strings and search keywords for the knowledge search system, and then use the learning results to convert the questions into appropriate search keywords, etc.

下記特許文献１は、『従来技術においては、“取得したいデータを特定するために、意図的に特定情報を準備”する必要がある。また、そのための専用手順が必要となる。』ことを課題として、『情報検索するに際して、回答抽出に際しての問合せ者、回答者双方の負荷削減、問合せ者へのアドバイス情報の提供、そしてジャストフイットした回答を検索するのに、マスタデータ、パターンデータ、そして集計テーブルを手段として具備する。更に、回答が不可能であった場合も、有益な質問を蓄積するためのテンポラリエリアを備えたナレッジDBを併せ持つ。』という技術を開示している（要約参照）。 The following Patent Document 1 addresses the issue that "in the prior art, it is necessary to 'intentionally prepare specific information in order to identify the data you want to obtain,' and a dedicated procedure is also required for this," and discloses a technology that "provides master data, pattern data, and aggregation tables as a means for reducing the burden on both the inquirer and the answerer when extracting an answer, providing advice information to the inquirer, and searching for just-fit answers when searching for information. It also includes a knowledge database with a temporary area for storing useful questions even when an answer is not possible" (see abstract).

下記特許文献２は、『自然文検索を可能とする検索システムの学習対象のデータを抽出する学習対象抽出装置、方法及びプログラムに関する。』ことを課題として、『検索可能な既知の質問を含む自然文を記憶する自然文記憶部と、検索要求のあった質問データを当該自然文と比較して、質問データを分類する自然文分類部と、既知の質問と回答とを記憶する質問回答記憶部と、質問データに基づいて自然文分類部を実行して検索結果を出力する自然文検索部と、当該検索結果から回答を抽出する回答抽出部と、キーワードデータに基づいて質問回答記憶部を検索して回答を出力する回答出力部と、回答抽出部、回答出力部が出力する出力結果から、質問データに対応する回答を抽出した抽出結果と質問データが学習対象であるか否かを判定する学習対象判定部と、抽出結果と質問データを学習対象のデータとして抽出する学習対象抽出部と、を有する。』という技術を開示している（要約参照）。 The following Patent Document 2 aims to "relate to a learning target extraction device, method, and program for extracting learning target data for a search system that enables natural language search." It discloses a technology that "includes a natural language storage unit that stores natural language including known questions that can be searched, a natural language classification unit that compares question data for which a search request has been made with the natural language and classifies the question data, a question and answer storage unit that stores known questions and answers, a natural language search unit that executes the natural language classification unit based on the question data and outputs search results, an answer extraction unit that extracts answers from the search results, an answer output unit that searches the question and answer storage unit based on keyword data and outputs answers, a learning target determination unit that determines whether the extraction result in which an answer corresponding to the question data is extracted and the question data are learning targets from the output results output by the answer extraction unit and the answer output unit, and a learning target extraction unit that extracts the extraction result and the question data as data to be learned." (See abstract).

特開２００９－０４２９５０号公報JP 2009-042950 A 特開２０１９－１７５２０３号公報JP 2019-175203 A

従来のナレッジ検索システムにおいては、機械学習を用いて質問文を加工する場合、学習データを収集して学習をあらかじめ実施しておく必要がある。しかし機械学習モデルを生成するために必要な学習データを収集する作業負荷は高く、このことがシステム導入の妨げとなる場合がある。 In conventional knowledge search systems, when processing questions using machine learning, it is necessary to collect learning data and conduct learning in advance. However, the workload of collecting the learning data required to generate a machine learning model is high, which can be an obstacle to the introduction of the system.

本発明は、上記のような課題に鑑みてなされたものであり、情報システム開発プロジェクトにおいて必要となる知識を有する有識者と容易にコンタクトをとることができる知識収集支援システムを提供することを目的とする。 The present invention was made in consideration of the above-mentioned problems, and aims to provide a knowledge gathering support system that makes it easy to contact experts who have the knowledge required for information system development projects.

本発明に係る知識収集支援システムは、チャットツールを介して発信した質問データに基づき過去の会話ログを検索し、その検索の際に、質問データの分類結果に対応する有識者ＩＤを併せて検索し、その検索結果をＧＵＩによって表示する。質問者は、ＧＵＩ上の検索結果を介して有識者を選択することにより、その有識者とチャットツール上で会話を開始できる。 The knowledge collection support system of the present invention searches past conversation logs based on question data submitted via a chat tool, and during the search, also searches for expert IDs corresponding to the classification results of the question data, and displays the search results via a GUI. The questioner can select an expert via the search results on the GUI to start a conversation with that expert on the chat tool.

本発明に係る知識収集支援システムによれば、情報システム開発プロジェクトにおいて必要となる知識を有する有識者と容易にコンタクトをとることができる。 The knowledge collection support system of the present invention makes it easy to contact experts who have the knowledge required for information system development projects.

実施形態１に係る知識収集支援システム１０の構成図である。1 is a configuration diagram of a knowledge collection support system 10 according to a first embodiment. チャットツール３００のユーザが質問を発信したときの各部の動作を説明するシーケンス図である。11 is a sequence diagram illustrating the operation of each part when a user of the chat tool 300 sends a question. チャットツール３００のユーザが質問を発信したときの各部の動作を説明するシーケンス図である。11 is a sequence diagram illustrating the operation of each part when a user of the chat tool 300 sends a question. 開発プロジェクトと有識者との間の関連性を表すグラフの例である。1 is an example of a graph showing the association between development projects and experts. 質問者がチャットツール３００上で有識者との会話を開始した以後の各部の動作を説明するシーケンス図である。11 is a sequence diagram illustrating the operation of each part after a questioner starts a conversation with an expert on the chat tool 300. FIG. 検索の再現率を重視する場合における検索動作を説明する模式図である。FIG. 13 is a schematic diagram illustrating a search operation when emphasis is placed on the recall rate of the search. 検索の適合率を重視する場合における検索動作を説明する模式図である。FIG. 13 is a schematic diagram illustrating a search operation when emphasis is placed on the search matching rate.

＜実施の形態１＞
図１は、本発明の実施形態１に係る知識収集支援システム１０の構成図である。知識収集支援システム１０は、情報システム開発プロジェクトにおいて必要となる知識を収集することを支援するシステムである。知識収集支援システム１０は、サーバ１００、データベース２００、チャットツール３００を有する。 <First embodiment>
1 is a configuration diagram of a knowledge collection support system 10 according to a first embodiment of the present invention. The knowledge collection support system 10 is a system that supports the collection of knowledge required in an information system development project. The knowledge collection support system 10 includes a server 100, a database 200, and a chat tool 300.

チャットツール３００は、２以上のユーザが対話するために用いるツールである。チャットツール３００は例えば、デスクトップアプリケーションやＷｅｂアプリケーションとして構成することができる。ユーザはチャットツール３００を用いて、開発プロジェクトにおいて必要となる知識を質問する。その質問は、自然言語文字列の形式で記述された質問データとして、ネットワークを介してサーバ１００へ送信される。 The chat tool 300 is a tool used for two or more users to converse with each other. The chat tool 300 can be configured as, for example, a desktop application or a web application. A user uses the chat tool 300 to ask about knowledge required for a development project. The question is transmitted to the server 100 via a network as question data written in the form of a natural language string.

サーバ１００は、質問データに対する回答を提供するコンピュータである。サーバ１００は、発話データ収集部１１０、分類器１２０、質問判断器１３０、全文検索部１４０、ＵＩ生成部１５０を備える。 The server 100 is a computer that provides answers to question data. The server 100 includes an utterance data collection unit 110, a classifier 120, a question judger 130, a full-text search unit 140, and a UI generation unit 150.

発話データ収集部１１０は、チャットツール３００から質問データを収集する。質問判断器１３０は、その質問データに含まれる文字列のうち、質問文である部分とそれ以外の部分を識別する。チャットツールが発信するメッセージごとに質問文か否かを判断してもよいし、１文ごとに判断してもよい。以下では記載の便宜上、メッセージごとに判断するものとする。同様に記載の便宜上、チャットツール３００が発信する１つのメッセージを１つの質問データとして説明する。 The speech data collection unit 110 collects question data from the chat tool 300. The question determiner 130 distinguishes between parts of the character string contained in the question data that are questions and other parts. It may be possible to determine whether each message sent by the chat tool is a question or not, or it may be possible to determine each sentence. For the sake of convenience in the following description, it will be assumed that the determination is made for each message. Similarly, for the sake of convenience in the description, one message sent by the chat tool 300 will be described as one piece of question data.

分類器１２０は、機械学習によって質問データを分類する。機械学習のアルゴリズムとしては任意の公知技術を使用できる。分類属性となる項目（分類ラベル）としては、例えば以下のようなものが挙げられる：（ａ）質問データが関連する情報システムの種別（例：システムが対象とする業務内容）、（ｂ）質問データが関連する情報システムの開発プロジェクトの特性（例：同システムにおける新規開発プロジェクト／保守プロジェクトなどの区別とそのプロジェクトの対象）、（ｃ）質問データが関連する開発プロジェクトにおいて用いる技術。例えばこれら３つの分類ラベルを持つ分類器を用いる場合、分類器１２０は、質問データを分類空間上に構成されたいずれかのクラスへ分類するように、機械学習をあらかじめ実施する。 The classifier 120 classifies the question data by machine learning. Any known technology can be used as the machine learning algorithm. Examples of items (classification labels) that serve as classification attributes include: (a) the type of information system to which the question data relates (e.g., the business content targeted by the system), (b) the characteristics of the development project of the information system to which the question data relates (e.g., a distinction between a new development project/maintenance project in the same system and the target of that project), and (c) the technology used in the development project to which the question data relates. For example, when using a classifier with these three classification labels, the classifier 120 performs machine learning in advance to classify the question data into one of the classes configured in the classification space.

全文検索部１４０は、質問データそのものと、分類器１２０が質問データを分類した結果とを用いて、後述する会話ログ２１０に対して全文検索を実施する。検索がヒットした場合、ＵＩ生成部１５０はその検索結果を提示するＵＩ（典型的にはＧＵＩ：ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）を生成してチャットツール３００のユーザへ送信する。例えば、検索結果を提示するＷｅｂページを生成してチャットツール３００へ返信する。チャットツール３００はそのＷｅｂページをユーザ端末上で表示する。検索がヒットしなかった場合については後述する。 The full-text search unit 140 uses the question data itself and the results of classification of the question data by the classifier 120 to perform a full-text search on the conversation log 210 (described below). If the search returns a hit, the UI generation unit 150 generates a UI (typically a GUI: Graphical User Interface) that presents the search results and sends it to the user of the chat tool 300. For example, it generates a web page that presents the search results and returns it to the chat tool 300. The chat tool 300 displays the web page on the user's terminal. The case where the search returns no hits will be described later.

データベース２００は、サーバ１００が用いるデータテーブルなどを保持する。データベース２００は具体的には、会話ログ２１０、有識者テーブル２２０、開発プロジェクトテーブル２３０、有識者・プロジェクト関連テーブル２４０、学習データテーブル２５０、を保持する。 The database 200 holds data tables and the like used by the server 100. Specifically, the database 200 holds a conversation log 210, an expert table 220, a development project table 230, an expert-project related table 240, and a learning data table 250.

会話ログ２１０は、チャットツール３００上における過去の会話記録である。原則として全てのユーザの全ての会話を記録するが、例えば後述する有識者との間の会話のみを記録してもよい。 The conversation log 210 is a record of past conversations on the chat tool 300. In principle, all conversations between all users are recorded, but it may also be possible to record only conversations between users and experts, as described below.

有識者テーブル２２０は、過去に実施された開発プロジェクトに参加した有識者のＩＤを記録したデータテーブルである。開発プロジェクトに対する全ての参加者が有識者として記録されている。したがってここでいう有識者とは、有識者として取り扱うことができる者の候補ということもできる。 The expert table 220 is a data table that records the IDs of experts who participated in development projects that have been implemented in the past. All participants in a development project are recorded as experts. Therefore, experts here can also be considered as candidates who can be treated as experts.

開発プロジェクトテーブル２３０は、過去に実施された開発プロジェクトのＩＤとその開発プロジェクトの分類属性を記録するデータテーブルである。ここでいう分類属性は、分類器１２０が質問データを分類する際の分類属性と一致するように構成されている。この理由については後述する。 The development project table 230 is a data table that records the IDs of development projects that have been implemented in the past and the classification attributes of those development projects. The classification attributes referred to here are configured to match the classification attributes used by the classifier 120 when classifying question data. The reason for this will be explained later.

有識者・プロジェクト関連テーブル２４０は、過去に実施された開発プロジェクトとその開発プロジェクトに参加した有識者を関連付けるためのデータテーブルである。例えば有識者テーブル２２０上の有識者ＩＤと開発プロジェクトテーブル２３０上の開発プロジェクトＩＤとを関連付けて記録する。 The expert/project association table 240 is a data table for associating development projects that have been implemented in the past with the experts who participated in those development projects. For example, the expert IDs in the expert table 220 and the development project IDs in the development project table 230 are associated and recorded.

学習データテーブル２５０は、チャットツール３００上におけるユーザと有識者との間の会話を、分類器１２０の学習データとして一時的に保存するためのデータテーブルである。 The learning data table 250 is a data table for temporarily storing conversations between users and experts on the chat tool 300 as learning data for the classifier 120.

＜実施の形態１：会話ログの検索がヒットした場合＞
図２は、チャットツール３００のユーザが質問を発信したときの各部の動作を説明するシーケンス図である。知識収集支援システム１０は、質問者にとって必要な知識を、まず会話ログ２１０から検索する。ここでは全文検索部１４０による会話ログ２１０に対する検索がヒットした場合の動作について説明する。 <First embodiment: When a search for a conversation log returns a hit>
2 is a sequence diagram explaining the operation of each part when a user of the chat tool 300 sends a question. The knowledge collection assistance system 10 first searches for knowledge necessary for the questioner from the conversation log 210. Here, the operation when a search by the full-text search unit 140 in the conversation log 210 returns a hit will be explained.

質問者はチャットツール３００上で、過去に実施された開発プロジェクトに関する質問を自然言語によって質問する。チャットツール３００は、その質問文を含むメッセージ（質問データ）をサーバ１００に対して発信する。発話データ収集部１１０はその質問データを取得する。 The questioner uses natural language on the chat tool 300 to ask a question about past development projects. The chat tool 300 sends a message (question data) including the question to the server 100. The speech data collection unit 110 acquires the question data.

分類器１２０は、あらかじめ機械学習を実施した結果にしたがって、質問データを分類し、分類結果を全文検索部１４０へ出力する。ここでいう分類結果とは、質問データが各分類属性におけるどの分類に属するのかを判定した結果である。例えば、分類ラベルとしては上述の（ａ）～（ｃ）を用いる場合、（ａ）情報システム種別＝保険システム、（ｂ）開発プロジェクト特性＝保険商品システムの新規開発プロジェクト、（ｃ）使用技術＝自社製クラウド、などである。分類ラベルは上記（ａ）～（ｃ）に限るものではなく、開発プロジェクトにおいて必要となる知識を有する有識者の属性を適切に記述することができれば、その他の分類ラベルを用いてもよい。以下では上述の（ａ）～（ｃ）を前提とする。 The classifier 120 classifies the question data according to the results of machine learning performed in advance, and outputs the classification results to the full-text search unit 140. The classification results here are the results of determining which classification in each classification attribute the question data belongs to. For example, when the above-mentioned (a) to (c) are used as classification labels, (a) information system type = insurance system, (b) development project characteristics = new development project for insurance product system, (c) technology used = in-house cloud, etc. The classification labels are not limited to the above-mentioned (a) to (c), and other classification labels may be used as long as they can appropriately describe the attributes of experts who have the knowledge required for the development project. The following assumes the above-mentioned (a) to (c).

全文検索部１４０は、質問データと分類結果を用いて、会話ログ２１０に対して全文検索を実施する。質問データと分類結果をどのように用いるのかについての詳細は後述の実施形態２で説明するが、最も単純には、質問データの文字列と分類結果の文字列を単純に文字列結合したものを用いて検索を実施すればよい。 The full-text search unit 140 uses the question data and the classification results to perform a full-text search on the conversation log 210. Details of how the question data and the classification results are used will be explained in the second embodiment described below, but in the simplest case, a search can be performed using a simple string combination of the question data string and the classification result string.

検索がヒットした場合、ＵＩ生成部１５０はその検索結果を提示するＧＵＩを生成してチャットツール３００へ返信する。チャットツール３００（またはチャットツール３００を実行しているユーザ端末）は、そのＧＵＩを画面表示する。具体的には、検索によって得られた過去の会話ログ（会話者と会話内容）を、会話時系列に沿って画面表示すればよい。複数の会話がヒットした場合は、会話ごとに整列してもよい。 If the search returns a hit, the UI generation unit 150 generates a GUI that presents the search results and returns it to the chat tool 300. The chat tool 300 (or the user terminal running the chat tool 300) displays the GUI on a screen. Specifically, past conversation logs (conversational parties and conversation contents) obtained by the search can be displayed on the screen in chronological order of the conversation. If multiple conversations return a hit, they can be arranged by conversation.

以上の動作により、質問者は質問内容に対応する過去の開発プロジェクトにおける質疑会話を参照して、質問内容に関する必要な知識を収集することができる。 By performing the above steps, the questioner can refer to question and answer conversations from past development projects that correspond to the content of the question, and gather the necessary knowledge related to the content of the question.

＜実施の形態１：会話ログの検索がヒットしなかった場合＞
図３は、チャットツール３００のユーザが質問を発信したときの各部の動作を説明するシーケンス図である。ここでは全文検索部１４０による会話ログ２１０に対する検索がヒットしなかった場合の動作について説明する。知識収集支援システム１０は、会話ログ２１０から有用な質疑会話が得られなかった場合、質問内容に関する知識を有していると想定される有識者を、以下の手順によって質問者に対して提示する。 <First embodiment: When no hits are found in the conversation log search>
3 is a sequence diagram explaining the operation of each part when a user of the chat tool 300 sends a question. Here, the operation when the full-text search unit 140 does not find a hit in the search of the conversation log 210 will be explained. When no useful question-and-answer conversation is obtained from the conversation log 210, the knowledge collection support system 10 presents experts who are assumed to have knowledge on the content of the question to the questioner in the following procedure.

全文検索部１４０が会話ログ２１０を検索するステップまでは図２と同じである。全文検索部１４０はさらに、質問データと分類結果を用いて、開発プロジェクトテーブル２３０が保持している開発プロジェクトの分類属性に対して、全文検索を実施する。質問データと分類結果をどのように用いるのかについては、会話ログ２１０を検索するときと同様であるので、詳細は実施形態２で説明する。会話ログの検索がヒットしなかった場合は開発プロジェクトをさらに自動的に検索してもよいし、ヒットしなかった旨をチャットツール３００上にいったん表示し、質問者がさらに開発プロジェクトを検索することを促すボタンなどを表示してもよい。 The process is the same as in FIG. 2 up to the step where the full-text search unit 140 searches the conversation log 210. The full-text search unit 140 further uses the question data and the classification results to perform a full-text search on the classification attributes of the development projects held in the development project table 230. How the question data and the classification results are used is the same as when searching the conversation log 210, and will be described in detail in the second embodiment. If the search of the conversation log does not return any hits, the development projects may be further automatically searched for, or a message indicating that no hits were found may be displayed on the chat tool 300, and a button or the like may be displayed to encourage the questioner to search further for development projects.

全文検索部１４０は、検索にヒットした開発プロジェクトを検索キーとして有識者・プロジェクト関連テーブル２４０を検索することにより、ヒットした開発プロジェクトに参加した参加者のＩＤを取得する。全文検索部１４０はさらに、その参加者ＩＤを用いて有識者テーブル２２０を検索することにより、その参加者についての詳細（氏名、所属、得意分野など）を取得する。全文検索部１４０はこのとき、開発プロジェクトと有識者との間の関連性（有識者がどの開発プロジェクトに参加したか、など）を併せて取得する。 The full-text search unit 140 searches the expert/project association table 240 using the development project found in the search as a search key to obtain the ID of the participant who participated in the found development project. The full-text search unit 140 further searches the expert table 220 using the participant ID to obtain details about the participant (such as name, affiliation, and area of expertise). At this time, the full-text search unit 140 also obtains the relationship between the development project and the expert (such as which development project the expert participated in).

ＵＩ生成部１５０は、検索によって得られた開発プロジェクトと有識者との間の関連性を表すグラフを含むＧＵＩを生成し、チャットツール３００へ返信する。チャットツール３００（またはチャットツール３００を実行しているユーザ端末）は、そのＧＵＩを画面表示する。グラフの例については後述する。 The UI generation unit 150 generates a GUI including a graph showing the relevance between the development projects and experts found by the search, and returns the GUI to the chat tool 300. The chat tool 300 (or the user terminal running the chat tool 300) displays the GUI on its screen. An example of the graph will be described later.

図４は、開発プロジェクトと有識者との間の関連性を表すグラフの例である。開発プロジェクトテーブル２３０を検索することにより、質問データとその分類結果に関連すると想定される開発プロジェクトを抽出することができる。その開発プロジェクトに参加した参加者は、質問内容に関連する知識を有する有識者の候補とみなすことができる。このＧＵＩはその有識者候補を提示するものである。 Figure 4 is an example of a graph showing the relationship between development projects and experts. By searching the development project table 230, it is possible to extract development projects that are assumed to be related to the question data and its classification results. Participants who have taken part in that development project can be considered candidates for experts who have knowledge related to the content of the question. This GUI presents those candidate experts.

各有識者のＩＤは、その有識者との間でチャットツール３００を介して会話を開始できるように構成されている。質問者が有識者ＩＤを選択（クリック）すると、その有識者との会話が開始される。質問者は、有識者との会話を通じて、質問内容に関する知識を収集することを図る。選択した有識者が必要な知識を有していなければ、その質問者との会話を終了して別の有識者ＩＤを選択してもよい。 Each expert ID is configured so that a conversation can be started with that expert via the chat tool 300. When the questioner selects (clicks) an expert ID, a conversation with that expert begins. Through the conversation with the expert, the questioner aims to gather knowledge related to the content of the question. If the selected expert does not have the necessary knowledge, the questioner may end the conversation with that questioner and select another expert ID.

図４は、ツリービューによってグラフを表現した例を示している。ツリーのルートは情報システム種別に相当する。ツリーの２段目はその情報システムにおける開発プロジェクト特性に相当する。グラフにおける有識者と開発プロジェクトを接続するリンクは、強弱をつけることもできる。図４においては有識者３を強調表示している。強弱の基準としては、例えば以下のようなものが考えられる。 Figure 4 shows an example of a graph represented by a tree view. The root of the tree corresponds to the information system type. The second level of the tree corresponds to the development project characteristics in that information system. The links connecting experts and development projects in the graph can be strength- or weakness-determined. Expert 3 is highlighted in Figure 4. Possible criteria for strength include, for example, the following:

（リンク強弱の例その１）有識者・プロジェクト関連テーブル２４０において、各参加者のその開発プロジェクトにおける役割や、その開発プロジェクトに対する参加期間を定義することができる。その役割の重要性や参加期間の長さに応じて、有識者と開発プロジェクトとの間のリンクの強弱をつけることができる。例えばプロジェクトマネージャのように重要な役割の参加者についてはリンクを他の参加者よりも太くし、参加期間が短ければリンクを細くする、などが考えられる。 (Example of link strength 1) In the expert/project related table 240, the role of each participant in the development project and the period of participation in the development project can be defined. The strength of the link between the expert and the development project can be determined according to the importance of the role and the length of participation. For example, a participant with an important role, such as a project manager, can have a thicker link than other participants, and a participant with a shorter participation period can have a thinner link.

（リンク強弱の例その２）有識者テーブル２２０または有識者・プロジェクト関連テーブル２４０において、質問者の質問に対する有識者の貢献度を記録することができる。例えばグラフ上において有識者が選択された回数を、その有識者が参加した開発プロジェクトのＩＤと関連付けて記録する。開発プロジェクトに関する質問に対して貢献度が高い有識者は、その他参加者よりもリンクを太くすることができる。選択された回数以外の貢献度を表すパラメータを用いてもよい。例えば過去の会話内容が検索においてヒットした回数などが考えられる。 (Example of link strength 2) In the expert table 220 or expert/project related table 240, the contribution of an expert to the question from the questioner can be recorded. For example, the number of times an expert is selected on a graph can be recorded in association with the ID of a development project in which the expert participated. Experts who have contributed greatly to questions related to development projects can have a thicker link than other participants. Parameters indicating the contribution other than the number of times selected can also be used. For example, the number of times the content of a past conversation has been found in a search could be considered.

（リンク強弱の例その３）全文検索部１４０が開発プロジェクトテーブル２３０を検索する際に、検索条件に合致する程度を表すスコアを得ることができる。そのスコアが高い開発プロジェクトに属する有識者については、その他の開発プロジェクトに属する有識者よりもリンクを太くすることが考えられる。 (Example 3 of link strength) When the full-text search unit 140 searches the development project table 230, it can obtain a score that indicates the degree to which the search criteria match. It is conceivable to make the link thicker for experts who belong to development projects with high scores than for experts who belong to other development projects.

図５は、質問者がチャットツール３００上で有識者との会話を開始した以後の各部の動作を説明するシーケンス図である。質問者が有識者と会話を開始したということは、質問者は自身の質問内容に関する必要な知識をその有識者が有していると判断したことを意味する。そうすると、それ以後における質問者と有識者との間の会話内容は、最初の質問内容の分類結果との間で関連を有していることになる。すなわちその会話内容は、分類器１２０が質問データを分類する際の学習データとして用いることができるといえる。そこで本実施形態１においては、図５の手順にしたがって、分類器１２０のための学習データを収集することとした。 Figure 5 is a sequence diagram explaining the operation of each part after the questioner starts a conversation with an expert on the chat tool 300. When the questioner starts a conversation with an expert, it means that the questioner judges that the expert has the necessary knowledge about the content of the question. In that case, the content of the conversation between the questioner and the expert thereafter will be related to the classification result of the initial question content. In other words, the content of the conversation can be used as learning data when the classifier 120 classifies question data. Therefore, in this embodiment 1, the learning data for the classifier 120 is collected according to the procedure in Figure 5.

質問者はチャットツール３００上で、有識者と会話する。チャットツール３００は、その会話文を含むメッセージ（発話データ）をサーバ１００に対して発信する。発話データ収集部１１０はその発話データを取得する。質問データと同様に、１つのメッセージを１つの発話データとみなすことにする。 The questioner converses with the expert on the chat tool 300. The chat tool 300 sends a message (utterance data) including the conversation to the server 100. The utterance data collection unit 110 acquires the utterance data. As with question data, one message is considered to be one piece of utterance data.

質問者と有識者との間の会話は、図２～図３で説明した場面とは異なり、双方向の会話である。したがってその会話内容は、必ずしも有識者からの質問文に限られず、一般的会話を含む場合がある。そこで質問判断器１３０は、発話データが質問文であるか否かをメッセージごとに判断する。質問文であるか否かは、文法規則にしたがって判断してもよいし、質問文であるか否かを判断するようにあらかじめ機械学習を実施した学習器によって判断してもよい。その他適当な手法を用いて判断してもよい。 The conversation between the questioner and the expert is a two-way conversation, unlike the scenes described in Figures 2 and 3. Therefore, the content of the conversation is not necessarily limited to questions from the expert, and may include general conversation. Therefore, the question determiner 130 determines for each message whether the speech data is a question. Whether a message is a question may be determined according to grammatical rules, or by a learning device that has previously performed machine learning to determine whether a message is a question. Any other appropriate method may also be used to determine the determination.

質問判断器１３０は、発話データが質問文であると判断した場合は、その発話データを学習データテーブル２５０に格納する。この時点において、質問者は有識者を選択することにより、その有識者が参加した開発プロジェクトを間接的に選択していることになる。すなわち質問内容とその開発プロジェクトは、一定の関連性を有していることになる。この関連性を学習することにより、以後開発プロジェクトを検索する際に、検索精度を高めることができると考えられる。そこで質問判断器１３０は、選択した有識者が参加した開発プロジェクトの分類属性を、発話データと関連付けて、学習データテーブル２５０に格納する。例えばメッセージごとに、そのメッセージの分類結果を併せて格納する。 When the question determiner 130 determines that the speech data is a question, it stores the speech data in the learning data table 250. At this point, by selecting an expert, the questioner indirectly selects a development project in which the expert participated. In other words, there is a certain correlation between the question and the development project. By learning this correlation, it is believed that the search accuracy can be improved when searching for development projects in the future. Therefore, the question determiner 130 associates the classification attributes of the development project in which the selected expert participated with the speech data and stores the classification attributes in the learning data table 250. For example, the classification result of each message is also stored.

質問判断器１３０は、発話データが質問文ではないと判断した場合は、その発話データを処理しない。これにより、質問者が開発プロジェクトに関する知識を要求している場合のみ学習データとしてその発話データを収集することになるので、学習データとして有用な発話データのみを絞り込むことができる。 If the question determiner 130 determines that the utterance data is not a question, it does not process the utterance data. This allows the utterance data to be collected as learning data only when the questioner is requesting knowledge about the development project, making it possible to narrow down to only utterance data that is useful as learning data.

質問者と有識者との間の会話が終了すると、分類器１２０は、学習データテーブル２５０が格納しているメッセージと分類結果を用いて、さらなる学習を実施する。これにより分類精度をさらに高めることができる。また学習データを収集するための作業が必要ないので、そのための作業負担を抑制できる。 When the conversation between the questioner and the expert ends, the classifier 120 performs further learning using the messages and classification results stored in the learning data table 250. This can further improve the classification accuracy. In addition, since no work is required to collect learning data, the workload for doing so can be reduced.

＜実施の形態１：まとめ＞
本実施形態１に係る知識収集支援システム１０は、質問データに基づき会話ログ２１０を検索する際に、質問データの分類結果に対応する有識者ＩＤを併せて検索し、その検索結果をＧＵＩによって表示する。質問者は、ＧＵＩ上の検索結果を介して有識者を選択することにより、その有識者とチャットツール３００上で会話を開始できる。これにより質問者は、過去の開発プロジェクトに関する知識が属人化していて会話ログ２１０などに記録されていない場合であっても、その知識を円滑に取得することができる。 <Embodiment 1: Summary>
When searching the conversation log 210 based on question data, the knowledge collection support system 10 according to the first embodiment also searches for expert IDs corresponding to the classification results of the question data and displays the search results on the GUI. The questioner can select an expert via the search results on the GUI and start a conversation with the expert on the chat tool 300. This allows the questioner to smoothly acquire knowledge related to past development projects even if the knowledge is personalized and not recorded in the conversation log 210 or the like.

本実施形態１に係る知識収集支援システム１０において、開発プロジェクトテーブル２３０は、過去の開発プロジェクトの分類属性を格納しており、質問データの分類結果と合致する分類属性を有する開発プロジェクトを開発プロジェクトテーブル２３０から検索するとともに、その開発プロジェクトに参加した参加者を有識者の候補としてＧＵＩ上で提示する。これにより、質問データに関する知識を有していると想定される有識者を効率的に検索することができる。 In the knowledge collection support system 10 according to the first embodiment, the development project table 230 stores the classification attributes of past development projects, and the development project table 230 is searched for development projects having classification attributes that match the classification results of the question data, and participants who took part in the development projects are presented on the GUI as candidates for experts. This makes it possible to efficiently search for experts who are assumed to have knowledge related to the question data.

本実施形態１に係る知識収集支援システム１０は、検索により得られた開発プロジェクトと有識者との間の関連性を表すグラフを、検索結果として提示する。これにより質問者は、必要な知識を有していると想定される有識者を効率的に選択できる。また選択した有識者が必要な知識を有していない場合であっても、他の有識者を速やかに選択することができる。 The knowledge collection support system 10 according to the first embodiment presents a graph showing the relevance between the development projects and experts obtained by the search as a search result. This allows the questioner to efficiently select an expert who is assumed to have the necessary knowledge. Even if the selected expert does not have the necessary knowledge, another expert can be quickly selected.

本実施形態１に係る知識収集支援システム１０は、質問者と有識者との間のチャットツール３００上の会話を、その有識者が参加した開発プロジェクトの分類結果と関連付けて学習データテーブル２５０に格納しておき、分類器１２０は学習データテーブル２５０が格納しているこれらデータを学習データとして用いる。これにより、分類器１２０が学習を実施するための学習データを収集する作業負荷を抑制できる。 The knowledge collection support system 10 according to the first embodiment stores the conversation between the questioner and the expert on the chat tool 300 in the learning data table 250 in association with the classification results of the development project in which the expert participated, and the classifier 120 uses the data stored in the learning data table 250 as learning data. This reduces the workload of the classifier 120 in collecting learning data for learning.

本実施形態１に係る知識収集支援システム１０は、質問者と有識者との間のチャットツール３００上の会話を、有識者が参加した過去の開発プロジェクトの分類属性と関連付けて学習データテーブル２５０に格納しておき、分類器１２０は学習データテーブル２５０が格納しているこれらデータを学習データとして用いる。これにより、発話データと開発プロジェクトとの間の関係を学習できるので、以後の有識者検索において適切な有識者を検索することができる。 The knowledge collection support system 10 according to the first embodiment stores conversations between the questioner and the expert on the chat tool 300 in the learning data table 250 in association with classification attributes of past development projects in which the expert participated, and the classifier 120 uses the data stored in the learning data table 250 as learning data. This makes it possible to learn the relationship between the speech data and the development projects, making it possible to search for appropriate experts in future expert searches.

本実施形態１に係る知識収集支援システム１０において、質問判断器１３０は、質問者と有識者との間のチャットツール３００上の会話において、質問文と判断した発話データのみを学習データテーブル２５０に格納する。これにより、学習データとして有用な発話データのみを絞り込んで学習することができる。 In the knowledge collection support system 10 according to the first embodiment, the question determiner 130 stores only speech data that is determined to be a question in a conversation between a questioner and an expert on the chat tool 300 in the learning data table 250. This makes it possible to narrow down learning to only speech data that is useful as learning data.

本実施形態１に係る知識収集支援システム１０は、質問データを用いて会話ログ２１０を検索することにより検索結果が得られた場合はその検索結果を質問者へ提示し、検索結果が得られなかった場合は質問データに合致する開発プロジェクトおよび有識者を改めて検索してその結果を質問者へ提示する。これにより、会話ログ２１０内に必要な情報が存在する場合はこれをそのまま提示して効率的な知識収集を実現するとともに、必要な知識が属人化している場合などであってもその知識を有識者から収集できる。 The knowledge collection support system 10 according to the first embodiment searches the conversation log 210 using question data, and if a search result is obtained, presents the search result to the questioner. If no search result is obtained, the system searches again for development projects and experts that match the question data and presents the results to the questioner. This allows for efficient knowledge collection by presenting the necessary information as is when it exists in the conversation log 210, and also makes it possible to collect the necessary knowledge from experts even if the knowledge is personalized.

＜実施の形態２＞
本発明の実施形態２では、全文検索部１４０が質問データと分類結果を用いて会話ログ２１０を検索する際における詳細動作を説明する。知識収集支援システム１０の構成は実施形態１と同じである。 <Embodiment 2>
In the second embodiment of the present invention, a detailed operation will be described when the full-text search unit 140 uses question data and classification results to search the conversation log 210. The configuration of the knowledge collection support system 10 is the same as that of the first embodiment.

図６は、検索の再現率を重視する場合における検索動作を説明する模式図である。再現率は、検索の網羅性を表す指標であり、（ＴｒｕｅＰｏｓｉｔｉｖｅ／（ＴｒｕｅＰｏｓｉｔｉｖｅ＋ＦａｌｓｅＮｅｇａｔｉｖｅ））によって表される。換言すると再現率は、検索キーワードと真に合致する検索結果のうち、実際に検索にヒットしたものの割合を表す。 Figure 6 is a schematic diagram explaining the search operation when emphasis is placed on the recall rate of the search. The recall rate is an index that indicates the comprehensiveness of the search, and is expressed as (True Positive/(True Positive+False Negative)). In other words, the recall rate indicates the proportion of search results that actually hit the search among those that truly match the search keywords.

再現率を高めるためには、質問データと関連すると想定される検索結果をできる限り広く収集することが望ましい。そこでこの場合、全文検索部１４０は、質問データの文字列と分類結果の文字列を単純に文字列結合したものを、検索キーワードとして用いる。これにより、質問データと分類結果のうちいずれかに合致する会話ログを、幅広く収集することができる。これは、上記計算式のうちＦａｌｓｅＮｅｇａｔｉｖｅが減少することによると考えられる。 To increase recall, it is desirable to collect as many search results as possible that are assumed to be related to the question data. In this case, the full-text search unit 140 uses a simple string combination of the question data string and the classification result string as the search keyword. This makes it possible to collect a wide range of conversation logs that match either the question data or the classification results. This is thought to be due to the reduction in false negatives in the above formula.

図７は、検索の適合率を重視する場合における検索動作を説明する模式図である。適合率は、検索の正確性を表す指標であり、（ＴｒｕｅＰｏｓｉｔｉｖｅ／（ＴｒｕｅＰｏｓｉｔｉｖｅ＋ＦａｌｓｅＰｏｓｉｔｉｖｅ））によって表される。換言すると適合率は、検索結果のうち検索キーワードと真に合致するものの割合を表す。 Figure 7 is a schematic diagram explaining the search operation when the search precision is important. Precision is an index that indicates the accuracy of the search, and is expressed as (True Positive/(True Positive+False Positive)). In other words, precision indicates the proportion of search results that truly match the search keyword.

適合率を高めるためには、検索を実施する前に、検索対象とする会話ログをあらかじめヒット可能性が高いものに絞り込むことが有用である。そこでこの場合、全文検索部１４０は、分類結果に合致する会話ログ２１０を検索することによって、検索対象とする会話ログをあらかじめ絞り込み、その絞り込んだ会話ログに対して、質問データを用いて全文検索を実施する。これにより、質問データに合致する可能性が高い会話ログを得ることができる。これは、上記計算式のうちＦａｌｓｅＰｏｓｉｔｉｖｅが減少することによると考えられる。 In order to increase the matching rate, it is useful to narrow down the conversation logs to be searched to those with a high probability of being a hit before conducting a search. In this case, the full-text search unit 140 narrows down the conversation logs to be searched to in advance by searching for conversation logs 210 that match the classification results, and then performs a full-text search on the narrowed-down conversation logs using the question data. This makes it possible to obtain conversation logs that are highly likely to match the question data. This is thought to be due to the reduction in false positives in the above calculation formula.

図６と図７で説明した検索動作は、開発プロジェクトテーブル２３０を検索する際にも適用することができる。すなわち、（ａ）再現率を高める場合は質問データと分類結果を文字列結合して開発プロジェクトテーブル２３０を全文検索し、（ｂ）適合率を高める場合は分類結果によって開発プロジェクトテーブル２３０内のレコードをあらかじめ絞り込んだ上で質問データによって改めて開発プロジェクトテーブル２３０を全文検索する。 The search operations described in Figures 6 and 7 can also be applied when searching the development project table 230. That is, (a) to increase recall, the question data and the classification results are combined into a string and a full-text search is performed on the development project table 230, and (b) to increase precision, the records in the development project table 230 are first narrowed down using the classification results, and then the development project table 230 is again full-text searched using the question data.

＜本発明の変形例について＞
以上の実施形態において、サーバ１００が備える各機能部は、その機能を実装した回路デバイスなどのハードウェアによって構成することもできるし、その機能を実装したソフトウェアを演算装置が実行することにより構成することもできる。 <Modifications of the present invention>
In the above embodiments, each functional unit of server 100 can be configured by hardware such as a circuit device that implements that function, or can be configured by a computing device executing software that implements that function.

以上の実施形態において、分類器１２０は、任意の機械学習アルゴリズム（例えばニューラルネットワーク）を用いるソフトウェアと、その学習結果を記録するデータとによって構成することができる。 In the above embodiments, the classifier 120 can be configured with software that uses any machine learning algorithm (e.g., a neural network) and data that records the learning results.

以上の実施形態において、全文検索部１４０が開発プロジェクトテーブル２３０を検索してもヒットしなかった場合は、検索結果なしである旨をチャットツール３００へ返信してもよいし、併せて質問内容を変更するように促すメッセージを返信してもよい。 In the above embodiment, if the full-text search unit 140 searches the development project table 230 but does not find any hits, it may reply to the chat tool 300 that there were no search results, and may also reply with a message encouraging the user to change the content of the question.

図４において、開発プロジェクトとその参加者との間の関連性を、ツリー形式のグラフで表示する例を説明したが、関連性を提示する方式はこれに限るものではなく、同様の関連性を提示できればその他の表示形式を用いてもよい。 In Figure 4, an example is described in which the relationships between development projects and their participants are displayed in a tree-type graph, but the method of presenting the relationships is not limited to this, and other display formats may be used as long as they can present similar relationships.

１０：知識収集支援システム
１００：サーバ
１１０：発話データ収集部
１２０：分類器
１３０：質問判断器
１４０：全文検索部
１５０：ＵＩ生成部
２００：データベース
２１０：会話ログ
２２０：有識者テーブル
２３０：開発プロジェクトテーブル
２４０：有識者・プロジェクト関連テーブル
２５０：学習データテーブル
３００：チャットツール 10: Knowledge collection support system 100: Server 110: Speech data collection unit 120: Classifier 130: Question judgement unit 140: Full text search unit 150: UI generation unit 200: Database 210: Conversation log 220: Expert table 230: Development project table 240: Expert-project related table 250: Learning data table 300: Chat tool

Claims

A knowledge collection support system that supports collection of knowledge required in an information system development project, comprising:
a speech data collection unit that receives question data describing natural language character strings including questions about development projects implemented in the past as messages transmitted by a chat tool;
a classifier configured to classify the query data using a machine learning model;
a full-text search unit that searches past conversation logs in the chat tool using a classification result by the classifier and the question data;
a user interface for displaying the search results on a screen;
Equipped with
The knowledge collection support system further includes a database for recording the conversation log and for recording IDs of experts who are treated in the conversation log as having knowledge required for the development project related to the conversation content,
When searching the conversation log, the full-text search unit also searches for experts in the conversation log that correspond to the classification result,
The user interface displays, on a screen, an ID of the expert as a result of the search;
the user interface is configured to enable a conversation with an expert in the chat tool to be started by selecting the ID of the expert obtained as a result of the search.

The database records classification attributes of development projects implemented in the past in association with IDs of the experts who participated in the development projects,
the full-text search unit searches the database for the development projects having classification attributes matching the classification attributes obtained by the classifier classifying the question data, and searches for IDs of the experts who participated in the matching development projects;
2. The knowledge gathering support system according to claim 1, wherein the user interface displays on a screen ID of the expert who participated in the matching development project.

The knowledge collection support system according to claim 2, characterized in that the user interface displays on a screen, as a search result by the full-text search unit, a graph showing the relevance between the development projects that match the classification attributes obtained by the classifier classifying the question data and the experts who participated in the development projects.

The database records at least one of the roles of the experts who participated in development projects implemented in the past in the development projects or the period during which the experts participated in the development projects;
The knowledge collection support system according to claim 3, characterized in that the user interface is configured to represent the strength of the connection between the development project and the expert on the graph according to the importance of the role or the length of the participation period.

the database records data representing the degree of contribution of the expert to questions related to development projects implemented in the past for each ID of the expert;
4. The knowledge collection support system according to claim 3, wherein the user interface is configured to represent, on the graph, the strength of the connection between the development project and the expert according to the degree of contribution.

when searching the database for the development project, the full-text search unit obtains a score indicating a degree of match with a search condition together with the search result;
4. The knowledge collection support system according to claim 3, wherein the user interface is configured to represent, on the graph, the strength of the connection between the development project and the expert according to the score.

the database records, in association with each other, classification attributes of the development projects in which the expert selected on the user interface participated, and speech data in conversations with the expert;
The knowledge collection support system according to claim 2, characterized in that the classifier classifies the speech data according to the classification attributes of the development project in which the selected expert participated by learning the classification attributes of the development project in which the selected expert participated and the speech data as learning data.

the knowledge collection support system further includes a question determiner that determines whether or not a natural language character string included in the question data includes a question for each message sent through the chat tool;
when the question determiner determines that the natural language character string included in the question data includes a question, the classifier learns the classification result of the question data by the classifier and the question data as learning data;
2. The knowledge collection support system according to claim 1, wherein the classifier does not use the question data as learning data when the question determiner determines that the natural language character string contained in the question data does not contain a question.

The classifier classifies the question data as follows:
A type of information system to which the query data pertains;
characteristics of the development project to which the inquiry data relates;
the technology used in the development project to which the questionnaire data relates;
The classification is configured to classify the data based on at least one of the classification attributes,
2. The knowledge collection support system according to claim 1, wherein the full-text search unit searches the conversation log using a result of classifying the question data according to the classification attribute.

The database includes the following classification attributes for past development projects:
Type of information system,
the characteristics of the development project;
The technologies used in the development project,
At least one of the following has been recorded:
The classifier classifies the query data according to the classification attributes recorded in the database;
The knowledge collection support system according to claim 2, characterized in that the full-text search unit searches the conversation log using the result of classifying the question data according to the classification attribute, thereby searching for the ID of the expert related to the development project to be asked a question by the question data.

When the conversation log that matches the classification result by the classifier and the question data is found, the full-text search unit outputs the found conversation log as a search result;
The knowledge collection support system according to claim 1, characterized in that if the full-text search unit does not find the conversation log that matches the classification result by the classifier and the question data, it outputs the ID of the expert as a search result.

The knowledge collection support system described in claim 1, characterized in that the full-text search unit searches the conversation log using a keyword that combines the classification result by the classifier and the question data as a string, thereby improving the search recall rate more than when the conversation log is searched using the classification result by the classifier and the question data individually.

The knowledge collection support system described in claim 1, characterized in that the full-text search unit narrows down the conversation log using the classification results by the classifier and performs a search on the narrowed-down conversation log using the question data, thereby improving the search precision compared to a case in which the classification results by the classifier and the question data are used in parallel in combination.