JP2022187109A

JP2022187109A - Dialog knowledge generation apparatus and dialog knowledge generation method

Info

Publication number: JP2022187109A
Application number: JP2021094945A
Authority: JP
Inventors: 健一横手; Kenichi Yokote; 康嗣森本; Yasutsugu Morimoto
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2021-06-07
Filing date: 2021-06-07
Publication date: 2022-12-19

Abstract

To automatically construct a flow model of a dialog including an asking using a business document and a past dialog record.SOLUTION: A dialog knowledge generation apparatus includes: a keyword selection unit that selects an attribute value which is a keyword of a dialog and an attribute name indicating a category of the attribute value from target data including a business document and a past dialog record using a clue phrase indicating each selection condition; a mapping unit that determines a correspondence relationship between the attribute value and the attribute name based on each of acquisition positions in the target data; and a graph generation unit that generates a knowledge graph by extracting a hierarchical relationship of a plurality of items including the attribute value or the attribute name, and combining the plurality of items based on the extracted hierarchical relationship.SELECTED DRAWING: Figure 2

Description

本発明は、対話知識生成装置及び対話知識生成方法に関し、計算機を利用したテキスト処理によって対話知識を生成する対話知識生成装置及び対話知識生成方法に適用して好適なものである。 The present invention relates to a dialogue knowledge generation device and a dialogue knowledge generation method, and is suitable for application to a dialogue knowledge generation device and a dialogue knowledge generation method for generating dialogue knowledge by text processing using a computer.

計算機を利用したフローチャート型のデータ構造の生成に関する従来技術として、例えば特許文献１及び特許文献２が挙げられる。特許文献１では、コーパスからベクトルのセットを生成し、データ構造（例えば、クラスタ形成木、系統樹）を生成するシステムが開示されている。また、特許文献２では、対話知識を管理する対話管理サーバとして、ユーザからの質問文のテキストデータを受け付け、ベクトル算出する対話管理サーバが開示されている。 For example, Japanese Unexamined Patent Application Publication Nos. 2003-100003 and 2003-100001 are examples of conventional techniques relating to the generation of a flowchart-type data structure using a computer. Patent Literature 1 discloses a system that generates a set of vectors from a corpus and generates a data structure (eg, cluster formation tree, phylogenetic tree). Further, Patent Document 2 discloses a dialogue management server that receives text data of a question sentence from a user and calculates a vector as a dialogue management server that manages dialogue knowledge.

特開２０１９－１６９１４８号公報JP 2019-169148 A 特開２０２０－１８４２９４号公報JP 2020-184294 A

対話による応対サービスにおいては、例えば、「契約期限が近づいているという通知が届いたが、対処の方法が分からないので教えてほしい」という問い合わせに対しては、「お客様番号は分かりますか」といったように不明点を確認する聞き返しを繰り返しながら、最終的な回答を提示することが好ましい。この場合、対話知識を表すフローチャート型のデータ構造（フローモデル）を事前に用意し、それを参照しながら対話管理することができれば、業務の熟練者でなくても回答に至るまでの手順を把握することができる。しかし、上述した従来技術を利用した場合、対話のフローモデルの木構造は生成できるものの、対話における聞き返し質問や回答のデータを割り当てることまでは実現できないという問題があった。 For example, in response to an inquiry such as "I received a notification that my contract is about to expire, but I don't know how to deal with it, please let me know." It is preferable to present the final answer while repeatedly asking questions to confirm unclear points. In this case, if it is possible to prepare a flowchart-type data structure (flow model) representing dialogue knowledge in advance and manage the dialogue while referring to it, even a non-expert person can understand the procedure leading up to the answer. can do. However, when using the conventional technology described above, although the tree structure of the dialogue flow model can be generated, there is the problem that it is not possible to allocate the data of the feedback questions and answers in the dialogue.

本発明は以上の点を考慮してなされたもので、業務文書（例えば業務マニュアル）及び過去の対話記録（例えば対話ログ）を活用して、聞き返しを含む対話のフローモデル（聞き返しテキスト付きの対話知識）を自動的に構築することが可能な対話知識生成方法及び対話知識生成装置を提案しようとするものである。 The present invention has been made in consideration of the above points, and utilizes business documents (e.g., business manuals) and past dialogue records (e.g., dialogue logs) to create a dialogue flow model including reflection (dialogue with reflection text). The object of the present invention is to propose a dialog knowledge generation method and a dialog knowledge generation apparatus capable of automatically constructing knowledge.

かかる課題を解決するため本発明においては、対話のフローモデルを表す知識グラフを自動生成する対話知識生成装置であって、業務文書及び過去の対話記録を含む対象データから、対話のキーワードである属性値と属性値のカテゴリを示す属性名とを、それぞれの選定条件を示す手掛かり句を用いて選定するキーワード選定部（例えば、後述する属性値キーワード選定部１４２及び属性名キーワード選定部１４３）と、前記キーワード選定部によって選定された前記属性値と前記属性名との対応関係を前記対象データにおけるそれぞれの取得位置に基づいて決定する対応付け部（例えば、後述する属性値－属性名対応付け部１４６）と、前記対応付け部によって決定された対応関係に基づいて前記属性値または前記属性名を有する複数のアイテムの階層関係を抽出し、前記抽出した階層関係に基づいて前記複数のアイテムを組み合わせることにより、前記知識グラフを生成するグラフ生成部（例えば、後述する知識グラフ生成部１４７及び聞き返し質問生成部１４８）と、を備える対話知識生成装置が提供される。 In order to solve this problem, the present invention provides a dialogue knowledge generation device that automatically generates a knowledge graph representing a flow model of dialogue, wherein from target data including business documents and past dialogue records, attributes that are keywords of dialogue A keyword selection unit (for example, an attribute value keyword selection unit 142 and an attribute name keyword selection unit 143, which will be described later) that selects a value and an attribute name indicating a category of the attribute value using clue phrases indicating respective selection conditions; An associating unit (for example, an attribute value-attribute name associating unit 146 to be described later) that determines the correspondence relationship between the attribute value and the attribute name selected by the keyword selecting unit based on the respective acquisition positions in the target data. ), extracting a hierarchical relationship of a plurality of items having the attribute value or the attribute name based on the correspondence determined by the associating unit, and combining the plurality of items based on the extracted hierarchical relationship. provides a dialogue knowledge generation device including a graph generation unit (for example, a knowledge graph generation unit 147 and a feedback question generation unit 148, which will be described later) that generates the knowledge graph.

また、かかる課題を解決するため本発明においては、対話のフローモデルを表す知識グラフを自動生成する対話知識生成装置による対話知識生成方法であって、前記対話知識生成装置が、業務文書及び過去の対話記録を含む対象データから、対話のキーワードである属性値と属性値のカテゴリを示す属性名とを、それぞれの選定条件を示す手掛かり句を用いて選定するキーワード選定ステップと、前記対話知識生成装置が、前記キーワード選定ステップで選定された前記属性値と前記属性名との対応関係を前記対象データにおけるそれぞれの取得位置に基づいて決定する対応付けステップと、前記対話知識生成装置が、前記対応付けステップで決定された対応関係に基づいて前記属性値または前記属性名を有する複数のアイテムの階層関係を抽出し、前記抽出した階層関係に基づいて前記複数のアイテムを組み合わせることにより、前記知識グラフを生成するグラフ生成ステップと、を備える対話知識生成方法が提供される。 Further, in order to solve such problems, the present invention provides a dialogue knowledge generation method by a dialogue knowledge generation device for automatically generating a knowledge graph representing a flow model of dialogue, wherein the dialogue knowledge generation device is composed of business documents and past a keyword selection step of selecting, from target data including dialogue records, attribute values that are keywords of dialogue and attribute names that indicate categories of attribute values, using clue phrases that indicate selection conditions for the respective attribute values; and the dialogue knowledge generation device. a matching step of determining a correspondence relationship between the attribute values selected in the keyword selecting step and the attribute names based on their acquisition positions in the target data; extracting the hierarchical relationship of a plurality of items having the attribute value or the attribute name based on the correspondence determined in the step, and combining the plurality of items based on the extracted hierarchical relationship to create the knowledge graph; and a generating graph generation step.

本発明によれば、業務文書及び過去の対話記録を活用して、聞き返しを含む対話のフローモデルを自動的に構築することができる。 According to the present invention, it is possible to automatically construct a dialogue flow model including reflection by utilizing business documents and past dialogue records.

本発明の一実施形態に係る対話知識生成装置１００を実現するコンピュータ１０のハードウェア構成例を示すブロック図である。1 is a block diagram showing a hardware configuration example of a computer 10 that implements a dialogue knowledge generating apparatus 100 according to an embodiment of the present invention; FIG. 対話知識生成装置１００の機能構成例を示すブロック図である。2 is a block diagram showing a functional configuration example of the dialogue knowledge generating device 100; FIG. 手掛かり句情報のデータ構成例を示す図である。FIG. 4 is a diagram showing a data configuration example of clue phrase information; 属性値キーワード情報のデータ構成例を示す図である。It is a figure which shows the data structural example of attribute value keyword information. 属性名キーワード情報のデータ構成例を示す図である。FIG. 4 is a diagram showing a data configuration example of attribute name keyword information; 最終回答キーワード情報のデータ構成例を示す図である。It is a figure which shows the data structural example of final reply keyword information. 属性値－最終回答対応関係情報のデータ構成例を示す図である。FIG. 9 is a diagram showing a data configuration example of attribute value-final answer correspondence relationship information; 属性値－属性名対応関係情報のデータ構成例を示す図である。FIG. 10 is a diagram showing a data configuration example of attribute value-attribute name correspondence information; 知識グラフノード情報のデータ構成例を示す図である。FIG. 4 is a diagram showing a data configuration example of knowledge graph node information; 知識グラフエッジ情報のデータ構成例を示す図である。FIG. 4 is a diagram showing a data configuration example of knowledge graph edge information; 対話知識生成処理の全体的な処理手順例を示すフローチャートである。FIG. 11 is a flowchart showing an example of an overall processing procedure of dialogue knowledge generation processing; FIG. 手掛かり句推定処理の処理手順例を示すフローチャートである。FIG. 11 is a flow chart showing an example of a clue phrase estimation process; FIG. 属性値キーワード選定処理の処理手順例を示すフローチャートである。FIG. 11 is a flowchart showing an example of a processing procedure for attribute value keyword selection processing; FIG. 属性名キーワード選定処理の処理手順例を示すフローチャートである。FIG. 10 is a flowchart showing an example of a processing procedure for attribute name keyword selection processing; FIG. 最終回答キーワード選定処理の処理手順例を示すフローチャートである。FIG. 11 is a flowchart showing an example of a processing procedure for final reply keyword selection processing; FIG. 属性値－最終回答対応付け処理の処理手順例を示すフローチャートである。FIG. 10 is a flow chart showing an example of a processing procedure of attribute value-final answer association processing; FIG. 属性値－属性名対応付け処理の処理手順例を示すフローチャートである。FIG. 10 is a flow chart showing an example of a processing procedure for attribute value-attribute name association processing; FIG. 知識グラフ生成処理の処理手順例を示すフローチャートである。FIG. 11 is a flowchart illustrating an example of a processing procedure of knowledge graph generation processing; FIG. 知識グラフ生成処理のイメージを視覚的に説明するための図である。FIG. 10 is a diagram for visually explaining an image of knowledge graph generation processing; 聞き返し質問生成処理の処理手順例を示すフローチャートである。FIG. 11 is a flow chart showing an example of a processing procedure of a feedback question generation process; FIG. 聞き返し質問生成処理のイメージを視覚的に説明するための図である。FIG. 10 is a diagram for visually explaining an image of the feedback question generation process; 知識グラフの可視化の一例（その１）を示す図である。FIG. 3 is a diagram showing an example (part 1) of visualization of a knowledge graph; 知識グラフの可視化の一例（その１）を示す図である。FIG. 3 is a diagram showing an example (part 1) of visualization of a knowledge graph;

以下、図面を参照して、本発明の実施形態を詳述する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（１）構成
図１は、本発明の一実施形態に係る対話知識生成装置１００を実現するコンピュータ１０のハードウェア構成例を示すブロック図である。 (1) Configuration FIG. 1 is a block diagram showing an example hardware configuration of a computer 10 that implements a dialogue knowledge generation apparatus 100 according to an embodiment of the present invention.

コンピュータ１０は、ＣＰＵ１１、メモリ１２、記憶デバイス１３、入力デバイス１４、出力デバイス１５、及びネットワークデバイス１６が、バス１７を介して互いに接続されて構成される。 The computer 10 is configured by connecting a CPU 11 , a memory 12 , a storage device 13 , an input device 14 , an output device 15 and a network device 16 to each other via a bus 17 .

ＣＰＵ１１は、プログラムを実行するプロセッサであり、例えばＣＰＵ（Central Processing Unit）である。メモリ１２は、プログラムやプログラムを実行する際に参照されるデータ等を格納するＲＡＭ（Random Access Memory）等の主記憶装置である。記憶デバイス１３は、プログラムやプログラムを実行する際に参照されるデータ等を格納する補助記憶装置であって、例えば、ＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）等である。入力デバイス１４は、操作者による入力を受け付ける装置であって、例えば、キーボード、マウス、またはタッチパネル等である。出力デバイス１５は、ＣＰＵ１１によって実行された処理の結果、あるいは、メモリ１２または記憶デバイス１３に保持されたデータ等を出力する装置であって、例えば、液晶表示装置や有機ＥＬ（Electro Luminescence）ディスプレイ等である。ネットワークデバイス１６は、通信インタフェース及び入出力インタフェース等の各種インタフェースである。バス１７は、内部通信線であって、コンピュータ１０内の各装置を連結する。 The CPU 11 is a processor that executes programs, such as a CPU (Central Processing Unit). The memory 12 is a main storage device such as a RAM (Random Access Memory) that stores programs and data referred to when the programs are executed. The storage device 13 is an auxiliary storage device that stores programs and data to be referenced when the programs are executed, and is, for example, a HDD (Hard Disk Drive) or an SSD (Solid State Drive). The input device 14 is a device that receives input from an operator, such as a keyboard, mouse, or touch panel. The output device 15 is a device for outputting the result of processing executed by the CPU 11 or the data held in the memory 12 or the storage device 13. For example, the output device 15 is a liquid crystal display device, an organic EL (Electro Luminescence) display, or the like. is. The network device 16 is various interfaces such as a communication interface and an input/output interface. Bus 17 is an internal communication line that connects each device within computer 10 .

後述する図２に示す対話知識生成装置１００の各機能は、それぞれの機能に応じた処理用のプログラムが上記したコンピュータ１０のハードウェア資源により実行されることによって実現される。具体的には、図２に示す制御部１４０内の各機能部は、ＣＰＵ１１が、メモリ１２または記憶デバイス１３に保持された所定のプログラムを関連データを参照しながら実行することによって実現される。また、知識グラフ可視化部１５０は、ＣＰＵ１１が所定のプログラムを実行し、さらにその実行結果を出力デバイス１５（あるいはユーザ端末２００でもよい）に出力することによって実現される。また、図２に示すテキスト情報ＤＢ１１０、関係情報ＤＢ１２０、及びグラフ情報ＤＢ１３０は、主に記憶デバイス１３によって実現される。 Each function of the dialogue knowledge generating apparatus 100 shown in FIG. 2, which will be described later, is realized by executing a processing program corresponding to each function by the hardware resources of the computer 10 described above. Specifically, each functional unit in control unit 140 shown in FIG. 2 is realized by CPU 11 executing a predetermined program held in memory 12 or storage device 13 while referring to related data. Also, the knowledge graph visualization unit 150 is realized by the CPU 11 executing a predetermined program and further outputting the execution result to the output device 15 (or the user terminal 200 may be used). The text information DB 110, relationship information DB 120, and graph information DB 130 shown in FIG. 2 are mainly implemented by the storage device 13. FIG.

なお、コンピュータ１０においてＣＰＵ１１が実行する上記プログラム、及び当該プログラムが参照するデータの一部または全ては、メモリ１２または記憶デバイス１３に予め格納されていてもよいし、必要に応じて、ネットワークデバイス１６を介して、ネットワークで接続された他の装置の非一時的記憶装置または非一時的な記憶媒体から、コンピュータ１０の記憶デバイス１３に格納されるように構成されてもよい。また、以下の説明では、簡便のため、対話知識生成装置１００が１つのコンピュータ１０によって実現される構成とするが、本実施形態に係る対話知識生成装置１００の構成はこれに限定されるものではなく、複数のコンピュータ１０によって実現されてもよく、データベース装置やクラウドストレージ等を含めて実現されてもよい。 Note that part or all of the above programs executed by the CPU 11 in the computer 10 and data referred to by the programs may be stored in advance in the memory 12 or the storage device 13, or may be stored in the network device 16 as necessary. may be configured to be stored in the storage device 13 of the computer 10 from a non-temporary storage device or a non-temporary storage medium of another device connected via a network. Further, in the following description, for the sake of simplicity, the dialogue knowledge generation device 100 is assumed to be implemented by one computer 10, but the structure of the dialogue knowledge generation device 100 according to this embodiment is not limited to this. Instead, it may be implemented by a plurality of computers 10, or may be implemented by including a database device, cloud storage, and the like.

図２は、対話知識生成装置１００の機能構成例を示すブロック図である。詳細は後述するが、対話知識生成装置１００は、ユーザ端末２００から業務文書、過去の対話記録（以下、問い合わせ記録とも呼ぶ）、及び初期手掛かり句の入力を受け付け、属性値キーワード、属性名キーワード、及び最終回答キーワードを生成し、それらの対応関係を特定したうえで、知識グラフ及び聞き返し質問を生成し、聞き返し質問のテキスト付きの知識グラフ（知識フローモデル）を可視化する。なお、本例では、テキストデータによる業務文書の一例として業務マニュアル（マニュアルとも称する）を用い、テキストデータによる過去の対話記録（問い合わせ記録）の一例として対話ログを用いて説明する。 FIG. 2 is a block diagram showing a functional configuration example of the dialogue knowledge generation device 100. As shown in FIG. Although the details will be described later, the dialogue knowledge generating device 100 accepts input of business documents, past dialogue records (hereinafter also referred to as inquiry records), and initial clue phrases from the user terminal 200, and uses attribute value keywords, attribute name keywords, and final answer keywords are generated, their corresponding relationships are specified, a knowledge graph and a reflection question are generated, and the knowledge graph (knowledge flow model) with the text of the reflection question is visualized. In this example, a business manual (also referred to as a manual) is used as an example of a business document based on text data, and a dialog log is used as an example of a past dialog record (inquiry record) based on text data.

図２に示すように、対話知識生成装置１００は、テキスト情報ＤＢ１１０、関係情報ＤＢ１２０、グラフ情報ＤＢ１３０、制御部１４０、及び知識グラフ可視化部１５０を備えて構成される。 As shown in FIG. 2, the dialogue knowledge generating apparatus 100 is configured with a text information DB 110, a relationship information DB 120, a graph information DB 130, a control section 140, and a knowledge graph visualization section 150. FIG.

テキスト情報ＤＢ１１０は、知識グラフの生成と聞き返し質問の生成に必要となるデータのうち、テキストに関連付くデータを格納する。図２に示すように、テキスト情報ＤＢ１１０は、手掛かり句ＤＢ１１１、属性値ＤＢ１１２、属性名ＤＢ１１３、及び最終回答ＤＢ１１４から構成される。 The text information DB 110 stores data related to text among the data necessary for generating knowledge graphs and generating feedback questions. As shown in FIG. 2, the text information DB 110 comprises a clue phrase DB 111, an attribute value DB 112, an attribute name DB 113, and a final answer DB 114. FIG.

手掛かり句ＤＢ１１１は、「手掛かり句」に関する手掛かり句情報を格納する。「手掛かり句」は、対話のキーワードである「属性値」、属性値のカテゴリを示す「属性名」、及び対話の「最終回答」を、対象データ（業務文書及び過去の対話記録）から選定するための選定条件となる語句である。本実施形態では、初期に外部（ユーザ端末２００）から提供される手掛かり句を初期手掛かり句と称する。また、詳細は後述するが、本実施形態では、手掛かり句推定部１４１が、初期手掛かり句を用いて対象データから抽出した対象表現を基に、初期手掛かり句（対象表現と読み替えてもよい）を補完する手掛かり句を推定することができ、手掛かり句ＤＢ１１１に格納される手掛かり句情報は、これら初期手掛かり句と補完された手掛かり句とを含む。手掛かり句情報のデータ構成例は、後で図３を参照しながら説明する。 The clue phrase DB 111 stores clue phrase information related to "clue phrases." "Clue phrase" selects "attribute value" which is a keyword of dialogue, "attribute name" which indicates the category of attribute value, and "final answer" of dialogue from target data (business documents and past dialogue records). It is a phrase that serves as a selection condition for In the present embodiment, a clue phrase initially provided from the outside (user terminal 200) is referred to as an initial clue phrase. Further, although the details will be described later, in the present embodiment, the clue phrase estimation unit 141 generates an initial clue phrase (which may be read as a target expression) based on a target expression extracted from target data using the initial clue phrase. The clue phrase to be complemented can be estimated, and the clue phrase information stored in the clue phrase DB 111 includes these initial clue phrases and the complemented clue phrases. A data configuration example of clue phrase information will be described later with reference to FIG.

属性値ＤＢ１１２は、業務文書（例えば業務マニュアル）または問い合わせ記録（例えば対話ログ）から抽出されるキーワードの一種である「属性値」に関する属性値キーワード情報を格納する。属性値は、後述する知識グラフにおける「回答テキスト」を得るための情報である。属性値キーワード情報は、属性値キーワード選定部１４２によって抽出され、そのデータ構成例は、後で図４を参照しながら説明する。 The attribute value DB 112 stores attribute value keyword information related to "attribute value" which is a type of keyword extracted from business documents (eg, business manuals) or inquiry records (eg, dialogue logs). The attribute value is information for obtaining "answer text" in the knowledge graph, which will be described later. The attribute value keyword information is extracted by the attribute value keyword selection unit 142, and its data configuration example will be described later with reference to FIG.

属性名ＤＢ１１３は、上記属性値のカテゴリを表す「属性名」に関する属性名キーワード情報を格納する。属性名と属性値の関係は、キー（Ｋｅｙ）と値（Ｖａｌｕｅ）の関係に相当する。属性名は、後述する知識グラフにおける「聞き返しテキスト」を得るための情報である。属性名キーワード情報は、属性名キーワード選定部１４３によって抽出され、そのデータ構成例は、後で図５を参照しながら説明する。 The attribute name DB 113 stores attribute name keyword information related to "attribute name" representing the category of the attribute value. The relationship between attribute names and attribute values corresponds to the relationship between keys and values. The attribute name is information for obtaining a "review text" in the knowledge graph, which will be described later. The attribute name keyword information is extracted by the attribute name keyword selection unit 143, and its data configuration example will be described later with reference to FIG.

最終回答ＤＢ１１４は、業務文書または問い合わせ記録から抽出されるキーワードの一種である「最終回答」に関する最終回答キーワード情報を格納する。最終回答は、聞き返し質問を含む対話知識における最終的な回答に相当し、言い換えれば、後述する知識グラフにおける「最終回答テキスト」を得るための情報である。最終回答キーワード情報は、最終回答キーワード選定部１４４によって抽出され、そのデータ構成例は、後で図６を参照しながら説明する。 The final reply DB 114 stores final reply keyword information related to "final reply", which is a type of keyword extracted from business documents or inquiry records. The final answer corresponds to the final answer in the dialogue knowledge including the reflection question, in other words, it is the information for obtaining the "final answer text" in the knowledge graph to be described later. The final answer keyword information is extracted by the final answer keyword selection unit 144, and its data configuration example will be described later with reference to FIG.

関係情報ＤＢ１２０は、知識グラフの生成と聞き返し質問の生成に必要となるデータのうち、テキスト同士の情報に関連付くデータを格納する。図２に示すように、関係情報ＤＢ１２０は、属性値最終回答対応関係ＤＢ１２１及び属性値属性名対応関係ＤＢ１２２から構成される。 The relationship information DB 120 stores data associated with information between texts, among the data necessary for generating knowledge graphs and generating feedback questions. As shown in FIG. 2, the relationship information DB 120 includes an attribute value final answer correspondence DB 121 and an attribute value attribute name correspondence DB 122 .

属性値最終回答対応関係ＤＢ１２１は、属性値キーワードと最終回答キーワードとの対応関係を示す属性値－最終回答対応関係情報を格納する。属性値－最終回答対応関係情報は、属性値－最終回答対応付け部１４５によって生成され、そのデータ構成例は、後で図７を参照しながら説明する。 The attribute value final answer correspondence DB 121 stores attribute value-final answer correspondence information indicating the correspondence between attribute value keywords and final answer keywords. The attribute value-final answer correspondence relation information is generated by the attribute value-final answer association unit 145, and its data configuration example will be described later with reference to FIG.

属性値属性名対応関係ＤＢ１２２は、属性値キーワードと属性名キーワードとの対応関係を示す属性値－属性名対応関係情報を格納する。属性値－属性名対応関係情報は、属性値－属性名対応付け部１４６によって生成され、そのデータ構成例は、後で図８を参照しながら説明する。 The attribute value-attribute name correspondence DB 122 stores attribute value-attribute name correspondence information indicating the correspondence between attribute value keywords and attribute name keywords. The attribute value-attribute name correspondence information is generated by the attribute value-attribute name association unit 146, and its data configuration example will be described later with reference to FIG.

グラフ情報ＤＢ１３０は、対話知識を示す知識グラフの生成、及び知識グラフにテキストデータで付加する聞き返し質問の生成に必要となるデータのうち、グラフ構造の情報に関連付くデータを格納する。図２に示すように、グラフ情報ＤＢ１３０は、知識グラフノードＤＢ１３１及び知識グラフエッジＤＢ１３２から構成される。 The graph information DB 130 stores data associated with graph structure information, among the data necessary for generating a knowledge graph indicating dialogue knowledge and for generating feedback questions added as text data to the knowledge graph. As shown in FIG. 2, the graph information DB 130 is composed of a knowledge graph node DB 131 and a knowledge graph edge DB 132. FIG.

知識グラフノードＤＢ１３１は、対話知識を示す知識グラフの構成要素であるノードに関する知識グラフノード情報を格納する。知識グラフノード情報は、知識グラフ生成部１４７及び聞き返し質問生成部１４８によって供給され、そのデータ構成例は、後で図９を参照しながら説明する。 The knowledge graph node DB 131 stores knowledge graph node information relating to nodes that are constituent elements of a knowledge graph representing dialogue knowledge. The knowledge graph node information is supplied by the knowledge graph generation unit 147 and the feedback question generation unit 148, and an example of the data structure will be described later with reference to FIG.

知識グラフエッジＤＢ１３２は、対話知識を示す知識グラフの構成要素であるエッジに関する知識グラフエッジ情報を格納する。知識グラフエッジ情報は、知識グラフ生成部１４７及び聞き返し質問生成部１４８によって供給され、そのデータ構成例は、後で図１０を参照しながら説明する。 The knowledge graph edge DB 132 stores knowledge graph edge information relating to edges that are constituent elements of a knowledge graph representing dialogue knowledge. The knowledge graph edge information is supplied by the knowledge graph generation unit 147 and the feedback question generation unit 148, and an example of the data configuration will be described later with reference to FIG.

本実施形態において、知識グラフは、聞き返しと回答のテキストデータを割り当てたフローチャート型のデータ構造である「対話フローモデル」の一形態であって、対話における質問または最終回答のテキストを保持するノード（知識グラフノード）と、質問を受けた側の回答のテキストを保持するエッジ（知識グラフエッジ）とが階層的に連結された構造体で表される。 In this embodiment, the knowledge graph is a form of a "dialogue flow model", which is a flowchart-type data structure in which text data of reflections and answers are assigned, and is a node ( A knowledge graph node) and an edge (knowledge graph edge) holding the answer text of the question-receiver are represented by a structure in which they are hierarchically connected.

制御部１４０は、手掛かり句推定部１４１、属性値キーワード選定部１４２、属性名キーワード選定部１４３、最終回答キーワード選定部１４４、属性値－最終回答対応付け部１４５、属性値－属性名対応付け部１４６、知識グラフ生成部１４７、及び聞き返し質問生成部１４８を備え、各機能部が有する機能を提供する。 The control unit 140 includes a clue phrase estimation unit 141, an attribute value keyword selection unit 142, an attribute name keyword selection unit 143, a final answer keyword selection unit 144, an attribute value-final answer association unit 145, and an attribute value-attribute name association unit. 146, a knowledge graph generation unit 147, and a feedback question generation unit 148, and provides functions possessed by each functional unit.

手掛かり句推定部１４１は、ユーザ端末２００から業務文書（本例では業務マニュアル）、問い合わせ記録（本例では対話ログ）、及び初期手掛かり句の入力を受け取り、初期手掛かり句を用いて、対象データ（業務文書及び問い合わせ記録）から新たな手掛かり句を抽出する。手掛かり句推定部１４１は、抽出した手掛かり句に関する手掛かり句情報を作成し、手掛かり句ＤＢ１１１に格納する。手掛かり句推定部１４１による上記の処理（手掛かり句推定処理と称する）については、後で図１２を参照しながら詳細な処理手順例を説明する。 The clue phrase estimation unit 141 receives a business document (in this example, a business manual), an inquiry record (in this example, a dialogue log), and an input of an initial clue phrase from the user terminal 200, and uses the initial clue phrase to generate target data ( Extract new clue phrases from business documents and inquiry records). The clue phrase estimation unit 141 creates clue phrase information about the extracted clue phrase and stores it in the clue phrase DB 111 . Regarding the above-described processing (referred to as clue phrase estimation processing) by the clue phrase estimation unit 141, a detailed processing procedure example will be described later with reference to FIG.

属性値キーワード選定部１４２は、手掛かり句ＤＢ１１１に格納された手掛かり句情報を用いて、属性値キーワードとその対象データ内における位置情報等を抽出し、これらの抽出結果を含む属性値キーワード情報を属性値ＤＢ１１２に格納する。属性値キーワード選定部１４２による上記の処理（属性値キーワード選定処理と称する）については、後で図１３を参照しながら詳細な処理手順例を説明する。 The attribute value keyword selection unit 142 uses the clue phrase information stored in the clue phrase DB 111 to extract the attribute value keyword and its positional information in the target data, and converts the attribute value keyword information including these extraction results into attribute value keyword information. Store in the value DB 112 . With regard to the above processing (referred to as attribute value keyword selection processing) by the attribute value keyword selection unit 142, a detailed processing procedure example will be described later with reference to FIG.

属性名キーワード選定部１４３は、手掛かり句ＤＢ１１１に格納された手掛かり句情報を用いて、属性名キーワードとその対象データ内の位置情報等を抽出し、これらの抽出結果を含む属性名キーワード情報を属性名ＤＢ１１３に格納する。属性名キーワード選定部１４３による上記の処理（属性名キーワード選定処理と称する）については、後で図１４を参照しながら詳細な処理手順例を説明する。 The attribute name keyword selection unit 143 uses the clue phrase information stored in the clue phrase DB 111 to extract the attribute name keyword and the position information in the target data, and the attribute name keyword information including these extraction results. Store in name DB 113 . With regard to the above processing (referred to as attribute name keyword selection processing) by the attribute name keyword selection unit 143, a detailed processing procedure example will be described later with reference to FIG.

最終回答キーワード選定部１４４は、手掛かり句ＤＢ１１１に格納された手掛かり句情報を用いて、最終回答キーワード及びその対象データ内位置情報等を抽出し、これらの抽出結果を含む最終回答キーワード情報を最終回答ＤＢ１１４に格納する。最終回答キーワード選定部１４４による上記の処理（最終回答キーワード選定処理と称する）については、後で図１５を参照しながら詳細な処理手順例を説明する。 The final answer keyword selection unit 144 uses the clue phrase information stored in the clue phrase DB 111 to extract the final answer keyword and its position information in the target data, etc., and extracts the final answer keyword information including these extraction results as the final answer. Store in DB 114 . With regard to the above processing (referred to as final response keyword selection processing) by the final response keyword selection unit 144, a detailed processing procedure example will be described later with reference to FIG.

属性値－最終回答対応付け部１４５は、属性値ＤＢ１１２に格納された属性値キーワード情報と最終回答ＤＢ１１４に格納された最終回答キーワード情報とを用いて、属性値レコードと最終回答レコードとの対応関係を特定し、その特定結果を含む属性値－最終回答対応関係情報を属性値－最終回答対応関係ＤＢ１２１に格納する。属性値－最終回答対応付け部１４５による上記の処理（属性値－最終回答対応付け処理と称する）については、後で図１６を参照しながら詳細な処理手順例を説明する。 The attribute value-final answer association unit 145 uses the attribute value keyword information stored in the attribute value DB 112 and the final answer keyword information stored in the final answer DB 114 to determine the correspondence relationship between the attribute value record and the final answer record. is specified, and the attribute value-final answer correspondence information including the specified result is stored in the attribute value-final answer correspondence DB 121. FIG. With regard to the above processing (referred to as attribute value-final reply correlating processing) by the attribute value-final reply correlating unit 145, a detailed processing procedure example will be described later with reference to FIG.

属性値－属性名対応付け部１４６は、属性値ＤＢ１１２に格納された属性値キーワード情報と属性名ＤＢ１１３に格納された属性名キーワード情報とを用いて、属性値レコードと属性名レコードとの対応関係を特定し、その特定結果を含む属性値－属性名対応関係情報を属性値－属性名対応関係ＤＢ１２２に格納する。属性値－属性名対応付け部１４６による上記の処理（属性値－属性名対応付け処理と称する）については、後で図１７を参照しながら詳細な処理手順例を説明する。 The attribute value-attribute name association unit 146 uses the attribute value keyword information stored in the attribute value DB 112 and the attribute name keyword information stored in the attribute name DB 113 to determine the correspondence relationship between the attribute value record and the attribute name record. is specified, and attribute value-attribute name correspondence information including the specified result is stored in the attribute value-attribute name correspondence DB 122 . Regarding the above processing (referred to as attribute value-attribute name association processing) by the attribute value-attribute name association unit 146, a detailed processing procedure example will be described later with reference to FIG.

知識グラフ生成部１４７は、属性値－最終回答対応関係ＤＢ１２１に格納された属性値－最終回答対応関係情報を用いて、対話知識を示す知識グラフを構成するノード及びエッジに関する情報として、知識グラフノード情報及び知識グラフエッジ情報を生成し、知識グラフノードＤＢ１３１及び知識グラフエッジＤＢ１３２に格納する。知識グラフ生成部１４７による上記の処理（知識グラフ生成処理と称する）については、後で図１８，図１９を参照しながら詳細な処理手順例を説明する。 The knowledge graph generation unit 147 uses the attribute value-final answer correspondence information stored in the attribute value-final answer correspondence DB 121 to generate knowledge graph nodes as information about nodes and edges that constitute a knowledge graph representing dialogue knowledge. Information and knowledge graph edge information are generated and stored in the knowledge graph node DB 131 and knowledge graph edge DB 132 . With regard to the above-described processing (referred to as knowledge graph generation processing) by the knowledge graph generation unit 147, a detailed processing procedure example will be described later with reference to FIGS. 18 and 19. FIG.

聞き返し質問生成部１４８は、属性値－属性名対応関係ＤＢ１２２に格納された属性値－属性名対応関係情報を用いて、知識グラフノードＤＢ１３１に格納された知識グラフノード情報及び知識グラフエッジＤＢ１３２に格納された知識グラフエッジ情報に、適宜テキストを登録することにより、対話知識を示す知識グラフに、聞き返し質問のテキスト（聞き返しテキスト）及び質問への回答のテキスト（回答テキスト）を関連付ける。聞き返し質問生成部１４８による上記の処理（聞き返し質問生成処理と称する）については、後で図２０，図２１を参照しながら詳細な処理手順例を説明する。 The feedback question generation unit 148 uses the attribute value-attribute name correspondence information stored in the attribute value-attribute name correspondence DB 122 to store the knowledge graph node information stored in the knowledge graph node DB 131 and the knowledge graph edge DB 132. By appropriately registering texts in the obtained knowledge graph edge information, the knowledge graph indicating the dialogue knowledge is associated with the text of the reflection question (reflection text) and the text of the answer to the question (answer text). Regarding the above processing (referred to as feedback question generation processing) by the feedback question generation unit 148, a detailed processing procedure example will be described later with reference to FIGS. 20 and 21. FIG.

知識グラフ可視化部１５０は、知識グラフノードＤＢ１３１に格納された知識グラフノード情報及び知識グラフエッジＤＢ１３２に格納された知識グラフエッジ情報を用いて、知識グラフを可視化することにより、聞き返し質問のテキスト付きの対話知識をユーザから視認可能な形態で出力する。知識グラフ可視化部１５０による知識グラフの可視化の具体例は、図２２，図２３に例示される。 The knowledge graph visualization unit 150 uses the knowledge graph node information stored in the knowledge graph node DB 131 and the knowledge graph edge information stored in the knowledge graph edge DB 132 to visualize the knowledge graph, thereby generating a reflective question with text. To output dialogue knowledge in a form that can be visually recognized by a user. A specific example of visualization of the knowledge graph by the knowledge graph visualization unit 150 is illustrated in FIGS. 22 and 23. FIG.

（２）データ構成例
以下では、テキスト情報ＤＢ１１０、関係情報ＤＢ１２０、及びグラフ情報ＤＢ１３０内の各ＤＢに格納される各種情報のデータ構成例について説明する。 (2) Data Configuration Example A data configuration example of various types of information stored in each DB in the text information DB 110, the relationship information DB 120, and the graph information DB 130 will be described below.

図３は、手掛かり句情報のデータ構成例を示す図である。図３に示す手掛かり句情報３００は、手掛かり句ＤＢ１１１にリスト形式で格納されるデータであって、１以上の手掛かり句レコード（行データ）から構成される。各手掛かり句レコードには、１つの手掛かり句に関する情報が保持される。そして、手掛かり句レコードは、手掛かり句ＩＤ３０１、対象データカテゴリ３０２、対象表現カテゴリ３０３、生成方法３０４、手掛かり句内容３０５、確信度３０６、及び照合パターン３０７のフィールドから構成される。 FIG. 3 is a diagram showing a data configuration example of clue phrase information. The clue phrase information 300 shown in FIG. 3 is data stored in a list format in the clue phrase DB 111, and is composed of one or more clue phrase records (line data). Each cue phrase record holds information about one cue phrase. The clue phrase record is composed of fields of clue phrase ID 301 , target data category 302 , target expression category 303 , generation method 304 , clue phrase content 305 , degree of certainty 306 , and matching pattern 307 .

手掛かり句ＩＤ３０１は、当該手掛かり句の識別子（手掛かり句ＩＤ）を保持する。対象データカテゴリ３０２は、当該手掛かり句を用いたキーワード抽出の対象とされる対象データのカテゴリを示す情報を保持する。例えば、手掛かり句は、「マニュアル」または「対話ログ」の何れかからキーワード抽出するときに用いられる。 The clue phrase ID 301 holds the identifier of the clue phrase (cue phrase ID). The target data category 302 holds information indicating the category of target data targeted for keyword extraction using the clue phrase. For example, clue phrases are used when extracting keywords from either "manual" or "dialogue log."

対象表現カテゴリ３０３は、当該手掛かり句を用いて抽出するキーワードの種類を示す情報を保持する。具体的には、属性名キーワードの場合は「属性名」、属性値キーワードの場合は「属性値」、最終回答キーワードの場合は「最終回答」と記載される。 The target expression category 303 holds information indicating the type of keyword to be extracted using the clue phrase. Specifically, "attribute name" is written for the attribute name keyword, "attribute value" is written for the attribute value keyword, and "final answer" is written for the final answer keyword.

生成方法３０４は、当該手掛かり句の生成方法を示す情報を保持する。具体的には、当該手掛かり句がユーザ端末２００から入力された初期手掛かり句である場合は「初期」、新たな手掛かり句を生成した結果得られた手掛かり句である場合には「補完」と記載される。 The generation method 304 holds information indicating the generation method of the clue phrase. Specifically, if the clue phrase is an initial clue phrase input from the user terminal 200, it is described as "initial", and if it is a clue phrase obtained as a result of generating a new clue phrase, it is described as "complementary". be done.

手掛かり句内容３０５は、当該手掛かり句の内容をテキスト形式で保持する。確信度３０６は、当該手掛かり句の生成方法３０４が「補完」である場合に、手掛かり句としての確からしさに関する情報（確信度）を保持する。 The cue phrase content 305 holds the content of the cue phrase in text format. The degree of certainty 306 holds information (degree of certainty) regarding the likelihood of a clue phrase when the method 304 of generating the clue phrase is “complement”.

照合パターン３０７は、当該手掛かり句を用いたキーワード抽出における対象表現の照合パターンを示す情報を保持する。本例では、照合パターン３０７は、固有のテキストと正規表現（［］内には、変数部分として単語の種類を指定）によって、対象表現を指定している。このような照合パターン３０７が保持されることにより、当該照合パターンと正規表現処理による文字列抽出とを利用して、変数部分に該当する語句をキーワードとして抽出するキーワード抽出を実施することができる。なお、図３では、「普通名詞」、「時相名詞」、「固有名詞」等の、形態素解析で得られる形式が照合パターン３０７に指定されているが、照合パターン３０７の表記は必ずしもこれに限定されるものではなく、構文解析で得られる句や、固有表現抽出で得られるＮａｍｅｄＥｎｔｉｔｙであってもよく、さらにはＨＴＭＬパーサーで得られるＨＴＭＬタグの形式を用いてもよい。 The collation pattern 307 holds information indicating the collation pattern of the target expression in keyword extraction using the clue phrase. In this example, the collation pattern 307 designates the target expression with a unique text and a regular expression (in [ ], the type of word is designated as a variable part). By holding such a collation pattern 307, it is possible to use the collation pattern and character string extraction by regular expression processing to perform keyword extraction for extracting words corresponding to variable portions as keywords. In FIG. 3, forms obtained by morphological analysis such as "common noun", "temporal noun", and "proper noun" are specified as the collation pattern 307, but the notation of the collation pattern 307 does not necessarily correspond to this. It is not limited, and may be a phrase obtained by syntactic analysis, a NamedEntity obtained by named entity extraction, or an HTML tag format obtained by an HTML parser.

図４は、属性値キーワード情報のデータ構成例を示す図である。図４に示す属性値キーワード情報４００は、属性値ＤＢ１１２にリスト形式で格納されるデータであって、１以上の属性値レコード（行データ）から構成される。そして属性値レコードは、属性値ＩＤ４０１、属性値キーワード４０２、対象データ内位置情報４０３、確信度４０４、及び選定元手掛かり句ＩＤ４０５のフィールドから構成される。 FIG. 4 is a diagram showing a data configuration example of attribute value keyword information. The attribute value keyword information 400 shown in FIG. 4 is data stored in the attribute value DB 112 in list form, and is composed of one or more attribute value records (line data). The attribute value record is composed of fields of attribute value ID 401 , attribute value keyword 402 , target data position information 403 , certainty 404 , and selection source clue phrase ID 405 .

属性値ＩＤ４０１は、抽出した属性値キーワードの識別子（属性値ＩＤ）を保持する。属性値キーワード４０２は、抽出した属性値キーワードのテキストを保持する。 The attribute value ID 401 holds the identifier (attribute value ID) of the extracted attribute value keyword. The attribute value keyword 402 holds the text of the extracted attribute value keyword.

対象データ内位置情報４０３は、照合パターン３０７に基づいて対象データから属性値キーワードを抽出したときに、対象データにおける属性値キーワードの抽出位置を示す情報を保持する。例えば、図４に示した「１３行目」とは、マニュアルの１３行目から属性値キーワードを抽出したことを意味し、「６番目発話」とは、対話ログの６番目の発話から属性値キーワードを抽出したことを意味する。 The target data position information 403 holds information indicating the extraction position of the attribute value keyword in the target data when the attribute value keyword is extracted from the target data based on the matching pattern 307 . For example, "13th line" shown in FIG. 4 means that the attribute value keyword is extracted from the 13th line of the manual, and "6th utterance" means that the attribute value keyword is extracted from the 6th utterance of the dialogue log. It means that the keyword has been extracted.

確信度４０４は、属性値キーワードとしての確からしさに関する情報（確信度）を保持する。選定元手掛かり句ＩＤ４０５は、属性値キーワードの抽出に用いた手掛かり句レコードのＩＤ情報（図３の手掛かり句ＩＤ３０１に対応）を保持する。 The degree of certainty 404 holds information (degree of certainty) regarding certainty as an attribute value keyword. The selection source clue phrase ID 405 holds the ID information (corresponding to the clue phrase ID 301 in FIG. 3) of the clue phrase record used for extracting the attribute value keyword.

図５は、属性名キーワード情報のデータ構成例を示す図である。図５に示す属性名キーワード情報５００は、属性名ＤＢ１１３にリスト形式で格納されるデータであって、１以上の属性名レコード（行データ）から構成される。そして属性名レコードは、属性名ＩＤ５０１、属性名キーワード５０２、対象データ内位置情報５０３、確信度５０４、及び選定元手掛かり句ＩＤ５０５のフィールドから構成される。 FIG. 5 is a diagram showing a data configuration example of attribute name keyword information. The attribute name keyword information 500 shown in FIG. 5 is data stored in the attribute name DB 113 in list form, and is composed of one or more attribute name records (line data). The attribute name record is composed of fields of attribute name ID 501 , attribute name keyword 502 , target data position information 503 , certainty 504 , and selection source clue phrase ID 505 .

属性名ＩＤ５０１は、抽出した属性名キーワードの識別子（属性名ＩＤ）を保持する。属性名キーワード５０２は、抽出した属性名キーワードのテキストを保持する。 The attribute name ID 501 holds the identifier (attribute name ID) of the extracted attribute name keyword. The attribute name keyword 502 holds the text of the extracted attribute name keyword.

対象データ内位置情報５０３は、照合パターン３０７に基づいて対象データから属性名キーワードを抽出したときに、対象データにおける属性名キーワードの抽出位置を示す情報を保持する。例えば、図５に示した「１２行目」とは、マニュアルの１２行目から属性名キーワードを抽出したことを意味し、「５番目発話」とは、対話ログの５番目の発話から属性名キーワードを抽出したことを意味する。 The target data internal position information 503 holds information indicating the extraction position of the attribute name keyword in the target data when the attribute name keyword is extracted from the target data based on the matching pattern 307 . For example, "12th line" shown in FIG. 5 means that the attribute name keyword is extracted from the 12th line of the manual, and "5th utterance" means that the attribute name keyword is extracted from the 5th utterance of the dialogue log. It means that the keyword has been extracted.

確信度５０４は、属性名キーワードとしての確からしさに関する情報（確信度）を保持する。選定元手掛かり句ＩＤ５０５は、属性名キーワードの抽出に用いた手掛かり句レコードのＩＤ情報（図３の手掛かり句ＩＤ３０１に対応）を保持する。 The degree of certainty 504 holds information (degree of certainty) regarding the likelihood of an attribute name keyword. The selection source clue phrase ID 505 holds the ID information (corresponding to the clue phrase ID 301 in FIG. 3) of the clue phrase record used to extract the attribute name keyword.

図６は、最終回答キーワード情報のデータ構成例を示す図である。図６に示す最終回答キーワード情報６００は、最終回答ＤＢ１１４にリスト形式で格納されるデータであって、１以上の最終回答レコード（行データ）から構成される。そして最終回答レコードは、最終回答ＩＤ６０１、最終回答キーワード６０２、対象データ内位置情報６０３、確信度６０４、及び選定元手掛かり句ＩＤ６０５のフィールドから構成される。 FIG. 6 is a diagram showing a data configuration example of final answer keyword information. The final reply keyword information 600 shown in FIG. 6 is data stored in the final reply DB 114 in list form, and is composed of one or more final reply records (line data). The final answer record is composed of fields of a final answer ID 601 , a final answer keyword 602 , positional information in target data 603 , a certainty factor 604 , and a selection source clue phrase ID 605 .

最終回答ＩＤ６０１は、抽出した最終回答キーワードの識別子（属性名ＩＤ）を保持する。最終回答キーワード６０２は、抽出した最終回答キーワードのテキストを保持する。 The final reply ID 601 holds the identifier (attribute name ID) of the extracted final reply keyword. The final answer keyword 602 holds the text of the extracted final answer keyword.

対象データ内位置情報６０３は、照合パターン３０７に基づいて対象データから最終回答キーワードを抽出したときに、対象データにおける最終回答キーワードの抽出位置を示す情報を保持する。例えば、図６に示した「１４行目」とは、マニュアルの１４行目から最終回答キーワードを抽出したことを意味し、「７番目発話」とは、対話ログの７番目の発話から最終回答キーワードを抽出したことを意味する。 The target data internal position information 603 holds information indicating the extraction position of the final reply keyword in the target data when the final reply keyword is extracted from the target data based on the matching pattern 307 . For example, "14th line" shown in FIG. 6 means that the final answer keyword is extracted from the 14th line of the manual, and "7th utterance" is the final answer from the 7th utterance of the dialogue log. It means that the keyword has been extracted.

確信度６０４は、最終回答キーワードとしての確からしさに関する情報（確信度）を保持する。選定元手掛かり句ＩＤ６０５は、最終回答キーワードの抽出に用いた手掛かり句レコードのＩＤ情報（図３の手掛かり句ＩＤ３０１に対応）を保持する。 The degree of certainty 604 holds information (degree of certainty) regarding the certainty of the final answer keyword. The selection source clue phrase ID 605 holds the ID information (corresponding to the clue phrase ID 301 in FIG. 3) of the clue phrase record used to extract the final answer keyword.

図７は、属性値－最終回答対応関係情報のデータ構成例を示す図である。図７に示す属性値－最終回答対応関係情報７００は、属性値最終回答対応関係ＤＢ１２１にリスト形式で格納されるデータであって、１以上の属性値－最終回答対応関係レコード（行データ）から構成される。そして属性値－最終回答対応関係レコードは、対応関係ＩＤ７０１、属性値ＩＤ７０２、最終回答ＩＤ７０３、及び確信度７０４のフィールドから構成される。 FIG. 7 is a diagram showing a data configuration example of attribute value-final answer correspondence relationship information. The attribute value-final answer correspondence information 700 shown in FIG. 7 is data stored in a list format in the attribute value-final answer correspondence DB 121. From one or more attribute value-final answer correspondence records (line data) Configured. The attribute value-final answer correspondence record is composed of correspondence ID 701, attribute value ID 702, final answer ID 703, and certainty 704 fields.

対応関係ＩＤ７０１は、属性値キーワードと最終回答キーワードとの対応関係（属性値－最終回答対応関係）の識別子を保持する。属性値ＩＤ７０２は、当該対応関係を構成する属性値キーワードのＩＤ情報（図４の属性値ＩＤ４０１に対応）を保持する。最終回答ＩＤ７０３は、当該対応関係を構成する最終回答キーワードのＩＤ情報（図６の最終回答ＩＤ６０１に対応）を保持する。確信度７０４は、当該対応関係の対応付けの確からしさに関する情報を保持する。 The correspondence ID 701 holds the identifier of the correspondence between the attribute value keyword and the final answer keyword (attribute value-final answer correspondence). The attribute value ID 702 holds ID information (corresponding to the attribute value ID 401 in FIG. 4) of the attribute value keyword forming the corresponding relationship. The final reply ID 703 holds the ID information (corresponding to the final reply ID 601 in FIG. 6) of the final reply keyword forming the corresponding relationship. The degree of certainty 704 holds information about the certainty of matching of the corresponding relationship.

図８は、属性値－属性名対応関係情報のデータ構成例を示す図である。図８に示す属性値－属性名対応関係情報８００は、属性値属性名対応関係ＤＢ１２２にリスト形式で格納されるデータであって、１以上の属性値－属性名対応関係レコード（行データ）から構成される。そして属性値－属性名対応関係レコードは、対応関係ＩＤ８０１、属性値ＩＤ８０２、属性値ＩＤ８０３、及び確信度８０４のフィールドから構成される。 FIG. 8 is a diagram showing a data configuration example of attribute value-attribute name correspondence information. The attribute value-attribute name correspondence information 800 shown in FIG. 8 is data stored in a list format in the attribute value attribute name correspondence DB 122, and is composed of one or more attribute value-attribute name correspondence records (row data). Configured. The attribute value-attribute name correspondence record is composed of correspondence ID 801, attribute value ID 802, attribute value ID 803, and certainty 804 fields.

対応関係ＩＤ８０１は、属性値キーワードと属性名キーワードとの対応関係（属性値－属性名対応関係）の識別子を保持する。属性値ＩＤ８０２は、当該対応関係を構成する属性値キーワードのＩＤ情報（図４の属性値ＩＤ４０１に対応）を保持する。属性名ＩＤ８０３は、当該対応関係を構成する属性名キーワードのＩＤ情報（図５の属性名ＩＤ５０１に対応）を保持する。確信度８０４は、当該対応関係の対応付けの確からしさに関する情報を保持する。 The correspondence ID 801 holds identifiers of correspondences (attribute value-attribute name correspondences) between attribute value keywords and attribute name keywords. The attribute value ID 802 holds ID information (corresponding to the attribute value ID 401 in FIG. 4) of the attribute value keyword forming the corresponding relationship. The attribute name ID 803 holds ID information (corresponding to the attribute name ID 501 in FIG. 5) of the attribute name keyword forming the corresponding relationship. The degree of certainty 804 holds information about the certainty of matching of the corresponding relationship.

図９は、知識グラフノード情報のデータ構成例を示す図である。図９に示す知識グラフノード情報９００は、知識グラフノードＤＢ１３１にリスト形式で格納されるデータであって、１以上の知識グラフノードレコード（行データ）から構成される。そして知識グラフノードレコードは、ノードＩＤ９０１、テキスト９０２、及びノードカテゴリ９０３のフィールドから構成される。 FIG. 9 is a diagram showing a data configuration example of knowledge graph node information. The knowledge graph node information 900 shown in FIG. 9 is data stored in the knowledge graph node DB 131 in list form, and is composed of one or more knowledge graph node records (line data). The knowledge graph node record is composed of fields of node ID 901 , text 902 , and node category 903 .

ノードＩＤ９０１は、知識グラフを構成するノードの識別子（ノードＩＤ）を保持する。テキスト９０２は、聞き返し質問のテキスト、または最終回答のテキストを保持する。ノードカテゴリ９０３は、知識ノードのカテゴリ（種類）を識別する情報を保持する。具体的には例えば、「最終回答」または「聞き返し質問」のカテゴリが用意される。 The node ID 901 holds the identifiers (node IDs) of the nodes forming the knowledge graph. Text 902 holds the text of the reflection question or the text of the final answer. The node category 903 holds information identifying the category (kind) of the knowledge node. Specifically, for example, a category of "final answer" or "reflection question" is prepared.

図１０は、知識グラフエッジ情報のデータ構成例を示す図である。図１０に示す知識グラフエッジ情報１０００は、知識グラフエッジＤＢ１３２にリスト形式で格納されるデータであって、１以上の知識グラフエッジレコード（行データ）から構成される。そして知識グラフエッジレコードは、エッジＩＤ１００１、流入元ノードＩＤ１００２、流入先ノードＩＤ１００３、及びテキスト１００４のフィールドから構成される。 FIG. 10 is a diagram showing a data configuration example of knowledge graph edge information. The knowledge graph edge information 1000 shown in FIG. 10 is data stored in the knowledge graph edge DB 132 in list format, and is composed of one or more knowledge graph edge records (line data). A knowledge graph edge record is composed of fields of an edge ID 1001 , an inflow source node ID 1002 , an inflow destination node ID 1003 , and a text 1004 .

エッジＩＤ１００１は、知識グラフを構成するエッジの識別子（エッジＩＤ）を保持する。流入元ノードＩＤ１００２は、知識グラフにおいて当該エッジを割り当てた流入元の知識グラフノードのＩＤ情報（図９のノードＩＤ９０１に対応）を保持する。流入先ノードＩＤ１００３は、知識グラフにおいて当該エッジを割り当てた流入先の知識グラフノードのＩＤ情報（図９のノードＩＤ９０１に対応）を保持する。テキスト１００４は、聞き返し質問の回答結果に関するテキストを保持する。 The edge ID 1001 holds identifiers (edge IDs) of edges forming the knowledge graph. The inflow source node ID 1002 holds the ID information (corresponding to the node ID 901 in FIG. 9) of the inflow source knowledge graph node to which the edge is assigned in the knowledge graph. The inflow destination node ID 1003 holds the ID information (corresponding to the node ID 901 in FIG. 9) of the inflow destination knowledge graph node to which the edge is assigned in the knowledge graph. The text 1004 holds the text regarding the answer result of the reflection question.

（３）対話知識生成処理
以下では、対話知識生成装置１００が実行する対話知識生成処理について、処理主体となる機能部を明示しながら詳しく説明する。 (3) Dialogue Knowledge Generation Processing Hereinafter, the dialogue knowledge generation processing executed by the dialogue knowledge generation apparatus 100 will be described in detail while clearly indicating the functional units that are the main bodies of the processing.

図１１は、対話知識生成処理の全体的な処理手順例を示すフローチャートである。図１１に示す対話知識生成処理において、対話知識生成装置１００がユーザ端末２００からの入力を受けると、手掛かり句推定部１４１が、手掛かり句情報３００を手掛かり句ＤＢ１１１に格納する手掛かり句推定処理を実行する（ステップＳ１１０１）。 FIG. 11 is a flow chart showing an example of an overall processing procedure of dialogue knowledge generation processing. In the dialogue knowledge generation process shown in FIG. 11, when the dialogue knowledge generation device 100 receives an input from the user terminal 200, the clue phrase estimation unit 141 executes the clue phrase estimation process of storing the clue phrase information 300 in the clue phrase DB 111. (step S1101).

次に、属性値キーワード選定部１４２が、手掛かり句情報３００を用いて対象データから属性値キーワードを抽出する属性値キーワード選定処理を実行し（ステップＳ１１０２）、属性名キーワード選定部１４３が、手掛かり句情報３００を用いて対象データから属性名キーワードを抽出する属性名キーワード選定処理を実行し（ステップＳ１１０３）、最終回答キーワード選定部１４４が、手掛かり句情報３００を用いて対象データから最終回答キーワードを抽出する最終回答キーワード選定処理を実行する（ステップＳ１１０４）。なお、ステップＳ１１０２～Ｓ１１０４の各処理は、実行順序を入れ替えてもよく、あるいは並行して実行してもよい。 Next, the attribute value keyword selection unit 142 executes attribute value keyword selection processing for extracting attribute value keywords from the target data using the clue phrase information 300 (step S1102). Attribute name keyword selection processing is executed to extract attribute name keywords from the target data using the information 300 (step S1103), and the final answer keyword selection unit 144 extracts the final answer keywords from the target data using the clue phrase information 300. Final answer keyword selection processing is executed (step S1104). It should be noted that the processing of steps S1102 to S1104 may be performed in a different order or may be performed in parallel.

次に、属性値－最終回答対応付け部１４５が、属性値レコードと最終回答レコードとを対応付ける属性値－最終回答対応付け処理を実行し（ステップＳ１１０５）、属性値－属性名対応付け部１４６が、属性値レコードと属性名レコードとを対応付ける属性値－属性名対応付け処理を実行する（ステップＳ１１０６）。なお、ステップＳ１１０５，Ｓ１１０６の各処理は、実行順序を入れ替えてもよく、あるいは並行して実行してもよい。 Next, the attribute value-final answer associating unit 145 executes attribute value-final answer associating processing for associating the attribute value record with the final answer record (step S1105), and the attribute value-attribute name associating unit 146 , an attribute value-attribute name association process for associating the attribute value record with the attribute name record is executed (step S1106). It should be noted that the processing of steps S1105 and S1106 may be performed in a different order, or may be performed in parallel.

その後、知識グラフ生成部１４７が、対話知識を示す知識グラフの構造を生成する知識グラフ生成処理を実行する（ステップＳ１１０７）。知識グラフ生成処理では、具体的には、知識グラフノード情報９００及び知識グラフエッジ情報１０００の一部作成を行うことにより、知識グラフを構成するノードとエッジとの関係性を特定し、知識グラフの構造を完成させる。 After that, the knowledge graph generation unit 147 executes knowledge graph generation processing for generating a structure of a knowledge graph indicating dialogue knowledge (step S1107). Specifically, in the knowledge graph generation process, by partially creating the knowledge graph node information 900 and the knowledge graph edge information 1000, the relationship between the nodes and edges that constitute the knowledge graph is specified, and the knowledge graph is generated. complete the structure.

次に、聞き返し質問生成部１４８が、ステップＳ１１０７で生成された知識グラフに、聞き返し質問及びその回答のテキスト（聞き返しテキスト、回答テキスト）を関連付ける聞き返し質問生成処理を実行する（ステップＳ１１０８）。聞き返し質問生成処理が実行されることにより、知識グラフ生成部１４７によって途中まで生成された知識グラフに聞き返しテキストが追加されるため、聞き返しテキスト付きの対話知識（対話フローモデル）としての知識グラフが完成する。 Next, the reflection question generation unit 148 executes reflection question generation processing for associating the reflection question and the text of the answer (reflection text, answer text) with the knowledge graph generated in step S1107 (step S1108). Reflection text is added to the knowledge graph halfway generated by the knowledge graph generation unit 147 by executing the reflection question generation process, so that the knowledge graph as dialogue knowledge (dialogue flow model) with reflection text is completed. do.

次に、知識グラフ可視化部１５０が、ステップＳ１１０７，Ｓ１１０８を経て生成された聞き返し質問のテキスト付きの知識グラフを可視化する知識グラフ可視化処理を実行する（ステップＳ１１０９）。 Next, the knowledge graph visualization unit 150 executes knowledge graph visualization processing for visualizing the knowledge graph with the text of the feedback question generated through steps S1107 and S1108 (step S1109).

その後、対話知識生成装置１００は、ユーザ端末２００からの追加入力があるか否かを判定し（ステップＳ１１１０）、追加入力がある場合は（ステップＳ１１１０のＹＥＳ）、ステップＳ１１０９に戻り、追加入力の内容に応じて、知識グラフ及び聞き返し質問を再度可視化する。一方、追加入力がない場合は（ステップＳ１１１０のＮＯ）、対話知識生成処理を終了する。 Thereafter, the dialogue knowledge generating apparatus 100 determines whether or not there is an additional input from the user terminal 200 (step S1110), and if there is an additional input (YES in step S1110), returns to step S1109, Depending on the content, the knowledge graph and reflection questions are revisited. On the other hand, if there is no additional input (NO in step S1110), the dialogue knowledge generation process ends.

以下では、対話知識生成処理における各ステップの処理について詳しく説明する。 The processing of each step in the dialogue knowledge generation processing will be described in detail below.

（３－１）手掛かり句推定処理
図１２は、手掛かり句推定処理の処理手順例を示すフローチャートである。手掛かり句推定処理は、図１１のステップＳ１１０１の処理であって、手掛かり句推定部１４１によって実行される。 (3-1) Clue Phrase Estimation Processing FIG. 12 is a flowchart showing an example of the procedure of the clue phrase estimation processing. The clue phrase estimation process is the process of step S1101 in FIG. 11 and is executed by the clue phrase estimation unit 141 .

手掛かり句推定処理ではまず、手掛かり句推定部１４１は、ユーザ端末２００から、業務文書（本例では業務マニュアル）、問い合わせ記録（本例では対話ログ）、及び初期手掛かり句の入力を受信する（ステップＳ１２０１）。 In the clue phrase estimation process, first, the clue phrase estimation unit 141 receives a business document (business manual in this example), an inquiry record (a dialogue log in this example), and an input of an initial clue phrase from the user terminal 200 (step S1201).

次に、手掛かり句推定部１４１は、ステップＳ１２０１で受信した初期手掛かり句に関する手掛かり句レコードを生成し、手掛かり句ＤＢ１１１に格納される手掛かり句情報３００に追加する（ステップＳ１２０２）。ステップＳ１２０２で生成される手掛かり句レコードは、生成方法３０４に「初期」が登録され、その他のフィールドにも適宜、入力された値が登録される。なお、ステップＳ１２０１においてユーザ端末２００から入力される初期手掛かり句は複数であってもよく、その場合、手掛かり句推定部１４１は、ステップＳ１２０２において受信した初期手掛かり句の分だけ、新規に手掛かり句レコードを生成する。また、ステップＳ１２０２において、手掛かり句推定部１４１は、ステップＳ１２０１で受信した業務文書及び問い合わせ記録を、対話知識生成装置１００内の適切な記憶場所（詳細は図示省略）に格納する。 Next, the clue phrase estimation unit 141 generates a clue phrase record regarding the initial clue phrase received in step S1201, and adds it to the clue phrase information 300 stored in the clue phrase DB 111 (step S1202). In the clue phrase record generated in step S1202, "initial" is registered in the generation method 304, and input values are registered in other fields as appropriate. Note that a plurality of initial clue phrases may be input from the user terminal 200 in step S1201. In this case, the clue phrase estimation unit 141 newly generates new clue phrase records for the initial clue phrases received in step S1202. to generate Also, in step S1202, the clue phrase estimation unit 141 stores the business document and inquiry record received in step S1201 in an appropriate storage location (details are not shown) within the dialogue knowledge generating apparatus 100. FIG.

次に、手掛かり句推定部１４１は、新たな手掛かり句を抽出していない初期手掛かり句が存在するか否かを判定する（ステップＳ１２０３）。新たな手掛かり句を抽出していない初期手掛かり句が存在する場合、例えば、ステップＳ１２０２で初期手掛かり句が手掛かり句ＤＢ１１１に登録された直後のケースでは（ステップＳ１２０３のＹＥＳ）、ステップＳ１２０４に進む。一方、全ての初期手掛かり句について新たな手掛かり句を抽出済みである場合は（ステップＳ１２０３のＮＯ）、手掛かり句推定処理を終了する。 Next, the clue phrase estimation unit 141 determines whether or not there is an initial clue phrase from which no new clue phrase has been extracted (step S1203). If there is an initial clue phrase for which no new clue phrase has been extracted, for example, immediately after the initial clue phrase is registered in the clue phrase DB 111 in step S1202 (YES in step S1203), the process proceeds to step S1204. On the other hand, if new clue phrases have already been extracted for all initial clue phrases (NO in step S1203), the clue phrase estimation process ends.

ステップＳ１２０４では、手掛かり句推定部１４１は、ステップＳ１２０２で生成した手掛かり句レコードに基づいて、初期手掛かり句の対象表現を抽出する。そして手掛かり句推定部１４１は、抽出した対象表現から新たな手掛かり句を抽出し（ステップＳ１２０５）、抽出した新たな手掛かり句に関する手掛かり句レコードを生成し、手掛かり句ＤＢ１１１（手掛かり句情報３００）に格納し（ステップＳ１２０６）、ステップＳ１２０３に戻る。 In step S1204, the clue phrase estimation unit 141 extracts the target expression of the initial clue phrase based on the clue phrase record generated in step S1202. Then, the clue phrase estimation unit 141 extracts a new clue phrase from the extracted target expression (step S1205), generates a clue phrase record related to the extracted new clue phrase, and stores it in the clue phrase DB 111 (cue phrase information 300). (step S1206) and returns to step S1203.

ステップＳ１２０５で抽出される新たな手掛かり句は、初期手掛かり句を補完する手掛かり句であり、新たな手掛かり句の手掛かり句レコードでは、生成方法３０４に「補完」と記載される。具体的には、初期手掛かり句の照合パターン３０７が「［普通名詞］のときは」とされ、ステップＳ１２０４において、マニュアルから対象表現として「契約のときは」というテキストが抽出されたとする。この場合、ステップＳ１２０５では、機械学習モデルや予め単語ごとに設定されたパラメータを利用する等して、「契約のときは」という表現を補完する表現が探索され、例えば「契約の際は」という表現が抽出されたとすると、「［普通名詞］の際は」といった手掛かり句が、新たな手掛かり句として抽出され、手掛かり句ＤＢ１１１に格納される。 The new clue phrase extracted in step S1205 is a clue phrase that complements the initial clue phrase. Specifically, it is assumed that the matching pattern 307 of the initial clue phrase is "when [common noun]" and the text "when making a contract" is extracted from the manual as the target expression in step S1204. In this case, in step S1205, by using a machine learning model or a parameter set for each word in advance, an expression that complements the expression "at the time of the contract" is searched. Assuming that an expression is extracted, a clue phrase such as "When [common noun]" is extracted as a new clue phrase and stored in the clue phrase DB 111 .

（３－２）属性値キーワード選定処理
図１３は、属性値キーワード選定処理の処理手順例を示すフローチャートである。属性値キーワード選定処理は、図１１のステップＳ１１０２の処理であって、属性値キーワード選定部１４２によって実行される。 (3-2) Attribute Value Keyword Selection Processing FIG. 13 is a flow chart showing an example of the processing procedure of the attribute value keyword selection processing. The attribute value keyword selection process is the process of step S1102 in FIG. 11 and is executed by the attribute value keyword selection unit 142. FIG.

属性値キーワード選定処理ではまず、属性値キーワード選定部１４２は、手掛かり句ＤＢ１１１に格納された手掛かり句情報３００を用いて、対象表現カテゴリ３０３が「属性値」である手掛かり句レコードを抽出する（ステップＳ１３０１）。 In the attribute value keyword selection process, first, the attribute value keyword selection unit 142 uses the clue phrase information 300 stored in the clue phrase DB 111 to extract clue phrase records whose target expression category 303 is "attribute value" (step S1301).

次に、属性値キーワード選定部１４２は、ステップＳ１３０１で抽出した手掛かり句レコードについて、照合パターン３０７による属性値キーワードの抽出が実行されていない手掛かり句レコードの存在を確認する（ステップＳ１３０２）。 Next, the attribute value keyword selection unit 142 confirms the presence of clue phrase records for which extraction of attribute value keywords by the collation pattern 307 has not been executed among the clue phrase records extracted in step S1301 (step S1302).

ステップＳ１３０２において属性値キーワードの抽出が実行されていない手掛かり句レコードが存在する場合（ステップＳ１３０２のＹＥＳ）、属性値キーワード選定部１４２は、当該手掛かり句レコードの対象データカテゴリ３０２が示す対象データに対して、照合パターン３０７を検索キーとして検索することにより、該当する属性値キーワードを抽出し（ステップＳ１３０３）、抽出した属性値キーワードの確信度（確信度４０４）を計算し（ステップＳ１３０４）、抽出した属性値キーワードに関する属性値レコードを生成し、当該属性値レコードを属性値ＤＢ１１２の属性値キーワード情報４００に追加する（ステップＳ１３０５）。ステップＳ１３０５の終了後はステップＳ１３０２に戻る。 If there is a clue phrase record for which attribute value keyword extraction has not been performed in step S1302 (YES in step S1302), the attribute value keyword selection unit 142 selects the target data indicated by the target data category 302 of the clue phrase record. Then, by searching using the collation pattern 307 as a search key, the corresponding attribute value keyword is extracted (step S1303), the certainty factor (certainty factor 404) of the extracted attribute value keyword is calculated (step S1304), and extracted. An attribute value record related to the attribute value keyword is generated, and the attribute value record is added to the attribute value keyword information 400 of the attribute value DB 112 (step S1305). After the end of step S1305, the process returns to step S1302.

そして、ステップＳ１３０２において属性値キーワードの抽出が実行されていない手掛かり句レコードが存在しない場合は（ステップＳ１３０２のＮＯ）、属性値キーワード選定処理を終了する。 Then, if there is no clue phrase record for which attribute value keyword extraction has not been executed in step S1302 (NO in step S1302), the attribute value keyword selection process is terminated.

（３－３）属性名キーワード選定処理
図１４は、属性名キーワード選定処理の処理手順例を示すフローチャートである。属性名キーワード選定処理は、図１１のステップＳ１１０３の処理であって、属性名キーワード選定部１４３によって実行される。 (3-3) Attribute Name Keyword Selection Processing FIG. 14 is a flow chart showing an example of a processing procedure for attribute name keyword selection processing. The attribute name keyword selection process is the process of step S1103 in FIG. 11 and is executed by the attribute name keyword selection unit 143. FIG.

属性名キーワード選定処理ではまず、属性名キーワード選定部１４３は、手掛かり句ＤＢ１１１に格納された手掛かり句情報３００を用いて、対象表現カテゴリ３０３が「属性名」である手掛かり句レコードを抽出する（ステップＳ１４０１）。 In the attribute name keyword selection process, first, the attribute name keyword selection unit 143 uses the clue phrase information 300 stored in the clue phrase DB 111 to extract clue phrase records whose target expression category 303 is "attribute name" (step S1401).

次に、属性名キーワード選定部１４３は、ステップＳ１４０１で抽出した手掛かり句レコードについて、照合パターン３０７による属性名キーワードの抽出が実行されていない手掛かり句レコードの存在を確認する（ステップＳ１４０２）。 Next, the attribute name keyword selection unit 143 confirms the presence of clue phrase records for which attribute name keywords have not been extracted by the matching pattern 307 among the clue phrase records extracted in step S1401 (step S1402).

ステップＳ１４０２において属性名キーワードの抽出が実行されていない手掛かり句レコードが存在する場合（ステップＳ１４０２のＹＥＳ）、属性名キーワード選定部１４３は、当該手掛かり句レコードの対象データカテゴリ３０２が示す対象データに対して、照合パターン３０７を検索キーとして検索することにより、該当する属性名キーワードを抽出し（ステップＳ１４０３）、抽出した属性名キーワードの確信度（確信度５０４）を計算し（ステップＳ１４０４）、抽出した属性名キーワードに関する属性名レコードを生成し、当該属性名レコードを属性名ＤＢ１１３の属性名キーワード情報５００に追加する（ステップＳ１４０５）。ステップＳ１４０５の終了後はステップＳ１４０２に戻る。 In step S1402, if there is a clue phrase record for which attribute name keyword extraction has not been performed (YES in step S1402), the attribute name keyword selection unit 143 selects the target data indicated by the target data category 302 of the clue phrase record. Then, by searching using the collation pattern 307 as a search key, the corresponding attribute name keyword is extracted (step S1403), the certainty factor (certainty factor 504) of the extracted attribute name keyword is calculated (step S1404), and extracted. An attribute name record relating to the attribute name keyword is generated, and the attribute name record is added to the attribute name keyword information 500 of the attribute name DB 113 (step S1405). After the end of step S1405, the process returns to step S1402.

そして、ステップＳ１４０２において属性名キーワードの抽出が実行されていない手掛かり句レコードが存在しない場合は（ステップＳ１４０２のＮＯ）、属性名キーワード選定処理を終了する。 Then, in step S1402, if there is no clue phrase record for which attribute name keyword extraction has not been executed (NO in step S1402), the attribute name keyword selection process is terminated.

（３－４）最終回答キーワード選定処理
図１５は、最終回答キーワード選定処理の処理手順例を示すフローチャートである。最終回答キーワード選定処理は、図１１のステップＳ１１０４の処理であって、最終回答キーワード選定部１４４によって実行される。 (3-4) Final Answer Keyword Selection Processing FIG. 15 is a flowchart showing an example of the processing procedure of the final reply keyword selection processing. The final answer keyword selection process is the process of step S1104 in FIG. 11, and is executed by the final answer keyword selection unit 144.

最終回答キーワード選定処理ではまず、最終回答キーワード選定部１４４は、手掛かり句ＤＢ１１１に格納された手掛かり句情報３００を用いて、対象表現カテゴリ３０３が「最終回答」である手掛かり句レコードを抽出する（ステップＳ１５０１）。 In the final answer keyword selection process, first, the final answer keyword selection unit 144 uses the clue phrase information 300 stored in the clue phrase DB 111 to extract clue phrase records whose target expression category 303 is "final answer" (step S1501).

次に、最終回答キーワード選定部１４４は、ステップＳ１５０１で抽出した手掛かり句レコードについて、照合パターン３０７による最終回答キーワードの抽出が実行されていない手掛かり句レコードの存在を確認する（ステップＳ１５０２）。 Next, the final answer keyword selection unit 144 confirms the existence of a clue phrase record for which the final answer keyword has not been extracted by the matching pattern 307 among the clue phrase records extracted in step S1501 (step S1502).

ステップＳ１５０２において最終回答キーワードの抽出が実行されていない手掛かり句レコードが存在する場合（ステップＳ１５０２のＹＥＳ）、最終回答キーワード選定部１４４は、当該手掛かり句レコードの対象データカテゴリ３０２が示す対象データに対して、照合パターン３０７を検索キーとして検索することにより、該当する最終回答キーワードを抽出し（ステップＳ１５０３）、抽出した最終回答キーワードの確信度（確信度６０４）を計算し（ステップＳ１５０４）、抽出した最終回答キーワードに関する最終回答レコードを生成し、当該最終回答レコードを最終回答ＤＢ１１４に格納される最終回答キーワード情報６００に追加する（ステップＳ１５０５）。ステップＳ１５０５の終了後はステップＳ１５０２に戻る。 In step S1502, if there is a clue phrase record for which the final answer keyword has not been extracted (YES in step S1502), the final answer keyword selection unit 144 selects the target data indicated by the target data category 302 of the clue phrase record. Then, by searching using the matching pattern 307 as a search key, the corresponding final answer keyword is extracted (step S1503), the degree of certainty (certainty degree 604) of the extracted final answer keyword is calculated (step S1504), and extracted. A final response record for the final response keyword is generated, and the final response record is added to the final response keyword information 600 stored in the final response DB 114 (step S1505). After the end of step S1505, the process returns to step S1502.

そして、ステップＳ１５０２において最終回答キーワードの抽出が実行されていない手掛かり句レコードが存在しない場合は（ステップＳ１５０２のＮＯ）、最終回答キーワード選定処理を終了する。 Then, in step S1502, if there is no clue phrase record for which the final answer keyword has not been extracted (NO in step S1502), the final answer keyword selection process ends.

（３－５）属性値－最終回答対応付け処理
図１６は、属性値－最終回答対応付け処理の処理手順例を示すフローチャートである。属性値－最終回答対応付け処理は、図１１のステップＳ１１０５の処理であって、属性値－最終回答対応付け部１４５によって実行される。 (3-5) Attribute Value-Final Answer Correlation Processing FIG. 16 is a flow chart showing an example of a processing procedure for attribute value-final reply correlation processing. The attribute value-final answer association process is the process of step S1105 in FIG. 11 and is executed by the attribute value-final answer association unit 145. FIG.

属性値－最終回答対応付け処理ではまず、属性値－最終回答対応付け部１４５は、属性値ＤＢ１１２及び最終回答ＤＢ１１４を参照し、属性値ＤＢ１１２に格納された属性値キーワード情報４００を構成する属性値レコード、及び最終回答ＤＢ１１４に格納された最終回答キーワード情報６００を構成する最終回答レコードを抽出する（ステップＳ１６０１）。 In the attribute value-final answer association process, first, the attribute value-final answer association unit 145 refers to the attribute value DB 112 and the final answer DB 114, and extracts the attribute values constituting the attribute value keyword information 400 stored in the attribute value DB 112. A record and a final answer record constituting the final answer keyword information 600 stored in the final answer DB 114 are extracted (step S1601).

次に、属性値－最終回答対応付け部１４５は、ステップＳ１６０１で抽出した最終回答レコードにおいて、ステップＳ１６０１で抽出した属性値レコードとの対応付けがされていない最終回答レコードが存在するか否かを判定する（ステップＳ１６０２）。ステップＳ１６０２の判定は、例えば、属性値最終回答対応関係ＤＢ１２１に格納された属性値－最終回答対応関係情報７００を参照して、行うことができる。 Next, the attribute value-final answer association unit 145 determines whether or not there is a final answer record not associated with the attribute value record extracted in step S1601 among the final answer records extracted in step S1601. Determine (step S1602). The determination in step S1602 can be made, for example, by referring to the attribute value-final reply correspondence information 700 stored in the attribute value final reply correspondence DB 121. FIG.

ステップＳ１６０２において属性値レコードとの対応付けがされていない最終回答レコードが存在する場合には（ステップＳ１６０２のＹＥＳ）、属性値－最終回答対応付け部１４５は、対応付けがされていない最終回答レコードについて、対応する属性値レコードを特定し（ステップＳ１６０３）、特定結果に基づいて属性値－最終回答対応関係レコードを生成し、属性値最終回答対応関係ＤＢ１２１の属性値－最終回答対応関係情報７００に追加し（ステップＳ１６０４）、ステップＳ１６０２に戻る。 In step S1602, if there is a final answer record that is not associated with an attribute value record (YES in step S1602), the attribute value-final answer associating unit 145 creates a final answer record that is not associated with , the corresponding attribute value record is specified (step S1603), an attribute value-final answer correspondence record is generated based on the specified result, and the attribute value-final answer correspondence relationship information 700 in the attribute value final answer correspondence DB 121 is Add (step S1604) and return to step S1602.

ステップＳ１６０３において、最終回答レコードに対応する属性値レコードを特定する際、属性値－最終回答対応付け部１４５は、それぞれのキーワードの位置情報の近さに基づいて、関連するペアを探索する。より具体的には、属性値－最終回答対応付け部１４５は、対応付けの対象とする最終回答レコードの対象データ内位置情報６０３と、ステップＳ１６０１で抽出した各属性値レコードの対象データ内位置情報４０３とを比較し、最も近い位置情報を示す属性値レコードを、対応付ける属性値レコードとして特定する。なお、最終回答キーワードと対象データが異なる属性値レコードは、対応付けの探索対象外としてよい。 In step S1603, when identifying the attribute value record corresponding to the final answer record, the attribute value-final answer correlating unit 145 searches for related pairs based on the proximity of the positional information of each keyword. More specifically, the attribute value-final answer associating unit 145 uses the target data position information 603 of the final answer record to be associated and the target data position information of each attribute value record extracted in step S1601. 403 to specify the attribute value record indicating the closest position information as the associated attribute value record. Note that an attribute value record whose target data is different from the final answer keyword may be excluded from the matching search target.

そして、ステップＳ１６０２において属性値レコードとの対応付けがされていない最終回答レコードが存在しない場合には（ステップＳ１６０２のＮＯ）、属性値－最終回答対応付け処理を終了する。 Then, in step S1602, if there is no final answer record that is not associated with an attribute value record (NO in step S1602), the attribute value-final answer association processing ends.

（３－６）属性値－属性名対応付け処理
図１７は、属性値－属性名対応付け処理の処理手順例を示すフローチャートである。属性値－属性名対応付け処理は、図１１のステップＳ１１０６の処理であって、属性値－属性名対応付け部１４６によって実行される。 (3-6) Attribute Value-Attribute Name Correlation Processing FIG. 17 is a flow chart showing an example of a processing procedure for attribute value-attribute name correlation processing. The attribute value-attribute name association processing is the processing of step S1106 in FIG. 11, and is executed by the attribute value-attribute name association unit 146.

属性値－属性名対応付け処理ではまず、属性値－属性名対応付け部１４６は、属性値ＤＢ１１２及び属性名ＤＢ１１３を参照し、属性値ＤＢ１１２に格納された属性値キーワード情報４００を構成する属性値レコード、及び属性名ＤＢ１１３に格納された属性名キーワード情報５００を構成する属性名レコードを抽出する（ステップＳ１７０１）。 In the attribute value-attribute name association processing, first, the attribute value-attribute name association unit 146 refers to the attribute value DB 112 and the attribute name DB 113, and extracts the attribute values constituting the attribute value keyword information 400 stored in the attribute value DB 112. Records and attribute name records that form the attribute name keyword information 500 stored in the attribute name DB 113 are extracted (step S1701).

次に、属性値－属性名対応付け部１４６は、ステップＳ１７０１で抽出した属性値レコードにおいて、ステップＳ１７０１で抽出した属性名レコードとの対応付けがされていない属性値レコードが存在するか否かを判定する（ステップＳ１７０２）。ステップＳ１７０２の判定は、例えば、属性値属性名対応関係ＤＢ１２２に格納された属性値－属性名対応関係情報８００を参照して、行うことができる。 Next, the attribute value-attribute name associating unit 146 determines whether or not there is an attribute value record not associated with the attribute name record extracted in step S1701 among the attribute value records extracted in step S1701. Determine (step S1702). The determination in step S1702 can be made by referring to the attribute value-attribute name correspondence information 800 stored in the attribute value attribute name correspondence DB 122, for example.

ステップＳ１７０２において属性名レコードとの対応付けがされていない属性値レコードが存在する場合には（ステップＳ１７０２のＹＥＳ）、属性値－属性名対応付け部１４６は、対応付けがされていない属性値レコードについて、対応する属性名レコードを特定し（ステップＳ１７０３）、特定結果に基づいて属性値－属性名対応関係レコードを生成し、属性値属性名対応関係ＤＢ１２２の属性値－属性名対応関係情報８００に追加し（ステップＳ１７０４）、ステップＳ１７０２に戻る。 If there is an attribute value record that is not associated with an attribute name record in step S1702 (YES in step S1702), the attribute value-attribute name associating unit 146 , the corresponding attribute name record is specified (step S1703), the attribute value-attribute name correspondence record is generated based on the specified result, and the attribute value-attribute name correspondence relationship information 800 of the attribute value attribute name correspondence DB 122 is stored. Add (step S1704) and return to step S1702.

ステップＳ１７０３において、属性値レコードに対応する属性名レコードを特定する際、属性値－属性名対応付け部１４６は、それぞれのキーワードの位置情報の近さに基づいて、関連するペアを探索する。より具体的には、属性値－属性名対応付け部１４６は、対応付けの対象とする属性値レコードの対象データ内位置情報４０３と、ステップＳ１７０１で抽出した各属性名レコードの対象データ内位置情報５０３とを比較し、最も近い位置情報を示す属性名レコードを、対応付ける属性名レコードとして特定する。なお、属性値キーワードと対象データが異なる属性名レコードは、対応付けの探索対象外としてよい。 In step S1703, when identifying the attribute name record corresponding to the attribute value record, the attribute value-attribute name correlating unit 146 searches for related pairs based on the proximity of the position information of each keyword. More specifically, the attribute value-attribute name associating unit 146 combines the target data position information 403 of the attribute value record to be associated and the target data position information of each attribute name record extracted in step S1701. 503 to specify the attribute name record indicating the closest position information as the associated attribute name record. Note that an attribute name record whose attribute value keyword and target data are different may be excluded from the matching search target.

そして、ステップＳ１７０２において属性名レコードとの対応付けがされていない属性値レコードが存在しない場合には（ステップＳ１７０２のＮＯ）、属性値－属性名対応付け処理を終了する。 Then, in step S1702, if there is no attribute value record that has not been associated with an attribute name record (NO in step S1702), the attribute value-attribute name association processing ends.

（３－７）知識グラフ生成処理
図１８は、知識グラフ生成処理の処理手順例を示すフローチャートである。知識グラフ生成処理は、図１１のステップＳ１１０７の処理であって、知識グラフ生成部１４７によって実行される。 (3-7) Knowledge Graph Generation Processing FIG. 18 is a flowchart illustrating an example of a processing procedure for knowledge graph generation processing. The knowledge graph generation process is the process of step S1107 in FIG. 11 and is executed by the knowledge graph generation unit 147. FIG.

知識グラフ生成処理ではまず、知識グラフ生成部１４７は、属性値最終回答対応関係ＤＢ１２１に格納された属性値－最終回答対応関係情報７００を参照し、対応関係にある最終回答キーワード及び属性値キーワードを用いてテキスト（最終回答テキスト）を生成し、生成したテキストを値にもつ知識グラフノードを新たに生成する。このとき、知識グラフ生成部１４７は、新たな知識グラフノードについて、上記テキストをテキスト９０２に保持し、ノードカテゴリ９０３に「最終回答」を保持する知識グラフノードレコードを生成し、さらにノードＩＤ９０１にＩＤを付与した上で、知識グラフノードＤＢ１３１の知識グラフノード情報９００に追加する（ステップＳ１８０１）。最終回答キーワードのテキストは、最終回答キーワード情報６００の最終回答キーワード６０２から取得することができ、属性値キーワードのテキストは、属性値キーワード情報４００の属性値キーワード４０２から取得することができる。 In the knowledge graph generation process, first, the knowledge graph generation unit 147 refers to the attribute value-final answer correspondence information 700 stored in the attribute value final answer correspondence DB 121, and extracts the final answer keyword and the attribute value keyword that are in correspondence. A text (final answer text) is generated using this, and a new knowledge graph node having the generated text as a value is generated. At this time, for the new knowledge graph node, the knowledge graph generation unit 147 generates a knowledge graph node record in which the above text is held in the text 902 and "final answer" is held in the node category 903; is added to the knowledge graph node information 900 of the knowledge graph node DB 131 (step S1801). The text of the final answer keyword can be obtained from the final answer keyword 602 of the final answer keyword information 600 , and the text of the attribute value keyword can be obtained from the attribute value keyword 402 of the attribute value keyword information 400 .

例えば、属性値－最終回答対応関係情報７００で対応関係が示される最終回答キーワードが「電話手続き」であり、属性値キーワードが「配偶者」であるとするとき、ステップＳ１８０１において知識グラフ生成部１４７は、最終回答テキストとして「電話手続き（配偶者）」を生成する。 For example, assuming that the final answer keyword whose correspondence is indicated by the attribute value-final answer correspondence information 700 is "telephone procedure" and the attribute value keyword is "spouse", the knowledge graph generation unit 147 generates "telephone procedure (spouse)" as the final answer text.

次に、知識グラフ生成部１４７は、知識グラフエッジＤＢ１３２に格納された知識グラフエッジ情報１０００を参照し、流入先ノードＩＤ１００２にノードＩＤ９０１が登録されていない知識グラフノードレコードが２つ以上存在するか否かを判定する（ステップＳ１８０２）。ステップＳ１８０２の処理は、知識グラフを樹形図で考えたときに、親ノードが設定されていないノード（子ノード）が２つ以上存在するか否かを判定する処理に相当する。したがって、初めてステップＳ１８０２に進んだときは、何れの知識グラフノードにも親ノードが設定されていないことから、流入先ノードＩＤ１００２にＩＤ９０１が登録されたノードは存在せず、ステップＳ１８０１で生成された全ての知識グラフノードレコードが該当しないことになる。 Next, the knowledge graph generation unit 147 refers to the knowledge graph edge information 1000 stored in the knowledge graph edge DB 132, and determines whether there are two or more knowledge graph node records in which the node ID 901 is not registered in the inflow destination node ID 1002. It is determined whether or not (step S1802). The process of step S1802 corresponds to the process of determining whether or not there are two or more nodes (child nodes) with no parent node set when considering the knowledge graph as a tree diagram. Therefore, when the process proceeds to step S1802 for the first time, no parent node is set for any of the knowledge graph nodes, so there is no node with ID901 registered in the inflow destination node ID1002. All knowledge graph node records will not apply.

次に、ステップＳ１８０２において流入先ノードＩＤ１００２にノードＩＤ９０１が登録されていない知識グラフノードレコードが２つ以上存在する場合（ステップＳ１８０２のＹＥＳ）、知識グラフ生成部１４７は、該当するレコードの知識グラフノードの親ノードを決定するために、親ノード生成の対象とする知識グラフノードの集合を選定する（ステップＳ１８０３）。具体的には、知識グラフ生成部１４７は、流入先ノードＩＤ１００２にノードＩＤ９０１が登録されていない知識グラフノードレコードとして抽出される知識グラフノードのうちから、テキスト９０２の文字列の類似性に基づいて、一部のノード集合を選定する。ステップＳ１８０３の処理は、親ノードが設定されていない子ノードのうちから、共通の親ノードを割り当てる子ノードを選定する処理に相当する。 Next, in step S1802, if there are two or more knowledge graph node records in which the node ID 901 is not registered in the inflow destination node ID 1002 (YES in step S1802), the knowledge graph generation unit 147 generates the knowledge graph node of the corresponding record. In order to determine the parent node of , a set of knowledge graph nodes to be generated as parent nodes is selected (step S1803). Specifically, the knowledge graph generation unit 147 selects knowledge graph nodes extracted as knowledge graph node records in which the node ID 901 is not registered in the inflow destination node ID 1002, based on the similarity of the character string of the text 902. , select some node-set. The process of step S1803 corresponds to the process of selecting a child node to which a common parent node is assigned from among child nodes for which no parent node is set.

次に、知識グラフ生成部１４７は、ノードカテゴリ９０３に「聞き返し質問」を保持する知識グラフノードレコードを生成し、ノードＩＤ９０１にＩＤを付与した上で、知識グラフノードＤＢ１３１の知識グラフノード情報９００に追加する（ステップＳ１８０４）。ステップＳ１８０４の処理は、ステップＳ１８０３で選定した子ノードの集合に共通する親ノードを生成する処理に相当する。但し、ステップＳ１８０４で作成された知識グラフノードレコードでは、ノードカテゴリ９０３が「聞き返し質問」である知識グラフノードにおけるテキスト９０２（聞き返しテキスト）は空白のままである。聞き返しテキストは、後述する聞き返し質問生成処理のなかでテキスト９０２に登録される（図２０のステップＳ２００６を参照）。 Next, the knowledge graph generation unit 147 generates a knowledge graph node record that holds “request question” in the node category 903, assigns an ID to the node ID 901, and adds Add (step S1804). The process of step S1804 corresponds to the process of generating a parent node common to the set of child nodes selected in step S1803. However, in the knowledge graph node record created in step S1804, the text 902 (reflection text) in the knowledge graph node whose node category 903 is "reflection question" remains blank. The feedback text is registered in the text 902 during the feedback question generation process (see step S2006 in FIG. 20).

次に、知識グラフ生成部１４７は、ステップＳ１８０３で選定した知識グラフノードのそれぞれに対して、流入先ノードＩＤ１００３に当該知識グラフノードのノードＩＤ９０１を保持する知識グラフエッジレコードを生成し、流入元ノードＩＤ１００２及びエッジＩＤ１００１を追加した上で、知識グラフエッジＤＢ１３２の知識グラフエッジ情報１０００に追加する（ステップＳ１８０５）。このとき、流入元ノードＩＤ１００２には、ステップＳ１８０４で生成した知識グラフノードレコードのノードＩＤ９０１（親ノードのノードＩＤ）が、共通して登録される。また、エッジＩＤ１００１には、生成された知識グラフエッジレコードごとに異なるＩＤが付与される。ステップＳ１８０５の処理は、ステップＳ１８０３で選定した子ノードのそれぞれと、ステップＳ１８０４で生成した親ノードとの間を結ぶエッジを生成する処理に相当する。ステップＳ１８０５の処理後は、ステップＳ１８０２に戻る。 Next, for each of the knowledge graph nodes selected in step S1803, the knowledge graph generation unit 147 generates a knowledge graph edge record that holds the node ID 901 of the knowledge graph node in the inflow destination node ID 1003. After adding the ID 1002 and the edge ID 1001, they are added to the knowledge graph edge information 1000 of the knowledge graph edge DB 132 (step S1805). At this time, the node ID 901 (node ID of the parent node) of the knowledge graph node record generated in step S1804 is commonly registered as the inflow source node ID 1002 . A different ID is assigned to the edge ID 1001 for each generated knowledge graph edge record. The process of step S1805 corresponds to the process of generating edges connecting each of the child nodes selected in step S1803 and the parent node generated in step S1804. After the process of step S1805, the process returns to step S1802.

そして、ステップＳ１８０２において流入先ノードＩＤ１００２にノードＩＤ９０１が登録されていない知識グラフノードレコードが１つになった場合は（ステップＳ１８０２のＮＯ）、知識グラフ生成処理を終了する。補足すると、樹形図の知識グラフにおいて最も根元のノード（根ノード）は流入元のエッジを持つ必要がないため、流入先ノードＩＤ１００３にノードＩＤ９０１が登録される必要がない。したがって、流入先ノードＩＤ１００２にノードＩＤ９０１が登録されていない知識グラフノードレコードが１つになった時点で、その他の知識グラフノードには全て親ノードが接続されており、残った１つの知識グラフノードを根ノードとして、知識グラフの構造的な生成を終了することができる。 If there is only one knowledge graph node record in which the node ID 901 is not registered in the inflow destination node ID 1002 in step S1802 (NO in step S1802), the knowledge graph generation process ends. Supplementally, since the most root node (root node) in the knowledge graph of the tree diagram does not need to have an inflow source edge, the node ID 901 does not need to be registered in the inflow destination node ID 1003 . Therefore, when there is only one knowledge graph node record in which the node ID 901 is not registered in the inflow destination node ID 1002, all the other knowledge graph nodes are connected to the parent node, leaving only one knowledge graph node. can be used as the root node to finish the structural generation of the knowledge graph.

図１９は、知識グラフ生成処理のイメージを視覚的に説明するための図である。図１９に示されたノード１９０１～１９０７は、図１８のステップＳ１８０２において流入先ノードＩＤ１００２にノードＩＤ９０１が登録されていないと判定された知識グラフノードの具体例である。各ノード内に示された文字列は、当該ノードのテキスト（テキスト９０２に相当）を表す。 FIG. 19 is a diagram for visually explaining an image of knowledge graph generation processing. Nodes 1901 to 1907 shown in FIG. 19 are specific examples of knowledge graph nodes for which the node ID 901 was determined not to be registered in the destination node ID 1002 in step S1802 of FIG. A character string shown in each node represents the text of the node (corresponding to the text 902).

知識グラフ生成処理においてステップＳ１８０２の判定が行われた時点では、ノード１９０１～１９０７は親ノードが設定されていないノードであり、図１９に示すノード１９０８及びエッジ１９１１～１９１３は存在していない。 Nodes 1901 to 1907 do not have parent nodes set at the time when step S1802 is determined in the knowledge graph generation process, and node 1908 and edges 1911 to 1913 shown in FIG. 19 do not exist.

次いでステップＳ１８０３では、親ノードが設定されていない子ノード（すなわちノード１９０１～１９０７）のうちから、共通の親ノードを割り当てる子ノードとして、例えば、ノード１９０１～１９０３が選定される。図１９の場合、ノード１９０１～１９０３は、「電話手続き」という共通する文字列を有していることから、文字列の類似性に基づいて、これらのノードが共通の親ノードを割り当てる子ノードの集合として選定されている。 Next, in step S1803, nodes 1901 to 1903, for example, are selected as child nodes to which a common parent node is assigned from among the child nodes (ie, nodes 1901 to 1907) for which no parent node is set. In the case of FIG. 19, since nodes 1901-1903 have the common character string "telephone procedure", based on the similarity of the character strings, the child nodes to which these nodes assign a common parent node. selected as a set.

次いでステップＳ１８０４では、ステップＳ１８０３で選定した子ノードの集合（ノード１９０１～１９０３）に共通する親ノードとして、ノード１９０８が生成される。図１８の説明で前述した通り、ステップＳ１８０４の時点では、ノード１９０８のテキスト９０２は空白である。 Next, in step S1804, a node 1908 is generated as a parent node common to the set of child nodes (nodes 1901 to 1903) selected in step S1803. As described above with reference to FIG. 18, the text 902 of node 1908 is blank at step S1804.

そして、ステップＳ１８０５では、ステップＳ１８０３で選定した子ノード（ノード１９０１～１９０３）のそれぞれと、ステップＳ１８０４で生成した親ノード（ノード１９０８）との間を結ぶエッジとして、エッジ１９１１～１９１３がそれぞれ生成される。 In step S1805, edges 1911 to 1913 are generated as edges connecting each of the child nodes (nodes 1901 to 1903) selected in step S1803 and the parent node (node 1908) generated in step S1804. be.

以上の処理が行われることにより、図１９に示す樹形図において、親ノードが設定されていなかったノード１９０１～１９０３を、エッジ１９１１～１９１３によって親ノードであるノード１９０８と接続することができる。そしてこれらの処理を繰り返すことにより、知識グラフ生成部１４７は、１つの根ノードを除く全てのノードについて、親ノードと接続するエッジを生成することができ、知識グラフの構造を完成することができる。 By performing the above processing, the nodes 1901 to 1903 whose parent nodes have not been set in the tree diagram shown in FIG. By repeating these processes, the knowledge graph generator 147 can generate edges connecting parent nodes to all nodes except for one root node, thereby completing the structure of the knowledge graph. .

（３－８）聞き返し質問生成処理
図２０は、聞き返し質問生成処理の処理手順例を示すフローチャートである。聞き返し質問生成処理は、図１１のステップＳ１１０８の処理であって、聞き返し質問生成部１４８によって実行される。 (3-8) Reflection Question Generation Processing FIG. 20 is a flow chart showing an example of the processing procedure of the reflection question generation processing. The feedback question generation process is the process of step S1108 in FIG. 11, and is executed by the feedback question generation unit 148.

聞き返し質問生成処理ではまず、聞き返し質問生成部１４８は、知識グラフノードＤＢ１３１に格納された知識グラフノード情報９００を参照して、ノードカテゴリ９０３が「聞き返し質問」である知識グラフノードレコードを抽出する（ステップＳ２００１）。 In the reflection question generation process, first, the reflection question generation unit 148 refers to the knowledge graph node information 900 stored in the knowledge graph node DB 131, and extracts knowledge graph node records whose node category 903 is "reflection question" ( step S2001).

次に、聞き返し質問生成部１４８は、ステップＳ２００１で抽出した知識グラフノードレコードのなかに、テキスト９０２に聞き返し質問の内容（聞き返しテキスト）が登録されていない知識グラフノードレコードが存在するか否かを判定する（ステップＳ２００２）。ステップＳ２００２の処理は、聞き返し質問が空の親ノードを探索する処理に相当する。 Next, the reflection question generating unit 148 determines whether or not there is a knowledge graph node record in which the content of the reflection question (reflection text) is not registered in the text 902 among the knowledge graph node records extracted in step S2001. Determine (step S2002). The process of step S2002 corresponds to the process of searching for a parent node with an empty feedback question.

次に、ステップＳ２００２においてテキスト９０２に聞き返しテキストが登録されていない知識グラフノードレコードが存在する場合（ステップＳ２００２のＹＥＳ）、聞き返し質問生成部１４８は、知識グラフエッジＤＢ１３２に格納された知識グラフエッジ情報１０００を参照して、当該知識グラフノードレコードのノードＩＤ９０１が流入元ノードＩＤ１００２に登録された知識グラフエッジレコードを抽出し、抽出した各レコードの流入先ノードＩＤ１００３をまとめた流入先ノードＩＤ群を抽出する（ステップＳ２００３）。ステップＳ２００３の処理は、聞き返し質問が空の親ノードを流入元ノードとして接続するエッジを探索し、当該エッジが接続する流入先ノード（子ノード）のノードＩＤ群を抽出する処理であって、言い換えれば、図１８のステップＳ１８０３で選定された知識グラフノードの集合のノードＩＤ群を抽出する処理に相当する。 Next, in step S2002, if there is a knowledge graph node record in which the reflection text is not registered in the text 902 (YES in step S2002), the reflection question generation unit 148 generates the knowledge graph edge information stored in the knowledge graph edge DB 132. With reference to 1000, the knowledge graph edge record in which the node ID 901 of the knowledge graph node record is registered as the inflow source node ID 1002 is extracted, and the inflow destination node ID group in which the inflow destination node ID 1003 of each extracted record is collected is extracted. (step S2003). The process of step S2003 is a process of searching for an edge connecting a parent node with an empty feedback question as an inflow source node, and extracting the node ID group of the inflow destination node (child node) connected by the edge. For example, this corresponds to the process of extracting the node ID group of the set of knowledge graph nodes selected in step S1803 of FIG.

次に、聞き返し質問生成部１４８は、知識グラフノードＤＢ１３１に格納された知識グラフノード情報９００を参照して、ステップＳ２００３で抽出した流入先ノードＩＤ群に対応する知識グラフノードレコードのテキスト９０２をまとめたテキスト群を抽出する（ステップＳ２００４）。ステップＳ２００４の処理は、ステップＳ２００３でノードＩＤを抽出した流入先ノード群のテキスト群を抽出する処理に相当する。 Next, the feedback question generating unit 148 refers to the knowledge graph node information 900 stored in the knowledge graph node DB 131, and summarizes the text 902 of the knowledge graph node record corresponding to the inflow destination node ID group extracted in step S2003. A text group is extracted (step S2004). The process of step S2004 corresponds to the process of extracting the text group of the destination node group whose node ID is extracted in step S2003.

次に、聞き返し質問生成部１４８は、ステップＳ２００４で抽出したテキスト群と、属性値属性名対応関係ＤＢ１２２に格納された属性値－属性名対応関係情報８００とを参照して（厳密には、属性値キーワード情報４００及び属性名キーワード情報５００も参照する）、上記テキスト群の各テキスト（テキスト９０２）の生成に用いられた属性値キーワードと、当該属性値キーワードに対応する属性名キーワードとを抽出する（ステップＳ２００５）。ここで、上記の各テキスト９０２は、図１８に示した知識グラフ生成処理のステップＳ１８０１において、知識グラフ生成部１４７が対応関係にある最終回答キーワード及び属性値キーワードを用いて生成したものであり、ステップＳ２００５では、当該生成で用いられた属性値キーワードが抽出される。ステップＳ２００５の処理は、ステップＳ２００４で抽出したテキスト群に含まれる属性値キーワードと、その対応する属性名キーワードとを抽出する処理であって、ステップＳ２００５で抽出される属性値キーワードは、基本的には、図１８のステップＳ１８０３で文字列の類似性に基づいて選定された知識グラフノードの集合における文字列の差分に相当する。そして、ステップＳ２００５で抽出される属性名キーワードは、各属性値キーワードで共通となることが想定される。 Next, the feedback question generating unit 148 refers to the text group extracted in step S2004 and the attribute value-attribute name correspondence information 800 stored in the attribute value attribute name correspondence DB 122 (strictly speaking, attribute Also refer to value keyword information 400 and attribute name keyword information 500), attribute value keywords used to generate each text (text 902) in the text group, and attribute name keywords corresponding to the attribute value keywords are extracted. (Step S2005). Here, each text 902 described above is generated by the knowledge graph generation unit 147 in step S1801 of the knowledge graph generation processing shown in FIG. At step S2005, the attribute value keyword used for the generation is extracted. The process of step S2005 is a process of extracting attribute value keywords included in the text group extracted in step S2004 and their corresponding attribute name keywords. The attribute value keywords extracted in step S2005 are basically corresponds to the character string difference in the set of knowledge graph nodes selected based on the character string similarity in step S1803 of FIG. The attribute name keyword extracted in step S2005 is assumed to be common to each attribute value keyword.

次に、聞き返し質問生成部１４８は、ステップＳ２００５で抽出した属性名キーワードを、聞き返しテキストとして、ステップＳ２００２で判定された聞き返し質問が空の親ノードの知識グラフノードレコードのテキスト９０２に登録し、知識グラフノードＤＢ１３１に格納する。また、聞き返し質問生成部１４８は、ステップＳ２００５で抽出した属性値キーワードの差分を、聞き返しに対する回答テキストとして、ステップＳ２００３における流入先ノードＩＤ群の抽出に用いた知識グラフエッジレコードのテキスト１００４に登録し、知識グラフエッジＤＢ１３２に格納する（ステップＳ２００６）。なお、ステップＳ２００５で抽出した属性値キーワードの差分は、例えばＬｉｎｕｘ（登録商標）のｄｉｆｆコマンド等の公知の方法によって抽出することができる。このようなステップＳ２００６の処理は、子ノードのテキスト群において共通（または類似）する属性値キーワードに相当する属性名キーワードを、親ノードのテキスト（聞き返しテキスト）に登録し、子ノードのテキスト群における属性値キーワードの差分を、親ノードと子ノードとを接続するエッジのテキスト（回答テキスト）に登録する処理に相当する。ステップＳ２００６の処理後は、ステップＳ２００２に戻る。 Next, the reflection question generation unit 148 registers the attribute name keyword extracted in step S2005 as reflection text in the text 902 of the knowledge graph node record of the parent node where the reflection question determined in step S2002 is empty. Store in the graph node DB 131 . Further, the reflection question generating unit 148 registers the difference of the attribute value keywords extracted in step S2005 as the reply text to the reflection in the text 1004 of the knowledge graph edge record used for extracting the inflow destination node ID group in step S2003. , is stored in the knowledge graph edge DB 132 (step S2006). The attribute value keyword difference extracted in step S2005 can be extracted by a known method such as the Linux (registered trademark) diff command. Such processing of step S2006 registers attribute name keywords corresponding to attribute value keywords common (or similar) in the child node text group in the parent node text (review text), This corresponds to the process of registering the attribute value keyword difference in the text (answer text) of the edge connecting the parent node and the child node. After the processing of step S2006, the process returns to step S2002.

そして、ステップＳ２００２においてテキスト９０２に聞き返しテキストが登録されていない知識グラフノードレコードが存在しない場合には（ステップＳ２００２のＮＯ）、知識グラフノード情報９００においてノードカテゴリ９０３が「聞き返し質問」である全ての知識グラフレコードと、知識グラフエッジ情報１０００を構成する全ての知識グラフエッジレコードとについて、全てのフィールドが登録済みになることを意味するため、聞き返し質問生成処理を終了する。すなわち、聞き返し質問生成処理が終了することにより、聞き返しテキスト付の対話知識としての知識グラフが完成する。 If there is no knowledge graph node record in which the reflection text is not registered in the text 902 in step S2002 (NO in step S2002), all knowledge graph node information 900 whose node category 903 is "reflection question" Since it means that all fields of the knowledge graph record and all the knowledge graph edge records forming the knowledge graph edge information 1000 have been registered, the feedback question generation process ends. That is, by completing the feedback question generation processing, a knowledge graph as dialogue knowledge with feedback text is completed.

なお、本実施形態において、図１８に示した知識グラフ生成処理と図２０に示した聞き返し質問生成処理とは、交互に繰り返して実行されるようにしてもよく、より具体的には例えば、図１８のステップＳ１８０２～Ｓ１８０５の処理が１回終わるごとに、図２０の処理を実行し、図２０のステップＳ２００２～Ｓ２００６の処理が１回終わると、再び図１８のステップＳ１８０２に戻るようにしてよい。 In this embodiment, the knowledge graph generation process shown in FIG. 18 and the feedback question generation process shown in FIG. 20 may be alternately and repeatedly executed. 20 is executed each time the processing of steps S1802 to S1805 in FIG. 18 is completed once, and once the processing of steps S2002 to S2006 in FIG. 20 is completed, the process may return to step S1802 in FIG. .

図２１は、聞き返し質問生成処理のイメージを視覚的に説明するための図である。図２１に示したノード１９０１～１９０３，１９０８、及びエッジ１９１１～１９１３は、図１９に示した同じ符号の各構成に対応している。すなわち、ノード１９０１～１９０３は、共通の親ノード（ノード１９０８）が割り当てられる子ノードとしての知識グラフノードであり、ノード１９０８は、ノード１９０１～１９０３の親ノードとされる知識グラフノードであり、エッジ１９１１～１９１３は、ノード１９０８とノード１９０１～１９０３とを接続する知識グラフエッジである。 FIG. 21 is a diagram for visually explaining an image of the feedback question generation process. Nodes 1901 to 1903 and 1908 and edges 1911 to 1913 shown in FIG. 21 correspond to the same reference numerals shown in FIG. That is, nodes 1901 to 1903 are knowledge graph nodes as child nodes to which a common parent node (node 1908) is assigned, node 1908 is a knowledge graph node to be a parent node of nodes 1901 to 1903, and edge 1911-1913 are knowledge graph edges connecting the node 1908 and the nodes 1901-1903.

聞き返し質問生成処理においてステップＳ２００２の判定が行われたとき、図２１では、テキスト９０２に聞き返しテキストが登録されていない知識グラフノードレコードとして、ノード１９０８が認められる。なお、当該時点では、ノード１９０８には「続柄」というテキストは登録されていない。また、エッジ１９１１～１９１３は先の知識グラフ生成処理において生成済みであるが（図１９参照）、各エッジにおける「配偶者」、「子」、「父母」というテキストも、当該時点では登録されていない。 When the determination in step S2002 is made in the reflection question generation processing, node 1908 is recognized as a knowledge graph node record in which no reflection text is registered in text 902 in FIG. Note that at this time, the text “relationship” is not registered in the node 1908 . Edges 1911 to 1913 have already been generated in the previous knowledge graph generation process (see FIG. 19), but the texts "spouse", "child", and "parent" for each edge are not registered at this time. No.

次いでステップＳ２００３では、知識グラフエッジ情報１０００を参照して、聞き返し質問が空の親ノード（ノード１９０８）を流入元ノードとして接続するエッジ１９１１～１９１３が探索され、当該エッジが接続する流入先ノード（ノード１９０１～１９０３）のノードＩＤ群が抽出される。 Next, in step S2003, the knowledge graph edge information 1000 is referenced to search for edges 1911 to 1913 that connect the parent node (node 1908) with an empty reflection question as an inflow source node. A node ID group of nodes 1901 to 1903) is extracted.

次いでステップＳ２００４では、知識グラフノード情報９００を参照して、ステップＳ２００３で抽出した流入先ノードＩＤ群に対応する知識グラフノード（ノード１９０１～１９０３）のテキスト９０２をまとめたテキスト群が抽出される。図２１の場合、具体的には、「電話手続き（配偶者）」、「電話手続き（子）」、「電話手続き（父母）」の３つのテキストが抽出される。 Next, in step S2004, with reference to the knowledge graph node information 900, a text group is extracted in which the texts 902 of the knowledge graph nodes (nodes 1901 to 1903) corresponding to the inflow destination node ID group extracted in step S2003 are summarized. In the case of FIG. 21, specifically, three texts are extracted: "telephone procedure (spouse)", "telephone procedure (child)", and "telephone procedure (parents)".

次いでステップＳ２００５では、ステップＳ２００４で抽出したテキスト群に含まれる属性値キーワードと、その対応する属性名キーワードとが抽出される。図２１では、抽出される属性名キーワードと属性値キーワードとの組み合わせが、属性名キーワード２１０１及び属性値キーワード２１０２として示されている。個々の属性名キーワードは属性名キーワード情報５００の属性名キーワード５０２から取得され、個々の属性値キーワードは、属性値キーワード情報４００の属性値キーワード４０２から取得される。ここで、図２１を参照すると、属性名キーワード２１０１は「続柄」で共通となっており、属性値キーワード２１０２はノードごとに異なることが分かる。 Next, in step S2005, attribute value keywords and corresponding attribute name keywords included in the text group extracted in step S2004 are extracted. In FIG. 21, combinations of extracted attribute name keywords and attribute value keywords are shown as attribute name keywords 2101 and attribute value keywords 2102 . Each attribute name keyword is obtained from the attribute name keyword 502 of the attribute name keyword information 500 , and each attribute value keyword is obtained from the attribute value keyword 402 of the attribute value keyword information 400 . Here, referring to FIG. 21, it can be seen that the attribute name keyword 2101 is common for "relationship" and the attribute value keyword 2102 differs for each node.

次いでステップＳ２００６では、子ノード（ノード１９０１～１９０３）のテキスト群において共通（または類似）する属性値キーワードに相当する属性名キーワード２１０１を、親ノード（ノード１９０８）のテキスト９０２に登録する処理と、子ノードのテキスト群における属性値キーワード２１０２の差分を、親ノードと子ノードとを接続するエッジ（エッジ１９１１～１９１３）のテキスト１００４に登録する処理とが行われる。図２１の場合は、各ノードにおける属性値キーワード２１０２がそれぞれ相違しているため、属性値キーワード２１０２の差分は属性値キーワード２１０２がそのまま使用されている。 Next, in step S2006, a process of registering an attribute name keyword 2101 corresponding to an attribute value keyword common (or similar) in the text group of the child nodes (nodes 1901 to 1903) in the text 902 of the parent node (node 1908); A process of registering the difference of the attribute value keyword 2102 in the text group of the child node in the text 1004 of the edges (edges 1911 to 1913) connecting the parent node and the child node is performed. In the case of FIG. 21, since the attribute value keywords 2102 are different for each node, the attribute value keywords 2102 are used as they are for the differences in the attribute value keywords 2102 .

以上の処理が行われることにより、図２１に示す樹形図において、ノード１９０１～１９０３の親ノードに設定されていたノード１９０８に対して、その聞き返しテキストとして「続柄」を登録することができ、また、ノード１９０８とノード１９０１～１９０３を接続するエッジ１９１１～１９１３に対して、それぞれの回答テキストとして「配偶者」、「子」、「父母」を登録することができる。かくして、知識グラフノード情報９００及び知識グラフエッジ情報１０００の全てのレコードにおいて全てのフィールドが登録済みとなり、聞き返しテキスト付の知識グラフを完成することができる。 By performing the above processing, "relationship" can be registered as the reflection text for the node 1908 set as the parent node of the nodes 1901 to 1903 in the tree diagram shown in FIG. For edges 1911 to 1913 connecting node 1908 and nodes 1901 to 1903, "spouse", "child", and "parent" can be registered as respective answer texts. Thus, all fields in all records of knowledge graph node information 900 and knowledge graph edge information 1000 have been registered, and a knowledge graph with feedback text can be completed.

（３－９）知識グラフ可視化処理
図２２は、知識グラフの可視化の一例（その１）を示す図である。図２２に示す表示画面２２００は、図１１のステップＳ１１０９の処理により、知識グラフ可視化部１５０によって出力デバイス１５（あるいはユーザ端末２００でもよい）に表示出力される知識グラフの可視化画面の一例であって、ネットワーク形式で知識グラフを可視化した例である。 (3-9) Knowledge Graph Visualization Processing FIG. 22 is a diagram showing an example (part 1) of knowledge graph visualization. A display screen 2200 shown in FIG. 22 is an example of a knowledge graph visualization screen displayed and output to the output device 15 (or the user terminal 200) by the knowledge graph visualization unit 150 by the process of step S1109 in FIG. , is an example of visualizing a knowledge graph in a network format.

表示画面２２００は、知識グラフノード情報９００及び知識グラフエッジ情報１０００に基づいて、その表示内容が形成される。例えば、図２２の場合、ノード２２０１は、根ノードに相当する知識グラフノードであって、ノード２２０２は、ノード２２０１の子ノードに相当する知識グラフノードの１つである。これらノード２２０１，２２０２の情報は、知識グラフノード情報９００から取得することができる。具体的には、ノード２２０１に示された「サービス加入時期」やノード２００２に示された「支払い形態」は、各ノードにおける質問テキスト（聞き返しテキスト）であり、知識グラフノード情報９００のテキスト９０２から取得することができる。また、ノード２２０１とノード２２０２とを接続するエッジ２２１１や、ノード２２０２以降を接続するエッジ２２１２の情報は、知識グラフエッジ情報１０００から取得することができる。具体的には、エッジ２２１２に示された「２０１９以前」やエッジ２２１２に示された「月単位」は、各エッジにおける回答テキストであり、知識グラフエッジ情報１０００のテキスト１００４から取得することができる。 The display content of the display screen 2200 is formed based on the knowledge graph node information 900 and the knowledge graph edge information 1000 . For example, in the case of FIG. 22, a node 2201 is a knowledge graph node corresponding to a root node, and a node 2202 is one of knowledge graph nodes corresponding to child nodes of the node 2201 . Information on these nodes 2201 and 2202 can be obtained from the knowledge graph node information 900 . Specifically, the "time to subscribe to the service" indicated by the node 2201 and the "payment form" indicated by the node 2002 are the question texts (review texts) at each node, and are obtained from the text 902 of the knowledge graph node information 900. can be obtained. Information on the edge 2211 connecting the node 2201 and the node 2202 and information on the edge 2212 connecting the node 2202 and subsequent nodes can be obtained from the knowledge graph edge information 1000 . Specifically, "before 2019" indicated at the edge 2212 and "monthly" indicated at the edge 2212 are answer texts for each edge, which can be obtained from the text 1004 of the knowledge graph edge information 1000. .

図２３は、知識グラフの可視化の一例（その１）を示す図である。図２３に示す表示画面２３００は、図１１のステップＳ１１０９の処理により、知識グラフ可視化部１５０によって出力デバイス１５（あるいはユーザ端末２００でもよい）に表示出力される知識グラフの可視化画面の一例であって、テーブル形式で知識グラフを可視化した例である。 FIG. 23 is a diagram illustrating an example (part 1) of visualization of a knowledge graph. A display screen 2300 shown in FIG. 23 is an example of a knowledge graph visualization screen displayed and output to the output device 15 (or may be the user terminal 200) by the knowledge graph visualization unit 150 by the process of step S1109 in FIG. , is an example of visualizing a knowledge graph in a table format.

表示画面２３００は、図２２に示した表示画面２２００と同様に、知識グラフノード情報９００及び知識グラフエッジ情報１０００に基づいて、その表示内容が形成される。なお、図２３において吹き出し内に示したブロック図は、参考のために、表示画面２３００のテーブルに表された各ノードの構成関係を示したものである。 As with the display screen 2200 shown in FIG. 22, the display contents of the display screen 2300 are formed based on the knowledge graph node information 900 and the knowledge graph edge information 1000 . It should be noted that the block diagram shown in the balloon in FIG. 23 shows the configuration relationship of each node represented in the table of the display screen 2300 for reference.

図２３の表示画面２３００によれば、ノードＩＤ「５」のノード２３０１が保持する「続柄」について質問が行われた場合は、その回答に応じて、エッジ２３１１，２３１２に分岐する。回答が「配偶者」であった場合は、エッジ２３１１を経てノードＩＤ「３」のノード２３０２に連結されることにより、配偶者向けの電話手続きを最終回答として案内することになる。一方、回答が「子」であった場合は、エッジ２３１２を経てノードＩＤ「４」のノード２３０３に連結されることにより、子向けの電話手続きを最終回答として案内することになる。 According to the display screen 2300 of FIG. 23, when a question is asked about the "relationship" held by the node 2301 with the node ID "5", the process branches to edges 2311 and 2312 according to the answer. If the answer is "spouse", it is connected to the node 2302 with the node ID "3" via the edge 2311, thereby guiding the telephone procedure for the spouse as the final answer. On the other hand, if the answer is "child", it is connected to the node 2303 with the node ID "4" via the edge 2312, and the telephone procedure for the child is guided as the final answer.

図２２，図２３に示したように、対話知識生成装置１００では、聞き返しテキスト付きの対話知識（対話フローモデル）を表す知識グラフが可視化されてユーザに提供されることにより、ユーザは、対話知識の全体像を容易に理解できるだけでなく、対話の進捗に応じて必要な聞き返し質問を把握することができる。 As shown in FIGS. 22 and 23, the dialogue knowledge generation apparatus 100 visualizes a knowledge graph representing dialogue knowledge (dialogue flow model) with reflection text and provides the user with the dialogue knowledge. In addition to being able to easily understand the overall picture of the dialogue, it is possible to grasp the necessary reflection questions according to the progress of the dialogue.

以上に説明したように、本実施形態に係る対話知識生成装置１００は、業務文書（業務マニュアル）及び問い合わせ記録（対話ログ）等、複数のデータソースを組み合わせて、人手による作業負荷及び構築コストを低減しながら、自動的に聞き返しテキスト付きの対話フローモデル（知識グラフ）を構築することができる。さらに、対話知識生成装置１００は、図２２や図２３に例示したように、生成した知識グラフを可視化してユーザに提供できることにより、対話業務の担当者（ユーザ）は、可視化された知識グラフを参照しながら対話管理を行うことができ、業務の比熟練者であっても、不明点を聞き返しながら対話業務を進め、対話の利用者に向けて、適切な最終回答を容易に提示することができる。 As described above, the dialogue knowledge generation device 100 according to this embodiment combines a plurality of data sources such as business documents (business manuals) and inquiry records (dialogue logs) to reduce manual workload and construction costs. A dialogue flow model (knowledge graph) with reflection text can be built automatically while reducing. 22 and 23, the interactive knowledge generating apparatus 100 can visualize the generated knowledge graph and provide it to the user. Dialogue management can be performed while referring to it, and even a non-skilled worker can proceed with dialogue work while asking back questions about unclear points and easily present an appropriate final answer to the user of the dialogue. can.

なお、本発明は上記した実施形態に限定されるものではなく、様々な変形例が含まれる。例えば、属性値キーワード、属性名キーワード、及び最終回答キーワードの抽出においては、手掛かり句情報３００の照合パターン３０７を用いるものでなくてもよい。その場合、例えば、事前に機械学習で生成した予測モデルを用いて、キーワードの抽出と確信度の計算を行うようにしてもよい。また例えば、属性値キーワードと属性名キーワードの対応関係の推定は、対象データ内位置情報４０３，５０３を用いるものでなくてもよい。その場合、例えば、外部のオープンデータやオントロジーを用いて、「子」の上位概念は「続柄」であることを特定し、対応関係の候補を絞り込みながら実施する等してもよい。 In addition, the present invention is not limited to the above-described embodiments, and includes various modifications. For example, the matching pattern 307 of the clue phrase information 300 may not be used in extracting attribute value keywords, attribute name keywords, and final answer keywords. In that case, for example, a prediction model generated in advance by machine learning may be used to extract keywords and calculate the degree of certainty. Further, for example, estimation of the correspondence relationship between the attribute value keyword and the attribute name keyword does not have to use the position information 403, 503 in the target data. In that case, for example, external open data or ontology may be used to specify that the superordinate concept of "child" is "relationship", and the process may be performed while narrowing down candidates for correspondence.

また、上記した実施形態は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、実施形態の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 Further, the above-described embodiments are described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the described configurations. Moreover, it is possible to add, delete, or replace a part of the configuration of the embodiment with another configuration.

また、上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ（Solid State Drive）等の記録装置、または、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 Further, each of the above configurations, functions, processing units, processing means, and the like may be realized by hardware, for example, by designing a part or all of them using an integrated circuit. Moreover, each of the above configurations, functions, etc. may be realized by software by a processor interpreting and executing a program for realizing each function. Information such as programs, tables, and files that implement each function can be stored in recording devices such as memories, hard disks, SSDs (Solid State Drives), or recording media such as IC cards, SD cards, and DVDs.

また、図面において制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際にはほとんど全ての構成が相互に接続されていると考えてもよい。 Further, in the drawings, control lines and information lines are shown as necessary for explanation, and not all control lines and information lines are necessarily shown on the product. In fact, it may be considered that almost all configurations are interconnected.

１０コンピュータ
１１ＣＰＵ
１２メモリ
１３記憶デバイス
１４入力デバイス
１５出力デバイス
１６ネットワークデバイス
１７バス
１００対話知識生成装置
１１０テキスト情報ＤＢ
１１１手掛かり句ＤＢ
１１２属性値ＤＢ
１１３属性名ＤＢ
１１４最終回答ＤＢ
１２０関係情報ＤＢ
１２１属性値最終回答対応関係ＤＢ
１２２属性値属性名対応関係ＤＢ
１３０グラフ情報ＤＢ
１３１知識グラフノードＤＢ
１３２知識グラフエッジＤＢ
１４０制御部
１４１手掛かり句推定部
１４２属性値キーワード選定部
１４３属性名キーワード選定部
１４４最終回答キーワード選定部
１４５属性値－最終回答対応付け部
１４６属性値－属性名対応付け部
１４７知識グラフ生成部
１４８聞き返し質問生成部
１５０知識グラフ可視化部
２００ユーザ端末
３００手掛かり句情報
４００属性値キーワード情報
５００属性名キーワード情報
６００最終回答キーワード情報
７００属性値－最終回答対応関係情報
８００属性値－属性名対応関係情報
９００知識グラフノード情報
１０００知識グラフエッジ情報
２２００，２３００表示画面
10 computer 11 CPU
12 Memory 13 Storage Device 14 Input Device 15 Output Device 16 Network Device 17 Bus 100 Dialogue Knowledge Generator 110 Text Information DB
111 Clue phrase DB
112 attribute value database
113 Attribute name DB
114 Final Answer DB
120 Relationship information DB
121 Attribute Value Final Answer Correspondence DB
122 attribute value attribute name correspondence DB
130 graph information DB
131 knowledge graph node DB
132 knowledge graph edge DB
140 control unit 141 clue phrase estimation unit 142 attribute value keyword selection unit 143 attribute name keyword selection unit 144 final answer keyword selection unit 145 attribute value-final answer association unit 146 attribute value-attribute name association unit 147 knowledge graph generation unit 148 Reflective question generation unit 150 Knowledge graph visualization unit 200 User terminal 300 Clue phrase information 400 Attribute value keyword information 500 Attribute name keyword information 600 Final answer keyword information 700 Attribute value-final answer correspondence information 800 Attribute value-attribute name correspondence information 900 Knowledge graph node information 1000 Knowledge graph edge information 2200, 2300 Display screen

Claims

A dialogue knowledge generation device for automatically generating a knowledge graph representing a dialogue flow model,
a keyword selection unit that selects, from target data including business documents and past dialogue records, attribute values that are keywords for dialogue and attribute names that indicate categories of attribute values, using clue phrases that indicate selection conditions for each;
an associating unit that determines a correspondence relationship between the attribute value selected by the keyword selecting unit and the attribute name based on their respective acquisition positions in the target data;
By extracting hierarchical relationships of a plurality of items having the attribute values or attribute names based on the correspondence determined by the association unit, and combining the plurality of items based on the extracted hierarchical relationships, a graph generator that generates a knowledge graph;
A dialogue knowledge generation device comprising:

a clue phrase estimation unit that extracts a target expression from the target data using an initial clue phrase input from the outside and estimates a new clue phrase that complements the extracted target expression;
The keyword selection unit selects the attribute value and the attribute name from the target data using the initial clue phrase and the new clue phrase estimated by the clue phrase estimation unit. Item 1. The interactive knowledge generation device according to item 1.

2. The interactive knowledge generation apparatus according to claim 1, further comprising a knowledge graph visualization unit that outputs the knowledge graph generated by the graph generation unit in a visible form.

3. The knowledge graph is represented by a structure in which nodes holding texts of questions or final answers in dialogue and edges holding texts of answers to the questions are hierarchically connected. 2. The interactive knowledge generation device according to 1.

The graph generation unit
a knowledge graph generation unit that generates the structure of the knowledge graph;
5. The method according to claim 4, further comprising: a feedback question generation unit that adds the content of the question in the dialogue and the content of the answer to the question to the structure of the knowledge graph generated by the knowledge graph generation unit. interactive knowledge generator.

In generating the knowledge graph, the knowledge graph generation unit selects a set of child nodes based on the similarity of text held from a plurality of the nodes for which a parent node to be connected is not created, and selects a set of child nodes. 6. The interactive knowledge generation apparatus according to claim 5, wherein a common parent node is created for sets.

7. The dialogue knowledge generating apparatus according to claim 6, wherein said reflection question generation unit assigns said attribute name corresponding to a similar portion of text in said set of child nodes to said parent node as reflection text.

The reflection question generation unit assigns the attribute value corresponding to the text difference in the set of child nodes to the edge connecting the child node having the difference and the parent node, as an answer text. 7. The interactive knowledge generation device according to claim 6.

The keyword selection unit further selects a final answer to the dialogue from the target data using the clue phrase,
6. The method according to claim 5, wherein the associating unit further determines a corresponding relationship between the attribute value selected by the keyword selecting unit and the final answer based on the respective acquisition positions in the target data. An interactive knowledge generator as described.

The knowledge graph generation unit generates a final answer node based on the correspondence relationship between the attribute value and the final answer determined by the association unit, and combines the attribute value and the final answer having the correspondence relationship. 10. The dialogue knowledge generation apparatus according to claim 9, wherein is assigned to the node of the final answer as the final answer text.

2. The dialogue knowledge generating apparatus according to claim 1, wherein said target data is text data.

A dialogue knowledge generation method by a dialogue knowledge generation device for automatically generating a knowledge graph representing a dialogue flow model, comprising:
The dialogue knowledge generation device extracts attribute values, which are keywords for dialogue, and attribute names representing categories of attribute values, from target data including business documents and past dialogue records, using clue phrases representing respective selection conditions. a keyword selection step to be selected;
an association step in which the dialogue knowledge generation device determines a correspondence relationship between the attribute value selected in the keyword selection step and the attribute name based on their acquisition positions in the target data;
The dialogue knowledge generating device extracts a hierarchical relationship of a plurality of items having the attribute value or the attribute name based on the correspondence determined in the matching step, and extracts the plurality of items based on the extracted hierarchical relationship. a graph generation step of generating the knowledge graph by combining items;
A dialogue knowledge generation method characterized by comprising: