WO2024134768A1 - Natural language processing device, natural language processing method, and computer program - Google Patents

Natural language processing device, natural language processing method, and computer program Download PDF

Info

Publication number
WO2024134768A1
WO2024134768A1 PCT/JP2022/046889 JP2022046889W WO2024134768A1 WO 2024134768 A1 WO2024134768 A1 WO 2024134768A1 JP 2022046889 W JP2022046889 W JP 2022046889W WO 2024134768 A1 WO2024134768 A1 WO 2024134768A1
Authority
WO
WIPO (PCT)
Prior art keywords
learning
text
document structure
data
natural language
Prior art date
Application number
PCT/JP2022/046889
Other languages
French (fr)
Japanese (ja)
Inventor
弘毅 中西
公雄 土川
晴夫 大石
Original Assignee
日本電信電話株式会社
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Publication of WO2024134768A1 publication Critical patent/WO2024134768A1/en

Links

Images

Definitions

  • the present invention relates to a natural language processing device, a natural language processing method, and a computer program.
  • Non-Patent Document 1 describes a domain-specific natural language processing and construction framework developed based on a pre-training method for natural language processing and a general-purpose language model.
  • the present invention was made in consideration of the above circumstances, and aims to provide a natural language processing device, a natural language processing method, and a computer program that accurately grasps context.
  • the natural language processing device includes a document structure data generation unit that acquires table of contents data of a learning procedure manual and generates document structure data showing a tree structure corresponding to the table of contents data, a learning data generation unit that references nodes connected by edges in order from the root node side of the document structure data and, when the node is acquired, combines the text of the learning procedure manual pointed to by the node with learning text, a pre-learning processing unit that constructs a trained evaluation model by deep learning using the training text, an evaluation target text input unit that inputs evaluation target text, and an evaluation target text evaluation unit that evaluates the input evaluation target text using the trained evaluation model.
  • the natural language processing method includes: acquiring table of contents data of a learning procedure manual, generating document structure data showing a tree structure corresponding to the table of contents data, referencing nodes connected by edges in order from the root node side of the document structure data, and when the node is acquired, combining the text of the learning procedure manual pointed to by the node with a learning text, constructing a trained evaluation model by deep learning using the training text, inputting a text to be evaluated, and evaluating the inputted text to be evaluated using the trained evaluation model.
  • a computer program according to a third aspect of the present invention causes a computer to execute the natural language processing method according to the second aspect.
  • the present invention provides a natural language processing device, a natural language processing method, and a computer program that can accurately grasp context.
  • FIG. 1 is a block diagram illustrating an example of a configuration of a natural language processing apparatus according to an embodiment.
  • FIG. 2 is a diagram showing an example of document structure data generated by a document structure data generating unit of the natural language processing apparatus according to an embodiment.
  • FIG. 3 is a flowchart illustrating an example of a learning data generation process of the procedure manual analysis unit of the natural language processing apparatus according to an embodiment.
  • FIG. 4 is a diagram illustrating an example of a natural language processing method of the natural language processing apparatus according to an embodiment.
  • FIG. 1 is a block diagram illustrating an example of the configuration of a natural language processing apparatus 1 according to an embodiment.
  • the natural language processing device 1 of this embodiment is a device that learns using document structure data generated from table of contents data of a learning procedure manual and learning text generated for each of the document structure data, and includes at least one processor and a memory in which a program executed by the processor is recorded, and can realize various functions described below by software or a combination of software and hardware.
  • the natural language processing device 1 includes a learning procedure data storage unit 2, a learning procedure reading unit 3, a procedure analysis unit 4, a document structure data generation unit 5, a learning data generation unit 6, a deep learning type natural language processing unit 7, a pre-learning processing unit 8, an evaluation target text evaluation unit 9, an evaluation result output unit 10, and an evaluation target text input unit 11.
  • the learning procedure manual data storage unit 2 stores data on the learning procedure manual to be learned in advance.
  • the learning procedure manual data storage unit 2 is, for example, a memory, and acquires and stores the learning procedure manual from the outside.
  • the learning procedure manual data storage unit 2 outputs the stored learning procedure manual to the learning procedure manual reading unit 3 as necessary.
  • the learning procedure manual is, for example, an operation manual, etc., and has a table of contents including, for example, the titles and summaries of chapters, sections, and paragraphs.
  • the learning procedure manual reading unit 3 acquires learning procedure manual data from the learning procedure manual data accumulation unit 2.
  • the learning procedure manual reading unit 3 transmits the acquired learning procedure manual data in response to a request from the procedure manual analysis unit 4.
  • the learning procedure manual reading unit 3 may also periodically acquire learning procedure manual data from the learning procedure manual data accumulation unit 2 and transmit it to the procedure manual analysis unit 4.
  • the learning procedure manual data storage unit 2 and the learning procedure manual reading unit 3 do not necessarily need to be included in the natural language processing device 1, and may be configured to be connected to the natural language processing device 1 from the outside.
  • FIG. 2 is a diagram showing an example of document structure data generated by the document structure data generating unit 5 of the natural language processing apparatus 1 according to an embodiment.
  • the procedure manual analysis unit 4 includes a document structure data generation unit 5 and a learning data generation unit 6 .
  • the document structure data generation unit 5 generates document structure data showing a tree structure corresponding to the table of contents data included in the learning procedure manual data acquired from the learning procedure manual reading unit 3.
  • the root node is "table of contents”
  • the nodes “Chapter 1,” “Chapter 2,” and “Chapter 3” branch off from the root node by edges (branches).
  • the table of contents data includes, for example, the text described in the chapters, sections, and sections.
  • the document structure data generation unit 5 divides the learning procedure manual according to the number of chapters. For example, if there are three chapters, the document structure data generation unit 5 divides the learning procedure manual into three and generates document structure data.
  • the method of dividing the document structure data is not limited to the above, and it may be divided by the number of sections or the number of clauses, for example.
  • the document structure data generation unit 5 obtains the titles of the chapters, sections, and paragraphs, and the summaries described in the chapters, sections, and paragraphs from the data of the learning procedure manual.
  • the document structure data generation unit 5 links the titles of the chapters, sections, and paragraphs of the document structure data with the corresponding summaries.
  • the summary corresponding to chapter n describes that if the OS is Windows (registered trademark), the processing described in chapter n, section 1 must be performed, and if the OS is Mac, the processing described in chapter n, section 2 must be performed.
  • the document structure data generation unit 5 supplies the document structure data and the summary data corresponding to the chapters, sections, and paragraphs of the document structure data to the learning data generation unit 6.
  • the learning data generation unit 6 combines the text of the learning procedure manual to generate learning text.
  • the learning data generation unit 6 acquires the document structure data generated by the document structure data generation unit 5 and summaries corresponding to the chapters, sections, and paragraphs of the document structure data.
  • the learning data generation unit 6 sequentially references the nodes connected by edges (branches) from the root node side of the acquired document structure data.
  • the learning data generating unit 6 When the learning data generating unit 6 acquires a node of a specific chapter, section, or paragraph of the document structure data (S), it combines the text of the learning procedure manual indicated by the node with the learning text (T). For example, the learning data generation unit 6 acquires a first node included in the nth chapter of the document structure data, and combines the text of the learning procedure manual pointed to by the first node with the learning text. Next, the learning data generation unit 6 acquires a second node connected to an edge branched off from the first node, and combines the text of the learning procedure manual pointed to by the second node with the learning text.
  • the nth node connected to the edge branching off from the n-1th node is acquired, and the text of the learning procedure manual pointed to by the nth node is merged into the learning text.
  • the learning data generation unit 6 then sequentially acquires the 1st node to the mth node of the n+1th chapter of the document structure data, and performs a process of sequentially merging the text of the learning procedure manual pointed to by the 1st node to the mth node of the n+1th chapter into the learning text.
  • the learning data generation unit 6 also acquires nodes from the document structure data according to the context conditions described in the summary corresponding to the chapter, section, and paragraph of the document structure data, and combines the text of the learning procedure manual indicated by the node with the learning text.
  • the learning data generation unit 6 acquires the context conditions described in the summary corresponding to the nth chapter of the document structure data (n).
  • the context conditions may include, for example, a first condition (e.g., the OS is iOS (registered trademark)) and a second condition (e.g., the OS is Android (registered trademark)).
  • the learning data generation unit 6 acquires a first condition node included in the nth chapter of the document structure data (n) and combines the text of the learning procedure indicated by the first condition node (e.g., an operation manual for iOS (registered trademark)) with the learning text (n).
  • the learning data generation unit 6 acquires a second condition node included in the nth chapter of the document structure data (n) and combines the text of the learning procedure indicated by the second condition node (e.g., an operation manual for Android (registered trademark)) with the learning text (n).
  • the learning data generation unit 6 references the nodes connected by edges in order from the root node of the document structure data, and if it is unable to obtain a node for a specific chapter, section, or paragraph, it determines that all of the text in the learning procedure manual has been combined into the learning text (T) and outputs the learning text (T) to the deep learning natural language processing unit 7 (pre-learning processing unit 8).
  • the deep learning type natural language processing unit 7 performs deep learning in advance using the learning text and constructs a trained model.
  • the deep learning type natural language processing unit 7 uses the constructed trained model to evaluate the context of the text to be evaluated and outputs the evaluation result.
  • the deep learning type natural language processing unit 7 includes a pre-learning processing unit 8, an evaluation target text evaluation unit 9, and an evaluation result output unit 10.
  • the pre-learning processing unit 8 acquires the training text output by the training data generation unit 6, converts the input evaluation target text into a text structured similar to the training text by deep learning using the training text, and constructs a trained model (trained evaluation model) that evaluates the context of the text in order according to the document structure of the converted text.
  • the pre-learning processing unit has a trained model that evaluates the context of the input text data based on a known algorithm that evaluates the context of natural language, and may update the trained model by adding a trained model that converts the structure of the text data to be input to the trained model.
  • the pre-learning processing unit 8 may update the trained model every time the training data generation unit 6 outputs training text, or may periodically update the trained model using multiple training texts.
  • the deep learning method in the pre-learning processing unit 8 is not limited, and various deep learning methods such as neural networks can be adopted.
  • the pre-learning processing unit 8 supplies the trained model to the evaluation target text evaluation unit 9.
  • the evaluation target text input unit 11 inputs the evaluation target text acquired from outside to the evaluation target text evaluation unit 9.
  • the evaluation target text is, for example, text such as an operation manual.
  • the evaluation target text evaluation unit 9 evaluates the evaluation target text input from the evaluation target text input unit 11 based on the trained model supplied from the pre-learning processing unit 8.
  • the evaluation result output unit 10 may include, for example, an image output device such as a monitor, and may be configured to output the evaluation results to an image output device.
  • the evaluation result output unit 10 acquires the evaluation results output from the evaluation target text evaluation unit 9, and outputs them to a monitor or the like. Note that the evaluation result output unit 10 may be configured to output the evaluation results not only as images, but also as audio, etc.
  • FIG. 3 is a flowchart illustrating an example of a learning data generation process of the procedure manual analysis unit of the natural language processing apparatus according to an embodiment.
  • FIG. 4 is a diagram for explaining an example of a natural language processing method of the natural language processing apparatus according to an embodiment.
  • the document structure data generation unit 5 of the procedure manual analysis unit 4 acquires the text of the learning procedure manual from the learning procedure manual reading unit 3 (step 21).
  • the document structure data generation unit 5 acquires the table of contents data contained in the text of the learning procedure manual (step 22).
  • the document structure data generation unit 5 generates document structure data (S) showing a tree structure corresponding to the table of contents data (step 23).
  • the learning data generation unit 6 of the procedure manual analysis unit 4 acquires the document structure data (S) and summaries corresponding to the chapters, sections, and paragraphs of the document structure data (S), and generates learning text (T) for studying the text of the learning procedure manual (step 24).
  • the learning data generation unit 6 sequentially references the nodes connected by edges starting from the root node of the document structure data (S) and determines whether a node for a specific chapter, section, or paragraph of the document structure data (S) has been obtained (step 25).
  • the learning data generation unit 6 determines that it has acquired a node for a specific chapter, section, or paragraph of the document structure data (S) (step 25, node present), it acquires the text of the learning procedure manual pointed to by that node (step 26).
  • the learning data generation unit 6 combines the text of the learning procedure manual acquired in step 26 into the learning text (T) (step 27).
  • the learning data generation unit 6 acquires all nodes for specific chapters, sections, and paragraphs in the document structure data (S), and if it determines that there are no nodes (step 25, no nodes), it outputs the combined learning text to the deep learning natural language processing unit 7 (step 28) and ends the process.
  • the deep learning type natural language processing unit 7 includes a pre-learning processing module (pre-learning processing unit 8) and an inference module (evaluation target text evaluation unit 9).
  • the pre-learning processing module performs deep learning using the learning text and generates a trained evaluation model for evaluating the text.
  • the inference module uses the trained evaluation model generated by the pre-learning processing module to evaluate the evaluation target text.
  • document structure data is generated from the table of contents data of the learning procedure manual, and pre-learning is performed using the learning text generated using the contextual conditions described in the summary linked to the chapters, sections, and paragraphs included in the table of contents data and the document structure data, a trained evaluation model is generated, and the text of the learning procedure manual is evaluated.
  • the program according to this embodiment may be transferred in a state where it is stored in an electronic device, or in a state where it is not stored in an electronic device. In the latter case, the program may be transferred via a network, or in a state where it is stored in a storage medium.
  • the storage medium is a non-transitory tangible medium.
  • the storage medium is a computer-readable medium.
  • the storage medium may be in any form, such as a CD-ROM or memory card, as long as it is capable of storing a program and is computer-readable.
  • the present invention is not limited to the above-described embodiments, and can be modified in various ways during implementation without departing from the gist of the invention.
  • the embodiments may also be implemented in appropriate combination, in which case the combined effects can be obtained.
  • the above-described embodiments include various inventions, and various inventions can be extracted by combinations selected from the multiple constituent elements disclosed. For example, if the problem can be solved and an effect can be obtained even if some constituent elements are deleted from all the constituent elements shown in the embodiments, the configuration from which these constituent elements are deleted can be extracted as an invention.
  • Reference Signs List 1 Natural language processing device 2: Learning procedure manual data storage unit 3: Learning procedure manual reading unit 4: Procedure manual analysis unit 5: Document structure data generation unit 6: Learning data generation unit 7: Deep learning type natural language processing unit 8: Pre-learning processing unit 9: Evaluation target text evaluation unit 10: Evaluation result output unit 11: Evaluation target text input unit

Abstract

A natural language processing device according to an embodiment comprises: a document structure data generation unit (5) which generates document structure data that represents a tree structure corresponding to table-of-contents data by acquiring the table-of-contents data about a training procedure document; a training data generation unit (6) which refers to nodes connected sequentially from a root node side of the document structure data through edges, and combines, to a training text, a text of the training procedure document indicated by the nodes when the nodes have been acquired; a pre-training processing unit (8) which constructs a trained evaluation model through deep learning that has used the training text; a text-to-be-evaluated input unit (11) which inputs text to be evaluated; and a text-to-be-evaluated evaluation unit (9) which uses the trained evaluation model to evaluate the input text to be evaluated.

Description

自然言語処理装置、自然言語処理方法及びコンピュータプログラムNatural language processing device, natural language processing method and computer program
 本発明は、自然言語処理装置、自然言語処理方法及びコンピュータプログラムに関する。 The present invention relates to a natural language processing device, a natural language processing method, and a computer program.
 機器の操作や業務の流れ、進行方法などがまとめられたマニュアルや手順書は、直列的な記載された部分だけでなく、並列に記載されている文章が含まれていることがある。ユーザがこのようなマニュアルや手順書を参照する際には、ドキュメント構造(目次)を頭に入れながら作業を進めなければならなかった。 Manuals and procedures that summarize the operation of equipment, work flows, and procedures often contain not only serially written sections, but also parallel sentences. When users refer to such manuals or procedures, they must keep the document structure (table of contents) in mind as they work.
 例えばマニュアル等を機械学習して、ドキュメント構造に従ってマニュアル等の文脈を正しく評価することができれば、ユーザがマニュアル等を精読する手間を省き、ユーザの機器の操作や業務を支援することができる。 For example, if we could use machine learning to learn about manuals and other documents and correctly evaluate the context of the manuals according to their structure, it would save users the trouble of having to carefully read through the manuals and would help them operate their devices and perform their work.
 従来の自然言語処理の学習手法において、操作マニュアル等のテキストを学習する場合、隣り合う文の関係を始まりから終わりへ向かって学習することによって、文脈の把握を行っていた。 In conventional natural language processing learning methods, when learning text such as an operating manual, the context is understood by learning the relationship between adjacent sentences from the beginning to the end.
 例えば、非特許文献1には、自然言語処理向けの事前学習手法及び汎用言語モデルを基に開発されたドメイン特化の自然言語処理及び構築フレームワークが記載されている。 For example, Non-Patent Document 1 describes a domain-specific natural language processing and construction framework developed based on a pre-training method for natural language processing and a general-purpose language model.
 しかしながら、操作マニュアル等のテキストの文脈が並列関係である場合においても、文脈を直列関係として学習し、並列関係にある文が誤って文脈把握されていた。 However, even when the context of text such as an operating manual is parallel, the system learns the context as being serial, and the parallel sentences are incorrectly understood as having the same context.
 本発明は、上記事情を鑑みてなされたものであり、正確に文脈把握を行う自然言語処理装置、自然言語処理方法及びコンピュータプログラムを提供することを目的とする。 The present invention was made in consideration of the above circumstances, and aims to provide a natural language processing device, a natural language processing method, and a computer program that accurately grasps context.
 本発明の第1の態様による自然言語処理装置は、学習手順書の目次データを取得することで、前記目次データに対応するツリー構造を示す文書構造データを生成する文書構造データ生成部と、前記文書構造データのルートノード側から順にエッジにより接続されたノードを参照し、前記ノードが取得された場合、前記ノードが指し示す前記学習手順書のテキストを学習用テキストに結合する学習データ生成部と、前記学習用テキストを用いた深層学習により、学習済み評価モデルを構築する事前学習処理部と、評価対象テキストを入力する評価対象テキスト入力部と、前記学習済み評価モデルを用いて、入力された評価対象テキストを評価する評価対象テキスト評価部と、を備える。 The natural language processing device according to the first aspect of the present invention includes a document structure data generation unit that acquires table of contents data of a learning procedure manual and generates document structure data showing a tree structure corresponding to the table of contents data, a learning data generation unit that references nodes connected by edges in order from the root node side of the document structure data and, when the node is acquired, combines the text of the learning procedure manual pointed to by the node with learning text, a pre-learning processing unit that constructs a trained evaluation model by deep learning using the training text, an evaluation target text input unit that inputs evaluation target text, and an evaluation target text evaluation unit that evaluates the input evaluation target text using the trained evaluation model.
 本発明の第2の態様による自然言語処理方法は、学習手順書の目次データを取得することで、前記目次データに対応するツリー構造を示す文書構造データを生成し、前記文書構造データのルートノード側から順にエッジにより接続されたノードを参照し、前記ノードが取得された場合、前記ノードが指し示す前記学習手順書のテキストを学習用テキストに結合し、前記学習用テキストを用いた深層学習により、学習済み評価モデルを構築し、評価対象テキストを入力し、前記学習済み評価モデルを用いて、入力された評価対象テキストを評価する、ことを含む。 The natural language processing method according to the second aspect of the present invention includes: acquiring table of contents data of a learning procedure manual, generating document structure data showing a tree structure corresponding to the table of contents data, referencing nodes connected by edges in order from the root node side of the document structure data, and when the node is acquired, combining the text of the learning procedure manual pointed to by the node with a learning text, constructing a trained evaluation model by deep learning using the training text, inputting a text to be evaluated, and evaluating the inputted text to be evaluated using the trained evaluation model.
 本発明の第3の態様によるコンピュータプログラムは、コンピュータに第2の態様による自然言語処理方法を実行させる。 A computer program according to a third aspect of the present invention causes a computer to execute the natural language processing method according to the second aspect.
 本発明によれば、正確に文脈把握を行う自然言語処理装置、自然言語処理方法及びコンピュータプログラムを提供することができる。 The present invention provides a natural language processing device, a natural language processing method, and a computer program that can accurately grasp context.
図1は、一実施形態に係る自然言語処理装置の一構成例を概略的に示すブロック図である。FIG. 1 is a block diagram illustrating an example of a configuration of a natural language processing apparatus according to an embodiment. 図2は、一実施形態に係る自然言語処理装置の文書構造データ生成部が生成する文書構造データの一例を示す図である。FIG. 2 is a diagram showing an example of document structure data generated by a document structure data generating unit of the natural language processing apparatus according to an embodiment. 図3は、一実施形態に係る自然言語処理装置の手順書解析部の学習データ生成処理の一例を示すフローチャートである。FIG. 3 is a flowchart illustrating an example of a learning data generation process of the procedure manual analysis unit of the natural language processing apparatus according to an embodiment. 図4は、一実施形態に係る自然言語処理装置の自然言語処理方法の一例を示す図である。FIG. 4 is a diagram illustrating an example of a natural language processing method of the natural language processing apparatus according to an embodiment.
 以下、図面を参照してこの発明に係る実施形態の自然言語処理装置について説明する。なお、以下の実施形態では、同一の番号を付した部分については同様の動作を行うものとして、重ねての説明を省略する。 Below, a natural language processing device according to an embodiment of the present invention will be described with reference to the drawings. Note that in the following embodiments, parts with the same numbers perform similar operations, and redundant explanations will be omitted.
 図1は、一実施形態に係る自然言語処理装置1の一構成例を概略的に示すブロック図である。
 本実施形態の自然言語処理装置1は、学習手順書の目次データから生成した文書構造データと、前記文書構造データごとに生成した学習用テキストと、を用いて学習する装置であって、少なくとも一つのプロセッサと、プロセッサにより実行されるプログラムが記録されたメモリとを含み、ソフトウエアにより又はソフトウエアとハードウエアとの組み合わせにより以下に説明する種々の機能を実現することができる。
FIG. 1 is a block diagram illustrating an example of the configuration of a natural language processing apparatus 1 according to an embodiment.
The natural language processing device 1 of this embodiment is a device that learns using document structure data generated from table of contents data of a learning procedure manual and learning text generated for each of the document structure data, and includes at least one processor and a memory in which a program executed by the processor is recorded, and can realize various functions described below by software or a combination of software and hardware.
 自然言語処理装置1は、学習手順書データ蓄積部2と、学習手順書読取部3と、手順書解析部4と、文書構造データ生成部5と、学習データ生成部6と、深層学習型自然言語処理部7と、事前学習処理部8と、評価対象テキスト評価部9と、評価結果出力部10と、評価対象テキスト入力部11と、を備える。 The natural language processing device 1 includes a learning procedure data storage unit 2, a learning procedure reading unit 3, a procedure analysis unit 4, a document structure data generation unit 5, a learning data generation unit 6, a deep learning type natural language processing unit 7, a pre-learning processing unit 8, an evaluation target text evaluation unit 9, an evaluation result output unit 10, and an evaluation target text input unit 11.
 学習手順書データ蓄積部2は、事前に学習する学習手順書のデータを蓄積している。学習手順書データ蓄積部2は、例えば、メモリであり、外部から学習手順書を取得し、記憶する。学習手順書データ蓄積部2は、記憶した学習手順書を必要に応じて学習手順書読取部3に出力する。学習手順書は、例えば、操作マニュアル等であって、例えば章、節、項のタイトルと概要とを含む目次を備えている。 The learning procedure manual data storage unit 2 stores data on the learning procedure manual to be learned in advance. The learning procedure manual data storage unit 2 is, for example, a memory, and acquires and stores the learning procedure manual from the outside. The learning procedure manual data storage unit 2 outputs the stored learning procedure manual to the learning procedure manual reading unit 3 as necessary. The learning procedure manual is, for example, an operation manual, etc., and has a table of contents including, for example, the titles and summaries of chapters, sections, and paragraphs.
 学習手順書読取部3は、学習手順書データ蓄積部2から学習手順書のデータを取得する。学習手順書読取部3は、取得した学習手順書のデータを手順書解析部4の要求に応じて送信する。また、学習手順書読取部3は、定期的に学習手順書データ蓄積部2から学習手順書のデータを取得し、手順書解析部4に送信してもよい。 The learning procedure manual reading unit 3 acquires learning procedure manual data from the learning procedure manual data accumulation unit 2. The learning procedure manual reading unit 3 transmits the acquired learning procedure manual data in response to a request from the procedure manual analysis unit 4. The learning procedure manual reading unit 3 may also periodically acquire learning procedure manual data from the learning procedure manual data accumulation unit 2 and transmit it to the procedure manual analysis unit 4.
 なお、学習手順書データ蓄積部2と、学習手順書読取部3は、自然言語処理装置1に必ずしも含まれている必要はなく、自然言語処理装置1に対して外部から接続される構成であってもよい。 Note that the learning procedure manual data storage unit 2 and the learning procedure manual reading unit 3 do not necessarily need to be included in the natural language processing device 1, and may be configured to be connected to the natural language processing device 1 from the outside.
 図2は、一実施形態に係る自然言語処理装置1の文書構造データ生成部5が生成する文書構造データの一例を示す図である。
 手順書解析部4は、文書構造データ生成部5と、学習データ生成部6と、を備える。
FIG. 2 is a diagram showing an example of document structure data generated by the document structure data generating unit 5 of the natural language processing apparatus 1 according to an embodiment.
The procedure manual analysis unit 4 includes a document structure data generation unit 5 and a learning data generation unit 6 .
 文書構造データ生成部5は、学習手順書読取部3から取得した学習手順書のデータに含まれる目次データに対応するツリー構造を示す文書構造データを生成する。例えば図2に示す文書構造データは、ルートノードは「目次」であり、ルートノードからそれぞれ「1章」、「2章」、「3章」のノードがエッジ(枝)により分岐している。「1章」、「2章」、「3章」のノードの各々から各々の章に含まれる項のノードがエッジ(枝)により分岐している。目次データは、例えば、章、節、項に記載されたテキスト等を含む。 The document structure data generation unit 5 generates document structure data showing a tree structure corresponding to the table of contents data included in the learning procedure manual data acquired from the learning procedure manual reading unit 3. For example, in the document structure data shown in FIG. 2, the root node is "table of contents," and the nodes "Chapter 1," "Chapter 2," and "Chapter 3" branch off from the root node by edges (branches). From each of the nodes "Chapter 1," "Chapter 2," and "Chapter 3," nodes for the sections included in each chapter branch off by edges (branches). The table of contents data includes, for example, the text described in the chapters, sections, and sections.
 例えば、文書構造データ生成部5は、文書構造データ生成時、章の数に応じて学習手順書を分割する。例として、章の数が3つである場合、文書構造データ生成部5は、学習手順書を3分割し、文書構造データを生成する。文書構造データの分割方法は上記に限らず、例えば、節の数、項の数で分割してもよい。 For example, when generating document structure data, the document structure data generation unit 5 divides the learning procedure manual according to the number of chapters. For example, if there are three chapters, the document structure data generation unit 5 divides the learning procedure manual into three and generates document structure data. The method of dividing the document structure data is not limited to the above, and it may be divided by the number of sections or the number of clauses, for example.
 文書構造データ生成部5は、章、節、項のタイトルと、章、節、項に記載された概要と、を学習手順書のデータから取得する。文書構造データ生成部5は、文書構造データの章、節、項のタイトルと、対応する概要とを紐づける。なお、本実施形態では、学習手順書の章、節、項に対応する概要に、文脈条件が記載されていることを前提とする。例えば、第n章に対応する概要には、OSがWindows(登録商標)である場合、第n章第1節に記載の処理を行う必要があることが記載され、OSがMacである場合、第n章第2節に記載の処理を行う必要があることが記載されているとする。文書構造データ生成部5は、文書構造データと、当該文書構造データの章、節、項に対応する概要のデータとを学習データ生成部6に供給する。 The document structure data generation unit 5 obtains the titles of the chapters, sections, and paragraphs, and the summaries described in the chapters, sections, and paragraphs from the data of the learning procedure manual. The document structure data generation unit 5 links the titles of the chapters, sections, and paragraphs of the document structure data with the corresponding summaries. Note that in this embodiment, it is assumed that the context conditions are described in the summaries corresponding to the chapters, sections, and paragraphs of the learning procedure manual. For example, the summary corresponding to chapter n describes that if the OS is Windows (registered trademark), the processing described in chapter n, section 1 must be performed, and if the OS is Mac, the processing described in chapter n, section 2 must be performed. The document structure data generation unit 5 supplies the document structure data and the summary data corresponding to the chapters, sections, and paragraphs of the document structure data to the learning data generation unit 6.
 学習データ生成部6は、学習手順書のテキストを結合し、学習用テキストを生成する。学習データ生成部6は、文書構造データ生成部5が生成した文書構造データと、当該文書構造データの章、節、項に対応する概要と、を取得する。学習データ生成部6は、取得した文書構造データのルートノード側から、エッジ(枝)により接続されたノードを順次参照する。 The learning data generation unit 6 combines the text of the learning procedure manual to generate learning text. The learning data generation unit 6 acquires the document structure data generated by the document structure data generation unit 5 and summaries corresponding to the chapters, sections, and paragraphs of the document structure data. The learning data generation unit 6 sequentially references the nodes connected by edges (branches) from the root node side of the acquired document structure data.
 学習データ生成部6は、文書構造データ(S)の特定の章、節、項のノードを取得した場合、当該ノードが指し示す学習手順書のテキストを学習用テキスト(T)に結合する。
 例えば、学習データ生成部6は、文書構造データの第n章に含まれる第1ノードを取得し、第1ノードが指し示す学習手順書のテキストを学習用テキストに結合する。次に学習データ生成部6は、第1ノードから分岐したエッジに接続された第2ノードを取得し、第2ノードが指し示す学習手順書のテキストを学習用テキストに結合する。
When the learning data generating unit 6 acquires a node of a specific chapter, section, or paragraph of the document structure data (S), it combines the text of the learning procedure manual indicated by the node with the learning text (T).
For example, the learning data generation unit 6 acquires a first node included in the nth chapter of the document structure data, and combines the text of the learning procedure manual pointed to by the first node with the learning text. Next, the learning data generation unit 6 acquires a second node connected to an edge branched off from the first node, and combines the text of the learning procedure manual pointed to by the second node with the learning text.
 この処理を続け、第n―1ノードから分岐したエッジに接続された第nノードを取得し、第nノードが指し示す学習手順書のテキストを学習用テキストに結合する。その後、学習データ生成部6は、文書構造データの第n+1章の第1ノード~第mノードを順次取得し、第n+1章の第1ノード~第mノードが指し示す学習手順書のテキストを学習用テキストに順次結合する処理を行う。 Continuing this process, the nth node connected to the edge branching off from the n-1th node is acquired, and the text of the learning procedure manual pointed to by the nth node is merged into the learning text. The learning data generation unit 6 then sequentially acquires the 1st node to the mth node of the n+1th chapter of the document structure data, and performs a process of sequentially merging the text of the learning procedure manual pointed to by the 1st node to the mth node of the n+1th chapter into the learning text.
 また、学習データ生成部6は、文書構造データの章、節、項に対応する概要に記載された文脈条件に応じて、文書構造データからノードを取得し、当該ノードが指し示す学習手順書のテキストを学習用テキストに結合する。 The learning data generation unit 6 also acquires nodes from the document structure data according to the context conditions described in the summary corresponding to the chapter, section, and paragraph of the document structure data, and combines the text of the learning procedure manual indicated by the node with the learning text.
 学習データ生成部6は、文書構造データ(n)の第n章に対応する概要に記載された文脈条件を取得する。文脈条件には、例えば、第1条件(例:OSがiOS(登録商標)である)と、第2条件(例:OSがAndroid(登録商標)である)が記載されているとする。 The learning data generation unit 6 acquires the context conditions described in the summary corresponding to the nth chapter of the document structure data (n). The context conditions may include, for example, a first condition (e.g., the OS is iOS (registered trademark)) and a second condition (e.g., the OS is Android (registered trademark)).
 学習データ生成部6は、文脈条件に記載された第1条件を満たすときに、文書構造データ(n)の第n章に含まれる第1条件ノードを取得し、当該第1条件ノードが指し示す学習手順書のテキスト(例:iOS(登録商標)での操作マニュアル)を学習用テキスト(n)に結合する。学習データ生成部6は、文脈条件に記載された第2条件を満たすときに、文書構造データ(n)の第n章に含まれる第2条件ノードを取得し、当該第2条件ノードが指し示す学習手順書のテキスト(例:Android(登録商標)での操作マニュアル)を学習用テキスト(n)に結合する。 When the first condition described in the context condition is satisfied, the learning data generation unit 6 acquires a first condition node included in the nth chapter of the document structure data (n) and combines the text of the learning procedure indicated by the first condition node (e.g., an operation manual for iOS (registered trademark)) with the learning text (n). When the second condition described in the context condition is satisfied, the learning data generation unit 6 acquires a second condition node included in the nth chapter of the document structure data (n) and combines the text of the learning procedure indicated by the second condition node (e.g., an operation manual for Android (registered trademark)) with the learning text (n).
 このことにより、異なる文脈条件を満たす際の処理が記載された学習手順書のテキストが直列に結合されることがなくなる。 This prevents learning procedure texts that describe the processing required to meet different contextual conditions from being serially combined.
 学習データ生成部6は、文書構造データのルートノード側から順にエッジにより接続されたノードを参照し、特定の章、節、項のノードを取得できなかった場合、学習手順書のテキストすべてを学習用テキスト(T)に結合したと判定し、当該学習用テキスト(T)を深層学習型自然言語処理部7(事前学習処理部8)に出力する。 The learning data generation unit 6 references the nodes connected by edges in order from the root node of the document structure data, and if it is unable to obtain a node for a specific chapter, section, or paragraph, it determines that all of the text in the learning procedure manual has been combined into the learning text (T) and outputs the learning text (T) to the deep learning natural language processing unit 7 (pre-learning processing unit 8).
 深層学習型自然言語処理部7は、学習用テキストを用いて事前に深層学習を行い、学習済みモデルを構築する。深層学習型自然言語処理部7は、構築した学習済みモデルを用いて、評価対象テキストの文脈を評価し、評価結果を出力する。 The deep learning type natural language processing unit 7 performs deep learning in advance using the learning text and constructs a trained model. The deep learning type natural language processing unit 7 uses the constructed trained model to evaluate the context of the text to be evaluated and outputs the evaluation result.
 深層学習型自然言語処理部7は、事前学習処理部8と、評価対象テキスト評価部9と、評価結果出力部10と、を備える。 The deep learning type natural language processing unit 7 includes a pre-learning processing unit 8, an evaluation target text evaluation unit 9, and an evaluation result output unit 10.
 事前学習処理部8は、学習データ生成部6が出力した学習用テキストを取得し、当該学習用テキストを用いた深層学習によって、事前に、入力された評価対象テキストを上記学習用テキストと同様の構造に構成されたテキストに変換し、変換後のテキストの文書構造に沿って、順番にテキストの文脈を評価する学習済みモデル(学習済み評価モデル)を構築する。ここで、事前学習処理部は、自然言語の文脈を評価する既知のアルゴリズムに基づいて、入力されたテキストデータの文脈を評価する学習済みモデルを備え、当該学習済みモデルに入力するテキストデータの構造を変換する学習済みモデルを追加して学習済みモデルを更新してもよい。事前学習処理部8は、学習データ生成部6が学習用テキストを出力するたびに、学習済みモデルを更新してもよく、複数の学習用テキストを用いて周期的に学習済みモデルを更新してもよい。なお、事前学習処理部8における深層学習の手法は限定されるものではなく、ニューラルネットワークなどの種々の深層学習手法を採用することができる。事前学習処理部8は、学習済みモデルを評価対象テキスト評価部9に供給する。 The pre-learning processing unit 8 acquires the training text output by the training data generation unit 6, converts the input evaluation target text into a text structured similar to the training text by deep learning using the training text, and constructs a trained model (trained evaluation model) that evaluates the context of the text in order according to the document structure of the converted text. Here, the pre-learning processing unit has a trained model that evaluates the context of the input text data based on a known algorithm that evaluates the context of natural language, and may update the trained model by adding a trained model that converts the structure of the text data to be input to the trained model. The pre-learning processing unit 8 may update the trained model every time the training data generation unit 6 outputs training text, or may periodically update the trained model using multiple training texts. Note that the deep learning method in the pre-learning processing unit 8 is not limited, and various deep learning methods such as neural networks can be adopted. The pre-learning processing unit 8 supplies the trained model to the evaluation target text evaluation unit 9.
 評価対象テキスト入力部11は、外部から取得した評価対象テキストを評価対象テキスト評価部9に入力する。評価対象テキストは、例えば、操作マニュアル等のテキストである。 The evaluation target text input unit 11 inputs the evaluation target text acquired from outside to the evaluation target text evaluation unit 9. The evaluation target text is, for example, text such as an operation manual.
 評価対象テキスト評価部9は、評価対象テキスト入力部11から入力された評価対象テキストを事前学習処理部8から供給された学習済みモデルに基づいて評価する。 The evaluation target text evaluation unit 9 evaluates the evaluation target text input from the evaluation target text input unit 11 based on the trained model supplied from the pre-learning processing unit 8.
 評価結果出力部10は、例えば、モニタ等の画像出力機器を含んでいてもよく、画像出力機器へ評価結果を出力する構成であってもよい。評価結果出力部10は、評価対象テキスト評価部9から出力された評価結果を取得し、モニタ等に出力する。なお、評価結果出力部10は、画像のみならず、音声等で評価結果を出力する構成であってもよい。 The evaluation result output unit 10 may include, for example, an image output device such as a monitor, and may be configured to output the evaluation results to an image output device. The evaluation result output unit 10 acquires the evaluation results output from the evaluation target text evaluation unit 9, and outputs them to a monitor or the like. Note that the evaluation result output unit 10 may be configured to output the evaluation results not only as images, but also as audio, etc.
 以下、本実施形態に係る自然言語処理装置1の学習データ生成部6により、学習手順書の目次データから生成した文書構造データを用いて、学習手順書のテキストを学習用テキストに結合する手順の一例について説明する。なお、以下の動作の説明における処理の内容は一例であって、同様の効果を得ることが可能な様々な処理を適宜に利用できる。 Below, an example of a procedure for combining the text of a learning procedure manual with learning text using document structure data generated from table of contents data of the learning procedure manual by the learning data generation unit 6 of the natural language processing device 1 according to this embodiment will be described. Note that the contents of the process in the following operation description are merely examples, and various processes that can achieve the same effect can be used as appropriate.
 図3は、一実施形態に係る自然言語処理装置の手順書解析部の学習データ生成処理の一例を示すフローチャートである。
 図4は、一実施形態に係る自然言語処理装置の自然言語処理方法の一例を説明するための図である。
FIG. 3 is a flowchart illustrating an example of a learning data generation process of the procedure manual analysis unit of the natural language processing apparatus according to an embodiment.
FIG. 4 is a diagram for explaining an example of a natural language processing method of the natural language processing apparatus according to an embodiment.
 手順書解析部4の文書構造データ生成部5は、学習手順書読取部3から学習手順書のテキストを取得する(ステップ21)。文書構造データ生成部5は、学習手順書のテキストに含まれる目次データを取得する(ステップ22)。文書構造データ生成部5は、目次データに対応するツリー構造を示す文書構造データ(S)を生成する(ステップ23)。 The document structure data generation unit 5 of the procedure manual analysis unit 4 acquires the text of the learning procedure manual from the learning procedure manual reading unit 3 (step 21). The document structure data generation unit 5 acquires the table of contents data contained in the text of the learning procedure manual (step 22). The document structure data generation unit 5 generates document structure data (S) showing a tree structure corresponding to the table of contents data (step 23).
 手順書解析部4の学習データ生成部6は、文書構造データ(S)と、当該文書構造データ(S)の章、節、項に対応する概要と、を取得し、学習手順書のテキストを学習するための学習用テキスト(T)を生成する(ステップ24)。 The learning data generation unit 6 of the procedure manual analysis unit 4 acquires the document structure data (S) and summaries corresponding to the chapters, sections, and paragraphs of the document structure data (S), and generates learning text (T) for studying the text of the learning procedure manual (step 24).
 学習データ生成部6は、文書構造データ(S)のルートノード側から順にエッジにより接続されたノードを順次参照し、文書構造データ(S)の特定の章、節、項のノードを取得したか判定する(ステップ25)。 The learning data generation unit 6 sequentially references the nodes connected by edges starting from the root node of the document structure data (S) and determines whether a node for a specific chapter, section, or paragraph of the document structure data (S) has been obtained (step 25).
 学習データ生成部6は、文書構造データ(S)の特定の章、節、項のノードを取得したと判定すると(ステップ25、ノード有)、当該ノードが指し示す先の学習手順書のテキストを取得する(ステップ26)。 When the learning data generation unit 6 determines that it has acquired a node for a specific chapter, section, or paragraph of the document structure data (S) (step 25, node present), it acquires the text of the learning procedure manual pointed to by that node (step 26).
 学習データ生成部6は、ステップ26で取得した学習手順書のテキストを学習用テキスト(T)に結合する(ステップ27)。 The learning data generation unit 6 combines the text of the learning procedure manual acquired in step 26 into the learning text (T) (step 27).
 学習データ生成部6は、文書構造データ(S)の特定の章、節、項のノードを全て取得し、ノードが無いと判定すると(ステップ25、ノード無)、結合した学習用テキストを深層学習型自然言語処理部7に出力し(ステップ28)、処理を終了する。 The learning data generation unit 6 acquires all nodes for specific chapters, sections, and paragraphs in the document structure data (S), and if it determines that there are no nodes (step 25, no nodes), it outputs the combined learning text to the deep learning natural language processing unit 7 (step 28) and ends the process.
 深層学習型自然言語処理部7は、事前学習処理モジュール(事前学習処理部8)と推論モジュール(評価対象テキスト評価部9)とを含んでいる。事前学習処理モジュールは、学習用テキストを用いた深層学習を行い、テキストの評価を行うための学習済みの評価モデルを生成する。推論モジュールは、事前学習処理モジュールにより生成された学習済みの評価モデルを用いて、評価対象テキストの評価を行う。 The deep learning type natural language processing unit 7 includes a pre-learning processing module (pre-learning processing unit 8) and an inference module (evaluation target text evaluation unit 9). The pre-learning processing module performs deep learning using the learning text and generates a trained evaluation model for evaluating the text. The inference module uses the trained evaluation model generated by the pre-learning processing module to evaluate the evaluation target text.
 本実施形態の自然言語処理装置1によれば、学習手順書の目次データから文書構造データを生成し、目次データに含まれる章、節、項に紐づいた概要に記載された文脈条件と、文書構造データと、を用いて生成された学習用テキストによる事前学習を行い、学習済みの評価モデルを生成し、学習手順書のテキストの評価を行うことにより、文脈が並列関係にある文を続き(直列)の文章としてではなく、並列関係として学習することで、正確に文脈把握を行うことが可能になる。 According to the natural language processing device 1 of this embodiment, document structure data is generated from the table of contents data of the learning procedure manual, and pre-learning is performed using the learning text generated using the contextual conditions described in the summary linked to the chapters, sections, and paragraphs included in the table of contents data and the document structure data, a trained evaluation model is generated, and the text of the learning procedure manual is evaluated. By learning sentences in a parallel context as parallel sentences rather than as consecutive (serial) sentences, it becomes possible to accurately grasp the context.
 本実施形態に係るプログラムは、電子機器に記憶された状態で譲渡されてよいし、電子機器に記憶されていない状態で譲渡されてもよい。後者の場合は、プログラムは、ネットワークを介して譲渡されてよいし、記憶媒体に記憶された状態で譲渡されてもよい。記憶媒体は、非一時的な有形の媒体である。記憶媒体は、コンピュータ可読媒体である。記憶媒体は、CD-ROM、メモリカード等のプログラムを記憶可能かつコンピュータで読取可能な媒体であればよく、その形態は問わない。 The program according to this embodiment may be transferred in a state where it is stored in an electronic device, or in a state where it is not stored in an electronic device. In the latter case, the program may be transferred via a network, or in a state where it is stored in a storage medium. The storage medium is a non-transitory tangible medium. The storage medium is a computer-readable medium. The storage medium may be in any form, such as a CD-ROM or memory card, as long as it is capable of storing a program and is computer-readable.
 なお、本発明は、上記実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。また、各実施形態は適宜組み合わせて実施してもよく、その場合組み合わせた効果が得られる。更に、上記実施形態には種々の発明が含まれており、開示される複数の構成要件から選択された組み合わせにより種々の発明が抽出され得る。例えば、実施形態に示される全構成要件からいくつかの構成要件が削除されても、課題が解決でき、効果が得られる場合には、この構成要件が削除された構成が発明として抽出され得る。 The present invention is not limited to the above-described embodiments, and can be modified in various ways during implementation without departing from the gist of the invention. The embodiments may also be implemented in appropriate combination, in which case the combined effects can be obtained. Furthermore, the above-described embodiments include various inventions, and various inventions can be extracted by combinations selected from the multiple constituent elements disclosed. For example, if the problem can be solved and an effect can be obtained even if some constituent elements are deleted from all the constituent elements shown in the embodiments, the configuration from which these constituent elements are deleted can be extracted as an invention.
 1…自然言語処理装置
 2…学習手順書データ蓄積部
 3…学習手順書読取部
 4…手順書解析部
 5…文書構造データ生成部
 6…学習データ生成部
 7…深層学習型自然言語処理部
 8…事前学習処理部
 9…評価対象テキスト評価部
 10…評価結果出力部
 11…評価対象テキスト入力部
Reference Signs List 1: Natural language processing device 2: Learning procedure manual data storage unit 3: Learning procedure manual reading unit 4: Procedure manual analysis unit 5: Document structure data generation unit 6: Learning data generation unit 7: Deep learning type natural language processing unit 8: Pre-learning processing unit 9: Evaluation target text evaluation unit 10: Evaluation result output unit 11: Evaluation target text input unit

Claims (5)

  1.  目次データを含む学習手順書を取得し、前記目次データに対応するツリー構造を示す文書構造データを生成する文書構造データ生成部と、
     前記文書構造データのルートノード側から順にエッジにより接続されたノードを参照し、前記ノードが取得された場合、前記ノードが指し示す前記学習手順書のテキストを学習用テキストに結合する学習データ生成部と、
     前記学習用テキストを用いた深層学習により、学習済み評価モデルを構築する事前学習処理部と、
     評価対象テキストを入力する評価対象テキスト入力部と、
     前記学習済み評価モデルを用いて、入力された評価対象テキストを評価する評価対象テキスト評価部と、
     を備える自然言語処理装置。
    a document structure data generating unit that obtains a learning procedure manual including table of contents data and generates document structure data showing a tree structure corresponding to the table of contents data;
    a learning data generation unit that refers to nodes connected by edges in order from a root node side of the document structure data, and when the node is acquired, combines the text of the learning procedure manual indicated by the node with a learning text;
    A pre-learning processing unit that constructs a trained evaluation model by deep learning using the training text;
    an evaluation target text input section for inputting an evaluation target text;
    an evaluation target text evaluation unit that evaluates an input evaluation target text using the trained evaluation model;
    A natural language processing device comprising:
  2.  前記学習データ生成部は、
     前記文書構造データのルートノードから順にエッジにより接続された前記ノードを参照し、前記ノードが取得されなかった場合、前記学習用テキストを出力する、
     請求項1記載の自然言語処理装置。
    The learning data generation unit
    refer to the nodes connected by edges in order from the root node of the document structure data, and if the node is not acquired, output the learning text;
    The natural language processing device according to claim 1 .
  3.  前記学習データ生成部は、
     前記文書構造データに含まれる概要に記載された文脈条件に応じて前記文書構造データから前記ノードを取得する、
     請求項1記載の自然言語処理装置。
    The learning data generation unit
    obtaining the node from the document structure data in accordance with a context condition described in an outline included in the document structure data;
    The natural language processing device according to claim 1 .
  4.  目次データを含む学習手順書を取得し、前記目次データに対応するツリー構造を示す文書構造データを生成し、
     前記文書構造データのルートノード側から順にエッジにより接続されたノードを参照し、前記ノードが取得された場合、前記ノードが指し示す前記学習手順書のテキストを学習用テキストに結合し、
     前記学習用テキストを用いた深層学習により、学習済み評価モデルを構築し、
     評価対象テキストを入力し、
     前記学習済み評価モデルを用いて、入力された評価対象テキストを評価する、自然言語処理方法。
    Acquire a learning procedure manual including table of contents data, and generate document structure data showing a tree structure corresponding to the table of contents data;
    Refer to the nodes connected by edges in order from the root node side of the document structure data, and when the node is acquired, combine the text of the learning procedure manual pointed to by the node into a learning text;
    A trained evaluation model is constructed by deep learning using the training text,
    Enter the text to be evaluated,
    A natural language processing method for evaluating an input text to be evaluated using the trained evaluation model.
  5.  コンピュータに請求項4記載の方法を実行させるコンピュータプログラム。
     
    A computer program causing a computer to carry out the method according to claim 4.
PCT/JP2022/046889 2022-12-20 Natural language processing device, natural language processing method, and computer program WO2024134768A1 (en)

Publications (1)

Publication Number Publication Date
WO2024134768A1 true WO2024134768A1 (en) 2024-06-27

Family

ID=

Similar Documents

Publication Publication Date Title
US11081018B2 (en) Personalized learning system and method for the automated generation of structured learning assets based on user data
US10860948B2 (en) Extending question training data using word replacement
Bontcheva et al. The GATE crowdsourcing plugin: Crowdsourcing annotated corpora made easy
Paetzold et al. Massalign: Alignment and annotation of comparable documents
Theunissen et al. A mapping study on documentation in Continuous Software Development
US10984247B2 (en) Accurate correction of errors in text data based on learning via a neural network
CN115438176B (en) Method and equipment for generating downstream task model and executing task
Porouhan et al. Process and deviation exploration through Alpha-algorithm and Heuristic miner techniques
Arulmohan et al. Extracting domain models from textual requirements in the era of large language models
CN117501283A (en) Text-to-question model system
WO2024134768A1 (en) Natural language processing device, natural language processing method, and computer program
KR102140648B1 (en) System for converting hangeul word file on the web
CN101377772A (en) Method and system for globalizing support operations
Bikeyev Synthetic ontologies: A hypothesis
US20090276379A1 (en) Using automatically generated decision trees to assist in the process of design and review documentation
Marciano et al. Developing a framework to enable collaboration in computational archival science education
Indahyanti et al. Auto-generating business process model from heterogeneous documents: A comprehensive literature survey
Krukaset et al. Thai sentence generation machine employing fixed patterns
Mornie et al. Visualisation of User Stories in UML Models: A Systematic Literature Review
WO2015030016A1 (en) System for processing unstructured data, method for processing unstructured data, and recording medium
US20220253591A1 (en) Structured text processing apparatus, structured text processing method and program
US20220269856A1 (en) Structured text processing learning apparatus, structured text processing apparatus, structured text processing learning method, structured text processing method and program
US11664010B2 (en) Natural language domain corpus data set creation based on enhanced root utterances
Menke et al. First Steps towards a Tool Chain for Automatic Processing of Multimodal Corpora
Beale et al. Curating Heuristics for Complex Systems