JP2006330935A

JP2006330935A - Program, method, and system for learning data preparation

Info

Publication number: JP2006330935A
Application number: JP2005151421A
Authority: JP
Inventors: Tomoya Iwakura; 友哉岩倉
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2005-05-24
Filing date: 2005-05-24
Publication date: 2006-12-07

Abstract

<P>PROBLEM TO BE SOLVED: To perform stacking learning by preparing stacking learning data without reducing learning data at a first step of an NE sampling device without increasing a load for preparing learning data. <P>SOLUTION: A learning control section 140 comprises a step of dividing the learning data into n sections, a step of preparing the approximate NE sampling device by using (n-1) pieces of the learning data, a step of preparing the stacking learning data by using the NE sampled results of the approximate NE sampling device as test data for the remaining one learning data, and a step of making the stacking NE sampling device 130 perform stacking learning by using the stacking learning data prepared by the learning control section 140. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

この発明は、異なる機械学習アルゴリズムに基づく複数の分類プログラム（分類器）の分類結果を手がかりとして利用することで分類を行なう分類プログラムを作成するスタッキング学習に用いるスタッキング用学習データを作成する学習データ作成プログラム、学習データ作成方法および学習データ作成装置に関し、学習データ作成の負荷を高くすることなくスタッキング用学習データを作成することができる学習データ作成プログラム、学習データ作成方法および学習データ作成装置に関するものである。 The present invention creates learning data for creating stacking learning data used for stacking learning for creating a classification program for performing classification by using classification results of a plurality of classification programs (classifiers) based on different machine learning algorithms as clues. TECHNICAL FIELD The present invention relates to a learning data creation program, a learning data creation method, and a learning data creation device that can create learning data for stacking without increasing the load of learning data creation. is there.

機械学習を行なう学習器を用いて分類器を作成する場合には、SVMs、Boosting、ADTreesなど利用する機械学習アルゴリズムを変えることで同一の学習データから異なる規則やモデルを有する分類器を作成することができる（例えば、非特許文献１〜３参照。）。 When creating a classifier using a machine learning machine, create a classifier with different rules and models from the same learning data by changing the machine learning algorithm used such as SVMs, Boosting, ADTrees, etc. (For example, refer nonpatent literatures 1-3.).

また、これらの分類器はそれぞれ利点が異なるため、これら複数の分類器による分類結果を手掛りとして利用することによって、より良い分類を行なうスタッキングと呼ばれる手法が開発されている（例えば、非特許文献４参照。）。 In addition, since these classifiers have different advantages, a technique called stacking has been developed to perform better classification by using the classification results of the plurality of classifiers as a clue (for example, Non-Patent Document 4). reference.).

Vladimir Vapnik著、「Statistical Learning Theory」、Wiley-Interscience Publication、1998Vladimir Vapnik, "Statistical Learning Theory", Wiley-Interscience Publication, 1998 R.E. Scapire and Y. Singer著、「BoosTexter: A boosting-based system for text categorization」、Machine Learning、39(2/3):135-168、May/June 2000R.E.Scapire and Y. Singer, `` BoosTexter: A boosting-based system for text categorization '', Machine Learning, 39 (2/3): 135-168, May / June 2000 Yoav Freund and Llew Mason著、「The alternating decision tree learning algorithm」、Proceeding of the sixteenth International Conference on Machine Learning、PP.124-133、1999Yoav Freund and Llew Mason, `` The alternating decision tree learning algorithm '', Proceeding of the sixteenth International Conference on Machine Learning, PP.124-133, 1999 David H. Wolpert著、「Stacked generalization.」、Proceedings of Neural Networks, pages 241-259,1992.By David H. Wolpert, "Stacked generalization.", Proceedings of Neural Networks, pages 241-259, 1992. Takehito Utsuro, Manabu Sassano and Kiyotaka Uchimoto著、「Combining Outputs of Multiple Japanese Named Entity Chunkers by Stacking」、Proceedings of EMNLP 2002、PP.281-288、2002Takehito Utsuro, Manabu Sassano and Kiyotaka Uchimoto, "Combining Outputs of Multiple Japanese Named Entity Chunkers by Stacking", Proceedings of EMNLP 2002, PP.281-288, 2002

スタッキングは、複数の分類器の分類結果を手がかりに利用することで分類を行なう分類器を作成するためのスタッキング用学習データを分類器を作成する手法である。以降、出力が手がかりとして利用される分類器を、１段目の分類器、１段目の分類器の分類結果を手がかりに分類を行なう分類器を２段目の分類器と呼ぶことにする。しかしながら、スタッキングの実現には、１段目の分類器を作成するための学習データとは別に、２段目の学習データ(スタッキング用学習データ)を用意する必要があり、学習データ作成の負荷が高いという問題がある。 Stacking is a method of creating a classifier using learning data for stacking to create a classifier that performs classification by using the classification results of a plurality of classifiers as clues. Hereinafter, a classifier whose output is used as a clue is referred to as a first-stage classifier, and a classifier that performs classification based on the classification result of the first-stage classifier is referred to as a second-stage classifier. However, in order to realize stacking, it is necessary to prepare learning data for the second stage (learning data for stacking) separately from the learning data for creating the classifier for the first stage. There is a problem that it is expensive.

学習データ作成の負荷を低減するために、１段目の分類器作成用の学習データをスタッキング用学習データとして再度利用することとすると、学習器によっては学習データをほぼ完璧に分類する分類器を作成するため、スタッキング学習の意味がなくなるという問題がある。 If the learning data for creating the first classifier is reused as learning data for stacking in order to reduce the learning data creation load, some classifiers classify the learning data almost perfectly. Since it is created, there is a problem that the meaning of stacking learning is lost.

また、ある学習データを１段目の分類器作成用とスタッキング学習用に分割して利用する場合には、１段目の分類器作成用の学習データが減り、学習精度が悪くなるという問題がある。また、１段目の分類器作成用とスタッキング学習用の学習データを、どのような比率で用意すれば良いかが不明であるという問題もある。 In addition, when certain learning data is divided and used for creating the first-stage classifier and stacking learning, there is a problem that learning data for creating the first-stage classifier is reduced and learning accuracy is deteriorated. is there. There is also a problem that it is unclear what ratio should be used for preparing the learning data for creating the first-stage classifier and stacking learning.

この発明は、上述した従来技術による問題点を解消するためになされたものであり、学習データ作成の負荷を高くすることなくスタッキング用学習データを作成することができる学習データ作成プログラム、学習データ作成方法および学習データ作成装置を提供することを目的とする。 The present invention has been made to solve the above-described problems caused by the prior art, and a learning data creation program and learning data creation that can create learning data for stacking without increasing the load of creating the learning data An object is to provide a method and an apparatus for creating learning data.

上述した課題を解決し、目的を達成するため、請求項１の発明に係る学習データ作成プログラムは、異なる機械学習アルゴリズムに基づく複数の分類プログラム(１段目の分類器)の分類結果を手がかりとして利用することで最終的な分類を行なう分類プログラム(２段目の分類器)を作成するスタッキング学習に用いるスタッキング用学習データを作成する学習データ作成プログラムであって、前記複数の分類プログラムが機械学習に用いる学習データを複数の部分学習データに分割する学習データ分割手順と、前記学習データ分割手順により学習データが分割されて得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、該一つの部分学習データを除く部分学習データを学習用のデータとして学習した前記複数の分類プログラムをテストしたテスト結果をスタッキング用学習データとして獲得する手順を繰り返することによりスタッキング用学習データを生成するスタッキング用学習データ生成手順と、をコンピュータに実行させることを特徴とする。 In order to solve the above-described problems and achieve the object, a learning data creation program according to the invention of claim 1 uses as a clue the classification results of a plurality of classification programs (first classifier) based on different machine learning algorithms. A learning data creation program for creating learning data for stacking used for stacking learning for creating a classification program (second classifier) for performing final classification by using the plurality of classification programs, wherein the plurality of classification programs are machine learning A learning data dividing procedure for dividing the learning data used for a plurality of partial learning data, and one partial learning data among a plurality of partial learning data obtained by dividing the learning data by the learning data dividing procedure As described above, the plurality of pieces of partial learning data excluding the one partial learning data are learned as learning data. It is characterized by causing a computer to execute a stacking learning data generation procedure for generating stacking learning data by repeating a procedure for acquiring test results obtained by testing a classification program as learning data for stacking.

この請求項１の発明によれば、複数の分類プログラムが機械学習に用いる学習データを複数の部分学習データに分割し、分割して得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、一つの部分学習データを除く部分学習データを学習用のデータとして学習した複数の分類プログラムをテストしたテスト結果をスタッキング用学習データとして生成するよう構成されており、一つの部分学習データを除く部分学習データを学習用のデータとして学習するので、部分学習データをより細かい単位で獲得することで、全ての学習データから作成される分類プログラムの近似の分類器を用いてスタッキング用学習データを作成することができる。 According to the first aspect of the present invention, learning data used for machine learning by a plurality of classification programs is divided into a plurality of partial learning data, and one partial learning data among a plurality of partial learning data obtained by the division. Is used as test data, and test results obtained by testing multiple classification programs that have been learned using partial learning data excluding one partial learning data as learning data, are generated as learning data for stacking. Since partial learning data excluding data is learned as learning data, learning for stacking is performed using an approximate classifier of a classification program created from all learning data by acquiring partial learning data in finer units. Data can be created.

また、請求項２の発明に係る学習データ作成プログラムは、請求項１の発明において、前記スタッキング用学習データ生成手順を、前記複数の分類プログラムをそれぞれ異なるコンピュータで実行することを特徴とする。 According to a second aspect of the present invention, there is provided a learning data creation program according to the first aspect of the invention, wherein the stacking learning data generation procedure is executed by a different computer for each of the plurality of classification programs.

この請求項２の発明によれば、複数の分類プログラムをそれぞれ異なるコンピュータで実行するよう構成したので、高速に分類プログラムを学習させることができ、高速にスタッキング用学習データを獲得できる。 According to the second aspect of the present invention, since the plurality of classification programs are executed by different computers, the classification program can be learned at high speed, and the learning data for stacking can be acquired at high speed.

また、請求項３の発明に係る学習データ作成プログラムは、請求項１または２の発明において、前記分類プログラムが、文章から固有表現を抽出する固有表現抽出プログラムであることを特徴とする。 According to a third aspect of the present invention, there is provided a learning data creation program according to the first or second aspect, wherein the classification program is a specific expression extraction program for extracting a specific expression from a sentence.

この請求項３の発明によれば、全ての学習データから作成される固有表現抽出プログラムの近似の固有表現抽出プログラムを用いてスタッキング用学習データを作成することができる。 According to the third aspect of the present invention, the learning data for stacking can be created using the approximated specific expression extraction program of the specific expression extraction program created from all the learning data.

また、請求項４の発明に係る学習データ作成プログラムは、請求項３の発明において、前記固有表現抽出プログラムは、文章から単語を切り出す単語切出手順と、前記単語切出手順により切り出された単語ごとに固有表現であるか否かを判定して固有表現に関するタグ付けを行なう固有表現タグ付け手順を行なう分類プログラムとすることを特徴とする。 According to a fourth aspect of the present invention, there is provided the learning data creation program according to the third aspect of the invention, wherein the specific expression extraction program includes a word extraction procedure for extracting a word from a sentence and a word extracted by the word extraction procedure. It is characterized in that it is a classification program for performing a specific expression tagging procedure for determining whether or not each is a specific expression and tagging the specific expression.

また、請求項５の発明に係る学習データ作成方法は、異なる機械学習アルゴリズムに基づく複数の分類プログラムの分類結果を手がかりとして利用することで分類を行なう分類プログラムを作成するスタッキング学習に用いるスタッキング用学習データを作成する学習データ作成方法であって、前記複数の分類プログラムが機械学習に用いる学習データを複数の部分学習データに分割する学習データ分割工程と、前記学習データ分割工程により学習データが分割されて得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、該一つの部分学習データを除く部分学習データを学習用のデータとして学習した前記複数の分類プログラムをテストしたテスト結果をスタッキング用学習データとして獲得する手順を繰り返すことでスタッキング用学習データを生成するスタッキング用学習データ生成工程と、を含んだことを特徴とする。 According to a fifth aspect of the present invention, there is provided a learning data generation method for stacking learning used for stacking learning for generating a classification program for performing classification by using classification results of a plurality of classification programs based on different machine learning algorithms. A learning data creation method for creating data, wherein the learning data is divided by a learning data dividing step for dividing learning data used by the plurality of classification programs for machine learning into a plurality of partial learning data, and the learning data dividing step. A test in which the plurality of classification programs learned by using one partial learning data out of a plurality of partial learning data obtained as a test data and learning the partial learning data excluding the one partial learning data as learning data Repeat the procedure to acquire the results as learning data for stacking. Characterized in that it includes a stacking learning data generating step of generating stacking learning data in Succoth, the.

この請求項５の発明によれば、複数の分類プログラムが機械学習に用いる学習データを複数の部分学習データに分割し、分割して得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、一つの部分学習データを除く部分学習データを学習用のデータとして学習した複数の分類プログラムをテストしたテスト結果をスタッキング用学習データとして生成するよう構成したので、全ての学習データから作成される固有表現抽出プログラムの近似の分類プログラムを用いてスタッキング用学習データを作成することができる。 According to the fifth aspect of the present invention, learning data used for machine learning by a plurality of classification programs is divided into a plurality of partial learning data, and one partial learning data among a plurality of partial learning data obtained by the division. Is used as test data, and test results obtained by testing a plurality of classification programs learned using partial learning data excluding one partial learning data as learning data are generated as stacking learning data. The learning data for stacking can be created by using an approximate classification program of the created unique expression extraction program.

また、請求項６の発明に係る学習データ作成装置は、異なる機械学習アルゴリズムに基づく複数の分類装置の分類結果を手がかりとして利用することで分類を行なう分類プログラムを作成するスタッキング学習に用いるスタッキング用学習データを作成する学習データ作成装置であって、前記複数の分類装置が機械学習に用いる学習データを複数の部分学習データに分割する学習データ分割手段と、前記学習データ分割手段により学習データが分割されて得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、該一つの部分学習データを除く部分学習データを学習用のデータとして学習した前記複数の分類装置をテストしたテスト結果をスタッキング用学習データとして獲得する手順を繰り返すことでスタッキング用学習データを生成するスタッキング用学習データ生成手段と、を備えたことを特徴とする。 According to a sixth aspect of the present invention, there is provided a learning data generating apparatus for stacking learning used for stacking learning for generating a classification program for performing classification by using classification results of a plurality of classification apparatuses based on different machine learning algorithms as clues. A learning data creation device for creating data, wherein the learning data is divided by learning data dividing means for dividing learning data used by the plurality of classification devices for machine learning into a plurality of partial learning data, and the learning data dividing means. A test that tests the plurality of classification devices that have learned one partial learning data of the plurality of partial learning data obtained as a test data and the partial learning data excluding the one partial learning data as learning data. Stacking is performed by repeating the procedure of acquiring the results as learning data for stacking. And stacking the learning data generating means for generating a grayed learning data, characterized by comprising a.

この請求項６の発明によれば、複数の分類装置が機械学習に用いる学習データを複数の部分学習データに分割し、分割して得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、一つの部分学習データを除く部分学習データを学習用のデータとして学習した複数の分類装置をテストしたテスト結果をスタッキング用学習データとして生成するよう構成したので、全ての学習データから作成される固有表現抽出プログラムの近似の分類装置を用いてスタッキング用学習データを作成することができる。 According to the invention of claim 6, the learning data used by the plurality of classification devices for machine learning is divided into a plurality of partial learning data, and one partial learning data among the plurality of partial learning data obtained by the division. Is used as test data, and a test result obtained by testing a plurality of classifiers that have learned partial learning data excluding one partial learning data as learning data is generated as learning data for stacking. The learning data for stacking can be created using an approximate classification device of the created unique expression extraction program.

請求項１、５および６の発明によれば、近似の分類プログラム（分類装置）を用いてスタッキング用学習データを作成するので、分類プログラム（分類装置）の学習データに影響を与えることなくスタッキング用学習データを作成することができるという効果を奏する。 According to the first, fifth and sixth aspects of the present invention, the learning data for stacking is created using the approximate classification program (classification apparatus), so that the learning data for the classification program (classification apparatus) is not affected. There is an effect that learning data can be created.

また、請求項２の発明によれば、高速に分類プログラムを学習させることができるので、スタッキング用学習データの作成を高速化することができるという効果を奏する。 Further, according to the invention of claim 2, since the classification program can be learned at high speed, there is an effect that the creation of the learning data for stacking can be speeded up.

また、請求項３の発明によれば、近似の固有表現抽出プログラムを用いてスタッキング用学習データを作成するので、固有表現抽出プログラムの学習データに影響を与えることなくスタッキング用学習データを作成することができるという効果を奏する。 According to the invention of claim 3, since the learning data for stacking is created using the approximated unique expression extraction program, the learning data for stacking can be created without affecting the learning data of the specific expression extraction program. There is an effect that can be.

以下に添付図面を参照して、この発明に係る学習データ作成プログラム、学習データ作成方法および学習データ作成装置の好適な実施例を詳細に説明する。図１と図２は、本発明によるスタッキング学習手法アルゴリズムとフローチャートの説明であり、適用するタスクにあわせて学習データを用意することで、様々なタスクに適用でき、使用する機械学習アルゴリズムを変更することで、様々な機械学習アルゴリズムによって作成される分類器の結果を手がかりとして分類を行なう分類器を作成することができる。 Exemplary embodiments of a learning data creation program, a learning data creation method, and a learning data creation device according to the present invention will be described below in detail with reference to the accompanying drawings. 1 and 2 are explanations of a stacking learning method algorithm and a flowchart according to the present invention. By preparing learning data in accordance with the task to be applied, it can be applied to various tasks, and the machine learning algorithm to be used is changed. Thus, it is possible to create a classifier that performs classification based on the results of the classifier created by various machine learning algorithms.

まず、本発明による装置の構成について説明する。図３は、本発明における機能ブロック図である。同図に示すように、この分類装置１は、m個の異なる機械学習アルゴリズムからなる分類装置１１₁〜１１mと、スタッキング用学習データ記憶部１２と、スタッキング用分類装置１３と、学習制御部１４と、分類制御部１５とを有する。 First, the configuration of the apparatus according to the present invention will be described. FIG. 3 is a functional block diagram in the present invention. As shown in the figure, the classification device 1 includes classification devices 11 _{1 to} 11 m composed of m different machine learning algorithms, a learning data storage unit 12 for stacking, a classification device 13 for stacking, and a learning control unit 14. And a classification control unit 15.

分類装置１１₁〜１１mは、それぞれ入力に対し分類を行なう処理器である。また、これらの分類装置１１₁〜１１mは、学習データから分類器を作成する学習機能を持つ。したがって、これらの分類装置１１₁〜１１mが、学習器として学習する場合を特に示す場合には、分類装置（学習器）１１₁〜１１mと記すこととする。 Each of the classification devices 11 _{1 to} 11m is a processor that performs classification on inputs. These classifiers 11 _{1 to} 11 m have a learning function for creating a classifier from learning data. Therefore, when these classification devices 11 _{1 to} 11 m particularly indicate a case where learning is performed as a learning device, they are denoted as classification devices (learning devices) 11 _{1 to} 11 m.

また、これらの分類装置１１₁〜１１mは、スタッキング用学習データを作成する場合には、学習データの一部を除いて学習を行い、近似分類装置として動作する。 Further, when creating the learning data for stacking, these classifiers 11 _{1 to} 11m perform learning except for a part of the learning data and operate as an approximate classifier.

スタッキング用分類装置１３は、分類装置１の機能である、分類装置１１₁〜１１mによる分類結果を利用して最終的に分類結果を決定する分類器をスタッキング用学習データから作成する機能と、スタッキング用分類装置１３により作成される分類器を使って分類を行なう処理器である。 The stacking classifier 13 is a function of the classifier _{1, and} uses a classification result from the classifiers 11 _{1 to} 11m to finally create a classifier that determines a classification result from stacking learning data, and stacking. This is a processor that performs classification using a classifier created by the classification apparatus 13 for use.

学習制御部１４は、ｍ台の分類装置（学習器）１１₁〜１１m、スタッキング用学習データ記憶部１２およびスタッキング用分類装置１３を用いてスタッキング学習を行なう制御部である。 The learning control unit 14 is a control unit that performs stacking learning using the m classification devices (learning devices) 11 _{1 to} 11 m, the stacking learning data storage unit 12, and the stacking classification device 13.

この学習制御部１４は、図１にあるアルゴリズムにてスタッキング用学習データを作成し、スタッキング学習を行なう。ここで、この学習制御部１４による図１にあるアルゴリズムによるスタッキング用学習データの作成について説明する。 The learning control unit 14 creates learning data for stacking by the algorithm shown in FIG. 1 and performs stacking learning. Here, creation of learning data for stacking by the algorithm shown in FIG. 1 by the learning control unit 14 will be described.

図１にあるアルゴリズムにある近似分類器の作成方法では、学習データを細かい単位で分割し、その一部を削除することで、近似度の高い分類器を作成することができる。また、近似分類器作成時に利用しなかった学習データは、１段目の分類器作成用に使用する学習データ以外のデータであり、近似分類器がその学習データを用いて分類した結果は、全体の学習データから作成した分類器での分類結果とほぼ同じ結果となると予想される。したがって、学習データのうち近似分類器の作成のために削除する学習データを順番に変えることによって、全ての学習データから作成される分類器とほぼ同等の分類結果を手掛りとして持つスタッキング用の学習データを、用意した学習データの分だけ作成することができる。 In the method for creating an approximate classifier in the algorithm shown in FIG. 1, it is possible to create a classifier having a high degree of approximation by dividing the learning data into fine units and deleting a part thereof. The learning data that was not used when creating the approximate classifier is data other than the learning data used for creating the first-stage classifier, and the result of the classification performed by the approximate classifier using the learning data is It is expected that the result will be almost the same as the result of classification by the classifier created from the learning data. Therefore, by changing the learning data to be deleted to create the approximate classifier in the learning data in order, the learning data for stacking that has a clue that is almost the same as the classifier created from all the learning data Can be created for the prepared learning data.

なお、本実施例では、本発明を日本語のテキストから固有表現を抽出する固有表現抽出装置に適用した場合を中心に説明する。本発明によるスタッキング学習手法アルゴリズムとフローチャートの説明である図１と図２を、固有表現抽出に適用した場合のアルゴリズムおよびフローチャの説明が図７、図１０である。ここで、固有表現の抽出とは、テキストから「人名」、「地名」、「日付」、「時間」などを抽出することであり、以降、固有表現をＮＥ（Named Entity）、固有表現抽出をＮＥ抽出と記すこととする。 In this embodiment, the case where the present invention is applied to a specific expression extraction apparatus that extracts a specific expression from Japanese text will be mainly described. FIG. 7 and FIG. 10 describe the algorithm and flow chart when FIG. 1 and FIG. 2 which are explanations of the stacking learning method algorithm and flowchart according to the present invention are applied to specific expression extraction. Here, the extraction of the specific expression is to extract “person name”, “place name”, “date”, “time”, etc. from the text. Thereafter, the specific expression is NE (Named Entity) and the specific expression is extracted. It shall be described as NE extraction.

まず、本実施例に係るＮＥ抽出装置（固有表現抽出装置）の構成について説明する。図４は、本実施例に係るＮＥ抽出装置の構成を示す機能ブロック図である。同図に示すように、このＮＥ抽出装置１００は、m個の異なる機械学習アルゴリズムからなるＮＥ抽出装置１１０₁〜１１０_mと、スタッキング用学習データ記憶部１２０と、スタッキング用ＮＥ抽出器１３０と、学習制御部１４０と、抽出制御部１５０とを有する。 First, the configuration of the NE extraction device (specific expression extraction device) according to the present embodiment will be described. FIG. 4 is a functional block diagram illustrating the configuration of the NE extraction apparatus according to the present embodiment. As shown in the figure, the NE extraction device 100 includes NE extraction devices 110 _{1 to} 110 _m each including m different machine learning algorithms, a stacking learning data storage unit 120, a stacking NE extractor 130, A learning control unit 140 and an extraction control unit 150 are included.

ＮＥ抽出装置１１０₁〜１１０_mは、それぞれテキストからＮＥＴ部１１１を使ってＮＥ抽出を行なう機能を持つ。なお、ＮＥＴ部１１１の詳細については後述する。 Each of the NE extraction devices 110 _{1 to} 110 _m has a function of performing NE extraction from the text using the NET unit 111. Details of the NET unit 111 will be described later.

これらのＮＥ抽出装置１１０₁〜１１０_mは、学習データを学習することによってＮＥ抽出を行なうＮＥ抽出器(ＮＥＴ部１１１)を作成する学習器として学習機能も持つ。したがって、これらのＮＥ抽出装置１１０₁〜１１０_mが、学習器として学習する場合を特に示す場合には、ＮＥ抽出装置（学習器）１１０₁〜１１０_mと記すこととする。 These NE extraction devices 110 _{1 to} 110 _m also have a learning function as a learning device that creates an NE extraction device (NET unit 111) that performs NE extraction by learning learning data. Therefore, when these NE extraction devices 110 _{1 to} 110 _m particularly indicate a case where learning is performed as a learning device, the NE extraction devices (learning devices) 110 _{1 to} 110 _m are described.

また、これらのＮＥ抽出装置１１０₁〜１１０_mは、スタッキング用学習データを作成する場合には、学習データの一部を除いて学習を行い、近似ＮＥ抽出器を作成し、それを使って抽出を実施する。なお、スタッキング用学習データの作成および近似ＮＥ抽出器の詳細については後述する。 In addition, when creating the learning data for stacking, these NE extraction devices 110 _{1 to} 110 _m perform learning by removing a part of the learning data, create an approximate NE extractor, and extract using that. To implement. The creation of learning data for stacking and details of the approximate NE extractor will be described later.

また、これらのＮＥ抽出装置１１０₁〜１１０_mは、ＩＲＥＸ（例えば、IREX Committee, editor. 1999. Proceedings of the IREX workshop 参照。）で定義された８種類のＮＥをテキストから抽出する。図５は、ＩＲＥＸで定義されたＮＥの例を示す図である。同図に示すように、これらのＮＥ抽出器１１０₁〜１１０_mは、テキストから「ARTIFACT」、「DATE」、「LOCATION」、「MONEY」、「ORGANIZATION」、「PERCENT」、「PERSON」および「TIME」を固有表現として抽出する。 These NE extraction devices 110 _{1 to} 110 _m extract eight types of NEs defined by IREX (see, for example, IREX Committee, editor. 1999. Proceedings of the IREX workshop) from text. FIG. 5 is a diagram illustrating an example of NE defined by IREX. As shown in the figure, these NE extractors 110 _{1 to} 110 _m are “ARTIFACT”, “DATE”, “LOCATION”, “MONEY”, “ORGANIZATION”, “PERCENT”, “PERSON” and “PERSON” from the text. "TIME" is extracted as a unique expression.

スタッキング用学習データ記憶部１２０は、スタッキング用学習データを記憶する記憶部である。図６は、スタッキング用学習データの例を示す図であり、比較のためにＮＥ抽出器作成用の学習データとともに示す。 The stacking learning data storage unit 120 is a storage unit that stores stacking learning data. FIG. 6 is a diagram showing an example of learning data for stacking and is shown together with learning data for creating an NE extractor for comparison.

同図に示すように、ＮＥ抽出器作成用の学習データは、素性とタグとを対応付けたものである。ここで、素性とはＮＥの抽出に利用する手掛りであり、タグは８種類のＮＥのうちのいずれか、またはＮＥに該当しないことを示す「Other」である。スタッキング用学習データは、ＮＥ抽出器１１０₁〜１１０_mの作成に利用する学習データの素性およびタグに、それぞれのＮＥ抽出器の出力結果を加えたものである。なお、図６において、「ＮＥ抽出器１」はＮＥ抽出器１１０₁に対応し、「ＮＥ抽出器２」はＮＥ抽出器１１０₂に対応する。 As shown in the figure, the learning data for creating the NE extractor associates a feature with a tag. Here, the feature is a clue used for NE extraction, and the tag is one of eight types of NE or “Other” indicating that the NE does not correspond to NE. The learning data for stacking is obtained by adding the output result of each NE extractor to the features and tags of the learning data used to create the NE extractors 110 _{1 to} 110 _m . In FIG. 6, “NE extractor 1” corresponds to NE extractor 110 ₁ , and “NE extractor 2” corresponds to NE extractor 110 ₂ .

スタッキング用ＮＥ抽出器１３０は、ＮＥ抽出器１１０₁〜１１０_mによるＮＥ抽出結果を利用して最終的にＮＥ抽出装置１００としてのＮＥ抽出を行なう処理器であり、スタッキング用学習データを学習することによってＮＥ抽出を行なうことができるようになる学習器機能を持つ。ＮＥ抽出装置１００の出力には後処理部１１２が適用される。後処理部１１２については後述する。 The NE extractor 130 for stacking is a processor that finally performs NE extraction as the NE extractor 100 using the NE extraction results from the NE extractors 110 _{1 to} 110 _m , and learns learning data for stacking. Has a learner function that enables NE extraction. A post-processing unit 112 is applied to the output of the NE extraction device 100. The post-processing unit 112 will be described later.

学習制御部１４０は、ｍ台のＮＥ抽出器（学習器）１１０₁〜１１０_m、スタッキング用学習データ記憶部１２０およびスタッキング用ＮＥ抽出器１３０を用いてスタッキング学習を行なう制御部である。 The learning control unit 140 is a control unit that performs stacking learning using the m NE extractors (learners) 110 _{1 to} 110 _m , the stacking learning data storage unit 120, and the stacking NE extractor 130.

この学習制御部１４０は、図７にあるアルゴリズムにてスタッキング用学習データを作成し、スタッキング学習を行なう。ここで、この学習制御部１４０による図７にあるアルゴリズムによるスタッキング用学習データの作成について説明する。 The learning control unit 140 creates learning data for stacking by the algorithm shown in FIG. 7 and performs stacking learning. Here, creation of learning data for stacking by the algorithm shown in FIG. 7 by the learning control unit 140 will be described.

スタッキング用の学習データは、１段目のＮＥ抽出器１１０₁〜１１０_mによるＮＥ抽出結果を学習することから、１段目のＮＥ抽出器１１０₁〜１１０_mの作成に使用する学習データ以外のデータであることが望ましい。したがって、スタッキング学習を行なうために、１段目のＮＥ抽出器１１０₁〜１１０_mの作成に使用する学習データ以外に、スタッキング用の学習データを新たに作成することも考えられる。しかし、この方法では、新たな学習データを作成するコストが問題となる。 Learning data for stacking, since learning the NE extraction result by the first stage of NE extractor 110 ₁ to 110 _m, other than the learning data used to create the first stage of the NE extractor 110 ₁ to 110 _m It is desirable to be data. Therefore, in order to perform stacking learning, it is also conceivable to newly create learning data for stacking in addition to the learning data used to create the first stage NE extractors 110 _{1 to} 110 _m . However, with this method, the cost of creating new learning data becomes a problem.

そこで、本実施例では、新たな学習データを作成するのではなく、全ての学習データを利用した場合の近似ＮＥ抽出器を作成することを考える。すなわち、本実施例では、一部を削除した学習データを利用することで、全ての学習データを学習した場合のNE抽出器の近似ＮＥ抽出器を作成し、スタッキング学習用の学習データ作成に利用する。 Therefore, in this embodiment, instead of creating new learning data, consider creating an approximate NE extractor that uses all of the learning data. That is, in this embodiment, by using the learning data from which a part has been deleted, an approximate NE extractor for the NE extractor when all the learning data is learned is created and used for creating learning data for stacking learning. To do.

図７にあるアルゴリズムにある近似ＮＥ抽出器の作成方法では、学習データを細かい単位で分割し、その一部を削除することで、近似度の高いＮＥ抽出器を作成することができる。また、近似ＮＥ抽出器作成時に利用しなかった学習データは、１段目のＮＥ抽出器作成用に使用する学習データ以外のデータであり、近似ＮＥ抽出器がその学習データを用いてＮＥ抽出した結果は、全体の学習データから作成したＮＥ抽出器でのＮＥ抽出とほぼ同じ結果となると予想される。したがって、学習データのうち近似ＮＥ抽出器の作成のために削除する学習データを順番に変えることによって、全ての学習データから作成されるＮＥ抽出器とほぼ同等のＮＥ抽出結果を手掛りとして持つスタッキング用の学習データを、用意した学習データの分だけ作成することができる。 In the method for creating an approximate NE extractor in the algorithm shown in FIG. 7, a NE extractor with a high degree of approximation can be created by dividing the learning data into fine units and deleting a part thereof. The learning data that was not used at the time of creating the approximate NE extractor is data other than the learning data used for creating the first-stage NE extractor, and the approximate NE extractor NE-extracted using the learned data. The result is expected to be almost the same as the NE extraction by the NE extractor created from the entire learning data. Therefore, by changing the learning data to be deleted for the creation of the approximate NE extractor in the learning data in order, the stacking has a NE extraction result that is almost equivalent to the NE extractor created from all the learning data. This learning data can be created as much as the prepared learning data.

図７は、本実施例に係るスタッキング学習アルゴリズムを示す図である。同図に示すように、本実施例では、（１）学習データＬＤをｎ分割して｛ＬＤ1，ＬＤ2, ...，ＬＤn｝とし、（２）学習するｍ個の学習器｛ＭＬ1，ＭＬ2，...，ＭＬm｝を用意する。なお、学習器｛ＭＬ1，ＭＬ2，...，ＭＬm｝は、図４に示したｍ台のＮＥ抽出器（学習器）１１０₁〜１１０_mに対応する。 FIG. 7 is a diagram illustrating a stacking learning algorithm according to the present embodiment. As shown in the figure, in this embodiment, (1) the learning data LD is divided into n to be {LD1, LD2,..., LDn}, and (2) m learning devices {ML1, ML2 to learn , ..., MLm} are prepared. The learning devices {ML1, ML2,..., MLm} correspond to the _m NE extractors (learning devices) 110 _{1 to} 110 _{m shown} in FIG.

そして、（３）用意したそれぞれの学習器を使って、学習に（Ｎ−１）の学習データを使用して近似ＮＥ抽出器を作成し、残り１をテストデータとしてそれぞれの近似ＮＥ抽出器でＮＥ抽出し、その結果をテストデータに素性として追加し、スタッキング用学習データＳＴを作成する。具体的には、まず、ｉ番目のスタッキング用学習データＳＴiを初期化する（ＳＴi＝ＬＤi）。そして、学習器ＭＬj（１≦ｊ≦ｍ）を使って、ＬＤからＬＤi（１≦ｉ≦ｎ）を除いた学習データ（ＬＤ−ＬＤi）から近似ＮＥ抽出器ＣＬjiを作成し、ＣＬjiでＬＤiからＮＥ抽出した結果をｉ番目のスタッキング用学習データＳＴiに素性として追加する。ここで、近似ＮＥ抽出器ＣＬjiとは、学習器ＭＬjから学習データＬＤによって作成されるＮＥ抽出器ＣＬjに対して、学習器ＭＬjから学習データ（ＬＤ−ＬＤi）によって作成されるＮＥ抽出器を意味する。そして、全ての学習器ＭＬj（１≦ｊ≦ｍ）について近似ＮＥ抽出器ＣＬjiを作成し、作成した近似ＮＥ抽出器ＣＬjiを用いてＬＤiからＮＥ抽出した結果をｉ番目のスタッキング用学習データＳＴiに素性として追加し、ＳＴiをスタッキング用学習データＳＴに追加する。そして、全ての学習データＬＤi（１≦ｉ≦ｎ）についてＳＴi（１≦ｉ≦ｎ）を作成してＳＴに追加する。 (3) Using each of the prepared learners, an approximate NE extractor is created using the learning data of (N-1) for learning, and the remaining 1 is used as test data for each approximate NE extractor. NE is extracted, the result is added to the test data as a feature, and stacking learning data ST is created. Specifically, first, the i-th stacking learning data STi is initialized (STi = LDi). Then, using the learning device MLj (1 ≦ j ≦ m), an approximate NE extractor CLji is created from learning data (LD-LDi) obtained by removing LDi (1 ≦ i ≦ n) from LD, and from LDi using CLji The result of NE extraction is added as a feature to the i-th stacking learning data STi. Here, the approximate NE extractor CLji means an NE extractor created from the learner MLj using the learning data (LD-LDi) with respect to the NE extractor CLj created from the learner MLj using the learning data LD. To do. Then, the approximate NE extractor CLji is created for all the learners MLj (1 ≦ j ≦ m), and the result of NE extraction from LDi using the created approximate NE extractor CLji is used as the i-th stacking learning data STi. As a feature, STi is added to the learning data ST for stacking. Then, STi (1 ≦ i ≦ n) is created for all learning data LDi (1 ≦ i ≦ n) and added to the ST.

そして、（４）学習器｛ＭＬ1，ＭＬ2，...，ＭＬm｝を使って学習データＬＤからＮＥ抽出器｛ＣＬ1，ＣＬ2，...，ＣＬm｝を作成する。ここで、ＮＥ抽出器｛ＣＬ1，ＣＬ2，...，ＣＬm｝は、図４に示したｍ台のＮＥ抽出器１１０₁〜１１０_mに対応する。そして、スタッキング用学習データＳＴからスタッキング用ＮＥ抽出器１３０を学習する。 Then, (4) NE extractors {CL1, CL2,..., CLm} are created from the learning data LD using the learners {ML1, ML2,. Here, the NE extractors {CL1, CL2,..., CLm} correspond to the _m NE extractors 110 _{1 to} 110 _{m shown} in FIG. Then, the stacking NE extractor 130 is learned from the stacking learning data ST.

このように、本実施例では、学習：テスト＝（ｎ−１）：１の比率で、近似のＮＥ抽出器作成とテストという手順を繰り返し、テスト結果をスタッキング用学習データとしている。すなわち、本実施例では、学習データの一部ＬＤiをテストデータとして、（ＬＤ−ＬＤi）から作成した近似ＮＥ抽出器ＣＬji（１≦ｊ≦ｍ）をテストした結果をスタッキング用学習データとして用いる。 As described above, in this embodiment, the procedure of creating an approximate NE extractor and testing is repeated at a ratio of learning: test = (n−1): 1, and the test result is used as learning data for stacking. That is, in this embodiment, a part of learning data LDi is used as test data, and the result of testing the approximate NE extractor CLji (1≤j≤m) created from (LD-LDi) is used as learning data for stacking.

したがって、ＮＥ抽出器｛ＣＬ1，ＣＬ2，...，ＣＬm｝作成用の学習データＬＤだけを用いて、かつ、ＮＥ抽出器｛ＣＬ1，ＣＬ2，...，ＣＬm｝を作成する際に学習データＬＤを減らすことなく、スタッキング用学習データを生成することができる。また、ｎを十分大きくすることによって、近似ＮＥ抽出器ＣＬji（１≦ｊ≦ｍ）を全ての学習データＬＤから作成されるＮＥ抽出器｛ＣＬ1，ＣＬ2，...，ＣＬm｝に十分近づけることができる。 Therefore, only the learning data LD for creating the NE extractor {CL1, CL2,..., CLm} is used, and the learning data is created when the NE extractor {CL1, CL2,. The learning data for stacking can be generated without reducing the LD. Further, by making n sufficiently large, the approximate NE extractor CLji (1 ≦ j ≦ m) is made sufficiently close to the NE extractors {CL1, CL2,..., CLm} created from all the learning data LD. Can do.

抽出制御部１５０は、ｍ台のＮＥ抽出器１１０₁〜１１０_mおよびスタッキング用ＮＥ抽出器１３０を用いてＮＥ抽出装置１００がＮＥ抽出を行なうように制御する制御部である。図８は、複数のＮＥ抽出器を使ったスタッキングを説明するための説明図である。同図は、４台のＮＥ抽出器１１０₁〜１１０₄を使う場合を示している。 The extraction control unit 150 is a control unit that controls the NE extraction apparatus 100 to perform NE extraction using the _m NE extractors 110 _{1 to} 110 _m and the stacking NE extractor 130. FIG. 8 is an explanatory diagram for explaining stacking using a plurality of NE extractors. This figure shows a case where _four NE extractors 110 _{1 to} 110 ₄ are used.

同図に示すように、各ＮＥ抽出器は、「宮崎／は／宮崎／の／出身」というテキストを入力してＮＥ抽出を行なう。例えば、ＮＥ抽出器１１０₁は、「宮崎（Other）／は（Other）／宮崎（LOCATION）／の（Other）／出身（Other）」をＮＥ抽出結果として出力する。また、ＮＥ抽出器１１０₂は、「宮崎（PERSON）／は（Other）／宮崎（Other）／の（Other）／出身（Other）」をＮＥ抽出結果として出力する。 As shown in the figure, each NE extractor performs NE extraction by inputting the text “Miyazaki / ha / Miyazaki / no / from”. For example, the NE extractor 110 ₁ outputs “Miyazaki (Other) / Has (Other) / Miyazaki (LOCATION) / (Other) / Owned from Others]” as the NE extraction result. Further, NE extractor 110 ₂ outputs the "Miyazaki (PERSON) / the (Other) / Miyazaki (Other) / the (Other) / former (Other)" as the NE extraction results.

そして、スタッキング用ＮＥ抽出器１３０は、各ＮＥ抽出器のＮＥ抽出結果を入力してＮＥ抽出装置１００としてのＮＥ抽出結果を出力する。図８では、スタッキング用ＮＥ抽出器１３０は、「宮崎（PERSON）／は（Other）／宮崎（LOCATION）／の（Other）／出身（Other）」をＮＥ抽出装置１００としてのＮＥ抽出結果として出力している。 The stacking NE extractor 130 receives the NE extraction result of each NE extractor and outputs the NE extraction result as the NE extraction device 100. In FIG. 8, the stacking NE extractor 130 outputs “Miyazaki (PERSON) / ha (Other) / Miyazaki (LOCATION) / (Other) / origin (Other)” as the NE extraction result as the NE extraction device 100. is doing.

次に、ＮＥ抽出器１１０₁〜１１０_mのＮＥＴ部１１１および後処理部１１２の詳細について説明する。ＮＥ抽出器１１０₁〜１１０_mは、形態素解析結果を手掛りにＮＥ抽出を行なう。しかし、形態素解析による単語の境界とＮＥの境界は必ずしも一致しない。そのため、ＮＥ抽出器１１０₁〜１１０_mは、ＮＥＴ部１１１が単語のＮＥを判別するＮＥタギングを行ない、後処理部１１２が単語の境界とＮＥの境界が一致しない場合の後処理を行なう。 Next, details of the NET unit 111 and the post-processing unit 112 of the NE extractors 110 _{1 to} 110 _m will be described. The NE extractors 110 _{1 to} 110 _m perform NE extraction based on morphological analysis results. However, the word boundary and NE boundary by morphological analysis do not always match. Therefore, the NE extractors 110 _{1 to} 110 _m perform NE tagging in which the NET unit 111 determines the NE of the word, and the post-processing unit 112 performs post-processing when the word boundary does not match the NE boundary.

ＮＥＴ部１１１は、形態解析器を用いて形態素解析を行い、形態解析器の出力した単語がどのＮＥクラスになるかを判別する。なお、ここでは、ChaSenの形態解析器（例えば、http://chasen.naist.jp/hiki/ChaSen/参照）を用いる。また、素性として表記、品詞および文字種を利用する。 The NET unit 111 performs morphological analysis using a morphological analyzer, and determines which NE class the word output from the morphological analyzer is. Here, a ChaSen morphological analyzer (see, for example, http://chasen.naist.jp/hiki/ChaSen/) is used. In addition, notation, parts of speech, and character types are used as features.

ここで、表記は、ChaSenによる解析結果として返される単語の表記である。また、品詞としては、ChaSenの結果を利用する。ただし、未知語と判別される場合は、固有名詞−一般として出力するようChaSenの設定を変更して利用する。 Here, the notation is a notation of a word returned as an analysis result by ChaSen. As part of speech, the result of ChaSen is used. However, if it is determined as an unknown word, the ChaSen setting is changed so that it is output as a proper noun-general.

また、平仮名、片仮名、漢字、アルファベット大文字、アルファベット小文字、記号・その他、６種類の組み合わせと数字を文字種とする。同じ文字が２回以上続く場合は、”＋”を付けて区別する。例えば、「訪朝」であれば、漢字が二文字続くので「漢字＋」という形で表現し、「食べる」であれば、「漢字−平仮名＋」という形で表現する。数字は、漢数字、アラビア数字ともに正規化を行ない、”１２以下”、”２４以下”、”１００以下”、”２０００以下”、”それ以外”、という分類を行なった結果を文字種とする。 Further, hiragana, katakana, kanji, uppercase alphabetic characters, lowercase alphabetic characters, symbols / others, and combinations of 6 types and numbers are used as character types. If the same character continues more than once, add “+” to distinguish them. For example, in the case of “visiting the morning”, two kanji characters are used, so that they are expressed in the form of “kanji +”, and in the case of “eating”, they are expressed in the form of “kanji-hirakana +”. Numbers are normalized for both Chinese numerals and Arabic numerals, and the result of classification “12 or less”, “24 or less”, “100 or less”, “2000 or less”, “other than that” is used as the character type.

単語単位でのＮＥ抽出を行なう場合、一つの単語が単独でＮＥとなる場合と複数の単語がＮＥを構成する場合がある。そのため、ＮＥタギングでは、「形態素の”ＮＥクラス”と”ＮＥ内での位置”」で表現されるＮＥタグの単語へのタグ付けを行なう。 When NE extraction is performed in units of words, there are cases where one word becomes NE alone or a plurality of words constitute NE. For this reason, in NE tagging, NE tags represented by “morphological“ NE class ”and“ position within NE ”are attached to words.

例えば、「<ORGANIZATION>岩倉使節団</ORGANIZATION>は訪<LOCATION>米</LOCATION>した」という文は、
岩倉”BEGIN-ORGANIZATION”
使節”INSIDE-ORGANIZATION”
団”END-ORGANIZATION”
は”O”
訪”O”
米”SINGLE-LOCATION”
し”O”
た”O”
と表現される。 For example, the sentence "<ORGANIZATION> Iwakura Mission </ ORGANIZATION> visited <LOCATION> US </ LOCATION>"
Iwakura “BEGIN-ORGANIZATION”
Envoy “INSIDE-ORGANIZATION”
Team “END-ORGANIZATION”
Is “O”
Visit “O”
US “SINGLE-LOCATION”
“O”
“O”
It is expressed.

判別対象となるＮＥタグは、図５に示した８種類と”ＮＥ内での位置”である単独（S-）、先頭（B-）、中間（I-）、終わり（E-）の４種類との組み合わせ３２種類と、ＮＥ以外（O）の１種類の合計３３種類である。 There are 8 types of NE tags to be discriminated, and 4 types: single (S-), top (B-), middle (I-), end (E-), which are "positions within NE" shown in FIG. There are a total of 33 types including 32 types in combination with types and one type other than NE (O).

ＮＥタギングの手掛りに利用する形態素範囲は、対象の形態素および前後２形態素の合計５形態素とする。また、それぞれのＮＥタグのＮＥクラスへのまとめあげは、Viterbiアルゴリズムにて行なう。Viterbiアルゴリズムへの入力は確率値であるが、ここでは利用する学習器から作成されるＮＥ抽出器は確信度を返すため、それぞれの確信度をsigmoid関数S(X)=1/(1+exp(-βX)）にて正規化を行ない確率値の代わりとする。なおβの値は「５」とする。 The morpheme range used for the clues of NE tagging is a total of 5 morphemes including the target morpheme and the front and back 2 morphemes. In addition, the NE tags are grouped into NE classes by the Viterbi algorithm. The input to the Viterbi algorithm is a probability value, but here the NE extractor created from the learner to be used returns a certainty factor. Therefore, each confidence factor is expressed as a sigmoid function S (X) = 1 / (1 + exp Normalization is performed in (-βX)) to replace the probability value. Note that the value of β is “5”.

後処理部１１２が行なう後処理とは、単語の一部がＮＥである場合に、単語から切り出しを行なう処理である。例えば、「岩倉／使節／団／が／訪米／し／た」と形態素解析した文からＮＥ抽出を行なう場合、「岩倉使節団」は「ORGANIZATION」として抽出できるが、「LOCATION」である「米」を単語「訪米」から抽出することができない。そこで、ここでは、単語単位でのＮＥタギングを行なったあとに文字単位のＮＥタギングを後処理として行なう。 The post-processing performed by the post-processing unit 112 is processing for cutting out from a word when a part of the word is NE. For example, if NE extraction is performed from a sentence that has been morphologically analyzed as “Iwakura / mission / group / ga / visit / shi / ta”, “Iwakura mission” can be extracted as “ORGANIZATION”, but “LOCATION” is “rice”. "Cannot be extracted from the word" visiting the United States ". Therefore, here, NE tagging in character units is performed as post-processing after NE tagging in word units.

図９は、単語単位のＮＥタギングから文字単位のＮＥタギングへの変換例を示す図である。同図において、上側のＮＥタギング結果は、下側のＮＥタギングに変換される。 FIG. 9 is a diagram illustrating an example of conversion from NE tagging in units of words to NE tagging in units of characters. In the figure, the upper NE tagging result is converted into lower NE tagging.

後処理で判別対象となるＮＥタグは、図５に示した８種類と”ＮＥ内での位置”である先頭（B-）、中間（I-）の２種類との組み合わせ１６種類と、ＮＥ以外（O）の１種類の合計１７種類である。また、後処理で利用する素性は、文字、文字が属する単語の品詞と位置、文字種、ＮＥタギング結果である。それぞれの素性は、文字の単語内での出現位置である単独（S-）、先頭（B-）、中間（I-）、終わり（E-）の４つを付与したものを利用する。解析は文末から文頭の方向へ行ない、２つ前までに決定したＮＥタグを動的に素性として利用する。 The NE tags to be discriminated in the post-processing include 16 types of combinations of the eight types shown in FIG. 5 and the two types of the first (B-) and middle (I-) which are “positions in the NE”, and the NE tag. There are a total of 17 types of one type other than (O). The features used in post-processing are the part of speech and position of the word to which the character belongs, the character, the character type, and the NE tagging result. For each feature, four character positions (single (S-), head (B-), middle (I-), end (E-)) are used. The analysis is performed in the direction from the end of the sentence to the beginning of the sentence, and the NE tag determined two times before is dynamically used as a feature.

次に、本実施例に係るＮＥ抽出装置１００によるスタッキング学習処理の処理手順について説明する。図１０は、本実施例に係るＮＥ抽出装置１００によるスタッキング学習処理の処理手順を示すフローチャートである。 Next, the processing procedure of the stacking learning process by the NE extraction apparatus 100 according to the present embodiment will be described. FIG. 10 is a flowchart illustrating the processing procedure of the stacking learning process performed by the NE extraction device 100 according to the present embodiment.

同図に示すように、このＮＥ抽出装置１００は、学習制御部１４０が学習データＬＤをｎ分割し（ステップＳ１０１）、分割して得られる学習データＬＤiの添え字ｉを「１」に初期化する（ステップＳ１０２）。 As shown in the figure, in the NE extraction apparatus 100, the learning control unit 140 divides the learning data LD into n (step S101), and initializes the subscript i of the learning data LDi obtained by the division to “1”. (Step S102).

そして、学習制御部１４０は、ｉ≦ｎであるか否かを判定し（ステップＳ１０３）、ｉ≦ｎである場合には、ｉ番目のスタッキング用学習データを初期化する。すなわち、ＳＴi＝ＬＤiとする（ステップＳ１０４）。また、学習器ＭＬjの添え字ｊを「１」に初期化する（ステップＳ１０５）。 Then, the learning control unit 140 determines whether i ≦ n is satisfied (step S103), and if i ≦ n, initializes the i-th stacking learning data. That is, STi = LDi (step S104). Further, the subscript j of the learning device MLj is initialized to “1” (step S105).

そして、学習制御部１４０は、ｊ≦ｍであるか否かを判定し（ステップＳ１０６）、ｊ≦ｍである場合には、ＭＬjを使って、ＬＤからＬＤiを除いた学習データ（ＬＤ−ＬＤi）から近似ＮＥ抽出器ＣＬjiを作成し、ＣＬjiでＬＤiからＮＥ抽出した結果を、学習器ＭＬjから作成されるＮＥ抽出器ＣＬjによるＬＤiからのＮＥ抽出結果の素性として、ｉ番目のスタッキング用学習データＳＴiに追加する（ステップＳ１０７）。そして、ｊに「１」を加え（ステップＳ１０８）、ステップＳ１０６に戻る。 Then, the learning control unit 140 determines whether or not j ≦ m (step S106). If j ≦ m, the learning data (LD-LDi) obtained by removing LDi from LD using MLj. ) To create an approximate NE extractor CLji, and the result of NE extraction from LDi by CLji is used as the feature of the NE extraction result from LDi by the NE extractor CLj created from the learning device MLj. Add to STi (step S107). Then, “1” is added to j (step S108), and the process returns to step S106.

一方、ｊ≦ｍでない場合には、ＳＴiの作成が完了した場合であるので、ＳＴiをスタッキング用学習データＳＴとして追加する。すなわち、ＳＴ＝ＳＴ＋ＳＴiとする（ステップＳ１０９）。なお、ＳＴは、スタッキング用学習データ記憶部１２０に記憶される。そして、ｉに「１」を加え（ステップＳ１１０）、ステップＳ１０３に戻る。 On the other hand, if j ≦ m is not satisfied, STi has been created, so STi is added as stacking learning data ST. That is, ST = ST + STi (step S109). Note that ST is stored in the learning data storage unit 120 for stacking. Then, “1” is added to i (step S110), and the process returns to step S103.

また、ｉ≦ｎでない場合には、ＳＴの作成が完了した場合であるので、学習制御部１４０は、学習器｛ＭＬ1，ＭＬ2，...，ＭＬm｝を使ってＬＤからＮＥ抽出器｛ＣＬ1，ＣＬ2，...，ＣＬm｝を作成する（ステップＳ１１１）。そして、スタッキング用ＮＥ抽出器１３０がスタッキング用学習データ記憶部１２０に記憶されたＳＴを用いてスタッキング学習を行なう（ステップＳ１１２）。 Further, when i ≦ n is not satisfied, since the creation of the ST is completed, the learning control unit 140 uses the learners {ML1, ML2,..., MLm} to perform the NE extractor {CL1 from the LD. , CL2,..., CLm} are created (step S111). Then, the stacking NE extractor 130 performs stacking learning using the ST stored in the stacking learning data storage unit 120 (step S112).

このように、学習制御部１４０が近似ＮＥ抽出器ＣＬjiを作成し、ＣＬjiでＬＤiからＮＥ抽出した結果を、ＮＥ抽出器ＣＬjによるＬＤiからのＮＥ抽出結果の素性として、ｉ番目のスタッキング用学習データＳＴiに追加することを１≦ｊ≦ｍおよび１≦ｉ≦ｎの範囲で繰り返すことによって、スタッキング用学習データＳＴを作成することができる。 In this way, the learning control unit 140 creates the approximate NE extractor CLji, and the result of NE extraction from LDi by CLji is used as the feature of the NE extraction result from LDi by the NE extractor CLj, and the i-th learning data for stacking. By repeating the addition to STi within the range of 1 ≦ j ≦ m and 1 ≦ i ≦ n, stacking learning data ST can be created.

図１１は、本実施例に係るＮＥ抽出装置１００によるスタッキング用学習データの作成例を示す図である。同図は、学習データＬＤを３分割した例を示す。まず、学習データＬＤ2およびＬＤ3から学習器ＭＬjを用いて近似ＮＥ抽出器ＣＬj1を作成し、ＬＤ1をテストデータとして近似ＮＥ抽出器ＣＬj1をテストした結果とＬＤ1を合わせてスタッキング用学習データＳＴ1を作成する。 FIG. 11 is a diagram illustrating an example of creating learning data for stacking by the NE extraction device 100 according to the present embodiment. The figure shows an example in which the learning data LD is divided into three. First, an approximate NE extractor CLj1 is created from the learning data LD2 and LD3 using the learner MLj, and the result of testing the approximate NE extractor CLj1 using LD1 as test data is combined with LD1 to create stacking learning data ST1. .

同様に、学習データＬＤ1およびＬＤ3から学習器ＭＬjを用いて近似ＮＥ抽出器ＣＬj2を作成し、ＬＤ2をテストデータとして近似ＮＥ抽出器ＣＬj2をテストした結果とＬＤ2を合わせてスタッキング用学習データＳＴ2を作成することができ、学習データＬＤ1およびＬＤ2から学習器ＭＬjを用いて近似ＮＥ抽出器ＣＬj3を作成し、ＬＤ3をテストデータとして近似ＮＥ抽出器ＣＬj3をテストした結果とＬＤ3を合わせてスタッキング用学習データＳＴ3を作成することができる。 Similarly, an approximate NE extractor CLj2 is created from the learning data LD1 and LD3 using the learning device MLj, and the result of testing the approximate NE extractor CLj2 using LD2 as test data and LD2 are combined to create learning data ST2 for stacking. A learning NE MLj is used to create an approximate NE extractor CLj3 from the learning data LD1 and LD2, and the result of testing the approximate NE extractor CLj3 using LD3 as test data is combined with the learning data ST3 for stacking. Can be created.

すなわち、
「<PERSON>宮崎</PERSON>と<LOCATION>宮崎</LOCATION>に行く。」
「<PERSON>宮崎</PERSON>と食事に行く。」
「彼と<LOCATION>宮崎</LOCATION>に行く。」
の３文から構成される学習データのうち２文を学習して近似のＮＥ抽出器を作成し、近似のＮＥ抽出器を使って残りの１文のＮＥ抽出結果をスタッキング用学習データとするということを繰り返すことで、全ての学習データに対し、近似のＮＥ抽出器を使ってスタッキング用学習データを作成することができる。 That is,
“Go to <PERSON> Miyazaki </ PERSON> and <LOCATION> Miyazaki </ LOCATION>.”
“<PERSON> Miyazaki </ PERSON> and go for a meal.”
“Go to <LOCATION> Miyazaki </ LOCATION> with him.”
2 of the learning data composed of the three sentences is learned to create an approximate NE extractor, and the NE extraction result of the remaining one sentence is used as stacking learning data using the approximate NE extractor. By repeating this, stacking learning data can be created for all learning data using an approximate NE extractor.

次に、本実施例に係るＮＥ抽出装置１００の評価結果について説明する。なお、評価にあたっては、SVMs、BoostingおよびADTreesを学習器として用いた。SVMsに与えるパラメータについては、Ｃの値は「１」、polinomial kernelは２次とし、マルチクラスはOne Vs the Restでの学習とした。Boostingについては、二値分類器であるBoostingをマルチクラスへ拡張するBoosTexterを、マルチクラスをOne Vs the Restで学習させるように変更して利用した。 Next, evaluation results of the NE extraction apparatus 100 according to the present embodiment will be described. In the evaluation, SVMs, Boosting, and ADTrees were used as learning devices. Regarding the parameters given to SVMs, the value of C is “1”, the polinomial kernel is secondary, and the multi-class is learning with One Vs the Rest. For Boosting, BoosTexter, which extends Boosting, which is a binary classifier, to multi-class, was modified so that multi-class can be learned with One Vs the Rest.

ADTreesについては、decision stumpをノードとする分類木をboostingの枠組みで作成するADTreesアルゴリズムを改良して利用した。ADTreesでは、分類木を構築するためにあらたに選択された素性を親とし、”その素性が出現した事例”、”その素性が出現しなかった事例”の二つを新たな探索空間とするが、ＮＥ抽出においては、素性数や事例数が多いため学習が進むにつれ探索空間が急激に拡大する。そこで、ここでは、ある素性が出現する事例に比べ出現しない事例数が極端に大きい点に注目し、探索空間の追加は、”その素性が出現した事例”に限定することにした。また、この変更にともない、ノードとなる素性選択条件を、BoosTexterのreal-valued predictions and abstainingに変更した。マルチクラスはOne Vs the Restでの学習とし、Boostingの繰り返しは全てのクラスともに１０００回に固定し、木の最大深さは「２」と設定した。 For ADTrees, we used an improved ADTrees algorithm that creates a classification tree with a decision stump as a node in the boosting framework. In ADTrees, the newly selected feature for constructing the classification tree is used as a parent, and two cases of “examples where the feature appeared” and “examples where the feature did not appear” are used as new search spaces. In NE extraction, since the number of features and the number of cases is large, the search space rapidly expands as learning progresses. Therefore, here, we focus on the fact that the number of cases that do not appear is extremely large compared to cases where a certain feature appears, and the addition of the search space is limited to “cases where the feature appears”. Along with this change, the feature selection condition to be a node has been changed to BoosTexter's real-valued predictions and abstaining. Multi-class learning was done with One Vs the Rest, boosting repetition was fixed at 1000 times for all classes, and the maximum tree depth was set to “2”.

また、ＮＥタギングには、SVMs、Boosting、ADTreesを利用し、後処理とスタッキング学習には、SVMsを利用した。後処理は、図４の１３０にあたるスタッキング用ＮＥ抽出器の出力に対し適用するものとし、全ての実験において、SVMsから作成された共通のものを使用した。 Further, SVMs, Boosting, and ADTrees were used for NE tagging, and SVMs were used for post-processing and stacking learning. The post-processing is applied to the output of the NE extractor for stacking corresponding to 130 in FIG. 4. In all experiments, a common one created from SVMs was used.

また、評価は、本実施例に係るＮＥ抽出装置１００による学習データの作成手法を次の三つの手法（ａ）〜（ｃ）と比較することによって行なった。手法(b)、(c)に関しては、非特許文献５にて用いられているスタッキング学習データの獲得方法である。 The evaluation was performed by comparing the learning data creation method by the NE extraction apparatus 100 according to the present embodiment with the following three methods (a) to (c). The methods (b) and (c) are the stacking learning data acquisition methods used in Non-Patent Document 5.

手法（ａ）：学習する正解（タグ）を全てのＮＥ抽出器の出力結果としてスタッキング用学習データを作成する。この手法では、全てのＮＥ抽出器の出力が正解と同じであるため、それぞれのＮＥ抽出器がどのようなＮＥ抽出を行なうのかを学習することができない。図１２は、手法（ａ）によるスタッキング用学習データの作成例を示す図である。同図に示すように、この手法では、ＮＥ抽出器は全て正解のタグを出力するものとしてスタッキング学習を行なう。例えば、学習データ「宮崎（PERSON）／と（Other）／宮崎（LOCATION）／に（Other）／行く（Other）／。（Other）」に対するスタッキング用学習データは、全てのＮＥ抽出器の出力結果が「宮崎（PERSON）／と（Other）／宮崎（LOCATION）／に（Other）／行く（Other）／。（Other）」であるという仮定の上で作成される。 Method (a): Learning data for stacking is created by using the correct answer (tag) to be learned as an output result of all NE extractors. In this method, since the outputs of all NE extractors are the same as the correct answer, it is impossible to learn what NE extraction each NE extractor performs. FIG. 12 is a diagram illustrating an example of creating learning data for stacking by the method (a). As shown in the figure, in this method, the NE extractors perform stacking learning on the assumption that all correct tags are output. For example, the learning data for stacking for the learning data “Miyazaki (PERSON) / and (Other) / Miyazaki (LOCATION) / Ni (Other) / Go (Other) /. (Other)” is the output result of all NE extractors. Is created under the assumption that “PERSON / Miyazaki (PERSON) / and (Other) / Miyazaki (LOCATION) / (Other) / Go (Other) /. (Other)”.

手法（ｂ）：ＮＥ抽出器の学習に使った学習データを用いてＮＥ抽出器をテストした結果をスタッキングに使う。すなわち、学習データからＮＥ抽出器を作成し、作成したＮＥ抽出器で学習データのＮＥ抽出を行ってスタッキング用学習データとする。この手法では、ＮＥ抽出器は、学習したデータに対してＮＥ抽出を行なうため、ほとんどの場合は正しいＮＥ抽出を行なうことができ、学習に失敗した部分だけがタグと異なる。そのため、手法（ａ）とほぼ同じ結果になる可能性がある。図１３は、手法（ｂ）によるスタッキング用学習データの作成例を示す図である。同図に示すように、この手法では、ＮＥ抽出器は学習に失敗した部分を除いて正解のタグを出力する。例えば、学習データ「宮崎（PERSON）／と（Other）／宮崎（LOCATION）／に（Other）／行く（Other）／。（Other）」に対するスタッキング用学習データでは、二つのＮＥ抽出器の出力結果が正解となっており、「彼（Other）／と（Other）／宮崎（LOCATION）／に（Other）／行く（Other）／。（Other）」に対するスタッキング用学習データでは、二つのＮＥ抽出器の出力結果が不正解となっている。 Method (b): The result of testing the NE extractor using the learning data used for the NE extractor learning is used for stacking. That is, a NE extractor is created from the learning data, and NE extraction of the learning data is performed by the created NE extractor to obtain learning data for stacking. In this method, since the NE extractor performs NE extraction on learned data, in most cases, correct NE extraction can be performed, and only the portion where learning fails is different from the tag. Therefore, there is a possibility that the result is almost the same as the method (a). FIG. 13 is a diagram illustrating an example of creating learning data for stacking by the method (b). As shown in the figure, in this method, the NE extractor outputs a correct tag except for a portion where learning has failed. For example, in the learning data for stacking of learning data “Miyazaki (PERSON) / and (Other) / Miyazaki (LOCATION) / Ni (Other) / Go (Other) /. (Other)”, the output results of two NE extractors In the stacking learning data for "Hi (Other) / and (Other) / Miyazaki (LOCATION) / To (Other) / Go (Other) /. (Other)]", the two NE extractors The output result of is incorrect.

手法（ｃ）：１段目の学習器の学習用データとスタッキング学習のための学習データを別に用意する。この手法では、各ＮＥ抽出器は、学習したデータではない新規のデータに対してＮＥ抽出を行ない、その結果がスタッキング用学習データとなるため、それぞれのＮＥ抽出器がどのような事例に対して正しいＮＥ抽出を行なうかが調べられる。しかし、この手法では、学習データ作成の負荷が問題となる。また、ＮＥ抽出器作成用の学習データとスタッキング用の学習データをどのような比率で用意するのが適切であるかの判断が難しい。図１４は、手法（ｃ）によるスタッキング用学習データの作成例を示す図である。同図に示すように、この手法では、ＮＥ抽出器は、学習したデータとは異なるデータに対してテスト（ＮＥ抽出）を行ない、そのテスト結果がスタッキング用学習データとなる。同図では、「宮崎（PERSON）／と（Other）／食事（Other）／に（Other）／行く（Other）／。（Other）」および「彼（Other）／と（Other）／宮崎（LOCATION）／に（Other）／行く（Other）／。（Other）」をＮＥ抽出器の学習データとし、「宮崎（PERSON）／と（Other）／宮崎（LOCATION）／に（Other）／行く（Other）／。（Other）」をスタッキング用学習データの作成に用いている。 Method (c): Learning data for the first-stage learning device and learning data for stacking learning are prepared separately. In this method, each NE extractor performs NE extraction on new data that is not learned data, and the result becomes learning data for stacking. It is checked whether correct NE extraction is performed. However, with this method, the load of creating learning data becomes a problem. Also, it is difficult to determine at what ratio it is appropriate to prepare learning data for creating an NE extractor and learning data for stacking. FIG. 14 is a diagram illustrating an example of creating stacking learning data by the method (c). As shown in the figure, in this method, the NE extractor performs a test (NE extraction) on data different from the learned data, and the test result becomes learning data for stacking. In the figure, “Miyazaki (PERSON) / and (Other) / meal (Other) / To (Other) / Go (Other) /.” (Other) ”and“ He (Other) / and (Other) / Miyazaki (LOCATION) ) / Ni (Other) / Go (Other) / Go (Other) is the training data of the NE extractor, and “Miyazaki (PERSON) / and (Other) / Miyazaki (LOCATION) / Ni (Other) / Go (Other) ) /. (Other) ”is used to create learning data for stacking.

また、評価は、ＣＲＬ固有表現データから作成されるＮＥ抽出器によるＩＲＥＸ一般課題（GENERAL）（http://www.csl.sony.co.jp/person/sekine/IREX/Package/IREXfinalB.tar.gzから入手可能）の抽出結果を比較することによって行なった。ここで、ＣＲＬ固有表現データとは、毎日新聞９５年度版１，１７４記事、約１１，０００文に対してＩＲＥＸで定義された固有表現を付与したデータである。 In addition, the evaluation is based on the IREX general task (GENERAL) (NEERAL) (http://www.csl.sony.co.jp/person/sekine/IREX/Package/IREXfinalB.tar. This was done by comparing the extraction results (available from gz). Here, the CRL specific expression data is data in which a specific expression defined by IREX is given to the 1,174 articles of the Mainichi Shimbun 95 edition, about 11,000 sentences.

また、評価結果の比較には、以下の式で計算されるRecall、Precision、F-measure（Ｆ値）を用いる。
Recall＝ＮＵＭ／（抽出すべきＮＥ数）
Precision＝ＮＵＭ／（ＮＥ抽出装置が抽出したＮＥ数）
F-measure＝２×Recall×Precision／（Recall＋Precision）
ここで、ＮＵＭとは、ＮＥ抽出装置が正しく抽出したＮＥ数である。 For comparison of evaluation results, Recall, Precision, and F-measure (F value) calculated by the following equations are used.
Recall = NUM / (number of NEs to be extracted)
Precision = NUM / (number of NEs extracted by NE extractor)
F-measure = 2 × Recall × Precision / (Recall + Precision)
Here, NUM is the number of NEs correctly extracted by the NE extraction device.

単独のSVMs、Boosting、ADTreesをＩＲＥＸのGENERALで評価したＦ値は、SVMsが８５．３４、Boostingが８２．２７、ADTreesが８３．３４であり、SVMsが最も良い結果となった。そこで、このSVMsによる結果を評価のベースラインとした。 The F values of single SVMs, Boosting, and ADTrees evaluated by the IREX GENERAL were 85.34 for SVMs, 82.27 for Boosting, and 83.34 for ADTrees, and SVMs showed the best results. Therefore, the result of SVMs was used as the baseline for evaluation.

また、スタッキングでは、１段目のＮＥ抽出器の組み合わせとして、SVMs-Boosting、SVMs-ADTrees、Boosting-ADTrees、SVMs-Boosting-ADTreesの４つを用いた。また、分割数ｎとしては、「５」、「１０」、「２０」を用いた。 In stacking, four combinations of SVMs-Boosting, SVMs-ADTrees, Boosting-ADTrees, and SVMs-Boosting-ADTrees were used as combinations of the first stage NE extractors. As the division number n, “5”, “10”, and “20” were used.

図１５は、評価結果を示す図である。同図は、単独で最も良かったSVMsによる結果（破線）と、比較対象である手法（ａ）、（ｃ）のうち最も結果が良かったもの（破線）と、本実施例による４つの組み合わせによる結果（実線）とを示す。なお、縦軸はＦ値、横軸は学習データの分割数である。 FIG. 15 is a diagram showing the evaluation results. The figure shows the result of SVMs that was the best alone (broken line), the method (a) and (c) that were the objects of comparison, the best result (broken line), and the four combinations according to this example. The result (solid line) is shown. The vertical axis represents the F value, and the horizontal axis represents the number of learning data divisions.

手法（ａ）では、４種類の組み合わせのうち結果が最も良かったのはSVMs-Boosting-ADTreesの組み合わせであり、全体のＦ値は８４．９であった。手法（ｂ）により最も良かったものは、単独で最も良かったSVMsによる結果と同じであった。 In method (a), the best result among the four types of combinations was the combination of SVMs-Boosting-ADTrees, and the overall F value was 84.9. The best with method (b) was the same as the result with the best SVMs alone.

手法（ｃ）では、ＣＲＬ固有表現データを記事単位でＮ分割し、「Ｎ−１」をそれぞれのＮＥ抽出器作成用、残り「１」をスタッキング用として学習を行なった。なお、Ｎとしては、５、１０、２０を用いた。この方法では、データをＮ分割した場合、Ｎ個の結果が得られるため、全体で（５＋１０＋２０）×４＝１４０の結果が得られ、単独のもので最高の結果であるSVMsより良い結果が残せたものは、１４０結果中２つであった。図１５には、その中で、一番結果の良いSVMs-Boosting-ADTreesの５、１０、２０分割での結果を示す。 In the method (c), the CRL specific expression data is divided into N parts for each article, and “N-1” is used for creating each NE extractor, and the remaining “1” is used for stacking. As N, 5, 10, and 20 were used. In this method, when the data is divided into N, N results are obtained, so a total result of (5 + 10 + 20) × 4 = 140 is obtained, and a better result than the SVMs that are the best results can be left alone. Two of the 140 results. FIG. 15 shows the results of dividing SVMs-Boosting-ADTrees with the best results among 5, 10, 20 divisions.

図１５に示すように、比較対象である手法（ａ）〜（ｃ）の中でベースラインを越える結果を残せたのは手法（ｃ）だけであり、手法（ｃ）でベースライン以上となったのは１４０個の結果のうち２個だけであった。これに対して、本実施例に係るＮＥ抽出装置１００では、４（組み合わせ種類）×３（分割種類）＝１２個の結果のうち６個の結果がベースライン以上であり、手法（ａ）〜（ｃ）と比べて良い評価結果が得られている。 As shown in FIG. 15, only the method (c) is able to leave a result exceeding the baseline among the methods (a) to (c) to be compared, and the method (c) is more than the baseline. Only 2 out of 140 results were found. On the other hand, in the NE extraction apparatus 100 according to the present embodiment, 6 out of 4 (combination type) × 3 (division type) = 12 results are equal to or higher than the baseline, and the methods (a) to A better evaluation result is obtained than in (c).

また、SVMs-Boostingの組み合わせを除いて、分割数を大きくすることで徐々に精度が上昇し、SVMs-ADTreesとSVMs-Boosting-ADTreesの組み合わせは、分割数を大きくすることで精度が上昇し、１０分割以降では、単独で最も精度の良かったSVMsの結果を超えている。 Also, except for the SVMs-Boosting combination, the accuracy gradually increases by increasing the number of divisions, and the combination of SVMs-ADTrees and SVMs-Boosting-ADTrees increases the accuracy by increasing the number of divisions, After 10 divisions, the result of the SVMs with the highest accuracy alone is exceeded.

このように、本実施例に係るスタッキング用学習データ作成手法は、手法（ａ）〜（ｃ）と比較して平均的に良い結果を示し、また、学習データの分割数を大きくすることによって学習精度を上昇させることができる。 Thus, the learning data creation method for stacking according to the present embodiment shows an average better result than methods (a) to (c), and learning is performed by increasing the number of divisions of the learning data. Accuracy can be increased.

上述してきたように、本実施例では、学習制御部１４０が、学習データをｎ分割し、（ｎ−１）の学習データを用いて近似ＮＥ抽出器を作成し、残りの１の学習データをテストデータとして近似ＮＥ抽出器がＮＥ抽出した結果を用いてスタッキング用学習データを作成することによって、ＮＥ抽出器１１０₁〜１１０_mの学習データを減らすことなくスタッキング用学習データを作成し、高精度なＮＥ抽出装置１００を実現することができる。 As described above, in this embodiment, the learning control unit 140 divides the learning data into n parts, creates an approximate NE extractor using the learning data of (n−1), and uses the remaining one learning data. By creating the learning data for stacking using the result of the NE extraction by the approximate NE extractor as the test data, the learning data for stacking is created without reducing the learning data of the NE extractors 110 _{1 to} 110 _m. A NE extraction device 100 can be realized.

なお、本実施例では、ＮＥ抽出装置１００について説明したが、ＮＥ抽出装置１００からスタッキング用ＮＥ抽出器１３０および抽出制御部１５０を取り除くことによってスタッキング用学習データを作成する学習データ作成装置を得ることができる。 In the present embodiment, the NE extraction device 100 has been described. However, by removing the stacking NE extractor 130 and the extraction control unit 150 from the NE extraction device 100, a learning data creation device that creates learning data for stacking is obtained. Can do.

また、本実施例では、ＮＥ抽出装置について説明したが、ＮＥ抽出装置が有する構成をソフトウェアによって実現することで、同様の機能を有するＮＥ抽出プログラムを得ることができる。そこで、このＮＥ抽出プログラムを実行するコンピュータについて説明する。なお、ＮＥ抽出装置１００からスタッキング用学習データを作成する学習データ作成装置を得るのと同様に、ＮＥ抽出プログラムからスタッキング用学習データを作成する学習データ作成プログラムを得ることができる。 In the present embodiment, the NE extraction device has been described. However, an NE extraction program having the same function can be obtained by realizing the configuration of the NE extraction device with software. Therefore, a computer that executes this NE extraction program will be described. Note that a learning data creation program for creating stacking learning data from the NE extraction program can be obtained in the same manner as a learning data creation device for creating stacking learning data from the NE extraction device 100.

図１６は、本実施例に係るＮＥ抽出プログラムを実行するコンピュータの構成を示す機能ブロック図である。同図に示すように、このコンピュータ２００は、ＲＡＭ２１０と、ＣＰＵ２２０と、ＨＤＤ２３０と、ＬＡＮインタフェース２４０と、入出力インタフェース２５０と、ＤＶＤドライブ２６０とを有する。 FIG. 16 is a functional block diagram illustrating the configuration of a computer that executes the NE extraction program according to the present embodiment. As shown in the figure, the computer 200 includes a RAM 210, a CPU 220, an HDD 230, a LAN interface 240, an input / output interface 250, and a DVD drive 260.

ＲＡＭ２１０は、プログラムやプログラムの実行途中結果などを記憶するメモリであり、ＣＰＵ２２０は、ＲＡＭ２１０からプログラムを読み出して実行する中央処理装置である。 The RAM 210 is a memory that stores a program and a program execution result, and the CPU 220 is a central processing unit that reads the program from the RAM 210 and executes the program.

ＨＤＤ２３０は、プログラムやデータを格納するディスク装置であり、ＬＡＮインタフェース２４０は、コンピュータ２００をＬＡＮ経由で他のコンピュータに接続するためのインタフェースである。 The HDD 230 is a disk device that stores programs and data, and the LAN interface 240 is an interface for connecting the computer 200 to other computers via the LAN.

入出力インタフェース２５０は、マウスやキーボードなどの入力装置および表示装置を接続するためのインタフェースであり、ＤＶＤドライブ２６０は、ＤＶＤの読み書きを行なう装置である。 The input / output interface 250 is an interface for connecting an input device such as a mouse or a keyboard and a display device, and the DVD drive 260 is a device for reading / writing a DVD.

そして、このコンピュータ２００において実行されるＮＥ抽出プログラム２１１は、ＤＶＤに記憶され、ＤＶＤドライブ２６０によってＤＶＤから読み出されてコンピュータ２００にインストールされる。 The NE extraction program 211 executed in the computer 200 is stored in the DVD, read from the DVD by the DVD drive 260, and installed in the computer 200.

あるいは、このＮＥ抽出プログラム２１１は、ＬＡＮインタフェース２４０を介して接続された他のコンピュータシステムのデータベースなどに記憶され、これらのデータベースから読み出されてコンピュータ２００にインストールされる。 Alternatively, this NE extraction program 211 is stored in a database or the like of another computer system connected via the LAN interface 240, read from these databases, and installed in the computer 200.

そして、インストールされたＮＥ抽出プログラム２１１は、ＨＤＤ２３０に記憶され、ＲＡＭ２１０に読み出されてＣＰＵ２２０によってＮＥ抽出プロセス２２１として実行される。 The installed NE extraction program 211 is stored in the HDD 230, read into the RAM 210, and executed by the CPU 220 as the NE extraction process 221.

また、本実施例では、１台のコンピュータでＮＥ抽出プログラムを実行する場合について説明したが、本発明はこれに限定されるものではなく、１段目の各ＮＥ抽出器に対応するプログラムを、それぞれ１台のコンピュータで実行する場合にも同様に適用することができる。 In this embodiment, the NE extraction program is executed by one computer. However, the present invention is not limited to this, and a program corresponding to each NE extractor in the first stage is The same can be applied to the case where each is executed by one computer.

また、本実施例では、ＮＥ抽出装置について説明したが、本発明はこれに限定されるものではなく、スタッキングを用いる分類装置にも同様に適用することができる。 In the present embodiment, the NE extraction apparatus has been described. However, the present invention is not limited to this, and can be similarly applied to a classification apparatus using stacking.

（付記１）異なる機械学習アルゴリズムに基づく複数の分類プログラムの分類結果を手がかりとして利用することで分類を行なう分類プログラムを作成するスタッキング学習に用いるスタッキング用学習データを作成する学習データ作成プログラムであって、
前記複数の分類プログラムが機械学習に用いる学習データを複数の部分学習データに分割する学習データ分割手順と、
前記学習データ分割手順により学習データが分割されて得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、該一つの部分学習データを除く部分学習データを学習用のデータとして機械学習アルゴリズムで学習した前記複数の分類プログラムでテストしたテスト結果をスタッキング用学習データとして獲得する手順を繰り返すことでスタッキング用学習データを生成するスタッキング用学習データ生成手順と、
をコンピュータに実行させることを特徴とする学習データ作成プログラム。 (Supplementary note 1) A learning data creation program for creating stacking learning data used for stacking learning for creating a classification program for performing classification by using classification results of a plurality of classification programs based on different machine learning algorithms as clues. ,
A learning data dividing procedure for dividing the learning data used by the plurality of classification programs for machine learning into a plurality of partial learning data;
One partial learning data of a plurality of partial learning data obtained by dividing the learning data by the learning data dividing procedure is used as test data, and the partial learning data excluding the one partial learning data is used as learning data. A learning data generation procedure for stacking that generates learning data for stacking by repeating a procedure of acquiring test results tested by the plurality of classification programs learned by a machine learning algorithm as learning data for stacking,
A learning data creation program characterized by causing a computer to execute.

（付記２）前記スタッキング用学習データ生成手順を、前記複数の分類プログラムをそれぞれ異なるコンピュータで実行することを特徴とする付記１に記載の学習データ作成プログラム。 (Supplementary note 2) The learning data creation program according to supplementary note 1, wherein the stacking learning data generation procedure is executed by a different computer for each of the plurality of classification programs.

（付記３）前記分類プログラムが、文章から固有表現を抽出する固有表現抽出プログラムであることを特徴とする付記１または２に記載の学習データ作成プログラム。 (Additional remark 3) The said classification program is a specific expression extraction program which extracts a specific expression from a sentence, The learning data creation program of Additional remark 1 or 2 characterized by the above-mentioned.

（付記４）前記固有表現抽出プログラムが、文章から単語を切り出す単語切出手順と、前記単語切出手順により切り出された単語ごとに固有表現であるか否かを判定して固有表現を抽出する複数の分類プログラムと、それら複数の分類プログラムの出力結果を手がかりとして、単語ごとに固有表現であるか否かを判定して固有表現を抽出する分類プログラムを作成するための付記３に記載の学習データ作成プログラム。 (Additional remark 4) The said specific expression extraction program determines whether it is a specific expression for every word extracted by the said word extraction procedure and the word extraction procedure which extracts a word from a sentence, and extracts a specific expression The learning according to attachment 3 for creating a classification program for extracting a specific expression by determining whether each word is a specific expression by using a plurality of classification programs and the output results of the plurality of classification programs as clues Data creation program.

（付記５）異なる機械学習アルゴリズムに基づく複数の分類プログラムの分類結果を手がかりとして利用することで分類を行なうスタッキング分類プログラムであって、
前記複数の分類プログラムが機械学習に用いる学習データを複数の部分学習データに分割する学習データ分割手順と、
前記学習データ分割手順により学習データが分割されて得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、該一つの部分学習データを除く部分学習データを学習用のデータとして機械学習アルゴリズムで学習した前記複数の分類プログラムをテストしたテスト結果をスタッキング用学習データとして獲得する手順を繰り返すことでスタッキング用学習データを生成するスタッキング用学習データ生成手順と、
前記スタッキング用学習データ生成手順により生成されたスタッキング用学習データを用いてスタッキング学習を行なうスタッキング学習手順と、
をコンピュータに実行させることを特徴とするスタッキング分類プログラム。 (Appendix 5) A stacking classification program for performing classification by using classification results of a plurality of classification programs based on different machine learning algorithms as a clue,
A learning data dividing procedure for dividing the learning data used by the plurality of classification programs for machine learning into a plurality of partial learning data;
One partial learning data of a plurality of partial learning data obtained by dividing the learning data by the learning data dividing procedure is used as test data, and the partial learning data excluding the one partial learning data is used as learning data. A learning data generation procedure for stacking that generates learning data for stacking by repeating a procedure of acquiring test results obtained by testing the plurality of classification programs learned by the machine learning algorithm as learning data for stacking;
A stacking learning procedure for performing stacking learning using the learning data for stacking generated by the learning data generation procedure for stacking;
A stacking classification program for causing a computer to execute.

（付記６）異なる機械学習アルゴリズムに基づく複数の分類プログラムの分類結果を手がかりとして利用することで分類を行なうスタッキング学習に用いるスタッキング用学習データを作成する学習データ作成方法であって、
前記複数の分類プログラムが機械学習に用いる学習データを複数の部分学習データに分割する学習データ分割工程と、
前記学習データ分割工程により学習データが分割されて得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、該一つの部分学習データを除く部分学習データを学習用のデータとして機械学習アルゴリズムで学習した前記複数の分類プログラムをテストしたテスト結果をスタッキング用学習データとして獲得する手順を繰り返すことでスタッキング用学習データを生成するスタッキング用学習データ生成工程と、
を含んだことを特徴とする学習データ作成方法。 (Supplementary note 6) A learning data creation method for creating learning data for stacking used for stacking learning in which classification is performed by using classification results of a plurality of classification programs based on different machine learning algorithms as clues,
A learning data dividing step for dividing the learning data used for machine learning by the plurality of classification programs into a plurality of partial learning data;
One partial learning data among a plurality of partial learning data obtained by dividing learning data by the learning data dividing step is used as test data, and partial learning data excluding the one partial learning data is used as learning data. A learning data generation process for stacking that generates learning data for stacking by repeating a procedure of acquiring test results obtained by testing the plurality of classification programs learned by the machine learning algorithm as learning data for stacking;
The learning data creation method characterized by including.

（付記７）前記スタッキング用学習データ生成工程は、前記複数の分類プログラムをそれぞれ異なるコンピュータで実行することを特徴とする付記６に記載の学習データ作成方法。 (Supplementary note 7) The learning data creation method according to supplementary note 6, wherein in the stacking learning data generation step, the plurality of classification programs are executed by different computers.

（付記８）前記分類プログラムが、文章から固有表現を抽出する固有表現抽出プログラムであることを特徴とする付記６または７に記載の学習データ作成方法。 (Supplementary note 8) The learning data creation method according to supplementary note 6 or 7, wherein the classification program is a specific expression extraction program for extracting a specific expression from a sentence.

（付記９）前記固有表現抽出プログラムが、文章から単語を切り出す単語切出工程と、前記単語切出工程により切り出された単語ごとに固有表現であるか否かを判定して固有表現を抽出する複数の分類プログラムと、それら複数の分類プログラムの出力結果を手がかりとして、単語ごとに固有表現であるか否かを判定して固有表現を抽出する分類プログラムを作成するための付記８に記載の学習データ作成方法。 (Additional remark 9) The said specific expression extraction program determines whether it is a specific expression for every word cut out by the word extraction process which extracts the word from a sentence, and the said word extraction process, and extracts a specific expression Supplementary learning 8 for creating a classification program for extracting a specific expression by determining whether each word is a specific expression by using a plurality of classification programs and output results of the plurality of classification programs as clues Data creation method.

（付記１０）異なる機械学習アルゴリズムに基づく複数の分類プログラムの分類結果を手がかりとして利用することで分類を行なうスタッキング分類方法であって、
前記複数の分類プログラムが機械学習に用いる学習データを複数の部分学習データに分割する学習データ分割工程と、
前記学習データ分割工程により学習データが分割されて得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、該一つの部分学習データを除く部分学習データを学習用のデータとして機械学習アルゴリズムで学習した前記複数の分類プログラムをテストしたテスト結果をスタッキング用学習データとして獲得する手順を繰り返すことでスタッキング用学習データを生成するスタッキング用学習データ生成工程と、
前記スタッキング用学習データ生成工程により生成されたスタッキング用学習データを用いてスタッキング学習を行なうスタッキング学習工程と、
を含んだことを特徴とするスタッキング分類方法。 (Appendix 10) A stacking classification method for performing classification by using classification results of a plurality of classification programs based on different machine learning algorithms as a clue,
A learning data dividing step for dividing the learning data used for machine learning by the plurality of classification programs into a plurality of partial learning data;
One partial learning data among a plurality of partial learning data obtained by dividing learning data by the learning data dividing step is used as test data, and partial learning data excluding the one partial learning data is used as learning data. A learning data generation process for stacking that generates learning data for stacking by repeating a procedure of acquiring test results obtained by testing the plurality of classification programs learned by the machine learning algorithm as learning data for stacking;
A stacking learning step of performing stacking learning using the stacking learning data generated by the stacking learning data generation step;
A stacking classification method characterized by including:

（付記１１）異なる機械学習アルゴリズムに基づく複数の分類装置の分類結果を手がかりとして利用することで分類を行なうスタッキング学習に用いるスタッキング用学習データを作成する学習データ作成装置であって、
前記複数の分類装置が機械学習に用いる学習データを複数の部分学習データに分割する学習データ分割手段と、
前記学習データ分割手段により学習データが分割されて得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、該一つの部分学習データを除く部分学習データを学習用のデータとして機械学習アルゴリズムで学習した前記複数の分類装置をテストしたテスト結果をスタッキング用学習データとして獲得する手順を繰り返すことでスタッキング用学習データを生成するスタッキング用学習データ生成手段と、
を備えたことを特徴とする学習データ作成装置。 (Supplementary note 11) A learning data creation device for creating stacking learning data used for stacking learning in which classification is performed by using classification results of a plurality of classification devices based on different machine learning algorithms as clues,
Learning data dividing means for dividing the learning data used by the plurality of classification devices for machine learning into a plurality of partial learning data;
One partial learning data among a plurality of partial learning data obtained by dividing learning data by the learning data dividing means is used as test data, and partial learning data excluding the one partial learning data is used as learning data. Stacking learning data generating means for generating stacking learning data by repeating a procedure of acquiring test results obtained by testing the plurality of classifiers learned by the machine learning algorithm as stacking learning data;
A learning data creation device characterized by comprising:

（付記１２）前記スタッキング用学習データ生成手段は、前記複数の分類装置として分類プログラムをそれぞれ異なるコンピュータで実行することを特徴とする付記１１に記載の学習データ作成装置。 (Additional remark 12) The said learning data production | generation means for stacking is a learning data production apparatus of Additional remark 11 characterized by running a classification program with a respectively different computer as said several classification apparatus.

（付記１３）前記分類装置が、文章から固有表現を抽出する固有表現抽出装置であることを特徴とする付記１１または１２に記載の学習データ作成装置。 (Supplementary note 13) The learning data creation device according to supplementary note 11 or 12, wherein the classification device is a specific expression extraction device that extracts a specific expression from a sentence.

（付記１４）前記固有表現抽出装置が、文章から単語を切り出す単語切出手段と、前記単語切出手段により切り出された単語ごとに固有表現であるか否かを判定して固有表現を抽出する複数の分類手段と、それら複数の分類手段の出力結果を手がかりとして、単語ごとに固有表現であるか否かを判定して固有表現を抽出する分類装置を作成するための付記１３に記載の学習データ作成装置。 (Additional remark 14) The said specific expression extraction apparatus determines whether it is a specific expression for every word cut out by the word extraction means which extracts a word from a sentence, and the said word extraction means, and extracts a specific expression The learning according to appendix 13 for creating a classification device that extracts a specific expression by determining whether each word is a specific expression by using a plurality of classification means as a clue based on output results of the plurality of classification means Data creation device.

（付記１５）異なる機械学習アルゴリズムに基づく複数の分類装置の分類結果を手がかりとして利用することで分類を行なうスタッキング分類装置であって、
前記複数の分類装置が機械学習に用いる学習データを複数の部分学習データに分割する学習データ分割手段と、
前記学習データ分割手段により学習データが分割されて得られた複数の部分学習データのうちの一つの部分学習データをテストデータとして、該一つの部分学習データを除く部分学習データを学習用のデータとして機械学習アルゴリズムで学習した前記複数の分類装置をテストしたテスト結果をスタッキング用学習データとして獲得する手順を繰り返すことでスタッキング用学習データを生成するスタッキング用学習データ生成手段と、
前記スタッキング用学習データ生成手段により生成されたスタッキング用学習データを用いてスタッキング学習を行なうスタッキング学習手段と、
を備えたことを特徴とするスタッキング分類装置。 (Supplementary note 15) A stacking classification device that performs classification by using classification results of a plurality of classification devices based on different machine learning algorithms as a clue,
Learning data dividing means for dividing the learning data used by the plurality of classification devices for machine learning into a plurality of partial learning data;
One partial learning data among a plurality of partial learning data obtained by dividing learning data by the learning data dividing means is used as test data, and partial learning data excluding the one partial learning data is used as learning data. Stacking learning data generating means for generating stacking learning data by repeating a procedure of acquiring test results obtained by testing the plurality of classifiers learned by the machine learning algorithm as stacking learning data;
Stacking learning means for performing stacking learning using the stacking learning data generated by the stacking learning data generating means;
A stacking and classifying apparatus comprising:

以上のように、本発明に係る学習データ作成プログラム、学習データ作成方法および学習データ作成装置は、スタッキングを用いる分類装置に有用であり、特に、学習データの作成負荷が高い場合に適している。 As described above, the learning data creation program, the learning data creation method, and the learning data creation device according to the present invention are useful for the classification device that uses stacking, and are particularly suitable when the learning data creation load is high.

本発明によるスタッキング学習アルゴリズムを示す図である。It is a figure which shows the stacking learning algorithm by this invention. 本発明のフローチャートである。3 is a flowchart of the present invention. 本発明における分類装置の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the classification device in this invention. 本実施例に係るＮＥ抽出装置の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the NE extraction apparatus which concerns on a present Example. ＩＲＥＸで定義されたＮＥの例を示す図である。It is a figure which shows the example of NE defined by IREX. スタッキング用学習データの例を示す図である。It is a figure which shows the example of the learning data for stacking. 本実施例に係るスタッキング学習アルゴリズムを示す図である。It is a figure which shows the stacking learning algorithm which concerns on a present Example. 複数のＮＥ抽出器を使ったスタッキングを説明するための説明図である。It is explanatory drawing for demonstrating the stacking using a some NE extractor. 単語単位のＮＥタギングから文字単位のＮＥタギングへの変換例を示す図である。It is a figure which shows the example of conversion from NE tagging of a word unit to NE tagging of a character unit. 本実施例に係るＮＥ抽出装置によるスタッキング学習処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the stacking learning process by the NE extraction apparatus which concerns on a present Example. 本実施例に係るＮＥ抽出装置によるスタッキング用学習データの作成例を示す図である。It is a figure which shows the example of preparation of the learning data for stacking by the NE extraction apparatus which concerns on a present Example. 手法（ａ）によるスタッキング用学習データの作成例を示す図である。It is a figure which shows the example of creation of the learning data for stacking by the method (a). 手法（ｂ）によるスタッキング用学習データの作成例を示す図である。It is a figure which shows the example of preparation of the learning data for stacking by the method (b). 手法（ｃ）によるスタッキング用学習データの作成例を示す図である。It is a figure which shows the example of creation of the learning data for stacking by the method (c). 評価結果を示す図である。It is a figure which shows an evaluation result. 本実施例に係るＮＥ抽出プログラムを実行するコンピュータの構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the computer which performs NE extraction program based on a present Example.

Explanation of symbols

１分類装置
１１₁〜１１_m 分類装置
１２スタッキング用学習データ記憶部
１３スタッキング用分類装置
１４学習制御部
１５抽出制御部
１００ＮＥ抽出装置
１１０₁〜１１０_m ＮＥ抽出装置
１１１ＮＥＴ部
１２０スタッキング用学習データ記憶部
１３０スタッキング用ＮＥ抽出器
１４０学習制御部
１５０抽出制御部
１６０後処理部
２００コンピュータ
２１０ＲＡＭ
２１１ＮＥ抽出プログラム
２２０ＣＰＵ
２２１ＮＥ抽出プロセス
２３０ＨＤＤ
２４０ＬＡＮインタフェース
２５０入出力インタフェース
２６０ＤＶＤドライブ 1 Classifier 11 _{1 to} 11 _m Classifier 12 Stacking Learning Data Storage Unit 13 Stacking Classifier 14 Learning Control Unit 15 Extraction Control Unit 100 NE Extraction Device 110 _{1 to} 110 _m NE Extraction Device 111 NET Unit 120 Stacking Learning Data Storage unit 130 NE extractor for stacking 140 Learning control unit 150 Extraction control unit 160 Post-processing unit 200 Computer 210 RAM
211 NE extraction program 220 CPU
221 NE extraction process 230 HDD
240 LAN interface 250 I / O interface 260 DVD drive

Claims

A learning data creation program for creating learning data for stacking used for stacking learning for creating a classification program for performing classification by using classification results of a plurality of classification programs based on different machine learning algorithms as a clue,
A learning data dividing procedure for dividing the learning data used by the plurality of classification programs for machine learning into a plurality of partial learning data;
One partial learning data of a plurality of partial learning data obtained by dividing the learning data by the learning data dividing procedure is used as test data, and the partial learning data excluding the one partial learning data is used as learning data. A learning data generation procedure for stacking that generates learning data for stacking by repeating a procedure of acquiring test results tested by the plurality of classification programs learned by a machine learning algorithm as learning data for stacking,
A learning data creation program characterized by causing a computer to execute.

The learning data creation program according to claim 1, wherein the learning data generation procedure for stacking is executed by a different computer for each of the plurality of classification programs.

The learning data creation program according to claim 1, wherein the classification program is a specific expression extraction program that extracts a specific expression from a sentence.

A plurality of classification programs for extracting a specific expression by determining whether the specific expression extraction program is a specific expression for each word extracted by the word extraction procedure; The learning data creation program according to claim 3 for creating a classification program for determining whether each word is a unique expression by using the output results of the plurality of classification programs as a clue and extracting the specific expression .

A learning data creation method for creating learning data for stacking used for stacking learning for creating a classification program for performing classification by using classification results of a plurality of classification programs based on different machine learning algorithms as clues,
A learning data dividing step for dividing the learning data used for machine learning by the plurality of classification programs into a plurality of partial learning data;
One partial learning data among a plurality of partial learning data obtained by dividing learning data by the learning data dividing step is used as test data, and partial learning data excluding the one partial learning data is used as learning data. A learning data generation process for stacking that generates learning data for stacking by repeating a procedure of acquiring test results obtained by testing the plurality of classification programs learned by the machine learning algorithm as learning data for stacking;
The learning data creation method characterized by including.

A learning data creation device for creating stacking learning data for use in stacking learning for creating a classification program for performing classification by using classification results of a plurality of classification devices based on different machine learning algorithms as clues,
Learning data dividing means for dividing the learning data used by the plurality of classification devices for machine learning into a plurality of partial learning data;
One partial learning data among a plurality of partial learning data obtained by dividing learning data by the learning data dividing means is used as test data, and partial learning data excluding the one partial learning data is used as learning data. Stacking learning data generating means for generating stacking learning data by repeating a procedure of acquiring test results obtained by testing the plurality of classifiers learned by the machine learning algorithm as stacking learning data;
A learning data creation device characterized by comprising: