JP2018041300A

JP2018041300A - Machine learning model generation device and program

Info

Publication number: JP2018041300A
Application number: JP2016175389A
Authority: JP
Inventors: 元樹谷口; Motoki Taniguchi; 大熊　智子; Tomoko Okuma; 智子大熊; 鈴木　星児; Seiji Suzuki; 星児鈴木
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2016-09-08
Filing date: 2016-09-08
Publication date: 2018-03-15
Anticipated expiration: 2036-09-08
Also published as: JP6770709B2

Abstract

PROBLEM TO BE SOLVED: To perform previous learning by automatically determining a class during the previous learning without requiring much time for searching for a condition of the previous learning when generating a model for additional learning.SOLUTION: An image reading device comprises: first similarity calculation means for calculating similarity between classes of which the relationship is predefined, based on the mutual relationship of the classes; second similarity calculation means for calculating similarity between the classes based on feature amounts of the classes; determination means for determining whether the classes are similar based on calculation results of the first calculation means and the second calculation means; class merging means for merging the classes that are determined similar by the determination means, into one class; and machine learning means for performing supervised machine learning processing.SELECTED DRAWING: Figure 1

Description

本発明は、機械学習用モデル生成装置及びプログラムに関する。 The present invention relates to a machine learning model generation device and a program.

自然言語処理を用いたアプリケーションの要素技術としても用いられる固有表現抽出は、テキストに含まれる固有名詞等の固有表現を抽出する技術であり、Support Vector Machine（ＳＶＭ）や、Conditional Random Fields（ＣＲＦｓ）などの識別問題（分類問題）を扱う機械学習手法を用いたシステムが知られている。 Specific expression extraction, which is also used as an elemental technology for applications using natural language processing, is a technique for extracting specific expressions such as proper nouns contained in text. Support Vector Machine (SVM) and Conditional Random Fields (CRFs) There is known a system using a machine learning technique that handles an identification problem such as a classification problem.

下記特許文献１には、既存のモデルのパラメータへの影響を少なくしたままで、追加データに適応したモデルパラメータを推定することができる、モデルパラメータ推定方法が開示されている。 Patent Document 1 below discloses a model parameter estimation method that can estimate a model parameter adapted to additional data while reducing the influence on parameters of an existing model.

下記非特許文献１には、ＣＲＦｓに隠れユニット層が追加されたＣＲＦｓの派生の一種であるHidden-Unit CRF（ＨＵＣＲＦ）が開示されている。 Non-Patent Document 1 below discloses Hidden-Unit CRF (HUCRF) which is a kind of derivation of CRFs in which a hidden unit layer is added to CRFs.

下記非特許文献２には、ラベル付けされていないデータ群に対して正準相関分析を利用したクラスタリングによって仮のラベル付けをしたデータをＨＵＣＲＦに学習させ、得られるパラメータを初期値として実際にラベルがつけられているデータで追加学習を行う事前学習の方法が開示されている。 In Non-Patent Document 2 below, HUCRF is made to learn temporary labeled data by clustering using canonical correlation analysis for an unlabeled data group, and the obtained parameters are actually labeled as initial values. A method of pre-learning for performing additional learning using data with a mark is disclosed.

特開２０１５− ３８７０９号公報Japanese Patent Laying-Open No. 2015-38709

L.maaten, M.welling, L.K.Saul "Hidden-Unit Conditional Random Fields" Int. Conf. on Artificial Intelligence & Statistics pp.479-488, 2011L.maaten, M.welling, L.K.Saul "Hidden-Unit Conditional Random Fields" Int. Conf. On Artificial Intelligence & Statistics pp.479-488, 2011 Y.B.Kim, K.Stratos, R.Sarikaya, "Pre-training of Hidden-Unit CRFs" Association for Computational Linguistics, pp.192-198, 2015Y.B.Kim, K.Stratos, R.Sarikaya, "Pre-training of Hidden-Unit CRFs" Association for Computational Linguistics, pp.192-198, 2015

識別問題においては、一般的に、ラベル（「タグ」等ともいわれる）が既に付与されている学習データ（既知データ）を与えて学習させる教師あり学習が行われ、学習の結果得られたモデルパラメータを用いて未知データの識別（いずれかのラベルを付与することによるクラス分類）を行う。このような、機械学習手法を用いて高い精度を達成するためには、一般的に大量の学習データ（コーパスや辞書）が必要となる。 In the identification problem, generally, supervised learning is performed in which learning data (known data) to which a label (also referred to as a “tag” or the like) is already given is given and learning is performed, and the model parameters obtained as a result of learning are performed. Is used to identify unknown data (classification by assigning one of the labels). In order to achieve high accuracy using such a machine learning method, a large amount of learning data (corpus or dictionary) is generally required.

例えば、企業内の文書には機密性の高い情報を含むものや、各企業独自の知識やルールを前提にしないと意味を理解できないものが多いため、企業内文書に対して固有表現抽出を行うためには、学習用コーパスを独自に用意する必要が生じる。このような企業内文書に対して大量のアノテーション（関連する注釈情報）を付与するには極めてコストがかかるため、結果として、企業内文書に対して固有表現抽出を行おうとしても、高精度を達成できるだけの大量の学習データを用意することは現実的ではない。 For example, there are many documents in the company that contain highly confidential information, and there are many that cannot be understood unless the knowledge and rules unique to each company are assumed. For this purpose, a learning corpus needs to be prepared independently. Since it is extremely expensive to give a large amount of annotation (related annotation information) to such an in-company document, as a result, even if trying to extract a specific expression for an in-company document, high accuracy is achieved. It is not realistic to prepare as much learning data as can be achieved.

学習データを大量に用意できないタスクの場合でも、大量の学習データが存在する別のタスクで事前学習させたモデルパラメータ（「モデル」ともいう）の一部を初期条件として、目的のタスクで追加学習することで、学習精度（識別問題であれば、得られたモデルを用いて未知データを識別する場合の精度）を向上させられる場合があるが、精度が高くなるような適切な事前学習の条件が不明であるため、条件の探索に時間がかかる。 Even in the case of tasks for which a large amount of learning data is not available, additional learning is performed with the target task using some model parameters (also referred to as “models”) that have been pre-trained in another task with a large amount of learning data as initial conditions This may improve the learning accuracy (if it is an identification problem, the accuracy of identifying unknown data using the obtained model), but appropriate pre-learning conditions that increase the accuracy Since it is unknown, it takes time to search for conditions.

本発明は、事前学習の際にクラスの統合を行わない場合と比較して、学習精度が高くなるようなモデルを生成することのできる機械学習用モデル生成装置及びプログラムの提供を目的とする。 An object of the present invention is to provide a machine learning model generation apparatus and a program that can generate a model with higher learning accuracy compared to a case where class integration is not performed in advance learning.

［機械学習用モデル生成装置］
請求項１に係る本発明は、予め関係性が定義されたクラスの互いの関係性に基づいて、前記クラス間の類似度を算出する、第一類似度算出手段と、
前記クラスの特徴量に基づいて、前記クラス間の類似度を算出する、第二類似度算出手段と、
前記第一算出手段及び前記第二算出手段の算出結果に基づき、前記クラス間が類似しているか否かを判定する、判定手段と、
前記判定手段によって類似していると判定されたクラス同士を、一つのクラスに統合するクラス統合手段と、
教師あり機械学習処理を行う機械学習手段と、
を具えた、機械学習用モデル生成装置である。 [Model generator for machine learning]
The present invention according to claim 1 is a first similarity calculation means for calculating a similarity between the classes based on a relationship between classes in which a relationship is defined in advance.
Second similarity calculation means for calculating the similarity between the classes based on the feature amount of the class;
A determination unit that determines whether the classes are similar based on the calculation results of the first calculation unit and the second calculation unit;
Class integration means for integrating the classes determined to be similar by the determination means into one class;
Machine learning means for performing supervised machine learning processing;
Is a machine learning model generation device.

請求項２に係る本発明は、前記関係性が階層構造またはグラフ構造である、請求項１記載の機械学習用モデル生成装置である。 The present invention according to claim 2 is the model generation apparatus for machine learning according to claim 1, wherein the relationship is a hierarchical structure or a graph structure.

請求項３に係る本発明は、前記機械学習手段が、モデルパラメータを推定するモデルパラメータ推定手段と、学習されたモデルパラメータからデータの各クラスにおける事後確率を推定する手段と、を具えている請求項１又は２記載のモデル生成装置である。 According to a third aspect of the present invention, the machine learning means comprises model parameter estimation means for estimating model parameters, and means for estimating posterior probabilities in each class of data from the learned model parameters. Item 3. The model generation device according to item 1 or 2.

［プログラム］
請求項４に係る本発明は、予め関係性が定義されたクラスの互いの関係性に基づいて、前記クラス間の類似度を第一類似度として算出するステップと、
前記クラスの特徴量に基づいて、前記クラス間の類似度を第二類似度として算出するステップと、
前記第一類似度及び前記第二類似度に基づき、前記クラス間が類似しているか否かを判定するステップと、
類似していると判定されたクラス同士を、一つのクラスに統合するステップと、
教師あり機械学習処理を行うステップと、
をコンピュータに実行させるためのプログラムである。 [program]
The present invention according to claim 4 is a step of calculating a similarity between the classes as a first similarity based on a relationship between classes whose relationships are defined in advance.
Calculating a similarity between the classes as a second similarity based on the feature amount of the class;
Determining whether the classes are similar based on the first similarity and the second similarity;
Combining the classes determined to be similar into a single class;
Performing supervised machine learning processing;
Is a program for causing a computer to execute.

請求項１に係る本発明よれば、事前学習の際にクラスの統合を行わない場合と比較して、学習精度が高くなるようなモデルを生成することが可能となる。 According to the first aspect of the present invention, it is possible to generate a model with higher learning accuracy compared to a case where class integration is not performed in the prior learning.

請求項２に係る本発明よれば、請求項１の効果に加えて、クラスの関係性が階層構造又はグラフ構造の場合に適用可能となる。 According to the second aspect of the present invention, in addition to the effect of the first aspect, the present invention can be applied to a case where the class relationship is a hierarchical structure or a graph structure.

請求項３に係る本発明よれば、請求項１又は２の効果に加えて、事後確率に基づいて算出された第二類似度に基づいてクラス間が類似しているか否かを判定される、機械学習用モデル生成装置が得られる。 According to the present invention of claim 3, in addition to the effect of claim 1 or 2, it is determined whether or not the classes are similar based on the second similarity calculated based on the posterior probability. A machine learning model generation apparatus is obtained.

請求項４に係る本発明よれば、事前学習の際にクラスの統合を行わない場合と比較して、学習精度が高くなるようなモデルを生成することが可能となる。 According to the fourth aspect of the present invention, it is possible to generate a model with higher learning accuracy compared to a case where class integration is not performed during pre-learning.

本発明の実施形態に係るモデル生成装置１０の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the model production | generation apparatus 10 which concerns on embodiment of this invention. 本発明の実施形態に係るモデル生成装置１０のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the model production | generation apparatus 10 which concerns on embodiment of this invention. 本発明の実施形態に係るモデル生成装置１０の動作を説明するためのフローチャートである。It is a flowchart for demonstrating operation | movement of the model production | generation apparatus 10 which concerns on embodiment of this invention. 実施例１に係るクラスの階層構造の例を示す。The example of the hierarchical structure of the class which concerns on Example 1 is shown. 実施例１に係るクラスにおける、各インスタンスの推定確率の例である。It is an example of the estimated probability of each instance in the class according to the first embodiment. 実施例１に係る学習クラス間の相関係数を示す行列である。3 is a matrix showing correlation coefficients between learning classes according to the first embodiment. 実施例１に係る学習クラス間の階層構造上の類似度と、クラスの階層構造との関係を示す。The relationship between the similarity on the hierarchical structure between the learning classes based on Example 1, and the hierarchical structure of a class is shown. 実施例１において、クラスの統合処理がされた後の階層構造を示す。In Example 1, the hierarchical structure after class integration processing is shown. 実施例２に係るクラスのグラフ構造の例を示す。The example of the graph structure of the class which concerns on Example 2 is shown. 実施例２に係るクラスにおける、各インスタンスの推定確率の例である。It is an example of the estimated probability of each instance in the class according to the second embodiment. 実施例２に係る学習クラス間の相関係数を示す行列である。10 is a matrix showing correlation coefficients between learning classes according to the second embodiment. 実施例２に係る学習クラス間のグラフ構造上の類似度と、クラスのグラフ構造との関係を示す。The relationship between the similarity on the graph structure between the learning classes based on Example 2, and the graph structure of a class is shown. 実施例２において、クラスの統合処理がされた後のグラフ構造を示す。In Example 2, the graph structure after a class integration process is shown.

図１は、本発明の実施形態に係るモデル生成装置１０の機能構成を示すブロック図である。本実施形態のモデル生成装置１０は、機械学習部１１と、第一類似度算出部１２と、第二類似度算出部１３と、判定部１４と、クラス統合部１５と、出力部１６を具えている。 FIG. 1 is a block diagram illustrating a functional configuration of a model generation device 10 according to an embodiment of the present invention. The model generation apparatus 10 according to the present embodiment includes a machine learning unit 11, a first similarity calculation unit 12, a second similarity calculation unit 13, a determination unit 14, a class integration unit 15, and an output unit 16. It is.

機械学習部１１では、ラベル付き学習データが記録された学習データ１７から機械学習処理が行われ、モデルパラメータの推定、及び、インスタンス（一件ごとの学習データ）がどのクラスに該当するのかというクラス分類に関する事後確率の推定、を行う。 In the machine learning unit 11, a machine learning process is performed from the learning data 17 in which labeled learning data is recorded, and a model parameter estimation and a class to which instance (learning data for each case) corresponds. Estimate posterior probabilities for classification.

第一類似度算出部１２では、機械学習部１１で推定された各インスタンスの事後確率（以下、「推定確率」ともいう）を基にして生成されるベクトル量を各クラスの特徴量として、当該特徴量に基づいて各クラス間の類似度（「特徴量類似度」）を算出する。 In the first similarity calculation unit 12, the vector amount generated based on the posterior probability of each instance estimated by the machine learning unit 11 (hereinafter also referred to as “estimated probability”) is used as the feature amount of each class. Based on the feature quantity, the similarity between the classes (“feature quantity similarity”) is calculated.

第二類似度算出部１３では、クラスの関係性の情報１８（例えば「クラスの階層」の情報）に基づいて、各クラス間の類似度（「関係性類似度」）を算出する。クラスの関係性の情報の初期値は、学習データ１６のラベルを基に人手で構築する。 The second similarity calculation unit 13 calculates the similarity (“relationship similarity”) between the classes based on the class relationship information 18 (for example, “class hierarchy” information). The initial value of the class relationship information is manually constructed based on the label of the learning data 16.

判定部１４では、特徴量類似度と関係性類似度のそれぞれにおいて、各所定の閾値を超えている場合、そのクラスペアを「類似している」と判定する。 The determination unit 14 determines that the class pair is “similar” when each of the feature amount similarity and the relationship similarity exceeds each predetermined threshold.

クラス統合部１５では、判定部１４において「類似している」と判定されたクラスペアを一つのクラスに統合する処理を行い、学習データ１７のラベルとクラスの関係性の情報１８を更新する。 The class integration unit 15 performs processing for integrating the class pairs determined as “similar” by the determination unit 14 into one class, and updates the information 18 on the relationship between the label of the learning data 17 and the class.

出力部１６では、判定部１４において「類似している」と判定されたクラスペアが存在しない場合、機械学習部１１によって推定されたモデルパラメータを出力する。 The output unit 16 outputs the model parameter estimated by the machine learning unit 11 when there is no class pair determined as “similar” by the determination unit 14.

図２は、モデル生成装置１０ハードウェア構成を示す図である。モデル生成装置１０は、ＣＰＵ２１、メモリ２２、ハードディスクドライブ（ＨＤＤ）等の記憶装置２３、表示装置２４を有し、これらの構成要素は、制御バス２５を介して互いに接続されている。 FIG. 2 is a diagram illustrating a hardware configuration of the model generation device 10. The model generation device 10 includes a CPU 21, a memory 22, a storage device 23 such as a hard disk drive (HDD), and a display device 24, and these components are connected to each other via a control bus 25.

ＣＰＵ２１は、メモリ２２または記憶装置２３に格納された（あるいはＣＤ−ＲＯＭ等の記憶媒体（図示しない）から提供される）制御プログラム、に基づいて所定の処理を実行して、モデル生成装置１０の動作を制御する。 The CPU 21 executes predetermined processing based on a control program stored in the memory 22 or the storage device 23 (or provided from a storage medium (not shown) such as a CD-ROM), and the model generating device 10 Control the behavior.

なお、本発明の実施に当たっては、モデル生成装置１０が、キーボード、タッチパネルなどの各種入力用インターフェイス装置を更に具えていても良い。 In carrying out the present invention, the model generation device 10 may further include various input interface devices such as a keyboard and a touch panel.

次に、図３のフローチャートを参照しながら、各クラスの特徴量を各インスタンスの推定確率を基に算出する場合を例に、本実施形態に係るモデル生成装置の動作を説明する。以下に示す実施例は、事前に定義されたクラスの関係性が、図４に示す階層構造を有している場合（実施例１）、図９に示すグラフ構造を有している場合（実施例２）である。 Next, the operation of the model generation apparatus according to the present embodiment will be described with reference to the flowchart of FIG. 3 taking as an example the case where the feature amount of each class is calculated based on the estimated probability of each instance. In the example shown below, when the relationship between the classes defined in advance has the hierarchical structure shown in FIG. 4 (Example 1), it has the graph structure shown in FIG. 9 (implementation). Example 2).

ステップＳ１では、階層構造の最下位のクラス（「法人名」、「政治組織名」、「都道府県名」、「市町村名」、「人名」）で機械学習処理が実行された後、各インスタンスについて推定確率が算出され、ステップＳ２へ進む。以下、各インスタンスの推定確率が、図５に示した値となった場合を例として説明する。（インスタンス１のデータの推定確率は、法人名：０．１、政治組織名：０．１、都道府県名：０．４、市町村名：０．３、人名：０．１となっている。) In step S1, machine learning processing is executed in the lowest class (“corporate name”, “political organization name”, “prefectural name”, “city name”, “person name”) in the hierarchical structure, and then each instance An estimated probability is calculated for, and the process proceeds to step S2. Hereinafter, a case where the estimated probability of each instance becomes the value shown in FIG. 5 will be described as an example. (The estimated probability of the data of instance 1 is corporation name: 0.1, political organization name: 0.1, prefecture name: 0.4, municipality name: 0.3, person name: 0.1. )

各クラスの特徴量は、例えば、各インスタンスの推定確率の値を各次元の値として持つベクトル量として算出することができる。 The feature quantity of each class can be calculated as, for example, a vector quantity having an estimated probability value of each instance as a value of each dimension.

ステップＳ２では、各インスタンスの推定確率から各学習クラスの特徴量が生成された後、生成された各学習クラスの特徴量に基づいて学習クラス間の相関係数が算出される。図６には、特徴量の相関行列として表現した。 In step S2, after the feature amount of each learning class is generated from the estimated probability of each instance, a correlation coefficient between learning classes is calculated based on the generated feature amount of each learning class. In FIG. 6, it is expressed as a correlation matrix of feature amounts.

ステップＳ１、Ｓ２と並列で行うことが可能なステップＳ３では、クラスの階層構造に基づいて、各学習クラス間の階層構造上の類似度を算出する。例えば、同一の上位階層（親）の配下に存在するクラス同士（兄弟クラス）を類似度「１」とし、それ以外のクラス間については類似度「０」とすることで、階層構造上の類似度を算出することができる。 In step S3, which can be performed in parallel with steps S1 and S2, the similarity in the hierarchical structure between each learning class is calculated based on the hierarchical structure of the class. For example, classes similar to each other in the same upper hierarchy (parent) (sibling classes) are set to a similarity of “1”, and other classes are set to a similarity of “0”. The degree can be calculated.

図４の階層構造を例にした場合、「法人名」と「政治組織名」、及び、「都道府県名」と「市町村名」の組み合わせは、それぞれ、同一の親（「組織名」、「地名」）を親としているため、階層構造上の類似度は「１」となり、これら以外の組み合わせについては、階層構造上の類似度は「０」となる。図７には、各学習クラス間の類似度を行列として表現した。 Taking the hierarchical structure of FIG. 4 as an example, the combinations of “corporate name” and “political organization name” and “prefectural name” and “city name” are the same parent (“organization name”, “ Since the place name “) is the parent, the similarity in the hierarchical structure is“ 1 ”, and for combinations other than these, the similarity in the hierarchical structure is“ 0 ”. In FIG. 7, the similarity between each learning class is expressed as a matrix.

ステップＳ２及びＳ３の完了後にステップＳ４へ進み、ステップＳ４では、ステップＳ２で算出された特徴量の相関係数、及び、ステップＳ３で算出された階層構造上の類似度、がそれぞれ所定の閾値以上である学習クラスペアが、類似しているクラスペアとして抽出される。例として、特徴量の相関係数の閾値を０．５、階層構造上の類似度の閾値を０．５とした場合は、「市町村名」と「都道府県名」のペアが類似しているクラスペアとして抽出されることになる。 After completion of steps S2 and S3, the process proceeds to step S4. In step S4, the correlation coefficient of the feature amount calculated in step S2 and the similarity in the hierarchical structure calculated in step S3 are each equal to or greater than a predetermined threshold. Learning class pairs are extracted as similar class pairs. For example, when the threshold value of the correlation coefficient of the feature amount is 0.5 and the threshold value of the similarity in the hierarchical structure is 0.5, the pair of “city name” and “prefecture name” is similar. It will be extracted as a class pair.

ステップＳ４に続くステップＳ５では、類似しているクラスペアが存在するかどうかの判定が行われ、類似しているペアが存在する場合は、ステップＳ６へ進み、類似しているクラスペアが存在しない場合は、ステップＳ７へ進む。 In step S5 following step S4, it is determined whether or not there is a similar class pair. If there is a similar pair, the process proceeds to step S6, and there is no similar class pair. If yes, go to Step S7.

ステップＳ６では、ステップＳ４で抽出された類似しているクラスペアを一つのクラスに統合する。具体的には、階層構造として記録されているクラスの関係性の情報（階層構造の情報）と学習データのラベルの更新を行う。「市町村名」と「都道府県名」のペアが類似しているクラスペアであった場合、両者が統合され、統合後のクラス名（ラベル）としては、例えば、階層構造上の上位階層名（例：「地名」）を用いることができる（図８）。クラスの統合処理後、ステップＳ１、Ｓ３へ進む。 In step S6, the similar class pairs extracted in step S4 are integrated into one class. Specifically, the information on the relationship between classes (hierarchical structure information) recorded as a hierarchical structure and the label of learning data are updated. If the pair of “city name” and “prefecture name” is similar, they are merged, and the merged class name (label) is, for example, an upper hierarchy name ( Example: “place name”) can be used (FIG. 8). After class integration processing, the process proceeds to steps S1 and S3.

再び進んだステップＳ１及びＳ３では、更新された階層構造の情報及び学習データのラベルに基づいて、それぞれ機械学習処理、各学習クラス間の階層構造上の類似度の算出が再度行われる。階層構造の更新により、「地名」が追加され、「市町村名」と「都道府県名」が削除された結果、階層構造の最下位のクラスは、「法人名」、「政治組織名」、「地名」、「人名」となる。 In steps S1 and S3 that have been advanced again, the machine learning process and the calculation of the similarity in the hierarchical structure between the learning classes are performed again based on the updated hierarchical structure information and learning data labels. As a result of the hierarchical structure update, “location name” was added and “city name” and “prefecture name” were deleted. As a result, the lowest class in the hierarchical structure is “corporate name”, “political organization name”, “ "Place name", "Person name".

このように、類似しているペアが存在しなくなるまでループが繰り返され、最終的にステップＳ７では、モデルが出力されて終了となる。 In this way, the loop is repeated until there are no similar pairs. Finally, in step S7, the model is output and the process ends.

実施例２では、小説のテキストから作者を推定するというタスクを想定したものであり、師弟・友人関係などから定義される作者間の関係性は、図９に示したようなグラフ構造を取る（線で結ばれるクラス（小説家名）間に、師弟関係や友人関係が存在することを示している）。 In the second embodiment, the task of estimating the author from the text of the novel is assumed, and the relationship between the authors defined from the teacher-friend / friend relationship has a graph structure as shown in FIG. This shows that there is a discipline or friendship between the classes (the name of the novelist) connected by a line).

ステップＳ１では、グラフ構造を構成する全クラスで機械学習処理が実行された後、各インスタンスについて推定確率が算出され、ステップＳ２へ進む。以下、各インスタンスの推定確率が、図５に示した値となった場合を例として説明する。（インスタンス１のデータの推定確率は、小説家Ａ：０．１、小説家Ｂ：０．１、小説家Ｃ：０．４、小説家Ｄ：０．３、小説家Ｅ：０．１となっている。) In step S1, after machine learning processing is executed in all classes constituting the graph structure, an estimated probability is calculated for each instance, and the process proceeds to step S2. Hereinafter, a case where the estimated probability of each instance becomes the value shown in FIG. 5 will be described as an example. (The estimated probability of instance 1 data is novelist A: 0.1, novelist B: 0.1, novelist C: 0.4, novelist D: 0.3, novelist E: 0.1. .)

ステップＳ２では、各インスタンスの推定確率から各学習クラスの特徴量が生成された後、生成された各学習クラスの特徴量に基づいて学習クラス間の相関係数が算出される。図１１には、特徴量の相関行列として表現した。 In step S2, after the feature amount of each learning class is generated from the estimated probability of each instance, a correlation coefficient between learning classes is calculated based on the generated feature amount of each learning class. In FIG. 11, it is expressed as a correlation matrix of feature amounts.

ステップＳ１、Ｓ２と並列で行うことが可能なステップＳ３では、クラスのグラフ構造に基づいて、各学習クラス間のグラフ構造上の類似度を算出する。例えば、線で直接接続されているクラス同士を類似度「１」とし、それ以外のクラス間については類似度「０」とすることで、グラフ構造上の類似度を算出することができる。 In step S3, which can be performed in parallel with steps S1 and S2, the similarity on the graph structure between each learning class is calculated based on the class graph structure. For example, the similarity degree on the graph structure can be calculated by setting the similarity degree “1” between classes directly connected by a line and the similarity degree “0” between other classes.

図１２のグラフ構造を例にした場合、例えば「小説家Ａ」は、「小説家Ｂ」、「小説家Ｃ」及び「小説家Ｄ」との組み合わせについて、グラフ構造上の類似度が「１」となり、「小説家Ｅ」との類似度は「０」となる。図１２には、各学習クラス間の類似度を行列として表現した。 When the graph structure of FIG. 12 is taken as an example, for example, “Novelist A” has a similarity of “1” for the combination of “Novelist B”, “Novelist C” and “Novelist D”. And the degree of similarity with “Noveler E” is “0”. In FIG. 12, the similarity between each learning class is expressed as a matrix.

ステップＳ２及びＳ３の完了後にステップＳ４へ進み、ステップＳ４では、ステップＳ２で算出された特徴量の相関係数、及び、ステップＳ３で算出されたグラフ構造上の類似度、がそれぞれ所定の閾値以上である学習クラスペアが、類似しているクラスペアとして抽出される。例として、特徴量の相関係数の閾値を０．５、グラフ構造上の類似度の閾値を０．５とした場合は、「小説家Ｃ」と「小説家Ｄ」のペアが類似しているクラスペアとして抽出されることになる。 After completion of steps S2 and S3, the process proceeds to step S4. In step S4, the correlation coefficient of the feature amount calculated in step S2 and the similarity on the graph structure calculated in step S3 are each greater than or equal to a predetermined threshold value. Learning class pairs are extracted as similar class pairs. As an example, when the threshold value of the correlation coefficient of the feature amount is 0.5 and the threshold value of the similarity on the graph structure is 0.5, the pair of “novelist C” and “novelist D” is similar. Will be extracted as a class pair.

ステップＳ６では、ステップＳ４で抽出された類似しているクラスペアを一つのクラスに統合する。具体的には、グラフ構造として記録されているクラスの関係性の情報（グラフ構造の情報）と学習データのラベルの更新を行う。例として、「小説家Ｃ」と「小説家Ｄ」のペアが類似しているクラスペアであった場合の統合処理後のグラフ構造を図１３に示す。統合後のクラス名（ラベル）としては、例えば、統合前の両者の名称を結合した名称を用いることができる。クラスの統合処理後、ステップＳ１、Ｓ３へ進む。 In step S6, the similar class pairs extracted in step S4 are integrated into one class. Specifically, the information on the relationship of the classes recorded as the graph structure (graph structure information) and the label of the learning data are updated. As an example, FIG. 13 shows a graph structure after the integration process when the pair of “novelist C” and “novelist D” is a similar class pair. As a class name (label) after integration, for example, a name obtained by combining both names before integration can be used. After class integration processing, the process proceeds to steps S1 and S3.

再び進んだステップＳ１及びＳ３では、更新されたグラフ構造の情報及び学習データのラベルに基づいて、それぞれ機械学習処理、各学習クラス間のグラフ構造上の類似度の算出が再度行われる。 In Steps S1 and S3 that have been advanced again, based on the updated graph structure information and the label of the learning data, the machine learning process and the calculation of the similarity on the graph structure between the learning classes are performed again.

以上で説明をしたように、本発明は、機械学習用モデル生成装置及びプログラムに適用することができる。 As described above, the present invention can be applied to a machine learning model generation device and a program.

１０モデル生成装置
１１機械学習部
１２第一類似度算出部
１３第二類似度算出部
１４判定部
１５クラス統合部
１６出力部
１７学習データ
１８クラスの関係性の情報
２１ＣＰＵ
２２メモリ
２３記憶装置
２４表示装置
２５制御バス DESCRIPTION OF SYMBOLS 10 Model production | generation apparatus 11 Machine learning part 12 1st similarity calculation part 13 2nd similarity calculation part 14 Judgment part 15 Class integration part 16 Output part 17 Learning data 18 Class relationship information 21 CPU
22 memory 23 storage device 24 display device 25 control bus

Claims

First similarity calculation means for calculating the similarity between the classes based on the relationship between the classes in which the relationship is defined in advance;
Second similarity calculation means for calculating the similarity between the classes based on the feature amount of the class;
A determination unit that determines whether the classes are similar based on the calculation results of the first calculation unit and the second calculation unit;
Class integration means for integrating the classes determined to be similar by the determination means into one class;
Machine learning means for performing supervised machine learning processing;
A machine learning model generation device comprising:

The machine learning model generation device according to claim 1, wherein the relationship is a hierarchical structure or a graph structure.

The model generation apparatus according to claim 1 or 2, wherein the machine learning means comprises model parameter estimation means for estimating model parameters, and means for estimating posterior probabilities in each class of data from the learned model parameters. .

Calculating a similarity between the classes as a first similarity based on a relationship between classes in which a relationship is defined in advance;
Calculating a similarity between the classes as a second similarity based on the feature amount of the class;
Determining whether the classes are similar based on the first similarity and the second similarity;
Combining the classes determined to be similar into a single class;
Performing supervised machine learning processing;
A program that causes a computer to execute.