JP2024039138A

JP2024039138A - Model selection device, model selection method, and program

Info

Publication number: JP2024039138A
Application number: JP2022143469A
Authority: JP
Inventors: 健太山岸; 敏生岡; 達也石井
Original assignee: Individual
Current assignee: Individual
Priority date: 2022-09-09
Filing date: 2022-09-09
Publication date: 2024-03-22

Abstract

【課題】ドメインの異なる複数の学習済みモデルの中から転移学習への適応に最適な学習済みモデルを選択することで、転移学習された学習済みモデルの精度を向上することが可能なモデル選択装置、モデル選択方法、及びプログラムを提供する。【解決手段】互いに共通のクラスを有する異なるドメインごとに、識別対象を示す入力データを入力として前記識別対象を識別可能に学習した学習済みモデルを生成するドメイン学習部と、前記異なるドメインごとに生成された学習済みモデルの中から、各々の学習済みモデルが前記識別対象を識別した識別結果に基づき、転移学習への適応に最適な学習済みモデルを選択する最適モデル選択部と、を備えるモデル選択装置。【選択図】図１[Problem] A model selection device that can improve the accuracy of a trained model that has undergone transfer learning by selecting a trained model that is optimal for adaptation to transfer learning from among multiple trained models with different domains. , a model selection method, and a program. [Solution] A domain learning unit that generates, for each different domain having a common class, a trained model trained to identify the identification target using input data indicating an identification target as input; an optimal model selection unit that selects a trained model that is optimal for adaptation to transfer learning from among the trained models that have been trained, based on an identification result in which each trained model has identified the identification target. Device. [Selection diagram] Figure 1

Description

本発明は、モデル選択装置、モデル選択方法、及びプログラムに関する。 The present invention relates to a model selection device, a model selection method, and a program.

近年、機械学習を用いて、入力データを識別する識別器の作成が行われている。大量の入力データと少量の入力データについてそれぞれの識別器を作成する場合、通常の機械学習では大量の入力データを用いた機械学習と少量の入力データを用いた機械学習によりそれぞれの識別器を作成する。この場合、少量の入力データを用いた機械学習によって作成された識別器では、その識別精度が低い傾向にある。このため、近年では、大量の入力タを用いて学習させた識別器（学習済みモデル）を少量の入力データに適応させる技術である転移学習が用いられることが多くなってきた。 In recent years, machine learning has been used to create classifiers that identify input data. When creating different classifiers for a large amount of input data and a small amount of input data, in normal machine learning, each classifier is created by machine learning using a large amount of input data and machine learning using a small amount of input data. do. In this case, a classifier created by machine learning using a small amount of input data tends to have low classification accuracy. For this reason, in recent years, transfer learning, which is a technique for adapting a classifier (trained model) trained using a large amount of input data to a small amount of input data, has been increasingly used.

これらの入力データの学習では、例えば、ＣＲＮＮ（ＣｏｎｖｏｌｕｔｉｏｎＲｅｃｕｒｒｅｎｔＮｅｕｒａｌＮｅｔｗｏｒｋｓ）と呼ばれるモデルに入力データを入力して識別対象を学習させることによって、学習済みＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎＮｅｕｒａｌＮｅｔｗｏｒｋｓ）を得ることができる。 In learning these input data, for example, a trained CNN (Convolution Neural Networks) can be obtained by inputting the input data into a model called a CRNN (Convolution Recurrent Neural Networks) and having the model learn the identification target.

通常、転移学習を行う際には、大量の入力データにより学習済みモデルを作成し、その学習済みモデルを再利用する。これにより、転移学習ではラベル付けされた少量の入力データを入力として学習を行うことができ、転移学習における学習時間の削減や精度の向上が見込める。 Normally, when performing transfer learning, a trained model is created using a large amount of input data, and the trained model is reused. This allows transfer learning to perform learning using a small amount of labeled input data as input, and is expected to reduce learning time and improve accuracy in transfer learning.

しかしながら、転移学習において再利用可能な学習済みモデルが複数ある場合、どの学習済みモデルが転移学習に最も適しているか（例えば最も精度がよいか）を事前に知ることは困難である。 However, when there are multiple trained models that can be reused in transfer learning, it is difficult to know in advance which trained model is most suitable for transfer learning (for example, which has the highest accuracy).

さらに、入力データの種類や入力値の取りうる値、出力するデータの種類や出力の取りうる値、確率分布などが異なる（即ちドメインが異なる）場合において、学習させたモデルに対して同じモデルを使いまわせるような技術は確立されてきた。しかしながら、種別が異なる識別対象（例えば書かれた年代が異なる手書き文字など）に対して、そのドメインを判別して適切な学習済みモデルを選択するという手法は存在しなかった。 Furthermore, when the type of input data, the possible values of the input value, the type of output data, the possible values of the output, the probability distribution, etc. are different (that is, the domain is different), the same model can be applied to the trained model. Reusable technology has been established. However, there has been no method for determining the domain and selecting an appropriate trained model for different types of recognition targets (for example, handwritten characters written in different years).

下記特許文献１には、分類器群(弱識別器群)を評価するために、ベクトルの分散関係に基づいて、特定の分類器群を選択する技術が開示されている。
また、下記特許文献２には、ソース言語からＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ：光学文字認識）によって読み取ったＯＣＲテキストをターゲット言語に翻訳する技術が開時されている。当該技術では、翻訳の複雑度を判別し、その複雑度に基づいてソース言語からターゲット言語へのＯＣＲテキストの機械語翻訳を実行することで、翻訳されたＯＣＲテキストを取得することができる。 Patent Document 1 listed below discloses a technique for selecting a specific classifier group based on a vector dispersion relationship in order to evaluate a classifier group (weak classifier group).
Further, Patent Document 2 listed below discloses a technology for translating OCR text read from a source language by OCR (Optical Character Recognition) into a target language. With this technology, translated OCR text can be obtained by determining the complexity of translation and performing machine language translation of the OCR text from the source language to the target language based on the complexity.

特開２０１９－１９１７１１号公報JP2019-191711A 米国特許第９５１４３７７号明細書US Patent No. 9514377

ところで、識別器では、入力データによって識別精度が高くなるドメインが異なる場合がある。この場合、転移学習を適用するにあたり、どのドメインにおける識別精度が高くなるかを事前に判定し、最も識別精度が高い識別器を転移学習に用いることが望ましい。しかしながら、上記特許文献１及び特許文献２の技術では、事前にドメインごとの識別精度を考慮することは困難であった。 By the way, in a classifier, the domain in which the classification accuracy is high may differ depending on the input data. In this case, when applying transfer learning, it is desirable to determine in advance in which domain the classification accuracy will be high, and to use the classifier with the highest classification accuracy for transfer learning. However, with the techniques of Patent Document 1 and Patent Document 2, it is difficult to consider the identification accuracy for each domain in advance.

上述の課題を鑑み、本発明の目的は、ドメインの異なる学習済みモデルの中から転移学習への適応に最適な学習済みモデルを選択することで、転移学習された学習済みモデルの精度を向上することが可能なモデル選択装置、モデル選択方法、及びプログラムを提供することにある。 In view of the above-mentioned problems, the purpose of the present invention is to improve the accuracy of a trained model that has undergone transfer learning by selecting a trained model that is optimal for adaptation to transfer learning from trained models of different domains. The object of the present invention is to provide a model selection device, a model selection method, and a program that can perform the following tasks.

上述の課題を解決するために、本発明の一態様に係るモデル選択装置は、互いに共通のクラスを有する異なるドメインごとに、識別対象を示す入力データを入力として前記識別対象を識別可能に学習した学習済みモデルを生成するドメイン学習部と、前記異なるドメインごとに生成された学習済みモデルの中から、各々の学習済みモデルが前記識別対象を識別した識別結果に基づき、転移学習への適応に最適な学習済みモデルを選択する最適モデル選択部と、を備える。 In order to solve the above-mentioned problem, a model selection device according to one aspect of the present invention learns to identify the identification target using input data indicating the identification target for each different domain having a common class. A domain learning unit that generates a trained model, and a trained model that is optimally adapted to transfer learning based on the identification result of identifying the classification target from among the trained models generated for each of the different domains. and an optimal model selection unit that selects a learned model.

本発明の一態様に係るモデル選択方法は、ドメイン学習部が、互いに共通のクラスを有する異なるドメインごとに、識別対象を示す入力データを入力として前記識別対象を識別可能に学習した学習済みモデルを生成するドメイン学習過程と、最適モデル選択部が、前記異なるドメインごとに生成された学習済みモデルの中から、各々の学習済みモデルが前記識別対象を識別した識別結果に基づき、転移学習への適応に最適な学習済みモデルを選択する最適モデル選択過程と、を含む。 In the model selection method according to one aspect of the present invention, the domain learning unit selects, for each different domain having a common class, a trained model that has been trained to identify the identification target using input data indicating the identification target as input. The domain learning process to be generated and the optimal model selection unit adapt to transfer learning based on the identification results of each trained model identifying the identification target from among the trained models generated for each of the different domains. an optimal model selection step of selecting a trained model optimal for the method.

本発明の一態様に係るプログラムは、コンピュータを、互いに共通のクラスを有する異なるドメインごとに、識別対象を示す入力データを入力として前記識別対象を識別可能に学習した学習済みモデルを生成するドメイン学習手段と、前記異なるドメインごとに生成された学習済みモデルの中から、各々の学習済みモデルが前記識別対象を識別した識別結果に基づき、転移学習への適応に最適な学習済みモデルを選択する最適モデル選択手段と、として機能させる。 A program according to an aspect of the present invention performs domain learning that causes a computer to generate a trained model that has been trained to identify the identification target by inputting input data indicating an identification target for each different domain having a common class. and an optimal method for selecting a trained model that is optimal for adaptation to transfer learning from among the trained models generated for each of the different domains, based on the identification result in which each trained model identifies the identification target. Function as a model selection means.

本発明によれば、ドメインの異なる学習済みモデルの中から転移学習への適応に最適な学習済みモデルを選択することで、転移学習された学習済みモデルの精度を向上することができる。 According to the present invention, by selecting a trained model that is optimal for adaptation to transfer learning from trained models of different domains, it is possible to improve the accuracy of a trained model that has been subjected to transfer learning.

第１の実施形態に係るモデル選択装置の機能構成の一例を示すブロック図である。1 is a block diagram showing an example of a functional configuration of a model selection device according to a first embodiment. FIG. 第１の実施形態に係る転移学習候補となる学習済みモデルの生成処理の流れの一例を示すフローチャートである。FIG. 2 is a flowchart illustrating an example of a process for generating a trained model that is a transfer learning candidate according to the first embodiment; FIG. 第１の実施形態に係る学習済みモデルの選択処理の流れの一例を示すフローチャートである。2 is a flowchart illustrating an example of the flow of a learned model selection process according to the first embodiment. 第２の実施形態に係るモデル選択装置の機能構成の一例を示すブロック図である。FIG. 2 is a block diagram illustrating an example of a functional configuration of a model selection device according to a second embodiment. 第２の実施形態に係る学習済みモデルの選択処理の流れの一例を示すフローチャートである。7 is a flowchart illustrating an example of the flow of a learned model selection process according to the second embodiment.

以下、図面を参照しながら本発明の実施形態について詳しく説明する。 Embodiments of the present invention will be described in detail below with reference to the drawings.

＜１．第１の実施形態＞
図１から図３を参照して、第１の実施形態について説明する。
第１の実施形態では、入力データに含まれる識別対象を識別する識別器について、対象とするドメインが異なる識別器ごとに識別対象を識別可能に学習した学習済みモデルの中から、各学習済みモデルのドメインスコアに基づき、転移学習への適応に最適な学習済みモデルを選択するモデル選択装置について説明する。なお、各ドメインは、互いに共通のクラスを有するものとする。 <1. First embodiment>
A first embodiment will be described with reference to FIGS. 1 to 3.
In the first embodiment, each trained model is selected from among the trained models that have been trained to identify the classification target for each classifier that has a different target domain for the classifier that identifies the classification target included in the input data. We will explain a model selection device that selects a trained model that is optimal for adaptation to transfer learning based on the domain score of . It is assumed that each domain has a common class.

以下では、識別対象が手書き文字であり、入力データが手書き文字を示す画像データである例について説明する。
この場合のドメインは、例えば、テキスト情報を含む古典籍や古文書などの種類である。それぞれのドメインは、この種類によって分類されており、ドメインごとにそれぞれの種類のデータが混合していることはないものとする。
クラスは、例えば、文字種である。文字種とは、漢字、平仮名、片仮名などの分類のことである。一例として、漢字で示された文字を含むドメイン同士は同じクラスに分類され、共通のクラスを有するといえる。平仮名と片仮名についても同様である。また、クラスは、文字の書体であってもよい。文字の書体は、例えば、楷書体、行書体、草書体などの分類である。一例として、楷書体で示された文字を含むドメイン同士は同じクラスに分類され、共通のクラスを有するといえる。行書体と草書体についても同様である。 In the following, an example will be described in which the identification target is a handwritten character and the input data is image data representing the handwritten character.
The domain in this case is, for example, a type of classical book or ancient document that includes text information. Each domain is classified according to this type, and it is assumed that data of each type is not mixed in each domain.
The class is, for example, a character type. Character types are classifications such as kanji, hiragana, and katakana. As an example, domains that include characters expressed in Kanji are classified into the same class, and can be said to have a common class. The same applies to hiragana and katakana. Further, the class may be a font of characters. The font of the characters is classified into, for example, regular script, cursive script, and cursive script. As an example, domains that include characters shown in block font are classified into the same class, and can be said to have a common class. The same applies to Gyosho and Cursive.

また、識別対象が手書き文字である場合の学習済みモデルは、手書きＯＣＲエンジンにてニューラルネットワークを用いた学習によってドメインごとに生成される。なお、ＯＣＲはＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｚａｔｉｏｎの略で、画像の中にあるテキストを認識し、テキストデータとして変換する技術のことを指す。
生成後、各学習済みモデルについて、テキストデータを含む手書き領域部分の画像データを入力として得られる出力に基づき、ドメインスコアが算出される。算出後、各学習済みモデルの中から、算出されたドメインスコアに基づき、転移学習への適応に最適な学習済みモデルが選択される。 Further, a trained model when the recognition target is a handwritten character is generated for each domain by learning using a neural network in a handwriting OCR engine. Note that OCR is an abbreviation for Optical Character Recognition, and refers to a technology that recognizes text in an image and converts it into text data.
After generation, a domain score is calculated for each trained model based on an output obtained by inputting image data of a handwritten region including text data. After calculation, the trained model that is most suitable for transfer learning is selected from among the trained models based on the calculated domain score.

また、ドメインごとに生成された学習済みモデルの各々が識別対象とする手書き文字は、学習済みモデルごとに手書き文字が書かれた年代が異なるものとする。また、当該学習済みモデルは、筆記具で書かれた手書き文字を示す画像データを用いて学習したモデルであるものとする。
また、モデル選択装置は、画像データが行画像として与えられている場合、ドメインの区切れ目ごとに区切られた行単位（例えば一行単位）の画像を入力データとして扱う。一方、モデル選択装置は、入力データが行画像として与えられていない場合、文字ごとに区切られた文字単位の画像を入力データとして扱ってもよい。 Furthermore, it is assumed that the handwritten characters to be identified by each of the trained models generated for each domain are written in different years for each trained model. Further, it is assumed that the learned model is a model learned using image data showing handwritten characters written with a writing instrument.
Further, when image data is provided as a line image, the model selection device handles as input data an image in line units (for example, one line unit) divided at each domain break. On the other hand, if the input data is not provided as a line image, the model selection device may treat an image of each character separated by character as the input data.

＜１－１．モデル選択装置の機能構成＞
図１を参照して、第１の実施形態に係るモデル選択装置の機能構成について説明する。図１は、第１の実施形態に係るモデル選択装置の機能構成の一例を示すブロック図である。図１に示すように、モデル選択装置１０は、記憶部１１０と、モデル分析部１２０と、モデル評価部１３０と、最適モデル選択部１４０とを備える。 <1-1. Functional configuration of model selection device>
With reference to FIG. 1, the functional configuration of the model selection device according to the first embodiment will be described. FIG. 1 is a block diagram showing an example of a functional configuration of a model selection device according to a first embodiment. As shown in FIG. 1, the model selection device 10 includes a storage section 110, a model analysis section 120, a model evaluation section 130, and an optimal model selection section 140.

（１）記憶部１１０
記憶部１１０は、各種情報を記憶する機能を有する。記憶部１１０は、記憶媒体、例えば、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、フラッシュメモリ、ＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓｒｅａｄ／ｗｒｉｔｅＭｅｍｏｒｙ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、またはこれらの記憶媒体の任意の組み合わせによって構成される。
記憶部１１０は、各種情報を記憶するためのデータベース（ＤＢ）を備えてもよい。例えば、図１に示すように、記憶部１１０は、入力画像ＤＢ１１１と、モデル管理ＤＢ１１２と、計算結果ＤＢ１１３とを備える。 (1) Storage unit 110
The storage unit 110 has a function of storing various information. The storage unit 110 includes a storage medium such as an HDD (Hard Disk Drive), an SSD (Solid State Drive), a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), or a RAM ( Random Access read/write Memory), ROM (Read Only Memory), or any combination of these storage media.
The storage unit 110 may include a database (DB) for storing various information. For example, as shown in FIG. 1, the storage unit 110 includes an input image DB 111, a model management DB 112, and a calculation result DB 113.

（１－１）入力画像ＤＢ１１１
入力画像ＤＢ１１１は、学習済みモデルを生成する際の入力として用いられる入力データと、生成された学習済みモデルから出力を得る際の入力として用いられる入力データが格納されるデータベースである。 (1-1) Input image DB111
The input image DB 111 is a database that stores input data used as input when generating a trained model and input data used as input when obtaining an output from the generated trained model.

（１－２）モデル管理ＤＢ１１２
モデル管理ＤＢ１１２は、後述するドメイン学習部１２１によって生成された学習済みのモデルであり、転移学習が適応される候補となる学習済みモデルを記憶する。 (1-2) Model management DB112
The model management DB 112 is a trained model generated by a domain learning unit 121 described later, and stores trained models that are candidates to which transfer learning is applied.

（１－３）計算結果ＤＢ１１３
計算結果ＤＢ１１３は、転移学習の適応に最適な学習済みモデルを選択するための指標を記憶する。当該指標は、例えば、後述するモデル評価部１３０によって算出されるドメインスコアである。 (1-3) Calculation result DB113
The calculation result DB 113 stores an index for selecting a trained model that is optimal for applying transfer learning. The index is, for example, a domain score calculated by the model evaluation unit 130 described later.

（２）モデル分析部１２０
モデル分析部１２０は、入力画像ＤＢ１１１より取り出したテキスト情報を含む画像データを分析し、転移学習へ適応の候補となる学習済みモデルを取り出す機能を有する。
図１に示すように、モデル分析部１２０は、ドメイン学習部１２１を備える。 (2) Model analysis section 120
The model analysis unit 120 has a function of analyzing image data including text information extracted from the input image DB 111 and extracting trained models that are candidates for application to transfer learning.
As shown in FIG. 1, the model analysis section 120 includes a domain learning section 121.

（２－１）ドメイン学習部１２１
ドメイン学習部１２１は、ドメインごとの学習済みモデルを生成する機能を有する。例えば、ドメイン学習部１２１は、入力画像ＤＢ１１１からテキスト情報を含む画像データを取得し、各画像データを入力として文字種の判別（識別）を行い学習することで、ドメインごとの学習済みモデルを生成して出力する。生成した学習済みモデルは、モデル管理ＤＢ１１２に格納される。 (2-1) Domain learning section 121
The domain learning unit 121 has a function of generating a trained model for each domain. For example, the domain learning unit 121 acquires image data including text information from the input image DB 111, and uses each image data as input to discriminate (identify) and learn character types, thereby generating a learned model for each domain. and output it. The generated trained model is stored in the model management DB 112.

なお、ドメイン学習部１２１は、異なるクラス間の対応関係を示す対応マップデータを有してもよい。この場合、ドメイン学習部１２１は、選択されたクラスに対して、対応マップデータに基づいて学習済みモデル間のクラスを統合する処理を行う。一例として、対応マップデータが漢字と平仮名との対応関係を示す場合、例えば読みが「あ」である漢字の識別結果と平仮名の「あ」の識別結果とを同じものとみなすことができる。これにより、ドメイン学習部１２１は、異なるクラスを有するドメインを共通のクラスを有するドメインとして扱うことができる。 Note that the domain learning unit 121 may have correspondence map data indicating the correspondence between different classes. In this case, the domain learning unit 121 performs a process of integrating classes between trained models for the selected class based on the corresponding map data. As an example, when the correspondence map data shows the correspondence between kanji and hiragana, for example, the identification result of a kanji whose reading is "a" and the identification result of the hiragana "a" can be considered to be the same. Thereby, the domain learning unit 121 can treat domains having different classes as domains having a common class.

（３）モデル評価部１３０
モデル評価部１３０は、学習済みモデルを評価する機能を有する。例えば、モデル評価部１３０は、ドメインの異なる学習済みモデルごとに分散を算出し、算出結果に基づき各学習済みモデルを評価する。
図１に示すように、モデル評価部１３０は、確率分布計算部１３１と、情報行列計算部１３２と、ドメインスコア計算部１３３とを備える。 (3) Model evaluation section 130
The model evaluation unit 130 has a function of evaluating a learned model. For example, the model evaluation unit 130 calculates the variance for each trained model in a different domain, and evaluates each trained model based on the calculation results.
As shown in FIG. 1, the model evaluation unit 130 includes a probability distribution calculation unit 131, an information matrix calculation unit 132, and a domain score calculation unit 133.

（３－１）確率分布計算部１３１
確率分布計算部１３１は、確率分布を算出する機能を有する。例えば、確率分布計算部１３１は、ドメイン学習部１２１にて得られた学習済みモデルを入力として、転移学習の対象となる画像データ（入力データ）の確率分布を算出する。確率分布計算部１３１は、ドメインごとの全ての学習済みモデルについて、確率分布を算出する。 (3-1) Probability distribution calculation unit 131
The probability distribution calculation unit 131 has a function of calculating a probability distribution. For example, the probability distribution calculation unit 131 receives the trained model obtained by the domain learning unit 121 as input and calculates the probability distribution of image data (input data) that is a target of transfer learning. The probability distribution calculation unit 131 calculates probability distributions for all trained models for each domain.

（３－２）情報行列計算部１３２
情報行列計算部１３２は、情報行列を算出する機能を有する。例えば、情報行列計算部１３２は、確率分布計算部１３１によって算出された確率分布を入力として、当該確率分布をベクトルとし、勾配を計算することで情報行列を生成する。 (3-2) Information matrix calculation unit 132
The information matrix calculation unit 132 has a function of calculating an information matrix. For example, the information matrix calculation unit 132 receives the probability distribution calculated by the probability distribution calculation unit 131 as an input, uses the probability distribution as a vector, and generates an information matrix by calculating a gradient.

（３－３）ドメインスコア計算部１３３
ドメインスコア計算部１３３は、ドメインスコアを算出する機能を有する。ドメインスコア計算部１３３は、最適な学習済みモデルを選択するための指標として、ドメイン学習部１２１によって生成された学習済みモデルごとに、手書き文字を識別した識別結果の分散をドメインスコアとして算出する。例えば、ドメインスコア計算部１３３は、情報行列計算部１３２によって算出された情報行列を入力として、ドメインごとにドメインスコアを算出する。算出したドメインスコアは、計算結果ＤＢ１１３に格納される。 (3-3) Domain score calculation unit 133
The domain score calculation unit 133 has a function of calculating a domain score. The domain score calculation unit 133 calculates the variance of the identification results for identifying handwritten characters as a domain score for each trained model generated by the domain learning unit 121, as an index for selecting an optimal trained model. For example, the domain score calculation unit 133 receives the information matrix calculated by the information matrix calculation unit 132 as input and calculates a domain score for each domain. The calculated domain score is stored in the calculation result DB 113.

（４）最適モデル選択部１４０
最適モデル選択部１４０は、転移学習への適応に最適な学習済みモデルを選択する機能を有する。例えば、最適モデル選択部１４０は、ドメイン学習部１２１によって生成されモデル管理ＤＢ１１２に格納された学習済みモデルの中から、各々の学習済みモデルが手書き文字を識別した識別結果に基づき、転移学習への適応に最適な学習済みモデルを選択する。第１の実施形態では、最適モデル選択部１４０は、ドメインスコア計算部１３３によって算出されて計算結果ＤＢ１１３に格納されたドメインスコアに基づき、モデル管理ＤＢ１１２から最適な学習済みモデルを選択する。 (4) Optimal model selection unit 140
The optimal model selection unit 140 has a function of selecting a trained model that is optimal for adaptation to transfer learning. For example, the optimal model selection unit 140 selects a handwritten character from among the trained models generated by the domain learning unit 121 and stored in the model management DB 112 based on the identification result of each trained model identifying handwritten characters. Select the best trained model for adaptation. In the first embodiment, the optimal model selection unit 140 selects the optimal trained model from the model management DB 112 based on the domain score calculated by the domain score calculation unit 133 and stored in the calculation result DB 113.

＜１－２．処理の流れ＞
以上、第１の実施形態に係るモデル選択装置１０の機能構成について説明した。続いて、図２及び図３を参照して、第１の実施形態に係る処理の流れについて説明する。以下では、一例として、テキスト情報を含む画像データを入力として扱うＯＣＲシステムについて、学習させた学習済みモデルを出力したのちに、新しい入力データにおいて最適な学習済みモデルを利用する例について説明する。なお、テキスト情報を含む画像データは、行画像を判別するためにＩＤ、幅、高さ、行画像の幅の開始位置を特定するためのｘ座標、行画像の高さの開始位置を特定するためのｙ座標、文字列の属性などを含み、１つの画像データの中に複数行にわたって文字列があるものとする。 <1-2. Processing flow>
The functional configuration of the model selection device 10 according to the first embodiment has been described above. Next, the flow of processing according to the first embodiment will be described with reference to FIGS. 2 and 3. Below, as an example, an example will be described in which an OCR system that handles image data including text information as input outputs a trained model and then uses the optimal trained model for new input data. Note that the image data including text information specifies the ID, width, height, x coordinate to specify the starting position of the width of the line image, ID, width, height, and starting position of the height of the line image to identify the line image. It is assumed that there is a character string in one image data that spans multiple lines, including the y-coordinate for the image, character string attributes, etc.

（１）転移学習候補となる学習済みモデルの生成処理
まず、図２を参照して、転移学習候補となる学習済みモデルの生成処理の流れの一例について説明する。図２は、第１の実施形態に係る転移学習候補となる学習済みモデルの生成処理の流れの一例を示すフローチャートである。 (1) Generation process of a trained model to be a transfer learning candidate First, an example of the flow of generation process for a trained model to be a transfer learning candidate will be described with reference to FIG. FIG. 2 is a flowchart illustrating an example of the process of generating a trained model that is a transfer learning candidate according to the first embodiment.

図２に示すように、まず、ドメイン学習部１２１は、入力画像ＤＢ１１１からテキストデータを含む画像データを取得する（ステップＳ１０１）。
次いで、ドメイン学習部１２１は、取得した画像データが二値化されているか否かを確認する（ステップＳ１０２）。二値化されていない場合（ステップＳ１０２／ＮＯ）、処理をステップＳ１０３へ進める。一方、二値化されている場合（ステップＳ１０２／ＹＥＳ）、処理をステップＳ１０４へ進める。 As shown in FIG. 2, the domain learning unit 121 first obtains image data including text data from the input image DB 111 (step S101).
Next, the domain learning unit 121 checks whether the acquired image data has been binarized (step S102). If it has not been binarized (step S102/NO), the process advances to step S103. On the other hand, if the data has been binarized (step S102/YES), the process advances to step S104.

処理がステップＳ１０３へ進んだ場合、ドメイン学習部１２１は、画像データの二値化を行う（ステップＳ１０３）。例えば、ドメイン学習部１２１は、判別分離法を行うことで分離度が最大となる閾値を求め、自動的に画像の二値化を行う。二値化後、処理をステップＳ１０４へ進める。 When the process proceeds to step S103, the domain learning unit 121 binarizes the image data (step S103). For example, the domain learning unit 121 performs a discriminant separation method to find a threshold value that maximizes the degree of separation, and automatically binarizes the image. After the binarization, the process advances to step S104.

処理がステップＳ１０４へ進んだ場合、ドメイン学習部１２１は、画像データを画像ごとに矩形画像に切り出す（ステップＳ１０４）。例えば、ドメイン学習部１２１は、幅、高さ、行画像の幅の開始位置を特定するためのｘ座標、行画像の高さの開始位置を特定するためのｙ座標から１行の外接矩形を特定し、当該外接矩形に基づき矩形画像を生成する。そして、ドメイン学習部１２１は、矩形画像ごとに対応する文字列を出力する。 When the process proceeds to step S104, the domain learning unit 121 cuts out the image data into rectangular images for each image (step S104). For example, the domain learning unit 121 calculates the circumscribing rectangle of one line from the width, height, x coordinate for specifying the starting position of the width of the line image, and y coordinate for specifying the starting position of the height of the line image. A rectangular image is generated based on the specified circumscribed rectangle. Then, the domain learning unit 121 outputs a character string corresponding to each rectangular image.

次いで、ドメイン学習部１２１は、学習におけるラベル情報の作成が必要であるか否かを確認する（ステップＳ１０５）。ラベル情報が用意されていない場合はラベル情報の作成が必要であると判定し（ステップＳ１０５／ＹＥＳ）、処理をステップＳ１０６へ進める。一方、ラベル情報が用意されている場合はラベル情報の作成が必要でないと判定し（ステップＳ１０５／ＮＯ）、処理をステップＳ１０７へ進める。 Next, the domain learning unit 121 checks whether it is necessary to create label information in learning (step S105). If label information is not prepared, it is determined that label information needs to be created (step S105/YES), and the process proceeds to step S106. On the other hand, if label information is prepared, it is determined that creation of label information is not necessary (step S105/NO), and the process proceeds to step S107.

処理がステップＳ１０６へ進んだ場合、ドメイン学習部１２１は、ラベル情報を作成する（ステップＳ１０６）。第１の実施形態では、学習用に文字種セマンティックセグメンテーション（ＳＳ）を作成している。文字種ＳＳでは、文字種ごとの単位で正解としてのラベル付けをし、クラスに用いる。ラベル情報の作成後、処理をステップＳ１０７へ進める。 When the process proceeds to step S106, the domain learning unit 121 creates label information (step S106). In the first embodiment, character type semantic segmentation (SS) is created for learning. In the character type SS, each character type is labeled as a correct answer and used for the class. After creating the label information, the process advances to step S107.

処理がステップＳ１０７へ進んだ場合、ドメイン学習部１２１は、ドメインごとに、ラベル情報をもとに矩形画像を用いて文字種の推定を行う（ステップＳ１０７）。
次いで、ドメイン学習部１２１は、ドメインごとにラベル情報を用いて手書き部分推定のニューラルネットワークを用い学習する（ステップＳ１０８）。これらのネットワーク構造は、例えばＦＣＮ（ＦｕｌｌｙＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｔｗｏｒｋｓ）の形態をとり得る。
そして、ドメイン学習部１２１は、ドメインごとに生成した学習済みモデルを出力し、モデル管理ＤＢ１１２に格納して保存する（ステップＳ１０９）。 When the process proceeds to step S107, the domain learning unit 121 estimates the character type using the rectangular image based on the label information for each domain (step S107).
Next, the domain learning unit 121 performs learning using a neural network for estimating handwritten portions using label information for each domain (step S108). These network structures may take the form of FCN (Fully Convolutional Networks), for example.
Then, the domain learning unit 121 outputs the trained model generated for each domain, and stores and saves it in the model management DB 112 (step S109).

（２）学習済みモデルの選択処理の流れ
次に、図３を参照して、学習済みモデルの選択処理の流れの一例について説明する。図３は、第１の実施形態に係る学習済みモデルの選択処理の流れの一例を示すフローチャートである。 (2) Flow of the trained model selection process Next, an example of the flow of the trained model selection process will be described with reference to FIG. FIG. 3 is a flowchart illustrating an example of the flow of a trained model selection process according to the first embodiment.

図３に示すように、まず、モデル評価部１３０は、矩形画像データを取得する（ステップＳ２０１）。例えば、モデル評価部１３０は、図２のステップＳ１０１からステップＳ１０４と同様にして、テキストデータを含む二値化された矩形画像データを取得する。 As shown in FIG. 3, the model evaluation unit 130 first obtains rectangular image data (step S201). For example, the model evaluation unit 130 acquires binarized rectangular image data including text data in the same manner as steps S101 to S104 in FIG. 2 .

次いで、モデル評価部１３０は、モデル管理ＤＢ１１２において保存されているドメインごとに学習済みモデルが複数あるか否かを確認する（ステップＳ２０２）。１つのドメインについて学習済みモデルが複数ある場合（ステップＳ２０２／ＹＥＳ）、処理をステップＳ２０３へ進める。一方、１つのドメインについて学習済みモデルが複数ない場合（ステップＳ２０２／ＮＯ）、処理をステップＳ２０４へ進める。 Next, the model evaluation unit 130 checks whether there are multiple trained models for each domain stored in the model management DB 112 (step S202). If there are multiple trained models for one domain (step S202/YES), the process advances to step S203. On the other hand, if there are not a plurality of trained models for one domain (step S202/NO), the process advances to step S204.

処理がステップＳ２０３へ進んだ場合、モデル評価部１３０は、複数ある学習済みモデルの中から、そのドメイン内で最も適した学習済みモデルを選択する（ステップＳ２０３）。選択後、処理をステップＳ２０４へ進める。なお、ステップＳ２０２にて、学習済みモデルが保存されていないドメインが検出された場合、当該学習済みモデルに関しては最適な学習済みモデルを選択する際のドメインの候補から除外する。 When the process proceeds to step S203, the model evaluation unit 130 selects the most suitable trained model within the domain from among the plurality of trained models (step S203). After selection, the process advances to step S204. Note that if a domain in which no trained model is stored is detected in step S202, the trained model is excluded from domain candidates when selecting the optimal trained model.

処理がステップＳ２０４へ進んだ場合、確率分布計算部１３１は、モデル管理ＤＢ１１２に格納された学習済みモデルについて、その出力から確率分布を算出する（ステップＳ２０４）。例えば、確率分布計算部１３１は、学習済みモデルを入力として、ターゲットデータセットのダミーラベル分布を算出する。 When the process proceeds to step S204, the probability distribution calculation unit 131 calculates a probability distribution from the output of the learned model stored in the model management DB 112 (step S204). For example, the probability distribution calculation unit 131 calculates a dummy label distribution of the target data set using the learned model as input.

次いで、情報行列計算部１３２は、情報行列を算出する（ステップＳ２０５）。例えば、情報行列計算部１３２は、確率分布計算部１３１によってダミーラベル分布として算出された確率分布の各パラメータごとの勾配を算出し、その転置行列との積をとることで入力データと確率分布から情報行列を算出する。 Next, the information matrix calculation unit 132 calculates an information matrix (step S205). For example, the information matrix calculation unit 132 calculates the gradient for each parameter of the probability distribution calculated as a dummy label distribution by the probability distribution calculation unit 131, and calculates the slope of the probability distribution from the input data and the probability distribution by multiplying the gradient with the transposed matrix. Calculate the information matrix.

次いで、情報行列計算部１３２は、ドメインごとのクラスに重複が有るか否かを確認する（ステップＳ２０６）。重複がない場合（ステップＳ２０６／ＮＯ）、処理をステップＳ２０７へ進める。一方、重複がある場合（ステップＳ２０６／ＹＥＳ）、処理をステップＳ２０８へ進める。 Next, the information matrix calculation unit 132 checks whether there is any overlap in the classes for each domain (step S206). If there is no overlap (step S206/NO), the process advances to step S207. On the other hand, if there is overlap (step S206/YES), the process advances to step S208.

処理がステップＳ２０７へ進んだ場合、情報行列計算部１３２は、情報行列のフィルタリングを行う（ステップＳ２０７）。当該フィルタリングでは、情報行列から異なるフィルタ間の相関は重要でないとし、すべてのフィルタパラメータが平均化される。これにより、完全な情報行列から近似を行うことで簡易化することができる。フィルタリング後、処理をステップＳ２０８へ進める。 When the process proceeds to step S207, the information matrix calculation unit 132 performs filtering of the information matrix (step S207). In this filtering, all filter parameters are averaged, assuming that the correlation between different filters is not important from the information matrix. This allows simplification by performing approximation from a complete information matrix. After filtering, the process advances to step S208.

処理がステップＳ２０８へ進んだ場合、ドメインスコア計算部１３３は、ドメインごとに情報行列より固定長のベクトルを出力する（ステップＳ２０８）。第１の本実施形態では、この固定長のベクトルがドメインスコアに相当する。ドメインスコア計算部１３３は、学習済みモデルごとに情報行列の大きさが異なるため、対角成分のみを取り出し、その対角成分を同じフィルタでの値を平均し、固定長のベクトルを出力する。 When the process proceeds to step S208, the domain score calculation unit 133 outputs a fixed-length vector from the information matrix for each domain (step S208). In the first embodiment, this fixed-length vector corresponds to the domain score. Since the size of the information matrix differs for each trained model, the domain score calculation unit 133 extracts only the diagonal components, averages the values of the diagonal components with the same filter, and outputs a fixed-length vector.

次いで、ドメインスコア計算部１３３は、学習済みモデルごとの対称性を確認する（ステップＳ２０９）。対称性がない（非対称）場合（ステップＳ２０９／ＮＯ）、処理をステップＳ２１０へ進める。一方、対称性がある場合（ステップＳ２０９／ＹＥＳ）、処理をステップＳ２１１へ進める。 Next, the domain score calculation unit 133 checks the symmetry of each learned model (step S209). If there is no symmetry (asymmetric) (step S209/NO), the process advances to step S210. On the other hand, if there is symmetry (step S209/YES), the process advances to step S211.

処理がステップＳ２１０へ進んだ場合、ドメインスコア計算部１３３は、非対称である類似度でドメインごとにドメインスコアを計算する（ステップＳ２１０）計算後、処理をステップＳ２１２へ進める。
処理がステップＳ２１１へ進んだ場合、ドメインスコア計算部１３３は、コサイン類似度で学習済みモデルをベクトル化し、ドメインスコアを計算する（ステップＳ２１１）。計算後、処理をステップＳ２１２へ進める。
なお、計算されたドメインスコアは、計算結果ＤＢ１１３に格納され保存される。 When the process proceeds to step S210, the domain score calculation unit 133 calculates a domain score for each domain using an asymmetric degree of similarity (step S210), and then proceeds to step S212.
When the process proceeds to step S211, the domain score calculation unit 133 vectorizes the trained model using cosine similarity and calculates a domain score (step S211). After the calculation, the process advances to step S212.
Note that the calculated domain score is stored and saved in the calculation result DB 113.

処理がステップＳ２１２へ進んだ場合、最適モデル選択部１４０は、ドメインスコアの算出結果に基づき、最適な学習済みモデルの選択を行う（ステップＳ２１２）。例えば、最適モデル選択部１４０は、それぞれのドメインに対して行ったドメインスコアの算出において、最も小さいスコアを示す学習済みモデルを最適な学習済みモデルとして選択する。 When the process proceeds to step S212, the optimal model selection unit 140 selects the optimal trained model based on the domain score calculation result (step S212). For example, in calculating the domain score for each domain, the optimal model selection unit 140 selects the trained model showing the smallest score as the optimal trained model.

以上説明したように、第１の実施形態に係るモデル選択装置１０は、互いに共通のクラスを有する異なるドメインごとに、識別対象を示す入力データを入力として識別対象を識別可能に学習した学習済みモデルを生成するドメイン学習部１２１と、異なるドメインごとに生成された学習済みモデルの中から、各々の学習済みモデルが識別対象を識別した識別結果に基づき、転移学習への適応に最適な学習済みモデルを選択する最適モデル選択部１４０とを備える。 As explained above, the model selection device 10 according to the first embodiment uses a trained model that has been trained to identify an identification target by inputting input data indicating an identification target for each different domain having a common class. The domain learning unit 121 generates a trained model that is optimal for adaptation to transfer learning based on the identification results of each trained model identifying the classification target from among the trained models generated for each different domain. and an optimal model selection unit 140 that selects the optimal model.

かかる構成により、第１の実施形態に係るモデル選択装置１０は、事前にドメインの違いによる識別器の識別精度の違いを考慮した上で、転移学習に適応する学習済みモデルを選択することができる。
よって、第１の実施形態に係るモデル選択装置１０は、ドメインの異なる学習済みモデルの中から転移学習への適応に最適な学習済みモデルを選択することで、転移学習された学習済みモデルの精度を向上することを可能とする。 With this configuration, the model selection device 10 according to the first embodiment can select a trained model that is suitable for transfer learning, taking into consideration in advance the difference in the classification accuracy of the classifier due to the difference in domains. .
Therefore, the model selection device 10 according to the first embodiment improves the accuracy of the trained model that has undergone transfer learning by selecting the trained model that is optimal for adaptation to transfer learning from among the trained models of different domains. This makes it possible to improve

また、第１の実施形態に係るモデル選択装置１０は、最適な学習済みモデルを選択するための指標として、異なるドメインごとに生成された学習済みモデルごとに識別結果の分散を算出するドメインスコア計算部１３３、をさらに備え、最適モデル選択部１４０は、算出されたドメインスコアに基づき、最適な学習済みモデルを選択する。
かかる構成により、第１の実施形態に係るモデル選択装置１０は、入力データの真値と入力データの学習済みモデルの確率分布とを用いることにより、入力データから推論するための最適なモデルを、より精度高く選択することができる。また、最適なモデルの選択にかかる時間を短縮することもできる。 The model selection device 10 according to the first embodiment also performs domain score calculation that calculates the variance of identification results for each trained model generated for each different domain, as an index for selecting an optimal trained model. The optimal model selection unit 140 selects the optimal trained model based on the calculated domain score.
With this configuration, the model selection device 10 according to the first embodiment selects the optimal model for inference from the input data by using the true value of the input data and the probability distribution of the learned model of the input data. It is possible to select with higher accuracy. Furthermore, the time required to select the optimal model can also be reduced.

＜２．第２の実施形態＞
以上、第１の実施形態について説明した。続いて、図４及び図５を参照して、第２の実施形態について説明する。
上述した第１の実施形態では、転移学習の候補となる学習済みモデルの中から、各学習済みモデルのドメインスコアに基づき、最適な学習済みモデルを選択する例について説明したが、かかる例に限定されない。第２の実施形態では、転移学習の候補となる学習済みモデルに対するアンサンブル学習に基づき、最適な学習済みモデルを選択する例について説明する。
なお、以下では、第１の実施形態での説明と重複する説明については、適宜省略する。 <2. Second embodiment>
The first embodiment has been described above. Next, a second embodiment will be described with reference to FIGS. 4 and 5.
In the first embodiment described above, an example was described in which an optimal trained model is selected from trained models that are candidates for transfer learning based on the domain score of each trained model, but the present invention is limited to this example. Not done. In the second embodiment, an example will be described in which an optimal trained model is selected based on ensemble learning for trained models that are candidates for transfer learning.
Note that, hereinafter, explanations that overlap with those in the first embodiment will be omitted as appropriate.

＜２－１．モデル選択装置の機能構成＞
図４を参照して、第２の実施形態に係るモデル選択装置１０ａの機能構成について説明する。図４は、第２の実施形態に係るモデル選択装置１０ａの機能構成の一例を示すブロック図である。図４に示すように、モデル選択装置１０ａは、記憶部１１０ａと、モデル分析部１２０ａと、最適モデル選択部１４０ａとを備える。 <2-1. Functional configuration of model selection device>
With reference to FIG. 4, the functional configuration of a model selection device 10a according to the second embodiment will be described. FIG. 4 is a block diagram showing an example of the functional configuration of a model selection device 10a according to the second embodiment. As shown in FIG. 4, the model selection device 10a includes a storage section 110a, a model analysis section 120a, and an optimal model selection section 140a.

（１）記憶部１１０ａ
記憶部１１０ａは、第１の実施形態に係る記憶部１１０と同様の記憶媒体によって各種情報を記憶する機能を有する。図４に示すように、記憶部１１０ａは、入力画像ＤＢ１１１と、モデル管理ＤＢ１１２ａと、計算結果ＤＢ１１３とを備える。 (1) Storage unit 110a
The storage unit 110a has a function of storing various information using a storage medium similar to that of the storage unit 110 according to the first embodiment. As shown in FIG. 4, the storage unit 110a includes an input image DB111, a model management DB112a, and a calculation result DB113.

（１－１）入力画像ＤＢ１１１
第２の実施形態に係る入力画像ＤＢ１１１の機能は、第１の実施形態に係る入力画像ＤＢ１１１の機能と同様であるため、その説明を省略する。 (1-1) Input image DB111
The functions of the input image DB 111 according to the second embodiment are the same as the functions of the input image DB 111 according to the first embodiment, so the description thereof will be omitted.

（１－２）モデル管理ＤＢ１１２ａ
モデル管理ＤＢ１１２ａは、後述する転移学習部１２２によって生成された学習済みモデルであり、後述するアンサンブル学習部１２３によってアンサンブル学習が適応される学習済みモデルを記憶する。 (1-2) Model management DB112a
The model management DB 112a is a trained model generated by a transfer learning unit 122 described later, and stores trained models to which ensemble learning is applied by an ensemble learning unit 123 described later.

（１－３）計算結果ＤＢ１１３
第２の実施形態に係る計算結果ＤＢ１１３の機能は、第１の実施形態に係る計算結果ＤＢ１１３の機能と同様であるため、その説明を省略する。 (1-3) Calculation result DB113
The functions of the calculation result DB 113 according to the second embodiment are the same as those of the calculation result DB 113 according to the first embodiment, so the description thereof will be omitted.

（２）モデル分析部１２０ａ
図４に示すように、モデル分析部１２０ａは、第１の実施形態に係るモデル分析部１２０と同様のドメイン学習部１２１に加え、転移学習部１２２と、アンサンブル学習部１２３とをさらに備える。 (2) Model analysis section 120a
As shown in FIG. 4, the model analysis unit 120a further includes a transfer learning unit 122 and an ensemble learning unit 123 in addition to a domain learning unit 121 similar to the model analysis unit 120 according to the first embodiment.

（２－１）ドメイン学習部１２１
第２の実施形態に係るドメイン学習部１２１の機能は、第１の実施形態に係るドメイン学習部１２１の機能と同様であるため、その説明を省略する。 (2-1) Domain learning section 121
The functions of the domain learning unit 121 according to the second embodiment are the same as those of the domain learning unit 121 according to the first embodiment, so a description thereof will be omitted.

（２－２）転移学習部１２２
転移学習部１２２は、学習済みモデルに対して、転移学習を行う機能を有する。例えば、転移学習部１２２は、ドメイン学習部１２１によって生成された学習済みモデルの各々に対して、転移先のドメインについて対象となる画像データを用いて転移学習を行う。 (2-2) Transfer learning section 122
The transfer learning unit 122 has a function of performing transfer learning on a trained model. For example, the transfer learning unit 122 performs transfer learning on each of the learned models generated by the domain learning unit 121 using target image data for the transfer destination domain.

（２－３）アンサンブル学習部１２３
アンサンブル学習部１２３は、異なるドメインごとに生成された学習済みモデルに基づき、アンサンブル学習を行う機能を有する。例えば、アンサンブル学習部１２３は、転移学習部１２２によって転移学習された学習済みモデルを入力として、当該学習済みモデルに対してアンサンブル学習を行う。 (2-3) Ensemble learning section 123
The ensemble learning unit 123 has a function of performing ensemble learning based on trained models generated for different domains. For example, the ensemble learning unit 123 receives as input the trained model that has been transferred learned by the transfer learning unit 122, and performs ensemble learning on the trained model.

アンサンブル学習部１２３は、アンサンブル学習を行う際のアンサンブルさせる学習済みモデルの学習において、任意の活性化関数を用いて学習済みモデルにおける分散表現としての重みを算出する。アンサンブル学習部１２３は、活性化関数として例えばソフトマックス（ｓｏｆｔｍａｘ）関数を用いて、アンサンブルの重みをｓｏｆｔｍａｘ値に基づいて算出する。なお、アンサンブル学習部１２３が用いる活性化関数は、出力の値が１．０となるように変換して出力する関数であればソフトマックス関数に限定されず、例えばシグモイド（ｓｉｇｍｏｉｄ）関数やＲｅＬＵ（ＲｅｃｔｉｆｉｅｄＬｉｎｅａｒＵｎｉｔ）関数など、他の活性化関数であってもよい。 The ensemble learning unit 123 uses an arbitrary activation function to calculate weights as a distributed representation in the trained models in learning the trained models to be ensembled when performing ensemble learning. The ensemble learning unit 123 uses, for example, a softmax function as an activation function, and calculates the weight of the ensemble based on the softmax value. Note that the activation function used by the ensemble learning unit 123 is not limited to a softmax function as long as it is a function that converts and outputs an output value of 1.0, and may be, for example, a sigmoid function or a ReLU ( Other activation functions may be used, such as a Rectified Linear Unit) function.

（３）最適モデル選択部１４０ａ
最適モデル選択部１４０ａは、第１の実施形態に係る最適モデル選択部１４０ａと同様に、ドメイン学習部１２１によって生成されモデル管理ＤＢ１１２に格納された学習済みモデルの中から、各々の学習済みモデルが手書き文字を識別した識別結果に基づき、転移学習への適応に最適な学習済みモデルを選択する。第２の実施形態では、最適モデル選択部１４０ａは、アンサンブル学習部１２３によるアンサンブル学習の結果に基づき、モデル管理ＤＢ１１２から最適な学習済みモデルを選択する。 (3) Optimal model selection unit 140a
The optimal model selection unit 140a, like the optimal model selection unit 140a according to the first embodiment, selects each trained model from among the trained models generated by the domain learning unit 121 and stored in the model management DB 112. Based on the recognition results of handwritten characters, the trained model that is most suitable for transfer learning is selected. In the second embodiment, the optimal model selection unit 140a selects the optimal trained model from the model management DB 112 based on the result of ensemble learning by the ensemble learning unit 123.

＜２－２．処理の流れ＞
以上、第２の実施形態に係るモデル選択装置１０ａの機能構成について説明した。続いて、図５を参照して、第２の実施形態に係る処理の流れについて説明する。以下では、第１の実施形態と同様に、テキスト情報を含む画像データを入力として扱うＯＣＲシステムについて、学習させた学習済みモデルを出力したのちに、新しい入力データにおいて最適な学習済みモデルを利用する例について説明する。なお、テキスト情報を含む画像データは、行画像を判別するためにＩＤ、幅、高さ、行画像の幅の開始位置を特定するためのｘ座標、行画像の高さの開始位置を特定するためのｙ座標、文字列の属性などを含み、１つの画像データの中に複数行にわたって文字列があるものとする。 <2-2. Processing flow>
The functional configuration of the model selection device 10a according to the second embodiment has been described above. Next, the flow of processing according to the second embodiment will be described with reference to FIG. 5. In the following, similar to the first embodiment, for an OCR system that handles image data including text information as input, after outputting a trained model, the optimal trained model is used for new input data. Let's discuss an example. Note that the image data including text information specifies the ID, width, height, x coordinate to specify the starting position of the width of the line image, ID, width, height, and starting position of the height of the line image to identify the line image. It is assumed that there is a character string in one image data that spans multiple lines, including the y-coordinate for the image, character string attributes, etc.

（１）転移学習候補となる学習済みモデルの生成処理
第２の実施形態に係る転移学習候補となる学習済みモデルの生成処理は、第１の実施形態にて図２を参照して説明した処理と同様であるため、その説明を省略する。 (1) Generation process of a trained model that becomes a transfer learning candidate The process of generating a trained model that becomes a transfer learning candidate according to the second embodiment is the process described with reference to FIG. 2 in the first embodiment. Since it is the same as , its explanation will be omitted.

（２）学習済みモデルの選択処理の流れ
次に、図５を参照して、学習済みモデルの選択処理の流れの一例について説明する。図５は、第２の実施形態に係る学習済みモデルの選択処理の流れの一例を示すフローチャートである。 (2) Flow of the trained model selection process Next, an example of the flow of the trained model selection process will be described with reference to FIG. FIG. 5 is a flowchart illustrating an example of the flow of a trained model selection process according to the second embodiment.

図５に示すステップＳ３０１からステップＳ３０３までの処理は、第１の実施形態にて図３を参照して説明したステップＳ２０１からステップＳ３０３の処理と同様であるため、その説明を省略する。 The processing from step S301 to step S303 shown in FIG. 5 is the same as the processing from step S201 to step S303 described with reference to FIG. 3 in the first embodiment, so the description thereof will be omitted.

処理がステップＳ３０４に進んだ場合、転移学習部１２２は、ドメイン学習部１２１で生成されたすべての学習済みモデルに対して、画像データ（入力データ）を用いて転移学習を行う（ステップＳ３０４）。転移学習された学習済みモデル（転移モデル）は、モデル管理ＤＢ１１２ａに格納され保存される。 When the process proceeds to step S304, the transfer learning unit 122 performs transfer learning on all trained models generated by the domain learning unit 121 using image data (input data) (step S304). The learned model (transfer model) that has undergone transfer learning is stored and saved in the model management DB 112a.

次いで、アンサンブル学習部１２３は、重み付けにアルゴリズムを用いるか否かを確認する（ステップＳ３０５）。アルゴリズムを用いる場合（ステップＳ３０５／ＹＥＳ）、処理をステップＳ３０６へ進める。一方、アルゴリズムを用いない場合（ステップＳ３０５／ＮＯ）、処理をステップＳ３０７へ進める。 Next, the ensemble learning unit 123 checks whether an algorithm is used for weighting (step S305). When using the algorithm (step S305/YES), the process advances to step S306. On the other hand, if the algorithm is not used (step S305/NO), the process advances to step S307.

処理がステップＳ３０６へ進んだ場合、アンサンブル学習部１２３は、ラベル推定の確率を算出する（ステップＳ３０６）。例えば、アンサンブル学習部１２３は、任意の活性化関数を用いてラベル推定の確率を算出する。具体的に、アンサンブル学習部１２３は、学習済みモデルごとのラベル推定の確率を算出したものを平均し、任意の活性化関数で標準化することで、全体の出力としてのラベル推定の確率を算出する。算出後、処理をステップＳ３１１へ進める。 When the process proceeds to step S306, the ensemble learning unit 123 calculates the probability of label estimation (step S306). For example, the ensemble learning unit 123 calculates the probability of label estimation using an arbitrary activation function. Specifically, the ensemble learning unit 123 calculates the probability of label estimation as an overall output by averaging the calculated probability of label estimation for each trained model and standardizing it with an arbitrary activation function. . After the calculation, the process advances to step S311.

処理がステップＳ３０７へ進んだ場合、アンサンブル学習部１２３は、学習済みモデル単位でラベルを取得する（ステップＳ３０７）。
次いで、アンサンブル学習部１２３は、取得したラベルの数に応じて、そのラベルの推定結果に対して、類推結果の数で多数決を取る（ステップＳ３０８）。アンサンブル学習部１２３は、多数決の結果数が多いものを類推するラベルとして決定する。この時、アンサンブル学習部１２３は、類推結果の数に対して若い順にＩＤを割り振り、その数を計算結果ＤＢ１１３に格納する。 When the process proceeds to step S307, the ensemble learning unit 123 acquires a label for each trained model (step S307).
Next, the ensemble learning unit 123 takes a majority vote based on the number of analogy results for the estimation results of the labels according to the number of acquired labels (step S308). The ensemble learning unit 123 determines a label with a large majority result as an analogous label. At this time, the ensemble learning unit 123 allocates IDs to the number of analogy results in descending order, and stores the numbers in the calculation result DB 113.

次いで、アンサンブル学習部１２３は、多数決の結果における類推数が一致するか否かを確認する（ステップＳ３０９）。一致する場合（ステップＳ３０９／ＹＥＳ）、処理をステップＳ３１０へ進める。一方、一致しない場合（ステップＳ３０９／ＮＯ）、処理をステップＳ３１１へ進める。
処理がステップＳ３１０へ進んだ場合、アンサンブル学習部１２３は、割り振ったＩＤが若い類推結果のラベルを選択する（ステップＳ３１０）。選択後、処理をステップＳ３１１へ進める。 Next, the ensemble learning unit 123 checks whether the numbers of analogies in the majority voting result match (step S309). If they match (step S309/YES), the process advances to step S310. On the other hand, if they do not match (step S309/NO), the process advances to step S311.
When the process proceeds to step S310, the ensemble learning unit 123 selects the label of the analogy result with the smaller assigned ID (step S310). After selection, the process advances to step S311.

処理がステップＳ３１１へ進んだ場合、アンサンブル学習部１２３は、アンサンブル学習を行い、その結果を出力する（ステップＳ３１１）。重み付けにアルゴリズムを用いない場合、アンサンブル学習部１２３は、集計関数をラベルの多数決として、アンサンブル精度を出力する。一方、重み付けにアルゴリズムを用いる場合、アンサンブル学習部１２３は、集計関数を確率の平均とし、アンサンブル精度を出力する。
そして、最適モデル選択部１４０ａは、そのアンサンブルにおける学習済みモデルを最適な学習済みモデルとして選択する（ステップＳ３１２）。 When the process proceeds to step S311, the ensemble learning unit 123 performs ensemble learning and outputs the result (step S311). When an algorithm is not used for weighting, the ensemble learning unit 123 uses a majority vote of the labels as the aggregation function and outputs ensemble accuracy. On the other hand, when using an algorithm for weighting, the ensemble learning unit 123 uses the aggregation function as an average of probabilities and outputs ensemble accuracy.
Then, the optimal model selection unit 140a selects the trained model in the ensemble as the optimal trained model (step S312).

以上説明したように、第２の実施形態に係るモデル選択装置１０ａは、互いに共通のクラスを有する異なるドメインごとに、識別対象を示す入力データを入力として識別対象を識別可能に学習した学習済みモデルを生成するドメイン学習部１２１と、異なるドメインごとに生成された学習済みモデルの中から、各々の学習済みモデルが識別対象を識別した識別結果に基づき、転移学習への適応に最適な学習済みモデルを選択する最適モデル選択部１４０ａとを備える。 As described above, the model selection device 10a according to the second embodiment uses a trained model that is trained to identify an identification target by inputting input data indicating an identification target for each different domain having a common class. The domain learning unit 121 generates a trained model that is optimal for adaptation to transfer learning based on the identification results of each trained model identifying the classification target from among the trained models generated for each different domain. and an optimal model selection unit 140a that selects the optimal model.

かかる構成により、第２の実施形態に係るモデル選択装置１０ａは、事前にドメインの違いによる識別器の識別精度の違いを考慮した上で、転移学習に適応する学習済みモデルを選択することができる。
よって、第２の実施形態に係るモデル選択装置１０ａは、ドメインの異なる学習済みモデルの中から転移学習への適応に最適な学習済みモデルを選択することで、転移学習された学習済みモデルの精度を向上することを可能とする。 With this configuration, the model selection device 10a according to the second embodiment can select a trained model that is suitable for transfer learning, taking into consideration in advance the difference in the classification accuracy of the classifier due to the difference in domains. .
Therefore, the model selection device 10a according to the second embodiment improves the accuracy of the trained model that has undergone transfer learning by selecting the trained model that is optimal for adaptation to transfer learning from among the trained models of different domains. This makes it possible to improve

また、第２の実施形態に係るモデル選択装置１０ａは、異なるドメインごとに生成された学習済みモデルに基づき、アンサンブル学習を行うアンサンブル学習部１２３、をさらに備え、最適モデル選択部１４０は、アンサンブル学習の結果に基づき、最適な学習済みモデルを選択する。
かかる構成により、第２の実施形態に係るモデル選択装置１０ａは、入力データの求める項目の真値と、学習済みモデルにその入力データを入力して推論することによって得られる目的変数の誤差を評価することにより、入力データから推論するための最適なモデルを、より精度高く選択することができる。また、最適なモデルの選択にかかる時間を短縮することもできる。 The model selection device 10a according to the second embodiment further includes an ensemble learning unit 123 that performs ensemble learning based on trained models generated for each different domain, and the optimal model selection unit 140 performs ensemble learning. Select the best trained model based on the results.
With this configuration, the model selection device 10a according to the second embodiment evaluates the true value of the item sought by the input data and the error of the target variable obtained by inputting the input data to the trained model and inferring it. By doing so, the optimal model for inference from input data can be selected with higher accuracy. Furthermore, the time required to select the optimal model can also be reduced.

以上、本発明の実施形態について説明した。なお、上述した実施形態におけるモデル選択装置１０及び１０aの一部又は全部の機能をコンピュータで実現するようにしてもよい。その場合、この機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよく、ＦＰＧＡ（Field Programmable Gate Array）等のプログラマブルロジックデバイスを用いて実現されるものであってもよい。 The embodiments of the present invention have been described above. Note that some or all of the functions of the model selection devices 10 and 10a in the embodiments described above may be realized by a computer. In that case, a program for realizing this function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into a computer system and executed. Note that the "computer system" herein includes hardware such as an OS and peripheral devices. Furthermore, the term "computer-readable recording medium" refers to portable media such as flexible disks, magneto-optical disks, ROMs, and CD-ROMs, and storage devices such as hard disks built into computer systems. Furthermore, a "computer-readable recording medium" refers to a storage medium that dynamically stores a program for a short period of time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. It may also include a device that retains a program for a certain period of time, such as a volatile memory inside a computer system that is a server or client in that case. Further, the above-mentioned program may be one for realizing a part of the above-mentioned functions, or may be one that can realize the above-mentioned functions in combination with a program already recorded in the computer system. It may be realized using a programmable logic device such as an FPGA (Field Programmable Gate Array).

以上、図面を参照してこの発明の実施形態について詳しく説明してきたが、具体的な構成は上述のものに限られることはなく、この発明の要旨を逸脱しない範囲内において様々な設計変更等をすることが可能である。 Although the embodiments of this invention have been described above in detail with reference to the drawings, the specific configuration is not limited to that described above, and various design changes may be made without departing from the gist of this invention. It is possible to do so.

１０，１０ａ…モデル選択装置、１１０，１１０ａ…記憶部、１１１…入力画像ＤＢ、１１２，１１２ａ…モデル管理、１１３…計算結果ＤＢ、１２０，１２０ａ…モデル分析部、１２１…ドメイン学習部、１２２…転移学習部、１２３…アンサンブル学習部、１３０…モデル評価部、１３１…確率分布計算部、１３２…情報行列計算部、１３３…ドメインスコア計算部、１４０，１４０ａ…最適モデル選択部 DESCRIPTION OF SYMBOLS 10, 10a... Model selection device, 110, 110a... Storage unit, 111... Input image DB, 112, 112a... Model management, 113... Calculation result DB, 120, 120a... Model analysis unit, 121... Domain learning unit, 122... Transfer learning section, 123... Ensemble learning section, 130... Model evaluation section, 131... Probability distribution calculation section, 132... Information matrix calculation section, 133... Domain score calculation section, 140, 140a... Optimal model selection section

Claims

a domain learning unit that generates a trained model trained to identify the identification target by inputting input data indicating the identification target for each different domain having a common class;
an optimal model selection unit that selects a trained model that is optimal for adaptation to transfer learning from among the trained models generated for each of the different domains, based on the identification result in which each trained model identifies the identification target; and,
A model selection device comprising:

a domain score calculation unit that calculates the variance of the identification results for each trained model generated for each different domain as an index for selecting the optimal trained model;
Furthermore,
The optimal model selection unit selects the optimal trained model based on the calculated variance;
The model selection device according to claim 1.

an ensemble learning unit that performs ensemble learning based on the trained models generated for each of the different domains;
Furthermore,
The optimal model selection unit selects the optimal trained model based on the result of the ensemble learning.
The model selection device according to claim 1.

a transfer learning unit that performs transfer learning on a transfer destination domain for each of the trained models generated for each of the different domains;
Furthermore,
The ensemble learning unit performs the ensemble learning on the trained model that has been subjected to transfer learning.
The model selection device according to claim 3.

The ensemble learning unit calculates a weight as a distributed representation in the trained model using an arbitrary activation function in learning the trained model to be ensembled when performing the ensemble learning.
The model selection device according to claim 3.

The ensemble learning unit calculates the weight of the ensemble based on the softmax value.
The model selection device according to claim 5.

The input data is image data indicating characters to be identified,
The trained models generated for each domain have different eras in which the characters to be identified were written, and are models trained using image data showing characters written with a writing instrument.
The model selection device according to claim 1.

The domain learning unit has correspondence map data indicating a correspondence relationship between different classes, and performs a process of integrating classes between trained models on the selected class based on the correspondence map data.
The model selection device according to claim 1.

When the input data is given as a row image, a row-by-row image divided at each domain break is treated as input data, and when the input data is not given as a row image, the input data is treated as input data. handles character-based images as input data,
The model selection device according to claim 1.

a domain learning process in which the domain learning unit receives input data indicating an identification target and generates a trained model trained to identify the identification target for each different domain having a common class;
The optimal model selection unit selects a trained model that is optimal for adaptation to transfer learning from among the trained models generated for each of the different domains, based on the identification result of each trained model identifying the identification target. an optimal model selection process;
Model selection methods, including:

computer,
domain learning means for generating a trained model trained to identify the identification target by inputting input data indicating the identification target for each different domain having a common class;
Optimal model selection means for selecting a trained model optimal for adaptation to transfer learning from among the trained models generated for each of the different domains, based on an identification result in which each trained model identifies the identification target; and,
A program to function as