JP7361999B1

JP7361999B1 - Machine learning device, machine learning system, machine learning method, and machine learning program

Info

Publication number: JP7361999B1
Application number: JP2023532817A
Authority: JP
Inventors: 翔貴宮川; 雄一佐々木
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2022-02-04
Filing date: 2022-02-04
Publication date: 2023-10-16
Anticipated expiration: 2042-02-04
Also published as: JPWO2023148914A1; WO2023148914A1; TW202333093A

Abstract

機械学習装置（１０）は、入力データの着目領域（Ｂ）を推論するための学習モデルを学習用データ（Ｌ）に基づいて生成する通常学習と、前記学習モデルを更新する転移学習とを行う学習部（１１、１５ａ）と、前記通常学習の際に、学習モデルを用いて入力データの着目領域を生成し、生成された着目領域を更新前着目領域として更新前着目領域記憶部（１３）に保存する生成処理を行い、転移学習の際又は前に、入力された人の知識を用いて入力データの着目領域を更新前着目領域（Ｂｉ）から更新後着目領域（Ａｉ）に更新して更新後着目領域記憶部（１４）に保存する更新処理を行う着目領域更新部（１８）と、更新処理による着目領域の変化を示す類似度距離（ｄｉ）を計算する着目領域更新評価部（１５）と、転移学習の際又は前に、類似度距離に基づいて着目領域更新部（１８）に出力される入力データを動的に選択するデータ動的選択部（１６）とを有する。The machine learning device (10) performs normal learning in which a learning model for inferring a region of interest (B) of input data is generated based on learning data (L), and transfer learning in which the learning model is updated. a learning section (11, 15a); a pre-update region of interest storage section (13) that generates a region of interest of input data using a learning model during the normal learning, and uses the generated region of interest as a pre-update region of interest; During or before transfer learning, the area of interest of the input data is updated from the area of interest before update (Bi) to the area of interest after update (Ai) using the input person's knowledge. A region-of-interest update section (18) performs update processing to store the updated region-of-interest storage section (14), and a region-of-interest update evaluation section (15) calculates a similarity distance (di) indicating a change in the region of interest due to the update process. ), and a data dynamic selection unit (16) that dynamically selects input data to be output to the region of interest update unit (18) based on the similarity distance during or before transfer learning.

Description

本開示は、機械学習装置、機械学習システム、機械学習方法、及び機械学習プログラムに関する。 The present disclosure relates to a machine learning device, a machine learning system, a machine learning method, and a machine learning program.

深層学習では、学習モデルが着目すべき領域である着目領域を示すＡｔｔｅｎｔｉｏｎＭａｐを導入することで、学習モデルを用いた推論における予測精度の向上を実現している。着目領域は、学習モデルの学習によって獲得されるが、人手で着目領域を修正する転移学習（すなわち、ｈｕｍａｎ－ｉｎ－ｔｈｅ－ｌｏｏｐ型の学習）を行うことで、学習モデルの性能の向上（予測精度の向上及び解釈性の向上など）を実現させることができる（例えば、特許文献１及び２を参照）。 In deep learning, improvement of prediction accuracy in inference using a learning model is achieved by introducing an Attention Map that indicates a region of interest that a learning model should pay attention to. The region of interest is acquired by learning the learning model, but by performing transfer learning (that is, human-in-the-loop learning) that manually corrects the region of interest, the performance of the learning model can be improved (predicted). (for example, see Patent Documents 1 and 2).

特開２０２１－０２２３６８号公報JP2021-022368A 国際公開第２０２０／０８５３３６号International Publication No. 2020/085336

しかしながら、転移学習において、大量のデータのうちのどのデータの着目領域を人手で更新すれば学習モデルの性能を効率的に向上させることができるか、は明らかでない。したがって、動画のような大量のデータの着目領域を人手で更新する転移学習において、学習モデルの性能を少ない労力で向上させること（すなわち、転移学習の効率化）が望まれる。 However, in transfer learning, it is not clear which region of data out of a large amount of data should be manually updated to efficiently improve the performance of a learning model. Therefore, in transfer learning that manually updates a region of interest in a large amount of data such as a video, it is desirable to improve the performance of a learning model with less effort (that is, to improve the efficiency of transfer learning).

本開示は、転移学習において生成される学習モデルの性能を効率的に向上させることを可能にする機械学習装置、機械学習システム、機械学習方法、及び機械学習プログラムを提供することを目的とする。 The present disclosure aims to provide a machine learning device, a machine learning system, a machine learning method, and a machine learning program that make it possible to efficiently improve the performance of a learning model generated in transfer learning.

本開示の機械学習装置は、入力データの着目領域を推論するための学習モデルを、予め収集された学習用データに基づいて生成する通常学習と、前記学習モデルを更新する転移学習とを行う学習部と、前記通常学習の際に、前記学習モデルを用いて前記入力データの前記着目領域を生成し、生成された前記着目領域を更新前着目領域として更新前着目領域記憶部に保存する生成処理を行い、前記転移学習の際又は前に、入力された人の知識を用いて前記入力データの前記着目領域を前記更新前着目領域から更新後着目領域に更新して更新後着目領域記憶部に保存する更新処理を行う着目領域更新部と、前記更新処理による前記着目領域の変化を示す類似度距離を計算する着目領域更新評価部と、前記転移学習の際又は前に、前記類似度距離に基づいて前記着目領域更新部に出力される前記入力データを動的に選択するデータ動的選択部と、を有することを特徴とする。 The machine learning device of the present disclosure performs normal learning in which a learning model for inferring a region of interest of input data is generated based on learning data collected in advance, and transfer learning in which the learning model is updated. and a generation process of generating the region of interest of the input data using the learning model during the normal learning, and storing the generated region of interest in a pre-update region of interest storage unit as a pre-update region of interest. During or before the transfer learning, update the region of interest of the input data from the pre-update region of interest to the post-update region of interest using the inputted person's knowledge and store it in the post-update region of interest storage unit. a region of interest update unit that performs an update process to save; a region of interest update evaluation unit that calculates a similarity distance indicating a change in the region of interest due to the update process; and a data dynamic selection section that dynamically selects the input data to be output to the region of interest update section based on the above.

本開示の機械学習方法は、入力データの着目領域を推論するための学習モデルを、予め収集された学習用データに基づいて生成する通常学習と、前記学習モデルを更新する転移学習とを行う機械学習装置によって実施される方法であって、前記通常学習の際に、前記学習モデルを用いて前記入力データの前記着目領域を生成し、生成された前記着目領域を更新前着目領域として更新前着目領域記憶部に保存する生成処理を行い、前記転移学習の際又は前に、入力された人の知識を用いて前記入力データの前記着目領域を前記更新前着目領域から更新後着目領域に更新して更新後着目領域記憶部に保存する更新処理を行うステップと、前記更新処理による前記着目領域の変化を示す類似度距離を計算するステップと、前記転移学習の際又は前に、前記類似度距離に基づいて前記転移学習の対象になる前記入力データを動的に選択するステップと、を有することを特徴とする。 The machine learning method of the present disclosure is a machine that performs normal learning in which a learning model for inferring a region of interest in input data is generated based on learning data collected in advance, and transfer learning in which the learning model is updated. A method implemented by a learning device, wherein during the normal learning, the region of interest of the input data is generated using the learning model, and the generated region of interest is used as the region of interest before update as the region of interest before update. A generation process is performed to store the data in a region storage unit, and during or before the transfer learning, the region of interest of the input data is updated from the pre-update region of interest to the post-update region of interest using the inputted person's knowledge. performing an update process to store the area of interest in the updated area of interest storage unit; calculating a similarity distance indicating a change in the area of interest due to the update process; and calculating the similarity distance during or before the transfer learning. Dynamically selecting the input data to be subjected to the transfer learning based on the method.

本開示の機械学習装置、機械学習システム、機械学習方法、及び機械学習プログラムを用いれば、転移学習において生成される学習モデルの性能を効率的に向上させることができる。 By using the machine learning device, machine learning system, machine learning method, and machine learning program of the present disclosure, it is possible to efficiently improve the performance of a learning model generated in transfer learning.

実施の形態１に係る機械学習装置の構成（教師あり学習部であるタスク学習部の構成を含む）を概略的に示す機能ブロック図である。1 is a functional block diagram schematically showing the configuration of a machine learning device according to Embodiment 1 (including the configuration of a task learning unit that is a supervised learning unit); FIG. 実施の形態１に係る機械学習装置の構成（動的選択・更新部の構成を含む）を概略的に示す機能ブロック図である。1 is a functional block diagram schematically showing the configuration of a machine learning device according to Embodiment 1 (including the configuration of a dynamic selection/update section). FIG. 機械学習装置の着目領域更新評価部の動作を示す図である。FIG. 3 is a diagram illustrating the operation of a region of interest update evaluation unit of the machine learning device. 機械学習装置の着目領域更新評価部の動作を示す図である。FIG. 3 is a diagram illustrating the operation of a region of interest update evaluation unit of the machine learning device. 機械学習装置のデータ動的選択部の動作を示す図である。It is a figure showing operation of a data dynamic selection part of a machine learning device. （Ａ）及び（Ｂ）は、機械学習装置の探索活用調整部の動作を示す図である。(A) and (B) are diagrams showing the operation of a search utilization adjustment unit of the machine learning device. （Ａ）及び（Ｂ）は、機械学習装置の探索活用調整部の動作を示す図である。(A) and (B) are diagrams showing the operation of a search utilization adjustment unit of the machine learning device. 実施の形態１に係る機械学習装置の動作を示すフローチャートである。3 is a flowchart showing the operation of the machine learning device according to the first embodiment. 実施の形態１に係る機械学習装置のハードウェア構成の例を示す図である。1 is a diagram illustrating an example of a hardware configuration of a machine learning device according to Embodiment 1. FIG. 実施の形態２に係る機械学習装置の構成を概略的に示す機能ブロック図である。FIG. 2 is a functional block diagram schematically showing the configuration of a machine learning device according to a second embodiment. 実施の形態２に係る機械学習装置の動作を示すフローチャートである。7 is a flowchart showing the operation of the machine learning device according to Embodiment 2. FIG. 実施の形態３に係る機械学習装置の構成を概略的に示す機能ブロック図である。3 is a functional block diagram schematically showing the configuration of a machine learning device according to Embodiment 3. FIG. 実施の形態３に係る機械学習装置の動作を示すフローチャートである。7 is a flowchart showing the operation of the machine learning device according to Embodiment 3. 実施の形態４に係る機械学習装置及び機械学習システムの構成を概略的に示す機能ブロック図である。FIG. 7 is a functional block diagram schematically showing the configuration of a machine learning device and a machine learning system according to a fourth embodiment. 実施の形態４に係る機械学習装置及び機械学習システムの動作を示すフローチャートである。12 is a flowchart showing operations of a machine learning device and a machine learning system according to Embodiment 4.

以下に、実施の形態に係る機械学習装置、機械学習システム、機械学習方法、及び機械学習プログラムを、図面を参照しながら説明する。以下の実施の形態は、例にすぎず、実施の形態を適宜組み合わせること及び各実施の形態を適宜変更することが可能である。 Below, a machine learning device, a machine learning system, a machine learning method, and a machine learning program according to embodiments will be described with reference to the drawings. The following embodiments are merely examples, and the embodiments can be combined as appropriate and each embodiment can be changed as appropriate.

実施の形態１．
図１は、実施の形態１に係る機械学習装置１０の構成（教師あり学習部であるタスク学習部１１の構成を含む）を概略的に示す機能ブロック図である。機械学習装置１０は、例えば、人の知見を取り入れて行う転移学習を行うことができる学習装置である。機械学習装置１０は、例えば、コンピュータである。機械学習装置１０は、実施の形態１に係る機械学習方法を実施することができる装置である。機械学習装置１０は、タスク学習部１１と、動的選択・更新部１９とを有している。機械学習装置１０は、学習データ記憶部１２、更新前着目領域記憶部１３、及び更新後着目領域記憶部１４に接続されている。機械学習装置１０、学習データ記憶部１２、更新前着目領域記憶部１３、及び更新後着目領域記憶部１４は、機械学習システムを構成する。学習データ記憶部１２、更新前着目領域記憶部１３、及び更新後着目領域記憶部１４は、機械学習装置１０の内部の記憶装置に設けられてもよい。学習データ記憶部１２、更新前着目領域記憶部１３、及び更新後着目領域記憶部１４は、共通の記憶装置の異なる記憶領域であってもよい。Embodiment 1.
FIG. 1 is a functional block diagram schematically showing the configuration of a machine learning device 10 according to the first embodiment (including the configuration of a task learning unit 11 that is a supervised learning unit). The machine learning device 10 is, for example, a learning device that can perform transfer learning that incorporates human knowledge. The machine learning device 10 is, for example, a computer. The machine learning device 10 is a device that can implement the machine learning method according to the first embodiment. The machine learning device 10 includes a task learning section 11 and a dynamic selection/updating section 19. The machine learning device 10 is connected to a learning data storage section 12 , a pre-update region of interest storage section 13 , and a post-update region of interest storage section 14 . The machine learning device 10, the learning data storage section 12, the pre-update region of interest storage section 13, and the post-update region of interest storage section 14 constitute a machine learning system. The learning data storage unit 12, the pre-update area of interest storage unit 13, and the post-update area of interest storage unit 14 may be provided in a storage device inside the machine learning device 10. The learning data storage unit 12, the pre-update area of interest storage unit 13, and the post-update area of interest storage unit 14 may be different storage areas of a common storage device.

タスク学習部１１は、学習モデル生成部１１ａと、学習モデル記憶部１１ｂとを有している。学習モデル生成部１１ａは、入力データと正解データとを含む学習用データＬを取得し、学習用データＬを用いて、実際の入力データ（例えば、推論プロセスにおいて入力される熱画像データ）から着目領域を推論するための学習モデルＭを生成し、学習モデル記憶部１１ｂに記憶させる。転移学習を行うときには、学習モデル生成部１１ａは、学習用データＬと更新後着目領域Ａとを用いて学習モデルＭを生成し、学習モデル記憶部１１ｂに記憶させる。なお、機械学習装置１０は、入力データを取得し、学習モデルＭを用いて入力データから得られる推論結果を出力する推論部を備えてもよい。この場合の機械学習装置１０は、機械学習・推論装置である。また、推論部は、機械学習装置１０とは別のコンピュータに設けられてもよい。 The task learning section 11 includes a learning model generation section 11a and a learning model storage section 11b. The learning model generation unit 11a acquires learning data L including input data and correct answer data, and uses the learning data L to generate a focus from actual input data (for example, thermal image data input in the inference process). A learning model M for inferring a region is generated and stored in the learning model storage unit 11b. When performing transfer learning, the learning model generation unit 11a generates a learning model M using the learning data L and the updated region of interest A, and stores it in the learning model storage unit 11b. Note that the machine learning device 10 may include an inference unit that acquires input data and outputs an inference result obtained from the input data using the learning model M. The machine learning device 10 in this case is a machine learning/inference device. Further, the inference unit may be provided in a computer different from the machine learning device 10.

学習データ記憶部１２は、機械学習の目的に応じて予め収集された学習用データＬを記憶し、学習用データＬをタスク学習部１１に提供する。学習データ記憶部１２は、機械学習装置１０の外部の記憶装置に設けられてもよい。 The learning data storage unit 12 stores learning data L collected in advance according to the purpose of machine learning, and provides the learning data L to the task learning unit 11. The learning data storage unit 12 may be provided in a storage device external to the machine learning device 10.

動的選択・更新部１９は、タスク学習部１１から学習モデルＭを取得し、更新前着目領域記憶部１３から更新前着目領域Ｂを取得し、更新後着目領域記憶部１４を介してタスク学習部１１に更新後着目領域を与える。 The dynamic selection/update unit 19 acquires the learning model M from the task learning unit 11 , acquires the pre-update region of interest B from the pre-update region of interest storage unit 13 , and performs task learning via the post-update region of interest storage unit 14 . The updated region of interest is given to the section 11.

動的選択・更新部１９は、転移学習を行うときに、学習モデルＭによって生成された着目領域に人手（すなわち、介入者としてのユーザ５０）によって修正を加えることで得られた更新後着目領域Ａｉを更新後着目領域記憶部１４に与える。更新後着目領域記憶部１４に蓄積された更新後着目領域Ａは、タスク学習部１１に与えられる。ここで、「Ａｉ」は、第ｉ番のデータの更新後着目領域、すなわち、選択された更新後着目領域を示し、「Ａ」は、更新後着目領域記憶部１４に蓄積されたすべての更新後着目領域を示す。なお、ｉは、データを識別するための情報であり、例えば、正の整数である。 When performing transfer learning, the dynamic selection/updating unit 19 generates an updated region of interest obtained by manually (i.e., user 50 as an intervener) modifying the region of interest generated by the learning model M. Ai is given to the updated target area storage unit 14. The updated region of interest A stored in the updated region of interest storage section 14 is given to the task learning section 11 . Here, “Ai” indicates the updated area of interest of the i-th data, that is, the selected updated area of interest, and “A” indicates all the updated areas of interest stored in the updated area of interest storage unit 14. Indicates the subsequent region of interest. Note that i is information for identifying data, and is, for example, a positive integer.

図２は、機械学習装置１０の構成（動的選択・更新部１９の構成を含む）を概略的に示す機能ブロック図である。動的選択・更新部１９は、着目領域更新評価部１５と、データ動的選択部１６と、探索活用調整部１７と、着目領域更新部１８とを有している。 FIG. 2 is a functional block diagram schematically showing the configuration of the machine learning device 10 (including the configuration of the dynamic selection/update section 19). The dynamic selection/update section 19 includes a region of interest update evaluation section 15 , a data dynamic selection section 16 , a search utilization adjustment section 17 , and a region of interest update section 18 .

機械学習装置１０は、入力データの着目領域を推論するための学習モデルＭを、予め収集された学習用データＬに基づいて更新する通常学習と、学習モデルＭを更新する転移学習とを行う学習部（類似度学習部１５ａとタスク学習部１１）を有している。動的選択・更新部１９は、通常学習の際に、学習部が生成した学習モデルを用いて入力データの着目領域を生成し、生成された着目領域を更新前着目領域として更新前着目領域記憶部１３に保存する生成処理を行い、転移学習の際又は前に、入力された人の知識を用いて入力データの着目領域を更新前着目領域Ｂｉから更新後着目領域Ａｉに更新して更新後着目領域記憶部１４に保存する更新処理を行う着目領域更新部１８を有している。また、動的選択・更新部１９は、前記更新処理による着目領域の変化を示す類似度距離ｄ^ｉを計算する着目領域更新評価部１５と、転移学習の際又は前に、類似度距離ｄ^ｉに基づいて着目領域更新部１８に出力される入力データを動的に選択するデータ動的選択部１６とを有している。The machine learning device 10 performs normal learning in which a learning model M for inferring a region of interest in input data is updated based on learning data L collected in advance, and transfer learning in which the learning model M is updated. (similarity learning unit 15a and task learning unit 11). During normal learning, the dynamic selection/update unit 19 generates a region of interest of input data using the learning model generated by the learning section, and stores the generated region of interest as a region of interest before update as a region of interest before update. Performs generation processing to be stored in the section 13, and updates the region of interest of the input data from the pre-update region of interest Bi to the post-update region of interest Ai using the inputted person's knowledge during or before transfer learning. It has a region-of-interest update section 18 that performs an update process to store the region of interest in the region-of-interest storage section 14 . In addition, the dynamic selection/update unit 19 is connected to a region of interest update evaluation unit 15 that calculates a similarity distance d ⁱ indicating a change in the region of interest due to the update process, and a region of interest update evaluation unit 15 that calculates a similarity distance d ⁱ during or before transfer learning. The data dynamic selection section 16 dynamically selects the input data to be output to the region of interest updating section 18 based on the following.

学習データ記憶部１２は、特定のタスクに応じて収集した学習用データＬを記憶する。学習データ記憶部１２は、例えば、画像分類タスクの場合は、画像データ（例えば、人物画像、医療画像など）と、各画像に対応するクラス番号とのセット（すなわち、学習データセット）を記憶する。学習用データＬは、画像に関するデータに限定されず、自然言語に関するデータ又はテーブルデータなどであってもよい。 The learning data storage unit 12 stores learning data L collected according to a specific task. For example, in the case of an image classification task, the learning data storage unit 12 stores a set of image data (for example, a human image, a medical image, etc.) and a class number corresponding to each image (i.e., a learning data set). . The learning data L is not limited to data related to images, but may be data related to natural language, table data, or the like.

タスク学習部１１は、学習用データＬを入力として取得し、学習用データＬに基づいて学習モデルＭを学習する。転移学習の際は、タスク学習部１１は、学習用データＬと更新後着目領域Ａとを入力として取得し、学習中又は学習後に、更新前着目領域Ｂを出力する。つまり、タスク学習部１１は、学習モデルＭの学習中又は学習後のいずれにおいても、更新前着目領域Ｂを出力する。また、タスク学習部１１が、学習中に更新前着目領域Ｂを出力するためには、ＡｔｔｅｎｔｉｏｎＢｒａｎｃｈＮｅｔｗｏｒｋ（ＡＢＮ）と呼ばれる既知の方法を採用すればよい。また、タスク学習部１１が、学習後に更新前着目領域Ｂを出力するためには、ＬｏｃａｌＩｎｔｅｒｐｒｅｔａｂｌｅＭｏｄｅｌ－ａｇｎｏｓｔｉｃＥｘｐｌａｎａｔｉｏｎ（ＬＩＭＥ）又はＣｌａｓｓＡｃｔｉｖａｔｉｏｎＭａｐｐｉｎｇ（ＣＡＭ）などのような事後的に着目領域を生成する既知の方法を採用すればよい。 The task learning unit 11 receives learning data L as input, and learns a learning model M based on the learning data L. During transfer learning, the task learning unit 11 acquires the learning data L and the updated region of interest A as input, and outputs the pre-updated region of interest B during or after learning. That is, the task learning unit 11 outputs the pre-update region of interest B either during or after learning the learning model M. Furthermore, in order for the task learning unit 11 to output the pre-update region of interest B during learning, a known method called Attention Branch Network (ABN) may be employed. Furthermore, in order to output the pre-update region of interest B after learning, the task learning unit 11 generates the region of interest after the fact using Local Interpretable Model-agnostic Explanation (LIME) or Class Activation Mapping (CAM). Any known method may be used.

更新前着目領域記憶部１３は、各データに対応する着目領域（更新前着目領域Ｂ）を記憶する。更新後着目領域記憶部１４は、各データに対応する更新後着目領域Ａを記憶する。なお、選択された更新前着目領域は、Ｂｉで示され、選択された更新後着目領域は、Ａｉで示される。更新前着目領域Ｂ及び更新後着目領域Ａのデータ構造は自由であるが、人間（すなわち、ユーザ５０）にとって解釈可能且つ編集可能な構造であることが望ましい。更新前着目領域Ｂ及び更新後着目領域Ａのデータ構造の例は、ヒートマップなどである。更新前着目領域Ｂ及び更新後着目領域Ａは、同一の記憶装置に保存されてもよい。 The pre-update region of interest storage unit 13 stores a region of interest (pre-update region of interest B) corresponding to each data. The updated area of interest storage unit 14 stores an updated area of interest A corresponding to each piece of data. Note that the selected pre-update region of interest is indicated by Bi, and the selected post-update region of interest is indicated by Ai. Although the data structure of the pre-update region of interest B and the post-update region of interest A is free, it is desirable that the data structure be interpretable and editable by humans (that is, the user 50). An example of the data structure of the pre-update region of interest B and the post-update region of interest A is a heat map or the like. The pre-update area of interest B and the post-update area of interest A may be stored in the same storage device.

着目領域更新部１８は、介入者であるユーザ５０が更新作業を行うためのユーザインタフェース（ＵＩ）を備える又はＵＩに接続される。ＵＩは、例えば、キーボード、マウス、タッチパネル、絵を描くためのペイントツール、などの入力装置である。例えば、着目領域更新部１８は、ディスプレイに表示された画像において、着目領域の一部を削除（すなわち、縮小）、又は着目領域を追加（すなわち、拡張）、又は着目領域の削除と追加の両方（例えば、着目領域の移動を含む）、の操作を繰り返し行う。このとき、複数の着目領域の各々に重みづけをする処理が可能である。また、重みづけされたデータに関する判定を行うための閾値の導入、重みづけされたデータを用いた補間処理を行うための構成が備えられてもよい。複数の着目領域の各々に重みづけをする処理は、例えば、特許文献２に示されている。着目領域の更新が完了した後に、更新後着目領域記憶部１４及び着目領域更新評価部１５に更新された着目領域が更新後着目領域Ａｉとして送られる。 The region of interest update unit 18 includes or is connected to a user interface (UI) for a user 50 who is an interventionist to perform update work. The UI is, for example, an input device such as a keyboard, mouse, touch panel, or paint tool for drawing. For example, in the image displayed on the display, the region of interest update unit 18 deletes a part of the region of interest (i.e., shrinks it), adds the region of interest (i.e., expands it), or both deletes and adds the region of interest. (including, for example, moving the region of interest). At this time, it is possible to carry out a process of weighting each of the plurality of regions of interest. Furthermore, a configuration may be provided to introduce a threshold value for making a determination regarding weighted data and to perform interpolation processing using weighted data. A process of weighting each of a plurality of regions of interest is shown in, for example, Patent Document 2. After the update of the region of interest is completed, the updated region of interest is sent to the updated region of interest storage section 14 and the region of interest update evaluation section 15 as the updated region of interest Ai.

着目領域更新評価部１５は、更新前着目領域Ｂｉと更新後着目領域Ａｉの類似度距離ｄ^ｉを計算し、更新前着目領域Ｂｉの妥当性を評価する。第ｉ番のデータについての類似度距離ｄ^ｉを計算する最も簡単な方法は、更新前着目領域と更新後着目領域の生データに対して二乗誤差を計算する方法である。ただし、画像及び自然言語などのような非構造化データの場合は、意味的な違いを捉えることができないため、生データの代わりに学習モデルＭにより獲得した中間特徴を用いて類似度距離ｄ^ｉを計算する。類似度距離ｄ^ｉの計算には、コサイン類似度又はマハラノビス距離などの任意の距離関数を用いることができる。なお、着目領域更新評価部１５の動作の具体例は、図３及び図４を用いて後述される。The region of interest update evaluation unit 15 calculates the similarity distance d ⁱ between the region of interest before update Bi and the region of interest after update Ai, and evaluates the validity of the region of interest before update Bi. The simplest method for calculating the similarity distance d ⁱ for the i-th data is to calculate the squared error for the raw data of the pre-update region of interest and the post-update region of interest. However, in the case of unstructured data such as images and natural language, it is not possible to capture semantic differences, so intermediate features acquired by the learning model M are used instead of raw data to calculate the similarity distance d ⁱ Calculate. Any distance function such as cosine similarity or Mahalanobis distance can be used to calculate the similarity distance d ⁱ . Note that a specific example of the operation of the region of interest update evaluation unit 15 will be described later using FIGS. 3 and 4.

データ動的選択部１６は、着目領域更新評価部１５により得られた更新前着目領域の中間特徴と類似度距離ｄ^ｉの対応関係を用いて、ユーザ５０の更新履歴に基づいて次に更新するべきデータを動的に選択する。はじめに、類似度学習部１５ａは、上記の対応関係を入出力とする学習モデルを新たに学習する。このとき、類似度学習部１５ａは、類似度距離ｄ^ｉの予測には平均Ａｖだけでなく偏差Ｄも出力されるようにガウス過程回帰モデルのようなベイズモデルを学習する。最後に、データ動的選択部１６は、類似度距離ｄ^ｉの平均Ａｖと偏差Ｄとに基づく獲得関数を最大化するようなデータ点を選ぶ。この処理は、静的なデータ選択方法と併用可能であり、また、一度に選択するデータは単一のデータであっても、複数のデータであってもよい。なお、データ動的選択部１６の動作の具体例は、図５を用いて後述される。The data dynamic selection unit 16 performs the next update based on the update history of the user 50, using the correspondence between the intermediate feature of the target area before update and the similarity distance d ⁱ obtained by the target area update evaluation unit 15. Dynamically select the desired data. First, the similarity learning unit 15a newly learns a learning model using the above correspondence as input and output. At this time, the similarity learning unit 15a learns a Bayesian model such as a Gaussian process regression model so that not only the average Av but also the deviation D is output for predicting the similarity distance d ⁱ . Finally, the data dynamic selection unit 16 selects data points that maximize the acquisition function based on the average Av and deviation D of the similarity distance d ⁱ . This process can be used in conjunction with a static data selection method, and the data selected at one time may be a single piece of data or multiple pieces of data. Note that a specific example of the operation of the data dynamic selection section 16 will be described later using FIG. 5.

探索活用調整部１７は、獲得関数のハイパーパラメータＨを調整することで、“更新前着目領域の妥当性の低さ”と“過去の更新済みデータとの類似度の低さ”のバランスを考慮したデータを選択する。探索活用調整部１７の動作の具体例は、図６及び図７を用いて後述される。探索活用調整部１７は、予め取得された学習用データＬにおける探索を重視するか又は活用を重視するかを示すハイパーパラメータＨをデータ動的選択部１６に与える。データ動的選択部１６は、転移学習の際に、類似度距離ｄ^ｉ及びハイパーパラメータＨに基づいて着目領域更新部１８に出力される入力データを動的に選択する。探索活用調整部１７は、転移学習の際に、先ず、ハイパーパラメータを、探索を重視した値に設定し、その後、探索を重視した値から活用を重視した値に徐々に変えることが望ましい。By adjusting the hyperparameter H of the acquisition function, the search utilization adjustment unit 17 considers the balance between "low validity of the region of interest before update" and "low similarity with past updated data". Select the data. A specific example of the operation of the search utilization adjustment unit 17 will be described later using FIGS. 6 and 7. The search/utilization adjustment unit 17 provides the data dynamic selection unit 16 with a hyperparameter H indicating whether emphasis is placed on search or utilization of the learning data L acquired in advance. The data dynamic selection unit 16 dynamically selects input data to be output to the region of interest updating unit 18 based on the similarity distance d ⁱ and the hyperparameter H during transfer learning. During transfer learning, the search and utilization adjustment unit 17 preferably first sets the hyperparameter to a value that emphasizes search, and then gradually changes the hyperparameter from a value that emphasizes search to a value that emphasizes utilization.

図３は、着目領域更新評価部１５の動作を示す図である。転移学習において、着目領域更新部１８によって更新前着目領域Ｂｉ（図３における領域６１ａ、６１ｂ）を更新して得られる更新後着目領域Ａｉ（図３における領域６２ａ、６２ｂ又は領域６３）と、学習モデルＭが新たに生成する着目領域（図３における領域６４ａ、６４ｂ）との間の差が、ペナルティとなる。したがって、転移学習を行うことで、学習モデルＭの性能の向上（すなわち、予測精度及び解釈性の効率的な向上）を促進するためには、更新前着目領域Ｂｉの妥当性が低い（すなわち、更新作業による変化が大きい）データを重点的に（すなわち、優先的に）更新することが望ましい。実施の形態１では、更新前着目領域Ｂｉの妥当性が低いデータを動的に選択することによって、より少ないデータ数の更新作業で、効率よく学習モデルＭの性能を向上させることができる。 FIG. 3 is a diagram showing the operation of the region of interest update evaluation unit 15. In transfer learning, the area of interest after update Ai (areas 62a, 62b or area 63 in FIG. 3) obtained by updating the area of interest Bi before update (areas 61a, 61b in FIG. 3) by the area of interest updating unit 18 and the learning The difference between the region of interest (regions 64a and 64b in FIG. 3) newly generated by model M becomes a penalty. Therefore, in order to improve the performance of the learning model M (i.e., efficiently improve prediction accuracy and interpretability) by performing transfer learning, the validity of the pre-update region of interest Bi is low (i.e., It is desirable to update data that undergoes large changes due to update work (that is, preferentially). In the first embodiment, by dynamically selecting data with low validity in the pre-update region of interest Bi, it is possible to efficiently improve the performance of the learning model M by updating a smaller number of data.

図３において、更新後着目領域Ａｉ（正解データ）が例１である場合、領域６２ａ、６２ｂと学習モデルＭが生成した着目領域である領域６４ａ、６４ｂとを比較してわかるように、領域６２ａ、６２ｂと領域６４ａ、６４ｂとは、互いに類似した位置に互いに類似した大きさで存在している。この場合は、両者の間の類似度距離ｄ^ｉは小さく、両者の間の差であるペナルティは小さく、更新前着目領域Ｂｉの妥当性は高いので、更新前着目領域Ｂｉを人手で更新する必要性は低い。In FIG. 3, when the updated region of interest Ai (correct data) is Example 1, as can be seen by comparing regions 62a and 62b with regions 64a and 64b, which are regions of interest generated by the learning model M, the region 62a , 62b and the regions 64a, 64b are located at similar positions and have similar sizes. In this case, the similarity distance d ⁱ between the two is small, the penalty that is the difference between the two is small, and the validity of the pre-update region of interest Bi is high, so it is necessary to manually update the pre-update region of interest Bi. gender is low.

図３において、更新後着目領域Ａｉ（正解データ）が例２である場合、領域６３と学習モデルＭが生成した着目領域である領域６４ａ、６４ｂとを比較してわかるように、領域６３と領域６４ｂとは互いに類似しているが、例２の更新後着目領域Ａｉには領域６４ａに対応する領域が存在しない。この場合は、両者の間の類似度距離ｄ^ｉは大きく、両者の間の差であるペナルティは大きく、更新前着目領域Ｂｉの妥当性は低いので、更新前着目領域Ｂｉを人手で更新する必要性は高い。In FIG. 3, when the updated region of interest Ai (correct data) is Example 2, as can be seen by comparing the region 63 and regions 64a and 64b, which are the regions of interest generated by the learning model M, the region 63 and the region 64b are similar to each other, but there is no region corresponding to the region 64a in the updated region of interest Ai of Example 2. In this case, the similarity distance d ⁱ between the two is large, the penalty that is the difference between the two is large, and the validity of the pre-update region of interest Bi is low, so it is necessary to manually update the pre-update region of interest Bi. The quality is high.

図３からわかるように、実施の形態１では、転移学習に際して、更新前着目領域Ｂｉの妥当性が低いデータを動的に選択して、人手による更新対象とすることによって、効率よく学習モデルＭの性能を向上させることができる。 As can be seen from FIG. 3, in the first embodiment, during transfer learning, the learning model M is efficiently performance can be improved.

図４は、機械学習装置１０の着目領域更新評価部１５の動作を示す図である。図４は、「馬と人」を含む局所的データである更新前着目領域Ｂｉ（＝ｘ^ｉ _before）が、着目領域更新部１８によって、「馬」を含む局所的データである更新後着目領域Ａｉ（＝ｘ^ｉ _after）に更新されること、学習モデルＭによって、更新前着目領域Ｂｉ（＝ｘ^ｉ _before）の中間特徴ｈ^ｉ _beforeと更新後着目領域Ａｉ（＝ｘ^ｉ _after）の中間特徴ｈ^ｉ _afterとが得られること、これらの中間特徴から類似度距離ｄ^ｉが得られることを示している。つまり、着目領域更新評価部１５は、着目領域更新部１８による更新作業（すなわち、更新処理）による着目領域の変化を、学習モデルＭが抽出した中間特徴間の類似度距離ｄ^ｉにより定量化する。類似度距離ｄ^ｉは、更新前着目領域Ｂｉ（＝ｘ^ｉ _before）の妥当性を表す値である。中間特徴は、画像データからマスクを用いて抽出された着目領域のデータ（局所的データ）の特徴である。類似度距離ｄ^ｉは、更新前着目領域Ｂｉ（＝ｘ^ｉ _before）の中間特徴ｈ^ｉ _beforeと、着目領域更新部１８から与えられる更新後着目領域Ａｉ（＝ｘ^ｉ _after）の中間特徴ｈ^ｉ _afterと、に基づく距離関数ｆとして、以下の式（１）で表すことができる。ここで、距離関数ｆは、例えば、コサイン類似度である。FIG. 4 is a diagram showing the operation of the region-of-interest update evaluation unit 15 of the machine learning device 10. FIG. 4 shows that the pre-update region of interest Bi (= x ⁱ _before ), which is local data including “horses and people”, is changed by the region of interest updating unit 18 into the post-update region of interest, which is local data including “horses”. Ai (= x ⁱ _after ), and by the learning model M, intermediate features h ⁱ _before of the area of interest before update Bi (= x ⁱ _before ) and intermediate features of the area of interest after update Ai (= x ⁱ _after ) h ⁱ _after can be obtained, and the similarity distance d ⁱ can be obtained from these intermediate features. In other words, the region of interest update evaluation section 15 quantifies the change in the region of interest due to the update work (i.e., update processing) performed by the region of interest update section 18 using the similarity distance d ⁱ between the intermediate features extracted by the learning model M. . The similarity distance d ⁱ is a value representing the validity of the pre-update region of interest Bi (=x ⁱ _before ). The intermediate feature is a feature of the data (local data) of the region of interest extracted from the image data using a mask. The similarity distance d ⁱ is the intermediate feature h ⁱ _before of the region of interest before update Bi (= x ⁱ _before ) and the intermediate feature h ⁱ of the region of interest after update Ai (= x ⁱ _after ) given from the region of interest updating unit 18. The distance function f based on _after can be expressed by the following equation (1). Here, the distance function f is, for example, cosine similarity.

図５は、データ動的選択部１６の動作を示す図である。データ動的選択部１６は、未更新データに対して更新前着目領域の妥当性（すなわち、類似度距離ｄ^ｉの期待値及び分散）を推測し、次に更新するべきデータ点をベイズ最適化の手法によって選択する。データ動的選択部１６は、例えば、全ての未更新データについて獲得関数を計算し、最も値の大きいデータを選択する。FIG. 5 is a diagram showing the operation of the data dynamic selection section 16. The data dynamic selection unit 16 estimates the validity of the region of interest before updating (that is, the expected value and variance of the similarity distance d ⁱ ) for the unupdated data, and performs Bayesian optimization on the data points to be updated next. Select by the following method. For example, the data dynamic selection unit 16 calculates an acquisition function for all unupdated data and selects the data with the largest value.

図６（Ａ）及び（Ｂ）は、探索活用調整部１７の動作を示す図である。図６（Ａ）は、ベイズ最適化の例を示す。図６（Ｂ）は、獲得関数であるＵＣＢ（ＵｐｐｅｒＣｏｎｆｉｄｅｎｃｅＢｏｕｎｄ）の例を示す。探索活用調整部１７は、ＵＣＢのハイパーパラメータを調整することで、“更新前着目領域の妥当性の低さ”と“過去の更新済みデータとの類似度が低さ”とのバランスを考慮したデータを選択する。 FIGS. 6A and 6B are diagrams showing the operation of the search utilization adjustment unit 17. FIG. 6(A) shows an example of Bayesian optimization. FIG. 6(B) shows an example of UCB (Upper Confidence Bound) which is an acquisition function. By adjusting the UCB hyperparameters, the search utilization adjustment unit 17 considers the balance between "low validity of the region of interest before update" and "low similarity with past updated data". Select data.

探索活用調整部１７は、ベイズ最適化では、獲得関数を最大化する未更新データを次にユーザに提示するデータとして利用する。このとき、獲得関数α_ｔ（ｘ）のハイパーパラメータβ_ｔを変更することで探索（偏差の大きいデータを重視する場合）と活用（平均の大きいデータを重視する場合）のバランスを制御することができる。ユーザ５０は、スライダーなどのようなＵＩを用いてハイパーパラメータβ_ｔを調整し、探索と活用のバランスをとりながら更新作業を進める。例えば、ハイパーパラメータβ_ｔの値を徐々に小さくすることで、作業の序盤は探索の重要度を上げて（つまり、活用の重要度を下げて）、作業の終盤では活用の重要度を上げる（つまり、探索の重要度を下げる）ようにデータを選択することができる。探索活用調整部１７の動作は、以下の式（２）で表すことができる。In Bayesian optimization, the search utilization adjustment unit 17 uses unupdated data that maximizes the acquisition function as data to be presented to the user next. At this time, by changing the hyperparameter β _t of the acquisition function α _t (x), it is possible to control the balance between exploration (when emphasizing data with a large deviation) and utilization (when emphasizing data with a large average). can. The user 50 adjusts the hyperparameter β _t using a UI such as a slider, and proceeds with the update work while balancing exploration and utilization. For example, by gradually decreasing the value of the hyperparameter β _t , the importance of exploration can be increased at the beginning of the task (that is, the importance of utilization can be decreased), and the importance of utilization can be increased at the end of the task ( In other words, data can be selected to reduce the importance of the search. The operation of the search utilization adjustment unit 17 can be expressed by the following equation (2).

式（２）において、ｘは、正規確率変数（Ｇａｕｓｓｉａｎｒａｎｄｏｍｖａｒｉａｂｌｅ）を示し、α_ｔ（ｘ）は、獲得関数としてのＧａｕｓｓｉａｎＰｒｏｃｅｓｓＵＣＢ（ＧＰ－ＵＣＢ）を示し、ｘ_ｎｅｘｔは、α_ｔ（ｘ）が最大になる点を示す。また、μ_ｔ（ｘ）は、ｘの予測平均を示し、β_ｔはハイパーパラメータを示し、σ_ｔ（ｘ）は、ｘの予測偏差を示す。β_ｔを０に近づければ、探索を重視した設定となり、β_ｔを大きくすれば活用を重視した設定になる。なお、添字ｔは、イテレーション数を示し、図では添字ｔを省略している。In equation (2), x represents a Gaussian random variable, α _t (x) represents Gaussian Process UCB (GP-UCB) as an acquisition function, and x _next represents α _t (x) indicates the point where is maximum. Further, μ _t (x) indicates the predicted average of x, β _t indicates the hyperparameter, and σ _t (x) indicates the predicted deviation of x. If β _t approaches 0, the setting emphasizes search, and if β _t becomes large, the setting emphasizes utilization. Note that the subscript t indicates the number of iterations, and the subscript t is omitted in the figure.

図７（Ａ）及び（Ｂ）は、探索活用調整部１７の動作を示す図である。ハイパーパラメータの調整は、直観的に行うことはできないため、この調整を人が行うことは容易ではない。活用を極端に重視すれば（図７（Ａ）において、ハイパーパラメータβ_ｔを０に近づければ）、更新済みデータの類似度が高い未更新データ（例えば、動画の場合は更新済みフレームの前後のフレーム）ばかり選ばれてしまい、着目領域の更新作業が冗長なものになる。一方で、探索を極端に重視すれば（図７（Ａ）において、ハイパーパラメータβ_ｔを過度に大きくすれば）、妥当性の高いデータ（すなわち、ユーザ５０が更新する必要性が低いデータ）が選ばれる可能性が高くなり、生成された学習モデルの性能の効率的な向上ができない。そこで、探索活用調整部１７は、これらの不適切な未更新データを選択の対象外とするために、新たなハイパーパラメータβ_ｔを導入している。具体的には、図７（Ｂ）に示されるように、類似度距離ｄ^ｉの予測平均μ_ｔ（ｘ）と予測偏差σ_ｔ（ｘ）のそれぞれに対して、選択対象外とする未更新データの割合を決定する。図７（Ｂ）では、予測平均μ_ｔ（ｘ）について、下位ａ％（ａは、設定値）のデータを除外することで、ハイパーパラメータβ_ｔの上限を決定している。また、図７（Ｂ）では、予測偏差σ_ｔ（ｘ）について、上位ｂ％（ｂは、設定値）のデータを除外することで、ハイパーパラメータβ_ｔの下限を決定している。これは、ハイパーパラメータβ_ｔのとりうる値の範囲を制限することに等しい。これにより、獲得関数α_ｔ（ｘ）の選択されるデータの範囲が決まる。FIGS. 7A and 7B are diagrams showing the operation of the search utilization adjustment unit 17. Adjustment of hyperparameters cannot be done intuitively, so it is not easy for humans to do this adjustment. If you place extreme emphasis on utilization (in Figure 7 (A), if the hyperparameter β _t approaches 0), you can use unupdated data that has a high degree of similarity to updated data (for example, in the case of a video, before and after an updated frame) frame) will be selected, making the task of updating the region of interest redundant. On the other hand, if the search is extremely important (in FIG. 7A, if the hyperparameter β _t is made excessively large), data with high validity (that is, data that the user 50 does not need to update) will be The possibility of being selected increases, and the performance of the generated learning model cannot be efficiently improved. Therefore, the search utilization adjustment unit 17 introduces a new hyperparameter β _t in order to exclude these inappropriate unupdated data from being selected. Specifically, as shown in FIG. 7(B), for each of the predicted average μ _t (x) and the predicted deviation σ _t (x) of the similarity distance d ⁱ , unupdated items to be excluded from selection are Determine the percentage of data. In FIG. 7B, the upper limit of the hyperparameter β _t is determined by excluding data in the lower a% (a is a set value) of the predicted average μ _t (x). Further, in FIG. 7B, the lower limit of the hyperparameter β _t is determined by excluding data in the top b% (b is a set value) of the prediction deviation σ _t (x). This is equivalent to limiting the range of possible values of the hyperparameter β _t . This determines the range of data selected by the acquisition function α _t (x).

図８は、機械学習装置１０の動作を示すフローチャートである。先ず、学習部は、学習用データＬを用いて教師あり学習を行って学習モデルＭを生成する（ステップＳ１０１）。動的選択・更新部１９は、学習モデルＭを用いて学習用データＬについて着目領域を生成する（ステップＳ１０２）。最初は、更新１回目であるから（ステップＳ１０３においてＹＥＳ）、動的選択・更新部１９は、データを静的に選択して（ステップＳ１０８）、着目領域の更新（ステップＳ１０６）及び更新結果の評価を行う（ステップＳ１０７）。 FIG. 8 is a flowchart showing the operation of the machine learning device 10. First, the learning unit performs supervised learning using the learning data L to generate a learning model M (step S101). The dynamic selection/updating unit 19 generates a region of interest for the learning data L using the learning model M (step S102). Initially, since this is the first update (YES in step S103), the dynamic selection/update unit 19 statically selects the data (step S108), updates the region of interest (step S106), and updates the update result. Evaluation is performed (step S107).

次に、動的選択・更新部１９は、更新継続であれば（ステップＳ１０９においてＹＥＳ）、ハイパーパラメータの調整（ステップＳ１０４）、データの動的選択（ステップＳ１０５）を経て、動的に選択されたデータの着目領域の更新（ステップＳ１０６）及び更新結果の評価を行う（ステップＳ１０７）。 Next, if the update is to be continued (YES in step S109), the dynamic selection/update unit 19 adjusts the hyperparameters (step S104) and dynamically selects the data (step S105). The region of interest of the acquired data is updated (step S106) and the update results are evaluated (step S107).

動的選択・更新部１９は、動的に選択されたデータについて更新を完了すれば、更新継続を行わず（ステップＳ１０９においてＮＯ）、転移学習を行うかどうかを判断する（ステップＳ１１０）。転移学習を行うときには（ステップＳ１１０においてＹＥＳ）、タスク学習部１１は、学習用データＬと更新後着目領域Ａとを用いて教師あり学習を行って、学習モデルＭを更新する（ステップＳ１０１）。その後、動的選択・更新部１９は、更新された学習モデルＭを用いてすべてのデータについて着目領域を生成する（ステップＳ１０２）。次に、動的選択・更新部１９は、ハイパーパラメータの調整（ステップＳ１０４）、データの動的選択（ステップＳ１０５）、選択されたデータの着目領域の更新（ステップＳ１０６）、及び更新結果の評価（ステップＳ１０７）を、更新継続を終了する（ステップＳ１０９においてＮＯ）まで、繰り返す。さらに、動的選択・更新部１９は、ステップＳ１０１～Ｓ１０７、Ｓ１０９の処理を、転移学習を終了する（ステップＳ１１０においてＮＯ）まで、繰り返し行う。 When the dynamic selection/updating unit 19 completes updating the dynamically selected data, it does not continue updating (NO in step S109) and determines whether to perform transfer learning (step S110). When performing transfer learning (YES in step S110), the task learning unit 11 performs supervised learning using the learning data L and the updated region of interest A to update the learning model M (step S101). After that, the dynamic selection/updating unit 19 generates a region of interest for all data using the updated learning model M (step S102). Next, the dynamic selection/update unit 19 adjusts hyperparameters (step S104), dynamically selects data (step S105), updates the region of interest of the selected data (step S106), and evaluates the update results. (Step S107) is repeated until the update continuation is ended (NO in step S109). Further, the dynamic selection/update unit 19 repeatedly performs the processes of steps S101 to S107 and S109 until the transfer learning is finished (NO in step S110).

図９は、機械学習装置１０のハードウェア構成の例を示す図である。機械学習装置１０は、プロセッサ１０１と、揮発性の記憶装置であるメモリ１０２と、ハードディスクドライブ（ＨＤＤ）又はソリッドステートドライブ（ＳＳＤ）などの不揮発性記憶装置１０３と、インタフェース１０４とを有している。メモリ１０２は、例えば、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などの半導体メモリである。機械学習装置１０は、外部の装置との通信を行う通信装置を有してもよい。 FIG. 9 is a diagram showing an example of the hardware configuration of the machine learning device 10. The machine learning device 10 includes a processor 101, a memory 102 that is a volatile storage device, a nonvolatile storage device 103 such as a hard disk drive (HDD) or a solid state drive (SSD), and an interface 104. . The memory 102 is, for example, a semiconductor memory such as a RAM (Random Access Memory). The machine learning device 10 may include a communication device that communicates with an external device.

機械学習装置１０の各機能は、処理回路により実現される。処理回路は、専用のハードウェアであっても、メモリ１０２に格納されるプログラム（例えば、実施の形態に係る機械学習プログラム）を実行するプロセッサ１０１であってもよい。プロセッサ１０１は、処理装置、演算装置、マイクロプロセッサ、マイクロコンピュータ、及びＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）のいずれであってもよい。 Each function of the machine learning device 10 is realized by a processing circuit. The processing circuit may be dedicated hardware or may be the processor 101 that executes a program (for example, a machine learning program according to the embodiment) stored in the memory 102. The processor 101 may be any one of a processing device, an arithmetic device, a microprocessor, a microcomputer, and a DSP (Digital Signal Processor).

処理回路が専用のハードウェアである場合、処理回路は、例えば、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）又はＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）などである。 When the processing circuit is dedicated hardware, the processing circuit is, for example, an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

処理回路がプロセッサ１０１である場合、機械学習方法は、ソフトウェア、ファームウェア、又はソフトウェアとファームウェアとの組み合わせにより実行される。ソフトウェア及びファームウェアは、プログラムとして記述され、メモリ１０２に格納される。プロセッサ１０１は、メモリ１０２に記憶されたプログラムを読み出して実行することにより、実施の形態１に係る機械学習方法を実施することができる。 When the processing circuit is the processor 101, the machine learning method is performed by software, firmware, or a combination of software and firmware. Software and firmware are written as programs and stored in memory 102. Processor 101 can implement the machine learning method according to the first embodiment by reading and executing the program stored in memory 102.

なお、機械学習装置１０は、一部を専用のハードウェアで実現し、他の一部をソフトウェア又はファームウェアで実現するようにしてもよい。このように、処理回路は、ハードウェア、ソフトウェア、ファームウェア、又はこれらのうちのいずれかの組み合わせによって、上述の各機能を実現することができる。 Note that the machine learning device 10 may have a part implemented by dedicated hardware and another part by software or firmware. In this way, the processing circuit can implement each of the above-mentioned functions using hardware, software, firmware, or any combination thereof.

インタフェース１０４は、他の装置と通信するために用いられる。インタフェース１０４には、外部の記憶装置１０５、ディスプレイ１０６、及びユーザ操作部としての入力装置１０７、などが接続される。 Interface 104 is used to communicate with other devices. The interface 104 is connected to an external storage device 105, a display 106, an input device 107 as a user operation unit, and the like.

以上に説明したように、実施の形態１に係る機械学習装置１０を用いれば、転移学習において、更新前着目領域Ｂｉと学習モデルＭが新たに生成する更新後着目領域Ａｉとの間の差がペナルティとなる。したがって、転移学習による精度向上を促進するためには、更新前着目領域Ｂの妥当性が低いデータを重点的に更新することが望ましい。実施の形態１に係る機械学習装置１０によれば、更新前着目領域Ｂのうちの、妥当性が低いデータ（すなわち、更新作業による変化が大きく、更新の必要性が高いデータ）を動的に選択するため、より少ないデータ数の更新作業で効率よく学習モデルの性能を向上させることができる。 As explained above, if the machine learning device 10 according to the first embodiment is used, in transfer learning, the difference between the pre-update region of interest Bi and the post-update region of interest Ai newly generated by the learning model M can be It will be a penalty. Therefore, in order to promote accuracy improvement through transfer learning, it is desirable to update data with low validity in the pre-update region of interest B in a focused manner. According to the machine learning device 10 according to the first embodiment, data with low validity (that is, data that changes significantly due to update work and needs to be updated) in the pre-update focused area B is dynamically Because of this selection, the performance of the learning model can be efficiently improved by updating a smaller amount of data.

実施の形態２．
図１０は、実施の形態２に係る機械学習装置２０の構成を概略的に示す機能ブロック図である。図１０において、図２に示される構成と同一又は対応する構成には、図２に示される符号と同じ符号が付されている。実施の形態２に係る機械学習装置２０は、学習部として、教師あり学習部であるタスク学習部２１と教師なし学習部２２とを有する点において、図２に示される実施の形態１に係る機械学習装置１０と相違する。タスク学習部２１と教師なし学習部２２は、着目領域更新評価部１５において、更新前着目領域の特徴抽出に用いる学習モデルＭを学習する。このとき、ＰｒｉｎｃｉｐｌｅＣｏｍｐｏｎｅｎｔＡｎａｌｙｓｉｓ（ＰＣＡ）又はＡｕｔｏ－Ｅｎｃｏｄｅｒ（ＡＥ）などのエンコーダを持つ任意の学習モデルＭを用いてもよい。また、教師なし学習部２２は、自己教師あり学習を行ってもよい。Embodiment 2.
FIG. 10 is a functional block diagram schematically showing the configuration of the machine learning device 20 according to the second embodiment. In FIG. 10, components that are the same as or correspond to those shown in FIG. 2 are given the same reference numerals as those shown in FIG. The machine learning device 20 according to the second embodiment is different from the machine according to the first embodiment shown in FIG. This is different from the learning device 10. The task learning unit 21 and the unsupervised learning unit 22 learn a learning model M used for feature extraction of the pre-update region of interest in the region of interest update evaluation unit 15 . At this time, any learning model M having an encoder such as Principle Component Analysis (PCA) or Auto-Encoder (AE) may be used. Further, the unsupervised learning unit 22 may perform self-supervised learning.

タスク学習部２１で作成した学習モデルＭは、タスクとの関連が大きい特徴をデータから選択的に抽出しているため、学習モデルＭは、更新後着目領域Ａに含まれる特徴の一部を無視する可能性がある。この学習モデルＭを着目領域更新評価部１５に利用すると、多様体に沿わない距離（すなわち、「ｏｕｔ－ｏｆ－ｍａｎｉｆｏｌｄ」である距離）が得られる。この問題を回避するために、教師なし学習部２２により作成した学習モデルＭを着目領域更新評価部１５の特徴抽出器として利用する。 Since the learning model M created by the task learning unit 21 selectively extracts features that are highly relevant to the task from the data, the learning model M ignores some of the features included in the updated region of interest A. there's a possibility that. When this learning model M is used in the region of interest update evaluation section 15, a distance that does not follow the manifold (that is, a distance that is "out-of-manifold") can be obtained. In order to avoid this problem, the learning model M created by the unsupervised learning section 22 is used as the feature extractor of the region of interest update evaluation section 15.

図１１は、機械学習装置２０の動作を示すフローチャートである。図１１において、図８に示されるステップと同じ内容のステップには、図８に示される符号と同じ符号が付されている。実施の形態２に係る機械学習装置２０の動作は、教師なし学習を行う点（ステップＳ２０１）において、実施の形態１に係る機械学習装置１０の動作と相違する。他の点に関しは、図１１の動作は、図８のものと同様である。 FIG. 11 is a flowchart showing the operation of the machine learning device 20. In FIG. 11, steps having the same contents as those shown in FIG. 8 are given the same reference numerals as those shown in FIG. The operation of the machine learning device 20 according to the second embodiment differs from the operation of the machine learning device 10 according to the first embodiment in that unsupervised learning is performed (step S201). In other respects, the operation of FIG. 11 is similar to that of FIG.

以上に説明したように、実施の形態２に係る機械学習装置２０を用いれば、教師なし学習ではタスクに特化しない汎用的な特徴表現が得られる。この特徴表現には更新後の着目領域に関する特徴も含まれているため、着目領域更新評価部に利用すると適切な類似度距離を計算することができる。 As described above, by using the machine learning device 20 according to the second embodiment, a general-purpose feature representation that is not task-specific can be obtained by unsupervised learning. Since this feature representation also includes features related to the updated region of interest, when used in the region of interest update evaluation section, it is possible to calculate an appropriate similarity distance.

なお、上記以外に関し、実施の形態２は、実施の形態１と同じである。 Note that the second embodiment is the same as the first embodiment except for the above.

実施の形態３．
図１２は、実施の形態３に係る機械学習装置３０の構成を概略的に示す機能ブロック図である。図１２において、図２に示される構成と同一又は対応する構成には、図２に示される符号と同じ符号が付されている。実施の形態３に係る機械学習装置３０は、転移学習の際に、データ動的選択部１６に供給される入力データを削減するデータ数削減部３１を有する点において、図２に示される実施の形態１に係る機械学習装置１０と相違する。Embodiment 3.
FIG. 12 is a functional block diagram schematically showing the configuration of the machine learning device 30 according to the third embodiment. In FIG. 12, components that are the same as or correspond to those shown in FIG. 2 are given the same reference numerals as those shown in FIG. The machine learning device 30 according to the third embodiment differs from the embodiment shown in FIG. 2 in that it includes a data number reduction unit 31 that reduces input data supplied to the data dynamic selection unit 16 during transfer learning. This is different from the machine learning device 10 according to the first embodiment.

データ数削減部３１は、クラスタリングなどの教師なし学習を用いてデータ数を削減する。データ数削減部３１は、各クラスから一定割合のデータを選択するか、又は、クラスタ単位でデータを選択する。このときに用いられる、クラスタ数又はデータの削減割合などのハイパーパラメータは、事前に設定されていてもよく、又は、データ削減のたびに特定のアルゴリズムに基づいて変更されてもよい。 The data number reduction unit 31 reduces the number of data using unsupervised learning such as clustering. The data number reduction unit 31 selects a certain percentage of data from each class, or selects data in units of clusters. Hyperparameters used at this time, such as the number of clusters or data reduction ratio, may be set in advance, or may be changed based on a specific algorithm each time data is reduced.

データ動的選択部１６において、全データ（動画の場合は各フレーム）を用いてベイズモデルを作成すること、また全データの獲得関数を計算することは、計算コストが非常に大きい。ガウス過程回帰の場合の計算量は、「Ｏ記法」で表すと、データ数Ｎに対してＯ（Ｎ^３）である。そこで、実施の形態３に係る機械学習装置３０では、ランダムサンプリング又はクラスタリング、ＫｅｒｎｅｌＩｎｔｅｒｐｏｌａｔｉｏｎｆｏｒＳｃａｌａｂｌｅＳｔｒｕｃｔｕｒｅｄＧａｕｓｓｉａｎＰｒｏｃｅｓｓｅｓ（ＫＩＳＳ－ＧＰ）などのアルゴリズムによりデータ数を削減するデータ数削減部３１が導入されている。具体的には、データ数削減部３１は、着目領域更新評価部１５と同様に、タスク学習部１１で生成した学習モデルＭを用いて、更新前着目領域Ｂの特徴を抽出し、その中間特徴を入力とする教師なし学習を実行する。In the data dynamic selection unit 16, creating a Bayesian model using all the data (each frame in the case of a video) and calculating an acquisition function for all the data requires a very large calculation cost. The amount of calculation in the case of Gaussian process regression is O(N ³ ) for the number of data N when expressed in "O notation". Therefore, in the machine learning device 30 according to the third embodiment, a data number reduction unit 31 is introduced that reduces the number of data using an algorithm such as random sampling, clustering, Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP), etc. . Specifically, similar to the region of interest update evaluation section 15, the data number reduction section 31 uses the learning model M generated by the task learning section 11 to extract features of the region of interest B before update, and extracts intermediate features thereof. Execute unsupervised learning with input.

図１３は、機械学習装置３０の動作を示すフローチャートである。図１３において、図８に示されるステップと同じ内容のステップには、図８に示される符号と同じ符号が付されている。実施の形態３に係る機械学習装置３０の動作は、データ数の削減の処理を行う点（ステップＳ３０１）において、実施の形態１に係る機械学習装置１０の動作と相違する。他の点に関しは、図１３の動作は、図８のものと同じである。 FIG. 13 is a flowchart showing the operation of the machine learning device 30. In FIG. 13, steps having the same contents as those shown in FIG. 8 are given the same reference numerals as those shown in FIG. The operation of the machine learning device 30 according to the third embodiment differs from the operation of the machine learning device 10 according to the first embodiment in that the process of reducing the number of data is performed (step S301). In other respects, the operation of FIG. 13 is the same as that of FIG.

以上に説明したように、実施の形態３に係る機械学習装置３０を用いれば、データ動的選択部１６における計算コストを低減でき、マシンパワーのコスト及び介入者の待ち時間を削減できる。 As described above, by using the machine learning device 30 according to the third embodiment, the calculation cost in the data dynamic selection section 16 can be reduced, and the cost of machine power and the waiting time of the intervention person can be reduced.

なお、上記以外に関し、実施の形態３は、実施の形態１と同じである。 Note that the third embodiment is the same as the first embodiment except for the above.

実施の形態４．
図１４は、実施の形態４に係る機械学習装置４０の構成及び機械学習システムの構成を概略的に示す機能ブロック図である。図１４において、図２に示される構成と同一又は対応する構成には、図２に示される符号と同じ符号が付されている。実施の形態４に係る機械学習システムは、機械学習装置４０によって生成された学習モデルＭを用いて実行される推論プロセスにおける改善度合いを測定する効果測定部４１と、この改善度合いに基づいて、着目領域の更新に関連するハイパーパラメータを決定する更新方法決定部４２とを有する点において、図２に示される実施の形態１に係る機械学習システムと相違する。Embodiment 4.
FIG. 14 is a functional block diagram schematically showing the configuration of the machine learning device 40 and the configuration of the machine learning system according to the fourth embodiment. In FIG. 14, components that are the same as or correspond to those shown in FIG. 2 are given the same reference numerals as those shown in FIG. The machine learning system according to the fourth embodiment includes an effect measurement unit 41 that measures the degree of improvement in the inference process executed using the learning model M generated by the machine learning device 40, and a This machine learning system differs from the machine learning system according to the first embodiment shown in FIG. 2 in that it includes an update method determining unit 42 that determines hyperparameters related to updating a region.

効果測定部４１は、例えば、機械学習装置４０によって生成された学習モデルＭを用いて実行される推論プロセスにおける予測精度、解釈性、各処理に要した時間、及びリソースに要するコストを取得し、予測精度、解釈性、各処理に要した時間、及びリソースに要するコスト、に対して得られた学習モデルＭの改善度合いを測定する。測定対象は、タスク及び作業環境（クラウドソーシングなど）に依存する。学習モデルＭの改善度合いは、予測精度、解釈性、各処理に要した時間、及びリソースに要するコストのすべてである必要はなく、これらのうちの１つ又は２つ以上の組み合わせであってもよい。 The effect measurement unit 41 obtains, for example, prediction accuracy, interpretability, time required for each process, and cost required for resources in the inference process executed using the learning model M generated by the machine learning device 40, The degree of improvement of the obtained learning model M with respect to prediction accuracy, interpretability, time required for each process, and cost required for resources is measured. What is measured depends on the task and work environment (crowdsourcing, etc.). The degree of improvement of the learning model M does not need to be based on all of the prediction accuracy, interpretability, time required for each process, and cost required for resources; it may be one or a combination of two or more of these. good.

更新方法決定部４２は、効果測定部４１によって測定された学習モデルＭの改善度合い、すなわち、測定された効果に基づいて、着目領域の更新に関連するハイパーパラメータＨを決定する。ハイパーパラメータＨは、ルールベースで決定されてもよく、又は、最適化されてもよい。決定されたハイパーパラメータＨは、機械学習装置４０の着目領域更新部１８及び探索活用調整部１７に自動的に提供される。決定されたハイパーパラメータＨは、ディスプレイによってユーザ５０に提示され、ユーザ５０にハイパーパラメータＨの変更を促してもよい。 The update method determining unit 42 determines a hyperparameter H related to updating the region of interest based on the degree of improvement of the learning model M measured by the effect measuring unit 41, that is, the measured effect. The hyperparameter H may be determined on a rule-based basis or may be optimized. The determined hyperparameter H is automatically provided to the region of interest updating unit 18 and the search utilization adjustment unit 17 of the machine learning device 40. The determined hyperparameter H may be presented to the user 50 via a display to prompt the user 50 to change the hyperparameter H.

着目領域の更新作業において、探索と活用の制御パラメータ、着目領域を更新するためのＵＩデザイン（例えば、着目領域の粒度など）、人手で更新するデータの数、介入者の人数、などの複数のハイパラーパラメータが存在する。ハイパーパラメータは、再学習を繰り返すたびに適切な値に変更されることが望ましい。実施の形態４に係る機械学習システムでは、学習モデルＭを用いて実行される推論プロセスにおける学習モデルＭの改善度合い測定し、これらの値に基づいてハイパーパラメータＨを調整している。 In updating the area of interest, multiple factors such as control parameters for exploration and utilization, UI design for updating the area of interest (e.g. granularity of the area of interest, etc.), number of data to be manually updated, number of interventionists, etc. are required. Hyperparameters exist. It is desirable that the hyperparameters be changed to appropriate values each time relearning is repeated. In the machine learning system according to the fourth embodiment, the degree of improvement of the learning model M in the inference process executed using the learning model M is measured, and the hyperparameter H is adjusted based on these values.

図１５は、実施の形態４に係る機械学習システムの動作を示すフローチャートである。図１５において、図８に示されるステップと同じ内容のステップには、図８に示される符号と同じ符号が付されている。実施の形態４に係る機械学習システムの動作は、効果測定部４１が学習モデルＭの改善度合い（すなわち、効果）を測定し（ステップＳ４０１）、更新方法決定部４２が測定された効果に基づいて、着目領域の更新方法（すなわち、関連するハイパーパラメータＨ）を決定する（ステップＳ４０２）点において、図８の動作と相違する。他の点に関しは、図１５の動作は、図８のものと同じである。 FIG. 15 is a flowchart showing the operation of the machine learning system according to the fourth embodiment. In FIG. 15, steps having the same contents as those shown in FIG. 8 are given the same reference numerals as those shown in FIG. In the operation of the machine learning system according to the fourth embodiment, the effect measuring unit 41 measures the degree of improvement (that is, the effect) of the learning model M (step S401), and the update method determining unit 42 measures the degree of improvement (that is, the effect) of the learning model M based on the measured effect. , differs from the operation in FIG. 8 in that the method of updating the region of interest (that is, the related hyperparameter H) is determined (step S402). In other respects, the operation of FIG. 15 is the same as that of FIG.

以上に説明したように、人手による更新作業に要する負荷とその効果とはトレードオフの関係にあるが実施の形態４に係る機械学習装置４０及び機械学習システムを用いれば、両者の良好なバランスがとれた更新作業が可能となる。 As explained above, there is a trade-off relationship between the load required for manual update work and its effect, but if the machine learning device 40 and machine learning system according to the fourth embodiment are used, a good balance between the two can be achieved. This makes it possible to perform updated updates.

なお、上記以外に関し、実施の形態４は、実施の形態１と同じである。 Note that the fourth embodiment is the same as the first embodiment except for the above.

１０、２０、３０、４０機械学習装置、１１タスク学習部（教師あり学習部）、１１ａ学習モデル生成部、１１ｂ学習モデル記憶部、１２学習データ記憶部、１３更新前着目領域記憶部、１４更新後着目領域記憶部、１５着目領域更新評価部、１５ａ類似度学習部（教師あり学習部）、１６データ動的選択部、１７探索活用調整部、１８着目領域更新部、１９、２９、３９、４９動的選択・更新部、２１タスク学習部（教師あり学習部）、２２教師なし学習部、３１データ数削減部、４１効果測定部、４２更新方法決定部、５０ユーザ、Ａ、Ａｉ、ｘ^ｉ _after 更新後着目領域、Ｂ、Ｂｉ、ｘ^ｉ _before 更新前着目領域、Ｈ、βｔハイパーパラメータ、ｄ^ｉ類似度距離、Ｌ学習用データ、Ｍ学習モデル。10, 20, 30, 40 machine learning device, 11 task learning unit (supervised learning unit), 11a learning model generation unit, 11b learning model storage unit, 12 learning data storage unit, 13 pre-update focused area storage unit, 14 update Post-focus area storage unit, 15 Focus area update evaluation unit, 15a Similarity learning unit (supervised learning unit), 16 Data dynamic selection unit, 17 Search utilization adjustment unit, 18 Focus area update unit, 19, 29, 39, 49 dynamic selection/update unit, 21 task learning unit (supervised learning unit), 22 unsupervised learning unit, 31 data number reduction unit, 41 effect measurement unit, 42 update method determination unit, 50 user, A, Ai, x ⁱ _after updated area of interest, B, Bi, x ⁱ _before updated area of interest, H, βt hyperparameter, d ⁱ similarity distance, L learning data, M learning model.

Claims

a learning unit that performs normal learning that generates a learning model for inferring a region of interest of input data based on pre-collected learning data; and transfer learning that updates the learning model;
during the normal learning, generating the region of interest of the input data using the learning model, and performing a generation process of storing the generated region of interest in a pre-update region of interest storage unit as a pre-update region of interest; During or before the transfer learning, updating the region of interest of the input data from the pre-update region of interest to the post-update region of interest using the inputted person's knowledge, and storing the updated region of interest in the post-update region of interest storage unit. A region of interest update unit that performs processing;
a region of interest update evaluation unit that calculates a similarity distance indicating a change in the region of interest due to the update process;
a data dynamic selection unit that dynamically selects the input data to be output to the region of interest update unit based on the similarity distance during or before the transfer learning;
A machine learning device characterized by having.

The machine learning device according to claim 1, wherein the learning unit is a supervised learning unit.

The learning department is
an unsupervised learning section that performs the normal learning;
a supervised learning unit that performs the transfer learning;
The machine learning device according to claim 1, comprising:

4. The method according to claim 1, further comprising a data number reduction unit that reduces the input data supplied to the data dynamic selection unit during or before the transfer learning. Machine learning device.

further comprising a search and utilization adjustment unit that provides the data dynamic selection unit with a hyperparameter indicating whether to emphasize search or utilization in the learning data acquired in advance;
The data dynamic selection unit dynamically selects the input data to be output to the region of interest updating unit based on the similarity distance and the hyperparameter during or before the transfer learning. The machine learning device according to any one of claims 1 to 4.

The search utilization adjustment unit sets the hyperparameter to a value that emphasizes the search during or before the transfer learning, and gradually changes the hyperparameter from the value that emphasizes the search to the value that emphasizes the utilization. The machine learning device according to claim 5.

A machine learning device according to claim 5 or 6,
an effect measurement unit that measures the degree of improvement in an inference process executed using the learning model generated by the machine learning device;
an update method determining unit that determines the hyperparameter related to updating the region of interest based on the degree of improvement;
A machine learning system characterized by having.

Machine learning performed by a machine learning device that performs normal learning that generates a learning model for inferring a region of interest in input data based on pre-collected learning data, and transfer learning that updates the learning model. A method,
during the normal learning, generating the region of interest of the input data using the learning model, and performing a generation process of storing the generated region of interest in a pre-update region of interest storage unit as a pre-update region of interest; During or before the transfer learning, updating the region of interest of the input data from the pre-update region of interest to the post-update region of interest using the inputted person's knowledge, and storing the updated region of interest in the post-update region of interest storage unit. a step of performing the processing;
calculating a similarity distance indicating a change in the region of interest due to the update process;
During or before the transfer learning, dynamically selecting the input data to be subjected to the transfer learning based on the similarity distance;
A machine learning method characterized by having the following.

A computer that performs normal learning that generates a learning model for inferring a region of interest of input data based on pre-collected learning data, and transfer learning that updates the learning model,
during the normal learning, generating the region of interest of the input data using the learning model, and performing a generation process of storing the generated region of interest in a pre-update region of interest storage unit as a pre-update region of interest; During or before the transfer learning, updating the region of interest of the input data from the pre-update region of interest to the post-update region of interest using the inputted person's knowledge, and storing the updated region of interest in the post-update region of interest storage unit. a step of processing;
calculating a similarity distance indicating a change in the region of interest due to the update process;
During or before the transfer learning, dynamically selecting the input data to be subjected to the transfer learning based on the similarity distance;
A machine learning program that runs.