JP7167356B2

JP7167356B2 - LEARNING APPARATUS, LEARNING APPARATUS OPERATING METHOD, LEARNING APPARATUS OPERATING PROGRAM

Info

Publication number: JP7167356B2
Application number: JP2021543951A
Authority: JP
Inventors: 隆史涌井
Original assignee: Fujifilm Corp
Current assignee: Fujifilm Corp
Priority date: 2019-09-03
Filing date: 2020-05-08
Publication date: 2022-11-08
Anticipated expiration: 2040-05-08
Also published as: WO2021044671A1; JPWO2021044671A1

Description

本開示の技術は、学習装置、学習装置の作動方法、学習装置の作動プログラムに関する。 The technology of the present disclosure relates to a learning device, a learning device operating method, and a learning device operating program.

入力画像内の複数のクラスの判別を画素単位で行うセマンティックセグメンテーションが知られている。クラスは、入力画像に映る物体の種類である。セマンティックセグメンテーションを実施する機械学習モデル（以下、単にモデルという）としては、Ｕ字型の畳み込みニューラルネットワーク（Ｕ－Ｎｅｔ；Ｕ－ＳｈａｐｅｄＮｅｕｒａｌＮｅｔｗｏｒｋ）等がある。 Semantic segmentation is known that distinguishes between multiple classes in an input image on a pixel-by-pixel basis. A class is the type of object appearing in the input image. A machine learning model (hereinafter simply referred to as a model) that implements semantic segmentation includes a U-shaped convolutional neural network (U-Net).

特開２０１９－０１６２９８号公報には、草、花、肌、髪等の比較的細かい複数のクラスの全てを最初に設定してモデルを学習させることが記載されている。特開２０１９－０１６２９８号公報では、ボケ量、像倍率といった入力画像の情報に基づきクラスの識別が困難であるか否かをクラス毎に判定している。そして、識別が困難なクラスは、例えば、草と花を「草花」、肌と髪を「顔」等、上位概念化したクラスに統合したうえで、モデルを再学習させている。 Japanese Patent Application Laid-Open No. 2019-016298 describes that all of a plurality of relatively detailed classes such as grass, flowers, skin, and hair are initially set to learn a model. In Japanese Patent Application Laid-Open No. 2019-016298, it is determined for each class whether or not class identification is difficult based on input image information such as the amount of blur and image magnification. Classes that are difficult to identify are integrated into higher-level classes, such as "grass and flowers" and "face" for skin and hair, and then the model is retrained.

セマンティックセグメンテーションを実施するモデルの学習においては、特開２０１９－０１６２９８号公報のように、最初に複数のクラスの全てを一度に学習させると、高い判別精度を得られないという問題があった。 In learning a model that implements semantic segmentation, there is a problem that high discrimination accuracy cannot be obtained if all of a plurality of classes are first learned at once, as in Japanese Patent Application Laid-Open No. 2019-016298.

本開示の技術は、複数のクラスの全てを一度に学習させる場合と比べて、クラスの判別精度が高い機械学習モデルを得ることが可能な学習装置、学習装置の作動方法、学習装置の作動プログラムを提供することを目的とする。 The technology of the present disclosure provides a learning device, a learning device operation method, and a learning device operation program that can obtain a machine learning model with higher class discrimination accuracy than when learning all of a plurality of classes at once. intended to provide

上記目的を達成するために、本開示の学習装置は、学習用入力画像と、学習用入力画像に対して、セマンティックセグメンテーションの対象となる３つ以上のクラスが指定されたアノテーション画像との組である学習データを取得する取得部と、アノテーション画像において指定された３つ以上のクラスのうちの少なくとも２つのクラスが統合され、アノテーション画像よりもクラス数が少ない仮アノテーション画像を生成する画像生成部と、学習用入力画像と仮アノテーション画像との組である仮学習データを用いて、仮アノテーション画像のクラス数に応じた構成を有する仮機械学習モデルを学習させ、仮機械学習モデルを学習済み仮機械学習モデルとする仮学習部と、学習済み仮機械学習モデルの少なくとも一部を用いて、アノテーション画像のクラス数に応じた構成を有する機械学習モデルを生成する機械学習モデル生成部と、学習データを用いて機械学習モデルを学習させ、機械学習モデルを学習済み機械学習モデルとする本学習部と、を備える。 In order to achieve the above object, the learning device of the present disclosure is a set of an input image for learning and an annotation image in which three or more classes to be subjected to semantic segmentation are specified for the input image for learning. an acquisition unit that acquires certain learning data; and an image generation unit that integrates at least two classes out of three or more classes specified in the annotation image and generates a temporary annotation image having fewer classes than the annotation image. , using temporary learning data, which is a set of an input image for learning and a temporary annotation image, to learn a temporary machine learning model having a configuration corresponding to the number of classes of temporary annotation images, and train the temporary machine learning model as a trained temporary machine A temporary learning unit as a learning model; a machine learning model generating unit that generates a machine learning model having a configuration corresponding to the number of classes of annotation images using at least a part of the trained temporary machine learning model; and a main learning unit that uses the machine learning model to learn and sets the machine learning model as a trained machine learning model.

仮機械学習モデルを生成する仮機械学習モデル生成部を備えることが好ましい。 It is preferable to include a temporary machine learning model generation unit that generates the temporary machine learning model.

アノテーション画像のクラス数をＭ、仮アノテーション画像のクラス数をＮとした場合、画像生成部と仮学習部は、仮アノテーション画像を生成する処理と仮機械学習モデルを学習済み仮機械学習モデルとする処理を、仮アノテーション画像のクラス数Ｎを徐々に増やしつつ、かつ仮アノテーション画像のクラス数ＮがＭ－１となるまで複数回繰り返すことが好ましい。 When the number of classes of annotation images is M and the number of classes of temporary annotation images is N, the image generation unit and the temporary learning unit set the processing for generating temporary annotation images and the temporary machine learning model as a trained temporary machine learning model. It is preferable to repeat the process multiple times while gradually increasing the class number N of the temporary annotation images until the class number N of the temporary annotation images reaches M−1.

仮機械学習モデル生成部は、前回用いた仮機械学習モデルの少なくとも一部を用いて、今回用いる仮機械学習モデルを生成することが好ましい。 It is preferable that the temporary machine learning model generation unit generates the temporary machine learning model to be used this time, using at least part of the temporary machine learning model used last time.

仮学習部は、仮機械学習モデルのクラスの判別精度の評価に用いる損失関数を変更することが好ましい。この場合、仮学習部は、前回と共通するクラスに対する損失関数の重みを、今回新たに出現したクラスに対する損失関数の重みよりも小さくすることが好ましい。 It is preferable that the provisional learning unit changes the loss function used to evaluate the class discrimination accuracy of the provisional machine learning model. In this case, the provisional learning unit preferably makes the weight of the loss function for the class common to the previous time smaller than the weight of the loss function for the class newly appearing this time.

仮学習部と本学習部は、仮機械学習モデルを学習済み仮機械学習モデルとする処理と機械学習モデルを学習済み機械学習モデルとする処理において、予め指定された部分を更新しないことが好ましい。この場合、更新しない部分のユーザによる指定を受け付ける第１受付部を備えることが好ましい。 It is preferable that the provisional learning unit and the main learning unit do not update a pre-specified part in the process of converting the temporary machine learning model into the trained temporary machine learning model and the process of converting the machine learning model into the trained temporary machine learning model. In this case, it is preferable to provide a first reception unit that receives the user's designation of the portion not to be updated.

画像生成部は、予め指定された画像生成条件にしたがって仮アノテーション画像を生成することが好ましい。 It is preferable that the image generator generates the temporary annotation image according to image generation conditions specified in advance.

画像生成条件のユーザによる指定を受け付ける第２受付部を備えることが好ましい。 It is preferable to include a second reception unit that receives designation of image generation conditions by the user.

画像生成条件は、仮アノテーション画像の各クラスの面積が偏らないような内容であることが好ましい。また、画像生成条件は、仮アノテーション画像の各クラスの複雑度が偏らないような内容であることが好ましい。 It is preferable that the image generation conditions are such that the area of each class of the temporary annotation image is not biased. In addition, it is preferable that the image generation condition is such that the complexity of each class of the temporary annotation image is not biased.

仮アノテーション画像の各クラスの面積が偏らないような内容の画像生成条件とするか、仮アノテーション画像の各クラスの複雑度が偏らないような内容の画像生成条件とするかのユーザによる選択指示を受け付ける第３受付部を備えることが好ましい。 The user selects and instructs whether the image generation condition is such that the area of each class of the temporary annotation image is balanced or the image generation condition is such that the complexity of each class of the temporary annotation image is balanced. It is preferable to provide a third receiving section for receiving.

画像生成条件は、最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容であることが好ましい。また、画像生成条件は、最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容であることが好ましい。 It is preferable that the image generation condition is such that the class having the largest area is left as one class without being merged. Moreover, it is preferable that the image generation condition is such that the class with the lowest complexity is left as one class without being integrated.

最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容の画像生成条件とするか、最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容の画像生成条件とするかのユーザによる選択指示を受け付ける第４受付部を備えることが好ましい。 The image generation condition is that the class with the largest area is left as one class without being merged, or the class with the lowest complexity is left as one class without being merged. It is preferable to include a fourth reception unit that receives a selection instruction from the user as to whether the content is to be the image generation condition.

仮アノテーション画像を表示する制御を行う表示制御部を備えることが好ましい。 It is preferable to include a display control unit that controls display of the temporary annotation image.

画像生成部は、統合されたクラスが異なる複数種の仮アノテーション画像を生成し、表示制御部は、複数種の仮アノテーション画像を表示する制御を行い、複数種の仮アノテーション画像のうちの１つの仮アノテーション画像のユーザによる選択指示を受け付ける第５受付部を備え、仮学習部は、第５受付部において選択指示を受け付けた仮アノテーション画像を仮学習データとして用いることが好ましい。 The image generation unit generates multiple types of temporary annotation images of different integrated classes, the display control unit performs control to display the multiple types of temporary annotation images, and selects one of the multiple types of temporary annotation images. It is preferable that a fifth accepting unit that accepts a user's instruction to select a temporary annotation image is provided, and the temporary learning unit uses the temporary annotation image for which the fifth accepting unit has accepted the selection instruction as the temporary learning data.

学習用入力画像は、培養中の複数の細胞を撮影した細胞画像であることが好ましい。 The learning input image is preferably a cell image obtained by photographing a plurality of cells being cultured.

本開示の学習装置の作動方法は、学習用入力画像と、学習用入力画像に対して、セマンティックセグメンテーションの対象となる３つ以上のクラスが指定されたアノテーション画像との組である学習データを取得する取得ステップと、アノテーション画像において指定された３つ以上のクラスのうちの少なくとも２つのクラスが統合され、アノテーション画像よりもクラス数が少ない仮アノテーション画像を生成する画像生成ステップと、学習用入力画像と仮アノテーション画像との組である仮学習データを用いて、仮アノテーション画像のクラス数に応じた構成を有する仮機械学習モデルを学習させ、仮機械学習モデルを学習済み仮機械学習モデルとする仮学習ステップと、学習済み仮機械学習モデルの少なくとも一部を用いて、アノテーション画像のクラス数に応じた構成を有する機械学習モデルを生成する機械学習モデル生成ステップと、学習データを用いて機械学習モデルを学習させ、機械学習モデルを学習済み機械学習モデルとする本学習ステップと、を備える。 The operating method of the learning device of the present disclosure acquires learning data that is a set of an input image for learning and an annotation image in which three or more classes to be subjected to semantic segmentation are specified for the input image for learning. an image generation step of integrating at least two classes among three or more classes specified in the annotation image to generate a temporary annotation image having fewer classes than the annotation image; and an input image for learning and a temporary annotation image, a temporary machine learning model having a configuration corresponding to the number of classes of the temporary annotation image is trained, and the temporary machine learning model is a trained temporary machine learning model. A learning step, a machine learning model generating step of generating a machine learning model having a configuration corresponding to the number of annotation image classes using at least a part of the trained temporary machine learning model, and a machine learning model using the learning data and a main learning step of learning the machine learning model as a learned machine learning model.

本開示の学習装置の作動プログラムは、学習用入力画像と、学習用入力画像に対して、セマンティックセグメンテーションの対象となる３つ以上のクラスが指定されたアノテーション画像との組である学習データを取得する取得部と、アノテーション画像において指定された３つ以上のクラスのうちの少なくとも２つのクラスが統合され、アノテーション画像よりもクラス数が少ない仮アノテーション画像を生成する画像生成部と、学習用入力画像と仮アノテーション画像との組である仮学習データを用いて、仮アノテーション画像のクラス数に応じた構成を有する仮機械学習モデルを学習させ、仮機械学習モデルを学習済み仮機械学習モデルとする仮学習部と、学習済み仮機械学習モデルの少なくとも一部を用いて、アノテーション画像のクラス数に応じた構成を有する機械学習モデルを生成する機械学習モデル生成部と、学習データを用いて機械学習モデルを学習させ、機械学習モデルを学習済み機械学習モデルとする本学習部として、コンピュータを機能させる。 The operation program of the learning device of the present disclosure acquires learning data that is a set of an input image for learning and an annotation image in which three or more classes to be subjected to semantic segmentation are specified for the input image for learning. an acquisition unit that integrates at least two classes out of three or more classes specified in the annotation image, and an image generation unit that generates a temporary annotation image having fewer classes than the annotation image; and an input image for learning and a temporary annotation image, a temporary machine learning model having a configuration corresponding to the number of classes of the temporary annotation image is trained, and the temporary machine learning model is a trained temporary machine learning model. A learning unit, a machine learning model generating unit that generates a machine learning model having a configuration corresponding to the number of annotation image classes using at least a part of the trained temporary machine learning model, and a machine learning model using the learning data. is learned, and the computer functions as a main learning unit that uses the machine learning model as a trained machine learning model.

本開示の技術によれば、複数のクラスの全てを一度に学習させる場合と比べて、クラスの判別精度が高い機械学習モデルを得ることが可能な学習装置、学習装置の作動方法、学習装置の作動プログラムを提供することができる。 According to the technology of the present disclosure, a learning device capable of obtaining a machine learning model with higher class discrimination accuracy than when learning all of a plurality of classes at once, a method of operating the learning device, and a learning device An operating program can be provided.

機械学習システムを示す図である。1 illustrates a machine learning system; FIG. 機械学習システムにおける処理の概要を示す図である。It is a figure which shows the outline|summary of the process in a machine-learning system. 学習用入力画像およびアノテーション画像を示す図である。FIG. 10 is a diagram showing an input image for learning and an annotation image; モデルを示す図である。FIG. 4 is a diagram showing a model; 学習装置を構成するコンピュータを示すブロック図である。1 is a block diagram showing a computer that constitutes a learning device; FIG. 学習装置のＣＰＵの処理部を示すブロック図である。3 is a block diagram showing a processing section of a CPU of the learning device; FIG. 画像群を示す図である。FIG. 4 is a diagram showing a group of images; モデル群を示す図である。It is a figure which shows a model group. 仮学習および本学習におけるクラス数を示す表である。It is a table showing the number of classes in provisional learning and main learning. 画像生成条件を示す図である。FIG. 4 is a diagram showing image generation conditions; 画像生成条件指定画面を示す図である。FIG. 10 is a diagram showing an image generation condition specification screen; 第１回仮学習における画像生成部の処理を示す図である。It is a figure which shows the process of the image production|generation part in 1st temporary learning. 第１回仮学習における仮モデル生成部の処理を示す図である。It is a figure which shows the process of the temporary model production|generation part in 1st temporary learning. 第１回仮学習における仮学習部の処理を示す図である。It is a figure which shows the process of the provisional learning part in 1st provisional learning. 第２回仮学習における画像生成部の処理を示す図である。It is a figure which shows the process of the image production|generation part in the 2nd provisional learning. 第２回仮学習における仮モデル生成部の処理を示す図である。It is a figure which shows the process of the temporary model production|generation part in 2nd temporary learning. 第２回仮学習における仮学習部の処理を示す図である。It is a figure which shows the process of the provisional learning part in 2nd provisional learning. モデル生成部の処理を示す図である。It is a figure which shows the process of a model generation part. 本学習部の処理を示す図である。It is a figure which shows the process of this learning part. アノテーション画像および第１回仮学習に用いる仮アノテーション画像を示す図である。FIG. 10 is a diagram showing annotation images and provisional annotation images used in the first provisional learning; 第１回仮学習における仮モデル生成部の処理の詳細を示す図である。It is a figure which shows the detail of the process of the temporary model production|generation part in 1st temporary learning. アノテーション画像および第２回仮学習に用いる仮アノテーション画像を示す図である。FIG. 10 is a diagram showing annotation images and provisional annotation images used in the second provisional learning; 第２回仮学習における仮モデル生成部の処理の詳細を示す図である。It is a figure which shows the detail of the process of the temporary model production|generation part in 2nd temporary learning. 仮学習部の詳細を示す図である。It is a figure which shows the detail of a provisional learning part. 第１回仮学習に用いる第１損失関数を示す図である。It is a figure which shows the 1st loss function used for 1st temporary learning. 第２回仮学習に用いる第１損失関数を示す図である。It is a figure which shows the 1st loss function used for the 2nd provisional learning. モデル生成部の処理の詳細を示す図である。It is a figure which shows the detail of the process of a model production|generation part. 本学習部の詳細を示す図である。It is a figure which shows the detail of this learning part. 本学習に用いる第２損失関数を示す図である。It is a figure which shows the 2nd loss function used for this learning. 第１回仮学習、第２回仮学習、および本学習の各学習のクラスの推移を、仮アノテーション画像およびアノテーション画像を用いて示す図であり、図３０Ａは第１回仮学習に用いる仮アノテーション画像、図３０Ｂは第２回仮学習に用いる仮アノテーション画像、図３０Ｃは本学習に用いるアノテーション画像をそれぞれ示す。FIG. 30A is a diagram showing the transition of each learning class of the first provisional learning, the second provisional learning, and the main learning using a provisional annotation image and an annotation image, FIG. 30A is a provisional annotation used for the first provisional learning; 30B shows a temporary annotation image used in the second temporary learning, and FIG. 30C shows an annotation image used in the main learning. 学習装置の処理手順を示すフローチャートである。4 is a flow chart showing a processing procedure of the learning device; 非更新部分指定画面を示す図である。It is a figure which shows a non-updated part designation|designated screen. 仮学習部の第１更新部において、非更新部分を更新しない様子を示す図である。FIG. 10 is a diagram showing how the first updating unit of the provisional learning unit does not update the non-updated part; 本学習部の第２更新部において、非更新部分を更新しない様子を示す図である。FIG. 10 is a diagram showing how the second updating unit of the learning unit does not update non-updated portions; 仮アノテーション画像の各クラスの面積が偏らないような内容の画像生成条件を示す図である。FIG. 10 is a diagram showing image generation conditions such that the area of each class of a temporary annotation image is not biased; 仮アノテーション画像の各クラスの複雑度が偏らないような内容の画像生成条件を示す図である。FIG. 10 is a diagram showing image generation conditions such that the degree of complexity of each class of temporary annotation images is not biased. 画像生成条件指定画面を示す図である。FIG. 10 is a diagram showing an image generation condition specification screen; 最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容の画像生成条件を示す図である。FIG. 10 is a diagram showing image generation conditions such that the class having the largest area is left as one class without being integrated; 最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容の画像生成条件を示す図である。FIG. 10 is a diagram showing an image generation condition that the class with the lowest complexity is left as one class without being integrated; 画像生成条件指定画面を示す図である。FIG. 10 is a diagram showing an image generation condition specification screen; 仮アノテーション画像表示画面を示す図である。It is a figure which shows a temporary annotation image display screen. 画像生成部において、統合されたクラスが異なる複数種の仮アノテーション画像を生成する様子を示す図であり、図４２Ａは、クラス１の分化細胞とクラス２の未分化細胞が統合された仮アノテーション画像を生成する様子を、図４２Ｂは、クラス１の分化細胞とクラス３の死細胞が統合された仮アノテーション画像を生成する様子を、図４２Ｃは、クラス２の未分化細胞とクラス３の死細胞が統合された仮アノテーション画像を生成する様子を、それぞれ示す。FIG. 42A is a diagram showing how an image generation unit generates a plurality of types of provisional annotation images with different integrated classes. FIG. 42A is a provisional annotation image in which differentiated cells of class 1 and undifferentiated cells of class 2 are integrated. 42B shows how to generate a temporary annotation image in which class 1 differentiated cells and class 3 dead cells are integrated, and FIG. 42C shows how class 2 undifferentiated cells and class 3 dead cells are generated. , respectively, generate a temporary annotation image in which are integrated. 仮アノテーション画像選択画面を示す図である。It is a figure which shows a temporary annotation image selection screen.

［第１実施形態］
図１において、機械学習システム２は、入力画像内の複数のクラスの判別を画素単位で行うセマンティックセグメンテーションを実施するためのモデルを用いるシステムである。機械学習システム２は、学習装置１０および運用装置１１を備える。学習装置１０および運用装置１１は、例えばデスクトップ型のパーソナルコンピュータである。学習装置１０および運用装置１１は、ネットワーク１２を介して相互に通信可能に接続されている。ネットワーク１２は、例えば、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、もしくはインターネット、公衆通信網等のＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）である。[First embodiment]
In FIG. 1, the machine learning system 2 is a system that uses a model for implementing semantic segmentation that distinguishes between multiple classes in an input image on a pixel-by-pixel basis. A machine learning system 2 includes a learning device 10 and an operation device 11 . The learning device 10 and the operation device 11 are, for example, desktop personal computers. Learning device 10 and operation device 11 are connected via network 12 so as to be able to communicate with each other. The network 12 is, for example, a LAN (Local Area Network) or a WAN (Wide Area Network) such as the Internet or a public communication network.

図２において、学習装置１０は、仮学習および本学習を行う。仮学習においては、仮学習データ１５を用いて仮モデル１６を学習させ、仮モデル１６を学習済み仮モデル１６Ｔとする。仮学習データ１５は、学習用入力画像１７と仮アノテーション画像１８との組である。仮アノテーション画像１８は、アノテーション画像１９から生成される。本学習においては、学習データ２０を用いてモデル２１を学習させ、モデル２１を学習済みモデル２１Ｔとする。モデル２１は、学習済み仮モデル１６Ｔから生成される。学習データ２０は、学習用入力画像１７とアノテーション画像１９との組である。仮モデル１６およびモデル２１は、例えばＵ－Ｎｅｔ等の畳み込みニューラルネットワークを含む（図４参照）。 In FIG. 2, the learning device 10 performs provisional learning and main learning. In the provisional learning, the provisional model 16 is learned using the provisional learning data 15, and the provisional model 16 is set as the trained provisional model 16T. The provisional learning data 15 is a set of a learning input image 17 and a provisional annotation image 18 . A temporary annotation image 18 is generated from the annotation image 19 . In this learning, the learning data 20 is used to learn the model 21, and the model 21 is a trained model 21T. The model 21 is generated from the trained temporary model 16T. The learning data 20 is a set of the learning input image 17 and the annotation image 19 . Temporary model 16 and model 21 include, for example, a convolutional neural network such as U-Net (see FIG. 4).

学習装置１０は、学習済みモデル２１Ｔを運用装置１１に送信する。運用装置１１は、学習装置１０からの学習済みモデル２１Ｔを受信する。運用装置１１は、映った物体のクラスおよびその輪郭が未だ判別されていない入力画像２２を学習済みモデル２１Ｔに与える。学習済みモデル２１Ｔは、入力画像２２にセマンティックセグメンテーションを実施して、入力画像２２に映る物体のクラスとその輪郭を判別し、その判別結果として出力画像２３を出力する。なお、運用装置１１に学習済みモデル２１Ｔを組み込んだ後も、学習済みモデル２１Ｔに学習データ２０を与えて学習させてもよい。 The learning device 10 transmits the trained model 21T to the operation device 11 . The operation device 11 receives the trained model 21T from the learning device 10 . The operation device 11 provides the learned model 21T with the input image 22 in which the class of the reflected object and its outline have not yet been determined. The trained model 21T performs semantic segmentation on the input image 22 to discriminate classes and outlines of objects appearing in the input image 22, and outputs an output image 23 as the discrimination result. Even after the trained model 21T is installed in the operation device 11, the learned model 21T may be given the learning data 20 to be learned.

図３に示すように、学習用入力画像１７は、本例においては、培養中の複数の細胞を位相差顕微鏡等で撮影した細胞画像である。学習用入力画像１７には、分化細胞ＤＣ、未分化細胞ＵＤＣ、死細胞ＤＤＣ、および培地ＰＬが映っている。この場合のアノテーション画像１９は、分化細胞ＤＣ、未分化細胞ＵＤＣ、死細胞ＤＤＣ、培地ＰＬが、各々クラス１、クラス２、クラス３、クラス４として指定されたものとなる。各クラス１～４の指定は、例えば、ユーザが手動により行う。クラス４の培地ＰＬは、他のクラス１～３を指定することで自ずと指定される。なお、学習済みモデル２１Ｔに与えられる入力画像２２も、学習用入力画像１７と同じく、培養中の複数の細胞を位相差顕微鏡等で撮影した細胞画像である。 As shown in FIG. 3, the learning input image 17 is, in this example, a cell image obtained by photographing a plurality of cells in culture with a phase-contrast microscope or the like. Differentiated cell DC, undifferentiated cell UDC, dead cell DDC, and medium PL are shown in the learning input image 17 . In this case, the annotation image 19 designates differentiated cell DC, undifferentiated cell UDC, dead cell DDC, and medium PL as class 1, class 2, class 3, and class 4, respectively. The designation of each class 1 to 4 is manually performed by the user, for example. Medium PL of class 4 is automatically designated by designating other classes 1-3. Note that the input image 22 given to the trained model 21T is also a cell image obtained by photographing a plurality of cells in culture with a phase-contrast microscope or the like, like the learning input image 17 .

図４に示すように、仮モデル１６およびモデル２１は、入力画像を解析する複数の階層を有し、階層毎に、入力画像に含まれる空間周波数の周波数帯域が異なる特徴を抽出する、Ｕ－Ｎｅｔ等の畳み込みニューラルネットワークで構成された階層型のモデルである。本例の場合は、第１階層、第２階層、第３階層、第４階層、第５階層の５つの階層を有する。なお、以下では、学習用入力画像１７を入力画像としてモデル２１に与え、モデル２１から本学習用出力画像２５（図２８も参照）を出力させる場合を例に説明する。 As shown in FIG. 4, the temporary model 16 and the model 21 have a plurality of layers for analyzing the input image. It is a hierarchical model composed of a convolutional neural network such as Net. In the case of this example, there are five hierarchies: the first hierarchy, the second hierarchy, the third hierarchy, the fourth hierarchy, and the fifth hierarchy. In the following description, an example will be described in which the learning input image 17 is given to the model 21 as an input image, and the model 21 outputs the main learning output image 25 (see also FIG. 28).

モデル２１は、エンコーダネットワーク（以下、ＥＮと略す）２６、デコーダネットワーク（以下、ＤＮと略す）２７、および出力層２８を含む。ＥＮ２６は、階層毎に、フィルタを用いた畳み込み演算を行って、画像特徴マップＣＭＰを抽出する畳み込み処理を行う。ＤＮ２７は、ＥＮ２６から出力された最小の画像特徴マップＣＭＰの画像サイズを段階的に拡大する。そして、段階的に拡大された画像特徴マップＣＭＰと、ＥＮ２６の各階層で出力された画像特徴マップＣＭＰとを結合して、学習用入力画像１７と同じ画像サイズの最終出力データ２９を生成する。 The model 21 includes an encoder network (hereinafter abbreviated as EN) 26 , a decoder network (hereinafter abbreviated as DN) 27 and an output layer 28 . The EN 26 performs convolution processing using a filter for each layer to extract an image feature map CMP. DN27 expands step by step the image size of the minimum image feature map CMP output from EN26. Then, the stepwise enlarged image feature map CMP and the image feature map CMP output at each layer of EN 26 are combined to generate final output data 29 having the same image size as the learning input image 17 .

ＥＮ２６には、階層毎に、二次元に配列された複数の画素値をもつ入力データが入力される。ＥＮ２６は、各階層において、入力データに対して畳み込み処理を行って画像特徴マップＣＭＰを抽出する。ＥＮ２６の第１階層には、入力データとして学習用入力画像１７が入力される。第１階層は、学習用入力画像１７に畳み込み処理を行って、例えば、学習用入力画像１７と同じ画像サイズの画像特徴マップＣＭＰを出力する。第２階層以下では、入力データとして、上位の各階層で出力された画像特徴マップＣＭＰが入力される。第２階層以下では、画像特徴マップＣＭＰに対して畳み込み処理が行われて、例えば、入力された画像特徴マップＣＭＰと同じ画像サイズの画像特徴マップＣＭＰが出力される。畳み込み処理は、図４において「ｃｏｎｖ（ｃｏｎｖｏｌｕｔｉｏｎの略）」として示す。 Input data having a plurality of pixel values arranged two-dimensionally is input to the EN 26 for each layer. The EN 26 extracts the image feature map CMP by performing convolution processing on the input data in each layer. The learning input image 17 is input as input data to the first layer of the EN 26 . In the first layer, convolution processing is performed on the learning input image 17 and, for example, an image feature map CMP having the same image size as that of the learning input image 17 is output. In the second layer and below, the image feature map CMP output in each upper layer is input as input data. In the second layer and below, convolution processing is performed on the image feature map CMP, and for example, an image feature map CMP having the same image size as the input image feature map CMP is output. The convolution process is shown as "conv (abbreviation for convolution)" in FIG.

畳み込み処理は、入力データに例えば３×３のフィルタを適用して、入力データ内の注目画素の画素値ｅと、注目画素に隣接する８個の画素の画素値ａ、ｂ、ｃ、ｄ、ｆ、ｇ、ｈ、ｉを畳み込むことにより、入力データと同様に、二次元状に画素値が配列された出力データを得る。フィルタの係数をｒ、ｓ、ｔ、ｕ、ｖ、ｗ、ｘ、ｙ、ｚとした場合、注目画素に対する畳み込み演算の結果である、出力データの画素の画素値ｋは、例えば下記の（式１）を計算することで得られる。
ｋ＝ａｚ＋ｂｙ＋ｃｘ＋ｄｗ＋ｅｖ＋ｆｕ＋ｇｔ＋ｈｓ＋ｉｒ・・・（式１）In the convolution process, for example, a 3×3 filter is applied to the input data to obtain the pixel value e of the target pixel in the input data and the pixel values a, b, c, d, and d of eight pixels adjacent to the target pixel. By convolving f, g, h, and i, output data in which pixel values are arranged two-dimensionally is obtained in the same manner as the input data. When the coefficients of the filter are r, s, t, u, v, w, x, y, and z, the pixel value k of the pixel of the output data, which is the result of the convolution operation on the pixel of interest, is given by, for example, the following (formula It is obtained by calculating 1).
k=az+by+cx+dw+ev+fu+gt+hs+ir (equation 1)

畳み込み処理では、入力データの各画素に対して上記のような畳み込み演算を行い、画素値ｋを出力する。こうして、二次元状に配列された画素値ｋをもつ出力データが出力される。出力データは、１個のフィルタに対して１つ出力される。種類が異なる複数のフィルタが使用された場合は、フィルタ毎に出力データが出力される。画像特徴マップＣＭＰは、こうした出力データで構成される。 In the convolution process, each pixel of the input data is subjected to the convolution operation as described above, and the pixel value k is output. In this way, output data having pixel values k arranged two-dimensionally is output. One output data is output for one filter. When multiple filters of different types are used, output data is output for each filter. The image feature map CMP is composed of such output data.

出力データは、二次元状に画素値ｋが配列されたデータであり、幅と高さをもつ。また、種類が異なる複数のフィルタを適用して、複数の出力データが出力された場合は、画像特徴マップＣＭＰは、複数の出力データの集合になる。画像特徴マップＣＭＰにおいて、フィルタの数はチャンネル数と呼ばれる。 The output data is data in which pixel values k are arranged two-dimensionally, and has width and height. Further, when a plurality of different types of filters are applied and a plurality of output data are output, the image feature map CMP becomes a set of the plurality of output data. In the image feature map CMP, the number of filters is called the number of channels.

各画像特徴マップＣＭＰの上に示す６４、１２８、２５６、５１２、１０２４の各数字は、各画像特徴マップＣＭＰが有するチャンネル数を示す。そして、第１階層から第５階層にそれぞれ付す括弧付きの１／１、１／２、１／４、１／８、１／１６は、最上位の入力画像である学習用入力画像１７の画像サイズを基準とした、各階層で取り扱う画像サイズを示す。 The numbers 64, 128, 256, 512, and 1024 shown above each image feature map CMP indicate the number of channels each image feature map CMP has. 1/1, 1/2, 1/4, 1/8, and 1/16 with parentheses attached to the first to fifth layers are images of the learning input image 17 which is the highest input image. Based on the size, the image size handled in each layer is shown.

本例のＥＮ２６の第１階層においては、学習用入力画像１７に対して２回の畳み込み処理が行われる。まず、学習用入力画像１７に対して、６４個のフィルタを適用する畳み込み処理が行われて、６４チャンネルの画像特徴マップＣＭＰが出力される。そして、この画像特徴マップＣＭＰに対して、さらに６４個のフィルタを適用する畳み込み処理が行われて、第１階層においては、最終的に６４チャンネルの画像特徴マップＣＭＰが出力される。 In the first layer of the EN 26 of this example, the input image 17 for learning is subjected to convolution processing twice. First, the learning input image 17 is subjected to convolution processing applying 64 filters, and a 64-channel image feature map CMP is output. Then, this image feature map CMP is further subjected to convolution processing applying 64 filters, and in the first layer, a 64-channel image feature map CMP is finally output.

ＥＮ２６において、第１階層が出力する画像特徴マップＣＭＰの幅と高さである画像サイズは、学習用入力画像１７の画像サイズと同じである。このため、第１階層が取り扱う画像サイズは、学習用入力画像１７と同じ、すなわち等倍を表す１／１である。 In EN 26 , the image size, which is the width and height of the image feature map CMP output by the first layer, is the same as the image size of the learning input image 17 . Therefore, the image size handled by the first layer is the same as that of the learning input image 17, that is, 1/1 representing the same size.

ＥＮ２６の第１階層において、２回の畳み込み処理で抽出された画像特徴マップＣＭＰに対して、図４において「ｐｏｏｌ（ｐｏｏｌｉｎｇの略）」として示すプーリング処理が行われる。プーリング処理は、画像特徴マップＣＭＰの局所的な統計量を計算して画像特徴マップＣＭＰを圧縮する処理である。局所的な統計量としては、例えば、２×２の画素のブロック内における画素値の最大値または平均値が用いられる。最大値を計算するプーリング処理は最大値プーリング、平均値を計算するプーリング処理は平均値プーリングと呼ばれる。つまり、プーリング処理は、画像特徴マップＣＭＰの各画素の画素値から局所的な代表値を選択して、画像特徴マップＣＭＰの解像度を下げ、画像特徴マップＣＭＰの画像サイズを縮小する処理といえる。例えば、２×２の画素のブロックから代表値を選択するプーリング処理を１画素ずつずらして行うと、画像特徴マップＣＭＰは、元の画像サイズの半分に縮小される。モデル２１では、第１階層において、例えば画像特徴マップＣＭＰの画像サイズを１／２にするプーリング処理が行われる。このため、ＥＮ２６の第２階層においては、学習用入力画像１７を基準として、１／２の画像サイズに縮小された画像特徴マップＣＭＰが、入力データとして入力される。 In the first layer of the EN 26, a pooling process indicated as "pool (abbreviation of pooling)" in FIG. 4 is performed on the image feature map CMP extracted by the two convolution processes. The pooling process is a process of calculating local statistics of the image feature map CMP and compressing the image feature map CMP. As the local statistic, for example, the maximum value or average value of pixel values within a block of 2×2 pixels is used. The pooling process for calculating the maximum value is called maximum value pooling, and the pooling process for calculating the average value is called average value pooling. That is, the pooling process can be said to be a process of reducing the image size of the image feature map CMP by reducing the resolution of the image feature map CMP by selecting a local representative value from the pixel values of each pixel of the image feature map CMP. For example, if the pooling process of selecting a representative value from a block of 2×2 pixels is shifted by one pixel, the image feature map CMP is reduced to half the original image size. In the model 21, in the first layer, for example, pooling processing is performed to halve the image size of the image feature map CMP. Therefore, in the second layer of EN 26, an image feature map CMP that is reduced to 1/2 the size of the input image for learning 17 is input as input data.

第２階層においては、１２８個のフィルタを適用する畳み込み処理が２回行われて、１２８チャンネルの画像特徴マップＣＭＰが出力される。そして、１２８チャンネルの画像特徴マップＣＭＰに対して、画像サイズを半分にするプーリング処理が行われる。これにより、第２階層から第３階層には、学習用入力画像１７の画像サイズを基準として、１／４の画像サイズに縮小された１２８チャンネルの画像特徴マップＣＭＰが、入力データとして入力される。 In the second layer, convolution processing applying 128 filters is performed twice to output a 128-channel image feature map CMP. Then, a pooling process for halving the image size is performed on the 128-channel image feature map CMP. As a result, the 128-channel image feature map CMP reduced to 1/4 the image size of the learning input image 17 is input to the second to third layers as input data. .

第３階層においては、２５６個のフィルタを適用する２回の畳み込み処理が行われて、２５６チャンネルの画像特徴マップＣＭＰが出力され、２５６チャンネルの画像特徴マップＣＭＰに対して、画像サイズをさらに半分にするプーリング処理が行われる。これにより、第３階層から第４階層には、学習用入力画像１７を基準として、１／８の画像サイズに縮小された２５６チャンネルの画像特徴マップＣＭＰが、入力データとして入力される。 In the third layer, two convolution processes applying 256 filters are performed, and a 256-channel image feature map CMP is output. A pooling process is performed to As a result, the 256-channel image feature map CMP reduced to 1/8 the image size of the learning input image 17 is input to the third to fourth layers as input data.

同様に、第４階層においては、５１２個のフィルタを適用する２回の畳み込み処理が行われて、５１２チャンネルの画像特徴マップＣＭＰが出力され、５１２チャンネルの画像特徴マップＣＭＰに対して、画像サイズをさらに半分にするプーリング処理が行われる。これにより、第４階層から第５階層には、学習用入力画像１７を基準として、１／１６の画像サイズに縮小された５１２チャンネルの画像特徴マップＣＭＰが、入力データとして入力される。 Similarly, in the fourth layer, two convolution processes applying 512 filters are performed to output a 512-channel image feature map CMP. is further halved. As a result, the 512-channel image feature map CMP reduced to 1/16 of the image size of the learning input image 17 is input to the fourth to fifth layers as input data.

最下位の階層の第５階層においては、１０２４個のフィルタを適用する２回の畳み込み処理が行われる。ただし、第５階層においては、畳み込み処理で抽出された画像特徴マップＣＭＰに対してはプーリング処理が行われない。 In the fifth layer, which is the lowest layer, two convolution processes applying 1024 filters are performed. However, in the fifth layer, the pooling process is not performed on the image feature map CMP extracted by the convolution process.

ＥＮ２６においては、各階層に入力される入力データ（学習用入力画像１７または画像特徴マップＣＭＰ）は、最上位の第１階層から最下位の第５階層に向かって、画像サイズが段階的に縮小されて解像度が下げられる。本例においては、第１階層に入力される学習用入力画像１７の画像サイズを基準に、第１階層は１／１（等倍）、第２階層は１／２、第３階層は１／４、第４階層は１／８、第５階層は１／１６のそれぞれの画像サイズの入力データが入力される。 In EN26, the input data (learning input image 17 or image feature map CMP) input to each layer is reduced in image size step by step from the first layer, which is the highest layer, to the fifth layer, which is the lowest layer. and the resolution is lowered. In this example, based on the image size of the learning input image 17 input to the first layer, the first layer is 1/1 (same size), the second layer is 1/2, and the third layer is 1/2. 4, input data of 1/8 image size is input to the fourth layer, and input data of 1/16 image size is input to the fifth layer.

ＥＮ２６の各階層は、階層毎に、解像度が異なる入力データに対してフィルタを適用して畳み込み処理を行う。第１階層では、各階層の入力データのうちで最も解像度が高い学習用入力画像１７に対して畳み込み処理が行われる。このため、第１階層で抽出される画像特徴マップＣＭＰは、学習用入力画像１７において最も空間周波数が高い周波数帯域をもつ、最も微細な構造の特徴を表す。第２階層および第３階層では、学習用入力画像１７よりも解像度が下げられた入力データに対して畳み込み処理が行われる。このため、第２階層および第３階層で抽出される画像特徴マップＣＭＰは、第１階層と比べて、空間周波数が低い周波数帯域をもつ、中域構造の特徴を表す。第４階層および第５階層では、さらに入力データの解像度が下がるため、第４階層および第５階層で抽出される画像特徴マップＣＭＰは、さらに空間周波数が低い周波数帯域をもつ、大域構造の特徴を表す。 Each layer of the EN 26 performs convolution processing by applying a filter to input data with different resolutions for each layer. In the first layer, convolution processing is performed on the learning input image 17 having the highest resolution among the input data of each layer. Therefore, the image feature map CMP extracted in the first layer represents the features of the finest structure having the frequency band with the highest spatial frequency in the learning input image 17 . In the second layer and the third layer, convolution processing is performed on input data whose resolution is lower than that of the learning input image 17 . For this reason, the image feature maps CMP extracted in the second and third hierarchies represent mid-range structure features having frequency bands with lower spatial frequencies than in the first hierarchies. Since the resolution of the input data is further lowered in the fourth and fifth layers, the image feature map CMP extracted in the fourth and fifth layers has a frequency band with a lower spatial frequency, and features of the global structure. show.

ＥＮ２６においては、最上位の第１階層から最下位の第５階層の階層毎に、学習用入力画像１７に含まれる周波数帯域が異なる画像の特徴を出力する。第１階層の１／１から第５階層の１／１６までの各画像サイズは、各階層が解析可能な周波数帯域を示す。すなわち、１／１は最も空間周波数が高い周波数帯域を示し、反対に１／１６は最も空間周波数が低い周波数帯域を示す。なお、ＥＮ２６において、階層が下るにつれて、フィルタの数を６４、１２８、２５６、・・・と増加させる理由は、画像サイズが小さくなる分、フィルタの数を増やして、学習用入力画像１７に含まれる様々な特徴を抽出するためである。 The EN 26 outputs image features of different frequency bands included in the learning input image 17 for each layer from the first layer, which is the highest layer, to the fifth layer, which is the lowest layer. Each image size from 1/1 of the first layer to 1/16 of the fifth layer indicates a frequency band that can be analyzed by each layer. That is, 1/1 indicates the frequency band with the highest spatial frequency, and 1/16 indicates the frequency band with the lowest spatial frequency. The reason why the number of filters is increased to 64, 128, 256, . This is for extracting various features that can be

ＥＮ２６の第１階層から第４階層は、それぞれが抽出した画像特徴マップＣＭＰを、ＤＮ２７に対して送信する。この画像特徴マップＣＭＰをＥＮ２６からＤＮ２７に送信する処理は、スキップレイヤ処理と呼ばれ、図４において「ｓｋｉｐ」で示す。ＤＮ２７の各階層において、ハッチングで示す画像特徴マップＣＭＰが、ＥＮ２６から送信された画像特徴マップＣＭＰである。 The first to fourth hierarchies of EN 26 transmit the image feature map CMP extracted by each to DN 27 . The process of transmitting this image feature map CMP from EN 26 to DN 27 is called skip layer process, and is indicated by "skip" in FIG. In each layer of DN27, the hatched image feature map CMP is the image feature map CMP transmitted from EN26.

ＤＮ２７は、アップサンプリング処理とマージ処理とを繰り返す。アップサンプリング処理は、図４において「ｕｐｓｍｐ（ｕｐｓａｍｐｌｉｎｇの略）」として示す。アップサンプリング処理は、ＥＮ２６から出力された最小の画像サイズの画像特徴マップＣＭＰの画像サイズを段階的に拡大する処理である。マージ処理は、アップサンプリング処理で段階的に拡大された画像特徴マップＣＭＰと、ＥＮ２６において階層毎に出力され、かつ、画像サイズが同じ画像特徴マップＣＭＰとを結合する処理である。ＤＮ２７は、これらアップサンプリング処理とマージ処理とにより、最終出力データ２９を出力する。 The DN 27 repeats upsampling and merging. The upsampling process is shown as "upsmp" (abbreviation for upsampling) in FIG. The upsampling process is a process of stepwise increasing the image size of the minimum image size image feature map CMP output from the EN 26 . The merging process is a process of combining the image feature map CMP that has been stepwise enlarged by the upsampling process and the image feature map CMP that is output for each layer in the EN 26 and has the same image size. The DN 27 outputs final output data 29 through these upsampling processing and merging processing.

ＤＮ２７は、ＥＮ２６の各階層と対応する第１階層から第５階層を有する。ＤＮ２７の各階層で行われるアップサンプリング処理では、ＥＮ２６の対応する各階層の画像サイズと同じサイズになるように画像特徴マップＣＭＰが拡大される。 The DN 27 has 1st to 5th hierarchies corresponding to the respective hierarchies of the EN 26 . In the upsampling process performed in each layer of DN27, the image feature map CMP is enlarged so as to have the same size as the image size of each corresponding layer of EN26.

また、本例のアップサンプリング処理は、画像サイズを拡大することに加えて、フィルタを適用する畳み込み処理を伴う。こうした畳み込み処理を伴うアップサンプリング処理は、アップコンボリューション処理と呼ばれる。ＤＮ２７の各階層においては、アップコンボリューション処理が終了した後に、マージ処理とさらなる畳み込み処理とが行なわれる。 Also, the upsampling process of this example involves a convolution process that applies a filter in addition to enlarging the image size. Upsampling processing accompanied by such convolution processing is called upconvolution processing. In each layer of DN 27, merge processing and further convolution processing are performed after the up-convolution processing is completed.

ＤＮ２７の第４階層は、まず、ＥＮ２６の最下位の第５階層から、１／１６という最小の画像サイズの画像特徴マップＣＭＰを受け取る。この画像特徴マップＣＭＰのチャンネル数は１０２４である。ＤＮ２７の第４階層は、１／１６の画像サイズの画像特徴マップＣＭＰを、２倍の１／８の画像サイズに拡大し、かつ、５１２個のフィルタを適用する畳み込み処理を行って、チャンネル数を半分の５１２個に減らす。 The 4th hierarchy of DN27 first receives the image feature map CMP of the minimum image size of 1/16 from the lowest 5th hierarchy of EN26. This image feature map CMP has 1024 channels. The fourth layer of the DN 27 expands the image feature map CMP of 1/16 image size to 1/8 image size by doubling, and performs convolution processing to apply 512 filters to reduce the number of channels. is reduced by half to 512.

ＤＮ２７の第４階層においては、ＥＮ２６の第５階層から受け取った画像特徴マップＣＭＰと、ＥＮ２６の第４階層からスキップレイヤ処理で送信された画像特徴マップＣＭＰとを結合するマージ処理が行われる。第４階層において結合される画像特徴マップＣＭＰは、それぞれ１／８の画像サイズで、かつ、５１２チャンネルである。そのため、第４階層においては、マージ処理によって、１／８の画像サイズで、かつ、１０２４チャンネル（５１２＋５１２）の画像特徴マップＣＭＰが生成される。 In the fourth layer of DN27, merge processing is performed to combine the image feature map CMP received from the fifth layer of EN26 and the image feature map CMP transmitted from the fourth layer of EN26 by skip layer processing. The image feature maps CMP combined in the fourth layer are each 1/8 image size and 512 channels. Therefore, in the fourth layer, an image feature map CMP of 1/8 image size and 1024 channels (512+512) is generated by merge processing.

さらに、第４階層においては、１０２４チャンネルの画像特徴マップＣＭＰに対して５１２個のフィルタを適用する畳み込み処理が２回行われて、１／８の画像サイズで、かつ、５１２チャンネルの画像特徴マップＣＭＰが生成される。第４階層においては、この１／８の画像サイズの画像特徴マップＣＭＰに対して、画像サイズを２倍の１／４に拡大し、かつ、チャンネル数を半分の２５６チャンネルにするアップコンボリューション処理が行われる。この結果、第４階層から第３階層に対して、１／４の画像サイズで、かつ、２５６チャンネルの画像特徴マップＣＭＰが出力される。 Furthermore, in the fourth layer, the convolution process of applying 512 filters to the 1024-channel image feature map CMP is performed twice, resulting in a 1/8 image size and a 512-channel image feature map. CMP is generated. In the fourth layer, the image feature map CMP of 1/8 image size is doubled in size to 1/4 and the number of channels is halved to 256 channels by upconvolution processing. is done. As a result, an image feature map CMP of 1/4 image size and 256 channels is output for the fourth to third hierarchies.

ＤＮ２７の第３階層においては、第４階層から受け取った画像特徴マップＣＭＰと、ＥＮ２６の第３階層からスキップレイヤ処理で送信された画像特徴マップＣＭＰとを結合するマージ処理が行われる。第３階層において結合される画像特徴マップＣＭＰは、それぞれ１／４の画像サイズで、かつ、２５６チャンネルである。そのため、第３階層においては、マージ処理によって、１／４の画像サイズで、かつ、５１２チャンネル（２５６＋２５６）の画像特徴マップＣＭＰが生成される。 In the third layer of DN27, merge processing is performed to combine the image feature map CMP received from the fourth layer and the image feature map CMP transmitted from the third layer of EN26 by skip layer processing. The image feature maps CMP combined in the third layer are each 1/4 image size and 256 channels. Therefore, in the third hierarchy, an image feature map CMP with a 1/4 image size and 512 channels (256+256) is generated by merge processing.

さらに、第３階層においては、５１２チャンネルの画像特徴マップＣＭＰに対して２５６個のフィルタを適用する畳み込み処理が２回行われて、１／４の画像サイズで、かつ、２５６チャンネルの画像特徴マップＣＭＰが生成される。第３階層においては、この１／４の画像サイズの画像特徴マップＣＭＰに対して、画像サイズを２倍の１／２に拡大し、かつ、チャンネル数を半分の１２８チャンネルにするアップコンボリューション処理が行われる。この結果、第３階層から第２階層に対して、１／２の画像サイズで、かつ、１２８チャンネルの画像特徴マップＣＭＰが出力される。 Furthermore, in the third layer, the convolution process of applying 256 filters to the 512-channel image feature map CMP is performed twice, resulting in a 1/4 image size and a 256-channel image feature map. CMP is generated. In the third layer, the image feature map CMP of the 1/4 image size is doubled in size to 1/2 and the number of channels is halved to 128 channels by upconvolution processing. is done. As a result, an image feature map CMP with half the image size and 128 channels is output from the third hierarchy to the second hierarchy.

ＤＮ２７の第２階層においては、第３階層から受け取った画像特徴マップＣＭＰと、ＥＮ２６の第２階層からスキップレイヤ処理で送信された画像特徴マップＣＭＰとを結合するマージ処理が行われる。第２階層において結合される画像特徴マップＣＭＰは、それぞれ１／２の画像サイズで、かつ、１２８チャンネルである。そのため、第２階層においては、マージ処理によって、１／２の画像サイズで、かつ、２５６チャンネル（１２８＋１２８）の画像特徴マップＣＭＰが生成される。 In the second layer of DN27, merge processing is performed to combine the image feature map CMP received from the third layer and the image feature map CMP transmitted from the second layer of EN26 by skip layer processing. The image feature maps CMP combined in the second layer are each half the image size and 128 channels. Therefore, in the second layer, an image feature map CMP with half the image size and 256 channels (128+128) is generated by the merging process.

さらに、第２階層においては、２５６チャンネルの画像特徴マップＣＭＰに対して１２８個のフィルタを適用する畳み込み処理が２回行われて、１／２の画像サイズで、かつ、１２８チャンネルの画像特徴マップＣＭＰが生成される。第２階層においては、この１／２の画像サイズの画像特徴マップＣＭＰに対して、画像サイズを２倍の１／１に拡大し、かつ、チャンネル数を半分の６４チャンネルにするアップコンボリューション処理が行われる。この結果、最終的に、第２階層から第１階層に対して、１／１の画像サイズで、かつ、６４チャンネルの画像特徴マップＣＭＰが出力される。 Furthermore, in the second layer, the convolution process of applying 128 filters to the 256-channel image feature map CMP is performed twice, resulting in a 1/2 image size and a 128-channel image feature map. CMP is generated. In the second layer, up-convolution processing is performed to double the image size to 1/1 and to halve the number of channels to 64 channels for the image feature map CMP of this 1/2 image size. is done. As a result, an image feature map CMP of 1/1 image size and 64 channels is finally output from the second layer to the first layer.

ＤＮ２７の第１階層においては、第２階層から受け取った画像特徴マップＣＭＰと、ＥＮ２６の第１階層からスキップレイヤ処理で送信された画像特徴マップＣＭＰとを結合するマージ処理が行われる。第１階層において結合される画像特徴マップＣＭＰは、それぞれ１／１の画像サイズで、かつ、６４チャンネルである。そのため、第１階層においては、マージ処理によって、１／１の画像サイズで、かつ、１２８チャンネル（６４＋６４）の画像特徴マップＣＭＰが生成される。 In the first layer of DN27, merge processing is performed to combine the image feature map CMP received from the second layer and the image feature map CMP transmitted from the first layer of EN26 by skip layer processing. The image feature maps CMP combined in the first hierarchy are each 1/1 image size and 64 channels. Therefore, in the first layer, an image feature map CMP of 1/1 image size and 128 channels (64+64) is generated by the merging process.

さらに、第１階層においては、１２８チャンネルの画像特徴マップＣＭＰに対して６４個のフィルタを適用する畳み込み処理が行われた後、１個のフィルタを適用する畳み込み処理が行われる。これにより、学習用入力画像１７と同じ１／１の画像サイズの最終出力データ２９が生成される。 Furthermore, in the first layer, convolution processing is performed by applying 64 filters to the 128-channel image feature map CMP, and then convolution processing is performed by applying one filter. As a result, the final output data 29 having an image size 1/1 that of the learning input image 17 is generated.

ＤＮ２７においては、ＥＮ２６から出力された最小の画像サイズの画像特徴マップＣＭＰの画像サイズを段階的に拡大する。そして、画像特徴マップＣＭＰを拡大しながら、ＥＮ２６において階層毎に抽出された画像特徴マップＣＭＰを結合して最終出力データ２９を生成する。最小の画像サイズの画像特徴マップＣＭＰは、学習用入力画像１７の最も空間周波数が低い大域構造の特徴を表すものである。ＤＮ２７では、この最小の画像サイズの画像特徴マップＣＭＰを拡大することで、大域構造の特徴を拡大しつつ、ＥＮ２６からの画像特徴マップＣＭＰを結合することで、中域構造から微細構造までの特徴を取り込む。 In DN27, the image size of the minimum image size image feature map CMP output from EN26 is enlarged step by step. Then, while enlarging the image feature map CMP, the EN 26 combines the image feature maps CMP extracted for each layer to generate the final output data 29 . The image feature map CMP with the smallest image size represents the feature of the global structure with the lowest spatial frequency of the learning input image 17 . In DN27, by enlarging the image feature map CMP of the minimum image size, the features of the global structure are expanded, and by combining the image feature maps CMP from EN26, the features from the midrange structure to the fine structure are expanded. take in.

出力層２８は、最終出力データ２９から、学習用入力画像１７内のクラス毎の領域がセグメンテーションされた本学習用出力画像２５を生成する。 The output layer 28 generates, from the final output data 29 , a main learning output image 25 obtained by segmenting the regions for each class in the learning input image 17 .

図５において、学習装置１０を構成するコンピュータは、ストレージデバイス３０、メモリ３１、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）３２、通信部３３、ディスプレイ３４、および入力デバイス３５を備えている。これらはバスライン３６を介して相互接続されている。 In FIG. 5, the computer that constitutes the learning device 10 includes a storage device 30 , a memory 31 , a CPU (Central Processing Unit) 32 , a communication section 33 , a display 34 and an input device 35 . These are interconnected via bus lines 36 .

ストレージデバイス３０は、学習装置１０を構成するコンピュータに内蔵、またはケーブル、ネットワークを通じて接続されたハードディスクドライブである。もしくはストレージデバイス３０は、ハードディスクドライブを複数台連装したディスクアレイである。ストレージデバイス３０には、オペレーティングシステム等の制御プログラム、各種アプリケーションプログラム、およびこれらのプログラムに付随する各種データ等が記憶されている。なお、ハードディスクドライブに代えてソリッドステートドライブを用いてもよい。 The storage device 30 is a hard disk drive built into the computer constituting the learning device 10 or connected via a cable or network. Alternatively, the storage device 30 is a disk array in which a plurality of hard disk drives are connected. The storage device 30 stores a control program such as an operating system, various application programs, various data associated with these programs, and the like. A solid state drive may be used instead of the hard disk drive.

メモリ３１は、ＣＰＵ３２が処理を実行するためのワークメモリである。ＣＰＵ３２は、ストレージデバイス３０に記憶されたプログラムをメモリ３１へロードして、プログラムにしたがった処理を実行することにより、コンピュータの各部を統括的に制御する。 The memory 31 is a work memory for the CPU 32 to execute processing. The CPU 32 loads a program stored in the storage device 30 into the memory 31 and executes processing according to the program, thereby comprehensively controlling each section of the computer.

通信部３３は、ネットワーク１２を介した各種情報の伝送制御を行うネットワークインターフェースである。ディスプレイ３４は各種画面を表示する。学習装置１０を構成するコンピュータは、各種画面を通じて、入力デバイス３５からの操作指示の入力を受け付ける。入力デバイス３５は、キーボード、マウス、タッチパネル等である。 The communication unit 33 is a network interface that controls transmission of various information via the network 12 . The display 34 displays various screens. The computer that constitutes the learning device 10 receives input of operation instructions from the input device 35 through various screens. The input device 35 is a keyboard, mouse, touch panel, or the like.

図６において、ストレージデバイス３０には、作動プログラム４０が記憶されている。作動プログラム４０は、コンピュータを学習装置１０として機能させるためのアプリケーションプログラムである。すなわち、作動プログラム４０は、本開示の技術に係る「学習装置の作動プログラム」の一例である。ストレージデバイス３０には、画像群４１、モデル群４２、および画像生成条件４３も記憶されている。 In FIG. 6, the storage device 30 stores an operating program 40 . The operating program 40 is an application program that causes the computer to function as the learning device 10 . That is, the operating program 40 is an example of the "learning device operating program" according to the technology of the present disclosure. The storage device 30 also stores an image group 41 , a model group 42 , and image generation conditions 43 .

作動プログラム４０が起動されると、学習装置１０を構成するコンピュータのＣＰＵ３２は、メモリ３１等と協働して、リードライト（以下、ＲＷ（ＲｅａｄＷｒｉｔｅ）と略す）制御部４５、画像生成部４６、仮モデル生成部４７、仮学習部４８、モデル生成部４９、本学習部５０、表示制御部５１、受付部５２、および送信制御部５３として機能する。 When the operation program 40 is started, the CPU 32 of the computer constituting the learning device 10 cooperates with the memory 31 and the like to operate a read/write (hereinafter abbreviated as RW (Read Write)) control unit 45 and an image generation unit 46. , a temporary model generation unit 47 , a temporary learning unit 48 , a model generation unit 49 , a main learning unit 50 , a display control unit 51 , a reception unit 52 , and a transmission control unit 53 .

ＲＷ制御部４５は、ストレージデバイス３０への各種データの記憶、およびストレージデバイス３０内の各種データの読み出しを制御する。ＲＷ制御部４５は、本開示の技術に係る「取得部」の一例である。 The RW control unit 45 controls storage of various data in the storage device 30 and reading of various data in the storage device 30 . The RW control unit 45 is an example of an “acquisition unit” according to the technology of the present disclosure.

画像生成部４６は、画像生成条件４３にしたがって、アノテーション画像１９から仮アノテーション画像１８を生成する。仮モデル生成部４７は、仮モデル１６を生成する。仮学習部４８は、仮学習データ１５を用いて仮モデル１６を学習させ、仮モデル１６を学習済み仮モデル１６Ｔとする前述の仮学習を行う。 The image generator 46 generates the temporary annotation image 18 from the annotation image 19 according to the image generation conditions 43 . The temporary model generation unit 47 generates the temporary model 16 . The temporary learning unit 48 learns the temporary model 16 using the temporary learning data 15, and performs the above-described temporary learning with the temporary model 16 as the learned temporary model 16T.

モデル生成部４９は、学習済み仮モデル１６Ｔからモデル２１を生成する。本学習部５０は、学習データ２０を用いてモデル２１を学習させ、モデル２１を学習済みモデル２１Ｔとする前述の本学習を行う。 The model generator 49 generates the model 21 from the trained temporary model 16T. The main learning unit 50 causes the model 21 to learn using the learning data 20, and performs the above-described main learning with the model 21 as the trained model 21T.

表示制御部５１は、ディスプレイ３４に各種画面を表示する制御を行う。各種画面には、画像生成条件４３を指定するための画像生成条件指定画面６５（図１１参照）等が含まれる。受付部５２は、各種画面を通じたユーザによる各種操作指示を受け付ける。各種操作指示には、画像生成条件指定画面６５を通じた画像生成条件４３のユーザによる指定が含まれる。送信制御部５３は、学習済みモデル２１Ｔを運用装置１１に送信する制御を行う。 The display control unit 51 controls the display of various screens on the display 34 . The various screens include an image generation condition designation screen 65 (see FIG. 11) for designating the image generation conditions 43, and the like. The accepting unit 52 accepts various operation instructions from the user through various screens. Various operation instructions include designation of the image generation condition 43 by the user through the image generation condition designation screen 65 . The transmission control unit 53 controls transmission of the trained model 21T to the operation device 11 .

図７に示すように、画像群４１は、前述の学習用入力画像１７、仮アノテーション画像１８、およびアノテーション画像１９を含む。また、図８に示すように、モデル群４２は、前述の仮モデル１６、学習済み仮モデル１６Ｔ、モデル２１、および学習済みモデル２１Ｔと、これらのモデルの基礎となる基礎モデル６０とを含む。 As shown in FIG. 7, the image group 41 includes the learning input image 17, the temporary annotation image 18, and the annotation image 19 described above. Further, as shown in FIG. 8, the model group 42 includes the above-described provisional model 16, trained provisional model 16T, model 21, and trained model 21T, and a basic model 60 that is the basis of these models.

図９の表６３に示すように、本例においては、仮学習を第１回仮学習と第２回仮学習とに分けて行う。第１回仮学習は、クラス数Ｎ＝２の仮アノテーション画像１８＿１（図２０参照）を用いて行う。第２回仮学習は、第１回仮学習から１つ増やしたクラス数Ｎ＝３（＝Ｍ－１）の仮アノテーション画像１８＿２（図２２参照）を用いて行う。なお、言うまでもないが、本学習は、クラス１～４が指定された、クラス数Ｍ＝４のアノテーション画像１９を用いて行う。 As shown in Table 63 of FIG. 9, in this example, provisional learning is divided into first provisional learning and second provisional learning. The first temporary learning is performed using the temporary annotation image 18_1 (see FIG. 20) with the number of classes N=2. The second provisional learning is performed using the provisional annotation image 18_2 (see FIG. 22) with the number of classes N=3 (=M−1), which is increased by one from the first provisional learning. Needless to say, this learning is performed using the annotation image 19 with the number of classes M=4, in which classes 1 to 4 are designated.

図１０に示すように、画像生成条件４３には、アノテーション画像１９において指定されたクラス１～４のうちのいずれのクラスを統合するかが、仮学習の回毎に登録されている。図１０では、第１回仮学習においてはクラス１～３（分化細胞ＤＣ、未分化細胞ＵＤＣ、死細胞ＤＤＣ）を統合する旨が、第２回仮学習においてはクラス１、２（分化細胞ＤＣ、未分化細胞ＵＤＣ）を統合する旨が、それぞれ登録された例を示している。 As shown in FIG. 10, in the image generation condition 43, which class among the classes 1 to 4 specified in the annotation image 19 is to be integrated is registered for each time of provisional learning. In FIG. 10, in the first provisional learning, classes 1 to 3 (differentiated cell DC, undifferentiated cell UDC, dead cell DDC) are integrated, and in the second provisional learning, classes 1 and 2 (differentiated cell DC , undifferentiated cells (UDC) are registered respectively.

図１１において、画像生成条件指定画面６５は、学習を開始する際に、表示制御部５１によりディスプレイ３４に表示される。画像生成条件指定画面６５は、第１指定領域６６および第２指定領域６７を有する。第１指定領域６６には、各クラス１～４のうち、第１回仮学習において統合するクラスを指定するためのチェックボックス６８が配されている。チェックボックス６８は、３つのクラスまでしかチェックを入れられないようになっている。第２指定領域６７には、各クラス１～４のうち、第２回仮学習において統合するクラスを指定するためのチェックボックス６９が配されている。チェックボックス６９は、チェックボックス６８においてチェックを入れられた３つのクラスのうち、２つのクラスまでしかチェックを入れられないようになっている。 In FIG. 11, an image generation condition designation screen 65 is displayed on the display 34 by the display control unit 51 when learning is started. The image generation condition designation screen 65 has a first designation area 66 and a second designation area 67 . In the first designation area 66, check boxes 68 are arranged for designating the classes to be integrated in the first provisional learning among the classes 1 to 4. FIG. The checkboxes 68 are designed so that only up to three classes can be checked. In the second designation area 67, a check box 69 for designating a class to be integrated in the second provisional learning among the classes 1 to 4 is arranged. Check boxes 69 are designed so that only two classes out of the three classes checked in check boxes 68 can be checked.

ユーザは、各指定領域６６、６７のチェックボックス６８、６９を適宜選択し、指定ボタン７０を選択する。受付部５２は、チェックボックス６８、６９の選択状態に応じた画像生成条件４３を受け付ける。すなわち、受付部５２は、本開示の技術に係る「第２受付部」の一例である。図１１では、第１回仮学習においてクラス１～３が、第２回仮学習においてクラス１、２が、それぞれ選択された場合を例示している。なお、キャンセルボタン７１が選択された場合、画像生成条件指定画面６５の表示が消される。 The user appropriately selects the check boxes 68 and 69 of the designated areas 66 and 67 and selects the designated button 70 . The receiving unit 52 receives the image generation condition 43 according to the selection state of the check boxes 68 and 69 . That is, the reception unit 52 is an example of the "second reception unit" according to the technology of the present disclosure. FIG. 11 illustrates a case where classes 1 to 3 are selected in the first provisional learning, and classes 1 and 2 are selected in the second provisional learning. Note that when the cancel button 71 is selected, the display of the image generation condition designation screen 65 is erased.

受付部５２は、画像生成条件４３をＲＷ制御部４５に出力する。ＲＷ制御部４５は、画像生成条件４３をストレージデバイス３０に記憶する。 The reception unit 52 outputs the image generation conditions 43 to the RW control unit 45 . The RW control unit 45 stores the image generation conditions 43 in the storage device 30 .

以下に示す図１２～図１９は、図１２～図１４が第１回仮学習、図１５～図１７が第２回仮学習、図１８および図１９が本学習に関する内容である。 12 to 19 shown below, FIGS. 12 to 14 are for the first provisional learning, FIGS. 15 to 17 are for the second provisional learning, and FIGS. 18 and 19 are for the main learning.

図１２に示すように、画像生成部４６は、ＲＷ制御部４５がストレージデバイス３０から読み出したアノテーション画像１９および画像生成条件４３を、ＲＷ制御部４５から受け取る。画像生成部４６は、画像生成条件４３にしたがって、アノテーション画像１９から第１回仮学習に用いる仮アノテーション画像１８＿１を生成する。画像生成部４６は、仮アノテーション画像１８＿１をＲＷ制御部４５に出力する。ＲＷ制御部４５は、仮アノテーション画像１８＿１をストレージデバイス３０に記憶する。 As shown in FIG. 12 , the image generation unit 46 receives from the RW control unit 45 the annotation image 19 and the image generation conditions 43 read from the storage device 30 by the RW control unit 45 . The image generator 46 generates the temporary annotation image 18_1 used for the first temporary learning from the annotation image 19 according to the image generation condition 43 . The image generator 46 outputs the temporary annotation image 18_1 to the RW controller 45 . The RW control unit 45 stores the temporary annotation image 18_1 in the storage device 30 .

図１３に示すように、仮モデル生成部４７は、ＲＷ制御部４５がストレージデバイス３０から読み出した基礎モデル６０を、ＲＷ制御部４５から受け取る。仮モデル生成部４７は、基礎モデル６０から第１回仮学習に用いる仮モデル１６＿１を生成する。仮モデル生成部４７は、仮モデル１６＿１をＲＷ制御部４５に出力する。ＲＷ制御部４５は、仮モデル１６＿１をストレージデバイス３０に記憶する。 As shown in FIG. 13 , the temporary model generation unit 47 receives from the RW control unit 45 the basic model 60 read from the storage device 30 by the RW control unit 45 . The temporary model generation unit 47 generates the temporary model 16_1 used for the first temporary learning from the basic model 60 . The temporary model generator 47 outputs the temporary model 16_1 to the RW controller 45 . The RW control unit 45 stores the temporary model 16_1 in the storage device 30 .

図１４に示すように、仮学習部４８は、ＲＷ制御部４５がストレージデバイス３０から読み出した仮学習データ１５＿１および仮モデル１６＿１を、ＲＷ制御部４５から受け取る。仮学習データ１５＿１は、学習用入力画像１７と仮アノテーション画像１８＿１との組である。仮学習部４８は、仮学習データ１５＿１を用いて仮モデル１６＿１を学習させ、仮モデル１６＿１を学習済み仮モデル１６Ｔ＿１とする。仮学習部４８は、学習済み仮モデル１６Ｔ＿１をＲＷ制御部４５に出力する。ＲＷ制御部４５は、学習済み仮モデル１６Ｔ＿１をストレージデバイス３０に記憶する。 As shown in FIG. 14 , the temporary learning unit 48 receives the temporary learning data 15_1 and the temporary model 16_1 read from the storage device 30 by the RW control unit 45 from the RW control unit 45 . The provisional learning data 15_1 is a set of the learning input image 17 and the provisional annotation image 18_1. The provisional learning unit 48 learns the provisional model 16_1 using the provisional learning data 15_1, and sets the provisional model 16_1 as a trained provisional model 16T_1. The temporary learning unit 48 outputs the trained temporary model 16T_1 to the RW control unit 45 . The RW control unit 45 stores the trained temporary model 16T_1 in the storage device 30 .

図１５に示すように、画像生成部４６は、図１２の場合と同じく、ＲＷ制御部４５がストレージデバイス３０から読み出したアノテーション画像１９および画像生成条件４３を、ＲＷ制御部４５から受け取る。画像生成部４６は、画像生成条件４３にしたがって、アノテーション画像１９から第２回仮学習に用いる仮アノテーション画像１８＿２を生成する。画像生成部４６は、仮アノテーション画像１８＿２をＲＷ制御部４５に出力する。ＲＷ制御部４５は、仮アノテーション画像１８＿２をストレージデバイス３０に記憶する。 As shown in FIG. 15, the image generator 46 receives from the RW controller 45 the annotation image 19 and the image generation conditions 43 read from the storage device 30 by the RW controller 45, as in the case of FIG. The image generator 46 generates the temporary annotation image 18_2 used for the second temporary learning from the annotation image 19 according to the image generation condition 43 . The image generator 46 outputs the temporary annotation image 18_2 to the RW controller 45 . The RW control unit 45 stores the temporary annotation image 18_2 in the storage device 30. FIG.

図１６に示すように、仮モデル生成部４７は、ＲＷ制御部４５がストレージデバイス３０から読み出した学習済み仮モデル１６Ｔ＿１を、ＲＷ制御部４５から受け取る。仮モデル生成部４７は、学習済み仮モデル１６Ｔ＿１から第２回仮学習に用いる仮モデル１６＿２を生成する。仮モデル生成部４７は、仮モデル１６＿２をＲＷ制御部４５に出力する。ＲＷ制御部４５は、仮モデル１６＿２をストレージデバイス３０に記憶する。 As shown in FIG. 16 , the temporary model generation unit 47 receives from the RW control unit 45 the trained temporary model 16T_1 read from the storage device 30 by the RW control unit 45 . The temporary model generation unit 47 generates a temporary model 16_2 to be used for the second temporary learning from the trained temporary model 16T_1. The temporary model generator 47 outputs the temporary model 16_2 to the RW controller 45 . The RW control unit 45 stores the temporary model 16_2 in the storage device 30 .

図１７に示すように、仮学習部４８は、ＲＷ制御部４５がストレージデバイス３０から読み出した仮学習データ１５＿２および仮モデル１６＿２を、ＲＷ制御部４５から受け取る。仮学習データ１５＿２は、学習用入力画像１７と仮アノテーション画像１８＿２との組である。仮学習部４８は、仮学習データ１５＿２を用いて仮モデル１６＿２を学習させ、仮モデル１６＿２を学習済み仮モデル１６Ｔ＿２とする。仮学習部４８は、学習済み仮モデル１６Ｔ＿２をＲＷ制御部４５に出力する。ＲＷ制御部４５は、学習済み仮モデル１６Ｔ＿２をストレージデバイス３０に記憶する。 As shown in FIG. 17 , the temporary learning unit 48 receives from the RW control unit 45 the temporary learning data 15_2 and the temporary model 16_2 read from the storage device 30 by the RW control unit 45 . The provisional learning data 15_2 is a set of the learning input image 17 and the provisional annotation image 18_2. The temporary learning unit 48 learns the temporary model 16_2 using the temporary learning data 15_2, and sets the temporary model 16_2 as a trained temporary model 16T_2. The temporary learning unit 48 outputs the trained temporary model 16T_2 to the RW control unit 45 . The RW control unit 45 stores the trained temporary model 16T_2 in the storage device 30. FIG.

このように、画像生成部４６と仮学習部４８は、仮アノテーション画像１８を生成する処理と仮モデル１６を学習済み仮モデル１６Ｔとする処理を、仮アノテーション画像１８のクラス数Ｎを徐々に増やしつつ、かつ仮アノテーション画像１８のクラス数ＮがＭ－１となるまで複数回繰り返す。 In this way, the image generation unit 46 and the temporary learning unit 48 gradually increase the class number N of the temporary annotation image 18 by performing the process of generating the temporary annotation image 18 and the process of making the temporary model 16 into the trained temporary model 16T. and until the number of classes N of the temporary annotation image 18 reaches M-1.

図１８に示すように、モデル生成部４９は、ＲＷ制御部４５がストレージデバイス３０から読み出した学習済み仮モデル１６Ｔ＿２を、ＲＷ制御部４５から受け取る。モデル生成部４９は、学習済み仮モデル１６Ｔ＿２からモデル２１を生成する。モデル生成部４９は、モデル２１をＲＷ制御部４５に出力する。ＲＷ制御部４５は、モデル２１をストレージデバイス３０に記憶する。 As shown in FIG. 18 , the model generation unit 49 receives from the RW control unit 45 the trained temporary model 16T_2 read from the storage device 30 by the RW control unit 45 . The model generator 49 generates the model 21 from the trained temporary model 16T_2. The model generator 49 outputs the model 21 to the RW controller 45 . The RW control unit 45 stores the model 21 in the storage device 30 .

図１９に示すように、本学習部５０は、ＲＷ制御部４５がストレージデバイス３０から読み出した学習データ２０およびモデル２１を、ＲＷ制御部４５から受け取る。本学習部５０は、学習データ２０を用いてモデル２１を学習させ、モデル２１を学習済みモデル２１Ｔとする。本学習部５０は、学習済みモデル２１ＴをＲＷ制御部４５に出力する。ＲＷ制御部４５は、学習済みモデル２１Ｔをストレージデバイス３０に記憶する。 As shown in FIG. 19 , the main learning unit 50 receives the learning data 20 and the model 21 read from the storage device 30 by the RW control unit 45 from the RW control unit 45 . The main learning unit 50 causes the model 21 to learn using the learning data 20, and sets the model 21 as a trained model 21T. The main learning unit 50 outputs the learned model 21T to the RW control unit 45 . The RW control unit 45 stores the learned model 21T in the storage device 30. FIG.

仮学習部４８および本学習部５０は、例えばミニバッチデータを用いたミニバッチ学習を仮モデル１６およびモデル２１に行わせる。ミニバッチデータは、学習用入力画像１７と仮アノテーション画像１８、または学習用入力画像１７とアノテーション画像１９とを分割した複数の分割画像（例えば元の画像の１／１００のサイズの枠で分割した１万枚の分割画像）のうちの一部（例えば１００枚）で構成される。仮学習部４８および本学習部５０は、こうしたミニバッチデータを複数組（例えば１００組）作成し、各組を順次仮モデル１６およびモデル２１に与えて学習させる。 The provisional learning unit 48 and the main learning unit 50 cause the provisional model 16 and the model 21 to perform mini-batch learning using, for example, mini-batch data. The mini-batch data is divided into a plurality of divided images obtained by dividing the input image for learning 17 and the temporary annotation image 18, or the input image for learning 17 and the annotation image 19 (for example, divided by a frame of 1/100 size of the original image). 10,000 divided images) (for example, 100 images). The provisional learning unit 48 and the main learning unit 50 create a plurality of sets (for example, 100 sets) of such mini-batch data, and sequentially apply each set to the provisional model 16 and the model 21 for learning.

図２０において、本例の仮アノテーション画像１８＿１は、クラス１～３の分化細胞ＤＣ、未分化細胞ＵＤＣ、死細胞ＤＤＣが、統合クラス１として統合された画像である。すなわち、仮アノテーション画像１８＿１は、アノテーション画像１９において指定された４つのクラス１～４のうちの３つのクラス１～３が統合され、アノテーション画像１９よりもクラス数が少ない画像である。 In FIG. 20, the provisional annotation image 18_1 of this example is an image in which Classes 1 to 3 of differentiated cell DC, undifferentiated cell UDC, and dead cell DDC are integrated as an integrated class 1 . That is, the temporary annotation image 18_1 is an image in which three classes 1 to 3 out of the four classes 1 to 4 designated in the annotation image 19 are integrated, and the number of classes is smaller than that of the annotation image 19 .

図２１において、基礎モデル６０は、仮モデル１６およびモデル２１と同じく、ＥＮ２６、ＤＮ２７、および出力層２８で構成される。 In FIG. 21, the basic model 60 is composed of EN 26, DN 27, and output layer 28, like the temporary model 16 and model 21. FIG.

出力層２８は、存否確率マップ生成レイヤ８０およびアクティベーションレイヤ８１を有する。存否確率マップ生成レイヤ８０は、ＤＮ２７が出力した最終出力データ２９から、存否確率マップＰＭＰを生成する。存否確率マップＰＭＰには、入力画像内のクラスの存否確率を示す数値が、画素毎に登録されている。存否確率マップＰＭＰの画素は、出力画像の画素と一対一で対応する。存否確率マップ生成レイヤ８０は、２つのクラス分の存否確率マップＰＭＰ１、ＰＭＰ２を生成する。 The output layer 28 has a presence/absence probability map generation layer 80 and an activation layer 81 . A presence/absence probability map generation layer 80 generates a presence/absence probability map PMP from the final output data 29 output by the DN 27 . A numerical value indicating the presence/absence probability of a class in the input image is registered for each pixel in the presence/absence probability map PMP. The pixels of the presence/absence probability map PMP have a one-to-one correspondence with the pixels of the output image. The presence/absence probability map generation layer 80 generates presence/absence probability maps PMP1 and PMP2 for two classes.

アクティベーションレイヤ８１は、存否確率マップＰＭＰ１、ＰＭＰ２に基づいて、認定データＡＶＤを出力する。アクティベーションレイヤ８１は、出力画像の特定画素に対応する、各存否確率マップＰＭＰ１、ＰＭＰ２の画素の各画素値のうちの例えば最大値（最高確率）をとる画素値のクラスを、特定画素が属するクラスとして認定する。こうして出力された認定データＡＶＤは、出力画像の各画素が属するクラスを認定したデータとなる。 The activation layer 81 outputs authorization data AVD based on the presence/absence probability maps PMP1 and PMP2. The activation layer 81 selects, for example, the maximum value (highest probability) among the pixel values of the pixels of the presence/absence probability maps PMP1 and PMP2 corresponding to the specific pixel of the output image. certified as a class. The certified data AVD output in this manner is data that identifies the class to which each pixel of the output image belongs.

このように、基礎モデル６０の出力層２８は２つのクラス用である。そして、第１回仮学習においては、図９で示したように、仮アノテーション画像１８＿１のクラス数Ｎ＝２である。すなわち、基礎モデル６０は、仮アノテーション画像１８＿１のクラス数に応じた構成を有している。したがって、仮モデル生成部４７は、基礎モデル６０自体を仮モデル１６＿１として出力する。 Thus, the output layer 28 of the base model 60 is for two classes. In the first temporary learning, as shown in FIG. 9, the number of classes of the temporary annotation image 18_1 is N=2. That is, the basic model 60 has a configuration corresponding to the number of classes of the temporary annotation image 18_1. Therefore, the temporary model generator 47 outputs the basic model 60 itself as the temporary model 16_1.

なお、基礎モデル６０の出力層２８は、２つのクラス用に限定されない。出力層２８が３つのクラス用の基礎モデル６０であってもよい。仮アノテーション画像１８＿１のクラス数Ｎ＝２で、基礎モデル６０の出力層２８が３つのクラス用のであった場合、仮モデル生成部４７は、基礎モデル６０の出力層２８を、２つのクラス用の出力層２８に置き換える。 Note that the output layer 28 of the base model 60 is not limited to two classes. The output layer 28 may be the base model 60 for the three classes. When the number of classes N of the temporary annotation image 18_1 is 2 and the output layer 28 of the basic model 60 is for three classes, the temporary model generation unit 47 converts the output layer 28 of the basic model 60 into two classes. Replace with output layer 28 .

図２２において、本例の仮アノテーション画像１８＿２は、クラス１、２の分化細胞ＤＣ、未分化細胞ＵＤＣが、統合クラス２として統合された画像である。すなわち、仮アノテーション画像１８＿２は、アノテーション画像１９において指定された４つのクラス１～４のうちの２つのクラス１、２が統合され、アノテーション画像１９よりもクラス数が少ない画像である。 In FIG. 22 , the provisional annotation image 18_2 of this example is an image in which class 1 and 2 differentiated cell DCs and undifferentiated cell UDCs are integrated as an integrated class 2 . That is, the temporary annotation image 18_2 is an image in which two classes 1 and 2 out of the four classes 1 to 4 designated in the annotation image 19 are integrated, and the number of classes is smaller than that of the annotation image 19 .

図２３において、仮モデル生成部４７は、第１回仮学習の学習済み仮モデル１６Ｔ＿１から第２回仮学習に用いる仮モデル１６＿２を生成する場合に、学習済み仮モデル１６Ｔ＿１のＥＮ２６およびＤＮ２７を、仮モデル１６＿２に持ち越す。つまり、仮モデル生成部４７は、前回用いた学習済み仮モデル１６Ｔ＿１の少なくとも一部を用いて、今回用いる仮モデル１６＿２を生成する。 In FIG. 23, when the temporary model generation unit 47 generates a temporary model 16_2 used in the second temporary learning from the trained temporary model 16T_1 of the first temporary learning, the temporary model generation unit 47 sets EN26 and DN27 of the trained temporary model 16T_1 to Carry over to the temporary model 16_2. That is, the temporary model generating unit 47 generates the temporary model 16_2 used this time by using at least part of the trained temporary model 16T_1 used last time.

一方、仮モデル生成部４７は、学習済み仮モデル１６Ｔ＿１の出力層２８は持ち越さずに、出力層２８＿２に置き換える。出力層２８＿２は、存否確率マップ生成レイヤ８６およびアクティベーションレイヤ８７を有する。存否確率マップ生成レイヤ８６は、３つのクラス分の存否確率マップＰＭＰ１、ＰＭＰ２、ＰＭＰ３を生成する。アクティベーションレイヤ８７は、アクティベーションレイヤ８１と同様にして、存否確率マップＰＭＰ１～ＰＭＰ３から認定データＡＶＤを出力する。すなわち、出力層２８＿２は３つのクラス用である。 On the other hand, the temporary model generation unit 47 replaces the output layer 28 of the trained temporary model 16T_1 with the output layer 28_2 without carrying over it. The output layer 28_2 has a presence/absence probability map generation layer 86 and an activation layer 87 . The presence/absence probability map generation layer 86 generates presence/absence probability maps PMP1, PMP2, and PMP3 for three classes. Activation layer 87 outputs recognition data AVD from presence/absence probability maps PMP1-PMP3 in the same manner as activation layer 81. FIG. That is, the output layer 28_2 is for three classes.

第２回仮学習においては、図９で示したように、仮アノテーション画像１８＿２のクラス数Ｎ＝３である。しかし、学習済み仮モデル１６Ｔ＿１の出力層２８は２つのクラス用である。そこで、仮モデル生成部４７は、２つのクラス用の出力層２８を、３つのクラス用の出力層２８＿２に置き換える。こうすることで、仮モデル生成部４７は、仮アノテーション画像１８＿２のクラス数に応じた構成を有する仮モデル１６＿２を生成する。 In the second temporary learning, as shown in FIG. 9, the number of classes of the temporary annotation image 18_2 is N=3. However, the output layer 28 of the trained temporary model 16T_1 is for two classes. Therefore, the temporary model generation unit 47 replaces the output layer 28 for two classes with the output layer 28_2 for three classes. By doing so, the temporary model generation unit 47 generates a temporary model 16_2 having a configuration corresponding to the number of classes of the temporary annotation image 18_2.

図２４に示すように、仮学習部４８は、第１処理部９０、第１評価部９１、および第１更新部９２を有する。第１処理部９０は、学習用入力画像１７を仮モデル１６に与えて、仮モデル１６から仮学習用出力画像９３を出力させる。第１処理部９０は、仮学習用出力画像９３を第１評価部９１に出力する。 As shown in FIG. 24 , the provisional learning section 48 has a first processing section 90 , a first evaluation section 91 and a first updating section 92 . The first processing unit 90 gives the learning input image 17 to the temporary model 16 and causes the temporary model 16 to output a temporary learning output image 93 . The first processing unit 90 outputs the temporary learning output image 93 to the first evaluation unit 91 .

第１評価部９１は、仮アノテーション画像１８と仮学習用出力画像９３とを比較し、仮モデル１６のクラスの判別精度を評価する。第１評価部９１は、第１損失関数９４を用いて仮モデル１６のクラスの判別精度を評価する。第１損失関数９４は、仮アノテーション画像１８と仮学習用出力画像９３とのクラスの指定の差異の程度を表す関数である。第１損失関数９４の算出値が０に近いほど、仮モデル１６のクラスの判別精度が高いことを示す。第１評価部９１は、第１損失関数９４による仮モデル１６のクラスの判別精度の評価結果を第１更新部９２に出力する。 The first evaluation unit 91 compares the temporary annotation image 18 and the temporary learning output image 93 to evaluate the class discrimination accuracy of the temporary model 16 . The first evaluation unit 91 uses the first loss function 94 to evaluate the class discrimination accuracy of the temporary model 16 . The first loss function 94 is a function representing the degree of difference in class designation between the temporary annotation image 18 and the temporary learning output image 93 . The closer the calculated value of the first loss function 94 to 0, the higher the class discrimination accuracy of the temporary model 16 . The first evaluation unit 91 outputs an evaluation result of the class discrimination accuracy of the temporary model 16 by the first loss function 94 to the first updating unit 92 .

第１更新部９２は、第１評価部９１からの評価結果に応じて、仮モデル１６を更新する。例えば、第１更新部９２は、学習係数を伴う確率的勾配降下法等により、仮モデル１６のＥＮ２６およびＤＮ２７のフィルタの係数の値を変化させる。学習係数は、フィルタの係数の値の変化幅を示す。すなわち、学習係数が比較的大きい値であるほど、フィルタの係数の値の変化幅は大きくなり、仮モデル１６の更新度合いも大きくなる。 The first updating section 92 updates the temporary model 16 according to the evaluation result from the first evaluating section 91 . For example, the first updating unit 92 changes the coefficient values of the filters EN26 and DN27 of the temporary model 16 by stochastic gradient descent with learning coefficients or the like. The learning coefficient indicates the width of change in the value of the coefficient of the filter. That is, the larger the learning coefficient, the greater the change in the value of the filter coefficient, and the greater the degree of update of the temporary model 16 .

仮学習部４８は、これら第１処理部９０による仮モデル１６への学習用入力画像１７の入力と第１評価部９１への仮学習用出力画像９３の出力、第１評価部９１による仮モデル１６のクラスの判別精度の評価、および第１更新部９２による仮モデル１６の更新を、仮モデル１６のクラスの判別精度が予め設定されたレベルとなるまで、繰り返し続ける。そして、仮学習部４８は、クラスの判別精度が予め設定されたレベルとなった仮モデル１６を、学習済み仮モデル１６Ｔとして出力する。 The temporary learning unit 48 inputs the learning input image 17 to the temporary model 16 by the first processing unit 90 , outputs the temporary learning output image 93 to the first evaluation unit 91 , outputs the temporary model by the first evaluation unit 91 . The evaluation of the discrimination accuracy of the 16 classes and the updating of the temporary model 16 by the first updating unit 92 are repeated until the class discrimination accuracy of the temporary model 16 reaches a preset level. Then, the temporary learning unit 48 outputs the temporary model 16 whose class discrimination accuracy has reached a preset level as a trained temporary model 16T.

図２５に示すように、第１回仮学習に用いる第１損失関数９４＿１は、クラス１～３を統合した統合クラス１（分化細胞ＤＣ、未分化細胞ＵＤＣ、死細胞ＤＤＣ）に対する損失関数に重み付け係数ＷＡを乗算したものと、クラス４（培地ＰＬ）に対する損失関数に重み付け係数ＷＢを乗算したものとの合計である。この場合、重み付け係数ＷＡ、ＷＢには、例えば０．５等の同じ値が設定される。 As shown in FIG. 25, the first loss function 94_1 used for the first provisional learning is a loss function for integrated class 1 (differentiated cell DC, undifferentiated cell UDC, dead cell DDC) that integrates classes 1 to 3. WA multiplied by a factor WA and the loss function for class 4 (medium PL) multiplied by a weighting factor WB. In this case, the same value such as 0.5 is set to the weighting factors WA and WB.

対して図２６に示すように、第２回仮学習に用いる第１損失関数９４＿２は、クラス１、２を統合した統合クラス２（分化細胞ＤＣ、未分化細胞ＵＤＣ）に対する損失関数に重み付け係数ＷＣを乗算したものと、クラス３（死細胞ＤＤＣ）に対する損失関数に重み付け係数ＷＤを乗算したものと、クラス４（培地ＰＬ）に対する損失関数に重み付け係数ＷＢを乗算したものとの合計である。この場合、重み付け係数ＷＢには、重み付け係数ＷＣ、ＷＤよりも小さい値が設定される。例えば重み付け係数ＷＣ、ＷＤには０．４が、重み付け係数ＷＢには０．２がそれぞれ設定される。 On the other hand, as shown in FIG. 26, the first loss function 94_2 used for the second provisional learning is the weighting coefficient WC , the loss function for class 3 (dead cell DDC) multiplied by a weighting factor WD, and the loss function for class 4 (medium PL) multiplied by a weighting factor WB. In this case, the weighting factor WB is set to a smaller value than the weighting factors WC and WD. For example, the weighting factors WC and WD are set to 0.4, and the weighting factor WB is set to 0.2.

このように、仮学習部４８は、仮モデル１６のクラスの判別精度の評価に用いる第１損失関数９４を変更する。また、仮学習部４８は、前回と共通するクラスに対する損失関数の重みを、今回新たに出現したクラスに対する損失関数の重みよりも小さくする。図２６の例では、クラス４が「前回と共通するクラス」に対応し、重み付け係数ＷＢが「前回と共通するクラスに対する損失関数の重み」に対応する。また、統合クラス２およびクラス３が「今回新たに出現したクラス」に対応し、重み付け係数ＷＣ、ＷＤが「今回新たに出現したクラスに対する損失関数の重み」に対応する。 In this manner, the temporary learning unit 48 changes the first loss function 94 used to evaluate the class discrimination accuracy of the temporary model 16 . In addition, the provisional learning unit 48 makes the weight of the loss function for the class common to the previous time smaller than the weight of the loss function for the class newly appearing this time. In the example of FIG. 26, class 4 corresponds to "the class common to the previous time", and the weighting factor WB corresponds to "the weight of the loss function for the class common to the previous time". Further, integrated class 2 and class 3 correspond to "the class newly appearing this time", and the weighting coefficients WC and WD correspond to "the weight of the loss function for the class newly appearing this time".

図２７において、モデル生成部４９は、第２回仮学習の学習済み仮モデル１６Ｔ＿２から本学習に用いるモデル２１を生成する場合に、学習済み仮モデル１６Ｔ＿２のＥＮ２６およびＤＮ２７を、モデル２１に持ち越す。つまり、モデル生成部４９は、学習済み仮モデル１６Ｔ＿２の少なくとも一部を用いて、モデル２１を生成する。 In FIG. 27, the model generation unit 49 carries over the EN26 and DN27 of the trained temporary model 16T_2 to the model 21 when generating the model 21 used for the main learning from the trained temporary model 16T_2 of the second temporary learning. That is, the model generator 49 generates the model 21 using at least part of the trained temporary model 16T_2.

一方、モデル生成部４９は、学習済み仮モデル１６Ｔ＿２の出力層２８＿２は持ち越さずに、出力層２８＿３に置き換える。出力層２８＿３は、存否確率マップ生成レイヤ１０１およびアクティベーションレイヤ１０２を有する。存否確率マップ生成レイヤ１０１は、４つのクラス分の存否確率マップＰＭＰ１、ＰＭＰ２、ＰＭＰ３、ＰＭＰ４を生成する。アクティベーションレイヤ１０２は、アクティベーションレイヤ８１、８７と同様にして、存否確率マップＰＭＰ１～ＰＭＰ４から認定データＡＶＤを出力する。すなわち、出力層２８＿３は４つのクラス用である。 On the other hand, the model generating unit 49 replaces the output layer 28_2 of the trained temporary model 16T_2 with the output layer 28_3 without carrying over it. The output layer 28_3 has a presence/absence probability map generation layer 101 and an activation layer 102 . The presence/absence probability map generation layer 101 generates presence/absence probability maps PMP1, PMP2, PMP3, and PMP4 for four classes. The activation layer 102 outputs authorization data AVD from the presence/absence probability maps PMP1 to PMP4 in the same manner as the activation layers 81 and 87. FIG. That is, the output layer 28_3 is for four classes.

本学習においては、図９で示したように、アノテーション画像１９のクラス数Ｍ＝４である。しかし、学習済み仮モデル１６Ｔ＿２の出力層２８＿２は３つのクラス用である。そこで、モデル生成部４９は、３つのクラス用の出力層２８＿２を、４つのクラス用の出力層２８＿３に置き換える。こうすることで、モデル生成部４９は、アノテーション画像１９のクラス数に応じた構成を有するモデル２１を生成する。 In this learning, as shown in FIG. 9, the number of classes of the annotation image 19 is M=4. However, the output layer 28_2 of the trained temporary model 16T_2 is for three classes. Therefore, the model generator 49 replaces the output layer 28_2 for three classes with an output layer 28_3 for four classes. By doing so, the model generation unit 49 generates the model 21 having a configuration corresponding to the number of classes of the annotation image 19 .

図２８に示すように、本学習部５０は、第２処理部１０５、第２評価部１０６、および第２更新部１０７を有する。第２処理部１０５は、仮学習部４８の第１処理部９０と同様に、学習用入力画像１７をモデル２１に与えて、モデル２１から本学習用出力画像２５を出力させる。第２処理部１０５は、本学習用出力画像２５を第２評価部１０６に出力する。 As shown in FIG. 28 , the main learning unit 50 has a second processing unit 105 , a second evaluation unit 106 and a second updating unit 107 . The second processing unit 105 provides the learning input image 17 to the model 21 and causes the model 21 to output the main learning output image 25, like the first processing unit 90 of the provisional learning unit 48 does. The second processing unit 105 outputs the main learning output image 25 to the second evaluation unit 106 .

第２評価部１０６は、仮学習部４８の第１評価部９１と同様に、アノテーション画像１９と本学習用出力画像２５とを比較し、モデル２１のクラスの判別精度を評価する。第２評価部１０６は、第２損失関数１０９を用いてモデル２１のクラスの判別精度を評価する。第２損失関数１０９は、アノテーション画像１９と本学習用出力画像２５とのクラスの指定の差異の程度を表す関数である。第２損失関数１０９の算出値が０に近いほど、モデル２１のクラスの判別精度が高いことを示す。第２評価部１０６は、第２損失関数１０９によるモデル２１のクラスの判別精度の評価結果を第２更新部１０７に出力する。 The second evaluation unit 106 compares the annotation image 19 with the main learning output image 25 and evaluates the class discrimination accuracy of the model 21 in the same manner as the first evaluation unit 91 of the provisional learning unit 48 . The second evaluation unit 106 evaluates the class discrimination accuracy of the model 21 using the second loss function 109 . The second loss function 109 is a function representing the degree of difference in class designation between the annotation image 19 and the main learning output image 25 . The closer the calculated value of the second loss function 109 to 0, the higher the class discrimination accuracy of the model 21 . The second evaluation unit 106 outputs the evaluation result of the class discrimination accuracy of the model 21 by the second loss function 109 to the second updating unit 107 .

第２更新部１０７は、仮学習部４８の第１更新部９２と同様に、第２評価部１０６からの評価結果に応じて、モデル２１を更新する。 The second updating section 107 updates the model 21 according to the evaluation result from the second evaluating section 106, like the first updating section 92 of the provisional learning section 48. FIG.

本学習部５０は、仮学習部４８と同様に、これら第２処理部１０５によるモデル２１への学習用入力画像１７の入力と第２評価部１０６への本学習用出力画像２５の出力、第２評価部１０６によるモデル２１のクラスの判別精度の評価、および第２更新部１０７によるモデル２１の更新を、モデル２１のクラスの判別精度が予め設定されたレベルとなるまで、繰り返し続ける。そして、本学習部５０は、クラスの判別精度が予め設定されたレベルとなったモデル２１を、学習済みモデル２１Ｔとして出力する。 Similar to the temporary learning unit 48, the main learning unit 50 inputs the learning input image 17 to the model 21 by the second processing unit 105, outputs the main learning output image 25 to the second evaluation unit 106, The evaluation of the class discrimination accuracy of the model 21 by the second evaluation unit 106 and the update of the model 21 by the second update unit 107 are repeated until the class discrimination accuracy of the model 21 reaches a preset level. Then, the main learning unit 50 outputs the model 21 whose class discrimination accuracy reaches a preset level as a trained model 21T.

図２９に示すように、第２損失関数１０９は、クラス１（分化細胞ＤＣ）に対する損失関数に重み付け係数ＷＥを乗算したものと、クラス２（未分化細胞ＵＤＣ）に対する損失関数に重み付け係数ＷＦを乗算したものと、クラス３（死細胞ＤＤＣ）に対する損失関数に重み付け係数ＷＤを乗算したものと、クラス４（培地ＰＬ）に対する損失関数に重み付け係数ＷＢを乗算したものとの合計である。この場合、重み付け係数ＷＢ、ＷＤには、重み付け係数ＷＥ、ＷＦよりも小さい値が設定される。例えば重み付け係数ＷＥ、ＷＦには０．４が、重み付け係数ＷＢ、ＷＤには０．１がそれぞれ設定される。 As shown in FIG. 29, the second loss function 109 is obtained by multiplying the loss function for class 1 (differentiated cells DC) by a weighting factor WE and the loss function for class 2 (undifferentiated cells UDC) by multiplying the weighting factor WF. , the loss function for class 3 (dead cell DDC) multiplied by a weighting factor WD, and the loss function for class 4 (medium PL) multiplied by a weighting factor WB. In this case, the weighting factors WB and WD are set to values smaller than the weighting factors WE and WF. For example, the weighting coefficients WE and WF are set to 0.4, and the weighting coefficients WB and WD are set to 0.1.

このように、本学習部５０は、モデル２１のクラスの判別精度の評価に用いる第２損失関数１０９を、仮モデル１６のクラスの判別精度の評価に用いる第１損失関数９４から変更する。また、本学習部５０は、仮学習と共通するクラスに対する損失関数の重みを、本学習において新たに出現したクラスに対する損失関数の重みよりも小さくする。図２９の例では、クラス３、４が「仮学習と共通するクラス」に対応し、重み付け係数ＷＢ、ＷＤが「仮学習と共通するクラスに対する損失関数の重み」に対応する。また、クラス１、２が「本学習において新たに出現したクラス」に対応し、重み付け係数ＷＥ、ＷＦが「本学習において新たに出現したクラスに対する損失関数の重み」に対応する。 In this manner, the main learning unit 50 changes the second loss function 109 used for evaluating the class discrimination accuracy of the model 21 from the first loss function 94 used for evaluating the class discrimination accuracy of the temporary model 16 . Further, the main learning unit 50 makes the weight of the loss function for the class common to the provisional learning smaller than the weight of the loss function for the class newly appearing in the main learning. In the example of FIG. 29, classes 3 and 4 correspond to "classes shared with temporary learning", and weighting coefficients WB and WD correspond to "loss function weights for classes shared with temporary learning". Classes 1 and 2 correspond to "classes newly appearing in the main learning", and weighting coefficients WE and WF correspond to "loss function weights for classes newly appearing in the main learning".

図３０は、第１回仮学習、第２回仮学習、および本学習の各学習のクラスの推移を、仮アノテーション画像１８＿１、１８＿２、およびアノテーション画像１９を用いて示すものである。図３０Ａは、第１回仮学習に用いる、クラス数Ｎ＝２（統合クラス１とクラス４）の仮アノテーション画像１８＿１を示す。図３０Ｂは、第２回仮学習に用いる、クラス数Ｎ＝３（統合クラス２、クラス３、クラス４）の仮アノテーション画像１８＿２を示す。図３０Ｃは、本学習に用いる、クラス数Ｍ＝４（クラス１～４）のアノテーション画像１９を示す。こうしてクラス数が１ずつ増やされながら、第１回仮学習、第２回仮学習、および本学習の各学習が進められる。 FIG. 30 shows transition of each learning class of the first temporary learning, the second temporary learning, and the main learning using the temporary annotation images 18_1, 18_2, and the annotation image 19. FIG. FIG. 30A shows a provisional annotation image 18_1 with the number of classes N=2 (integrated class 1 and class 4) used for the first provisional learning. FIG. 30B shows a provisional annotation image 18_2 with the number of classes N=3 (integrated class 2, class 3, class 4) used for the second provisional learning. FIG. 30C shows an annotation image 19 with the number of classes M=4 (classes 1 to 4) used for this learning. While the number of classes is increased by one in this way, each of the first provisional learning, the second provisional learning, and the main learning is advanced.

次に、上記構成による作用について、図３１のフローチャートを参照して説明する。まず、学習装置１０において作動プログラム４０が起動されると、図６で示したように、学習装置１０のＣＰＵ３２は、ＲＷ制御部４５、画像生成部４６、仮モデル生成部４７、仮学習部４８、モデル生成部４９、本学習部５０、表示制御部５１、受付部５２、および送信制御部５３として機能される。 Next, the operation of the above configuration will be described with reference to the flow chart of FIG. First, when the operation program 40 is activated in the learning device 10, as shown in FIG. , model generation unit 49 , main learning unit 50 , display control unit 51 , reception unit 52 , and transmission control unit 53 .

まず、表示制御部５１により、図１１で示した画像生成条件指定画面６５がディスプレイ３４に表示される（ステップＳＴ１００）。ユーザにより各指定領域６６、６７のチェックボックス６８、６９が選択され、指定ボタン７０が選択された場合、チェックボックス６８、６９の選択状態に応じた画像生成条件４３が、受付部５２において受け付けられる（ステップＳＴ１１０）。画像生成条件４３は、受付部５２からＲＷ制御部４５に出力され、ＲＷ制御部４５によりストレージデバイス３０に記憶される。 First, the image generation condition designation screen 65 shown in FIG. 11 is displayed on the display 34 by the display control unit 51 (step ST100). When the user selects the check boxes 68 and 69 of the designated areas 66 and 67 and selects the designation button 70, the reception unit 52 receives the image generation condition 43 corresponding to the selection state of the check boxes 68 and 69. (Step ST110). The image generation conditions 43 are output from the reception unit 52 to the RW control unit 45 and stored in the storage device 30 by the RW control unit 45 .

図１２で示したように、ＲＷ制御部４５によりアノテーション画像１９および画像生成条件４３がストレージデバイス３０から読み出される（ステップＳＴ１２０）。アノテーション画像１９および画像生成条件４３は、ＲＷ制御部４５から画像生成部４６に出力される。なお、ステップＳＴ１２０は、本開示の技術に係る「取得ステップ」の一例である。 As shown in FIG. 12, the annotation image 19 and the image generation conditions 43 are read from the storage device 30 by the RW control unit 45 (step ST120). The annotation image 19 and the image generation conditions 43 are output from the RW control section 45 to the image generation section 46 . Note that step ST120 is an example of the “acquisition step” according to the technology of the present disclosure.

図２０で示したように、画像生成部４６では、画像生成条件４３にしたがって、アノテーション画像１９から第１回仮学習に用いる仮アノテーション画像１８＿１が生成される（ステップＳＴ１３０）。仮アノテーション画像１８＿１は、画像生成部４６からＲＷ制御部４５に出力され、ＲＷ制御部４５によってストレージデバイス３０に記憶される。なお、ステップＳＴ１３０は、本開示の技術に係る「画像生成ステップ」の一例である。 As shown in FIG. 20, the image generation unit 46 generates a provisional annotation image 18_1 used for the first provisional learning from the annotation image 19 according to the image generation condition 43 (step ST130). The temporary annotation image 18_1 is output from the image generator 46 to the RW controller 45 and stored in the storage device 30 by the RW controller 45 . Note that step ST130 is an example of the "image generation step" according to the technology of the present disclosure.

次いで、図１３で示したように、ＲＷ制御部４５により基礎モデル６０がストレージデバイス３０から読み出される。基礎モデル６０は、ＲＷ制御部４５から仮モデル生成部４７に出力される。 Next, as shown in FIG. 13, the basic model 60 is read from the storage device 30 by the RW control unit 45 . The basic model 60 is output from the RW control unit 45 to the temporary model generation unit 47 .

図２１で示したように、仮モデル生成部４７では、基礎モデル６０から第１回仮学習に用いる仮モデル１６＿１が生成される（ステップＳＴ１４０）。仮モデル１６＿１は、本例においては、基礎モデル６０をそのまま利用したモデルであり、２つのクラス用の出力層２８を有する。仮モデル１６＿１は、仮モデル生成部４７からＲＷ制御部４５に出力され、ＲＷ制御部４５によってストレージデバイス３０に記憶される。 As shown in FIG. 21, the temporary model generation unit 47 generates the temporary model 16_1 used for the first temporary learning from the basic model 60 (step ST140). In this example, the temporary model 16_1 is a model that uses the basic model 60 as it is, and has an output layer 28 for two classes. The temporary model 16_1 is output from the temporary model generation unit 47 to the RW control unit 45 and stored in the storage device 30 by the RW control unit 45 .

図１４で示したように、ＲＷ制御部４５により、学習用入力画像１７と仮アノテーション画像１８＿１の組である仮学習データ１５＿１、および仮モデル１６＿１がストレージデバイス３０から読み出される（ステップＳＴ１５０）。仮学習データ１５＿１および仮モデル１６＿１は、ＲＷ制御部４５から仮学習部４８に出力される。なお、ステップＳＴ１５０は、ステップＳＴ１２０と同じく、本開示の技術に係る「取得ステップ」の一例である。 As shown in FIG. 14, the RW control unit 45 reads the temporary learning data 15_1 and the temporary model 16_1, which are a set of the learning input image 17 and the temporary annotation image 18_1, from the storage device 30 (step ST150). The provisional learning data 15_1 and the provisional model 16_1 are output from the RW control section 45 to the provisional learning section 48 . Note that step ST150, like step ST120, is an example of the "acquisition step" according to the technology of the present disclosure.

仮学習部４８では、仮学習データ１５＿１を用いて仮モデル１６＿１の仮学習が行われる（ステップＳＴ１６０）。より詳しくは図２４で示したように、第１処理部９０において、学習用入力画像１７が仮モデル１６＿１に与えられて、仮モデル１６＿１から仮学習用出力画像９３が出力される。次いで、第１評価部９１において、仮アノテーション画像１８＿１と仮学習用出力画像９３とが比較され、図２５で示した第１損失関数９４＿１を用いて、仮モデル１６＿１のクラスの判別精度が評価される。そして、第１更新部９２によって、仮モデル１６＿１が更新される。これら第１処理部９０による仮モデル１６＿１への学習用入力画像１７の入力と第１評価部９１への仮学習用出力画像９３の出力、第１評価部９１による仮モデル１６＿１のクラスの判別精度の評価、および第１更新部９２による仮モデル１６＿１の更新は、仮モデル１６＿１のクラスの判別精度が予め設定されたレベルとなるまで、繰り返し続けられる。クラスの判別精度が予め設定されたレベルとなった仮モデル１６＿１は、学習済み仮モデル１６Ｔ＿１として仮学習部４８からＲＷ制御部４５に出力され、ＲＷ制御部４５によってストレージデバイス３０に記憶される。なお、ステップＳＴ１６０は、本開示の技術に係る「仮学習ステップ」の一例である。 The temporary learning unit 48 performs temporary learning of the temporary model 16_1 using the temporary learning data 15_1 (step ST160). More specifically, as shown in FIG. 24, in the first processing unit 90, the learning input image 17 is given to the temporary model 16_1, and the temporary learning output image 93 is output from the temporary model 16_1. Next, in the first evaluation unit 91, the temporary annotation image 18_1 and the temporary learning output image 93 are compared, and the class discrimination accuracy of the temporary model 16_1 is evaluated using the first loss function 94_1 shown in FIG. be. Then, the provisional model 16_1 is updated by the first updating unit 92 . The input of the learning input image 17 to the provisional model 16_1 by the first processing unit 90, the output of the provisional learning output image 93 to the first evaluation unit 91, and the class discrimination accuracy of the provisional model 16_1 by the first evaluation unit 91 and the updating of the temporary model 16_1 by the first update unit 92 are repeated until the class discrimination accuracy of the temporary model 16_1 reaches a preset level. The provisional model 16_1 whose class discrimination accuracy has reached a preset level is output from the provisional learning unit 48 to the RW control unit 45 as a trained provisional model 16T_1, and is stored in the storage device 30 by the RW control unit 45 . Note that step ST160 is an example of a "provisional learning step" according to the technology of the present disclosure.

続いて、図１５および図２２で示したように、画像生成部４６において、アノテーション画像１９から第２回仮学習に用いる仮アノテーション画像１８＿２が生成される（ステップＳＴ１７０でＮＯ、ステップＳＴ１３０）。次いで、図１６および図２３で示したように、仮モデル生成部４７において、第１回仮学習の学習済み仮モデル１６Ｔ＿１から第２回仮学習に用いる仮モデル１６＿２が生成される（ステップＳＴ１４０）。仮モデル１６＿２は、学習済み仮モデル１６Ｔ＿１のＥＮ２６およびＤＮ２７が持ち越されたモデルであり、３つのクラス用の出力層２８＿２を有する。そして、図１７および図２４で示したように、仮学習部４８において、学習用入力画像１７と仮アノテーション画像１８＿２の組である仮学習データ１５＿２を用いて、仮モデル１６＿２の仮学習が行われる（ステップＳＴ１６０）。仮モデル１６＿２のクラスの判別精度の評価には、図２６で示した第１損失関数９４＿２が用いられる。この第２回仮学習によりクラスの判別精度が予め設定されたレベルとなった仮モデル１６＿２は、学習済み仮モデル１６Ｔ＿２として仮学習部４８からＲＷ制御部４５に出力され、ＲＷ制御部４５によってストレージデバイス３０に記憶される。 Subsequently, as shown in FIGS. 15 and 22, the image generator 46 generates a provisional annotation image 18_2 to be used for the second provisional learning from the annotation image 19 (NO in step ST170, step ST130). Next, as shown in FIGS. 16 and 23, the temporary model generator 47 generates a temporary model 16_2 to be used for the second temporary learning from the trained temporary model 16T_1 for the first temporary learning (step ST140). . The temporary model 16_2 is a model in which EN26 and DN27 of the trained temporary model 16T_1 are carried over, and has an output layer 28_2 for three classes. Then, as shown in FIGS. 17 and 24, the provisional learning unit 48 performs provisional learning of the provisional model 16_2 using the provisional learning data 15_2 that is a set of the learning input image 17 and the provisional annotation image 18_2. (Step ST160). The first loss function 94_2 shown in FIG. 26 is used to evaluate the class discrimination accuracy of the temporary model 16_2. The provisional model 16_2 whose class discrimination accuracy has reached a preset level through the second provisional learning is output as a trained provisional model 16T_2 from the provisional learning unit 48 to the RW control unit 45, and is stored by the RW control unit 45. stored in the device 30;

第２回仮学習の終了後（ステップＳＴ１７０でＹＥＳ）、図１８で示したように、ＲＷ制御部４５により、第２回仮学習の学習済み仮モデル１６Ｔ＿２がストレージデバイス３０から読み出される。学習済み仮モデル１６Ｔ＿２は、ＲＷ制御部４５からモデル生成部４９に出力される。 After completing the second provisional learning (YES in step ST170), the RW control unit 45 reads the trained provisional model 16T_2 of the second provisional learning from the storage device 30 as shown in FIG. The trained temporary model 16T_2 is output from the RW control unit 45 to the model generation unit 49. FIG.

図２７で示したように、モデル生成部４９では、学習済み仮モデル１６Ｔ＿２から、本学習に用いるモデル２１が生成される（ステップＳＴ１８０）。モデル２１は、学習済み仮モデル１６Ｔ＿２のＥＮ２６およびＤＮ２７が持ち越されたモデルであり、４つのクラス用の出力層２８＿３を有する。モデル２１は、モデル生成部４９からＲＷ制御部４５に出力され、ＲＷ制御部４５によってストレージデバイス３０に記憶される。なお、ステップＳＴ１８０は、本開示の技術に係る「機械学習モデル生成ステップ」の一例である。 As shown in FIG. 27, the model generator 49 generates the model 21 used for the main learning from the trained provisional model 16T_2 (step ST180). The model 21 is a model in which EN26 and DN27 of the trained temporary model 16T_2 are carried over, and has an output layer 28_3 for four classes. The model 21 is output from the model generator 49 to the RW controller 45 and stored in the storage device 30 by the RW controller 45 . Note that step ST180 is an example of the "machine learning model generation step" according to the technology of the present disclosure.

図１９で示したように、ＲＷ制御部４５により、学習用入力画像１７とアノテーション画像１９の組である学習データ２０、およびモデル２１がストレージデバイス３０から読み出される（ステップＳＴ１９０）。学習データ２０およびモデル２１は、ＲＷ制御部４５から本学習部５０に出力される。なお、ステップＳＴ１９０は、ステップＳＴ１２０、ＳＴ１５０と同じく、本開示の技術に係る「取得ステップ」の一例である。 As shown in FIG. 19, the RW control unit 45 reads the learning data 20, which is a set of the learning input image 17 and the annotation image 19, and the model 21 from the storage device 30 (step ST190). The learning data 20 and the model 21 are output from the RW control section 45 to the main learning section 50 . Note that step ST190, like steps ST120 and ST150, is an example of the "acquisition step" according to the technology of the present disclosure.

本学習部５０では、学習データ２０を用いてモデル２１の本学習が行われる（ステップＳＴ２００）。より詳しくは図２８で示したように、第２処理部１０５において、学習用入力画像１７がモデル２１に与えられて、モデル２１から本学習用出力画像２５が出力される。次いで、第２評価部１０６において、アノテーション画像１９と本学習用出力画像２５とが比較され、図２９で示した第２損失関数１０９を用いて、モデル２１のクラスの判別精度が評価される。そして、第２更新部１０７によって、モデル２１が更新される。これら第２処理部１０５によるモデル２１への学習用入力画像１７の入力と第２評価部１０６への本学習用出力画像２５の出力、第２評価部１０６によるモデル２１のクラスの判別精度の評価、および第２更新部１０７によるモデル２１の更新は、モデル２１のクラスの判別精度が予め設定されたレベルとなるまで、繰り返し続けられる。クラスの判別精度が予め設定されたレベルとなったモデル２１は、学習済みモデル２１Ｔとして本学習部５０からＲＷ制御部４５に出力され、ＲＷ制御部４５によってストレージデバイス３０に記憶される。なお、ステップＳＴ２００は、本開示の技術に係る「本学習ステップ」の一例である。 In the main learning unit 50, the main learning of the model 21 is performed using the learning data 20 (step ST200). More specifically, as shown in FIG. 28, in the second processing unit 105, the learning input image 17 is given to the model 21, and the model 21 outputs the main learning output image 25. FIG. Next, the annotation image 19 is compared with the main learning output image 25 in the second evaluation unit 106, and the class discrimination accuracy of the model 21 is evaluated using the second loss function 109 shown in FIG. Then, the model 21 is updated by the second updating unit 107 . The input of the learning input image 17 to the model 21 by the second processing unit 105, the output of the main learning output image 25 to the second evaluation unit 106, and the evaluation of the class discrimination accuracy of the model 21 by the second evaluation unit 106 , and the update of the model 21 by the second update unit 107 are repeated until the class discrimination accuracy of the model 21 reaches a preset level. The model 21 whose class discrimination accuracy has reached a preset level is output from the main learning unit 50 to the RW control unit 45 as a trained model 21T, and is stored in the storage device 30 by the RW control unit 45 . Note that step ST200 is an example of the "main learning step" according to the technology of the present disclosure.

学習済みモデル２１Ｔは、ＲＷ制御部４５によりストレージデバイス３０から読み出されて、ＲＷ制御部４５から送信制御部５３に出力される。学習済みモデル２１Ｔは、送信制御部５３により運用装置１１に送信される（ステップＳＴ２１０）。 The learned model 21T is read from the storage device 30 by the RW control unit 45 and is output from the RW control unit 45 to the transmission control unit 53 . The trained model 21T is transmitted to the operation device 11 by the transmission control unit 53 (step ST210).

運用装置１１では、図２で示したように、入力画像２２が学習済みモデル２１に与えられ、入力画像２２に映る物体のクラスとその輪郭を判別した出力画像２３が、学習済みモデル２１から出力される。出力画像２３は、運用装置１１のディスプレイに表示される等して、ユーザの閲覧に供される。また、出力画像２３は、例えば分化細胞ＤＣの個数の計数といった細胞培養の評価に供される。 In the operation device 11, as shown in FIG. 2, the input image 22 is given to the trained model 21, and the learned model 21 outputs an output image 23 obtained by discriminating the class of the object appearing in the input image 22 and its outline. be done. The output image 23 is displayed on the display of the operation device 11 and is provided for viewing by the user. In addition, the output image 23 is used for cell culture evaluation such as counting the number of differentiated cell DCs.

以上説明したように、学習装置１０は、取得部としてのＲＷ制御部４５と、画像生成部４６と、仮学習部４８と、モデル生成部４９と、本学習部５０とを備える。ＲＷ制御部４５は、学習用入力画像１７と、学習用入力画像１７に対して、セマンティックセグメンテーションの対象となる３つ以上のクラスが指定されたアノテーション画像１９との組である学習データ２０を、ストレージデバイス３０から読み出して取得する。画像生成部４６は、アノテーション画像１９において指定された３つ以上のクラスのうちの少なくとも２つのクラスが統合され、アノテーション画像１９よりもクラス数が少ない仮アノテーション画像１８を生成する。仮学習部４８は、学習用入力画像１７と仮アノテーション画像１８との組である仮学習データ１５を用いて、仮アノテーション画像１８のクラス数に応じた構成を有する仮モデル１６を学習させ、仮モデル１６を学習済み仮モデル１６Ｔとする。 As described above, the learning device 10 includes the RW control unit 45 as an acquisition unit, the image generation unit 46, the provisional learning unit 48, the model generation unit 49, and the main learning unit 50. The RW control unit 45 generates learning data 20, which is a set of an input image for learning 17 and an annotation image 19 in which three or more classes to be subjected to semantic segmentation are specified for the input image for learning 17, It is obtained by reading from the storage device 30 . The image generation unit 46 integrates at least two classes among the three or more classes designated in the annotation image 19 to generate the temporary annotation image 18 having fewer classes than the annotation image 19 . The temporary learning unit 48 uses the temporary learning data 15, which is a set of the learning input image 17 and the temporary annotation image 18, to learn the temporary model 16 having a configuration corresponding to the number of classes of the temporary annotation image 18, and learns the temporary model 16. Let the model 16 be a trained provisional model 16T.

モデル生成部４９は、学習済み仮モデル１６Ｔの少なくとも一部を用いて、アノテーション画像１９のクラス数に応じた構成を有するモデル２１を生成する。本学習部５０は、学習データ２０を用いてモデル２１を学習させ、モデル２１を学習済みモデル２１Ｔとする。このように、学習装置１０では、アノテーション画像１９において指定されたクラスより少ない数のクラスで、モデル２１を学習させる。したがって、複数のクラスの全てを一度に学習させる場合と比べて、クラスの判別精度が高い学習済みモデル２１Ｔを得ることが可能となる。 The model generation unit 49 uses at least part of the trained temporary model 16T to generate the model 21 having a configuration corresponding to the number of classes of the annotation image 19 . The main learning unit 50 causes the model 21 to learn using the learning data 20, and sets the model 21 as a trained model 21T. In this way, the learning device 10 trains the model 21 with a smaller number of classes than the classes specified in the annotation image 19 . Therefore, it is possible to obtain a trained model 21T with high class discrimination accuracy compared to the case where all of a plurality of classes are learned at once.

学習装置１０はさらに、仮モデル１６を生成する仮モデル生成部４７を備える。したがって、ユーザの手を煩わすことなく仮モデル１６を生成することができる。 The learning device 10 further includes a temporary model generator 47 that generates the temporary model 16 . Therefore, the temporary model 16 can be generated without bothering the user.

アノテーション画像１９のクラス数をＭ、仮アノテーション画像１８のクラス数をＮとした場合、画像生成部４６と仮学習部４８は、仮アノテーション画像１８を生成する処理と仮モデル１６を学習済み仮モデル１６Ｔとする処理を、仮アノテーション画像１８のクラス数Ｎを徐々に増やしつつ、かつ仮アノテーション画像１８のクラス数ＮがＭ－１となるまで複数回繰り返す。したがって、仮学習を複数段階に分けて細かく行うことができ、よりクラスの判別精度が高いモデル２１を得ることが可能となる。 Assuming that the number of classes of the annotation image 19 is M and the number of classes of the temporary annotation image 18 is N, the image generation unit 46 and the temporary learning unit 48 perform processing for generating the temporary annotation image 18 and the temporary model 16 that has been trained. The processing for 16T is repeated multiple times while gradually increasing the class number N of the temporary annotation image 18 until the class number N of the temporary annotation image 18 reaches M−1. Therefore, provisional learning can be divided into a plurality of stages and performed finely, and a model 21 with higher class discrimination accuracy can be obtained.

仮モデル生成部４７は、前回用いた仮モデル１６の少なくとも一部を用いて、今回用いる仮モデル１６を生成する。したがって、前回の仮学習の成果を、今回の仮学習、ひいては本学習に取り込むことができる。 The temporary model generating unit 47 generates the temporary model 16 to be used this time by using at least part of the temporary model 16 used last time. Therefore, the results of the previous provisional learning can be incorporated into the current provisional learning and, by extension, the main learning.

仮学習部４８は、仮モデル１６のクラスの判別精度の評価に用いる第１損失関数９４を変更する。具体的には、仮学習部４８は、前回と共通するクラスに対する損失関数の重みを、今回新たに出現したクラスに対する損失関数の重みよりも小さくする。前回と共通するクラスは、前回既に仮学習済みである。対して今回新たに出現したクラスは、今回はじめて仮学習する。このため、前回と共通するクラスに対する損失関数の重みを、今回新たに出現したクラスに対する損失関数の重みよりも小さくすれば、前回と共通するクラスに比べて、今回新たに出現したクラスを重点的に仮学習することができる。 The provisional learning unit 48 changes the first loss function 94 used to evaluate the class discrimination accuracy of the provisional model 16 . Specifically, the provisional learning unit 48 makes the weight of the loss function for the class common to the previous time smaller than the weight of the loss function for the class newly appearing this time. Classes common to the last time have already been tentatively learned last time. On the other hand, the class that newly appeared this time will be provisionally learned for the first time this time. For this reason, if the weight of the loss function for the class common to the previous time is made smaller than the weight of the loss function for the class newly appearing this time, the class newly appearing this time will be given more weight than the class common to the previous time. can be provisionally learned.

画像生成部４６は、予め指定された画像生成条件４３にしたがって仮アノテーション画像１８を生成する。したがって、仮アノテーション画像１８を容易に生成することができる。 The image generator 46 generates the temporary annotation image 18 according to the image generation conditions 43 specified in advance. Therefore, the temporary annotation image 18 can be easily generated.

また、学習装置１０は、画像生成条件４３のユーザによる指定を受け付ける第２受付部としての受付部５２を備える。したがって、ユーザの考えを反映させた仮アノテーション画像１８を生成することができ、ユーザの指定通りに仮学習を行うことができる。 The learning device 10 also includes a reception unit 52 as a second reception unit that receives designation of the image generation condition 43 by the user. Therefore, it is possible to generate the provisional annotation image 18 reflecting the user's thoughts, and provisional learning can be performed as specified by the user.

なお、第１損失関数９４の種類を変更してもよい。例えばダイス係数を用いた第１損失関数９４と、二乗誤差を用いた第１損失関数９４とを選択的に用いる。同様に、第１損失関数９４と第２損失関数１０９の種類を変更してもよい。 Note that the type of the first loss function 94 may be changed. For example, the first loss function 94 using the Dice coefficient and the first loss function 94 using the squared error are selectively used. Similarly, the types of first loss function 94 and second loss function 109 may be changed.

ここで、細胞培養の分野は、ｉＰＳ（ＩｎｄｕｃｅｄＰｌｕｒｉｐｏｔｅｎｔＳｔｅｍ）細胞等の出現により、最近脚光を浴びている。このため、細胞画像内の細胞のクラスの判別をより正確に行う技術が要望されている。本開示の技術では、培養中の複数の細胞を撮影した細胞画像を入力画像２２としている。したがって、本開示の技術は、最近の要望に応えることができる技術であるといえる。 Here, the field of cell culture has recently been in the limelight due to the emergence of iPS (Induced Pluripotent Stem) cells and the like. Therefore, there is a demand for a technique for more accurately discriminating the class of cells in a cell image. In the technology of the present disclosure, the input image 22 is a cell image obtained by photographing a plurality of cells in culture. Therefore, it can be said that the technique of the present disclosure is a technique that can meet recent demands.

［第２実施形態］
図３２～図３４に示す第２実施形態では、仮学習部４８による仮学習の処理および本学習部５０による本学習の処理において、予め指定された部分を更新しない。[Second embodiment]
In the second embodiment shown in FIGS. 32 to 34, in the provisional learning process by the provisional learning section 48 and the main learning process by the main learning section 50, the previously specified portion is not updated.

図３２は、表示制御部５１によりディスプレイ３４に表示される非更新部分指定画面１２０を示す。非更新部分指定画面１２０は、仮モデル１６およびモデル２１の更新しない部分（以下、非更新部分という）のユーザによる指定を受け付けるための画面である。非更新部分指定画面１２０は、第１指定領域１２１および第２指定領域１２２を有する。第１指定領域１２１には、ＥＮ２６の各階層または全階層を非更新部分として指定するためのチェックボックス１２３が配されている。第２指定領域１２２には、ＤＮ２７の各階層または全階層を非更新部分として指定するためのチェックボックス１２４が配されている。 FIG. 32 shows a non-updated portion designation screen 120 displayed on the display 34 by the display control unit 51. As shown in FIG. The non-updated portion designation screen 120 is a screen for accepting designation by the user of portions of the temporary model 16 and the model 21 that are not to be updated (hereinafter referred to as non-updated portions). The non-updated portion designation screen 120 has a first designation area 121 and a second designation area 122 . A check box 123 for designating each layer or all layers of the EN 26 as non-updated portions is arranged in the first designation area 121 . A check box 124 for designating each layer or all layers of the DN 27 as non-updated portions is arranged in the second designation area 122 .

ユーザは、各指定領域１２１、１２２のチェックボックス１２３、１２４を適宜選択し、指定ボタン１２５を選択する。受付部５２は、チェックボックス１２３、１２４の選択状態に応じた非更新部分指定条件１２７を受け付ける。すなわち、受付部５２は、本開示の技術に係る「第１受付部」の一例である。図３２では、ＥＮ２６の全階層が非更新部分として選択された場合を例示している。なお、キャンセルボタン１２６が選択された場合、非更新部分指定画面１２０の表示が消される。 The user appropriately selects the check boxes 123 and 124 of the designated areas 121 and 122 and selects the designated button 125 . The accepting unit 52 accepts a non-updated portion designation condition 127 according to the selection state of the check boxes 123 and 124 . That is, the reception unit 52 is an example of the "first reception unit" according to the technology of the present disclosure. FIG. 32 illustrates a case where all layers of EN26 are selected as non-updated portions. Note that when the cancel button 126 is selected, the display of the non-updated portion designation screen 120 is erased.

受付部５２は、非更新部分指定条件１２７をＲＷ制御部４５に出力する。ＲＷ制御部４５は、非更新部分指定条件１２７をストレージデバイス３０に記憶する。 The receiving unit 52 outputs the non-updated portion designation condition 127 to the RW control unit 45 . The RW control unit 45 stores the non-updated portion designation condition 127 in the storage device 30 .

図３３に示すように、非更新部分指定条件１２７は、ＲＷ制御部４５によりストレージデバイス３０から読み出されて、ＲＷ制御部４５から第１更新部９２に出力される。第１更新部９２は、非更新部分指定条件１２７にしたがって、仮モデル１６（ここでは第１回仮学習に用いる仮モデル１６＿１）の非更新部分を更新しない。 As shown in FIG. 33 , the non-updated portion designation condition 127 is read from the storage device 30 by the RW control unit 45 and output from the RW control unit 45 to the first updating unit 92 . The first update unit 92 does not update the non-updated portion of the temporary model 16 (here, the temporary model 16_1 used for the first temporary learning) according to the non-updated portion specifying condition 127 .

また、図３４に示すように、非更新部分指定条件１２７は、第２更新部１０７にも出力される。第２更新部１０７は、非更新部分指定条件１２７にしたがって、モデル２１の非更新部分を更新しない。 In addition, as shown in FIG. 34, the non-updated portion designation condition 127 is also output to the second updating unit 107. FIG. The second update unit 107 does not update the non-updated portions of the model 21 according to the non-updated portion designation condition 127 .

図３３および図３４では、図３２の場合と同じく、ＥＮ２６の全階層が非更新部分として指定された場合を例示している。この場合、第１更新部９２および第２更新部１０７は、仮モデル１６およびモデル２１のＤＮ２７は更新するが、ＥＮ２６は更新しない。 33 and 34 exemplify the case where all layers of EN26 are designated as non-updated portions, as in the case of FIG. In this case, the first updating unit 92 and the second updating unit 107 update the DN 27 of the temporary model 16 and the model 21, but do not update the EN 26. FIG.

このように、第２実施形態では、受付部５２は、非更新部分のユーザによる指定を受け付ける。仮学習部４８と本学習部５０は、仮モデル１６を学習済み仮モデル１６Ｔとする処理とモデル２１を学習済みモデル２１Ｔとする処理において、非更新部分を更新しない。こうすれば、他の学習装置で学習されたモデルの一部を、大元の基礎モデル６０に転用する、いわゆる転移学習を行った場合に、基礎モデル６０に転用した、他の学習装置で学習された一部を非更新部分として指定して更新させなくすることができる。したがって、転移学習の成果を効果的に取り込むことができる。 Thus, in the second embodiment, the accepting unit 52 accepts user designation of non-updated portions. The provisional learning unit 48 and the main learning unit 50 do not update the non-updated part in the process of converting the temporary model 16 into the trained temporary model 16T and the process of changing the model 21 into the trained model 21T. In this way, when performing so-called transfer learning, in which part of a model learned by another learning device is diverted to the original basic model 60, learning by another learning device diverted to the basic model 60 is performed. The updated part can be specified as a non-updated part so as not to be updated. Therefore, the results of transfer learning can be effectively incorporated.

［第３実施形態］
図３５に示す第３実施形態では、画像生成条件４３を、仮アノテーション画像１８の各クラスの面積が偏らないような内容とする。[Third embodiment]
In the third embodiment shown in FIG. 35, the image generation condition 43 is such that the area of each class of the temporary annotation image 18 is not biased.

図３５において、第３実施形態では、面積情報１３０がストレージデバイス３０に記憶される。面積情報１３０には、アノテーション画像１９における面積比率がクラス毎に登録されている。面積比率は、認定データＡＶＤに基づいて、各クラスに属する画素の個数を計数し、計数した画素の個数をアノテーション画像１９の全画素の個数で除算した値である。図３５では、クラス１の分化細胞ＤＣの面積比率が４６％、クラス２の未分化細胞ＵＤＣの面積比率が１１％、クラス３の死細胞ＤＤＣの面積比率が６％、クラス４の培地ＰＬの面積比率が３７％の場合を例示している。 In FIG. 35, area information 130 is stored in the storage device 30 in the third embodiment. In the area information 130, the area ratio in the annotation image 19 is registered for each class. The area ratio is a value obtained by counting the number of pixels belonging to each class based on the recognition data AVD and dividing the counted number of pixels by the number of all pixels of the annotation image 19 . In FIG. 35, the area ratio of class 1 differentiated cell DC is 46%, the area ratio of class 2 undifferentiated cell UDC is 11%, the area ratio of class 3 dead cell DDC is 6%, and the area ratio of class 4 medium PL A case where the area ratio is 37% is illustrated.

この場合の画像生成条件４３は、第１回仮学習においてはクラス２～４（未分化細胞ＵＤＣ、死細胞ＤＤＣ、培地ＰＬ）を統合する旨が、第２回仮学習においてはクラス２、３（未分化細胞ＵＤＣ、死細胞ＤＤＣ）を統合する旨が、それぞれ登録されたものとなる。第１回仮学習においては、クラス１の分化細胞ＤＣの面積比率が４６％、クラス２～４の統合クラスの面積比率が１１＋６＋３７＝５４％であり、他の３つのクラスを統合した場合と比べて面積比率が偏っていない。また、第２回仮学習においては、クラス１の分化細胞ＤＣの面積比率が４６％、クラス４の培地ＰＬの面積比率が３７％、クラス２、３の統合クラスの面積比率が１１＋６＝１７％であり、クラス３、４、またはクラス２、４を統合した場合と比べて面積比率が偏っていない。 In this case, the image generation condition 43 is that classes 2 to 4 (undifferentiated cell UDC, dead cell DDC, medium PL) are integrated in the first provisional learning, and classes 2 and 3 in the second provisional learning. (Undifferentiated cell UDC, dead cell DDC) are respectively registered. In the first provisional learning, the area ratio of differentiated cells DC in class 1 was 46%, and the area ratio of integrated classes of classes 2 to 4 was 11 + 6 + 37 = 54%, compared to the case where the other three classes were integrated. The area ratio is not biased. In the second provisional learning, the area ratio of differentiated cells DC of class 1 is 46%, the area ratio of medium PL of class 4 is 37%, and the area ratio of integrated classes of classes 2 and 3 is 11 + 6 = 17%. , and the area ratio is not biased compared to when classes 3 and 4 or classes 2 and 4 are integrated.

このように、第３実施形態では、画像生成条件４３は、仮アノテーション画像１８の各クラスの面積が偏らないような内容である。したがって、仮アノテーション画像１８の各クラスの仮学習の負荷を平均化することができる。 Thus, in the third embodiment, the image generation condition 43 is such that the area of each class of the temporary annotation image 18 is not biased. Therefore, the temporary learning load of each class of the temporary annotation image 18 can be averaged.

［第４実施形態］
図３６に示す第４実施形態では、画像生成条件４３を、仮アノテーション画像１８の各クラスの複雑度が偏らないような内容とする。[Fourth Embodiment]
In the fourth embodiment shown in FIG. 36, the image generation condition 43 is set so that the complexity of each class of the temporary annotation image 18 is balanced.

図３６において、第４実施形態では、複雑度情報１３５がストレージデバイス３０に記憶される。複雑度情報１３５には、アノテーション画像１９における複雑度がクラス毎に登録されている。複雑度は、各クラスの面積、および／または、各クラスの境界線のジグザグの隣り合う山同士のピッチ等に応じて設定されたレベルの値である。複雑度のレベルの値が大きいほど、当該クラスが複雑であることを示している。図３６では、クラス１の分化細胞ＤＣの複雑度がレベル５、クラス２の未分化細胞ＵＤＣの複雑度がレベル６、クラス３の死細胞ＤＤＣの複雑度がレベル７、クラス４の培地ＰＬの複雑度がレベル４の場合を例示している。 In FIG. 36, the complexity information 135 is stored in the storage device 30 in the fourth embodiment. The complexity information 135 registers the complexity of the annotation image 19 for each class. The complexity is a level value set according to the area of each class and/or the pitch of adjacent zigzag peaks of the boundary line of each class. The higher the complexity level value, the more complex the class. In FIG. 36, the complexity of class 1 differentiated cell DC is level 5, the complexity of class 2 undifferentiated cell UDC is level 6, the complexity of class 3 dead cell DDC is level 7, and the complexity of class 4 medium PL A case of complexity level 4 is illustrated.

この場合の画像生成条件４３は、第１回仮学習においてはクラス２～４（未分化細胞ＵＤＣ、死細胞ＤＤＣ、培地ＰＬ）を統合する旨が、第２回仮学習においてはクラス３、４（死細胞ＤＤＣ、培地ＰＬ）を統合する旨が、それぞれ登録されたものとなる。第１回仮学習においては、クラス１の分化細胞ＤＣの複雑度がレベル５、クラス２～４の統合クラスの複雑度がレベル５．７（≒（６＋７＋４）／３）であり、他の３つのクラスを統合した場合と比べて複雑度が偏っていない。また、第２回仮学習においては、クラス１の分化細胞ＤＣの複雑度がレベル５、クラス２の未分化細胞ＵＤＣの複雑度がレベル６、クラス３、４の統合クラスの複雑度がレベル５．５（＝（７＋４）／２）であり、クラス２、３、またはクラス２、４を統合した場合と比べて複雑度が偏っていない。 In this case, the image generation condition 43 is that classes 2 to 4 (undifferentiated cell UDC, dead cell DDC, medium PL) are integrated in the first provisional learning, and that classes 3 and 4 are integrated in the second provisional learning. (Dead cell DDC, medium PL) are registered respectively. In the first provisional learning, the complexity of the differentiated cell DC of class 1 is level 5, the complexity of the integrated class of classes 2 to 4 is level 5.7 (≈ (6 + 7 + 4) / 3), and the other 3 The complexity is not biased compared to the integration of two classes. In the second provisional learning, the complexity of class 1 differentiated cell DC is level 5, the complexity of class 2 undifferentiated cell UDC is level 6, and the complexity of class 3 and 4 combined class is level 5. .5 (=(7+4)/2), and the complexity is not biased compared to when classes 2 and 3 or classes 2 and 4 are combined.

このように、第４実施形態では、画像生成条件４３は、仮アノテーション画像１８の各クラスの複雑度が偏らないような内容である。したがって、上記第３実施形態の場合と同じく、仮アノテーション画像１８の各クラスの仮学習の負荷を平均化することができる。 Thus, in the fourth embodiment, the image generation condition 43 is such that the complexity of each class of the temporary annotation image 18 is balanced. Therefore, as in the case of the third embodiment, the temporary learning load of each class of the temporary annotation image 18 can be averaged.

なお、図３７に示すように、仮アノテーション画像１８の各クラスの面積が偏らないような内容の上記第３実施形態の画像生成条件４３とするか、仮アノテーション画像１８の各クラスの複雑度が偏らないような内容の上記第４実施形態の画像生成条件４３とするかのユーザによる選択指示を受け付けてもよい。 As shown in FIG. 37, the image generation condition 43 of the third embodiment is such that the area of each class of the temporary annotation image 18 is not biased, or the complexity of each class of the temporary annotation image 18 is set to A user's selection instruction may be accepted as to whether the image generation conditions 43 of the fourth embodiment are set so as not to be unbiased.

図３７において、画像生成条件指定画面１４０には、画像生成条件４３を、仮アノテーション画像１８の各クラスの面積が偏らないような内容とするか、仮アノテーション画像１８の各クラスの複雑度が偏らないような内容とするかを、択一的に選択するためのラジオボタン１４１が設けられている。ユーザは、ラジオボタン１４１を選択して指定ボタン１４２を選択する。受付部５２は、画像生成条件４３を、仮アノテーション画像１８の各クラスの面積が偏らないような内容とするという選択指示、または、画像生成条件４３を、仮アノテーション画像１８の各クラスの複雑度が偏らないような内容とするという選択指示を受け付ける。すなわち、受付部５２は、本開示の技術に係る「第３受付部」の一例である。なお、キャンセルボタン１４３が選択された場合、画像生成条件指定画面１４０の表示が消される。 In FIG. 37, on the image generation condition designation screen 140, the image generation condition 43 is set so that the area of each class of the provisional annotation image 18 is not biased, or the complexity of each class of the provisional annotation image 18 is not biased. A radio button 141 is provided for alternatively selecting whether or not the content should not be included. The user selects the radio button 141 and selects the designation button 142 . The receiving unit 52 selects the image generation condition 43 so that the area of each class of the temporary annotation image 18 is not biased, or sets the image generation condition 43 to the complexity of each class of the temporary annotation image 18 . Receives a selection instruction that the contents should not be biased. That is, the reception unit 52 is an example of the "third reception unit" according to the technology of the present disclosure. Note that when the cancel button 143 is selected, the display of the image generation condition designation screen 140 is erased.

このように、受付部５２において、仮アノテーション画像１８の各クラスの面積が偏らないような内容の画像生成条件４３とするか、仮アノテーション画像１８の各クラスの複雑度が偏らないような内容の画像生成条件４３とするかのユーザによる選択指示を受け付ければ、ユーザに適した内容の画像生成条件４３に基づいて、画像生成部４６において仮アノテーション画像１８を生成することができる。なお、仮アノテーション画像１８の各クラスの面積が偏らないような内容の画像生成条件４３とするか、仮アノテーション画像１８の各クラスの複雑度が偏らないような内容の画像生成条件４３とするかを、択一的でなく両方選択可能としてもよい。 In this way, in the receiving unit 52, the image generation condition 43 is set so that the area of each class of the provisional annotation image 18 is not biased, or the content is set so that the complexity of each class of the provisional annotation image 18 is not biased. If the user selects the image generation condition 43, the image generation unit 46 can generate the provisional annotation image 18 based on the image generation condition 43 that is suitable for the user. It should be noted that whether the image generation condition 43 is set such that the area of each class of the temporary annotation image 18 is not biased, or the image generation condition 43 is set such that the complexity of each class of the provisional annotation image 18 is not biased. can be selected instead of alternatively.

［第５実施形態］
図３８に示す第５実施形態では、画像生成条件４３を、最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容とする。[Fifth embodiment]
In the fifth embodiment shown in FIG. 38, the image generation condition 43 is such that the class having the largest area is left as one class without being integrated.

図３８において、第５実施形態では、第３実施形態と同じく、面積情報１３０がストレージデバイス３０に記憶される。図３８では、クラス１の分化細胞ＤＣの面積比率が２３％、クラス２の未分化細胞ＵＤＣの面積比率が７％、クラス３の死細胞ＤＤＣの面積比率が２％、クラス４の培地ＰＬの面積比率が６８％の場合を例示している。すなわち、最大の面積をもつクラスは、クラス４の培地ＰＬである。 In FIG. 38, in the fifth embodiment, area information 130 is stored in the storage device 30 as in the third embodiment. In FIG. 38, the area ratio of class 1 differentiated cell DC is 23%, the area ratio of class 2 undifferentiated cell UDC is 7%, the area ratio of class 3 dead cell DDC is 2%, and the area ratio of class 4 medium PL A case where the area ratio is 68% is illustrated. That is, the class with the largest area is class 4 medium PL.

この場合の画像生成条件４３は、第１回仮学習においてはクラス１～３（分化細胞ＤＣ、未分化細胞ＵＤＣ、死細胞ＤＤＣ）を統合する旨が、第２回仮学習においてはクラス２、３（未分化細胞ＵＤＣ、死細胞ＤＤＣ）を統合する旨が、それぞれ登録されたものとなる。最大の面積をもつクラスであるクラス４の培地ＰＬは、各回の仮学習において統合されずに１つのクラスのままとされる。また、クラス１～３のうちで最大の面積をもつクラスであるクラス１の分化細胞ＤＣは、第２回仮学習において統合されずに１つのクラスとされる。 In this case, the image generation condition 43 is that classes 1 to 3 (differentiated cell DC, undifferentiated cell UDC, dead cell DDC) are integrated in the first provisional learning, and class 2, 3 (undifferentiated cell UDC, dead cell DDC) are registered respectively. The medium PL of class 4, which is the class with the largest area, is left as one class without being integrated in each provisional learning. In addition, the differentiated cell DC of class 1, which is the class having the largest area among classes 1 to 3, is not integrated into one class in the second provisional learning.

このように、第５実施形態では、画像生成条件４３は、最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容である。こうして最大の面積をもつクラスを、早い段階で独立して仮学習することで、大局的な構造から微細な構造へ、という仮学習の流れを自然に作ることができる。したがって、よりクラスの判別精度が高い学習済みモデル２１Ｔを得ることが可能となる。 As described above, in the fifth embodiment, the image generation condition 43 is such that the class having the largest area is left as one class without being integrated. In this way, by temporarily learning the class with the largest area independently at an early stage, it is possible to naturally create a flow of temporary learning from the global structure to the detailed structure. Therefore, it is possible to obtain a trained model 21T with higher class discrimination accuracy.

［第６実施形態］
図３９に示す第６実施形態では、画像生成条件４３を、最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容とする。[Sixth Embodiment]
In the sixth embodiment shown in FIG. 39, the image generation condition 43 is such that the class with the lowest complexity is left as one class without being integrated.

図３９において、第６実施形態では、第４実施形態と同じく、複雑度情報１３５がストレージデバイス３０に記憶される。図３９では、クラス１の分化細胞ＤＣの複雑度がレベル８、クラス２の未分化細胞ＵＤＣの複雑度がレベル７、クラス３の死細胞ＤＤＣの複雑度がレベル６、クラス４の培地ＰＬの複雑度がレベル３の場合を例示している。すなわち、最小の複雑度をもつクラスは、クラス４の培地ＰＬである。 39, in the sixth embodiment, the complexity information 135 is stored in the storage device 30 as in the fourth embodiment. In FIG. 39, the complexity of class 1 differentiated cell DC is level 8, the complexity of class 2 undifferentiated cell UDC is level 7, the complexity of class 3 dead cell DDC is level 6, and the complexity of class 4 medium PL A case of complexity level 3 is illustrated. That is, the class with the lowest complexity is class 4 medium PL.

この場合の画像生成条件４３は、第１回仮学習においてはクラス１～３（分化細胞ＤＣ、未分化細胞ＵＤＣ、死細胞ＤＤＣ）を統合する旨が、第２回仮学習においてはクラス１、２（分化細胞ＤＣ、未分化細胞ＵＤＣ）を統合する旨が、それぞれ登録されたものとなる。最小の複雑度をもつクラスであるクラス４の培地ＰＬは、各回の仮学習において統合されずに１つのクラスのままとされる。また、クラス１～３のうちで最小の複雑度をもつクラスであるクラス３の死細胞ＤＤＣは、第２回仮学習において統合されずに１つのクラスとされる。 The image generation condition 43 in this case is that classes 1 to 3 (differentiated cell DC, undifferentiated cell UDC, dead cell DDC) are integrated in the first provisional learning, and class 1, 2 (differentiated cell DC, undifferentiated cell UDC) are respectively registered. Class 4 medium PL, which is the class with the lowest complexity, is left as one class without being merged in each trial. In addition, the dead cell DDC of class 3, which is the class with the lowest complexity among classes 1 to 3, is not integrated into one class in the second provisional learning.

このように、第６実施形態では、画像生成条件４３は、最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容である。こうして最小の複雑度をもつクラスを、早い段階で独立して仮学習することで、上記第５実施形態と同じく、大局的な構造から微細な構造へ、という仮学習の流れを自然に作ることができ、よりクラスの判別精度が高い学習済みモデル２１Ｔを得ることが可能となる。 Thus, in the sixth embodiment, the image generation condition 43 is such that the class with the lowest complexity is left as one class without being merged. In this way, the class with the minimum complexity is provisionally learned independently at an early stage, thereby naturally creating a provisional learning flow from the global structure to the fine structure, as in the fifth embodiment. , and it is possible to obtain a trained model 21T with higher class discrimination accuracy.

なお、図４０に示すように、最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容の上記第５実施形態の画像生成条件４３とするか、最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容の上記第６実施形態の画像生成条件４３とするかのユーザによる選択指示を受け付けてもよい。 As shown in FIG. 40, the image generation condition 43 of the fifth embodiment is set such that the class having the largest area is left as one class without being integrated, or the minimum complexity is set to A user's selection instruction may be accepted as to whether the image generation condition 43 of the sixth embodiment is to remain as one class without merging the existing classes.

図４０において、画像生成条件指定画面１５０には、画像生成条件４３を、最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容とするか、最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容とするかを、択一的に選択するためのラジオボタン１５１が設けられている。ユーザは、ラジオボタン１５１を選択して指定ボタン１５２を選択する。受付部５２は、画像生成条件４３を、最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容とするという選択指示、または、画像生成条件４３を、最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容とするという選択指示を受け付ける。すなわち、受付部５２は、本開示の技術に係る「第４受付部」の一例である。なお、キャンセルボタン１５３が選択された場合、画像生成条件指定画面１５０の表示が消される。 In FIG. 40, on the image generation condition specification screen 150, the image generation condition 43 is set such that the class having the maximum area is left as one class without being integrated, or the class having the minimum complexity is left as it is. A radio button 151 is provided for alternatively selecting whether the classes should be kept as one class without being integrated. The user selects the radio button 151 and selects the designation button 152 . The reception unit 52 sets the image generation condition 43 to a selection instruction that the class having the largest area is left as one class without merging, or sets the image generation condition 43 to the minimum complexity. A selection instruction is accepted to say that the classes having the . That is, the reception unit 52 is an example of the "fourth reception unit" according to the technology of the present disclosure. Note that when the cancel button 153 is selected, the display of the image generation condition designation screen 150 is erased.

このように、受付部５２において、最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容の画像生成条件４３とするか、最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容の画像生成条件４３とするかのユーザによる選択指示を受け付ければ、図３７の場合と同じく、ユーザに適した内容の画像生成条件４３に基づいて、画像生成部４６において仮アノテーション画像１８を生成することができる。なお、最大の面積をもつクラスは統合せずに１つのクラスのままとする、という内容の画像生成条件４３とするか、最小の複雑度をもつクラスは統合せずに１つのクラスのままとする、という内容の画像生成条件４３とするかを、択一的でなく両方選択可能としてもよい。 In this way, in the receiving unit 52, the image generation condition 43 is set such that the class with the largest area is not integrated and is left as one class, or the class with the minimum complexity is not integrated. 37, based on the image generation condition 43 suitable for the user, as in the case of FIG. The temporary annotation image 18 can be generated in the image generator 46 . The image generation condition 43 is such that the class with the maximum area is left as one class without being integrated, or the class with the minimum complexity is left as one class without being integrated. The image generation condition 43 with the content of "Yes" may be selected not alternatively but both.

［第７実施形態］
図４１に示す第７実施形態では、仮アノテーション画像１８を表示する。[Seventh embodiment]
In the seventh embodiment shown in FIG. 41, a temporary annotation image 18 is displayed.

図４１において、仮アノテーション画像表示画面１６０は、表示制御部５１によりディスプレイ３４に表示される。仮アノテーション画像表示画面１６０は、例えば図１１で示した画像生成条件指定画面６５により画像生成条件４３が指定された場合に、画像生成条件指定画面６５に代えて表示される。仮アノテーション画像表示画面１６０には、第１回仮学習に用いる仮アノテーション画像１８＿１、および第２回仮学習に用いる仮アノテーション画像１８＿２が表示される。ユーザは、表示された仮アノテーション画像１８＿１、１８＿２でよければ、確認ボタン１６１を選択する。確認ボタン１６１が選択された場合、仮アノテーション画像表示画面１６０の表示が消される。一方、ユーザは、表示された仮アノテーション画像１８＿１、１８＿２で満足がいかない場合、再指定ボタン１６２を選択する。再指定ボタン１６２が選択された場合、表示制御部５１により、図１１で示した画像生成条件指定画面６５がディスプレイ３４に再び表示され、画像生成条件４３の再指定が可能となる。 In FIG. 41 , a temporary annotation image display screen 160 is displayed on the display 34 by the display control section 51 . The temporary annotation image display screen 160 is displayed instead of the image generation condition designation screen 65 when the image generation condition 43 is designated on the image generation condition designation screen 65 shown in FIG. 11, for example. The temporary annotation image display screen 160 displays a temporary annotation image 18_1 used for the first temporary learning and a temporary annotation image 18_2 used for the second temporary learning. The user selects the confirmation button 161 if the displayed temporary annotation images 18_1 and 18_2 are acceptable. When the confirmation button 161 is selected, the temporary annotation image display screen 160 is cleared. On the other hand, if the user is not satisfied with the displayed temporary annotation images 18_1 and 18_2, the user selects the re-specify button 162 . When the redesignation button 162 is selected, the display control unit 51 displays the image generation condition designation screen 65 shown in FIG.

このように、第７実施形態では、仮アノテーション画像１８を表示する。したがって、仮アノテーション画像１８をユーザに確認させることができ、場合によっては画像生成条件４３を再指定させることができる。 Thus, in the seventh embodiment, the temporary annotation image 18 is displayed. Therefore, the user can confirm the temporary annotation image 18 and, in some cases, redesignate the image generation condition 43 .

なお、図４１では、仮アノテーション画像１８＿１、１８＿２を仮アノテーション画像表示画面１６０に並列表示させたが、これに限定されない。仮アノテーション画像１８＿１、１８＿２を１枚ずつ表示させてもよい。ただし、図４１のように各回の仮学習に用いる仮アノテーション画像１８を並列表示させたほうが、仮アノテーション画像１８の変遷を確認することができるため好適である。 Although the temporary annotation images 18_1 and 18_2 are displayed side by side on the temporary annotation image display screen 160 in FIG. 41, the present invention is not limited to this. The temporary annotation images 18_1 and 18_2 may be displayed one by one. However, it is preferable to display the temporary annotation images 18 used for each temporary learning in parallel as shown in FIG. 41 because the transition of the temporary annotation images 18 can be confirmed.

［第８実施形態］
図４２および図４３に示す第８実施形態では、統合されたクラスが異なる複数種の仮アノテーション画像１８を生成してこれらを表示し、複数種の仮アノテーション画像１８のうちの１つの仮アノテーション画像１８のユーザによる選択指示を受け付け、選択指示を受け付けた仮アノテーション画像１８を仮学習データ１５として用いる。[Eighth embodiment]
In the eighth embodiment shown in FIGS. 42 and 43, multiple types of temporary annotation images 18 with different integrated classes are generated and displayed, and one of the multiple types of temporary annotation images 18 is displayed. 18, and the temporary annotation image 18 for which the selection instruction is received is used as the temporary learning data 15 .

図４２に示すように、画像生成部４６は、統合されたクラスが異なる複数種の仮アノテーション画像１８を生成する。図４２では、第２回仮学習に用いる仮アノテーション画像１８＿２を複数種生成する場合を例示している。 As shown in FIG. 42, the image generator 46 generates a plurality of types of temporary annotation images 18 with different integrated classes. FIG. 42 illustrates a case where a plurality of types of provisional annotation images 18_2 used for the second provisional learning are generated.

より具体的には、図４２Ａに示すように、画像生成部４６は、クラス１の分化細胞ＤＣとクラス２の未分化細胞ＵＤＣを統合するという内容の画像生成条件４３Ａにしたがって、クラス１の分化細胞ＤＣとクラス２の未分化細胞ＵＤＣが統合クラスとして統合された仮アノテーション画像１８＿２Ａを生成する。また、図４２Ｂに示すように、画像生成部４６は、クラス１の分化細胞ＤＣとクラス３の死細胞ＤＤＣを統合するという内容の画像生成条件４３Ｂにしたがって、クラス１の分化細胞ＤＣとクラス３の死細胞ＤＤＣが統合クラスとして統合された仮アノテーション画像１８＿２Ｂを生成する。さらに、図４２Ｃに示すように、画像生成部４６は、クラス２の未分化細胞ＵＤＣとクラス３の死細胞ＤＤＣを統合するという内容の画像生成条件４３Ｃにしたがって、クラス２の未分化細胞ＵＤＣとクラス３の死細胞ＤＤＣが統合クラスとして統合された仮アノテーション画像１８＿２Ｃを生成する。 More specifically, as shown in FIG. 42A , the image generation unit 46 performs class 1 differentiated cells according to an image generation condition 43A that integrates class 1 differentiated cells DC and class 2 undifferentiated cells UDC. A provisional annotation image 18_2A is generated in which cell DC and undifferentiated cell UDC of class 2 are integrated as an integrated class. In addition, as shown in FIG. 42B, the image generation unit 46 performs class 1 differentiated cell DC and class 3 differentiated cell DC in accordance with image generation condition 43B that integrates class 1 differentiated cell DC and class 3 dead cell DDC. dead cells DDC are integrated as an integrated class to generate a temporary annotation image 18_2B. Furthermore, as shown in FIG. 42C , the image generation unit 46 combines the class 2 undifferentiated cell UDC with the class 2 undifferentiated cell UDC according to the image generation condition 43C that integrates the class 2 undifferentiated cell UDC and the class 3 dead cell DDC. A temporary annotation image 18_2C in which the dead cell DDC of class 3 is integrated as an integrated class is generated.

表示制御部５１は、図４３に示す仮アノテーション画像選択画面１７０をディスプレイ３４に表示する。仮アノテーション画像選択画面１７０には、画像生成部４６において生成された、複数種の仮アノテーション画像１８＿２Ａ～１８＿２Ｃが表示される。これら複数種の仮アノテーション画像１８＿２Ａ～１８＿２Ｃの下部には、ラジオボタン１７１が設けられている。ラジオボタン１７１は、仮アノテーション画像１８＿２Ａ～１８＿２Ｃのうちの１つを選択するためのボタンである。 The display control unit 51 displays a temporary annotation image selection screen 170 shown in FIG. 43 on the display 34 . A plurality of types of temporary annotation images 18_2A to 18_2C generated by the image generator 46 are displayed on the temporary annotation image selection screen 170. FIG. Radio buttons 171 are provided below the plurality of types of temporary annotation images 18_2A to 18_2C. The radio button 171 is a button for selecting one of the temporary annotation images 18_2A to 18_2C.

ユーザは、ラジオボタン１７１を選択して指定ボタン１７２を選択する。受付部５２は、複数種の仮アノテーション画像１８＿２Ａ～１８＿２Ｃのうちの１つの仮アノテーション画像１８の選択指示を受け付ける。すなわち、受付部５２は、本開示の技術に係る「第５受付部」の一例である。仮学習部４８は、ラジオボタン１７１により選択された仮アノテーション画像１８＿２を、第２回仮学習の仮学習データ１５＿２として用いる。なお、キャンセルボタン１７３が選択された場合、仮アノテーション画像選択画面１７０の表示が消される。 The user selects the radio button 171 and selects the designation button 172 . The receiving unit 52 receives an instruction to select one temporary annotation image 18 from among the plurality of types of temporary annotation images 18_2A to 18_2C. That is, the reception unit 52 is an example of the "fifth reception unit" according to the technology of the present disclosure. The provisional learning unit 48 uses the provisional annotation image 18_2 selected by the radio button 171 as the provisional learning data 15_2 for the second provisional learning. Note that when the cancel button 173 is selected, the display of the temporary annotation image selection screen 170 is erased.

図４３では、ラジオボタン１７１により仮アノテーション画像１８＿２Ｂが選択された場合を例示している。この状態で指定ボタン１７２が選択された場合、仮学習部４８は、仮アノテーション画像１８＿２Ｂを第２回仮学習の仮学習データ１５＿２として用いる。 FIG. 43 illustrates a case where the provisional annotation image 18_2B is selected with the radio button 171 . When the designation button 172 is selected in this state, the provisional learning unit 48 uses the provisional annotation image 18_2B as the provisional learning data 15_2 for the second provisional learning.

このように、第８実施形態では、画像生成部４６は、統合されたクラスが異なる複数種の仮アノテーション画像１８を生成する。表示制御部５１は、複数種の仮アノテーション画像１８を表示する制御を行う。受付部５２は、複数種の仮アノテーション画像のうちの１つの仮アノテーション画像のユーザによる選択指示を受け付ける。仮学習部４８は、受付部５２において選択指示を受け付けた仮アノテーション画像１８を仮学習データ１５として用いる。したがって、ユーザは、複数種の仮アノテーション画像１８を実際に確認しながら、仮学習データ１５として用いる仮アノテーション画像１８を選択することができる。 Thus, in the eighth embodiment, the image generator 46 generates multiple types of temporary annotation images 18 with different integrated classes. The display control unit 51 controls the display of multiple types of temporary annotation images 18 . The accepting unit 52 accepts a user's instruction to select one of a plurality of types of temporary annotation images. The provisional learning unit 48 uses the provisional annotation image 18 for which the selection instruction is received by the reception unit 52 as the provisional learning data 15 . Therefore, the user can select the temporary annotation image 18 to be used as the temporary learning data 15 while actually checking the multiple types of temporary annotation images 18 .

なお、第８実施形態では、仮アノテーション画像１８の選択が画像生成条件４３の設定と等価である。このため、図１１で示した画像生成条件指定画面６５等を用いた画像生成条件４３の指定は不要である。 Note that in the eighth embodiment, selecting the temporary annotation image 18 is equivalent to setting the image generation condition 43 . Therefore, it is not necessary to designate the image generation condition 43 using the image generation condition designation screen 65 shown in FIG. 11 or the like.

第１回仮学習においては、上記で例示したように必ず３つのクラスを統合しなければならないという訳ではない。例えばクラス１の分化細胞ＤＣとクラス２の未分化細胞ＵＤＣ、クラス３の死細胞ＤＤＣとクラス４の培地ＰＬをそれぞれ統合し、トータルで２つのクラスとしてもよい。 In the first provisional learning, it is not always necessary to integrate the three classes as illustrated above. For example, class 1 differentiated cell DC and class 2 undifferentiated cell UDC, and class 3 dead cell DDC and class 4 culture medium PL may be combined to form two classes in total.

アノテーション画像１９のクラス数Ｎは３つ以上であればよく、例示の４つに限らない。このため、仮学習の回数も、例示の２回に限らない。 The number of classes N of the annotation image 19 is not limited to four as long as it is three or more. Therefore, the number of provisional learnings is not limited to two as illustrated.

機械学習システム２を構成するコンピュータのハードウェア構成は種々の変形が可能である。例えば、学習装置１０と運用装置１１とを統合して、１台のコンピュータで構成してもよい。また、学習装置１０および運用装置１１のうちの少なくともいずれかを、処理能力および信頼性の向上を目的として、ハードウェアとして分離された複数台のコンピュータで構成することも可能である。例えば、学習装置１０の画像生成部４６、仮モデル生成部４７、および仮学習部４８の機能と、モデル生成部４９および本学習部５０の機能とを、２台のコンピュータに分散して担わせる。この場合は２台のコンピュータで学習装置１０を構成する。 Various modifications are possible for the hardware configuration of the computer that constitutes the machine learning system 2 . For example, the learning device 10 and the operation device 11 may be integrated into one computer. Also, at least one of the learning device 10 and the operation device 11 can be configured with a plurality of computers separated as hardware for the purpose of improving processing capability and reliability. For example, the functions of the image generating unit 46, the temporary model generating unit 47, and the temporary learning unit 48 of the learning device 10, and the functions of the model generating unit 49 and the main learning unit 50 are distributed to two computers. . In this case, the learning device 10 is composed of two computers.

このように、機械学習システム２のコンピュータのハードウェア構成は、処理能力、安全性、信頼性等の要求される性能に応じて適宜変更することができる。さらに、ハードウェアに限らず、作動プログラム４０等のアプリケーションプログラムについても、安全性および信頼性の確保を目的として、二重化したり、あるいは、複数のストレージデバイスに分散して格納することももちろん可能である。 In this way, the hardware configuration of the computer of the machine learning system 2 can be appropriately changed according to required performance such as processing power, safety, and reliability. Furthermore, not only the hardware but also application programs such as the operation program 40 can of course be duplicated or distributed and stored in multiple storage devices for the purpose of ensuring safety and reliability. be.

上記各実施形態では、学習用入力画像１７として、培養中の複数の細胞を撮影した細胞画像を例示し、クラスとして細胞、培地等を例示したが、これに限定されない。例えばＭＲＩ（ＭａｇｎｅｔｉｃＲｅｓｏｎａｎｃｅＩｍａｇｉｎｇ）画像を学習用入力画像１７とし、肝臓、腎臓といった臓器をクラスとしてもよい。 In each of the above-described embodiments, the learning input image 17 is a cell image obtained by photographing a plurality of cells in culture, and the class is a cell, medium, or the like, but the class is not limited to this. For example, MRI (Magnetic Resonance Imaging) images may be used as learning input images 17, and organs such as liver and kidney may be used as classes.

モデルはＵ－Ｎｅｔに限らず、他の畳み込みニューラルネットワーク、例えばＳｅｇＮｅｔ、ＲｅｓＮｅｔ（ＲｅｓｉｄｕａｌＮｅｔｗｏｒｋ）等でもよい。 The model is not limited to U-Net, but may be other convolutional neural networks such as SegNet and ResNet (Residual Network).

上記各実施形態において、例えば、ＲＷ制御部４５、画像生成部４６、仮モデル生成部４７、仮学習部４８（第１処理部９０、第１評価部９１、第１更新部９２）、モデル生成部４９、本学習部５０（第２処理部１０５、第２評価部１０６、第２更新部１０７）、表示制御部５１、受付部５２、および送信制御部５３といった各種の処理を実行する処理部（ＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）のハードウェア的な構造としては、次に示す各種のプロセッサ（Ｐｒｏｃｅｓｓｏｒ）を用いることができる。各種のプロセッサには、上述したように、ソフトウェア（作動プログラム４０）を実行して各種の処理部として機能する汎用的なプロセッサであるＣＰＵ３２に加えて、ＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）等の製造後に回路構成を変更可能なプロセッサであるプログラマブルロジックデバイス（ＰｒｏｇｒａｍｍａｂｌｅＬｏｇｉｃＤｅｖｉｃｅ:ＰＬＤ）、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）等の特定の処理を実行させるために専用に設計された回路構成を有するプロセッサである専用電気回路等が含まれる。 In each of the above embodiments, for example, the RW control unit 45, the image generation unit 46, the temporary model generation unit 47, the temporary learning unit 48 (first processing unit 90, first evaluation unit 91, first update unit 92), model generation 49, main learning unit 50 (second processing unit 105, second evaluation unit 106, second update unit 107), display control unit 51, reception unit 52, and transmission control unit 53. As the hardware structure of (Processing Unit), various processors shown below can be used. In addition to the CPU 32, which is a general-purpose processor that executes software (operation program 40) and functions as various processing units, as described above, the various processors include FPGAs (Field Programmable Gate Arrays), etc. Programmable Logic Device (PLD), which is a processor whose circuit configuration can be changed, ASIC (Application Specific Integrated Circuit), etc. It includes electric circuits and the like.

１つの処理部は、これらの各種のプロセッサのうちの１つで構成されてもよいし、同種または異種の２つ以上のプロセッサの組み合わせ（例えば、複数のＦＰＧＡの組み合わせ、および／または、ＣＰＵとＦＰＧＡとの組み合わせ）で構成されてもよい。また、複数の処理部を１つのプロセッサで構成してもよい。 One processing unit may be configured with one of these various processors, or a combination of two or more processors of the same or different type (for example, a combination of a plurality of FPGAs and/or a CPU and combination with FPGA). Also, a plurality of processing units may be configured by one processor.

複数の処理部を１つのプロセッサで構成する例としては、第１に、クライアントおよびサーバ等のコンピュータに代表されるように、１つ以上のＣＰＵとソフトウェアの組み合わせで１つのプロセッサを構成し、このプロセッサが複数の処理部として機能する形態がある。第２に、システムオンチップ（ＳｙｓｔｅｍＯｎＣｈｉｐ:ＳｏＣ）等に代表されるように、複数の処理部を含むシステム全体の機能を１つのＩＣ（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）チップで実現するプロセッサを使用する形態がある。このように、各種の処理部は、ハードウェア的な構造として、上記各種のプロセッサの１つ以上を用いて構成される。 As an example of configuring a plurality of processing units with a single processor, first, as represented by computers such as clients and servers, a single processor is configured by combining one or more CPUs and software. There is a form in which a processor functions as multiple processing units. Secondly, as typified by System On Chip (SoC), etc., there is a mode of using a processor that realizes the function of the entire system including multiple processing units with a single IC (Integrated Circuit) chip. be. In this way, the various processing units are configured using one or more of the above various processors as a hardware structure.

さらに、これらの各種のプロセッサのハードウェア的な構造としては、より具体的には、半導体素子等の回路素子を組み合わせた電気回路（ｃｉｒｃｕｉｔｒｙ）を用いることができる。 Further, as the hardware structure of these various processors, more specifically, an electric circuit in which circuit elements such as semiconductor elements are combined can be used.

以上の記載から、以下の付記項１に記載の発明を把握することができる。 From the above description, the invention described in the following supplementary item 1 can be grasped.

［付記項１］
学習用入力画像と、前記学習用入力画像に対して、セマンティックセグメンテーションの対象となる３つ以上のクラスが指定されたアノテーション画像との組である学習データを取得する取得プロセッサと、
前記アノテーション画像において指定された３つ以上のクラスのうちの少なくとも２つのクラスが統合され、前記アノテーション画像よりもクラス数が少ない仮アノテーション画像を生成する画像生成プロセッサと、
前記学習用入力画像と前記仮アノテーション画像との組である仮学習データを用いて、前記仮アノテーション画像のクラス数に応じた構成を有する仮機械学習モデルを学習させ、前記仮機械学習モデルを学習済み仮機械学習モデルとする仮学習プロセッサと、
前記学習済み仮機械学習モデルの少なくとも一部を用いて、前記アノテーション画像のクラス数に応じた構成を有する機械学習モデルを生成する機械学習モデル生成プロセッサと、
前記学習データを用いて前記機械学習モデルを学習させ、前記機械学習モデルを学習済み機械学習モデルとする本学習プロセッサと、
を備える学習装置。[Appendix 1]
an acquisition processor that acquires learning data that is a set of a learning input image and an annotation image in which three or more classes to be subjected to semantic segmentation are specified for the learning input image;
an image generation processor that integrates at least two classes out of three or more classes specified in the annotation image to generate a temporary annotation image having fewer classes than the annotation image;
A provisional machine learning model having a configuration corresponding to the number of classes of the provisional annotation image is learned using provisional learning data that is a set of the learning input image and the provisional annotation image, and the provisional machine learning model is learned. a provisional learning processor as a completed provisional machine learning model;
a machine learning model generation processor that uses at least part of the trained temporary machine learning model to generate a machine learning model having a configuration corresponding to the number of classes of the annotation image;
a learning processor that trains the machine learning model using the learning data and sets the machine learning model as a trained machine learning model;
A learning device with

本開示の技術は、上述の種々の実施形態と種々の変形例を適宜組み合わせることも可能である。また、上記各実施形態に限らず、要旨を逸脱しない限り種々の構成を採用し得ることはもちろんである。さらに、本開示の技術は、プログラムに加えて、プログラムを非一時的に記憶する記憶媒体にもおよぶ。 The technology of the present disclosure can also appropriately combine various embodiments and various modifications described above. Moreover, it is needless to say that various configurations can be employed without departing from the scope of the present invention without being limited to the above embodiments. Furthermore, the technology of the present disclosure extends to storage media that non-temporarily store programs in addition to programs.

以上に示した記載内容および図示内容は、本開示の技術に係る部分についての詳細な説明であり、本開示の技術の一例に過ぎない。例えば、上記の構成、機能、作用、および効果に関する説明は、本開示の技術に係る部分の構成、機能、作用、および効果の一例に関する説明である。よって、本開示の技術の主旨を逸脱しない範囲内において、以上に示した記載内容および図示内容に対して、不要な部分を削除したり、新たな要素を追加したり、置き換えたりしてもよいことはいうまでもない。また、錯綜を回避し、本開示の技術に係る部分の理解を容易にするために、以上に示した記載内容および図示内容では、本開示の技術の実施を可能にする上で特に説明を要しない技術常識等に関する説明は省略されている。 The above description and illustration are detailed descriptions of the parts related to the technology of the present disclosure, and are merely examples of the technology of the present disclosure. For example, the above descriptions of configurations, functions, actions, and effects are descriptions of examples of configurations, functions, actions, and effects of portions related to the technology of the present disclosure. Therefore, unnecessary parts may be deleted, new elements added, or replaced with respect to the above-described description and illustration without departing from the gist of the technology of the present disclosure. Needless to say. In addition, in order to avoid complication and facilitate understanding of the portion related to the technology of the present disclosure, the descriptions and illustrations shown above require no particular explanation in order to enable implementation of the technology of the present disclosure. Descriptions of common technical knowledge, etc., that are not used are omitted.

本明細書において、「Ａおよび／またはＢ」は、「ＡおよびＢのうちの少なくとも１つ」と同義である。つまり、「Ａおよび／またはＢ」は、Ａだけであってもよいし、Ｂだけであってもよいし、ＡおよびＢの組み合わせであってもよい、という意味である。また、本明細書において、３つ以上の事柄を「および／または」で結び付けて表現する場合も、「Ａおよび／またはＢ」と同様の考え方が適用される。 As used herein, "A and/or B" is synonymous with "at least one of A and B." That is, "A and/or B" means that only A, only B, or a combination of A and B may be used. In addition, in this specification, when three or more matters are expressed by connecting with "and/or", the same idea as "A and/or B" is applied.

本明細書に記載された全ての文献、特許出願および技術規格は、個々の文献、特許出願および技術規格が参照により取り込まれることが具体的かつ個々に記された場合と同程度に、本明細書中に参照により取り込まれる。 All publications, patent applications and technical standards mentioned herein are expressly incorporated herein by reference to the same extent as if each individual publication, patent application and technical standard were specifically and individually noted to be incorporated by reference. incorporated by reference into the book.

Claims

an acquisition unit that acquires learning data that is a set of an input image for learning and an annotation image in which three or more classes to be subjected to semantic segmentation are specified for the input image for learning;
an image generating unit that integrates at least two classes out of three or more classes specified in the annotation image and generates a temporary annotation image having a smaller number of classes than the annotation image;
A provisional machine learning model having a configuration corresponding to the number of classes of the provisional annotation image is learned using provisional learning data that is a set of the learning input image and the provisional annotation image, and the provisional machine learning model is learned. a provisional learning unit as a completed provisional machine learning model;
a machine learning model generation unit that uses at least part of the trained temporary machine learning model to generate a machine learning model having a configuration corresponding to the number of classes of the annotation image;
a main learning unit that learns the machine learning model using the learning data and sets the machine learning model as a trained machine learning model;
A learning device with

The learning device according to claim 1, comprising a temporary machine learning model generation unit that generates the temporary machine learning model.

When the number of classes of the annotation image is M, and the number of classes of the temporary annotation image is N, the image generating unit and the temporary learning unit perform the process of generating the temporary annotation image and the temporary machine learning model. Claim 1 or claim 2, wherein the process of using a completed temporary machine learning model is repeated a plurality of times while gradually increasing the number N of classes of the temporary annotation image until the number N of classes of the temporary annotation image reaches M-1. The learning device according to .

In the learning device according to claim 3 citing claim 2,
The temporary machine learning model generation unit is a learning device that generates the temporary machine learning model to be used this time by using at least part of the trained temporary machine learning model used last time.

5. The learning device according to claim 3, wherein the provisional learning unit changes a loss function used to evaluate class discrimination accuracy of the provisional machine learning model.

6. The learning device according to claim 5, wherein the provisional learning unit weights a loss function for a class common to the previous time less than a weight for a loss function for a class newly appearing this time.

The temporary learning unit and the main learning unit, in the process of turning the temporary machine learning model into the trained temporary machine learning model and the process of turning the machine learning model into the trained machine learning model, 7. The learning device according to any one of claims 1 to 6, which does not update.

8. The learning device according to claim 7, further comprising a first receiving unit that receives a user's designation of the portion not to be updated.

9. The learning device according to any one of claims 1 to 8, wherein the image generation unit generates the temporary annotation image according to image generation conditions specified in advance.

10. The learning device according to claim 9, further comprising a second reception unit that receives designation of the image generation condition by a user.

11. The learning device according to claim 9, wherein the image generation condition is such that the area of each class of the temporary annotation image is not biased.

12. The learning device according to any one of claims 9 to 11, wherein the image generation condition is such that complexity of each class of the temporary annotation image is not biased.

In the learning device according to claim 12 citing claim 11,
A user who selects whether the image generation condition is such that the area of each class of the temporary annotation image is balanced, or the image generation condition is such that the complexity of each class of the temporary annotation image is balanced. A learning device comprising a third reception unit that receives a selection instruction from.

14. The learning device according to any one of claims 9 to 13, wherein the image generation condition is that the class having the largest area is left as one class without being integrated.

15. The learning device according to any one of claims 9 to 14, wherein the image generation condition is such that the class with the lowest complexity is left as one class without being merged.

In the learning device according to claim 15 citing claim 14,
The image generation condition is such that the class with the largest area is left as one class without being merged, or the class with the lowest complexity is left as one class without being merged. A learning device comprising a fourth reception unit that receives a user's selection instruction as to whether or not the image generation condition is to be set.

17. The learning device according to any one of claims 1 to 16, further comprising a display control unit that controls display of the temporary annotation image.

The image generation unit generates a plurality of types of temporary annotation images with different integrated classes,
The display control unit performs control to display a plurality of types of the temporary annotation images,
A fifth reception unit that receives a user's selection instruction for one of the plurality of types of temporary annotation images,
18. The learning device according to claim 17, wherein the provisional learning unit uses the provisional annotation image for which the selection instruction is received by the fifth reception unit as the provisional learning data.

19. The learning device according to any one of claims 1 to 18, wherein the learning input image is a cell image obtained by photographing a plurality of cells in culture.

an acquisition step of acquiring learning data that is a set of an input image for learning and an annotation image in which three or more classes to be subjected to semantic segmentation are specified for the input image for learning;
an image generating step of integrating at least two classes out of three or more classes specified in the annotation image and generating a temporary annotation image having a smaller number of classes than the annotation image;
A provisional machine learning model having a configuration corresponding to the number of classes of the provisional annotation image is learned using provisional learning data that is a set of the learning input image and the provisional annotation image, and the provisional machine learning model is learned. a provisional learning step as a completed provisional machine learning model;
A machine learning model generation step of generating a machine learning model having a configuration corresponding to the number of classes of the annotation image using at least part of the trained temporary machine learning model;
A main learning step of learning the machine learning model using the learning data and making the machine learning model a trained machine learning model;
A method of operating a learning device comprising:

an acquisition unit that acquires learning data that is a set of an input image for learning and an annotation image in which three or more classes to be subjected to semantic segmentation are specified for the input image for learning;
an image generating unit that integrates at least two classes out of three or more classes specified in the annotation image and generates a temporary annotation image having a smaller number of classes than the annotation image;
A provisional machine learning model having a configuration corresponding to the number of classes of the provisional annotation image is learned using provisional learning data that is a set of the learning input image and the provisional annotation image, and the provisional machine learning model is learned. a provisional learning unit as a completed provisional machine learning model;
a machine learning model generation unit that generates a machine learning model having a configuration corresponding to the number of classes of the annotation image using at least part of the trained temporary machine learning model;
As a main learning unit that learns the machine learning model using the learning data and sets the machine learning model as a trained machine learning model,
An operating program for a learning device that makes a computer work.