JP7383939B2

JP7383939B2 - Learning device, learning method, cell discrimination device, cell discrimination method, cell discrimination learning program, and cell discrimination program

Info

Publication number: JP7383939B2
Application number: JP2019160467A
Authority: JP
Inventors: 章仁小松崎
Original assignee: Tosoh Corp
Current assignee: Tosoh Corp
Priority date: 2019-09-03
Filing date: 2019-09-03
Publication date: 2023-11-21
Anticipated expiration: 2039-09-03
Also published as: JP2021039544A

Description

本発明は、学習済モデルを構築する学習装置および学習方法、学習済モデルを利用する細胞判別装置および細胞判別方法、ならびに細胞判別学習プログラムおよび細胞判別プログラムに関する。 The present invention relates to a learning device and a learning method for constructing a learned model, a cell discriminating device and a cell discriminating method that use the learned model, and a cell discriminating learning program and a cell discriminating program.

従来、細胞種の判別は、各種顕微鏡下で撮影した明視野像、位相差像、微分干渉像および免疫蛍光染色像等を目視することにより行われている。近年、目視によって判別を行うのみならず、例えば、非特許文献１に開示されているように、深層学習などの機械学習により判別モデルを構築し、これを用いた判別も行われてきている。 Conventionally, cell types have been determined by visually observing bright field images, phase contrast images, differential interference images, immunofluorescence staining images, etc. taken under various microscopes. In recent years, in addition to visual discrimination, a discrimination model has been constructed by machine learning such as deep learning, and discrimination has been performed using this model, as disclosed in Non-Patent Document 1, for example.

Ronald Wihal Oei et al., "Convolutional neural network for cell classification using microscope images of intracellular actin networks" PLOS ONE, 2019.Ronald Wihal Oei et al., "Convolutional neural network for cell classification using microscope images of intracellular actin networks" PLOS ONE, 2019.

より精度の高い判別モデルの構築にあたっては、学習させるデータの数が多いほど有利とされている。一方で、例えば、学習用データにおいて分類ごとにデータ数が異なると、構築される判別モデルに偏りが生じるため、判別精度が低下する虞がある。そのため、通常は、もっともデータ数が少ない分類に数量を合せた学習用データを生成した上で学習を行う。したがって、機械学習ではすべての分類項目において、データ数を多く取得することが望まれる。 When building a discriminant model with higher accuracy, it is considered advantageous to have a larger amount of training data. On the other hand, for example, if the number of data differs for each classification in the learning data, the discriminant model that is constructed will be biased, so there is a risk that the discrimination accuracy will decrease. Therefore, learning is usually performed after generating training data whose quantity is matched to the classification with the least amount of data. Therefore, in machine learning, it is desirable to acquire a large amount of data for all classification items.

しかし、試料によっては、細胞の存在比に大きな偏りがあり、数多くデータを取得することが難しい場合がある。例えば、血中の細胞を分類するにあたっては、血中循環腫瘍細胞(以下、ＣＴＣ)をはじめ、データの取得が難しい細胞種がある。血中にはＣＴＣ以外に、赤血球および白血球などの細胞、さらには細胞以外の成分が存在する。これらを機械学習によって分類しようとした場合、データ数は、最もデータ数が少ないＣＴＣに合せることになる。その結果、少ないデータ量で機械学習を行うため、判別モデルの精度が低下するといった問題がある。 However, depending on the sample, there may be large deviations in the abundance ratio of cells, making it difficult to obtain a large amount of data. For example, when classifying cells in the blood, there are cell types for which it is difficult to obtain data, including circulating tumor cells (hereinafter referred to as CTCs). In addition to CTCs, blood contains cells such as red blood cells and white blood cells, as well as components other than cells. When attempting to classify these by machine learning, the number of data will be adjusted to the CTC with the smallest number of data. As a result, since machine learning is performed with a small amount of data, there is a problem that the accuracy of the discriminant model decreases.

そこで、本発明は上記の問題点に鑑みてなされたものであり、その目的は、細胞の種類を精度よく判別できる細胞判別装置および細胞判別方法、ならびにこれに用いられる学習済モデルを構築する学習装置および学習方法を提供することにある。 Therefore, the present invention has been made in view of the above problems, and its purpose is to provide a cell discrimination device and a cell discrimination method capable of accurately discriminating cell types, and a learning method for constructing a trained model used therein. The objective is to provide devices and learning methods.

本発明に係る学習装置は、上記課題を解決するために、細胞判別に用いられる学習済モデルを構築する学習装置であって、細胞を示すデータおよび非細胞を示すデータを含む第１の教師データを用いた機械学習により、細胞および非細胞の何れであるかを判別するための第１の学習済モデルを構築する第１学習部と、互いに異なる種類の細胞を示す複数種類のデータを含み、かつ非細胞を示すデータを含まない第２の教師データを用いた機械学習により、上記第１の学習済モデルによって細胞と判別されたデータの細胞の種類を判別するための、上記第１の学習済モデルとは異なる第２の学習済モデルを構築する第２学習部と、を備えている。 In order to solve the above-mentioned problems, a learning device according to the present invention is a learning device that constructs a trained model used for cell discrimination, and includes first training data including data indicating cells and data indicating non-cells. a first learning unit that constructs a first trained model for determining whether cells are cells or non-cells by machine learning using the method; and a plurality of types of data indicating mutually different types of cells; and the first learning for determining the type of cell of the data determined as a cell by the first trained model by machine learning using second training data that does not include data indicating non-cells. and a second learning section that constructs a second trained model different from the trained model.

本発明に係る細胞判別装置は、上記課題を解決するために、入力されたデータにおける細胞の種類を判別する細胞判別装置であって、上述の学習装置が構築した第１の学習済モデルを用いて、入力されたデータが細胞および非細胞の何れを示すものであるかを判別する第１判別部と、上記第１判別部による判別結果が細胞を示す場合に、上記学習装置が構築した第２の学習済モデルを用いて、上記入力されたデータが何れの種類の細胞を示すものであるかを判別する第２判別部と、上記第１判別部による判別結果が非細胞を示す場合には該判別結果を出力し、上記第１判別部による判別結果が細胞を示す場合には上記第２判別部による判別結果を出力する判別結果出力部とを備えている。 In order to solve the above problems, a cell discrimination device according to the present invention is a cell discrimination device that discriminates the type of cell in input data, and uses a first trained model constructed by the above learning device. a first discrimination section that discriminates whether the input data indicates cells or non-cells; a second discriminator that discriminates which type of cell the input data indicates using the trained model of No. 2, and a second discriminator that discriminates which type of cell the input data indicates; is equipped with a discrimination result output section that outputs the discrimination result, and outputs the discrimination result of the second discrimination section when the discrimination result by the first discrimination section indicates a cell.

また、本発明に係る学習方法は、上記課題を解決するために、細胞判別に用いられる学習済モデルを構築する学習方法であって、細胞を示すデータおよび非細胞を示すデータを含む第１の教師データを用いた機械学習により、細胞および非細胞の何れであるかを判別するための第１の学習済モデルを構築する第１学習工程と、互いに異なる種類の細胞を示す複数種類のデータを含み、かつ非細胞を示すデータを含まない第２の教師データを用いた機械学習により、第１の学習済モデルによって細胞と判別されたデータの細胞の種類を判別するための、上記第１の学習済モデルとは異なる第２の学習済モデルを構築する第２学習工程と、を含む。 Moreover, in order to solve the above-mentioned problem, the learning method according to the present invention is a learning method for constructing a trained model used for cell discrimination, and includes a first model including data indicating cells and data indicating non-cells. A first learning step in which a first trained model is constructed to determine whether cells are cells or non-cells by machine learning using training data, and multiple types of data indicating different types of cells are combined. The above-mentioned first method for determining the cell type of data that is determined to be a cell by the first trained model by machine learning using the second training data that includes the data and does not include data indicating non-cells. and a second learning step of constructing a second trained model different from the trained model.

また、本発明に係る細胞判別方法は、上記課題を解決するために、入力されたデータにおける細胞の種類を判別する細胞判別方法であって、データを入力するデータ入力工程と、上述の学習装置が構築した第１の学習済モデルを用いて、入力されたデータが細胞および非細胞の何れを示すものであるかを判別する第１判別工程と、上記第１判別工程による判別結果が細胞を示す場合に、上記学習装置が構築した第２の学習済モデルを用いて、上記入力されたデータが何れの種類の細胞を示すものであるかを判別する第２判別工程と、上記第１判別工程による判別結果が非細胞を示す場合には該判別結果を出力し、上記第１判別工程による判別結果が細胞を示す場合には上記第２判別工程による判別結果を出力する判別結果出力工程と、を含む構成である。 In addition, in order to solve the above problems, the cell discrimination method according to the present invention is a cell discrimination method that discriminates the type of cell in input data, and includes a data input step of inputting data, and the above-mentioned learning device. A first discrimination step in which the input data is determined to indicate cells or non-cells using the first trained model constructed by a second discrimination step of determining which type of cell the input data indicates using a second trained model constructed by the learning device; and a second discrimination step of determining which type of cell the input data indicates; a discrimination result output step of outputting the discrimination result when the discrimination result in the step indicates non-cells, and outputting the discrimination result in the second discrimination step when the discrimination result in the first discrimination step indicates cells; The configuration includes .

本発明によれば、画像中の細胞の種類を精度よく判別することができる判別モデルを生成することができる。 According to the present invention, it is possible to generate a discrimination model that can accurately discriminate the type of cells in an image.

本発明の一実施形態に係る細胞判別装置の概略構成を示す機能ブロック図である。FIG. 1 is a functional block diagram showing a schematic configuration of a cell discrimination device according to an embodiment of the present invention. 第１のテーブルの一例を示す図である。FIG. 3 is a diagram showing an example of a first table. 第２のテーブルの一例を示す図である。It is a figure which shows an example of a 2nd table. 本発明の一実施形態に係る細胞判別処理の流れを示すフローチャートである。It is a flowchart showing the flow of cell discrimination processing according to one embodiment of the present invention. 入力画像の一例を示す図である。FIG. 3 is a diagram showing an example of an input image.

〔細胞判別装置〕
以下、本発明の一実施形態について詳細に説明する。図１は、本実施形態に係る細胞判別装置１の概略構成の一例を示す機能ブロック図である。細胞判別装置１は、入力部１０、主制御部２０および表示部３０を備えている。主制御部２０には、判別部４０、学習モデル５０、学習部６０および判別結果出力部７０を備えている。判別部４０は、第１判別部４１および第２判別部４２を備えている。また、学習部６０は、細胞判別に用いられる学習済モデルを構築する学習装置として機能し、第１学習部６１、第２学習部６２および教師データ生成部６３を備えている。学習モデル５０には、第１学習済モデル５１および第２学習済モデル５２が含まれている。 [Cell discrimination device]
Hereinafter, one embodiment of the present invention will be described in detail. FIG. 1 is a functional block diagram showing an example of a schematic configuration of a cell discrimination device 1 according to this embodiment. The cell discrimination device 1 includes an input section 10, a main control section 20, and a display section 30. The main control section 20 includes a discrimination section 40, a learning model 50, a learning section 60, and a discrimination result output section 70. The determining section 40 includes a first determining section 41 and a second determining section 42. Further, the learning section 60 functions as a learning device that constructs a trained model used for cell discrimination, and includes a first learning section 61, a second learning section 62, and a teacher data generation section 63. The learning model 50 includes a first trained model 51 and a second trained model 52.

細胞判別装置１は、入力されたデータにおける細胞の種類を判別する装置である。以下の説明では、データとして画像データ（以下、単に画像という場合もある）を用いる場合を例にして説明する。 The cell discrimination device 1 is a device that discriminates the type of cell in input data. In the following description, an example will be described in which image data (hereinafter, sometimes simply referred to as an image) is used as data.

（入力部）
入力部１０は、判別対象となる画像の入力を受け付けるものである。入力部１０は、記憶媒体に記憶されたデータファイルを読み込むこと、または、有線または無線のネットワークを介して他の装置から画像を受信することによって、上述した画像の入力を受け付ける。入力部１０は、受け付けた画像データを、主制御部２０に送信する。 (input section)
The input unit 10 receives input of an image to be determined. The input unit 10 receives the above-described image input by reading a data file stored in a storage medium or by receiving an image from another device via a wired or wireless network. The input unit 10 transmits the received image data to the main control unit 20.

（画像データ）
画像データは、分析試料中の細胞を撮像して得られた画像である。なお、細胞の撮像は、個々の細胞を一つずつ撮像したものに限らず、多数の細胞をアレイ上に整列させてアレイ全体またはその一部を撮像し、撮像後に画像を分割したものであってもよい。とりわけ、網羅的に処理して後で分割して個々の細胞画像を取得する場合には、細胞が含まれていない区画またはウェルを撮像した画像の比率が高くなる。以上から、画像としては、細胞の画像と、細胞以外の画像とに大別される。 (image data)
The image data is an image obtained by imaging cells in an analysis sample. Note that cell imaging is not limited to imaging individual cells one by one; it can also involve arranging a large number of cells on an array, imaging the entire array or a portion of it, and dividing the image after imaging. You can. Particularly, when performing comprehensive processing and later dividing to obtain individual cell images, the proportion of images taken of compartments or wells that do not contain cells increases. From the above, images can be broadly classified into images of cells and images of non-cells.

細胞の画像としては、分析試料中に存在し得る複数種類の細胞のうちの何れかの種類の細胞の画像である。非限定的な例示として、分析試料として血液に所定の処理を施した試料を使用する場合、これに存在し得る細胞として、白血球、サイトケラチン陽性血中循環腫瘍細胞およびサイトケラチン陰性血中循環腫瘍細胞等の血中成分が挙げられる。したがって、細胞の画像としては、白血球、サイトケラチン陽性血中循環腫瘍細胞およびサイトケラチン陰性血中循環腫瘍細胞の何れかの画像であり得る。 The image of the cell is an image of any type of cell among multiple types of cells that may exist in the analysis sample. As a non-limiting example, when a blood sample subjected to a certain treatment is used as an analysis sample, cells that may be present in the sample include leukocytes, cytokeratin-positive circulating tumor cells, and cytokeratin-negative circulating tumor cells. Examples include blood components such as cells. Therefore, the cell image may be an image of any of leukocytes, cytokeratin-positive circulating tumor cells, and cytokeratin-negative circulating tumor cells.

一方で、細胞以外の画像としては、ごみなどの細胞以外の物質が撮像された画像と、細胞および細胞以外の物質の何れも撮像されていない画像とに分けられる。ここで、細胞以外の物質の何れも撮像されていない画像とは、細胞を捕捉する微細孔またはウェルなどに細胞もごみも捕捉されず、培地、細胞を分散させていた分散媒体のみが含まれる微細孔またはウェルなどを撮像した画像が意図される。 On the other hand, images other than cells are classified into images in which substances other than cells such as garbage are captured, and images in which neither cells nor substances other than cells are captured. Here, an image in which no substances other than cells are captured means that neither cells nor dirt are captured in the micropores or wells that capture cells, and only the culture medium and dispersion medium used to disperse the cells are included. An image of a micropore or well is intended.

画像は、特定の一つの撮像手法により得られた画像に限られず、複数の撮像手法により得られた画像を組み合わせて用いるものであってもよい。例えば、明視野観察像、暗視野観察像、位相差観察像および蛍光観察像の中から選択される２以上の観察像を組み合わせて用いることができる。また、蛍光観察像として、各種抗体による蛍光観察像およびＤＡＰＩによる核染色像などを組み合わせて用いることができる。また、複数の画像を組み合わせて用いる場合、複数の画像を合成して一つの画像とすることができる。例えば、重ねあわせ処理を行って一つの画像に合成したものであってもよく、重ねることなくそれぞれを並べて結合し、一つの画像とするものであってもよい。 The image is not limited to an image obtained by one specific imaging method, but may be a combination of images obtained by a plurality of imaging methods. For example, two or more observation images selected from bright field observation images, dark field observation images, phase contrast observation images, and fluorescence observation images can be used in combination. Further, as the fluorescence observation image, a combination of fluorescence observation images using various antibodies, nuclear staining images using DAPI, etc. can be used. Moreover, when using a plurality of images in combination, the plurality of images can be combined into one image. For example, the images may be combined into one image by performing an overlapping process, or they may be arranged and combined without overlapping to form a single image.

また、各画像に対しは、前処理として、コントラスト調整、および特定のチャネルに基づいて、別のチャネルを調整することなどを行ってもよい。例えば、特定のチャネル（例えば、後述するようなＤＡＰＩによる蛍光観察像）の輝度値を基に、他のチャネルの輝度値を規格化する処理などをおこなってもよい。 Further, each image may be subjected to pre-processing such as contrast adjustment and adjustment of another channel based on a particular channel. For example, processing may be performed to normalize the brightness values of other channels based on the brightness values of a specific channel (for example, a fluorescent observation image by DAPI as described later).

上述の通り、本実施形態では、入力されるデータとして画像データを用いた例について説明しているが、入力されるデータは画像データに限定されない。例えば、画像および当該画像から抽出し得る特定の指標に基づき数値化またはパラメータ化したような数値データを用いてもよい。 As described above, in this embodiment, an example in which image data is used as input data is described, but input data is not limited to image data. For example, numerical data digitized or parameterized based on an image and a specific index that can be extracted from the image may be used.

（学習モデル）
第１判別部４１において用いられる第１学習済モデル５１は、細胞判別装置１に入力された画像を入力とし、当該画像が細胞を示すものであるか、非細胞を示すものであるかのそれぞれの確率値を出力とする学習済モデルである。後述する通り、第１学習済モデル５１は、第１の教師データを用いた機械学習により第１学習部６１において構築される。 (learning model)
The first trained model 51 used in the first discrimination unit 41 receives the image input to the cell discrimination device 1 and determines whether the image indicates a cell or a non-cell. This is a trained model whose output is the probability value of . As will be described later, the first learned model 51 is constructed in the first learning unit 61 by machine learning using first teacher data.

第１学習済モデル５１の出力においては、非細胞成分の画像および細胞も非細胞成分も撮像されていない画像を区別せずに非細胞を示すものとして出力するものであってもよいし、非細胞成分を示すものと、細胞も非細胞成分も撮像されていないものとを区別して出力するものであってもよい。すなわち、前者の場合には、例えば、「細胞」および「非細胞」の確率値を出力とする学習モデルとすることができ、後者の場合には、例えば、「細胞」、「非細胞」および「空（empty）」の確率値を出力とする学習モデルとすることができる。 In the output of the first trained model 51, images of non-cellular components and images in which neither cells nor non-cellular components are imaged may be output without distinguishing between them, or may be output as showing non-cells. It may be possible to distinguish between those showing cellular components and those in which neither cells nor non-cellular components are imaged and output them. That is, in the former case, for example, a learning model that outputs the probability values of "cell" and "non-cell" can be used, and in the latter case, for example, the probability value of "cell", "non-cell" and This can be a learning model that outputs an "empty" probability value.

一方、第２判別部４２において用いられる第２学習済モデル５２は、第１学習済モデル５１と異なる学習済モデルであり、細胞判別装置１に入力された画像を入力とし、複数ある細胞の種類それぞれにおいて、その細胞である確率値を出力とする学習済モデルである。後述する通り、第２学習済モデル５２は、第２の教師データを用いた機械学習により第２学習部６２において構築される。 On the other hand, the second trained model 52 used in the second discriminator 42 is a trained model different from the first trained model 51, and uses the image input to the cell discriminator 1 as input, Each is a trained model whose output is the probability value of that cell. As will be described later, the second learned model 52 is constructed in the second learning unit 62 by machine learning using second teacher data.

本実施形態においては、第１学習済モデル５１および第２学習済モデル５２は、それぞれ第１の教師データおよび第２の教師データを用いて、複数の畳み込み層を有する畳み込みニューラルネットワーク（ＣＮＮ：Convolutional Neural Network）に学習させて構築されたモデルである。ＣＮＮとしては、Ｃｈａｉｎｅｒ（ＰｒｅｆｅｒｒｅｄＮｅｔｗｏｒｋｓ社）等の公知のソフトウェア内の機能を利用することができる。しかしながら、学習に用いるニューラルネットワークはＣＮＮに限定されず、他の公知のニューラルネットワークを用いてもよい。 In the present embodiment, the first trained model 51 and the second trained model 52 are constructed using a convolutional neural network (CNN) having a plurality of convolutional layers using first training data and second training data, respectively. This is a model built by training a neural network. As the CNN, functions in known software such as Chainer (Preferred Networks) can be used. However, the neural network used for learning is not limited to CNN, and other known neural networks may be used.

（教師データ生成部）
教師データ生成部６３は、教師データ生成用に入力された画像データから、第１の教師データおよび第２の教師データを生成する。詳細には、教師データ生成用に入力される画像データは、予め、その画像データが、複数種類の細胞のうちの何れのものであるか、または非細胞であるかの情報と関連付けがなされている。教師データ生成部６３は、画像データに関連付けされた当該情報を参照して、第１の教師データおよび第２の教師データを生成する。画像データと当該情報との関連付けは、画像データと当該情報との対応関係を示すテーブルである第１のテーブルを作成しておき、画像データとともに当該第１のテーブルを入力し、それを参照するものである。しかしながら画像データと当該情報との関連付けはテーブルを用いる場合に限らず、例えば、画像データのメタデータとして当該情報を含ませるものであってもよい。第１のテーブルの一例を図２に示す。図２に示すテーブルでは、各画像データに対して、後述する「ＣＫ＋」、「ＣＫ－」、「ＷＢＣ」、「ＤＳＴ」および「ＥＭＰ」の何れかの情報が関連付けされている。 (Teacher data generation section)
The teacher data generation unit 63 generates first teacher data and second teacher data from the image data input for teacher data generation. Specifically, the image data input for generating training data is associated in advance with information as to which of multiple types of cells or non-cells the image data is. There is. The teacher data generation unit 63 generates first teacher data and second teacher data by referring to the information associated with the image data. To associate the image data with the information, create a first table that shows the correspondence between the image data and the information, input the first table along with the image data, and refer to it. It is something. However, the association between the image data and the information is not limited to the case where a table is used. For example, the information may be included as metadata of the image data. An example of the first table is shown in FIG. In the table shown in FIG. 2, each piece of image data is associated with information such as "CK+", "CK-", "WBC", "DST", and "EMP", which will be described later.

画像データが、複数種類の細胞のうちの何れのものであるか、または非細胞のものであるかの情報は、予め人間によって判断されたものである。教師データ生成用に入力される画像データに含まれる細胞の画像としては、判別対象となる細胞が含まれていた試料中に存在し得る種類の細胞の画像が含まれている。一方、非細胞の画像としては、ごみなどの細胞以外の物質が撮像された画像と、細胞および細胞以外の物質の何れも撮像されていない画像が含まれている。 Information as to whether the image data is of a plurality of types of cells or non-cells is determined in advance by humans. The images of cells included in the image data input for generating teacher data include images of types of cells that may exist in the sample that contained the cells to be determined. On the other hand, non-cellular images include images in which substances other than cells such as garbage are captured, and images in which neither cells nor substances other than cells are captured.

まず、教師データ生成部６３における第１の教師データの生成について説明する。 First, the generation of first teacher data in the teacher data generation section 63 will be explained.

教師データ生成部６３は、入力された画像データに関連付けされている情報が非細胞であることを示す情報、すなわち非細胞の種類を示す情報の場合には、「非細胞」のラベルを付す。一方、入力された画像データに関連付けされている情報が複数種類の細胞のうちの何れであるかを示す情報、すなわち細胞の種類を示す情報である場合には、「細胞」のラベルを付す。なお、ラベルを付すとは、(ｉ)画像データのメタデータとしてラベル情報を付与する、(ｉｉ)各画像データ付されたラベルを示すデータファイルを更新する、または(ｉｉｉ)画像データを、ラベルに対応するディレクトリに保存するなどの方法によって実現することができる。また、入力された画像データに関連付けされている情報が、「非細胞」および「細胞」の何れのラベルを付すべき情報に対応するものであるかは、細胞の種類を示す情報および非細胞の種類を示す情報と、付されるべきラベルとの対応関係を示す第２のテーブルを参照することで決定する。第２のテーブルは、予め作成しておき、画像データとともに入力すればよい。あるいは、第２のテーブルは予め入力される場合に限らず、「ＣＫ＋」、「ＣＫ－」、「ＷＢＣ」、「ＤＳＴ」および「ＥＭＰ」などの情報に対して、ユーザがラベルとの対応関係を指示することで、画像データ入力後に別途作成されるものであってもよい。第２のテーブルの一例を図３に示す。図３に示すテーブルでは、「ＣＫ＋」、「ＣＫ－」、「ＷＢＣ」、「ＤＳＴ」および「ＥＭＰ」で示される情報と、付されるべきラベルとの対応関係が示されている。 If the information associated with the input image data is information indicating that it is non-cell, that is, information indicating the type of non-cell, the teacher data generation unit 63 labels it "non-cell". On the other hand, if the information associated with the input image data is information indicating which of a plurality of types of cells, that is, information indicating the type of cell, a label of "cell" is attached. Note that attaching a label means (i) attaching label information as metadata to image data, (ii) updating a data file indicating a label attached to each image data, or (iii) adding a label to image data. This can be achieved by saving it in a directory corresponding to the . In addition, whether the information associated with the input image data corresponds to information that should be labeled "non-cell" or "cell" is determined by information indicating the type of cell and information indicating the type of cell. This is determined by referring to a second table showing the correspondence between information indicating the type and the label to be attached. The second table may be created in advance and input together with the image data. Alternatively, the second table is not limited to the case where the information is input in advance, and the user can specify the correspondence relationship with the label for information such as "CK+", "CK-", "WBC", "DST", and "EMP". It may be created separately after inputting the image data by instructing. An example of the second table is shown in FIG. The table shown in FIG. 3 shows the correspondence between the information indicated by "CK+", "CK-", "WBC", "DST", and "EMP" and the labels to be attached.

これにより、教師データ生成部６３は、複数種類の細胞のうちの何れであるか、または非細胞であるかの情報が関連付けされている画像データから、細胞を示すデータおよび非細胞を示すデータを含む第１の教師データを生成する。また、入力された画像データそのものに細胞であるか非細胞であるかの情報が付されていなくても、細胞であるか非細胞であるかを判別するための第１の学習済モデルを構築するための第１の教師データを作成することができる。 As a result, the teacher data generation unit 63 generates data indicating cells and data indicating non-cells from the image data associated with information regarding which of multiple types of cells or non-cells. Generate first training data including: In addition, even if the input image data itself does not have information on whether it is a cell or a non-cell, a first trained model is constructed to determine whether the input image data is a cell or a non-cell. It is possible to create first training data for

なお、本実施形態においては、複数種類の細胞のうちの何れであるかを示す情報が関連付けされている画像データを選別し、これに「細胞」のラベルを付している。しかしながら、入力する画像データに、予め、複数種類の細胞のうちの何れであるかを示す情報とともに、細胞であることを示す情報も関連付けしておいてもよい。これにより、入力された画像データに関連付けされている情報が細胞であることを示すものである場合には、その画像データには「細胞」のラベルが付されているとみなせばよい。 Note that in this embodiment, image data associated with information indicating which of a plurality of types of cells is associated is selected and labeled as "cell." However, the input image data may be associated in advance with information indicating which of a plurality of types of cells it is, as well as information indicating that it is a cell. Accordingly, if the information associated with the input image data indicates that it is a cell, the image data may be considered to be labeled with "cell".

次いで、教師データ生成部６３における第２の教師データの生成について説明する。 Next, generation of second teacher data in the teacher data generation section 63 will be explained.

教師データ生成部６３は、入力された画像データの中から、複数種類の細胞のうちの何れであるかの情報が関連付けされているデータのみを抽出する。そして、入力された画像データに対し、関連付けされた情報が示す細胞の種類を示すラベルを付すことで、互いに異なる種類の細胞を示す複数種類のデータを含む第２の教師データを生成する。教師データ生成部６３は、複数種類の細胞のうちの何れであるかの情報が関連付けされているデータのみを抽出して第２の教師データを生成する。そのため、第２の教師データには、非細胞を示すデータが含まれなくなる。また、第１の教師データの生成に用いた、非細胞を示す画像データを含む入力データを用いても、非細胞を示す画像データが含まれない第２の教師データを生成することができる。 The teacher data generation unit 63 extracts only data associated with information indicating which of a plurality of types of cells belongs to the input image data. Then, by attaching a label indicating the cell type indicated by the associated information to the input image data, second teacher data including a plurality of types of data indicating mutually different types of cells is generated. The teacher data generation unit 63 extracts only data associated with information regarding which of the plurality of types of cells it is, and generates second teacher data. Therefore, the second teacher data no longer includes data indicating non-cells. Further, even if input data including image data indicating non-cells used to generate the first teacher data is used, it is possible to generate second teacher data that does not include image data indicating non-cells.

本実施形態における教師データ生成部６３では、第２の教師データを生成する際に、複数ある細胞種類のうち、入力されたデータ中に存在する数が最も少ない細胞種類のデータ数に合わせて、各細胞種類のデータを抽出している。細胞種類ごとのデータ数に大きなばらつきがある場合には、入力されたデータ中に存在する数が最も少ない細胞種類のデータ数にあわせることにより、判別精度の高い学習済モデルを構築することができる。例えば、入力されたデータ中に存在する数が最も少ない細胞種類のデータ数が、数が最も多い細胞種類のデータ数の９０％未満である場合に、最も少ない細胞種類のデータ数に合わせて、データの抽出を行う構成とすればよい。しかしながら、第２教師データに含める各細胞種類のデータ数を合わせることは必須ではない。 In the teacher data generation unit 63 in this embodiment, when generating second teacher data, among the plurality of cell types, according to the number of data of the cell type that exists in the input data the least number, Data for each cell type is extracted. If there is a large variation in the number of data for each cell type, it is possible to build a trained model with high discrimination accuracy by matching the number of data for the cell type with the smallest number in the input data. . For example, if the number of data of the cell type with the smallest number in the input data is less than 90% of the number of data of the cell type with the largest number, then according to the number of data of the smallest cell type, It may be configured to extract data. However, it is not essential to match the number of data of each cell type included in the second teacher data.

教師データ生成部６３は、第１の教師データおよび第２の教師データを生成する際に、画像の回転および反転など、公知の手法によるデータ拡張を行ってもよい。 When generating the first teacher data and the second teacher data, the teacher data generation unit 63 may perform data expansion using known techniques such as image rotation and inversion.

（第１学習部および第２学習部）
第１学習部６１は、教師データ生成部６３が生成した第１の教師データを用いて、公知の機械学習の方法により、第１学習済モデル５１を構築する。一方、第２学習部６２は、教師データ生成部６３が生成した第２の教師データを用いて、公知の機械学習の方法により、第２学習済モデル５２を構築する。本実施形態では、第１学習部６１および第２学習部６２は、上述の通り、入力される画像とそれらの情報（細胞であるか、細胞種は何か）であるラベルとの対応をＣＮＮに学習させている。 (1st study part and 2nd study part)
The first learning unit 61 constructs the first trained model 51 using the first teacher data generated by the teacher data generation unit 63 by a known machine learning method. On the other hand, the second learning unit 62 constructs the second trained model 52 using the second teacher data generated by the teacher data generation unit 63 by a known machine learning method. In this embodiment, as described above, the first learning unit 61 and the second learning unit 62 calculate the correspondence between input images and their information (whether it is a cell or what type of cell is it) using CNN. I am letting them learn.

なお、本実施形態では、第１学習部６１と、第２学習部６２と、教師データ生成部６３とを含む学習部６０を備えている細胞判別装置１について説明している。しかしながら、細胞判別装置に学習部６０を設ける構成でなくてもよい。すなわち、細胞判別装置１とは独立に存在する、第１学習部６１と第２学習部６２と教師データ生成部６３とを含む別の学習装置により、第１学習済モデル５１および第２学習済モデル５２を構築するものであってもよい。学習装置が細胞判別装置１とは独立に存在する場合には、細胞判別装置１は、記憶媒体に記憶された各学習済モデルを読み込むこと、または、有線または無線のネットワークを介して他の装置から各学習済モデルを受信することで、細胞判別装置１において各学習済モデルが利用可能となる。 In addition, in this embodiment, the cell discriminating device 1 including the learning section 60 including the first learning section 61, the second learning section 62, and the teacher data generating section 63 is described. However, it is not necessary to provide the learning section 60 in the cell discrimination device. That is, the first trained model 51 and the second trained model are generated by another learning device that exists independently of the cell discrimination device 1 and includes a first learning section 61, a second learning section 62, and a teacher data generation section 63. The model 52 may also be constructed. When the learning device exists independently of the cell discriminating device 1, the cell discriminating device 1 can load each learned model stored in a storage medium or communicate with other devices via a wired or wireless network. By receiving each learned model from the cell discriminator 1, each learned model becomes available for use in the cell discriminating device 1.

（第１判別部）
第１判別部４１は、第１学習済モデル５１に画像を入力し、第１学習済モデル５１の出力結果から、細胞判別装置１に入力された画像が細胞を示すものであるか、非細胞を示すものであるかを判別する。具体的には、最も確率が高かったものを判別結果として採用する。第１判別部４１は、その判別結果を判別結果出力部７０に送信する。なお、第１判別部４１は、判別結果が非細胞を示すものであった場合のみ、判別結果を判別結果出力部７０に送信するものであってもよい。 (First discrimination part)
The first discriminator 41 inputs the image to the first trained model 51, and determines whether the image input to the cell discriminator 1 represents a cell or not, based on the output result of the first trained model 51. Determine whether it indicates. Specifically, the one with the highest probability is adopted as the discrimination result. The first discrimination section 41 transmits the discrimination result to the discrimination result output section 70. Note that the first discrimination section 41 may transmit the discrimination result to the discrimination result output section 70 only when the discrimination result indicates non-cells.

（第２判別部）
第２判別部４２は、第１判別部４１において細胞を示す画像との判別結果が得られた場合にのみ、第２学習済モデル５２に画像を入力し、第２学習済モデル５２の出力結果から、細胞判別装置１に入力された画像における細胞の種類を判別する。具体的には、最も確率が高かったものを判別結果として採用する。第２判別部４２は、その判別結果を判別結果出力部７０に送信する。 (Second discrimination part)
The second discriminator 42 inputs the image to the second trained model 52 only when the first discriminator 41 obtains a discrimination result between the image and the image showing cells, and outputs the output result of the second trained model 52. From this, the type of cell in the image input to the cell discriminating device 1 is discriminated. Specifically, the one with the highest probability is adopted as the discrimination result. The second discrimination section 42 transmits the discrimination result to the discrimination result output section 70.

（判別結果出力部）
判別結果出力部７０は、第１判別部４１または第２判別部４２から送信される判別結果を表示部３０に対して出力する。第１判別部４１より送られてくる判別結果が非細胞を示している場合には、判別結果出力部７０は、第１判別部４１から送られてくる当該判別結果を出力する。一方、第１判別部４１より送られてくる判別結果が細胞を示している場合、あるいは第１判別部４１からは判別結果が送信されず、第２判別部４２から判別結果が送られてくる場合には、判別結果出力部７０は、第２判別部４２から送られてくる判別結果を出力する。これにより、判別結果出力部７０は、表示部３０を介して、最終的な判別結果をユーザに知らせる。 (Discrimination result output section)
The determination result output unit 70 outputs the determination result transmitted from the first determination unit 41 or the second determination unit 42 to the display unit 30. When the discrimination result sent from the first discrimination section 41 indicates a non-cell, the discrimination result output section 70 outputs the discrimination result sent from the first discrimination section 41. On the other hand, if the discrimination result sent from the first discrimination section 41 indicates a cell, or the discrimination result is not sent from the first discrimination section 41 and the discrimination result is sent from the second discrimination section 42. In this case, the determination result output unit 70 outputs the determination result sent from the second determination unit 42. Thereby, the determination result output unit 70 notifies the user of the final determination result via the display unit 30.

（表示部）
表示部３０は、判別結果出力部７０から出力される最終的な判別結果を表示する装置である。一態様として、表示部３０は、最終的な判別結果を画像データまたは文字データとして表示する表示装置である。なお、表示部は、細胞判別装置１に備えられる場合に限らず、細胞判別装置１と接続可能な外部装置として設けられても構わない。 (Display)
The display unit 30 is a device that displays the final determination result output from the determination result output unit 70. In one embodiment, the display unit 30 is a display device that displays the final determination result as image data or character data. Note that the display unit is not limited to being provided in the cell discriminating device 1, and may be provided as an external device connectable to the cell discriminating device 1.

（細胞判別装置の動作）
次に、細胞判別装置１を用いて判別処理を行う場合の流れの一例を、図４を参照して説明する。図４は、細胞判別装置１の動作の流れの一例を説明するフローチャートである。 (Operation of cell discrimination device)
Next, an example of the flow when performing a discrimination process using the cell discrimination device 1 will be described with reference to FIG. 4. FIG. 4 is a flowchart illustrating an example of the flow of operation of the cell discrimination device 1.

まず、判別部４０が、入力部１０を介して、ユーザの入力操作により入力された画像を取得する（ステップＳ１１；データ入力工程）。ここで、判別の対象となる画像は、第１学習済モデル５１および第２学習済モデルの構築に用いられた第１の教師データおよび第２の教師データに含まれる画像と同様の処理操作によって取得されたものである。次に、第１判別部４１が、第１学習済モデル５１に対して、画像を入力し、第１学習済モデル５１からの出力を取得する。そして、その出力結果から、画像が細胞を示すものであるか、非細胞を示すものであるかを判別する（ステップＳ１２；第１判別工程）。画像が細胞を示すとの判別結果となった場合（ステップＳ１３のｙｅｓ）、今度は第２判別部４２が、第２学習済モデル５２に対して、第１学習済モデル５１に入力された画像と同じ画像を入力し、第２学習済モデル５２からの出力を取得する。そして、その出力結果から、画像が何れの種類の細胞を示すものであるかを判別し（第２判別工程）、その判別結果を判別結果出力部７０に送信する（ステップＳ１４）。一方、第１判別部４１における判別で画像が非細胞を示すとの判別結果となった場合（ステップＳ１３のｎｏ）、その判別結果を判別結果出力部７０に送信する。判別結果出力部７０は、第２判別部４２から判別結果が送られてきた場合には、第２判別部４２からの判別結果を表示部３０に出力し、第１判別部４１から画像が非細胞であるとの判別結果が送られてきた場合には、第１判別部４１からの判別結果を、表示部３０に出力する（ステップＳ１５；判別結果出力工程）。 First, the determination unit 40 acquires an image input by a user's input operation via the input unit 10 (step S11; data input step). Here, the images to be determined are processed by the same processing operations as the images included in the first training data and second training data used to construct the first trained model 51 and the second trained model. It was acquired. Next, the first discrimination unit 41 inputs the image to the first trained model 51 and obtains the output from the first trained model 51. Then, based on the output result, it is determined whether the image shows cells or non-cells (step S12; first determination step). If the determination result is that the image shows cells (yes in step S13), then the second determination unit 42 uses the image input to the first trained model 51 for the second trained model 52. The same image as is input and the output from the second trained model 52 is obtained. Then, from the output result, it is determined which type of cell the image represents (second determination step), and the determination result is transmitted to the determination result output unit 70 (step S14). On the other hand, if the first discrimination section 41 determines that the image shows non-cells (no in step S13), the discrimination result is transmitted to the discrimination result output section 70. When the discrimination result is sent from the second discrimination section 42, the discrimination result output section 70 outputs the discrimination result from the second discrimination section 42 to the display section 30, and determines whether the image from the first discrimination section 41 is non-identical. When the determination result that it is a cell is sent, the determination result from the first determination section 41 is output to the display section 30 (step S15; determination result output step).

上記の通り、細胞判別装置１を用いた判別処理では、２段階の判別ステップによって、画像中の細胞の判別を行っている。これにより、細胞であるか非細胞であるか、細胞である場合にはどの種類の細胞であるかの判別を一度に行うように構築された学習済モデルを用いて判別を行う場合に比して、画像中の細胞が何れの種類の細胞であるかを精度よく判別することができる。 As described above, in the discrimination process using the cell discrimination device 1, cells in an image are discriminated through two discrimination steps. This is compared to the case where discrimination is performed using a trained model that is constructed to discriminate whether the cell is a cell or a non-cell, and if it is a cell, what type of cell it is. Therefore, it is possible to accurately determine which type of cell the cell in the image is.

〔ソフトウェアによる実現例〕
細胞判別装置１の制御ブロック（主制御部２０、特に判別部４０、学習部６０および判別結果出力部７０）は、集積回路（ＩＣチップ）等に形成された論理回路（ハードウェア）によって実現してもよいし、ソフトウェアによって実現してもよい。 [Example of implementation using software]
The control blocks of the cell discriminating device 1 (main control unit 20, especially the discriminating unit 40, learning unit 60, and discrimination result output unit 70) are realized by logic circuits (hardware) formed on integrated circuits (IC chips), etc. or may be realized by software.

後者の場合、細胞判別装置１は、各機能を実現するソフトウェアであるプログラムの命令を実行するコンピュータを備えている。このコンピュータは、例えば少なくとも１つのプロセッサ（制御装置）を備えていると共に、上記プログラムを記憶したコンピュータ読み取り可能な少なくとも１つの記録媒体を備えている。そして、上記コンピュータにおいて、上記プロセッサが上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記プロセッサとしては、例えばＣＰＵ（Central Processing Unit）またはＧＰＵ（Graphics Processing Unit）を用いることができる。上記記録媒体としては、「一時的でない有形の媒体」、例えば、ＲＯＭ（Read Only Memory）等の他、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムを展開するＲＡＭ（Random Access Memory）などをさらに備えていてもよい。また、上記プログラムは、該プログラムを伝送可能な任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに供給されてもよい。なお、本発明の一態様は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 In the latter case, the cell discrimination device 1 includes a computer that executes instructions of a program that is software that implements each function. This computer includes, for example, at least one processor (control device) and at least one computer-readable recording medium storing the above program. In the computer, the processor reads the program from the recording medium and executes the program, thereby achieving the object of the present invention. As the processor, for example, a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit) can be used. As the recording medium, in addition to "non-temporary tangible media" such as ROM (Read Only Memory), tapes, disks, cards, semiconductor memories, programmable logic circuits, etc. can be used. Further, the computer may further include a RAM (Random Access Memory) for expanding the above program. Furthermore, the program may be supplied to the computer via any transmission medium (communication network, broadcast waves, etc.) that can transmit the program. Note that one aspect of the present invention can also be realized in the form of a data signal embedded in a carrier wave, in which the program is embodied by electronic transmission.

〔まとめ〕
本発明の態様１に係る学習装置は、細胞判別に用いられる学習済モデルを構築する学習装置であって、細胞を示すデータおよび非細胞を示すデータを含む第１の教師データを用いた機械学習により、細胞および非細胞の何れであるかを判別するための第１の学習済モデルを構築する第１学習部と、互いに異なる種類の細胞を示す複数種類のデータを含み、かつ非細胞を示すデータを含まない第２の教師データを用いた機械学習により、第１の学習済モデルによって細胞と判別されたデータの細胞の種類を判別するための、上記第１の学習済モデルとは異なる第２の学習済モデルを構築する第２学習部と、を備えている。〔summary〕
A learning device according to aspect 1 of the present invention is a learning device that constructs a trained model used for cell discrimination, and is a learning device that performs machine learning using first teacher data including data indicating cells and data indicating non-cells. a first learning unit that constructs a first trained model for determining whether cells are cells or non-cells; By machine learning using second training data that does not include data, a second trained model different from the first trained model is used to determine the cell type of data that has been determined as a cell by the first trained model. and a second learning section that constructs the second trained model.

本発明の態様２に係る学習装置は、上記態様１において、入力されたデータに関連付けされた情報を参照して、上記第１の教師データおよび上記第２の教師データを生成する教師データ生成部をさらに備えている。 In the learning device according to aspect 2 of the present invention, in aspect 1, the teacher data generation unit generates the first teacher data and the second teacher data by referring to information associated with the input data. It also has:

本発明の態様３に係る学習装置は、上記態様２において、上記教師データ生成部は、細胞の種類を示す情報および非細胞の種類を示す情報と、細胞であるか非細胞であるかの情報との対応関係が示されたテーブルを参照して、上記第１の教師データを生成する。 In the learning device according to Aspect 3 of the present invention, in Aspect 2, the teacher data generation unit generates information indicating the type of cell, information indicating the type of non-cell, and information indicating whether the cell is a cell or a non-cell. The above-mentioned first teacher data is generated by referring to the table showing the correspondence relationship with the above.

本発明の態様４に係る学習装置は、上記態様１～３の何れかにおいて、上記入力されたデータは画像データである。 In the learning device according to aspect 4 of the present invention, in any one of aspects 1 to 3 above, the input data is image data.

本発明の態様５に係る細胞判別装置は、入力されたデータにおける細胞の種類を判別する細胞判別装置であって、上記態様１～４の何れかに記載の学習装置が構築した第１の学習済モデルを用いて、入力されたデータが細胞および非細胞の何れを示すものであるかを判別する第１判別部と、上記第１判別部による判別結果が細胞を示す場合に、上記学習装置が構築した第２の学習済モデルを用いて、上記入力されたデータが何れの種類の細胞を示すものであるかを判別する第２判別部と、上記第１判別部による判別結果が非細胞を示す場合には該判別結果を出力し、上記第１判別部による判別結果が細胞を示す場合には上記第２判別部による判別結果を出力する判別結果出力部とを備えている。 A cell discriminating device according to aspect 5 of the present invention is a cell discriminating device that discriminates the type of cell in input data, and the cell discriminating device according to aspect 5 of the present invention is a cell discriminating device that discriminates the type of cell in input data, and includes a first learning constructed by the learning device according to any one of aspects 1 to 4 above. a first discriminator that discriminates whether the input data indicates a cell or a non-cell using a predetermined model, and the learning device A second discriminator determines which type of cell the input data indicates, using a second trained model constructed by If the discrimination result by the first discrimination section indicates a cell, the discrimination result output section outputs the discrimination result by the second discrimination section.

本発明の態様６に係る細胞判別装置は、上記学習装置を備えている。 A cell discrimination device according to aspect 6 of the present invention includes the learning device described above.

本発明の態様７に係る学習方法は、細胞判別に用いられる学習済モデルを構築する学習方法であって、細胞を示すデータおよび非細胞を示すデータを含む第１の教師データを用いた機械学習により、細胞および非細胞の何れであるかを判別するための第１の学習済モデルを構築する第１学習工程と、互いに異なる種類の細胞を示す複数種類のデータを含み、かつ非細胞を示すデータを含まない第２の教師データを用いた機械学習により、第１の学習済モデルによって細胞と判別されたデータの細胞の種類を判別するための、上記第１の学習済モデルとは異なる第２の学習済モデルを構築する第２学習工程と、を含む。 A learning method according to aspect 7 of the present invention is a learning method for constructing a trained model used for cell discrimination, and includes machine learning using first training data including data indicating cells and data indicating non-cells. a first learning step of constructing a first trained model for determining whether the cell is a cell or a non-cell; By machine learning using second training data that does not include data, a second trained model different from the first trained model is used to determine the cell type of data that has been determined as a cell by the first trained model. and a second learning step of constructing a second trained model.

本発明の態様８に係る学習方法は、上記態様７において、細胞または非細胞を示す上記データは、血中成分を撮像して得られた画像データであり、上記細胞の種類として、少なくとも血中循環腫瘍細胞を含む構成である。 In the learning method according to aspect 8 of the present invention, in aspect 7, the data indicating cells or non-cells is image data obtained by imaging blood components, and as the type of cells, at least blood A composition containing circulating tumor cells.

本発明の態様９に係る細胞判別方法は、入力されたデータにおける細胞の種類を判別する細胞判別方法であって、データを入力するデータ入力工程と、上記態様１～４の何れかに記載の学習装置が構築した第１の学習済モデルを用いて、入力されたデータが細胞および非細胞の何れを示すものであるかを判別する第１判別工程と、上記第１判別工程による判別結果が細胞を示す場合に、上記学習装置が構築した第２の学習済モデルを用いて、上記入力されたデータが何れの種類の細胞を示すものであるかを判別する第２判別工程と、上記第１判別工程による判別結果が非細胞を示す場合には該判別結果を出力し、上記第１判別工程による判別結果が細胞を示す場合には上記第２判別工程による判別結果を出力する判別結果出力工程と、を含む。 A cell discrimination method according to aspect 9 of the present invention is a cell discrimination method that discriminates the type of cell in input data, and includes a data input step of inputting data, and a cell discrimination method according to any one of aspects 1 to 4 above. A first discrimination step of determining whether the input data indicates cells or non-cells using the first trained model constructed by the learning device; and a discrimination result of the first discrimination step described above. when indicating a cell, a second discrimination step of determining which type of cell the input data indicates using a second trained model constructed by the learning device; If the discrimination result in the first discrimination step indicates a non-cell, the discrimination result is output; if the discrimination result in the first discrimination step indicates a cell, the discrimination result in the second discrimination step is output. process.

本発明の各態様に係る学習装置および細胞判別装置は、コンピュータによって実現してもよく、この場合には、コンピュータを上記学習装置または上記細胞判別装置が備える各部（ソフトウェア要素）として動作させることにより上記学習装置または上記細胞判別装置をコンピュータにて実現させる学習プログラムまたは細胞判別プログラム、およびそれを記録したコンピュータ読み取り可能な記録媒体も、本発明の範疇に含まれる。 The learning device and the cell discrimination device according to each aspect of the present invention may be realized by a computer, and in this case, by operating the computer as each part (software element) included in the learning device or the cell discrimination device. The scope of the present invention also includes a learning program or a cell discrimination program for realizing the above learning device or the above cell discrimination device on a computer, and a computer-readable recording medium on which the program is recorded.

本発明は上述した実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能である。すなわち、請求項に示した範囲で適宜変更した技術的手段を組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 The present invention is not limited to the embodiments described above, and various modifications can be made within the scope of the claims. That is, embodiments obtained by combining technical means appropriately modified within the scope of the claims are also included in the technical scope of the present invention.

以下のように本発明の一態様を実施した。機械学習の学習フェーズおよび判別モデルの性能評価には、杏林大学から提供を受けた血液検体を前処理し、抽出した細胞を微細孔アレイの各微細孔に捕捉してアレイ上に細胞を整列し、顕微鏡およびデジタルカメラを用いて取得した画像を用いた。前処理とは、主に赤血球および白血球を取り除き、蛍光免疫染色を行う処理である。なお、前処理によって多くの赤血球および白血球は除去できるが、一定数は残存する。微細孔アレイは非特許文献：東ソー研究・技術報告、第５８巻、ｐ３－１２、２０１４年に記載の微細孔アレイであり、分割された微細孔内に細胞を捕捉し、アレイ上で細胞を整列することができる構造体である。なお、試料の前処理および微細孔アレイへの細胞の整列は、当該非特許文献に記載の方法に従って実施した。 One embodiment of the invention was implemented as follows. For the learning phase of machine learning and the performance evaluation of the discriminant model, we preprocessed the blood sample provided by Kyorin University, captured the extracted cells in each micropore of a micropore array, and arranged the cells on the array. , using images acquired using a microscope and a digital camera. Pretreatment is a process that mainly removes red blood cells and white blood cells and performs fluorescent immunostaining. Note that although many red blood cells and white blood cells can be removed by pretreatment, a certain number remain. The micropore array is a micropore array described in a non-patent document: Tosoh Research and Technical Report, Vol. It is a structure that can be aligned. Note that sample pretreatment and cell alignment into the micropore array were performed according to the method described in the non-patent document.

顕微鏡は、オリンパス株式会社製ＩＸ８３を用い、対物レンズは１０倍とした。デジタルカメラは、浜松ホトニクス株式会社製ＯＲＣＡ－ＦＬＡＳＨ４．０を用いた。この顕微鏡およびデジタルカメラを用いて、明視野像、サイトケラチン抗体による蛍光像、ＣＤ４５抗体による蛍光像およびＤＡＰＩによる核染色像を取得した。取得した画像を微細孔ごとに分割した。微細孔１つあたりの画像は、６２×６２ピクセルであった。微細孔ごとに、取得した各画像を横に並べて結合し、学習および判別に用いる画像データとした。画像データの一例を図５に示す。図５には、白血球（ＷＢＣ）、サイトケラチン陽性血中循環腫瘍細胞（ＣＫ＋）、サイトケラチン陰性血中循環腫瘍細胞（ＣＫ－）、非細胞成分（ＤＳＴ）および空の微細孔（ＥＭＰ）の画像データの一例を示している。また、図５に示される通り、各画像データは、明視野像（図中、「明視野」の列に示される像）、ＤＡＰＩによる核染色像（図中、「蛍光１ＤＡＰＩ」の列に示される像）、サイトケラチン抗体による蛍光像（図中、「蛍光２ＣＫ」の列に示される像）およびＣＤ４５抗体による蛍光像（図中、「蛍光３ＣＤ４５」の列に示される像）がこの順で隙間なく並べられ、一つの画像に合成されている。なお、正解ラベルは、人間による判別を指定した。 The microscope used was IX83 manufactured by Olympus Corporation, and the objective lens was set to 10 times. The digital camera used was ORCA-FLASH 4.0 manufactured by Hamamatsu Photonics Co., Ltd. Using this microscope and digital camera, a bright field image, a fluorescence image using a cytokeratin antibody, a fluorescence image using a CD45 antibody, and a nuclear staining image using DAPI were obtained. The acquired images were divided into micropores. The image per micropore was 62 x 62 pixels. For each micropore, the acquired images were arranged side by side and combined to form image data used for learning and discrimination. An example of image data is shown in FIG. Figure 5 shows white blood cells (WBC), cytokeratin-positive circulating tumor cells (CK+), cytokeratin-negative circulating tumor cells (CK-), noncellular components (DST), and empty micropores (EMP). An example of image data is shown. In addition, as shown in Figure 5, each image data includes a bright field image (the image shown in the "Bright field" column in the figure), a nuclear staining image by DAPI (the image shown in the "Fluorescence 1 DAPI" column in the figure). ), a fluorescence image by the cytokeratin antibody (image shown in the column "Fluorescence 2 CK" in the figure), and a fluorescence image by the CD45 antibody (image shown in the column "Fluorescence 3 CD45" in the figure). They are arranged in this order without any gaps and combined into a single image. Note that the correct label was determined by humans.

データ数は、ＷＢＣが５００、ＣＫ＋が１５１、ＣＫ－が５００、ＤＳＴが５００、ＥＭＰが４８１である。このデータからランダム抽出で８割を学習データとし、２割を検証データとした。 The number of data is 500 for WBC, 151 for CK+, 500 for CK-, 500 for DST, and 481 for EMP. From this data, 80% was randomly extracted as training data and 20% was used as verification data.

コンピュータ上で、学習部を実装したプログラムを実行し、２段階の判別モデル（第１学習済モデル、第２学習済モデル）を構築した。プログラムの実装にあたっては、株式会社ＰｒｅｆｅｒｒｅｄＮｅｔｗｏｒｋｓのＣｈａｉｎｅｒを用いた。以下に記載の処理は、Ｃｈａｉｎｅｒに含まれる関数、クラスを用いている。なお、第２学習済モデルの構築にあたっては、データ数が最も少ないサイトケラチン陽性血中循環腫瘍細胞（ＣＫ＋）にデータ数を合わせた。 A program equipped with a learning section was executed on a computer to construct a two-stage discrimination model (a first trained model and a second trained model). Chainer from Preferred Networks, Inc. was used to implement the program. The processing described below uses functions and classes included in Chainer. In constructing the second trained model, the number of data was adjusted to cytokeratin-positive circulating tumor cells (CK+), which had the least amount of data.

機械学習の方法として、畳み込み層および全結合層からなるニューラルネットワークを用いて、最適化アルゴリズムは、ＡｄａｐｔｉｖｅＭｏｍｅｎｔＥｓｔｉｍａｔｉｏｎ（Ａｄａｍ）を用いた。ニューラルネットワークは、第１層、第２層および第３層に畳み込み層を配置し、これらのハイパーパラメータとして、チャネル数６４、ストライド３およびゼロパディングを指定した。そして畳み込み層の活性化関数にＲｅＬＵを用いた。続く層に全結合層を配置し、活性化関数にソフトマックス関数を用いた。また、学習時には、画像の回転または反転によるデータ拡張を行った。学習は１５０ｅｐｏｃｈ行い、ミニバッチサイズは６４とした。なお、「ＣＫ＋」、「ＣＫ－」または「ＷＢＣ」とラベルされたデータについては、「Ｃｅｌｌ」のラベルでもって第１段階の学習を行うよう指定した。 A neural network consisting of convolutional layers and fully connected layers was used as the machine learning method, and Adaptive Moment Estimation (Adam) was used as the optimization algorithm. In the neural network, convolutional layers were arranged in the first layer, second layer, and third layer, and the number of channels 64, stride 3, and zero padding were specified as these hyperparameters. Then, ReLU was used as the activation function of the convolutional layer. A fully connected layer was placed in the next layer, and a softmax function was used as the activation function. Additionally, during learning, data was augmented by rotating or inverting images. Learning was performed for 150 epochs, and the mini-batch size was 64. Note that data labeled "CK+", "CK-", or "WBC" was designated to be subjected to the first stage learning with the label "Cell".

以上により、「Ｃｅｌｌ」、「ＤＳＴ」および「ＥＰＭ」の３分類の判別を行う第１学習済モデル、ならびに「ＣＫ＋」、「ＣＫ－」および「ＷＢＣ」の３分類の判別を行う第２学習済モデルの２段階の判別モデルを構築した。 As described above, the first trained model that discriminates between the three classifications of "Cell", "DST", and "EPM", and the second trained model that discriminates between the three classifications of "CK+", "CK-", and "WBC" are created. A two-stage discriminant model was constructed.

２段階の判別モデルを利用して細胞判別の検証を行った結果、検証データに対する正答率は９６．７％であった。 As a result of verifying cell discrimination using a two-stage discrimination model, the correct answer rate for the validation data was 96.7%.

一方で、同じデータセットにおいて、「ＣＫ＋」、「ＣＫ－」、「ＷＢＣ」、「ＤＳＴ」または「ＥＭＰ」とラベルされたデータを用いて、「ＣＫ＋」、「ＣＫ－」、「ＷＢＣ」、「ＤＳＴ」および「ＥＭＰ」の５分類の判別を一度に行う１段階の判別モデルを構築した。 On the other hand, in the same data set, using data labeled "CK+", "CK-", "WBC", "DST", or "EMP", "CK+", "CK-", "WBC", We constructed a one-stage discrimination model that discriminates between the five categories of "DST" and "EMP" at once.

この１段階の判別モデルを利用して細胞判別の検証を行った結果、検証データに対する正答率は９３．７％であった。 As a result of verifying cell discrimination using this one-stage discrimination model, the correct answer rate for the validation data was 93.7%.

本発明は、細胞の分類を行う技術に利用することができる。 INDUSTRIAL APPLICATION This invention can be utilized for the technique of classifying a cell.

１細胞判別装置
１０入力部
２０主制御部
３０表示部
４０判別部
４１第１判別部
４２第２判別部
５０学習モデル
５１第１学習済モデル
５２第２学習済モデル
６０学習部（学習装置）
６１第１学習部
６２第２学習部
６３教師データ生成部
７０判別結果出力部 1 Cell discrimination device 10 Input section 20 Main control section 30 Display section 40 Discrimination section 41 First discrimination section 42 Second discrimination section 50 Learning model 51 First learned model 52 Second learned model 60 Learning section (learning device)
61 First learning section 62 Second learning section 63 Teacher data generation section 70 Discrimination result output section

Claims

A learning device that constructs a trained model used for cell discrimination,
First learning to construct a first trained model for determining whether it is a cell or a non-cell by machine learning using first training data including data indicating cells and data indicating non-cells. Department and
By machine learning using second training data that includes multiple types of data indicating different types of cells and not including data indicating non-cells, data that has been determined to be cells by the first trained model is determined. A learning device comprising: a second learning unit that constructs a second trained model different from the first trained model for determining cell types.

2. The computer according to claim 1, further comprising a teacher data generation unit that generates the first teacher data and the second teacher data by referring to information associated with the input data. learning device.

The teacher data generation unit refers to a table showing the correspondence between information indicating the type of cell, information indicating the type of non-cell, and information indicating whether the cell is a cell or a non-cell. 3. The learning device according to claim 2, wherein the learning device generates one piece of teacher data.

4. The learning device according to claim 1, wherein the data included in the first teacher data and the second teacher data is image data.

A cell discrimination device that discriminates the type of cell in input data,
A step of determining whether the input data indicates cells or non-cells using the first learned model constructed by the learning device according to any one of claims 1 to 4. 1 discrimination section;
When the discrimination result by the first discrimination unit indicates a cell, the second learned model constructed by the learning device is used to discriminate which type of cell the input data indicates. a second determining unit that
Discrimination that outputs the discrimination result when the discrimination result by the first discrimination section indicates non-cell, and outputs the discrimination result by the second discrimination section when the discrimination result by the first discrimination section indicates cell. A cell discrimination device comprising: a result output unit.

The cell discrimination device according to claim 5, comprising the learning device.

A learning method for constructing a trained model executed by at least one processor and used for cell discrimination, the method comprising:
First learning to construct a first trained model for determining whether it is a cell or a non-cell by machine learning using first training data including data indicating cells and data indicating non-cells. process and
Through machine learning using second training data that includes multiple types of data that indicate different types of cells and does not include data that indicates non-cells, cells in the data that are determined to be cells by the first trained model are A learning method comprising: a second learning step of constructing a second trained model different from the first trained model for determining the type of the first trained model.

8. The data indicating cells or non-cells is image data obtained by imaging blood components, and the types of cells include at least circulating tumor cells. How to learn.

A cell discrimination method executed by at least one processor to discriminate the type of cell in input data, the method comprising:
a data input step of inputting data;
A first method for determining whether the input data indicates cells or non-cells using the first trained model constructed by the learning device according to any one of claims 1 to 4. Discrimination process;
When the discrimination result in the first discrimination step indicates a cell, the second trained model constructed by the learning device is used to discriminate which type of cell the input data indicates. a second discrimination step of
If the discrimination result in the first discrimination step indicates non-cells, the discrimination result is output, and if the discrimination result in the first discrimination step indicates cells, the discrimination result in the second discrimination step is output. A cell discrimination method comprising: a result output step.

A learning program for causing a computer to function as the learning device according to any one of claims 1 to 4.

A cell discrimination program for causing a computer to function as the cell discrimination apparatus according to claim 5 or 6.