WO2022050078A1 - 学習データ作成装置、方法及びプログラム、機械学習装置及び方法、学習モデル及び画像処理装置 - Google Patents

学習データ作成装置、方法及びプログラム、機械学習装置及び方法、学習モデル及び画像処理装置 Download PDF

Info

Publication number
WO2022050078A1
WO2022050078A1 PCT/JP2021/030534 JP2021030534W WO2022050078A1 WO 2022050078 A1 WO2022050078 A1 WO 2022050078A1 JP 2021030534 W JP2021030534 W JP 2021030534W WO 2022050078 A1 WO2022050078 A1 WO 2022050078A1
Authority
WO
WIPO (PCT)
Prior art keywords
correct
learning
region
masks
learning data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/030534
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
拓也 蔦岡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujifilm Corp
Original Assignee
Fujifilm Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujifilm Corp filed Critical Fujifilm Corp
Priority to JP2022546227A priority Critical patent/JP7457138B2/ja
Publication of WO2022050078A1 publication Critical patent/WO2022050078A1/ja
Priority to US18/179,329 priority patent/US20230206609A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06V10/7747Organisation of the process, e.g. bagging or boosting
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • A61B1/00002Operational features of endoscopes
    • A61B1/00004Operational features of endoscopes characterised by electronic signal processing
    • A61B1/00009Operational features of endoscopes characterised by electronic signal processing of image signals during a use of endoscope
    • A61B1/000094Operational features of endoscopes characterised by electronic signal processing of image signals during a use of endoscope extracting biological structures
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • A61B1/00002Operational features of endoscopes
    • A61B1/00004Operational features of endoscopes characterised by electronic signal processing
    • A61B1/00009Operational features of endoscopes characterised by electronic signal processing of image signals during a use of endoscope
    • A61B1/000096Operational features of endoscopes characterised by electronic signal processing of image signals during a use of endoscope using artificial intelligence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10068Endoscopic image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the present invention relates to a learning data creation device, a method and a program, a machine learning device and a method, a learning model and an image processing device, and particularly relates to a technique for creating learning data that makes a region extractor perform machine learning satisfactorily.
  • Patent Document 1 describes a technique for aggregating a plurality of annotation data sets created by a plurality of annotators for the same image and acquiring the aggregated annotation data sets.
  • Annotation datasets are aggregated by weighted averaging multiple annotation datasets using the reliability of multiple annotators.
  • One embodiment according to the technique of the present disclosure creates learning data suitable for learning of a region extractor having expected performance under a situation where a plurality of correct region masks are applied to one image.
  • a learning data creating device a method, a program capable of learning, a machine learning device and a method for making a region extractor machine-learn using the learning data, a learned learning model, and an image processing device.
  • the invention according to the first aspect is a learning data creation device including a first processor, wherein the first processor creates learning data for machine learning, and the first processor is for one image and one image.
  • a learning sample acquisition process for acquiring a plurality of first correct answer area masks as a set of learning samples, a correct answer area mask integration process for generating one second correct answer area mask from a plurality of first correct answer area masks, and one sheet. The process of outputting the pair of the image and the second correct area mask as learning data is performed.
  • first correct answer area masks Under the situation where a plurality of first correct answer area masks are given to one image, these are acquired as a set of learning samples, and a plurality of first correct answer area masks are integrated into one second correct answer. Generate a region mask. Then, a pair of one second correct area mask integrated with one image is output as learning data. By integrating a plurality of first correct answer area masks given to one image to generate one second correct answer area mask, a more reliable correct answer area mask can be obtained.
  • the training sample acquisition process is a correct answer given to one image by a plurality of evaluators as a plurality of first correct answer area masks for one image. It is preferable to acquire the region mask as a plurality of first correct region masks.
  • the training sample acquisition process is machine-learned in advance using each of a plurality of evaluators' correct answer area masks as a plurality of first correct answer area masks for one image. It is preferable to input one image into each of the plurality of first region extractors and acquire the plurality of region extraction results output by the plurality of first region extractors as a plurality of first correct region masks. ..
  • the first region extractor may be machine-learned using the correct region mask given by one evaluator, or given by an evaluator group belonging to some standard (for example, the institution to which the evaluator belongs). It may be machine-learned using a correct area mask.
  • the first processor calculates a sample weight that reduces the weight of the learning sample during machine learning as the degree of disagreement between the plurality of first correct region masks increases. It is preferable to perform the weight calculation process and output the pair of one image and the second correct area mask and the calculated sample weight as learning data.
  • the sample weight is a value in the range of 0 to 1
  • the sample weight calculation process sets the ratio of pixels that do not match in the plurality of first correct area masks to 1. It is preferable to calculate the value obtained by subtracting from the sample weight. As a result, the larger the proportion of pixels that do not match in the plurality of first correct area masks, the smaller the sample weight can be.
  • the learning sample acquisition process further acquires the diagnostic information of the living tissue
  • the correct answer area mask integrated process is the diagnostic information among the plurality of first correct answer area masks. It is preferred to generate a second correct region mask using a matching first correct region mask.
  • the diagnostic information of the biological tissue includes the diagnosis result for the biological tissue and the coordinate position on the image obtained by collecting the biological tissue.
  • the first correct area mask that matches the diagnostic information is a correct area mask that matches the diagnosis results and includes the coordinate positions of the collected tissues. This makes it possible to eliminate the first correct region mask that does not match the diagnosis result.
  • the correct answer area mask integration process is performed on the correct answer area mask in which the area of the common portion of the plurality of first correct answer area masks is the correct answer area, and the plurality of first correct answer area masks.
  • a correct area mask in which the area of the sum set is the correct area a correct area mask in which the area consisting of pixels determined to be correct by majority decision is the correct area for each pixel of the plurality of first correct area masks, and a plurality of first correct area.
  • the correct region masks integrated by averaging the masks and the first correct region masks selected from a plurality of first correct region masks and having the largest or smallest correct region. It is preferable to use any of the above as the second correct region mask.
  • the learning data creating device is provided with a recording device for recording a learning data set composed of a plurality of learning data.
  • the learning data set consisting of a plurality of learning data recorded and accumulated in the recording device can be used for machine learning of the area extractor that extracts a specific area from the input image.
  • one image is a medical image
  • a plurality of first correct answer region masks indicate a region of interest given to the medical image by a plurality of evaluators.
  • the correct area mask is preferable.
  • the machine learning device includes a second processor and a second region extractor, and the second processor uses the learning data created by the above-mentioned learning data creation device to obtain a second. Make the area extractor machine-learn.
  • the second region extractor is a learning model composed of a convolutional neural network.
  • the invention according to the twelfth aspect is a second region extractor in which machine learning is performed by the above machine learning device, and is a trained learning model configured by a convolutional neural network.
  • the invention according to the thirteenth aspect is an image processing device equipped with a learned learning model.
  • the invention according to the fourteenth aspect is a learning data creation method in which a first processor creates learning data for machine learning by performing the processing of each of the following steps, with respect to one image and one image.
  • a step of acquiring a plurality of first correct area masks as a set of learning samples, a step of generating one second correct area mask from a plurality of first correct area masks, and one image and a second correct area mask. Includes a step to output the pair of as training data.
  • one piece includes a step of calculating a sample weight that reduces the weight of the learning sample at the time of machine learning as the degree of disagreement of the plurality of first correct area masks becomes larger. It is preferable to output the pair of the image, the second correct region mask, and the calculated sample weight as training data.
  • the step of acquiring the training sample further acquires the diagnostic information of the biological tissue
  • the step of generating the second correct area mask is a plurality of first correct area masks. It is preferable to generate the second correct area mask by using the first correct area mask that matches the diagnostic information.
  • the second processor makes the second region extractor machine-learn using the learning data created by the above-mentioned learning data creation method.
  • the invention according to the seventeenth aspect is a machine learning method in which a second processor causes a second region extractor to perform machine learning using the training data created by the above training data creation method, and the learning is performed at the initial stage of learning.
  • the sample weight contained in the data is set to a fixed value and the second region extractor is machine-learned, the sample weight is moved from the fixed value to the original value as the machine learning progresses, or when the machine learning reaches the reference level, the sample weight is used. It is preferable to switch from a fixed value to the original value so that the second region extractor is machine-learned.
  • the parameters of the second region extractor are brought closer to the optimum value by starting the sample weight from a fixed value, and as the machine learning progresses, the sample weight is brought closer from the fixed value to the original value, or machine learning is the standard.
  • the sample weight is switched from the fixed value to the original value, so that the parameters of the second region extractor are learned to be closer to the optimum value, and the region extractor has the expected performance.
  • the invention according to the nineteenth aspect has a function of acquiring a plurality of first correct area masks for one image and one image as a set of learning samples, and one second correct answer from a plurality of first correct area masks. It is a learning data creation program that realizes a function of generating a region mask and a function of outputting a pair of one image and a second correct region mask as training data by a computer.
  • FIG. 1 is a block diagram showing a first embodiment of the learning data creating device according to the present invention.
  • FIG. 2 is a diagram showing an embodiment of a learning sample.
  • FIG. 3 is a block diagram showing a second embodiment of the learning data creating device according to the present invention.
  • FIG. 4 is a block diagram showing a third embodiment of the learning data creating device according to the present invention.
  • FIG. 5 is a diagram showing another embodiment of the learning sample acquisition unit.
  • FIG. 6 is a diagram showing a fourth embodiment of the learning data creating device.
  • FIG. 7 is a schematic diagram of the machine learning device according to the present invention.
  • FIG. 8 is a block diagram showing an embodiment of the machine learning device shown in FIG. 7.
  • FIG. 9 is a schematic view showing another embodiment of the machine learning device according to the present invention.
  • FIG. 1 is a block diagram showing a first embodiment of the learning data creating device according to the present invention.
  • FIG. 2 is a diagram showing an embodiment of a learning sample.
  • FIG. 10 is a flowchart showing a first embodiment of the learning data creation method according to the present invention.
  • FIG. 11 is a flowchart showing a second embodiment of the learning data creation method according to the present invention.
  • FIG. 12 is a flowchart showing a third embodiment of the learning data creation method according to the present invention.
  • FIG. 13 is a flowchart showing a first embodiment of the machine learning method according to the present invention.
  • FIG. 14 is a flowchart showing a second embodiment of the machine learning method according to the present invention.
  • FIG. 1 is a block diagram showing a first embodiment of the learning data creating device according to the present invention.
  • the learning data creation device 1-1 shown in FIG. 1 includes a first processor 10-1 including a CPU (Central Processing Unit), a memory, and the like, and the first processor 10-1 includes a learning sample acquisition unit 20 and a correct answer area mask. It functions as an integration unit 30 and an output unit 34.
  • a CPU Central Processing Unit
  • the first processor 10-1 includes a learning sample acquisition unit 20 and a correct answer area mask. It functions as an integration unit 30 and an output unit 34.
  • the learning sample acquisition unit 20 acquires a learning sample from the database 2 that stores the first learning data set.
  • FIG. 2 is a diagram showing an embodiment of a learning sample.
  • one learning sample consists of one image shown in FIG. 2 (A) and a plurality of correct area masks (first correct area mask) shown in FIG. 2 (B). It is configured.
  • the image shown in FIG. 2 (A) is a medical image taken by an endoscopic scope. Further, in the plurality of first correct answer region masks shown in FIG. 2B, a plurality of evaluators (in this example, four doctors) read the same medical image, and each of them was given attention to the medical image. The correct area mask indicating the area.
  • Each doctor can create the first correct area mask by using the user interface and performing an operation of surrounding the area considered to be the lesion area on the medical image with a closed curve.
  • each first correct answer region mask covers, for example, the region surrounded by the closed curve. It can be a binarized image in which "1" and the other areas are "0".
  • one learning sample is one image and a plurality of first correct answer area. It consists of one set with a mask.
  • the learning sample acquisition unit 20 performs a learning sample acquisition process of acquiring one image and a plurality of first correct area masks for the one image as a set of learning samples 22 from the database 2. .. One image constituting the learning sample 22 acquired by the learning sample acquisition unit 20 is added to the output unit 34, and the plurality of first correct area masks are added to the correct answer area mask integration unit 30.
  • the correct answer area mask integration unit 30 performs a correct answer area mask integration process for integrating a plurality of input first correct answer area masks, and generates one correct answer area mask (second correct answer area mask) from the plurality of first correct answer area masks. do.
  • the second correct answer area mask is generated with the area consisting of the pixels determined to be the correct answer by majority vote as the correct answer area. For example, when there are five plurality of first correct answer region masks, a region in which a plurality of first correct answer region masks overlap by three or more is set as a correct answer region to generate a second correct answer region mask.
  • a plurality of first correct answer region masks are even numbers, for example, a region overlapping with half or more of the even numbers can be used as a correct answer region to generate a second correct answer region mask.
  • a plurality of first correct area masks are integrated by averaging to generate a second correct area mask.
  • the first correct answer area mask selected from a plurality of first correct answer area masks and having the maximum or minimum correct answer area is defined as the second correct answer area mask.
  • the second correct answer area mask 32 generated by the correct answer area mask integration unit 30 as described above is added to the output unit 34.
  • the output unit 34 outputs a pair of one image constituting the learning sample 22 and one second correct area mask as learning data 4 for machine learning to a device in the subsequent stage.
  • FIG. 3 is a block diagram showing a second embodiment of the learning data creating device according to the present invention.
  • the same improperness is attached to the portion common to the first embodiment shown in FIG. 1, and the detailed description thereof will be omitted.
  • the learning data creation device 1-2 shown in FIG. 3 includes a first processor 10-2, and the first processor 10-2 includes a learning sample acquisition unit 20, a correct area mask integration unit 30, a sample weight calculation unit 40, and a sample weight calculation unit 40. It functions as an output unit 35.
  • the sample weight calculation unit 40 inputs a plurality of first correct answer area masks, and calculates a sample weight according to the degree of matching / disagreement of the plurality of first correct answer area masks.
  • the sample weight is a weight attached to a learning sample (learning data) used when the region extractor described later is machine-learned, and is a weight that the learning sample contributes to learning.
  • the sample weight calculation unit 40 calculates a sample weight that reduces the weight of the learning sample during machine learning as the degree of disagreement between the plurality of first correct area masks increases. On the contrary, the smaller the degree of disagreement (the larger the degree of matching) of the plurality of first correct area masks, the larger the sample weight is calculated.
  • the sample weight can be, for example, a value in the range of 0 to 1, and the sample weight calculation unit 40 uses a value obtained by subtracting the ratio of pixels that do not match in the plurality of first correct area masks from 1 as the sample weight. Can be calculated.
  • the sample weight 42 calculated by the sample weight calculation unit 40 is added to the output unit 35.
  • An image constituting the learning sample 22 and a second correct answer area mask 32 are added to the output unit 35, and the output unit 35 receives a pair of one image and the second correct answer area mask 32 and a sample weight 42. It is output to the device in the subsequent stage as learning data 4 for machine learning.
  • FIG. 4 is a block diagram showing a third embodiment of the learning data creating device according to the present invention.
  • the same improperness is given to the parts common to the first embodiment shown in FIG. 1 and the third embodiment shown in FIG. 3, and detailed description thereof will be omitted.
  • the learning data creation device 1-3 shown in FIG. 4 includes a first processor 10-3, and the first processor 10-3 includes a learning sample acquisition unit 21, a correct area mask integration unit 31, a sample weight calculation unit 41, and a sample weight calculation unit 41. It functions as an output unit 36.
  • a plurality of learning samples are stored in the database 3, and one learning sample contains diagnostic information (biopsy information) of biological tissue in addition to one image and a plurality of first correct region masks.
  • the biopsy information has, for example, the diagnosis result of the biological tissue collected by forceps or the like, and the coordinate position on the image of the collected biological tissue.
  • the learning sample acquisition unit 21 acquires one learning sample 23 from the database 3.
  • One image constituting the acquired learning sample 23 is added to the output unit 36, and the plurality of first correct answer area masks and biopsy information are added to the correct answer area mask integration unit 31 and the sample weight calculation unit 41, respectively. ..
  • the correct answer area mask integration unit 31 integrates a plurality of input first correct answer area masks and generates a second correct answer area mask from a plurality of first correct answer area masks. In this case, biopsy information is used.
  • the correct area mask integration unit 31 generates a second correct area mask using the first correct area mask that matches the biopsy information among the plurality of first correct area masks.
  • the correct answer area mask integration unit 31 When the correct answer area mask integration unit 31 is accompanied by diagnostic information by each evaluator to the plurality of first correct answer area masks, the correct answer area mask integration unit 31 is the same as the diagnostic result of the biological tissue included in the biopsy information among the plurality of first correct answer area masks. Select only the first correct region mask that has diagnostic information. Further, among the plurality of first correct answer area masks, only the first correct answer area mask that includes the coordinate position of the living tissue included in the biopsy information in the correct answer area is selected.
  • the correct area mask integration unit 31 generates the first correct area mask selected based on the biopsy information as the second correct area mask.
  • the second correct answer area mask 33 generated by the correct answer area mask integration unit 31 is added to the output unit 35.
  • one second correct answer area mask is selected from the plurality of first correct answer area masks as in the first embodiment of FIG. Generate. Further, in this example, among the plurality of first correct answer area masks, only the first correct answer area mask that has the same diagnosis result and includes the coordinate position of the collected tissue is selected.
  • the present invention is not limited to this, and the first correct region mask that matches the diagnosis results may be selected, or the first correct region mask that includes the coordinate positions of the collected tissues may be selected.
  • the sample weight calculation unit 41 calculates the sample weight according to the degree of match / mismatch of the first correct area mask selected based on the biopsy information among the plurality of first correct area masks. do.
  • the sample weight 43 calculated by the sample weight calculation unit 41 is added to the output unit 36.
  • An image constituting the learning sample 23, a second correct answer area mask 33, and a sample weight 43 are added to the output unit 36, and the output unit 36 is a pair of one image and the second correct answer area mask 33. And the sample weight 43 is output to the device in the subsequent stage as learning data 4 for machine learning.
  • FIG. 5 is a diagram showing another embodiment of the learning sample acquisition unit.
  • the learning sample acquisition unit 24 shown in FIG. 5 includes a plurality of region extractors 26A, 26B, and 26C (first region extractor 16).
  • the plurality of region extractors 26A, 26B, and 26C are region extractors that have been machine-learned in advance using their respective learning data sets (image and correct region mask learning data sets) of each of the plurality of evaluators.
  • the plurality of region extractors 26A, 26B, and 26C may be trained using a correct region mask or the like created by one evaluator for each region extractor, or may have some criteria (for example, the evaluator belongs to). It may be learned using a correct answer area mask or the like created by an evaluator group of an institution that performs the training.
  • the learning sample acquisition unit 24 acquires one image from the image database 5, and uses the same image as an input image of a plurality of region extractors 26A, 26B, and 26C.
  • the plurality of area extractors 26A, 26B, and 26C each output the area extraction result as the first correct area mask for the input image.
  • each region extractor 26A, 26B, 26C was trained using a different training data set for each evaluator, different region extraction results (first correct answer region) even if the same image is input. Mask) is output.
  • the learning sample acquisition unit 24 learns one image acquired from the image database 5 and a plurality of first correct region masks output from the plurality of region extractors 26A, 26B, and 26C using this image as an input image. Output as 25.
  • FIG. 6 is a diagram showing a fourth embodiment of the learning data creation device.
  • the learning data creating device 1-4 shown in FIG. 6 includes the first processor 10-1 shown in FIG. 1 and the recording device 6.
  • the first processor 10-1 acquires one training sample 22 from the database 2 as described with reference to FIG. 1, the first processor 10-1 obtains one image constituting the training sample 22 and a plurality of first correct area masks.
  • One learning data 4 consisting of a pair with one integrated second correct area mask is output.
  • the recording device 6 can be configured by, for example, a database capable of recording and managing a large amount of data, and sequentially records the learning data output from the first processor 10-1.
  • the plurality of learning data recorded and stored in the recording device 6 are used as a second learning data set for machine learning for learning a region extractor (second region extractor) described later.
  • the recording device 6 shown in FIG. 6 records the learning data output from the first processor 10-1 of the learning data creating device 1-1, but is not limited to this, and is shown in FIGS. 3 and 4.
  • the learning data output from the first processors 10-2 and 10-3 of the learning data creating devices 1-2 and 1-3 may be recorded.
  • FIG. 7 is a schematic diagram of the machine learning device according to the present invention.
  • the machine learning device 50 shown in FIG. 7 includes a second processor 51 and a second region extractor 52.
  • the second processor 51 has a function of machine learning the second region extractor 52 by using the learning data (second learning data set) stored in the recording device 6 (see FIG. 6).
  • FIG. 8 is a block diagram showing an embodiment of the machine learning device shown in FIG. 7.
  • the second region extractor 52 of the machine learning device 50 shown in FIG. 8 can be configured by, for example, a convolutional neural network (CNN) which is one of the learning models.
  • CNN convolutional neural network
  • the second processor 51 includes a loss value calculation unit 54 and a parameter control unit 56, and uses the second learning data set stored in the recording device 6 to machine-learn the second region extractor 52.
  • the second region extractor 52 is, for example, a portion for inferring a region of interest such as a lesion region reflected in the input image when an arbitrary medical image is used as an input image, has a plurality of layer structures, and has a plurality of layers. Holds the weight parameter of. Weight parameters include the filter coefficients of a filter called the kernel used for convolution operations in the convolution layer.
  • the second region extractor 52 can change from the unlearned second region extractor 52 to the trained second region extractor 52 by updating the weight parameter from the initial value to the optimum value.
  • the second region extractor 52 includes an input layer 52A, an intermediate layer 52B having a plurality of sets composed of a convolution layer and a pooling layer, and an output layer 52C, and each layer has a plurality of "nodes” as "edges". It has a structure that is connected by.
  • An image to be learned (learning image) is input to the input layer 52A as an input image.
  • the learning image is an image in the learning data (learning data consisting of a pair of the image and the second correct answer area mask) stored in the recording device 6.
  • the intermediate layer 52B has a plurality of sets including a convolution layer and a pooling layer as one set, and is a portion for extracting features from an image input from the input layer 52A.
  • the convolution layer filters nearby nodes in the previous layer (performs a convolution operation using the filter) and obtains a "feature map”.
  • the pooling layer reduces the feature map output from the convolution layer to a new feature map.
  • the "convolution layer” plays a role of feature extraction such as edge extraction from an image, and the “pooling layer” plays a role of imparting robustness so that the extracted features are not affected by translation or the like.
  • the intermediate layer 52B is not limited to the case where the convolution layer and the pooling layer are set as one set, but may also include the case where the convolution layers are continuous, the activation process by the activation function, and the normalization layer.
  • the output layer 52C is a part that outputs a feature map showing the features extracted by the intermediate layer 52B. Further, in the second region extractor 52 that has been trained, the output layer 52C is inferred by region classification (segmentation) of, for example, the region of interest in the input image in pixel units or in units of several pixels as a group. Output the result.
  • region classification segmentation
  • Arbitrary initial values are set for the coefficients and offset values of the filter applied to each convolution layer of the second region extractor 52 before learning.
  • the loss value calculation unit 54 of the loss value calculation unit 54 and the parameter control unit 56 that function as the learning control unit has a feature map output from the output layer 52C of the second region extractor 52 and an input image (learning image). ) Is compared with the second correct area mask (mask image read from the recording device 6 corresponding to the paired image), and the error between the two (loss value, which is the value of the loss function) is calculated. ..
  • a method for calculating the loss value for example, softmax cross entropy, sigmoid, etc. can be considered.
  • the parameter control unit 56 adjusts the weight parameter of the second region extractor 52 by the error back propagation method based on the loss value calculated by the loss value calculation unit 54.
  • the error is back-propagated in order from the final layer, the stochastic gradient descent method is performed in each layer, and the parameter update is repeated until the error converges.
  • the machine learning device 50 repeats machine learning using the learning data recorded in the recording device 6, so that the second region extractor 52 becomes the trained second region extractor 52.
  • the trained second region extractor 52 inputs an unknown input image (for example, a captured image)
  • the trained second region extractor 52 outputs an inference result such as a mask image indicating a region of interest in the captured image.
  • FIG. 9 is a schematic diagram showing another embodiment of the machine learning device according to the present invention.
  • the machine learning device 50-1 shown in FIG. 9 includes a third processor 53 and a second region extractor 52.
  • the third processor 53 of the machine learning device 50-1 shown in FIG. 9 has, for example, the functions of the first processor 10-1 shown in FIG. 1 and the second processor 51 shown in FIG. 7.
  • the third processor 53 which functions as the first processor 10-1, acquires one learning sample from the database 2, one image constituting the training sample and a plurality of first correct area masks are integrated. Create learning data for machine learning consisting of a pair with two second correct area masks.
  • the third processor 53 which functions as the second processor 51, causes the second region extractor 52 to perform machine learning using the created learning data.
  • the third processor 53 may train the second region extractor 52 using the training data each time the training data is created. Further, every time a plurality of training data (learning data for one batch) are created, the second region extractor 52 may be trained using the training data for one batch.
  • FIG. 10 is a flowchart showing a first embodiment of the learning data creation method according to the present invention.
  • the processing of each step of the learning data creation method shown in FIG. 10 is performed by the first processor 10-1 of the learning data creation device 1-1 shown in FIG.
  • the learning sample acquisition unit 20 acquires one learning sample 22 from the database 2 (step S10).
  • the correct answer area mask integration unit 30 integrates a plurality of first correct answer area masks constituting the learning sample, and generates one correct answer area mask (second correct answer area mask) from the plurality of first correct answer area masks (step S12). ).
  • the method of generating the second correct answer area mask is a method of extracting the area of the common part of the plurality of first correct answer area masks and using the extracted area as the correct answer area to generate the second correct answer area mask, and the method of generating the second correct answer area mask, and the plurality of first correct answer area masks.
  • the output unit 34 outputs a pair of the image constituting the learning sample acquired in step S10 and the second correct mask generated in step S12 as learning data for machine learning to the output destination in the subsequent stage (step). S14).
  • FIG. 11 is a flowchart showing a second embodiment of the learning data creation method according to the present invention.
  • each step of the learning data creation method shown in FIG. 11 is performed by the first processor 10-2 of the learning data creation device 1-2 shown in FIG.
  • the same step numbers are assigned to the parts common to the learning data creation method of the first embodiment shown in FIG. 10, and detailed description thereof will be omitted.
  • the learning data creation method of the second embodiment shown in FIG. 11 is a learning data creation method of the first embodiment shown in FIG. 10 in that the processing of step S16 mainly performed by the sample weight calculation unit 40 is added. Is different from.
  • the sample weight is calculated according to the degree of match / mismatch of the plurality of first correct answer area masks based on the plurality of first correct answer area masks.
  • the sample weight is, for example, a value in the range of 0 to 1, and the larger the degree of disagreement between the plurality of first correct area masks, the smaller the value.
  • the output unit 35 adds the pair of the image constituting the learning sample acquired in step S10 and the second correct mask generated in step S12, and the sample weight calculated in step S16 to the learning data for machine learning. Is output to the device in the subsequent stage (step S18).
  • FIG. 12 is a flowchart showing a third embodiment of the learning data creation method according to the present invention.
  • the processing of each step of the learning data creation method shown in FIG. 12 is performed by the first processor 10-3 of the learning data creation device 1-3 shown in FIG.
  • step S11 a learning sample is acquired from the database 3, and this learning sample contains diagnostic information (biopsy information) of a living tissue in addition to one image and a plurality of first correct region masks. include.
  • the correct answer area mask integration unit 31 When the correct answer area mask integration unit 31 is accompanied by diagnostic information by each evaluator to the plurality of first correct answer area masks, the correct answer area mask integration unit 31 is the same as the diagnostic result of the biological tissue included in the biopsy information among the plurality of first correct answer area masks. Select only the first correct region mask that has diagnostic information. Further, among the plurality of first correct answer area masks, only the first correct answer area mask that includes the coordinate position of the living tissue included in the biopsy information in the correct answer area is selected. As a result, among the plurality of first correct area masks, only the first correct area mask that matches the diagnosis results and includes the coordinate positions of the collected tissues is selected. The correct area mask integration unit 31 generates the first correct area mask selected based on the biopsy information as the second correct area mask (step S13).
  • the sample weight calculation unit 41 calculates the sample weight according to the degree of match / mismatch of the first correct area mask selected based on the biopsy information among the plurality of first correct area masks. (Step S17).
  • the output unit 36 adds the pair of the image constituting the learning sample acquired in step S11 and the second correct mask generated in step S13, and the sample weight calculated in step S17 to the learning data for machine learning. Is output to the device in the subsequent stage (step S18).
  • FIG. 13 is a flowchart showing a first embodiment of the machine learning method according to the present invention.
  • the processing of each step of the machine learning method of the first embodiment shown in FIG. 13 can be performed by, for example, the machine learning device 50 shown in FIG. 7.
  • the machine learning device 50 (second processor 51) inputs learning data from the recording device 6. For example, one batch of training data is input (step S100).
  • the second processor 51 trains the second region extractor 52 based on the input learning data (step S110). That is, the second processor 51 has the output of the second area extractor 52 obtained when the image for learning of the training data is input to the second area extractor 52, and the second correct area mask which is the correct answer data. Various parameters of the second region extractor 52 are updated so that the difference between the two and the second region extractor 52 becomes small.
  • the sample weight information is added to the training data, it is preferable to change the contribution rate of machine learning by the training data according to the sample weight.
  • step S120 After learning the second region extractor 52 with the learning data for one batch, it is determined whether or not to end the machine learning (step S120). When it is determined that the machine learning is not terminated (in the case of "No"), the process proceeds to step S100, the learning data for the next batch is input, and the processes of steps S100 to S120 are repeated.
  • FIG. 14 is a flowchart showing a second embodiment of the machine learning method according to the present invention.
  • each step of the machine learning method of the second embodiment shown in FIG. 14 can be performed by the machine learning device 50 shown in FIG. 7, similarly to the machine learning method of the first embodiment shown in FIG. ..
  • the same step numbers are assigned to the parts common to the machine learning method of the first embodiment shown in FIG. 13, and detailed description thereof will be omitted.
  • the machine learning device 50 (second processor 51) inputs learning data from the recording device 6 (step S102).
  • learning data having a sample weight is input in addition to a pair of one image and a second correct area mask.
  • the second processor 51 determines whether or not the machine learning of the second region extractor 52 using the training data has reached the reference level (step S104).
  • the learning level when the second region extractor 52 is machine-learned using about 70% of the learning data of all the learning data can be set as the reference level.
  • the value of 70% is an example and is not limited to this.
  • the reference level may be a value appropriately set for the accuracy of region extraction of the second region extractor 52 (difference between the output of the second region extractor 52 and the second correct region mask) and the like.
  • step S104 When it is determined in step S104 that the learning level has not reached the reference level (in the case of "No"), the second processor 51 sets the sample weight of the learning data to a fixed value and sets the sample weight to the second region extractor 52. Is machine-learned (step S112). For example, when the sample weight is a value in the range of 0 to 1, the second region extractor 52 is machine-learned with the sample weight set to a fixed value of "1" regardless of the training data.
  • the machine learning of the second region extractor is performed with the sample weight included in the training data as a fixed value, so that the progress of machine learning of the second region extractor 52 can be accelerated.
  • step S104 when it is determined in step S104 that the learning level has reached the reference level (in the case of "Yes"), the second processor 51 switches the sample weight from the fixed value to the original value and extracts the second region.
  • the device 52 is machine-learned (step S114). That is, by changing the contribution rate of machine learning by each learning data according to the sample weight, for example, by lowering the contribution rate of machine learning by the learning data with low reliability of the second correct answer area mask, the second The accuracy of region extraction of the region extractor 52 is further improved.
  • the sample weight is set to a fixed value until the learning level of the second region extractor 52 reaches the reference level, and when the learning level reaches the reference level, the sample weight is changed from the fixed value to the original value. I try to switch and do machine learning. However, not limited to this, as the machine learning progresses from the initial stage of learning, the sample weight is continuously or stepwise changed so as to approach the original value from the fixed value so that the second region extractor is machine-learned. May be good.
  • the present invention is a second region extractor 52 in which machine learning is performed by a machine learning device 50, a trained learning model configured by a convolutional neural network, and image processing equipped with the trained learning model. Including equipment.
  • the hardware structure of the learning data creation device and the machine learning device for example, a processing unit that executes various processes such as a CPU, has various processors as shown below. processor).
  • processor the circuit configuration can be changed after manufacturing CPU (Central Processing Unit), FPGA (Field Programmable Gate Array), etc., which are general-purpose processors that execute software (programs) and function as various processing units.
  • Programmable Logic Device PLD
  • Programmable Logic Device PLD
  • ASIC Application Specific Integrated Circuit
  • the first, second and third processors and one processing unit may be composed of one of these various processors, or two or more processors of the same type or different types (for example, a plurality of FPGAs). , Or a combination of CPU and FPGA). Further, a plurality of processing units may be configured by one processor. As an example of configuring a plurality of processing units with one processor, first, one processor is configured by a combination of one or more CPUs and software, as represented by a computer such as a client or a server. There is a form in which the processor functions as a plurality of processing units.
  • SoC System On Chip
  • the various processing units are configured by using one or more of the above-mentioned various processors as a hardware-like structure.
  • circuitry that combines circuit elements such as semiconductor elements.
  • the present invention includes a learning data creation program that realizes various functions as a learning data creation device according to the present invention by being installed in a computer, and a recording medium on which this learning data creation program is recorded.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Surgery (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Computing Systems (AREA)
  • Radiology & Medical Imaging (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Signal Processing (AREA)
  • Biophysics (AREA)
  • Optics & Photonics (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Molecular Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
PCT/JP2021/030534 2020-09-07 2021-08-20 学習データ作成装置、方法及びプログラム、機械学習装置及び方法、学習モデル及び画像処理装置 Ceased WO2022050078A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2022546227A JP7457138B2 (ja) 2020-09-07 2021-08-20 学習データ作成装置、方法、プログラム、及び機械学習方法
US18/179,329 US20230206609A1 (en) 2020-09-07 2023-03-06 Training data creation apparatus, method, and program, machine learning apparatus and method, learning model, and image processing apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020149585 2020-09-07
JP2020-149585 2020-09-07

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/179,329 Continuation US20230206609A1 (en) 2020-09-07 2023-03-06 Training data creation apparatus, method, and program, machine learning apparatus and method, learning model, and image processing apparatus

Publications (1)

Publication Number Publication Date
WO2022050078A1 true WO2022050078A1 (ja) 2022-03-10

Family

ID=80490784

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/030534 Ceased WO2022050078A1 (ja) 2020-09-07 2021-08-20 学習データ作成装置、方法及びプログラム、機械学習装置及び方法、学習モデル及び画像処理装置

Country Status (3)

Country Link
US (1) US20230206609A1 (https=)
JP (1) JP7457138B2 (https=)
WO (1) WO2022050078A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2023170975A1 (https=) * 2022-03-11 2023-09-14

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102924100B1 (ko) * 2022-11-22 2026-02-06 한국과학기술원 3d 객체 텍스처 맵 생성 장치 및 방법

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011192178A (ja) * 2010-03-16 2011-09-29 Denso It Laboratory Inc 画像認識装置及び画像認識方法
WO2020031243A1 (ja) * 2018-08-06 2020-02-13 株式会社島津製作所 教師ラベル画像修正方法、学習済みモデルの作成方法および画像解析装置
WO2020194662A1 (ja) * 2019-03-28 2020-10-01 オリンパス株式会社 情報処理システム、内視鏡システム、学習済みモデル、情報記憶媒体、情報処理方法及び学習済みモデルの製造方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011192178A (ja) * 2010-03-16 2011-09-29 Denso It Laboratory Inc 画像認識装置及び画像認識方法
WO2020031243A1 (ja) * 2018-08-06 2020-02-13 株式会社島津製作所 教師ラベル画像修正方法、学習済みモデルの作成方法および画像解析装置
WO2020194662A1 (ja) * 2019-03-28 2020-10-01 オリンパス株式会社 情報処理システム、内視鏡システム、学習済みモデル、情報記憶媒体、情報処理方法及び学習済みモデルの製造方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2023170975A1 (https=) * 2022-03-11 2023-09-14
JP7768344B2 (ja) 2022-03-11 2025-11-12 オムロン株式会社 学習方法、葉状態識別装置、およびプログラム

Also Published As

Publication number Publication date
US20230206609A1 (en) 2023-06-29
JPWO2022050078A1 (https=) 2022-03-10
JP7457138B2 (ja) 2024-03-27

Similar Documents

Publication Publication Date Title
CN107492099B (zh) 医学图像分析方法、医学图像分析系统以及存储介质
CN111656357B (zh) 眼科疾病分类模型的建模方法、装置及系统
Kumar et al. Breast cancer classification of image using convolutional neural network
US12124960B2 (en) Learning apparatus and learning method
CN113688862B (zh) 一种基于半监督联邦学习的脑影像分类方法及终端设备
Hadavi et al. Lung cancer diagnosis using CT-scan images based on cellular learning automata
CN113096137B (zh) 一种oct视网膜图像领域适应分割方法及系统
CN110335241B (zh) 肠镜检查后对肠道准备自动进行评分的方法
CN116935009B (zh) 基于历史数据分析进行预测的手术导航系统
CN114943721A (zh) 一种建立基于改进U-Net网络的颈部超声图像分割方法
WO2022050078A1 (ja) 学習データ作成装置、方法及びプログラム、機械学習装置及び方法、学習モデル及び画像処理装置
CN114926366B (zh) 使用人工神经网络的运动伪影校正的方法和设备
CN112236832A (zh) 诊断辅助系统、诊断辅助方法以及诊断辅助程序
Pandey et al. An analysis of pneumonia prediction approach using deep learning
Thakur et al. Deep reinforcement learning in healthcare and bio-medical applications
WO2022044425A1 (ja) 学習装置、学習方法、プログラム、学習済みモデル、及び内視鏡システム
CN120600323B (zh) 一种基于神经网络的半月板损伤预测方法及系统
CN115053296A (zh) 使用机器学习的改进的手术报告生成方法及其设备
CN116052158A (zh) 脊柱图像处理方法、装置、计算机设备、存储介质
JP7786700B2 (ja) 情報処理装置、情報処理方法、及びコンピュータプログラム
Rodrigues et al. DermaDL: advanced convolutional neural networks for automated melanoma detection
CN110176007A (zh) 晶状体分割方法、装置及存储介质
CN120565099A (zh) 基于因果推断的医学数据增强方法、装置、存储介质及设备
CN113345558A (zh) 一种提高骨科诊断决策效率的辅助系统及方法
CN108985302A (zh) 一种皮肤镜图像处理方法、装置及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21864136

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022546227

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21864136

Country of ref document: EP

Kind code of ref document: A1