WO2022079847A1 - 情報処理システム、情報処理方法及びプログラム - Google Patents
情報処理システム、情報処理方法及びプログラム Download PDFInfo
- Publication number
- WO2022079847A1 WO2022079847A1 PCT/JP2020/038843 JP2020038843W WO2022079847A1 WO 2022079847 A1 WO2022079847 A1 WO 2022079847A1 JP 2020038843 W JP2020038843 W JP 2020038843W WO 2022079847 A1 WO2022079847 A1 WO 2022079847A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- absence
- images
- image
- histopathological
- gene mutation
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims description 80
- 238000003672 processing method Methods 0.000 title claims description 9
- 206010064571 Gene mutation Diseases 0.000 claims abstract description 276
- 201000010099 disease Diseases 0.000 claims abstract description 37
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 37
- 210000003097 mucus Anatomy 0.000 claims description 127
- 238000000034 method Methods 0.000 claims description 91
- 206010009944 Colon cancer Diseases 0.000 claims description 63
- 208000001333 Colorectal Neoplasms Diseases 0.000 claims description 59
- 206010028980 Neoplasm Diseases 0.000 claims description 53
- 239000007787 solid Substances 0.000 claims description 43
- 210000004027 cell Anatomy 0.000 claims description 42
- 230000003902 lesion Effects 0.000 claims description 42
- 238000007873 sieving Methods 0.000 claims description 41
- 230000005861 gene abnormality Effects 0.000 claims description 37
- 101150040459 RAS gene Proteins 0.000 claims description 33
- 102200055464 rs113488022 Human genes 0.000 claims description 33
- 210000001519 tissue Anatomy 0.000 claims description 25
- 101000984753 Homo sapiens Serine/threonine-protein kinase B-raf Proteins 0.000 claims description 22
- 102100027103 Serine/threonine-protein kinase B-raf Human genes 0.000 claims description 22
- 101150076031 RAS1 gene Proteins 0.000 claims description 20
- 102000016914 ras Proteins Human genes 0.000 claims description 20
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 claims description 19
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 claims description 19
- 108010078814 Tumor Suppressor Protein p53 Proteins 0.000 claims description 19
- -1 BRAF V600E Proteins 0.000 claims description 14
- 101150048834 braF gene Proteins 0.000 claims description 14
- 101150029707 ERBB2 gene Proteins 0.000 claims description 13
- 101150080074 TP53 gene Proteins 0.000 claims description 13
- 108700025694 p53 Genes Proteins 0.000 claims description 13
- 108090000623 proteins and genes Proteins 0.000 claims description 11
- 210000004881 tumor cell Anatomy 0.000 claims description 10
- 238000010187 selection method Methods 0.000 claims description 8
- 230000035772 mutation Effects 0.000 claims description 7
- 208000031513 cyst Diseases 0.000 claims description 6
- 210000002175 goblet cell Anatomy 0.000 claims description 5
- 229920000715 Mucilage Polymers 0.000 claims description 4
- 239000000853 adhesive Substances 0.000 claims description 4
- 238000004458 analytical method Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 2
- 101100388568 Mus musculus Ebp gene Proteins 0.000 claims 8
- 102100025064 Cellular tumor antigen p53 Human genes 0.000 claims 6
- 230000001413 cellular effect Effects 0.000 claims 1
- 210000002149 gonad Anatomy 0.000 claims 1
- 230000001575 pathological effect Effects 0.000 abstract description 18
- 239000003814 drug Substances 0.000 description 52
- 229940079593 drug Drugs 0.000 description 51
- 208000032818 Microsatellite Instability Diseases 0.000 description 27
- 238000010586 diagram Methods 0.000 description 24
- 238000012360 testing method Methods 0.000 description 17
- 201000011510 cancer Diseases 0.000 description 16
- 230000005856 abnormality Effects 0.000 description 15
- 230000002068 genetic effect Effects 0.000 description 15
- 102000015098 Tumor Suppressor Protein p53 Human genes 0.000 description 13
- 238000011156 evaluation Methods 0.000 description 12
- 238000010801 machine learning Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 9
- 230000007704 transition Effects 0.000 description 8
- 208000029742 colonic neoplasm Diseases 0.000 description 7
- 238000012549 training Methods 0.000 description 7
- 230000003111 delayed effect Effects 0.000 description 6
- 238000012795 verification Methods 0.000 description 6
- 230000007170 pathology Effects 0.000 description 5
- 238000002790 cross-validation Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 206010051066 Gastrointestinal stromal tumour Diseases 0.000 description 2
- 108700020796 Oncogene Proteins 0.000 description 2
- 206010038019 Rectal adenocarcinoma Diseases 0.000 description 2
- 201000010897 colon adenocarcinoma Diseases 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 201000001281 rectum adenocarcinoma Diseases 0.000 description 2
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 102100039788 GTPase NRas Human genes 0.000 description 1
- 101000744505 Homo sapiens GTPase NRas Proteins 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 208000000453 Skin Neoplasms Diseases 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 102000018252 Tumor Protein p73 Human genes 0.000 description 1
- 108010091356 Tumor Protein p73 Proteins 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 208000035250 cutaneous malignant susceptibility to 1 melanoma Diseases 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 210000002429 large intestine Anatomy 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 201000000849 skin cancer Diseases 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 230000008961 swelling Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/695—Preprocessing, e.g. image segmentation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/40—ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10056—Microscopic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30028—Colon; Small intestine
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
- G06V2201/031—Recognition of patterns in medical or anatomical images of internal organs
Definitions
- the present invention relates to an information processing system, an information processing method and a program.
- Non-Patent Document 1 For cancers such as lung cancer, colon cancer, stomach cancer, breast cancer, GIST (gastrointestinal stromal tumor), and skin cancer (for example, malignant melanoma), oncogene tests are performed when the doctor deems it necessary. One or several genes are investigated and diagnosed, and drugs are selected and treated based on the test results (see Non-Patent Document 1).
- the result of the oncogene test takes 1 to 2 months, so the administration of the drug according to the patient's genetic abnormality may be delayed, and the condition may progress during that time, which may be too late.
- the condition may progress during that time, which may be too late.
- One aspect of the present invention has been made in view of the above problems, and provides an information processing system, an information processing method, and a program capable of suppressing a delay in administration of a drug according to a genetic abnormality of a target disease.
- the purpose is to do.
- Another aspect of the present invention has been made in view of the above problems, and is an information processing system and information processing capable of suppressing delay in administration of a drug according to a genetic abnormality in a patient with colorectal cancer.
- the purpose is to provide methods and programs.
- Another aspect of the present invention has been made in view of the above problems, and provides an information processing system capable of suppressing a delay in administration of a drug according to a genetic abnormality in a cancer patient. The purpose.
- the information processing system includes an acquisition unit for acquiring a pathological tissue image of a patient's tissue of a target disease, a division unit for dividing the pathological histology image of the patient into a plurality of region images, and a division unit.
- a feature prediction unit for inputting a region image for each of a plurality of feature prediction models constructed for each type of histopathological feature and acquiring prediction information for the presence or absence of histopathological features, and the above-mentioned acquisition.
- a sorting unit that selects a plurality of region images in which the combination of the presence or absence of the histopathological features matched with the combination of the presence or absence of the histopathological features at the time of selection set in advance, and the presence or absence of the histopathological features.
- a region image selected by selection is input, and a gene mutation prediction unit for acquiring prediction information on the presence or absence of gene mutation, and the above-mentioned It is provided with a prediction result output unit that outputs a prediction result of the presence or absence of at least one gene mutation for the patient by using the prediction information of the presence or absence of the gene mutation for each acquired region image.
- a histopathological image of a patient with a target disease is input, a prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, so that the administration of the drug according to the genetic abnormality of the target disease is delayed. Can be suppressed.
- the information processing system is the information processing system according to the first aspect, in which the acquisition unit further acquires the site of the primary lesion of the target disease, and the selection unit is the selection unit.
- the plurality of region images are selected using the acquired combination of the presence or absence of histopathological features and the acquired site of the primary lesion of the target disease.
- the probability that the region image input to the gene mutation prediction model can be limited to the region image related to the target disease is improved. It is possible to improve the prediction accuracy of the presence or absence of gene mutation.
- the information processing system is the information processing system according to the first or second aspect, and the feature prediction model uses a region image in which a pathological tissue image is divided as an input, and the region is concerned. It is a machine-learned model using learning data that uses the histopathological features given to the image as output, and the gene mutation prediction model is selected using a combination of the presence or absence of histopathological features. This is a machine-learned model using learning data that uses the obtained region image as an input and information on the presence or absence of a specific gene mutation as an output.
- the feature prediction model and the gene mutation prediction model are models after machine learning, the prediction accuracy of the presence or absence of gene mutation can be improved.
- the information processing system is an information processing system for predicting a gene in which a mutation occurs in a colorectal cancer tissue of a colorectal cancer patient, and is a colorectal cancer pathology of the patient.
- the combination of the feature prediction unit that inputs the region image and acquires the prediction information of the presence or absence of the histopathological feature and the combination of the acquired presence or absence of the histopathological feature are BRAF, BRAF V600E, ERBB2.
- the selected region image is input to acquire the prediction information of the presence or absence of the gene mutation.
- the gene mutation prediction unit and the prediction information of the presence or absence of gene mutation in each acquired region image for the patient, of at least one of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI. It is provided with a prediction result output unit that outputs a prediction result of presence / absence.
- the prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, and thus administer the drug according to the gene abnormality of colorectal cancer.
- the delay can be suppressed.
- the information processing system is the information processing system according to the fourth aspect, in which the acquisition unit further acquires the site of the primary lesion of colorectal cancer, and the selection unit is a selection unit.
- the plurality of region images are selected using the acquired site of the primary lesion of colorectal cancer together with at least one determined histopathological feature.
- the probability that the region image input to the gene mutation prediction model can be limited to the region image related to the target disease is improved. It is possible to improve the prediction accuracy of the presence or absence of gene mutation.
- the information processing system is the information processing system according to the fourth or fifth aspect, and the feature prediction model uses a region image in which a pathological tissue image is divided as an input, and the region is concerned. It is a machine-learned model using learning data that uses the histopathological features given to the image as output, and the gene mutation prediction model is selected using a combination of the presence or absence of histopathological features. This is a machine-learned model using learning data that uses the obtained region image as an input and information on the presence or absence of a specific gene mutation as an output.
- the feature prediction model and the gene mutation prediction model are models after machine learning, the prediction accuracy of the presence or absence of gene mutation can be improved.
- the information processing method includes an acquisition procedure for acquiring a pathological tissue image of a patient with a target disease, a division procedure for dividing the pathological tissue image of the patient into a plurality of region images, and a pathological tissue.
- a feature prediction procedure for inputting a region image for each of a plurality of feature prediction models constructed for each type of histological feature and acquiring prediction information for the presence or absence of histopathological features, and the above-mentioned acquired feature prediction procedure.
- a histopathological image of a patient with a target disease is input, a prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, so that the administration of the drug according to the genetic abnormality of the target disease is delayed. Can be suppressed.
- the information processing method is an information processing method for predicting a gene in which a mutation occurs in a colorectal cancer tissue of a colorectal cancer patient, and is a colorectal cancer pathology of the patient.
- the division procedure for dividing the colorectal cancer histopathological image of the patient into multiple region images, and the multiple feature prediction models constructed for each type of histopathological feature are BRAF, BRAF V600E, ERBB2.
- a gene mutation prediction procedure for inputting the selected region image for each of the gene mutation prediction models constructed for each combination of presence / absence and type of gene mutation, and acquiring prediction information for the presence / absence of gene mutation, respectively.
- the prediction result of the presence / absence of at least one of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI is output for the patient. It has a prediction result output procedure and a procedure to be performed.
- the prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, and thus administer the drug according to the gene abnormality of colorectal cancer.
- the delay can be suppressed.
- the program according to the ninth aspect of the present invention includes an acquisition procedure for acquiring a pathological histological image of a patient with a target disease on a computer, a division procedure for dividing the pathological histological image of the patient into a plurality of region images, and a pathology.
- a feature prediction procedure for inputting a region image for each of a plurality of feature prediction models constructed for each type of histological feature and acquiring prediction information for the presence or absence of a histological feature, and the above-mentioned acquisition.
- the gene mutation prediction procedure for acquiring the prediction information of the presence or absence of the gene mutation by inputting the region image selected by selection for each of the gene mutation prediction models constructed for each type of gene mutation, and the above-mentioned acquisition.
- This is a program for executing a prediction result output procedure for outputting at least one prediction result of the presence or absence of a gene mutation for the patient by using the prediction information of the presence or absence of a gene mutation for each region image.
- a histopathological image of a patient with a target disease is input, a prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, so that the administration of the drug according to the genetic abnormality of the target disease is delayed. Can be suppressed.
- the program according to the tenth aspect of the present invention is a program for predicting a gene in which a mutation has occurred in a colon cancer tissue of a colon cancer patient, and a computer is used to display the patient's colon cancer pathological tissue.
- the division procedure for dividing the patient's colorectal cancer histopathological image into multiple region images, and the multiple feature prediction models constructed for each type of histopathological feature are used for each of the acquisition procedure for acquiring images.
- the combination of the feature prediction procedure for inputting a region image and acquiring the prediction information for the presence or absence of the histopathological feature and the combination of the presence or absence of the acquired histopathological feature are BRAF, BRAF V600E, ERBB2, A selection procedure for selecting a plurality of region images that match a combination of the presence or absence of histopathological features at the time of sorting preset for at least one of RAS, TP53, or MSI, and the presence or absence of histopathological features.
- the selected region image is input and the prediction information of the presence or absence of the gene mutation is acquired, respectively, and the above-mentioned gene mutation prediction procedure.
- the prediction result of the presence / absence of at least one of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI is output for the patient. It is a program to execute the prediction result output procedure.
- the prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, and thus administer the drug according to the gene abnormality of colorectal cancer.
- the delay can be suppressed.
- the information processing system is an information processing system for estimating the presence or absence of a BRAF gene mutation in a tumor, and is a means for dividing a histopathological image of the tumor into one or a plurality of images. , An image containing a tumor cell ratio of more than 50%, and / or an image containing a papillary structure, and / or an image containing a non-sawtooth papillary structure, and / or a non-sawtooth papillary structure from the divided images.
- the prediction result of the presence or absence of the BRAF gene mutation can be immediately obtained.
- the clinician can prescribe a drug corresponding to the predicted result of the presence or absence of the BRAF gene mutation to the patient by referring to this predicted result, and thus administer the drug according to the BRAF gene abnormality of the cancer. Delay can be suppressed.
- the information processing system is an information processing system that estimates the presence or absence of BRAF V600E gene mutation in a tumor, and serves as a means for dividing a histopathological image of the tumor into one or a plurality of images. , An image containing a chord style, and / or an image containing a small solid follicle, and / or an image in which the small solid follicle is composed of normal cells, and / or a large size from the divided images.
- Images of solid follicles composed of normal cells and / or images containing oblong nuclei and / or images with mucus and / or images with non-sawtooth papillary structures and no mucus A means for selecting an image and / or an image containing a sieving structure and no mucus, and / or an image having a sieving structure and the presence of mucus, and using the selected image to generate a BRAF V600E gene mutation. It is provided with a means for estimating the presence or absence.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the BRAF V600E gene mutation to the patient by referring to this prediction result, so that the drug corresponding to the BRAF V600E gene abnormality of the cancer can be prescribed. It is possible to suppress the delay in administration of.
- the information processing system is an information processing system that estimates the presence or absence of an ERBB2 gene mutation in a tumor, and is a means for dividing a histopathological image of the tumor into one or a plurality of images.
- a means for estimating the presence or absence of an ERBB2 gene mutation is provided.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the ERBB2 gene mutation to the patient by referring to this prediction result, and thus administer the drug according to the ERBB2 gene abnormality of the cancer. Delay can be suppressed.
- the information processing system is an information processing system that estimates the presence or absence of a TP53 gene mutation in a tumor, and is a means for dividing a histopathological image of the tumor into one or a plurality of images. From the divided images, images containing ring cells and leaking mucus and / or images containing goblet cells and / or images containing cord-like structures and no mucus and / or leaking mucus. A means for selecting a certain image and a means for estimating the presence or absence of a TP53 gene mutation using the selected image are provided.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the TP53 gene mutation to the patient by referring to this prediction result, and thus administer the drug according to the TP53 gene abnormality of the cancer. Delay can be suppressed.
- the information processing system is an information processing system for estimating the presence or absence of an MSI gene abnormality in a tumor, and is a means for dividing a histopathological image of the tumor into one or a plurality of images. From the divided images, an image containing a chordal pattern and / or an image containing a cord-like structure and no mucus, and / or an image having a tumor cell ratio of more than 50%, and / or a solid follicle.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the MSI gene abnormality to the patient by referring to this prediction result, and thus administer the drug according to the MSI gene abnormality of the cancer. Delay can be suppressed.
- the information processing system is an information processing system for estimating the presence or absence of a RAS gene mutation in a tumor, which comprises a means for dividing a histopathological image of the tumor into one or a plurality of images.
- a means for dividing a histopathological image of the tumor into one or a plurality of images comprising a means for dividing a histopathological image of the tumor into one or a plurality of images.
- the divided images an image containing ring cells and leaking mucus, and / or an image containing a tubular structure and no mucus, and / or an image containing a cord-like structure and no mucus, And / or images containing small solid follicles and / or images containing large solid follicles, and / or images in which large solid follicles are composed of normal cells, and / or include papillary structures.
- Images that do not and / or images that contain a sieving structure and have mucus, and / or images that contain a sieving structure and have mucus leakage, and / or images that have swelling, and / or tumor cell proportions Images with more than 50% and / or images containing small solid follicles, and / or images in which small solid follicles are composed of normal cells, and / or images containing large solid follicles, and / Or an image in which a large solid follicle is composed of normal cells and / or an image containing an oblong nucleus and / or an image in which mucus is present, and / or an image in which mucus leaks, and / or non-sawtooth.
- a means for estimating the presence or absence of a RAS gene mutation is provided.
- the prediction result of the presence or absence of the RAS gene mutation can be immediately obtained.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the RAS gene mutation to the patient by referring to this prediction result, and thus administer the drug according to the RAS gene abnormality of the cancer. Delay can be suppressed.
- a prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, so that the administration of the drug according to the genetic abnormality of the target disease is delayed. Can be suppressed.
- a histopathological image of colorectal cancer of a patient with colorectal cancer a prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, and thus administer the drug according to the gene abnormality of colorectal cancer.
- the delay can be suppressed.
- the target gene mutation is BRAF +, it is a combination of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection.
- BRAF V600E + it is a combination of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection.
- the target gene mutation When the target gene mutation is ERBB2 +, it is a combination of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection.
- the target gene mutation When the target gene mutation is TP53 +, it is a combination of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection.
- MSI When the state of the genetic abnormality is MSI, it is a set of a combination of the site of the primary lesion selected and the presence or absence of histopathological features at the time of selection.
- the target gene mutation When the target gene mutation is RAS +, it is a combination of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection. It is a continuation of FIG.
- FIG. 23 This is an example of a table stored in the storage 23. It is a continuation of FIG. It is a schematic diagram which shows the estimation process of the presence or absence of each gene mutation. It is a schematic diagram for demonstrating the method of assembling the prediction result for a region image into the prediction result for each patient. It is a schematic diagram for demonstrating an example of a screen transition in a terminal. It is a flowchart which shows an example of the process flow of the screen transition of FIG.
- a target disease for predicting the presence or absence of a gene mutation it can be applied to a disease having a gene mutation in a tissue (or a disease to which a drug can be administered according to a gene abnormality) including cancer.
- a disease having a gene mutation in a tissue or a disease to which a drug can be administered according to a gene abnormality
- the target disease according to the present embodiment will be described below as a colon cancer.
- the gene abnormality will be described as being included in the gene mutation.
- FIG. 1 is a schematic configuration diagram of an information processing system according to this embodiment.
- the information processing system S includes terminals 1-1 to 1-N (N is a natural number) and a computer system 2 connected via a communication network NW.
- the terminals 1-1 to 1-N are used by different users, and are, for example, mobile phones such as multifunctional mobile phones (so-called smartphones), tablets, notebook computers, desktop computers, and the like.
- the terminals 1-1 to 1-N may display the information transmitted from the computer system 2 via, for example, a WEB browser, or the computer on the screen of the application installed on the terminals 1-1 to 1-N.
- the information transmitted from the system 2 may be displayed.
- the terminals 1-1 to 1-N will be described below assuming that the terminals 1-1 to 1-N display information transmitted from the computer system 2 via, for example, a WEB browser.
- the computer system 2 can communicate with the terminals 1-1 to 1-N, and these communications may be wired or wireless. That is, the computer system 2 is connected to the terminals 1-1 to 1-N so as to be able to exchange information.
- the computer system 2 is used, for example, by an administrator who manages the information processing system S according to the present embodiment.
- the computer system 2 may be a single computer or a plurality of computers.
- FIG. 2 is a schematic configuration diagram of a computer system according to the present embodiment.
- the computer system 2 includes an input interface 21, a communication module 22, a storage 23, a memory 24, and a processor 25.
- the input interface 21 receives an input from the administrator of the computer system 2 and outputs an input signal corresponding to the received input to the processor 25.
- the communication module 22 is connected to the communication network NW and communicates with the terminals 1-1 to 1-N. This communication may be wired or wireless, but will be described as being wired.
- the storage 23 stores a program for the processor 25 to read and execute, a feature prediction model after machine learning, a variation prediction model after machine learning, and various data.
- the memory 24 temporarily holds data and programs.
- the memory 24 is a volatile memory, for example, a RAM (Random Access Memory).
- the processor 25 functions as an acquisition unit 251 and an output unit 250 by loading a program from the storage 23 into the memory 24 and executing a series of instructions included in the program.
- the output unit 250 has, for example, a division unit 252, a feature prediction unit 253, a selection unit 254, a gene mutation prediction unit 255, and a prediction result output unit 256. Each process will be described later.
- FIG. 3 is a schematic diagram showing the learning process of the gene mutation prediction model.
- the division portion 252 captures one or more regions of a pathological tissue (here, as an example, colorectal cancer tissue) image (for example, a slide image) of a patient with a target disease (here, as an example, colorectal cancer). Divide into images (eg, tiled tile images).
- a pathological tissue here, as an example, colorectal cancer tissue
- a target disease here, as an example, colorectal cancer
- Divide into images eg, tiled tile images.
- gene mutations in the diseased tissue of the patient here, as an example, colorectal cancer tissue
- the case of dividing into tile-shaped tile images is also referred to as tile division.
- Step S2 Subsequently, using, for example, an annotation system, the pathologist inputs histopathological features (for example, an oblong nucleus) for each of the plurality of region images.
- the terminal device (not shown) used by the pathologist accepts histopathological features.
- the acquisition unit 251 acquires the histopathological features given to each of the plurality of region images.
- Step S3 the processor 25 learns the relationship between the region image and the histopathological feature for each histopathological feature, and outputs a feature prediction model.
- the processor 25 uses machine learning (for example,) using learning data that inputs a region image in which a histopathological image is divided and outputs a histopathological feature assigned to the region image. , Deep Learning) and output a feature prediction model.
- feature prediction models are output for the number L of histopathological features (L is a natural number), and these feature prediction models FM-1, ..., FM-L are stored in the storage 23 by the processor 25.
- the feature prediction model was machine-learned using the learning data in which the histopathological image divided into the region image is input and the histopathological feature given to the region image is used as the output. It is a model.
- Step S4 the processor 25 also uses the feature prediction models FM-1, ..., FM-L for the divided image (here, a tile image as an example) without annotation for the histopathological feature, respectively. Predict the presence or absence of histopathological features. This predicts the presence or absence of each histopathological feature for the divided image without annotations for the histopathological feature.
- Step S5 the sorting unit 254 sorts a region image (here, a tile image as an example) from a plurality of region images using a combination of the presence or absence of predicted histopathological features. Specifically, the sorting unit 254 uses the histopathological features predicted in step S4 to extract a region image (here, a tile image as an example) corresponding to a specific combination of the presence or absence of histopathological features. do.
- Step S6 the processor 25 learns the relationship between the tile image group corresponding to a specific combination of the presence or absence of histopathological features and the gene mutation, and outputs a gene mutation prediction model. Specifically, for example, the processor 25 inputs a region image (region image extracted in step S5) corresponding to a specific combination of the presence or absence of histopathological features, and outputs information on the presence or absence of a specific gene mutation. Using the learning data to be used, learning is performed by machine learning (for example, Deep Learning), and a gene mutation prediction model is output.
- machine learning for example, Deep Learning
- gene mutation prediction models are output for the number M of gene mutations (M is a natural number), and these gene mutation prediction models GM-1, ..., GM-M are stored in the storage 23 by the processor 25.
- M is a natural number
- the gene mutation prediction model uses learning data in which the region image selected using a specific combination of the presence or absence of histopathological features is input and the information of the specific gene mutation is used as an output. It is a machine-learned model.
- FIG. 4A is a schematic diagram for explaining the process of annotating histopathological features on histopathological images.
- the histopathological image is divided into a plurality of region images.
- region images outside the cell tissue hereinafter, also referred to as white images
- region images other than the cell tissue for example, magic ink
- Histopathological features are annotated by hand (eg, a pathologist) on the region images that are not excluded.
- FIG. 4B is a schematic diagram showing an example of the process of annotating histopathological features on histopathological images.
- the histopathological image is divided into a plurality of region images. From each region image, a white image and a region image (also referred to as a magic image) in which the magic ink occupies more than the reference are excluded. For each of the non-excluded region images, the presence or absence of each histopathological feature is given.
- a histopathological feature it is indicated by a circle, and if there is no histopathological feature, it is indicated by ⁇ , and each region image has L histopathological features. Whether or not it is given is given.
- FIG. 5 is a schematic diagram illustrating learning and testing in constructing a feature prediction model for a certain histopathological feature.
- the training data is a set of a region image and the presence / absence of a certain histological feature
- the region image is given to the input of the machine learning model
- the feature is output to the output of the machine learning model.
- machine learning Given the presence or absence of, machine learning is performed.
- the relationship between the region image and the presence or absence of a certain histopathological feature is learned, and a feature prediction model for predicting the certain histopathological feature is generated.
- 5-fold cross validation is performed on 80% of the total learning data. That is, learning is performed with 64% of the learning data of the whole, and verification is performed with the learning data of 16% of the whole. Specifically, it is verified as follows. First, 80% of the total learning data is divided into five. For the sake of easy explanation, the subsets of the data divided into five are set as s1, s2, s3, s4, and s5, respectively. For example, after 80% of the training data is divided into five, the model is first trained using a subset of s1, s2, s3, and s4 as a training subset. Subsequently, the model is verified using s5 as a verification subset.
- the evaluation index (for example, accuracy, F1 value, etc.) obtained at this time is set to e1.
- learning is performed in s2, s3, s4, and s5, and evaluation is performed in s1.
- learning and evaluation are repeated while exchanging the subsets.
- five evaluation indexes can be obtained.
- the model with the best performance among these five evaluation indexes is determined as the feature prediction model.
- FIG. 6A is a schematic diagram illustrating the addition of predicted values of histopathological features to the region image.
- FIG. 6A shows a feature prediction model FM-1 for histopathological feature 1, a feature prediction model FM-2 for histopathological feature 2, a feature prediction model FM-3 for histopathological feature 3, ..., Pathology.
- a feature prediction model FM-L for histopathological features L is shown.
- the predicted value of the histopathological feature is a degree having the histopathological feature determined based on the prediction result by the feature prediction models FM-1, ..., FM-L for each region image.
- the predicted value of the histopathological feature may be the predicted result (for example, a numerical value of 0 to 1) itself, or the predicted result is compared with the threshold value and the value corresponding to the comparison result (for example, a value of 0 or 1). May be.
- the predicted value of the histopathological feature will be described as a numerical value of 0 to 1.
- FIG. 6B is a schematic diagram illustrating an example of imparting a predicted value of histopathological features to a region image.
- the histopathological features of each of the region images excluding the white image and the magic image are shown.
- Each of the feature prediction values 1 to L is given.
- the processor 25 has the histopathological feature when the feature predicted value exceeds the threshold value, and the histopathological feature when the feature predicted value is equal to or less than the threshold value. May be determined not to have.
- the threshold value may be set to a different value for each histopathological feature or the same value may be set, but here, as an example, a different value is set for each histopathological feature. Explain as if it were.
- FIG. 7 is a schematic diagram for explaining learning and testing in constructing a gene mutation prediction model for a certain gene mutation.
- the training data is a set of a region image after selection and the presence or absence of a certain gene mutation, and the region image after selection is given to the input of the machine learning model and is output to the machine learning model.
- machine learning Given the presence or absence of the gene mutation, machine learning is performed. As a result, the relationship between the region image and the presence or absence of a certain gene mutation is learned, and a gene mutation prediction model for predicting the certain gene mutation is generated.
- 5-fold cross validation is performed on 80% of the total learning data. That is, learning is performed with 64% of the learning data of the whole, and verification is performed with the learning data of 16% of the whole. Specifically, it is verified as follows. First, 80% of the total learning data is divided into five. For the sake of easy explanation, the subsets of the data divided into five are set as s1, s2, s3, s4, and s5, respectively. For example, after 80% of the training data is divided into five, the model is first trained using a subset of s1, s2, s3, and s4 as a training subset. Subsequently, the model is verified using s5 as a verification subset.
- the evaluation index (for example, accuracy, F1 value, etc.) obtained at this time is set to e1.
- learning is performed in s2, s3, s4, and s5, and evaluation is performed in s1.
- learning and evaluation are repeated while exchanging the subsets.
- five evaluation indexes can be obtained.
- the model with the best performance among these five evaluation indexes is determined as the gene mutation prediction model.
- FIG. 8 is an example of a test of a gene mutation prediction model for colorectal cancer.
- a feature prediction model and a gene mutation prediction model are learned from 80% of the data of the second period.
- the post-learning feature prediction model and the post-learning gene mutation prediction model are applied to the 1st stage data, the 2.5th stage data, and the TCGA data.
- the TCGA data is limited to colon adenocarcinoma (COAD) and rectal adenocarcinoma (READ) among the colorectal cancers.
- COAD colon adenocarcinoma
- READ rectal adenocarcinoma
- the first feature (primary) that meets all the conditions (1) to (3) below from all the experimental data for each type of gene mutation (or gene abnormality) of interest.
- the combination of the nest site) and the second feature was selected.
- the AUC is an area (integral) under the ROC curve with the first axis as the false positive rate and the second axis as the true positive rate. The range of this area can take a value between 0 and 1.
- AUC shows 0.8 or more in either case assembly method 1 or case assembly method 2 (that is, prediction is highly accurate even in a cohort not used for learning). It is possible).
- the number of required tiles is the smallest among those with the same combination of image types and features.
- Case assembly method 1 When the average value of the predicted values of the gene mutation model for each region image included in the selected region image group (for example, the region image group of FIG. 19) is equal to or higher than the threshold th, case 1 is predicted to have a gene mutation. , If not, how to predict that there is no gene mutation (2) Case assembly method 2 The selected region image group (for example, the number of tiles in which the predicted value of the gene mutation model for each region image included in the sorted region image group (for example, the region image group of FIG. 19) is equal to or greater than the first threshold th1). When the ratio of the region image group in FIG. 19) to the total number of region images is equal to or greater than the second threshold th2, the target patient is predicted to have a gene mutation, and if not, it is predicted that there is no gene mutation. Method
- FIG. 9 is a set of combinations of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection when the target gene mutation is BRAF +.
- BRAF + means all BRAF mutations including BRAF V600E.
- FIG. 10 is a set of combinations of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection when the target gene mutation is BRAF V600E +.
- FIG. 11 is a set of combinations of selected primary lesion sites and the presence or absence of histopathological features at the time of selection when the target gene mutation is ERBB2 +.
- FIG. 12 is a set of combinations of selected primary lesion sites and the presence or absence of histopathological features at the time of selection when the target gene mutation is TP53 +.
- FIG. 13 is a set of combinations of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection when the state of the genetic abnormality is MSI.
- FIG. 14 is a set of combinations of selected primary lesion sites and the presence or absence of histopathological features at the time of selection when the target gene mutation is RAS +.
- FIG. 15 is a continuation of FIG.
- RAS + means KRAS + or NRAS +.
- the AUC of Method 1 and the AUC of Case Assembling Method 2 the AUC of Case Assembling Method 1 and the AUC of Case Assembling Method 2 with the 2.5th stage data, and the AUC of Case Assembling Method 1 and the AUC of Case Assembling Method 2 are shown with TCGA data. Has been done.
- the " ⁇ structure (or cell) and mucus ⁇ " in FIGS. 9 to 17 means that the XX structure (or cell) and at the same time have the characteristic of mucus ⁇ .
- "sieving structure and no mucus” means “sieving structure, and at the same time, no mucus is present.” It means that.
- the sieve structure and the presence of mucus means "the sieve structure and the presence of mucus at the same time”. Means that.
- FIGS. 9 to 17 means that the XX structure (or cell) and at the same time have the characteristic of mucus ⁇ .
- FIGS. 9, 10, 13, 14, 14, 15, 16, and 17 means "sieving structure, and at the same time, no mucus is present.” It means that.
- the sieve structure and the presence of mucus means “the sieve structure and the presence of mucus at the same time”. Means that.
- non-sawtooth papilla structure and no mucus means “non-sawtooth papilla structure and at the same time no mucus”.
- a cord-like structure and the presence of mucus means “a cord-like structure and at the same time, the presence of mucus”.
- a cord-like structure and a mucus leak means "a cord-like structure and at the same time there is a mucus leak”.
- alignt ring cell and mucus leakage in FIGS. 12 and 16 means “signet ring cell and mucus leakage at the same time”.
- FIGS. 12, 13 and 16 means “the cord-like structure and at the same time, the absence of mucus”.
- sieving structure and mucus leakage in FIGS. 13, 14, 15, 16 and 17 means “sieving structure and at the same time there is mucus leakage”.
- signet ring cells and mucus leakage in FIGS. 14 and 17 means “signet ring cells and at the same time have mucus leakage”.
- tubular structure and no mucus in FIGS. 14 and 17 means “tubular structure and at the same time no mucus”.
- FIGS. 14 and 17 means “non-sawtooth papilla structure and at the same time, mucus is present”.
- tubular structure and mucus present in FIGS. 15 and 17 means “tubular structure and mucus present at the same time”.
- tubular structure and mucus leakage in FIGS. 15 and 17 means “tubular structure and at the same time there is mucus leakage”.
- FIG. 16 is an example of a table stored in the storage 23.
- FIG. 17 is a continuation of FIG.
- the target gene mutation is BRAF +
- a set of records of a combination of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection is accumulated.
- the target gene mutation is BRAF V600E
- a set of records of a combination of the selected primary lesion site and the presence or absence of histopathological features at the time of selection is accumulated. ing.
- ERBB2 table T3 of FIG. 16 when the target gene mutation is ERBB2, a set of records of a combination of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection is accumulated. .. Further, in the TP53 table T4 of FIG. 16, when the target gene mutation is TP53, a set of records of a combination of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection is accumulated. .. Further, in the MSI table T5 of FIG. 16, when the state of the genetic abnormality is MSI, a set of records of a combination of the site of the selected primary lesion and the presence or absence of histopathological features at the time of selection is accumulated. There is.
- FIG. 18 is a schematic diagram showing an estimation process for the presence or absence of each gene mutation.
- Step S11 the acquisition unit 251 acquires a histopathological image of a patient whose gene analysis has not been performed and whose gene mutation is unknown.
- Step S12 the division unit 252 divides the histopathological image of the patient into a plurality of region images.
- the feature prediction unit 253 predicts the presence or absence of the target histopathological feature for each region image using the feature prediction model constructed for each type of histopathological feature. More specifically, for example, the feature prediction unit 253 inputs a region image to each of a plurality of feature prediction models constructed for each type of histopathological feature, and predicts information on the presence or absence of histopathological features. To get each.
- the feature prediction unit 253 selects the plurality of region images using the combination of the presence or absence of the acquired histopathological features. More specifically, for example, the feature prediction unit 253 describes a region image in which the acquired combination of the presence or absence of histopathological features matches a specific combination of the presence or absence of histopathological features set for each genetic abnormality. Extract from multiple area images.
- a specific combination of the presence or absence of histopathological features set for a BRAF gene abnormality is stored in, for example, the BRAF table T1 of FIG. It is one of the "combinations of".
- a specific combination of the presence or absence of histopathological features set for the BRAF V600E gene abnormality is stored, for example, in the BRAF V600E table T2 of FIG. It is one of the "combination of the presence or absence of features".
- a specific combination of the presence or absence of histopathological features set for an ERBB2 gene abnormality is stored, for example, in the ERBB2 table T3 of FIG. It is one of the "combination of presence / absence”.
- a specific combination of the presence or absence of histopathological features set for a TP53 gene abnormality is stored, for example, in the TP53 table T4 of FIG.
- Even one of the "combination of presence / absence" is plural.
- a specific combination of the presence or absence of histopathological features set for an MSI gene abnormality is stored in, for example, the MSI table T5 of FIG. It is one of the "combination of presence / absence”.
- a specific combination of the presence or absence of histopathological features set for a RAS gene abnormality is stored in, for example, the RAS table T6 of FIG. It is one of the "combination of presence / absence".
- the gene mutation prediction unit 255 uses a combination of the presence / absence of histopathological features and a gene mutation prediction model constructed for each type of gene mutation, and the presence / absence of the target gene mutation in each region image. Predict. More specifically, for example, the gene mutation prediction unit 255 inputs a region image selected by selection for each combination of presence / absence of histopathological features and a gene mutation prediction model constructed for each type of gene mutation. Then, the prediction information of the presence or absence of the gene mutation is acquired respectively.
- Step S16 the prediction result output unit 256 predicts the presence or absence of each gene mutation in the patient by using the prediction result for each region image.
- the prediction result output unit 256 outputs a list of prediction results regarding the presence or absence of each gene mutation, for example.
- the prediction result of the presence or absence of a gene mutation does not have to be the prediction result of the presence or absence of all gene mutations, and may be at least one or more.
- the prediction result output unit 256 outputs the prediction result of the presence or absence of at least one gene mutation for the patient by using the prediction information of the presence or absence of the gene mutation for each acquired region image.
- a region image that matches the "combination of the presence or absence of histopathological features at the time of selection" is selected, and the region image after selection is input to the gene mutation prediction model to input the region.
- the gene mutation prediction result for each image is acquired, and the prediction result for each region image is used to predict the presence or absence of the target gene mutation in the target patient, but the present invention is not limited to this.
- This series of steps may be carried out for each "combination of the presence or absence of histopathological features at the time of sorting". In this case, it may be collectively output which set of "combination of presence / absence of histopathological features at the time of selection" was predicted to have a gene mutation of the target. Further, by carrying out the above treatment across a plurality of gene mutations, the above output may be carried out across the plurality of gene mutations.
- the acquisition unit 251 may further acquire the site of the primary lesion of the target disease.
- the sorting unit 254 may sort the plurality of region images by using the combination of the presence or absence of the acquired histopathological features and the acquired site of the primary lesion of the target disease. ..
- the sorting unit 254 refers, for example, to the BRAF table T1 (see FIG. 16) stored in the storage 23, and for one record, for example, "primary". Area images that match the "site of the nest" and the "combination of the presence or absence of histopathological features at the time of selection" may be selected.
- the site of the primary lesion is the "left side of the large intestine" and the combination of the presence or absence of histopathological features at the time of sorting is "tumor cell ratio". Area images corresponding to "more than 50%" may be selected. As a result, by selecting the region image using the site of the primary lesion of the target disease, the probability that the region image input to the gene mutation prediction model can be limited to the region image related to the target disease is improved. It is possible to improve the prediction accuracy of the presence or absence of.
- a region image that matches the pair of "site of primary lesion” and “combination of presence / absence of histopathological features at the time of selection” is selected, and the selected region image is used as a gene. It is input to the mutation prediction model to acquire the gene mutation prediction result for each region image, and the prediction result for each region image is used to predict the presence or absence of the target gene mutation in the target patient. This series of steps may be carried out for each set of "site of primary lesion” and "combination of presence / absence of histopathological features at the time of selection".
- FIG. 19 is a schematic diagram for explaining a method of assembling a prediction result for a region image into a prediction result for each patient.
- FIG. 19 shows a set of the image region after selection derived from the target patient and the predicted value for each region image by the gene mutation prediction model.
- the prediction result output unit 256 can be used at the case level by the case assembly method 1 or the case assembly method 2. Predict the presence or absence of gene mutations in.
- the prediction result output unit 256 does not perform gene mutation prediction itself at the case level (as a result, it is treated as if there is no gene mutation).
- FIG. 20 is a schematic diagram for explaining an example of screen transition in a terminal.
- a text box TB1 for inputting the path of the pathological tissue image file for selecting the pathological tissue image of the patient and a reference button for referencing the pathological tissue image file.
- B1 is provided on the screen G1 of the terminal 1.
- a radio button B2 for selecting the position of the primary lesion of colorectal cancer is provided.
- the transmission button B3 is pressed with the histopathological image of the patient selected and the position of the site of the primary lesion of colorectal cancer selected, the screen transitions to the screen G2.
- a list of prediction results of the presence or absence of gene mutation is displayed.
- Step S110 First, the terminal 1 receives a histopathological image of a patient and a site of a primary lesion of colorectal cancer.
- Step S120 Next, the terminal 1 transmits the histopathological image of the patient and the site of the primary lesion of colorectal cancer to the computer system 2.
- Step S130 the computer system 2 receives the histopathological image of the patient and the site of the primary lesion of colorectal cancer, and uses the histopathological image of the patient and the site of the primary lesion of colorectal cancer, respectively.
- Information for displaying the prediction result of the presence or absence of a gene mutation is output. Since the details of this process have been described with reference to FIG. 18, the description thereof will be omitted.
- Step S140 Next, the computer system 2 transmits information for displaying the prediction result of the presence or absence of each gene mutation to the terminal 1.
- Step S150 the terminal 1 receives information for displaying the prediction result of the presence or absence of each gene mutation, and displays the prediction result for the presence or absence of each gene mutation using this information. This completes the processing of this flowchart.
- the information processing system S includes an acquisition unit 251 that acquires a pathological histological image of a patient with a target disease, a divided unit 252 that divides the pathological histological image of the patient into a plurality of region images, and a pathology.
- the feature prediction unit 253, which inputs a region image to each of the plurality of feature prediction models constructed for each type of histological feature and acquires the prediction information of the presence or absence of the histological feature, and the acquisition.
- the selection unit 254 for selecting a plurality of region images in which the combination of the presence or absence of the histopathological features matched with the combination of the presence or absence of the histopathological features at the time of selection set in advance, and the presence or absence of the histopathological features.
- the region image selected by selection is input to obtain the prediction information of the presence or absence of the gene mutation.
- a prediction result output unit 256 that outputs a prediction result of the presence or absence of at least one gene mutation for the patient by using the prediction information of the presence or absence of the gene mutation for each acquired region image.
- a histopathological image of a patient with a target disease is input, a prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, so that the administration of the drug according to the genetic abnormality of the target disease is delayed. Can be suppressed.
- the information processing system is, for example, an information processing system for predicting a gene in which a mutation has occurred in a colorectal cancer tissue of a colorectal cancer patient.
- the information processing system S includes an acquisition unit 251 for acquiring a histopathological image of colorectal cancer of the patient. Further, the information processing system S includes a division unit 252 that divides the colorectal cancer histopathological image of the patient into a plurality of region images. Further, the information processing system S inputs a region image to each of a plurality of feature prediction models constructed for each type of histopathological feature, and acquires prediction information of the presence or absence of the histopathological feature.
- the feature prediction unit 253 is provided.
- the information processing system S when the combination of the presence or absence of the acquired histopathological features is preset for at least one of BRAF, BRAF V600E, ERBB2, RAS, TP53, or MSI at the time of sorting.
- a sorting unit 254 for selecting a plurality of region images matching the combination of the presence or absence of histopathological features is provided. Further, the information processing system S inputs the selected region image to each of the combination of the presence / absence of histopathological features and the gene mutation prediction model constructed for each type of gene mutation, and the presence / absence of the gene mutation.
- a prediction result output unit 256 that outputs a prediction result of the presence or absence of at least one gene mutation is provided.
- the prediction result of the presence or absence of a gene mutation can be immediately obtained.
- the clinician can refer to this prediction result and prescribe a drug corresponding to the prediction result of the presence or absence of a gene mutation to the patient, and thus administer the drug according to the gene abnormality of colorectal cancer.
- the delay can be suppressed.
- the information processing system is an information processing system for estimating the presence or absence of a BRAF gene mutation in a tumor, and means for dividing a histopathological image of the tumor into one or a plurality of images, and the division. From the images obtained, an image having a tumor cell ratio of more than 50% and / or an image containing a papillary structure, and / or an image containing a non-sawtooth papillary structure, and / or a non-sawtooth papillary structure. Select images that contain and / or have a sieving structure, and / or images that have a sieving structure and no mucus, and / or an image that contains a sieving structure and has mucus. It comprises means and means for estimating the presence or absence of a BRAF gene mutation using the selected image.
- the clinician can prescribe a drug corresponding to the predicted result of the presence or absence of the BRAF gene mutation to the patient by referring to this predicted result, and thus administer the drug according to the BRAF gene abnormality of the cancer. Delay can be suppressed.
- the information processing system is an information processing system for estimating the presence or absence of BRAF V600E gene mutation in a tumor, and means for dividing the histopathological image of the tumor into one or a plurality of images, and the division.
- images obtained an image including a track pattern and / or an image containing a small solid follicle, and / or an image in which the small solid follicle is composed of normal cells, and / or a large solid follicle.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the BRAF V600E gene mutation to the patient by referring to this prediction result, so that the drug corresponding to the BRAF V600E gene abnormality of the cancer can be prescribed. It is possible to suppress the delay in administration of.
- the information processing system is an information processing system for estimating the presence or absence of an ERBB2 gene mutation in a tumor, and is divided into a means for dividing a histopathological image of the tumor into one or a plurality of images.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the ERBB2 gene mutation to the patient by referring to this prediction result, and thus administer the drug according to the ERBB2 gene abnormality of the cancer. Delay can be suppressed.
- the information processing system is an information processing system that estimates the presence or absence of a TP53 gene mutation in a tumor, and is divided into a means for dividing a histopathological image of the tumor into one or a plurality of images. Images containing ring cells and / or cup cells, and / or images containing cord-like structure and no mucus, and / or images with mucus leakage. A means for estimating the presence or absence of a TP53 gene mutation using the selected image.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the TP53 gene mutation to the patient by referring to this prediction result, and thus administer the drug according to the TP53 gene abnormality of the cancer. Delay can be suppressed.
- the information processing system is an information processing system for estimating the presence or absence of an MSI gene abnormality in a tumor, and is divided into a means for dividing a histopathological image of the tumor into one or a plurality of images.
- images an image containing a chordal pattern and / or an image containing a cord-like structure and no mucus, and / or an image having a tumor cell ratio of more than 50%, and / or an image containing a solid follicle.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the MSI gene abnormality to the patient by referring to this prediction result, and thus administer the drug according to the MSI gene abnormality of the cancer. Delay can be suppressed.
- the information processing system is an information processing system for estimating the presence or absence of a RAS gene mutation in a tumor, and is divided into a means for dividing a histopathological image of the tumor into one or a plurality of images. Images containing ring cells and leaking mucus, and / or images containing tubular structure and no mucus, and / or images containing cord-like structure and no mucus, and / or Images containing small solid follicles and / or images containing large solid follicles and / or images in which large solid follicles are composed of normal cells and / or images containing papillary structures, and / or images containing papillary structures.
- Images that exceed and / or images that contain small solid follicles, and / or images in which small solid follicles are composed of normal cells, and / or images that contain large solid follicles, and / or oblong. Includes images containing morphonuclear and / or images with mucus and / or images with mucus leakage and / or images with non-sawtooth papillary structures and no mucus, and / or sieving structures. Images and / or images with a sieving structure and no mucus, and / or images with a sieving structure and presence of mucus, and / or images with a tubular structure and no mucus, and / or tubular.
- the clinician can prescribe a drug corresponding to the prediction result of the presence or absence of the RAS gene mutation to the patient by referring to this prediction result, and thus administer the drug according to the RAS gene abnormality of the cancer. Delay can be suppressed.
- At least a part of the computer system 2 described in the above-described embodiment may be configured by hardware or software.
- a program that realizes at least a part of the functions of the computer system 2 may be stored in a recording medium such as a flexible disk or a CD-ROM, read by a computer, and executed.
- the recording medium is not limited to a removable one such as a magnetic disk or an optical disk, and may be a fixed recording medium such as a hard disk device or a memory.
- a program that realizes at least a part of the functions of the computer system 2 may be distributed via a communication line (including wireless communication) such as the Internet. Further, the program may be encrypted, modulated, compressed, and distributed via a wired line or a wireless line such as the Internet, or stored in a recording medium.
- a communication line including wireless communication
- the program may be encrypted, modulated, compressed, and distributed via a wired line or a wireless line such as the Internet, or stored in a recording medium.
- the computer system 2 may be operated by one or more information devices.
- one of them may be a computer, and the function may be realized as at least one means of the computer system 2 by executing a predetermined program by the computer.
- all the steps (steps) may be realized by automatic control by a computer. Further, the progress control between the processes may be manually performed while the computer is used to perform each process. Further, at least a part of the whole process may be manually performed.
- the present invention is not limited to the above embodiment as it is, and at the implementation stage, the components can be modified and embodied within a range that does not deviate from the gist thereof.
- various inventions can be formed by an appropriate combination of the plurality of components disclosed in the above-described embodiment. For example, some components may be removed from all the components shown in the embodiments. Further, components over different embodiments may be combined as appropriate.
- Terminal 2 Computer system 21
- Input interface 22 Communication module 23
- Storage 24 Memory 25
- Processor 250 Output unit 251 Acquisition unit 252 Division unit 253 Feature prediction unit 254 Sorting unit 255 Gene mutation prediction unit 256 Prediction result output unit
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Databases & Information Systems (AREA)
- Epidemiology (AREA)
- Multimedia (AREA)
- Evolutionary Biology (AREA)
- Artificial Intelligence (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Molecular Biology (AREA)
- Bioethics (AREA)
- Quality & Reliability (AREA)
- Primary Health Care (AREA)
- Computing Systems (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Genetics & Genomics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Pathology (AREA)
- Apparatus For Radiation Diagnosis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Description
本発明の別の一態様によれば、大腸がんの患者の大腸がんの病理組織画像を入力すれば、遺伝子変異の有無の予測結果をすぐに得ることができる。これにより、臨床医は、この予測結果を参照して、当該患者に対して、遺伝子変異の有無の予測結果に対応する医薬品を処方できるので、大腸がんの遺伝子異常に応じた医薬品の投与の遅れを抑制することができる。
入力インタフェース21は、コンピュータシステム2の管理者からの入力を受け付け、受け付けた入力に応じた入力信号をプロセッサ25へ出力する。
通信モジュール22は、通信回路網NWに接続されており、端末1-1~1-Nと通信する。この通信は有線であっても無線であってもよいが、有線であるものとして説明する。
メモリ24は、データ及びプログラムを一時的に保持する。メモリ24は、揮発性メモリであり、例えばRAM(Random Access Memory)である。
プロセッサ25は、ストレージ23からプログラムをメモリ24にロードし、当該プログラムに含まれる一連の命令を実行することによって、取得部251、出力部250として機能する。出力部250は例えば、分割部252、特徴予測部253、選別部254、遺伝子変異予測部255、予測結果出力部256を有する。それぞれの処理については後述する。
(ステップS1)分割部252は、対象疾患(ここでは一例として大腸がん)の患者の病理組織(ここでは一例として大腸がん組織)画像(例えば、スライド画像)を、1つもしくは複数の領域画像(例えば、タイル状のタイル画像)に分割する。ここで、当該患者の疾患組織(ここでは一例として大腸がん組織)の遺伝子変異が既知である。ここでタイル状のタイル画像に分割する場合をタイル分割ともいう。
例えば、学習用データの8割を5分割したのち、まずはs1、s2、s3、s4のサブセットを訓練サブセットとしてモデルの学習を進める。続いて、s5を検証サブセットとして、モデルの検証を行う。このときに得られる評価指標(例えば精度やF1値など)をe1とする。
次に、s2、s3、s4、s5で学習し、s1で評価を行う。同様にして、サブセットを入れ替えながら学習と評価を繰り返す。すべての組み合わせに対して学習と評価を繰り返すことで、5個の評価指標が得られる。そして、これら5個の評価指標のうち最も性能の良かったモデルが特徴予測モデルに決定される。
病理組織学的特徴の予測値は、各領域画像に対して、特徴予測モデルFM-1、…、FM-Lによる予測結果を基に決定される、その病理組織学的特徴を持つ度合いとする。病理組織学的特徴の予測値は、予測結果(例えば0~1の数値)そのものであってもよいし、予測結果が閾値と大小比較し比較結果に応じた値(例えば0または1の値)であってもよい。ここでは一例として病理組織学的特徴の予測値は0~1の数値であるものとして説明する。
例えば、学習用データの8割を5分割したのち、まずはs1、s2、s3、s4のサブセットを訓練サブセットとしてモデルの学習を進める。続いて、s5を検証サブセットとして、モデルの検証を行う。このときに得られる評価指標(例えば精度やF1値など)をe1とする。
次に、s2、s3、s4、s5で学習し、s1で評価を行う。同様にして、サブセットを入れ替えながら学習と評価を繰り返す。すべての組み合わせに対して学習と評価を繰り返すことで、5個の評価指標が得られる。そして、これら5個の評価指標のうち最も性能の良かったモデルが遺伝子変異予測モデルに決定される。
図8は、大腸がんについての遺伝子変異予測モデルのテストの実施例である。図8に示すようにテストすることによって、「選別時の病理組織学的特徴の有無の組合せ」を求める。具体的には、2期の80%のデータで、特徴予測モデルおよび遺伝子変異予測モデルを学習する。学習後の特徴予測モデル及び学習後の遺伝子変異予測モデルを1期のデータ、2.5期のデータ、及びTCGAのデータに適用する。ここでTCGAのデータは、大腸がんのうち、結腸腺癌(colon adenocarcinoma:COAD)及び直腸腺癌(rectum adenocarcinoma:READ)だけに限定したデータである。
一実施例として、対象とする遺伝子変異(または遺伝子異常)の種類ごとに、すべての実験データの中から、下記(1)~(3)のすべての条件に合致する、第1の特徴(原発巣の部位)と第2の特徴(病理組織学的特徴)の組合せを選別した。
(1) 2期テストセットにおいて、症例組み上げ手法1または症例組み上げ手法2のいずれかでAUC(Area Under the Curve)が0.8以上を示す(すなわち学習に使用したコホートで高精度に予測が可能である)。ここで、AUCとは、第1の軸を偽陽性率、第2の軸を真陽性率とするROC曲線下の面積(積分)である。この面積の範囲は0から1の間の値を取ることができる。
(2) 2期以外のいずれかのテストセットにおいて、症例組み上げ手法1または症例組み上げ手法2のいずれかでAUCが0.8以上を示す(すなわち学習に使用していないコホートでも高精度に予測が可能である)。
(3) 画像の種類と特徴の組合せが同じもののうち、必要タイル枚数が最も少ない。
(1)症例組み上げ方法1
選別後の領域画像群(例えば図19の領域画像群)に含まれる各領域画像に対する遺伝子変異モデルの予測値の平均値が、閾値th以上である場合、症例1は遺伝子変異があると予測し、そうでない場合には遺伝子変異がないと予測する方法
(2)症例組み上げ方法2
選別後の領域画像群(例えば図19の領域画像群)に含まれる各領域画像に対する遺伝子変異モデルの予測値が第1の閾値th1以上であるタイルの枚数の、選別後の領域画像群(例えば図19の領域画像群)の全領域画像枚数に対する割合が、第2の閾値th2以上である場合に、対象患者は遺伝子変異があると予測し、そうでない場合には遺伝子変異がないと予測する方法
(ステップS11)まず、取得部251は、遺伝子解析が未実施であり、遺伝子変異が不明な患者の病理組織画像を取得する。
ここで、例えばBRAF遺伝子異常に対して設定された病理組織学的特徴の有無の特定の組合せは、例えば、図16のBRAFテーブルT1に記憶されている「選別時の病理組織学的特徴の有無の組合せ」のうちの一つである。また同様に、例えばBRAF V600E遺伝子異常に対して設定された病理組織学的特徴の有無の特定の組合せは、例えば、図16のBRAF V600EテーブルT2に記憶されている「選別時の病理組織学的特徴の有無の組合せ」のうちの一つである。また同様に、例えばERBB2遺伝子異常に対して設定された病理組織学的特徴の有無の特定の組合せは、例えば、図16のERBB2テーブルT3に記憶されている「選別時の病理組織学的特徴の有無の組合せ」のうちの一つである。また同様に、例えばTP53遺伝子異常に対して設定された病理組織学的特徴の有無の特定の組合せは、例えば、図16のTP53テーブルT4に記憶されている「選別時の病理組織学的特徴の有無の組合せ」のうちの一つであっても複数である。また同様に、例えばMSI遺伝子異常に対して設定された病理組織学的特徴の有無の特定の組合せは、例えば、図16のMSIテーブルT5に記憶されている「選別時の病理組織学的特徴の有無の組合せ」のうちの一つである。また同様に、例えばRAS遺伝子異常に対して設定された病理組織学的特徴の有無の特定の組合せは、例えば、図16のRASテーブルT6に記憶されている「選別時の病理組織学的特徴の有無の組合せ」のうちの一つである。
例えば、大腸がんの場合、BRAF遺伝子変異の有無を予測するために、選別部254は例えば、ストレージ23に記憶されたBRAFテーブルT1(図16参照)を参照し、一つのレコードについて例えば「原発巣の部位」と「選別時の病理組織学的特徴の有無の組合せ」に一致する領域画像を選別してもよい。例えば最初のレコード(1行目のレコード)の場合、選別部254は、原発巣の部位が「大腸左側部」であり且つ選別時の病理組織学的特徴の有無の組合せが「腫瘍細胞比率が50%を超える」ものに該当する領域画像を選別してもよい。これにより、対象疾患の原発巣の部位も用いて領域画像を選別することにより、遺伝子変異予測モデルに入力される領域画像が対象疾患に関係する領域画像に限定できる確率が向上するので、遺伝子変異の有無の予測精度を向上することができる。
予測結果出力部256は例えば、選別後の領域画像群(例えば図19の領域画像群)に含まれるタイル枚数がK以上の場合に、上記症例組み上げ方法1または症例組み上げ方法2により、症例レベルでの遺伝子変異の有無を予測する。一方、予測結果出力部256は例えば、タイル枚数がK未満の場合は、症例レベルでの遺伝子変異予測自体を行わない(結果として遺伝子変異がないものとして扱う)。
(ステップS110)まず端末1は、患者の病理組織画像、大腸がんの原発巣の部位を受け付ける。
2 コンピュータシステム
21 入力インタフェース
22 通信モジュール
23 ストレージ
24 メモリ
25 プロセッサ
250 出力部
251 取得部
252 分割部
253 特徴予測部
254 選別部
255 遺伝子変異予測部
256 予測結果出力部
Claims (16)
- 対象疾患の患者の病理組織画像を取得する取得部と、
前記患者の病理組織画像を、複数の領域画像に分割する分割部と、
病理組織学的特徴の種類毎に構築された複数の特徴予測モデルそれぞれに対して、領域画像を入力して、病理組織学的特徴の有無の予測情報をそれぞれ取得する特徴予測部と、
前記取得された病理組織学的特徴の有無の組み合わせが、予め設定された選別時の病理組織学的特徴の有無の組合せに合致する領域画像を複数選別する選別部と、
病理組織学的特徴の有無の組み合わせ及び遺伝子変異の種類毎に構築された遺伝子変異予測モデルそれぞれに対して、選別によって選ばれた領域画像を入力して、遺伝子変異の有無の予測情報をそれぞれ取得する遺伝子変異予測部と、
前記取得された領域画像毎の遺伝子変異の有無の予測情報を用いて、前記患者について、少なくとも一つの遺伝子変異の有無の予測結果を出力する予測結果出力部と、
を備える情報処理システム。 - 前記取得部は、対象疾患の原発巣の部位を更に取得し、
前記選別部は、前記取得された病理組織学的特徴の有無の組み合わせとともに、前記取得された対象疾患の原発巣の部位とを用いて、前記複数の領域画像を選別する
請求項1に記載の情報処理システム。 - 前記特徴予測モデルは、病理組織画像が分割された領域画像を入力とし、当該領域画像に対して付与された病理組織学的特徴を出力として用いる学習用データを用いて機械学習されたモデルであり、
前記遺伝子変異予測モデルは、病理組織学的特徴の有無の組み合わせを用いて選別された領域画像を入力とし、特定の遺伝子変異の有無の情報を出力として用いる学習用データを用いて機械学習されたモデルである
請求項1または2に記載の情報処理システム。 - 大腸がんの患者の大腸がん組織において変異が起こっている遺伝子を予測するための情報処理システムであって、
前記患者の大腸がん病理組織画像を取得する取得部と、
前記患者の大腸がん病理組織画像を、複数の領域画像に分割する分割部と、
病理組織学的特徴の種類毎に構築された複数の特徴予測モデルそれぞれに対して、領域画像を入力して、病理組織学的特徴の有無の予測情報をそれぞれ取得する特徴予測部と、
前記取得された病理組織学的特徴の有無の組み合わせが、BRAF、BRAF V600E、ERBB2、RAS、TP53、またはMSIのうち少なくとも一つに対して予め設定された選別時の病理組織学的特徴の有無の組合せに合致する領域画像を複数選別する選別部と、
病理組織学的特徴の有無の組み合わせ及び遺伝子変異の種類毎に構築された遺伝子変異予測モデルそれぞれに対して、当該選別された領域画像を入力して、遺伝子変異の有無の予測情報をそれぞれ取得する遺伝子変異予測部と、
前記取得された領域画像毎の遺伝子変異の有無の予測情報を用いて、前記患者について、BRAF、BRAF V600E、ERBB2、RAS、TP53、またはMSIのうち少なくとも一つの遺伝子変異の有無の予測結果を出力する予測結果出力部と、
を備える情報処理システム。 - 前記取得部は、大腸がんの原発巣の部位を更に取得し、
前記選別部は、前記決定された少なくとも一つの病理組織学的特徴とともに、前記取得された大腸がんの原発巣の部位を用いて、前記複数の領域画像を選別する
請求項4に記載の情報処理システム。 - 前記特徴予測モデルは、病理組織画像が分割された領域画像を入力とし、当該領域画像に対して付与された病理組織学的特徴を出力として用いる学習用データを用いて機械学習されたモデルであり、
前記遺伝子変異予測モデルは、病理組織学的特徴の有無の組み合わせを用いて選別された領域画像を入力とし、特定の遺伝子変異の有無の情報を出力として用いる学習用データを用いて機械学習されたモデルである
請求項4または5に記載の情報処理システム。 - 対象疾患の患者の病理組織画像を取得する取得手順と、
前記患者の病理組織画像を、複数の領域画像に分割する分割手順と、
病理組織学的特徴の種類毎に構築された複数の特徴予測モデルそれぞれに対して、領域画像を入力して、病理組織学的特徴の有無の予測情報をそれぞれ取得する特徴予測手順と、
前記取得された病理組織学的特徴の有無の組み合わせが、予め設定された選別時の病理組織学的特徴の有無の組合せに合致する領域画像を複数選別する選別手順と、
病理組織学的特徴の有無の組み合わせ及び遺伝子変異の種類毎に構築された遺伝子変異予測モデルそれぞれに対して、選別によって選ばれた領域画像を入力して、遺伝子変異の有無の予測情報をそれぞれ取得する遺伝子変異予測手順と、
前記取得された領域画像毎の遺伝子変異の有無の予測情報を用いて、前記患者について、少なくとも一つの遺伝子変異の有無の予測結果を出力する予測結果出力手順と、
を有する情報処理方法。 - 大腸がんの患者の大腸がん組織において変異が起こっている遺伝子を予測するための情報処理方法であって、
前記患者の大腸がん病理組織画像を取得する取得手順と、
前記患者の大腸がん病理組織画像を、複数の領域画像に分割する分割手順と、
病理組織学的特徴の種類毎に構築された複数の特徴予測モデルそれぞれに対して、領域画像を入力して、病理組織学的特徴の有無の予測情報をそれぞれ取得する特徴予測手順と、
前記取得された病理組織学的特徴の有無の組み合わせが、BRAF、BRAF V600E、ERBB2、RAS、TP53、またはMSIのうち少なくとも一つに対して予め設定された選別時の病理組織学的特徴の有無の組合せに合致する領域画像を複数選別する選別手順と、
病理組織学的特徴の有無の組み合わせ及び遺伝子変異の種類毎に構築された遺伝子変異予測モデルそれぞれに対して、当該選別された領域画像を入力して、遺伝子変異の有無の予測情報をそれぞれ取得する遺伝子変異予測手順と、
前記取得された領域画像毎の遺伝子変異の有無の予測情報を用いて、前記患者について、BRAF、BRAF V600E、ERBB2、RAS、TP53、またはMSIのうち少なくとも一つの遺伝子変異の有無の予測結果を出力する予測結果出力手順と、
を有する情報処理方法。 - コンピュータに、
対象疾患の患者の病理組織画像を取得する取得手順と、
前記患者の病理組織画像を、複数の領域画像に分割する分割手順と、
病理組織学的特徴の種類毎に構築された複数の特徴予測モデルそれぞれに対して、領域画像を入力して、病理組織学的特徴の有無の予測情報をそれぞれ取得する特徴予測手順と、
前記取得された病理組織学的特徴の有無の組み合わせが、予め設定された選別時の病理組織学的特徴の有無の組合せに合致する領域画像を複数選別する選別手順と、
病理組織学的特徴の有無の組み合わせ及び遺伝子変異の種類毎に構築された遺伝子変異予測モデルそれぞれに対して、選別によって選ばれた領域画像を入力して、遺伝子変異の有無の予測情報をそれぞれ取得する遺伝子変異予測手順と、
前記取得された領域画像毎の遺伝子変異の有無の予測情報を用いて、前記患者について、少なくとも一つの遺伝子変異の有無の予測結果を出力する予測結果出力手順と、
を実行させるためのプログラム。 - 大腸がんの患者の大腸がん組織において変異が起こっている遺伝子を予測するためのプログラムであって、コンピュータに、
前記患者の大腸がん病理組織画像を取得する取得手順と、
前記患者の大腸がん病理組織画像を、複数の領域画像に分割する分割手順と、
病理組織学的特徴の種類毎に構築された複数の特徴予測モデルそれぞれに対して、領域画像を入力して、病理組織学的特徴の有無の予測情報をそれぞれ取得する特徴予測手順と、
前記取得された病理組織学的特徴の有無の組み合わせが、BRAF、BRAF V600E、ERBB2、RAS、TP53、またはMSIのうち少なくとも一つに対して予め設定された選別時の病理組織学的特徴の有無の組合せに合致する領域画像を複数選別する選別手順と、
病理組織学的特徴の有無の組み合わせ及び遺伝子変異の種類毎に構築された遺伝子変異予測モデルそれぞれに対して、当該選別された領域画像を入力して、遺伝子変異の有無の予測情報をそれぞれ取得する遺伝子変異予測手順と、
前記取得された領域画像毎の遺伝子変異の有無の予測情報を用いて、前記患者について、BRAF、BRAF V600E、ERBB2、RAS、TP53、またはMSIのうち少なくとも一つの遺伝子変異の有無の予測結果を出力する予測結果出力手順と、
を実行させるためのプログラム。 - 腫瘍におけるBRAF遺伝子変異の有無を推定する情報処理システムであって、
前記腫瘍の病理組織画像を1つもしくは複数の画像に分割する手段と、
当該分割された画像の中から、
腫瘍細胞比率が50%を超える画像、及び/または
乳頭状構造を含む画像、及び/または
非鋸歯状乳頭状構造を含む画像、及び/または
非鋸歯状乳頭状構造を含みかつ粘液が存在する画像、及び/または
篩状構造を含む画像、及び/または
篩状構造を含みかつ粘液が存在しない画像、及び/または
篩状構造を含みかつ粘液が存在する画像を選択する手段と、
前記選択された画像を用いてBRAF遺伝子変異の有無を推定する手段と、
を備える情報処理システム。 - 腫瘍におけるBRAF V600E遺伝子変異の有無を推定する情報処理システムであって、
前記腫瘍の病理組織画像を1つもしくは複数の画像に分割する手段と、
当該分割された画像の中から、
軌条様式を含む画像、及び/または
小型充実性胞巣を含む画像、及び/または
小型充実性胞巣が通常型細胞で構成される画像、及び/または
大型充実性胞巣が通常型細胞で構成される画像、及び/または
長楕円形核を含む画像、及び/または
粘液が存在する画像、及び/または
非鋸歯状乳頭状構造を含みかつ粘液が存在しない画像、及び/または
篩状構造を含みかつ粘液が存在しない画像、及び/または
篩状構造を含みかつ粘液が存在する画像を選択する手段と、
前記選択された画像を用いてBRAF V600E遺伝子変異の有無を推定する手段と、
を備える情報処理システム。 - 腫瘍におけるERBB2遺伝子変異の有無を推定する情報処理システムであって、
前記腫瘍の病理組織画像を1つもしくは複数の画像に分割する手段と、
当該分割された画像の中から、
索状構造を含みかつ粘液が存在する画像、及び/または
索状構造を含みかつ粘液漏出がある画像を選択する手段と、
前記選択された画像を解析対象としてERBB2遺伝子変異の有無を推定する手段と、
を備える情報処理システム。 - 腫瘍におけるTP53遺伝子変異の有無を推定する情報処理システムであって、
前記腫瘍の病理組織画像を1つもしくは複数の画像に分割する手段と、
当該分割された画像の中から、
印環細胞を含みかつ粘液漏出がある画像、及び/または
杯細胞を含む画像、及び/または
索状構造を含みかつ粘液が存在しない画像、及び/または
粘液漏出がある画像を選択する手段と、
前記選択された画像を用いてTP53遺伝子変異の有無を推定する手段と、
を備える情報処理システム。 - 腫瘍におけるMSI遺伝子異常の有無を推定する情報処理システムであって、
前記腫瘍の病理組織画像を1つもしくは複数の画像に分割する手段と、
当該分割された画像の中から、
軌条様式を含む画像、及び/または
索状構造を含みかつ粘液が存在しない画像、及び/または
腫瘍細胞比率が50%を超える画像、及び/または
充実性胞巣を含む画像、及び/または
小型充実性胞巣が通常型細胞で構成される画像、及び/または
長楕円形核を含む画像、及び/または
乳頭状構造を含む画像、及び/または
杯細胞を含む画像、及び/または
非鋸歯状乳頭状構造を含む画像、及び/または
類円形核を含む画像、及び/または
篩状構造を含みかつ粘液が存在しない画像、及び/または
篩状構造を含みかつ粘液が存在する画像、及び/または
篩状構造を含みかつ粘液漏出がある画像を選択する手段と、
前記選択された画像を用いてMSI遺伝子異常の有無を推定する手段と、
を備える情報処理システム。 - 腫瘍におけるRAS遺伝子変異の有無を推定する情報処理システムであって、
前記腫瘍の病理組織画像を1つもしくは複数の画像に分割する手段と、
当該分割された画像の中から、
印環細胞を含みかつ粘液漏出がある画像、及び/または
管状構造を含みかつ粘液が存在しない画像、及び/または
索状構造を含みかつ粘液が存在しない画像、及び/または
小型充実性胞巣を含む画像、及び/または
大型充実性胞巣を含む画像、及び/または
大型充実性胞巣が通常型細胞で構成される画像、及び/または
乳頭状構造を含む画像、及び/または
粘液が存在しない画像、及び/または
粘液が存在する画像、及び/または
杯細胞を含む画像、及び/または
非鋸歯状乳頭状構造を含む画像、及び/または
非鋸歯状乳頭状構造を含みかつ粘液が存在しない画像、及び/または
非鋸歯状乳頭状構造を含みかつ粘液が存在する画像、及び/または
篩状構造を含む画像、及び/または
篩状構造を含みかつ粘液が存在しない画像、及び/または
篩状構造を含みかつ粘液が存在する画像、及び/または
篩状構造を含みかつ粘液漏出がある画像、及び/または
簇出が存在する画像、及び/または
腫瘍細胞比率が50%を超える画像、及び/または
小型充実性胞巣を含む画像、及び/または
小型充実性胞巣が通常型細胞で構成される画像、及び/または
大型充実性胞巣を含む画像、及び/または
長楕円形核を含む画像、及び/または
粘液が存在する画像、及び/または
粘液漏出がある画像、及び/または
非鋸歯状乳頭状構造を含みかつ粘液が存在しない画像、及び/または
篩状構造を含む画像、及び/または
篩状構造を含みかつ粘液が存在しない画像、及び/または
篩状構造を含みかつ粘液が存在する画像、及び/または
管状構造を含みかつ粘液が存在する画像、及び/または
管状構造を含みかつ粘液漏出がある画像、及び/または
軌条様式を含む画像、及び/または
高度細胞異型を含む画像、及び/または
索状構造を含みかつ粘液が存在する画像、及び/または
索状構造を含みかつ粘液漏出がある画像、及び/または
充実性胞巣を含む画像、及び/または
非鋸歯状乳頭状構造を含む画像、及び/または
非鋸歯状乳頭状構造を含みかつ粘液が存在する画像、及び/または
類円形核を含む画像を選択する手段と、
前記選択された画像を用いてRAS遺伝子変異の有無を推定する手段と、
を備える情報処理システム。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202080106293.5A CN116710954A (zh) | 2020-10-14 | 2020-10-14 | 信息处理系统、信息处理方法以及程序 |
JP2022556767A JPWO2022079847A1 (ja) | 2020-10-14 | 2020-10-14 | |
EP20957676.8A EP4231228A1 (en) | 2020-10-14 | 2020-10-14 | Information processing system, information processing method, and program |
US18/248,706 US20230274428A1 (en) | 2020-10-14 | 2020-10-14 | Information processing system, information processing method, and program |
PCT/JP2020/038843 WO2022079847A1 (ja) | 2020-10-14 | 2020-10-14 | 情報処理システム、情報処理方法及びプログラム |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2020/038843 WO2022079847A1 (ja) | 2020-10-14 | 2020-10-14 | 情報処理システム、情報処理方法及びプログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022079847A1 true WO2022079847A1 (ja) | 2022-04-21 |
Family
ID=81208980
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2020/038843 WO2022079847A1 (ja) | 2020-10-14 | 2020-10-14 | 情報処理システム、情報処理方法及びプログラム |
Country Status (5)
Country | Link |
---|---|
US (1) | US20230274428A1 (ja) |
EP (1) | EP4231228A1 (ja) |
JP (1) | JPWO2022079847A1 (ja) |
CN (1) | CN116710954A (ja) |
WO (1) | WO2022079847A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2023176256A (ja) * | 2022-05-31 | 2023-12-13 | 楽天グループ株式会社 | 画像からデータを予測する方法、コンピュータシステム、及びコンピュータ可読媒体 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11989628B2 (en) * | 2021-03-05 | 2024-05-21 | International Business Machines Corporation | Machine teaching complex concepts assisted by computer vision and knowledge reasoning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017010397A1 (ja) * | 2015-07-15 | 2017-01-19 | 国立大学法人大阪大学 | 画像解析装置、画像解析方法、画像解析システム、画像解析プログラム、および記録媒体 |
US20180232883A1 (en) * | 2017-02-13 | 2018-08-16 | Amit Sethi | Systems & Methods for Computational Pathology using Points-of-interest |
WO2019159821A1 (ja) * | 2018-02-15 | 2019-08-22 | 国立大学法人新潟大学 | 高頻度変異型癌の判別システム、プログラム及び方法 |
-
2020
- 2020-10-14 WO PCT/JP2020/038843 patent/WO2022079847A1/ja active Application Filing
- 2020-10-14 CN CN202080106293.5A patent/CN116710954A/zh active Pending
- 2020-10-14 JP JP2022556767A patent/JPWO2022079847A1/ja active Pending
- 2020-10-14 US US18/248,706 patent/US20230274428A1/en active Pending
- 2020-10-14 EP EP20957676.8A patent/EP4231228A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017010397A1 (ja) * | 2015-07-15 | 2017-01-19 | 国立大学法人大阪大学 | 画像解析装置、画像解析方法、画像解析システム、画像解析プログラム、および記録媒体 |
US20180232883A1 (en) * | 2017-02-13 | 2018-08-16 | Amit Sethi | Systems & Methods for Computational Pathology using Points-of-interest |
WO2019159821A1 (ja) * | 2018-02-15 | 2019-08-22 | 国立大学法人新潟大学 | 高頻度変異型癌の判別システム、プログラム及び方法 |
Non-Patent Citations (2)
Title |
---|
COUDRAY NICOLAS; OCAMPO PAOLO SANTIAGO; SAKELLAROPOULOS THEODORE; NARULA NAVNEET; SNUDERL MATIJA; FENYö DAVID; MOREIRA ANDRE : "Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning", NATURE MEDICINE, NATURE PUBLISHING GROUP US, NEW YORK, vol. 24, no. 10, 17 September 2018 (2018-09-17), New York, pages 1559 - 1567, XP036608997, ISSN: 1078-8956, DOI: 10.1038/s41591-018-0177-5 * |
FU YU, JUNG ALEXANDER W., TORNE RAMON VIÑAS, GONZALEZ SANTIAGO, VÖHRINGER HARALD, SHMATKO ARTEM, YATES LUCY R., JIMENEZ-LINAN MERC: "Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis", NATURE CANCER, vol. 1, no. 8, 1 August 2020 (2020-08-01), pages 800 - 810, XP055932862, DOI: 10.1038/s43018-020-0085-8 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2023176256A (ja) * | 2022-05-31 | 2023-12-13 | 楽天グループ株式会社 | 画像からデータを予測する方法、コンピュータシステム、及びコンピュータ可読媒体 |
Also Published As
Publication number | Publication date |
---|---|
US20230274428A1 (en) | 2023-08-31 |
JPWO2022079847A1 (ja) | 2022-04-21 |
CN116710954A (zh) | 2023-09-05 |
EP4231228A1 (en) | 2023-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yang et al. | Can CT-based radiomics signature predict KRAS/NRAS/BRAF mutations in colorectal cancer? | |
Fan et al. | MuSE: accounting for tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling from sequencing data | |
Kristensen et al. | Principles and methods of integrative genomic analyses in cancer | |
CN111278993A (zh) | 从无细胞核酸中检测体细胞单核苷酸变体并应用于微小残留病变监测 | |
AU2021251264A1 (en) | Predicting likelihood and site of metastasis from patient records | |
TWI814753B (zh) | 用於標靶定序之模型 | |
WO2022079847A1 (ja) | 情報処理システム、情報処理方法及びプログラム | |
Lucas et al. | Deep learning–based recurrence prediction in patients with non–muscle-invasive bladder cancer | |
EP3629904A1 (en) | Methods and systems for identifying or monitoring lung disease | |
JP2008511058A (ja) | コンピュータシステムを用いるデータ品質および/または部分異数染色体の決定 | |
Berens et al. | Multiscale, multimodal analysis of tumor heterogeneity in IDH1 mutant vs wild-type diffuse gliomas | |
US20200321091A1 (en) | System and method for prediction of medical treatment effect | |
Caines et al. | Cluster analysis of multiplex ligation-dependent probe amplification data in choroidal melanoma | |
CN110916666B (zh) | 一种预测手术切除肝细胞癌复发的影像组学特征处理方法 | |
Hequet et al. | Prospective, multicenter French study evaluating the clinical impact of the Breast Cancer Intrinsic Subtype-Prosigna® Test in the management of early-stage breast cancers | |
Zhao et al. | Survival prediction in gliomas: current state and novel approaches | |
Marko et al. | Why is there a lack of consensus on molecular subgroups of glioblastoma? Understanding the nature of biological and statistical variability in glioblastoma expression data | |
Choe et al. | CT radiomics-based prediction of anaplastic lymphoma kinase and epidermal growth factor receptor mutations in lung adenocarcinoma | |
Li et al. | Integrative analysis of histopathological images and genomic data in colon adenocarcinoma | |
US12020777B1 (en) | Cancer diagnostic tool using cancer genomic signatures to determine cancer type | |
Mbogning et al. | Bagging survival tree procedure for variable selection and prediction in the presence of nonsusceptible patients | |
CN114613498B (zh) | 一种基于机器学习的辅助mdt临床决策方法、系统及设备 | |
Chen et al. | Convolutional Neural Network Quantification of Gleason Pattern 4 and association with biochemical recurrence in intermediate-grade prostate tumors | |
US20220301654A1 (en) | Systems and methods for predicting and monitoring treatment response from cell-free nucleic acids | |
CN115312126A (zh) | 一种无创预测egfr/tp53共突变肺癌患者的人工智能系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20957676 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022556767 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202080106293.5 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2020957676 Country of ref document: EP Effective date: 20230515 |