WO2021062904A1 - 基于病理图像的tmb分类方法、系统及tmb分析装置 - Google Patents

基于病理图像的tmb分类方法、系统及tmb分析装置 Download PDF

Info

Publication number
WO2021062904A1
WO2021062904A1 PCT/CN2019/113582 CN2019113582W WO2021062904A1 WO 2021062904 A1 WO2021062904 A1 WO 2021062904A1 CN 2019113582 W CN2019113582 W CN 2019113582W WO 2021062904 A1 WO2021062904 A1 WO 2021062904A1
Authority
WO
WIPO (PCT)
Prior art keywords
tmb
classification
target
image
pathological
Prior art date
Application number
PCT/CN2019/113582
Other languages
English (en)
French (fr)
Inventor
任菲
刘志勇
刘玉东
Original Assignee
中国科学院计算技术研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院计算技术研究所 filed Critical 中国科学院计算技术研究所
Priority to US17/596,127 priority Critical patent/US11468565B2/en
Publication of WO2021062904A1 publication Critical patent/WO2021062904A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/695Preprocessing, e.g. image segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/698Matching; Classification
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B2576/00Medical imaging apparatus involving image processing or analysis
    • A61B2576/02Medical imaging apparatus involving image processing or analysis specially adapted for a particular organ or body part
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the present invention relates to the technical field of image processing, in particular to a TMB classification method, system and TMB analysis device based on pathological images.
  • TMB Tumor Mutation Burden
  • TMB Tumor Mutation Burden
  • TMB Tumor Mutation Burden
  • TMB is an assessment method of the total number of somatic mutations in tumors. It generally refers to the number of non-synonymous mutations per megabase in the coding region of an exon.
  • TMB is an important biomarker in the era of tumor immunotherapy. The predictive ability of TMB immunotherapy is not limited to "hot tumors" (immunogenic tumors) such as NSCLC and melanoma, but a pan-cancer species. The biomarkers of, have predictive power in a variety of tumors including liver cancer.
  • the detection of TMB is an important evaluation method for tumor immunogenicity, and the gold standard method for its detection is whole exome sequencing.
  • the defects of the existing panel sequencing are mainly caused by the dependence of the technology path. Since the gold standard method of TMB is whole-exome sequencing, in order to obtain an approximate value of TMB, the detection area is reduced by a method similar to sampling surveys for prediction. However, due to the uneven distribution of somatic mutations in tumor genes, this introduces A large number of errors lead to reduced accuracy. At the same time, this detection method also uses second-generation sequencing, which is the existing detection technology of exon sequencing, which brings about the disadvantages of high cost, long cycle, and dependence on tissue samples of the second-generation sequencing technology platform itself. Therefore, the development of accurate, low-cost, fast, and independent TMB classification acquisition methods for samples other than pathological images is of great value to tumor research.
  • second-generation sequencing is the existing detection technology of exon sequencing, which brings about the disadvantages of high cost, long cycle, and dependence on tissue samples of the second-generation sequencing technology platform itself. Therefore, the development of accurate, low-cost, fast, and independent TMB classification acquisition methods for samples other than pathological images is of great value
  • the present invention proposes a TMB classification method, which includes: performing TMB classification marking and preprocessing on known pathological images to construct a training set; training a convolutional neural network through the training set to construct a classification model; The target pathological image is preprocessed to obtain multiple target blocks; the target block is classified by the classification model to obtain the TMB classification result of the target case; the TMB classification results of all the blocks are used to pass the classification Vote to obtain the image TMB classification result of the target case.
  • the step of preprocessing the target pathological image specifically includes: marking the target tumor cell area of the target pathological image; cutting out the target from the target pathological image according to the target tumor cell area Partial image; sliding window segmentation is performed on the target partial image, and the target middle block obtained by the segmentation is inverted to obtain multiple target blocks.
  • the step of constructing the training set specifically includes: classifying the known pathological image into multiple types by TMB according to at least one classification threshold; marking all known tumors of the known pathological image Cell area; cut out a known local image from the known pathological image according to the known tumor cell area; perform sliding window segmentation on the known local image, and reverse the color of the middle block obtained by segmentation to obtain more Zhang training tiles; all the training tiles are randomly divided to construct a training subset and a test subset of the training set.
  • the convolutional neural network includes four pairs of convolutional layers and a maximum pooling layer, a first fully connected layer, and a second fully connected layer; all the convolutional layers And the first fully connected layer uses the ReLU activation function, and the second fully connected layer uses the Sigmoid activation function; by changing the fine-grained convolution kernel of each convolutional layer of the convolutional neural network, multiple preselected receptive fields are obtained, and Construct multiple corresponding preselected classification models, obtain the accuracy and AUC value of the preselected classification model, use the preselected classification model with the maximum accuracy and maximum AUC value as the classification model, and use the preselected receptive field corresponding to the classification model as the most Feel the wild.
  • a TMB classification system based on pathological images of the present invention includes: a training set building module for TMB classification marking and preprocessing of known pathological images to build a training set; a classification model building module for passing the training set
  • the convolutional neural network is trained to build a classification model
  • the target image preprocessing module is used to preprocess the target pathological image of the target case to obtain multiple target tiles
  • the tile classification module is used to classify
  • the model classifies the target block to obtain the TMB classification result of the target case
  • the image classification module is used to use all the TMB classification results of the block to obtain the image TMB classification result of the target case through classification voting.
  • the target image preprocessing module specifically includes: marking the target tumor cell area of the target pathological image; cutting out the target partial image from the target pathological image according to the target tumor cell area; The target partial image is segmented by a sliding window, and the target middle block obtained by the segmentation is inverted to obtain multiple target blocks.
  • the training set construction module includes: a TMB marking module for classifying the known pathological image into multiple types according to TMB by at least one classification threshold; a partial region cut-out module for Mark all the known tumor cell areas of the known pathological image, and cut out the known local image from the known pathological image according to the known tumor cell area; training the block segmentation module for the known local image Perform sliding window segmentation and reverse the color of the intermediate tiles obtained by segmentation to obtain multiple training tiles; training set division module is used to randomly divide all the training tiles to construct a training subset of the training set And test subsets.
  • the convolutional neural network includes four pairs of convolutional layers and a maximum pooling layer, a first fully connected layer, and a second fully connected layer; wherein, all the convolutional layers
  • the buildup layer and the first fully connected layer use the ReLU activation function
  • the second fully connected layer uses the Sigmoid activation function; by changing the fine-grained convolution kernel of each convolutional layer of the convolutional neural network, multiple preselected receptive fields are obtained , And construct multiple corresponding pre-selected classification models to obtain the accuracy and AUC value of the pre-selected classification model.
  • the present invention also provides a readable storage medium that stores executable instructions, and the executable instructions are used to execute a TMB classification method based on pathological images.
  • the present invention also relates to a TMB analysis device based on pathological images, including a processor and a readable storage medium.
  • the processor retrieves executable instructions in the readable storage medium to analyze the target pathological image to obtain the The target classification result of the target pathological image.
  • Fig. 1 is a flow chart of the pathological image classification method of the present invention.
  • Fig. 2 is a schematic diagram of a classification model construction process according to an embodiment of the present invention.
  • Fig. 3 is a scatter diagram of a known pathological image TMB according to an embodiment of the present invention.
  • Fig. 4 is a schematic diagram of block division according to an embodiment of the present invention.
  • Fig. 5 is a schematic diagram of a neural convolutional network structure according to an embodiment of the present invention.
  • Fig. 6 is a schematic diagram of the structure of the TMB analysis device based on pathological images of the present invention.
  • FIG. 7A, 7B, and 7C are schematic diagrams of specific embodiments of the TMB analysis device based on pathological images of the present invention.
  • FIG. 8 is a schematic diagram of the comparison between the classification accuracy and the AUC value of the classification model and the panel sequencing according to the embodiment of the present invention.
  • Figure 9A is a schematic diagram of survival analysis based on MSKCC IMPACT468 panel.
  • Figure 9B is a schematic diagram of survival analysis based on FM1 panel.
  • Fig. 9C is a schematic diagram of survival analysis based on CNN model prediction of the present invention.
  • Fig. 10 is a schematic diagram of the receptive field of a classification model according to an embodiment of the present invention.
  • the inventor Based on a deep understanding of the nature of tumor biology, genomic research technology, medical image processing and other cutting-edge technologies, the inventor innovatively proposed a new technical path for detecting tumor mutation burden through pathological images, completely breaking the technical path dependence of current TMB detection methods. .
  • the present invention assumes that the spatial structure of tumor cells and immune cells, and the morphology of tumor cells and their microenvironment-related cells, and other pathological imaging characteristics are consistent with the inherent characteristics of tumor cells’ genomes.
  • TMB as the most critical "grasper" neoantigen replacement marker for the interaction between tumor cells and immune cells, is the immunogenicity of tumor cells, that is, the "hazard degree" of the immune system faced by tumor cells.
  • Deep learning is end-to-end learning that can automatically perform feature extraction.
  • EGFR epidermal growth factor receptor
  • MSI microsatellite instability
  • CNN convolutional neural networks
  • various attempts of modeling training strategies have been experienced. After trying popular models such as AlexNet, VGG, and ResNet, the inventor found that the phenomenon of overfitting is very serious. After analysis, these models are proposed to extract features of natural images rather than pathological images.
  • the feature scale (receptive field) of these models is very large, and each feature in the final feature map contains extensive information, even global features. .
  • the problem of predicting high or low TMB from pathological images is very different from the problem of natural image classification, because pathological image classification pays more attention to minute details than natural image classification (such as cat and dog classification).
  • the inventor narrowed the range of the receptive field and simplified the model, and used a collection of local features as evidence of classification to adapt to the classification problem of pathological images and alleviate the problem of overfitting.
  • the purpose of the present invention is to solve the defects of low accuracy, high cost, long cycle and dependence on tissue samples in the TMB detection technology panel detection technology, and proposes an analysis method for pathological images to perform TMB classification on pathological images.
  • the invented analysis method has an accuracy of 99.7% for TMB classification of pathological images.
  • Fig. 1 is a flow chart of the pathological image classification method of the present invention. As shown in Figure 1, the pathological image classification method of the present invention includes:
  • Step S100 train the CNN network through known pathological images to build a classification model; specifically including:
  • Step S110 selecting a known pathological image
  • the classification model of the present invention is an analysis tool for pathological images of a certain type of tumor, and the training data used is also the pathological image of known cases of this type of tumor.
  • the known pathological images are used for the target pathological image of lung cancer cases.
  • the lung cancer pathological image data is used as the training data of the classification model, and the known pathological image data of gastric cancer is used for gastric cancer cases.
  • Fig. 2 is a schematic diagram of a classification model construction process according to an embodiment of the present invention. As shown in FIG. 2, in the embodiment of the present invention, it is a classification model constructed for pathological images of liver cancer cases. Therefore, the inventor selects the data of the Cancer Genome Atlas (TCGA) project liver cancer project to construct a training set;
  • TCGA Cancer Genome Atlas
  • TCGA liver cancer project of the TCGA project.
  • NCI National Cancer Institute
  • NHGRI National Human Genome Research Institute
  • TCGA uses genome analysis technology based on large-scale sequencing to understand the molecular mechanism of cancer through extensive cooperation.
  • the purpose of TCGA is to improve the scientific understanding of the molecular basis of cancer, improve the ability to diagnose, treat and prevent cancer, and finally complete the database of all cancer genome changes.
  • the UCSCXENA browser was used to retrieve somatic mutations (single nucleotide mutations and small indels) from the GDC-TCGA liver cancer (LIHC) center, and the mutation results obtained by the MUSE analysis method of 362 samples were used. Only mutations marked with a pass filter tag (located in the exon region and non-synonymous mutations or in the spliced region) are used to construct the training set.
  • Step S120 classify and mark the known pathological images
  • the known pathological images can be divided into two or three types according to TMB, and the known pathological images can also be divided into more than three types according to TMB according to research needs.
  • the present invention is not limited to this.
  • the known pathological images are divided into two types (high TMB and low TMB) according to TMB for example, and the method of classifying three or more types of TMB is used as an example.
  • the two classification methods are the same, and the difference is only in the number of TMB thresholds, so I will not repeat them.
  • segmented regression or "broken rod analysis” is used to find an inflection point to find the threshold to divide the known pathological images into two types according to TMB, which specifically includes: the scores of 362 cases of TMB are sorted from highest to highest Sort in small order, draw a scatter plot, apply piecewise regression to fit the scatter points of two straight lines, and finally find the inflection point of the curve, use the TMB corresponding to this inflection point as the classification threshold to distinguish high TMB from low TMB, and use this
  • the threshold value classifies 362 cases by TMB.
  • Fig. 3 is a scatter diagram of a known pathological image TMB according to an embodiment of the present invention.
  • the TMB corresponding to all known pathological images is subjected to segmented regression, and the TMB corresponding to the inflection point is approximately 3.66.
  • the horizontal dotted line passes through the inflection point and 32 points are located above the line.
  • Table 1 shows the TMB table of 32 cases marked as high TMB cases, indicating that the TMB of the corresponding 32 cases is at a high level (marked as high TMB cases), and its corresponding
  • the pathological image is regarded as a positive image
  • the TMB corresponding to the remaining 330 cases is at a low level (identified as a low TMB case)
  • the corresponding pathological image is regarded as a negative image.
  • Step S130 preprocessing of the known pathological image
  • Step S131 marking the tumor cell area
  • the images with the maximum scanning resolution included in the data set are at least 20 ⁇ (the magnification of the objective lens), and the 20 ⁇ field of view is the practice for doctors to judge the benign and malignant tumors under the microscope, so the present invention cuts out each 20 ⁇ image
  • Several partial images with a size of 1712*961 pixels the partial images only contain the cancerous area (tumor cell area).
  • 12 slice images of 12 cases due to poor image quality were excluded.
  • a total of 470 partial images were cut out for high TMB cases (positive images), and a total of 5,162 partial images were cut out for low TMB cases (negative images);
  • Step S132 segment all the partial images to obtain training tiles
  • the present invention divides the partial image into multiple tiles by flexibly adjusting the step size to reduce the resolution and equalize the category. .
  • the segmentation of the blocks can adopt multiple methods, such as threshold segmentation, region segmentation, etc.
  • the sliding window method is used to perform the training block segmentation.
  • Fig. 4 is a schematic diagram of block division according to an embodiment of the present invention. As shown in Fig. 4, the size of the sliding window used in the embodiment of the present invention is 256 ⁇ 256 pixels. For the partial image of the negative example image, 28 tiles are cut out according to 4 rows and 7 columns for each image.
  • each picture is cut into 300 small tiles according to 12 rows and 25 columns.
  • all the partial images of the positive example image can be divided into 141000 pieces of 256 ⁇ 256 pixel tiles, and the negative example image
  • the partial image can be divided into 144536 tiles of the same resolution (256 ⁇ 256 pixels). The number of tiles in the two categories is approximately the same, so the problem of category imbalance is solved while the data is enhanced.
  • Step S133 invert the color of the training image block
  • the present invention reverses the color of all the blocks.
  • step S134 the training set and the test set are randomly divided according to a ratio of 4:1.
  • Step S140 train the convolutional neural network with the training set and the test set;
  • CNN Convolutional neural network
  • its derivative models have a wide range of applications in the field of image classification.
  • CNN is a feedforward neural network, its history can be traced back to 1962.
  • Biologists Hubel and Wiesel found that cells in the cat visual cortex are sensitive to some visual inputs, so they proposed the concept of receptive fields; in 1980, Kunihiko proposed a new cognitive accelerator based on Hubel and Wiesel's local receptive field theory. This is the earliest implementation of the CNN network model; the receptive field is a basic concept of convolutional neural networks.
  • each neuron in the convolutional layer only establishes a connection with the neuron in the upper receiving field through the convolution kernel; this area is the receptive field of the neuron.
  • Convolutional neural networks absorb the idea of local receptive fields, and the advantages are weight sharing and local connection. While ensuring the training effect, CNN can effectively control the parameter size and the amount of calculation;
  • the present invention reduces the range of the receptive field and simplifies the model, and uses the set of local features as the proof of classification to adapt to the classification problem of pathological images and at the same time alleviate the problem of overfitting;
  • FIG. 5 is a schematic diagram of a neural convolutional network structure according to an embodiment of the present invention. As shown in Figure 5, after testing different hyperparameters, the present invention finally chooses to use 4 pairs of convolutional layers 2-1, 2-2, 2-3, 2-4 and maximum pooling layers 3-1, 3. -2, 3-3, 3-4, and sequentially connect a fully connected layer 4-1 containing 256 neurons and a fully connected layer 4-2 containing only 1 neuron, of which the convolutional layer 2-1 , 2-2, 2-3, 2-4 and fully connected layer 4-1 all use ReLU activation function, and fully connected layer 4-2 uses Sigmoid as the activation function. In this way, after processing and analyzing the target block 1, the whole The output of the connection layer 4-2 serves as the classification standard.
  • This embodiment is a preferred embodiment for training CNN with known pathological images of confirmed liver cancer patients obtained from the TCGA-LIHC project.
  • other structures may be used
  • a form of CNN to obtain a better classification effect such as 3 pairs of convolutional layer and maximum pooling layer and a fully connected layer, or 5 pairs of convolutional layer and maximum pooling layer and a fully connected layer, etc.
  • the present invention does not Limit this.
  • Step S150 Determine the receptive field size of the classification model
  • the present invention mainly adopts the method of changing the size of the convolution kernel to control the feeling.
  • Step S200 preprocessing the target pathological image of the target case to obtain the target image block
  • the preprocessing of the target pathological image is similar to the preprocessing of the known pathological image when constructing the training set, including:
  • Step S210 marking the tumor cell area of the target pathological image
  • Step S220 Cut out a partial image with a size of 1712 ⁇ 961 pixels from the target pathological image according to the tumor cell area;
  • Step S230 segment the cut out partial images, for example, segmentation is performed in the same way as the training set is constructed, and each target case image is segmented in the manner of 12 rows ⁇ 25 columns using a sliding window of 256 ⁇ 256 300 tiles are produced; other methods can be used for cutting, and the present invention is not limited to this;
  • Step S240 performing color inversion processing on the segmented image blocks
  • Step S300 Analyze the target block with the classification model, and obtain the TMB classification result of the block of the target case;
  • each tile classify it through the classification model to obtain the TMB classification result of the target case relative to each tile, and determine whether its TMB belongs to a high level or a low level;
  • Step S400 Obtain the TMB classification result of the image of the target case according to the TMB classification results of all the tiles;
  • the TMB classification result of the target case relative to all the tiles is obtained, and the TMB classification result of the target case relative to the target pathological image is obtained based on the TMB classification results of all the tiles;
  • the target image TMB classification result is obtained by the voting method, and the target case is voted with respect to the target pathological image TMB level according to the block TMB classification result, and the block TMB classification result with the largest number of votes is the target case TMB classification result of the image.
  • the cut size of the partial image and the size of the sliding window are not limited to fixed pixels, and are only used to clearly explain the method proposed by the present invention. Other sizes can also be used to cut or cut partial images.
  • the selection of the sliding window is not limited to this in the present invention.
  • Fig. 6 is a schematic diagram of the structure of the TMB analysis device based on pathological images of the present invention.
  • an embodiment of the present invention also provides a readable storage medium and a data processing device.
  • the readable storage medium of the present invention stores executable instructions, and when the executable instructions are executed by the processor of the data processing device, the above-mentioned pathological image-based TMB classification method is realized.
  • a program instructing relevant hardware such as a processor, FPGA, ASIC, etc.
  • the program can be stored in a readable storage medium, such as a read-only memory. , Disk or CD, etc.
  • each module in the above embodiment can be implemented in the form of hardware, for example, an integrated circuit to achieve its corresponding function, or it can be implemented in the form of a software function module, for example, the processor executes the program/instruction stored in the memory. To achieve its corresponding functions.
  • the embodiments of the present invention are not limited to the combination of hardware and software in any specific form.
  • FIG. 7A, 7B, and 7C are schematic diagrams of specific embodiments of the TMB analysis device based on pathological images of the present invention.
  • the data processing device of the present invention can have a variety of specific forms to perform TMB classification based on pathological images. For example, as shown in FIG.
  • a computer is used as the TMB analysis device, and the input unit of the TMB analysis computer Including at least one of the input devices or interfaces such as digital cameras, digital cameras, scanners, card readers, optical drives, USB interfaces, etc., which can convert known pathological images and target pathological images into data files and input into TMB analysis computer, or The data files of the known pathological image and the target pathological image are directly input to the TMB analysis computer; the storage unit of the TMB analysis computer stores the computer executable instructions for realizing the TMB classification method based on the pathological image of the present invention, and the processor of the TMB analysis computer calls And execute the above-mentioned computer executable instructions to process the input known pathological image data and/or target pathological image data to generate a classification model or obtain the TMB classification corresponding to the target pathological image; when the TMB classification result is obtained, the TMB analysis is performed
  • the output unit of the computer such as a printer or a display, outputs the TMB classification result based
  • the TMB analysis device of the present invention can also be based on the generated classification model, that is, the TMB analysis device no longer performs the work of constructing a classification model based on known pathological images, but in the storage unit of the TMB analysis device, in addition to storing the implementation of the present invention
  • the processor of the TMB analysis device calls and executes the above-mentioned computer executable instructions and classification models to input the target pathological image data Perform processing and analysis to obtain the TMB classification corresponding to the target pathological image; when the TMB classification result is obtained, it is output through the output unit of the TMB analysis computer.
  • the TMB analysis device can reduce the processing performance requirements of the processor and greatly simplify
  • the complexity of the TMB analysis device may increase the portability of the TMB analysis device or expand the scope of application of the TMB analysis device.
  • a tablet computer is used as the TMB analysis device, so that the target pathological image can be processed more conveniently, and a mobile terminal such as a smart phone can also be used.
  • the present invention does not Limit this.
  • a network server is used as a TMB analysis device, so that a network platform can be built. The user only needs to input target pathological image data at the network terminal and pass through a network such as a switch/gateway. The device obtains the TMB classification result of the target pathological image through the network server on the local area network or the wide area network.
  • the pathological image classification method of the present invention separately classifies the segments of the pathological image and the pathological image itself.
  • the image Block TMB classification by drawing accuracy, loss and AUC curves, using the results of the 10th training period, the accuracy of the test set is 0.9486, and the AUC value is 0.9488.
  • TMB classification of pathological images based on the classification model, 350 known cases are classified and predicted, and each pathological image is divided into 816 tiles on average. After predicting the TMB level of each tile of a case, the majority is used. The voting method is used to calculate the overall TMB level of the pathological image of the current case. After the experiment, only one of the 350 patients has a false negative prediction, and the classification accuracy of the TMB patient-level prediction is 0.9971.
  • the present invention uses the normal tissue partial image to perform classification prediction: the collected normal tissue partial image is cut into 768 blocks for prediction (the label is unified with low TMB), which The result is that 3 blocks were misjudged as high TMB with an accuracy of 0.9961.
  • TMB panel TMB obtained in this way is an approximate value of the TMB obtained by WES (WES TMB).
  • WES WES
  • MSKCC IMPACT468 and FM1 Two gene panels, MSKCC IMPACT468 and FM1.
  • MSKCC IMPACT468 Two gene panels, MSKCC IMPACT468 and FM1.
  • FIG. 8 is a schematic diagram of the comparison between the classification accuracy and the AUC value of the classification model and the panel sequencing according to the embodiment of the present invention.
  • the classification status determined by the WES-TMB inflection point value mentioned in step S120 compare the WES TMB classification accuracy predicted by the trained CNN model and the panel TMB prediction, and the classification accuracy of the FM1 panel of the TMB and The AUC values are 0.807 and 0.875, respectively.
  • the classification accuracy and AUC value of the MSKCC IMPACT468 panel used for TMB are 0.778 and 0.875, respectively, which are far lower than the corresponding score predicted by the classification model of the present invention.
  • FIG. 9A is a schematic diagram of survival analysis based on MSKCC IMPACT468 panel
  • FIG. 9B is a schematic diagram of survival analysis based on FM1 panel
  • FIG. 9C is a schematic diagram of survival analysis based on CNN model prediction of the present invention.
  • the classification model of the present invention can well extract the characteristics of liver cancer pathological images, thereby classifying the high and low TMB levels of tumor tissues. This model is better than the estimation method of panel-based TMB in predicting the survival rate of patients.
  • Fig. 10 is a schematic diagram of the receptive field of a classification model according to an embodiment of the present invention. As shown in Figure 10, an example area is shown in which a 48 ⁇ 48 pixel reception field is projected onto the input image. In a 20 ⁇ pathological image, the receptive field area of this size contains about 2 cells. The size of this receptive field can help the model fully recognize the heterogeneity of liver cancer cells, while avoiding the interference of interstitial tissue that may appear in pathological images.
  • TMB is an indicator that reflects the degree of gene mutation in tumor cells, and it can reflect the pathogenesis of tumors at the molecular level.
  • the pathological morphological characteristics of tumor cells and their microenvironment-related cells have a universal internal relationship with the intrinsic characteristics of tumor cells' genomes, so that TMB can be predicted by pathological image characteristics. Observing the HE pathological section of liver cancer under the microscope, it can be observed that the cancer cells are different in size and shape, the internal structure is abnormal, and the nucleo-cytoplasmic ratio is increased compared with the normal cells.
  • the morphological characteristics of the two are shown in Table 4.
  • receptive fields of different sizes can obtain information on different scales, and small receptive fields can better obtain local information.
  • the morphological characteristics of cancer cells belong to the local information, so a smaller receptive field can be used to obtain a better prediction effect.
  • the invention relates to a TMB classification method, system and TMB analysis device based on pathological images, including: TMB classification marking and preprocessing of known pathological images to construct a training set; training a convolutional neural network through the training set , To construct a classification model; to preprocess the target pathological image of the target case to obtain multiple target blocks; to classify the target block with the classification model to obtain the TMB classification result of the target case; to TMB classification results of all the tiles, the image TMB classification results of the target case are obtained through classification voting.
  • the invention also relates to a TMB analysis device based on pathological images.
  • the TMB classification method of the present invention does not rely on samples other than pathological images, has the advantages of accuracy, low cost, and speed, and has great value for tumor research.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Quality & Reliability (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • Surgery (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Physiology (AREA)
  • Psychiatry (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

本发明涉及一种基于病理图像的TMB分类方法、系统及TMB分析装置,包括:对已知病理图像进行TMB分类标记和预处理,以构建训练集;通过该训练集对卷积神经网络进行训练,以构建分类模型;对目标病例的目标病理图像进行预处理,以获得多张目标图块;以该分类模型对该目标图块进行分类,以获取该目标病例的图块TMB分类结果;以所有该图块TMB分类结果,通过分类投票获取该目标病例的图像TMB分类结果。本发明还涉及一种基于病理图像的TMB分析装置。本发明的TMB分类方法,不依赖于除病理图像之外样本,具有准确、低成本、快速的优点,对肿瘤的研究具有重大价值。

Description

基于病理图像的TMB分类方法、系统及TMB分析装置 技术领域
本发明涉及图像处理技术领域,特别是涉及一种基于病理图像的TMB分类方法、系统及TMB分析装置。
背景技术
肿瘤突变负荷(TMB,Tumor Mutation Burden)是对肿瘤体细胞突变总量的一种评估手段,一般指落入外显子编码区域每兆碱基中非同义突变的数量。TMB是肿瘤免疫治疗时代重要的生物标志物,TMB的免疫治疗疗效预测能力并不局限于NSCLC、黑色素瘤等“热肿瘤”(有免疫源性的肿瘤,immunogenic tumor),而是一个泛癌种的生物标志物,在包括肝癌在内的多种肿瘤中均具有预测能力。检测TMB是肿瘤免疫原性的重要评估手段,其检测的金标准方法是全外显子组测序。
全外显子组测序由于超高的价格和实验周期而无法被广泛应用,检测几百个基因(通常检测1~3M基因外显子编码区域,而人体所有编码基因区域大约为40M)的panel测序,是目前比较常用的一种方式。
然而panel测序由于没有检测所有的目标区域,所得到的TMB只是一个近似值。通过对FDA批准的两款panel测序(IMPACT和FM1)在癌症基因图谱计划(TCGA)中的肝癌数据集中发现,直接使用两款panel测序进行检测的基因,预测外显子TMB的准确性分别为77.8%和80.7%,即使经过各种方式的优化,准确性也仅仅达到90%左右。例如,中国国家发明“一种肿瘤突变负荷的检测位点组合、检测方法、检测试剂盒及系统”(申请号:201910338312.6提出一种基于Panel测序检测TMB的位点组合、处理流程及计算方法,提供了肿瘤突变负荷的检测方法,对目标检测区域中的具体位点进行了优化:排除了中国人群中肿瘤发生发展相关高频突变位点,纳入了同义突变,使得基于Panel和全外显子测序的TMB结果一致性得到了一定量的提高。
更进一步而言,即使panel测序,依旧面临检测成本过高、检测周期过高和组织样本依赖等问题。通常情况下,panel测序获得TMB所需要的费用是几千到一两万,这对TMB检测的普及应用形成了重大障碍。再者,获得TMB的检测周期一般是2-3 周。最后,获得TMB需要足够数量和足够质量的肿瘤组织样本,而实际实践中,经常出现无法获得的情况。
现有的panel测序存在的缺陷,主要是由于技术路径依赖导致的。由于TMB的金标准方法是全外显子组测序,为了获得TMB的近似值,通过类似于抽样调查的方式减少检测区域进行预测,然而由于肿瘤基因体细胞突变分布的不均匀性等特点,这引入了大量误差,导致准确性降低。同时,该检测方法还由于沿用外显子测序既有的检测技术二代测序,进而带来了二代测序技术平台本身的费用高、周期长和组织样本依赖等缺点。所以,准确、低成本、快速、不依赖于除病理图像之外样本的TMB分类获取手段的发展对肿瘤的研究具有重大价值。
发明公开
本发明提出一种TMB分类方法,包括:对已知病理图像进行TMB分类标记和预处理,以构建训练集;通过该训练集对卷积神经网络进行训练,以构建分类模型;对目标病例的目标病理图像进行预处理,以获得多张目标图块;以该分类模型对该目标图块进行分类,以获取该目标病例的图块TMB分类结果;以所有该图块TMB分类结果,通过分类投票获取该目标病例的图像TMB分类结果。
本发明所述的TMB分类方法,其中对该目标病理图像进行预处理的步骤具体包括:标记出该目标病理图像的目标肿瘤细胞区域;根据该目标肿瘤细胞区域从该目标病理图像中切出目标局部图像;对该目标局部图像进行滑动窗口分割,并对分割获得的目标中间图块进行反色,以获得多张该目标图块。
本发明所述的TMB分类方法,其中构建该训练集的步骤具体包括:通过至少一个分类阈值将该已知病理图像按TMB分类为多个类型;标记出所有该已知病理图像的已知肿瘤细胞区域;根据该已知肿瘤细胞区域从该已知病理图像中切出已知局部图像;对该已知局部图像进行滑动窗口分割,并对分割获得的中间图块进行反色,以获得多张训练图块;对所有该训练图块进行随机划分以构建该训练集的训练子集和测试子集。
本发明所述的TMB分类方法,其中该卷积神经网络依次包括四对卷积层和最大池化层、一层第一全连接层,以及一层第二全连接层;所有该卷积层和该第一全连接层采用ReLU激活函数,该第二全连接层采用Sigmoid激活函数;通过改变该卷积神经网络各卷积层的卷积核的细粒度,获得多个预选感受野,并构建多个对应的 预选分类模型,获取该预选分类模型的准确度和AUC值,以具有最大准确度和最大AUC值的预选分类模型为该分类模型,以该分类模型对应的预选感受野为最佳感受野。
本发明一种基于病理图像的TMB分类系统,包括:训练集构建模块,用于对已知病理图像进行TMB分类标记和预处理,以构建训练集;分类模型构建模块,用于通过该训练集对卷积神经网络进行训练,以构建分类模型;目标图像预处理模块,用于对目标病例的目标病理图像进行预处理,以获得多张目标图块;图块分类模块,用于以该分类模型对该目标图块进行分类,以获取该目标病例的图块TMB分类结果;图像分类模块,用于以所有该图块TMB分类结果,通过分类投票获取该目标病例的图像TMB分类结果。
本发明所述的TMB分类系统,其中该目标图像预处理模块具体包括:标记出该目标病理图像的目标肿瘤细胞区域;根据该目标肿瘤细胞区域从该目标病理图像中切出目标局部图像;对该目标局部图像进行滑动窗口分割,并对分割获得的目标中间图块进行反色,以获得多张该目标图块。
本发明所述的TMB分类方法,其中该训练集构建模块包括:TMB标记模块,用于通过至少一个分类阈值将该已知病理图像按TMB分类为多个类型;局部区域切出模块,用于标记出所有该已知病理图像的已知肿瘤细胞区域,根据该已知肿瘤细胞区域从该已知病理图像中切出已知局部图像;训练图块分割模块,用于对该已知局部图像进行滑动窗口分割,并对分割获得的中间图块进行反色,以获得多张训练图块;训练集划分模块,用于对所有该训练图块进行随机划分以构建该训练集的训练子集和测试子集。
本发明所述的TMB分类系统,其中该卷积神经网络依次包括四对卷积层和最大池化层、一层第一全连接层,以及一层第二全连接层;其中,所有该卷积层和该第一全连接层采用ReLU激活函数,该第二全连接层采用Sigmoid激活函数;通过改变该卷积神经网络各卷积层的卷积核的细粒度,获得多个预选感受野,并构建多个对应的预选分类模型,获取该预选分类模型的准确度和AUC值,以具有最大准确度和最大AUC值的预选分类模型为该分类模型,以该分类模型对应的预选感受野为最佳感受野。
本发明还提出一种可读存储介质,存储有可执行指令,该可执行指令用于执行基于病理图像的TMB分类方法。
本发明还涉及一种基于病理图像的TMB分析装置,包括处理器和可读存储介质,该处理器调取该可读存储介质中的可执行指令,以对目标病理图像进行分析,以获得该目标病理图像的目标分类结果。
附图简要说明
图1是本发明的病理图像分类方法流程图。
图2是本发明实施例的分类模型构建过程示意图。
图3是本发明实施例的已知病理图像TMB散点图。
图4是本发明实施例的图块分割示意图。
图5是本发明实施例的神经卷积网络结构示意图。
图6是本发明的基于病理图像的TMB分析装置结构示意图。
图7A、7B、7C是本发明的基于病理图像的TMB分析装置具体实施例的示意图。
图8是本发明实施例的分类模型与panel测序的分类准确度和AUC值对比示意图。
图9A是基于MSKCC IMPACT468 panel的生存分析示意图。
图9B是基于FM1 panel的生存分析示意图。
图9C是本发明的基于CNN模型预测的生存分析示意图。
图10是本发明实施例的分类模型感受野示意图。
其中,附图标记为:
1:目标图块                       2-1、2-2、2-3、2-4:卷积层
3-1、3-2、3-3、3-4:最大池化层    4-1、4-2:全连接层
实现本发明的最佳方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图,对本发明进一步详细说明。应当理解,此处所描述的具体实施方法仅仅用以解释本发明,并不用于限定本发明。
发明人基于对肿瘤生物学本质和基因组研究技术、医学图像处理等前沿技术的深刻理解,创新性的提出通过病理图像检测肿瘤突变负荷的全新技术路径,完全打破了现行TMB检测手段的技术路径依赖。基于对肿瘤生物学本质的深刻理解,本发 明假设,肿瘤细胞与免疫细胞等多种细胞的空间结构定位,肿瘤细胞及其微环境相关细胞的形态等病理影像特征与肿瘤细胞的基因组内在特征一定具有普遍的内在联系,TMB作为肿瘤细胞与免疫细胞互相作用最关键的“抓手”新抗原的替代性标志物,是肿瘤细胞免疫原性,即肿瘤细胞面临的免疫系统的“危害程度”的重要评估指标,应该也在其病理图像中有所呈现。深度学习是端到端的学习,能够自动化进行特征提取。最新研究发现,深度学习不仅仅可以以较高的准确性从病理影像中预测出EGFR(epidermal growth factor receptor)等驱动基因突变的信息,而且可以很好的预测肿瘤免疫相关的微卫星不稳定性(MSI,microsatellite instability)状态。在图片分类领域,卷积神经网络(CNN)及其派生模型的应用非常广泛。本发明实际开发过程中,经历了多种建模训练策略的尝试。发明人在尝试了AlexNet、VGG、ResNet等流行模型之后,发现过拟合的现象十分严重。经过分析,这些模型的提出是为了提取自然图像的特征而不是病理图像。相对来讲,它们更注重图像中的主体与其环境的联系,因此这些模型的特征尺度(感受野)十分巨大,最终得到的特征图中的每个特征都包含了广泛的信息,甚至是全局特征。但是,从病理图像预测TMB高或者低的问题,和自然图像分类问题的区别是很大的,因为病理图像分类比自然图像分类(如猫狗分类)更关注微小的细节。发明人通过缩小感受野的范围并简化模型,用局部特征的集合来作为分类的凭据,以适应病理图像的分类问题,同时缓解过拟合问题。
本发明的目的是解决TMB检测技术panel检测技术中准确性低、成本高、周期长和组织样本依赖等缺陷,提出了一种针对病理图像的分析方法,以对病理图像进行TMB分类,通过本发明的分析方法,对病理图像TMB分类的准确性高达99.7%。
(一)关于本发明的病理图像分类方法
图1是本发明的病理图像分类方法流程图。如图1所示,本发明的病理图像分类方法包括:
步骤S100,通过已知病理图像对CNN网络进行训练,以构建分类模型;具体包括:
步骤S110,选取已知病理图像;
本发明的分类模型是针对某一类型肿瘤的病理图像的分析工具,对所采用的训练数据也是采用该类型肿瘤的已知病例的病理图像,例如,针对肺癌病例的目标病理图像,采用已知的肺癌病理图像数据作为分类模型的训练数据,针对胃癌病例则 采用已知的胃癌病理图像数据等。图2是本发明实施例的分类模型构建过程示意图。如图2所示,于本发明的实施例中,是针对肝癌病例的病理图像构建的分类模型,因此,发明人选取癌症基因组图谱(TCGA)项目肝癌项目的数据以构建训练集;
发明人对来自TCGA项目肝癌项目的数据进行了研究。TCGA由国家癌症研究所(NCI)和国家人类基因组研究所(NHGRI)于2006年联合发起,目前共研究36种癌症类型。TCGA采用基于大规模测序的基因组分析技术,通过广泛的合作,了解癌症的分子机制。TCGA的目的是提高对癌症分子基础的科学认识,提高诊断、治疗和预防癌症的能力,并最终完成所有癌症基因组变化的数据库。本实施例利用UCSCXENA浏览器从GDC-TCGA肝癌(LIHC)中枢中检索到体细胞突变(单核苷酸变异和小片段插入缺失),并采用了362个样品的MUSE分析方法获得的突变结果。只有标记有pass filter标签的突变(位于外显子区域且非同义突变或位于剪切区域)才用于构建训练集。
步骤S120,对已知病理图像进行分类标记;
在对TMB进行分类之前,需要选择至少一个阈值来区分TMB的高低,以将TMB分类为两个或多个类型,而肝癌目前没有临床意义的TMB阈值。通常情况下,可以将已知病理图像按照TMB分为两个或三个类型,也可以根据研究需要将已知病理图像按照TMB分为三个以上的类型,本发明并不以此为限。以下的实施例中,如无特别指明,均采用将已知病理图像按照TMB分为两个类型(高TMB和低TMB)进行举例,进行三个或三个以上类型的TMB分类的方法与进行两个分类的方法相同,区别仅在于TMB阈值的数量,故不再赘述。
于本发明的实施例,采用分段回归或“断棒分析”通过找到一个拐点来找到阈值将已知病理图像按照TMB分为两个类型,具体包括:将362例TMB的得分按从大到小的顺序排序,绘制散点图,应用分段回归拟合两条直线的散点,最后求出曲线的拐点,以这个拐点对应的TMB作为区分高TMB和低TMB的分类阈值,并以这个阈值对362个病例进行TMB分类。图3是本发明实施例的已知病理图像TMB散点图。如图3所示,将所有已知病理图像对应的TMB进行分段回归,其中拐点对应的TMB约为3.66。水平虚线经过拐点,有32个点位于该线之上,表1为32例标记为高TMB病例的TMB表,表示对应的32个病例的TMB处于高水平(标识为高TMB病例),其对应病理图像作为 正例图像,其余330个病例对应的TMB处于低水平(标识为低TMB病例),其对应病理图像作为负例图像。
序号 病例号 TMB
1 TCGA-UB-A7MB-01A 35.472
2 TCGA-4R-AA8I-01A 22.667
3 TCGA-CC-A7IH-01A 16.750
4 TCGA-DD-AAC8-01A 13.583
5 TCGA-DD-A1EE-01A 10.778
6 TCGA-WQ-A9G7-01A 8.389
7 TCGA-DD-AACI-01A 7.944
8 TCGA-DD-A3A9-01A 7.694
9 TCGA-CC-A7IK-01A 7.333
10 TCGA-DD-AACL-01A 7.222
11 TCGA-ED-A7PZ-01A 6.250
12 TCGA-MI-A75G-01A 5.222
13 TCGA-DD-AADF-01A 5.167
14 TCGA-DD-AACQ-01A 5.111
15 TCGA-DD-AAE7-01A 5.028
16 TCGA-G3-A3CK-01A 4.917
17 TCGA-CC-A8HT-01A 4.583
18 TCGA-LG-A6GG-01A 4.583
19 TCGA-DD-AADO-01A 4.583
20 TCGA-CC-A5UD-01A 4.528
21 TCGA-DD-AACT-01A 4.500
22 TCGA-ED-A459-01A 4.306
23 TCGA-RC-A6M6-01A 4.278
24 TCGA-DD-AAEA-01A 4.139
25 TCGA-RC-A6M4-01A 4.139
26 TCGA-DD-AACZ-01A 4.083
27 TCGA-G3-A7M5-01A 4.000
28 TCGA-BC-A10Z-01A 3.861
29 TCGA-CC-A7IE-01A 3.806
30 TCGA-MI-A75I-01A 3.750
31 TCGA-DD-AADM-01A 3.750
32 TCGA-G3-AAV0-01A 3.667
表1
步骤S130,已知病理图像的预处理;
用GDC工具从TCGA-LIHC项目下载362例肝癌患者的380张全切片图像,这些病例和上文提到的TMB数据的病例一一对应;
步骤S131,标记肿瘤细胞区域;
数据集中包含的最大扫描分辨率的图像至少是20×(物镜放大倍数),且20×的视野是医生在镜下进行肿瘤良恶性判断的惯例,因此本发明从每张20×图像中切出数张大小为1712*961像素的局部图像,局部图像中只包含癌变区域(肿瘤细胞区域)。在切图过程中,排除了因图像质量较差的12个病例的12张切片图像。最后高TMB病例(正例图像)共切出470张局部图像,低TMB病例(负例图像)共切出5162张局部图像;
步骤S132,将所有局部图像进行分割,获得训练图块;
如果局部图像直接用于深度学习模型的训练,会面临分辨率过大和类别不均衡的问题,本发明通过灵活调整步长的方式将局部图像分割为多个图块,来降低分辨率及均衡类别。其中,分割图块可以采用多种方式,例如阈值分割、区域分割等,于本发明的实施例中,采用滑窗法进行训练图块分割。图4是本发明实施例的图块分割示意图。如图4所示,本发明的实施例采用的滑动窗口的大小为256×256像素,对于负例图像的局部图像,每张图片按照4行7列切出28个图块,虽然图块之间互相有少量重叠部分,但这已经是重叠面积最小的方案,在不遗漏每个像素的前提下。对于正例图像的局部图像,每张图片按照12行25列切出300个小图块,这样,所有正例图像的局部图像可以分割出141000张256×256像素的图块,负例图像的局部图像可以分割出144536张相同分辨率(256×256像素)的图块。两个类别的图块数近似相同,于是在数据增强的同时解决了类别不均衡问题。
步骤S133,对训练图块进行反色处理;
由于有一些图块位于肿瘤细胞组织边缘部分,可能会含有空白局域。白色在RGB体系下的色值较高,细胞(特别是细胞核)的颜色对应的色值较低。而当这些图块作为深度学习模型的输入时,没有分析意义的像素点对应的特征值接近于0,有分析意义的像素点对应更高的特征值,才更便于对分类模型的训练以及分析,因此本发明对所有图块进行了反色。
步骤S134,按4:1的比率随机划分出训练集与测试集。
步骤S140,以训练集和测试集对卷积神经网络进行训练;
卷积神经网络(CNN)及其衍生模型在图像分类领域有着广泛的应用。CNN是一个前馈神经网络,其历史可以追溯到1962年。生物学家Hubel和Wiesel发现猫视觉皮层中的细胞对部分视觉输入很敏感,因此提出了接受场的概念;1980年,Kunihiko提出了基于Hubel和Wiesel的局部接受场理论的新认知加速器。这是CNN网络模型最早的实现;感受野是卷积神经网络的一个基本概念。与特征都取决于整体输入的完全连接的网络不同,卷积层中的每个神经元只通过卷积核与上接收场中的神经元建立连接;这个区域是神经元的感受野。卷积神经网络吸收了局部感受野的思想,优点是权值共享和局部连接。在保证训练效果的同时,CNN可以有效地控制参数大小和计算量;
发明人在尝试了AlexNet、VGG、ResNet等流行模型之后,发现过拟合的现象十分严重。经过分析,这些模型的提出是为了提取自然图像的特征而不是病理图像。相对来讲,它们更注重图像中的主体与其环境的联系,因此这些模型的感受野十分巨大,最终得到的特征图中的每个特征都包含了广泛的信息,甚至是全局特征。比如AlexNet的pool5输出的特征图上的像素在输入图像上的感受野为195×195像素,VGG16的最大感受野为212×212像素,ResNet50的最大感受野甚至能达到483×483像素;
但是,从病理图像预测某一病例的TMB高或者低的问题,和自然图像分类问题的区别是很大的,因为病理图像分类比自然图像分类(如猫狗分类)更关注微小的细节。因此,本发明缩小感受野的范围并简化模型,用局部特征的集合来作为分类的凭据,以适应病理图像的分类问题,同时缓解过拟合问题;
CNN可以具有多种结构形式,但并非所有结构形式的CNN都能获得较佳的TMB分类效果。图5是本发明实施例的神经卷积网络结构示意图。如图5所示,经过对不同的超参数的测试,本发明最终选择采用4对卷积层2-1、2-2、2-3、2-4和最大池化层3-1、3-2、3-3、3-4,并依次连接一个包含256个神经元的全连接层4-1和一个仅包含1个神经元的全连接层4-2,其中卷积层2-1、2-2、2-3、2-4和全连接层4-1都采用ReLU激活函数,全连接层4-2使用Sigmoid作为激活函数,这样,对目标图块1处理分析后,以全连接层4-2的输出作为分类的标准。
本实施例是以从TCGA-LIHC项目中获取的确诊肝癌患者的已知病理图像对 CNN进行训练的较佳实施例,在针对其他类型种类或采用其他训练数据进行训练时,可能会采用其他结构形式的CNN以获得较佳的分类效果,例如3对卷积层和最大池化层及一个全连接层,或5对卷积层和最大池化层及一个全连接层等,本发明并不以此为限。
步骤S150,确定分类模型的感受野大小
对CNN网络模型进行训练的过程中,可以用调整CNN模型深度和调整卷积核大小等方法来控制感受野大小。但是,随着模型深度的改变会导致模型参数数量的显著改变,极大地影响训练效果(可能产生过拟合或欠拟合效应),因此本发明主要采用改变卷积核大小的方法来控制感受野;
Figure PCTCN2019113582-appb-000001
表2
于本发明的实施例中,先进行了一系列实验,通过大幅度地改变感受野(从10×10像素到212×212像素),将合适的感受野范围锁定在46×46像素到60×60像素之间。然后在卷积层数固定的情况下,在此区间内使用了不同的卷积核大小更加细粒度地控制感受野。通过将模型中前3个卷积层的卷积核抽取若干个从3×3改为5×5,设计出8种模型,如表2所示。
这8个模型用相同的数据集训练之后,各个模型的准确度和AUC(Area Under Curve,受试者工作特征曲线(ROC)下的面积)值如表3所示。根据实验结果,效果最好的模型是RF48,最佳感受野是48×48像素。
Figure PCTCN2019113582-appb-000002
表3
步骤S200,对目标病例的目标病理图像进行预处理,以获得目标图块;
对目标病理图像的预处理与构建训练集时对已知病理图像的预处理类似,包括:
步骤S210,标记出目标病理图像的肿瘤细胞区域;
步骤S220,根据肿瘤细胞区域,从目标病理图像中切出大小为1712×961像素的局部图像;
步骤S230,对切出的局部图像进行分割,例如,采用与构建训练集相同的方式进行分割,使用大小为256×256的滑动窗口,以12行×25列的方式对每张目标病例图像分割出300张图块;以可以采用其他方式进行切割,本发明并不以此为限;
步骤S240,对分割出的图块进行反色处理;
步骤S300,以分类模型对目标图块进行分析,获取目标病例的图块TMB分类结果;
对于每一张图块,通过分类模型对其进行分类,以获得目标病例相对于每一张图块的图块TMB分类结果,确定其TMB是属于高水平还是低水平;
步骤S400,根据所有图块TMB分类结果,得到目标病例的图像TMB分类结果;
当对所有图块通过分类模型进行分类后,即获得了目标病例相对于所有图块的图块TMB分类结果,再以所有图块TMB分类结果得到目标病例相对目标病理图像的图像TMB分类结果;于本发明的实施例中,通过投票法获得目标图像TMB分类结果,以图块TMB分类结果对目标病例相对于目标病理图像TMB水平进行投票,以具有最大票数的图块TMB分类结果最为目标病例的图像TMB分类结果。
应当理解的是,以上叙述中,局部图像的切割的大小、滑动窗口的大小并非限定于固定的像素,仅用于清楚解释本发明所提出的方法,也可以采用其他大小进行 局部图像的切割或选取滑动窗口,本发明并不以此为限。
(二)关于本发明的基于病理图像的TMB分析装置
图6是本发明的基于病理图像的TMB分析装置结构示意图。如图6所示,本发明实施例还提供一种可读存储介质,以及一种数据处理装置。本发明的可读存储介质存储有可执行指令,可执行指令被数据处理装置的处理器执行时,实现上述基于病理图像的TMB分类方法。本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件(例如处理器、FPGA、ASIC等)完成,所述程序可以存储于可读存储介质中,如只读存储器、磁盘或光盘等。上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块可以采用硬件的形式实现,例如通过集成电路来实现其相应功能,也可以采用软件功能模块的形式实现,例如通过处理器执行存储于存储器中的程序/指令来实现其相应功能。本发明实施例不限制于任何特定形式的硬件和软件的结合。
图7A、7B、7C是本发明的基于病理图像的TMB分析装置具体实施例的示意图。本发明的数据处理装置可以有多种具体形式,以进行基于病理图像的TMB分类,例如,如图7A所示,于本发明的实施例,使用计算机作为TMB分析装置,TMB分析计算机的输入单元包括数码相机、数码摄像头、扫描仪、读卡器、光驱、USB接口等输入设备或接口的至少一种,可以将已知病理图像和目标病理图像转换为数据文件输入TMB分析计算机,或将已知病理图像和目标病理图像的数据文件直接输入TMB分析计算机;TMB分析计算机的存储单元中,存储有实现本发明的基于病理图像的TMB分类方法的计算机可执行指令,TMB分析计算机的处理器调用并执行上述计算机可执行指令,对输入的已知病理图像数据和/或目标病理图像数据进行处理,以生成分类模型或得到目标病理图像对应的TMB分类;当获得TMB分类结果后,通过TMB分析计算机的输出单元,例如是打印机或显示器,向用户输出基于目标病理图像的TMB分类结果。
本发明的TMB分析装置还可以基于已生成的分类模型,即TMB分析装置不再进行根据已知病理图像构建分类模型的工作,而是在TMB分析装置的存储单元中,除了存储有实现本发明的基于病理图像的TMB分类方法的计算机可执行指令外,还存储有已经构建好的分类模型,TMB分析装置的处理器调用并执行上述计算机可执行指令和分类模型,对输入的目标病理图像数据进行处理分析,以得到目标病理图像对应的TMB分类;当获得TMB分类结果后,通过TMB分析计算机的输出单元 进行输出,如此一来,TMB分析装置可以降低对处理器的处理性能要求,大大简化TMB分析装置的复杂程度,或增加TMB分析装置的便携性,或扩展TMB分析装置的适用范围。如图7B所示,于本发明的另一实施例,使用平板电脑作为TMB分析装置,这样可以较为方便的对目标病理图像进行处理,也可以使用例如智能手机等的移动终端,本发明并不以此为限。如图7C所示,于本发明的又一实施例,使用网络服务器作为TMB分析装置,这样就可以搭建出一个网络平台,用户只需在网络终端输入目标病理图像数据,经由交换机/网关等网络设备,通过局域网或广域网上的网络服务器获得目标病理图像的TMB分类结果。
(三)关于本发明的病理图像分类方法的技术效果
可以看出,本发明的病理图像分类方法分别对病理图像分割出的图块以及病理图像本身进行了分类,实际使用中,在确定感受野并使用预处理好的数据集训练模型后,对于图块TMB分类,通过绘制准确度、损耗度和AUC曲线,使用第10个训练时期的结果,测试集的准确度为0.9486,AUC值为0.9488。而对于病理图像TMB分类,基于分类模型,对350例已知病例进行分类预测,将每例病理图像平均切割816块图块,在预测了一个病例的每一个图块的TMB水平后,使用多数投票法来计算当前病例的病理图像总体TMB水平,实验结束后,350例患者中只有一例预测错误,属于假阴性范畴,TMB患者级预测的分类准确度为0.9971。
由于在数据预处理过程中排除了病理图像的正常组织区域,故在训练模型的过程中没有正常组织区域分割出的图块加入训练集。为检测对病理图像中正常组织区域图块预测能力,本发明还使用正常组织局部图像进行了分类预测:将采集到的正常组织局部图像切成768块进行预测(标签统一为低TMB),其结果为3块图块被误判为高TMB,准确度为0.9961。
(四)关于本发明的病理图像分类方法与现有技术的效果对比
由于大多数临床参考的TMB评分现在是通过基因panel获得的,用这种方式获得的TMB(panel TMB)是WES(WES TMB)获得的TMB的近似值。目前,FDA已经批准了MSKCC IMPACT468和FM1两个基因panel。我们从TCGA-LIHC项目中提取了这两个panel的基因,并计算了这些panel的TMB得分。
图8是本发明实施例的分类模型与panel测序的分类准确度和AUC值对比示意图。如图8所示,根据步骤S120中提到的WES-TMB拐点值确定的分类状态,比较训练后的CNN模型预测的和panel TMB预测的WES TMB分类精度,TMB的FM1  panel的分类准确度和AUC值分别为0.807和0.875。同样,用于TMB的MSKCC IMPACT468 panel的分类准确度和AUC值分别为0.778和0.875,远低于本发明的分类模型预测的相应得分。
以往的研究发现,肝癌中TMB含量高与预后不良有关,比较基于CNN模型TMB预测分类和基于panel TMB分类之间的生存预测能力。首先,利用分段回归方法,找出panel TMB对应分类的拐点。图9A是基于MSKCC IMPACT468 panel的生存分析示意图,图9B是基于FM1 panel的生存分析示意图,图9C是本发明的基于CNN模型预测的生存分析示意图。如图9A、9B、9C所示,由于测试区域的限制,相邻患者之间的TMB过度拟合到相同的值,特别是在TMB较低的患者中,这直接反映了panel TMB精确度较低。生存曲线分析表明,基于CNN模型预测的高TMB组与低TMB组(mos=357d vs 624d,p=0.00095)的生存时间有显著性差异,但高TMB组与低TMB组无论使用FM1 panel还是MSKCC IMPAT468 panel均无显著性差异。很明显,本发明的CNN模型表现良好,对患者的预后更为有利。
实验表明,本发明的分类模型能够很好地提取肝癌病理图像的特征,从而对肿瘤组织的高、低TMB水平进行分类。该模型对患者生存率的预测优于基于panel的TMB的估计方法。
(五)分类模型的相关病理解释
对于病理图像,不同的感受野可以获得不同特征尺度的信息,而小的感受野可以从病理图像中更好地获得局部信息。在20×视野下的HE玻片中,癌细胞的形态特征属于局部信息,因此较小的感受野可以用来获得更好的预测结果。图10是本发明实施例的分类模型感受野示意图。如图10所示,显示了一个示例区域,其中48×48像素接收字段投影到输入图像上。在一张20×的病理图像中,这个大小的感受野区包含大约2个细胞。这种感受野大小可以帮助模型充分识别肝癌细胞的异质性,同时避免病理图像中可能出现的间质组织的干扰。
正常细胞中基因突变的积累会导致肿瘤细胞的产生。TMB是反映肿瘤细胞内基因突变程度的一个指标,它可以从分子水平反映肿瘤的发病机制。肿瘤细胞及其微环境相关细胞的病理形态学特征与肿瘤细胞的基因组内在特征具有普遍的内在联系,从而可以通过病理图像特征预测TMB。镜下观察肝癌的HE病理切片,可以观察到癌细胞与正常细胞相比,大小和形状不一,内部结构异常,核质比增加,二者形态学特征对比如表4所示。
根据深度学习中的感受野理论,不同大小的感受野能够在不同尺度上获取信息,小的感受野会更好地获取局部信息。在20×视野下的HE切片中,癌细胞的形态特征就属于局部信息,所以采用较小的感受野可以获得更好的预测效果。
Figure PCTCN2019113582-appb-000003
表4
工业应用性
本发明涉及一种基于病理图像的TMB分类方法、系统及TMB分析装置,包括:对已知病理图像进行TMB分类标记和预处理,以构建训练集;通过该训练集对卷积神经网络进行训练,以构建分类模型;对目标病例的目标病理图像进行预处理,以获得多张目标图块;以该分类模型对该目标图块进行分类,以获取该目标病例的图块TMB分类结果;以所有该图块TMB分类结果,通过分类投票获取该目标病例的图像TMB分类结果。本发明还涉及一种基于病理图像的TMB分析装置。本发明的TMB分类方法,不依赖于除病理图像之外样本,具有准确、低成本、快速的优点,对肿瘤的研究具有重大价值。

Claims (10)

  1. 一种基于病理图像的TMB分类方法,其特征在于,包括:
    对已知病理图像进行TMB分类标记和预处理,以构建训练集;
    通过该训练集对卷积神经网络进行训练,以构建分类模型;
    对目标病例的目标病理图像进行预处理,以获得多张目标图块;
    以该分类模型对该目标图块进行分类,以获取该目标病例的图块TMB分类结果;
    以所有该图块TMB分类结果,通过分类投票获取该目标病例的图像TMB分类结果。
  2. 如权利要求1所述的TMB分类方法,其特征在于,对该目标病理图像进行预处理的步骤具体包括:
    标记出该目标病理图像的目标肿瘤细胞区域;
    根据该目标肿瘤细胞区域从该目标病理图像中切出目标局部图像;
    对该目标局部图像进行滑动窗口分割,并对分割获得的目标中间图块进行反色,以获得多张该目标图块。
  3. 如权利要求1所述的TMB分类方法,其特征在于,构建该训练集的步骤具体包括:
    通过至少一个分类阈值将该已知病理图像按TMB分类为多个类型;
    标记出所有该已知病理图像的已知肿瘤细胞区域;
    根据该已知肿瘤细胞区域从该已知病理图像中切出已知局部图像;
    对该已知局部图像进行滑动窗口分割,并对分割获得的中间图块进行反色,以获得多张训练图块;
    对所有该训练图块进行随机划分以构建该训练集的训练子集和测试子集。
  4. 如权利要求1所述的TMB分类方法,其特征在于,该卷积神经网络依次包括四对卷积层和最大池化层、一层第一全连接层,以及一层第二全连接层;所有该卷积层和该第一全连接层采用ReLU激活函数,该第二全连接层采用Sigmoid激活函数;
    通过改变该卷积神经网络各卷积层的卷积核的细粒度,获得多个预选感受野, 并构建多个对应的预选分类模型,获取该预选分类模型的准确度和AUC值,以具有最大准确度和最大AUC值的预选分类模型为该分类模型,以该分类模型对应的预选感受野为最佳感受野。
  5. 一种基于病理图像的TMB分类系统,其特征在于,包括:
    训练集构建模块,用于对已知病理图像进行TMB分类标记和预处理,以构建训练集;
    分类模型构建模块,用于通过该训练集对卷积神经网络进行训练,以构建分类模型;
    目标图像预处理模块,用于对目标病例的目标病理图像进行预处理,以获得多张目标图块;
    图块分类模块,用于以该分类模型对该目标图块进行分类,以获取该目标病例的图块TMB分类结果;
    图像分类模块,用于以所有该图块TMB分类结果,通过分类投票获取该目标病例的图像TMB分类结果。
  6. 如权利要求5所述的TMB分类系统,其特征在于,该目标图像预处理模块具体包括:
    标记出该目标病理图像的目标肿瘤细胞区域;根据该目标肿瘤细胞区域从该目标病理图像中切出目标局部图像;对该目标局部图像进行滑动窗口分割,并对分割获得的目标中间图块进行反色,以获得多张该目标图块。
  7. 如权利要求5所述的TMB分类方法,其特征在于,该训练集构建模块包括:
    TMB标记模块,用于通过至少一个分类阈值将该已知病理图像按TMB分类为多个类型;
    局部区域切出模块,用于标记出所有该已知病理图像的已知肿瘤细胞区域,根据该已知肿瘤细胞区域从该已知病理图像中切出已知局部图像;
    训练图块分割模块,用于对该已知局部图像进行滑动窗口分割,并对分割获得的中间图块进行反色,以获得多张训练图块;
    训练集划分模块,用于对所有该训练图块进行随机划分以构建该训练集的训练子集和测试子集。
  8. 如权利要求5所述的TMB分类系统,其特征在于,该卷积神经网络依次包括四对卷积层和最大池化层、一层第一全连接层,以及一层第二全连接层;其中, 所有该卷积层和该第一全连接层采用ReLU激活函数,该第二全连接层采用Sigmoid激活函数;
    通过改变该卷积神经网络各卷积层的卷积核的细粒度,获得多个预选感受野,并构建多个对应的预选分类模型,获取该预选分类模型的准确度和AUC值,以具有最大准确度和最大AUC值的预选分类模型为该分类模型,以该分类模型对应的预选感受野为最佳感受野。
  9. 一种可读存储介质,存储有可执行指令,该可执行指令用于执行如权利要求1~4任一项所述的基于病理图像的TMB分类方法。
  10. 一种基于病理图像的TMB分析装置,包括处理器和如权利要求9所述的可读存储介质,该处理器调取该可读存储介质中的可执行指令,以对目标病理图像进行分析,以获得该目标病理图像的目标分类结果。
PCT/CN2019/113582 2019-09-30 2019-10-28 基于病理图像的tmb分类方法、系统及tmb分析装置 WO2021062904A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/596,127 US11468565B2 (en) 2019-09-30 2019-10-28 TMB classification method and system and TMB analysis device based on pathological image

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910942092.8A CN110866893B (zh) 2019-09-30 2019-09-30 基于病理图像的tmb分类方法、系统及tmb分析装置
CN201910942092.8 2019-09-30

Publications (1)

Publication Number Publication Date
WO2021062904A1 true WO2021062904A1 (zh) 2021-04-08

Family

ID=69652088

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/113582 WO2021062904A1 (zh) 2019-09-30 2019-10-28 基于病理图像的tmb分类方法、系统及tmb分析装置

Country Status (3)

Country Link
US (1) US11468565B2 (zh)
CN (1) CN110866893B (zh)
WO (1) WO2021062904A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113192077A (zh) * 2021-04-15 2021-07-30 华中科技大学 一种细胞及区域层次的病理图自动分类方法及系统
CN113313680A (zh) * 2021-05-24 2021-08-27 华南理工大学 一种结直肠癌病理图像预后辅助预测方法及系统
CN113673610A (zh) * 2021-08-25 2021-11-19 上海鹏冠生物医药科技有限公司 一种用于组织细胞病理图像诊断系统的图像预处理方法

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866893B (zh) 2019-09-30 2021-04-06 中国科学院计算技术研究所 基于病理图像的tmb分类方法、系统及tmb分析装置
CN112927179A (zh) * 2019-11-21 2021-06-08 粘晓菁 肝肿瘤智慧分析方法
CN111681229B (zh) * 2020-06-10 2023-04-18 创新奇智(上海)科技有限公司 深度学习模型训练方法、可穿戴衣服瑕疵识别方法及装置
CN111797842B (zh) * 2020-07-06 2023-04-07 中国科学院计算机网络信息中心 图像分析方法及装置、电子设备
CN112101409B (zh) * 2020-08-04 2023-06-20 中国科学院计算技术研究所 基于病理图像的肿瘤突变负荷(tmb)分类方法与系统
CN112071430B (zh) * 2020-09-07 2022-09-13 北京理工大学 一种病理指标的智能预测系统
CN113192633B (zh) * 2021-05-24 2022-05-31 山西大学 基于注意力机制的胃癌细粒度分类方法
CN113947607B (zh) * 2021-09-29 2023-04-28 电子科技大学 一种基于深度学习的癌症病理图像生存预后模型构建方法
CN114152557B (zh) * 2021-11-16 2024-04-30 深圳元视医学科技有限公司 基于图像分析的血细胞计数方法和系统
CN115620894B (zh) * 2022-09-20 2023-05-02 贵州医科大学第二附属医院 基于基因突变的肺癌免疫疗效预测系统、装置及存储介质
CN116580216B (zh) * 2023-07-12 2023-09-22 北京大学 病理图像匹配方法、装置、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009400A (zh) * 2018-01-11 2018-05-08 至本医疗科技(上海)有限公司 全基因组肿瘤突变负荷预测方法、设备以及存储介质
CN108509991A (zh) * 2018-03-29 2018-09-07 青岛全维医疗科技有限公司 基于卷积神经网络的肝部病理图像分类方法
CN109785903A (zh) * 2018-12-29 2019-05-21 哈尔滨工业大学(深圳) 一种基因表达数据分类器
CN110264462A (zh) * 2019-06-25 2019-09-20 电子科技大学 一种基于深度学习的乳腺超声肿瘤识别方法

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110082203A1 (en) * 2008-02-04 2011-04-07 Kevin Ka-Wang Wang Process to diagnose or treat brain injury
US10572996B2 (en) * 2016-06-28 2020-02-25 Contextvision Ab Method and system for detecting pathological anomalies in a digital pathology image and method for annotating a tissue slide
CN108717554A (zh) * 2018-05-22 2018-10-30 复旦大学附属肿瘤医院 一种甲状腺肿瘤病理组织切片图像分类方法及其装置
US20210200987A1 (en) * 2018-06-06 2021-07-01 Visiongate, Inc. Morphometric detection of dna mismatch repair deficiency
CN109628638B (zh) * 2018-12-21 2019-12-10 中国水产科学研究院黄海水产研究所 基于对虾东方病毒基因组序列的检测方法及其应用
CN109880910B (zh) 2019-04-25 2020-07-17 南京世和基因生物技术股份有限公司 一种肿瘤突变负荷的检测位点组合、检测方法、检测试剂盒及系统
CN110245657B (zh) * 2019-05-17 2021-08-24 清华大学 病理图像相似性检测方法及检测装置
CN110288542A (zh) * 2019-06-18 2019-09-27 福州数据技术研究院有限公司 一种基于随机变换的肝部病理图像样本增强方法
CN110866893B (zh) 2019-09-30 2021-04-06 中国科学院计算技术研究所 基于病理图像的tmb分类方法、系统及tmb分析装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009400A (zh) * 2018-01-11 2018-05-08 至本医疗科技(上海)有限公司 全基因组肿瘤突变负荷预测方法、设备以及存储介质
CN108509991A (zh) * 2018-03-29 2018-09-07 青岛全维医疗科技有限公司 基于卷积神经网络的肝部病理图像分类方法
CN109785903A (zh) * 2018-12-29 2019-05-21 哈尔滨工业大学(深圳) 一种基因表达数据分类器
CN110264462A (zh) * 2019-06-25 2019-09-20 电子科技大学 一种基于深度学习的乳腺超声肿瘤识别方法

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113192077A (zh) * 2021-04-15 2021-07-30 华中科技大学 一种细胞及区域层次的病理图自动分类方法及系统
CN113192077B (zh) * 2021-04-15 2022-08-02 华中科技大学 一种细胞及区域层次的病理图自动分类方法及系统
CN113313680A (zh) * 2021-05-24 2021-08-27 华南理工大学 一种结直肠癌病理图像预后辅助预测方法及系统
CN113313680B (zh) * 2021-05-24 2023-06-23 华南理工大学 一种结直肠癌病理图像预后辅助预测方法及系统
CN113673610A (zh) * 2021-08-25 2021-11-19 上海鹏冠生物医药科技有限公司 一种用于组织细胞病理图像诊断系统的图像预处理方法

Also Published As

Publication number Publication date
CN110866893A (zh) 2020-03-06
CN110866893B (zh) 2021-04-06
US11468565B2 (en) 2022-10-11
US20220207726A1 (en) 2022-06-30

Similar Documents

Publication Publication Date Title
WO2021062904A1 (zh) 基于病理图像的tmb分类方法、系统及tmb分析装置
CN109300111B (zh) 一种基于深度学习的染色体识别方法
CN110245657B (zh) 病理图像相似性检测方法及检测装置
Xu et al. Using transfer learning on whole slide images to predict tumor mutational burden in bladder cancer patients
CN113628157A (zh) 利用病理图像来表征肿瘤微环境的系统和方法
CN110111895A (zh) 一种鼻咽癌远端转移预测模型的建立方法
WO2015173435A1 (en) Method for predicting a phenotype from a genotype
Ström et al. Pathologist-level grading of prostate biopsies with artificial intelligence
US20220237789A1 (en) Weakly supervised multi-task learning for cell detection and segmentation
US20220277811A1 (en) Detecting False Positive Variant Calls In Next-Generation Sequencing
US20230306598A1 (en) Systems and methods for mesothelioma feature detection and enhanced prognosis or response to treatment
JP7467504B2 (ja) 染色体異数性を判定するためおよび分類モデルを構築するための方法およびデバイス
Chidester et al. Discriminative bag-of-cells for imaging-genomics
CN111814893A (zh) 基于深度学习的肺部全扫描图像egfr突变预测方法和系统
CN117174179A (zh) 基于深度学习的微卫星不稳定性大肠癌机制信息分析系统
CN115295154A (zh) 肿瘤免疫治疗疗效预测方法、装置、电子设备及存储介质
Liu et al. Pathological prognosis classification of patients with neuroblastoma using computational pathology analysis
Jing et al. A comprehensive survey of intestine histopathological image analysis using machine vision approaches
US20230282362A1 (en) Systems and methods for determining breast cancer prognosis and associated features
CN109191452B (zh) 一种基于主动学习的腹腔ct图像腹膜转移自动标记方法
CN116228759A (zh) 肾细胞癌类型的计算机辅助诊断系统及设备
CN112101409B (zh) 基于病理图像的肿瘤突变负荷(tmb)分类方法与系统
CN115690056A (zh) 基于her2基因检测的胃癌病理图像分类方法及系统
CN114974432A (zh) 一种生物标志物的筛选方法及其相关应用
Fang et al. Deep Learning Predicts Biomarker Status and Discovers Related Histomorphology Characteristics for Low-Grade Glioma

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19948082

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19948082

Country of ref document: EP

Kind code of ref document: A1