CN118197603A - Method for predicting stomach cancer molecular subtype by using stomach cancer pathological image - Google Patents

Method for predicting stomach cancer molecular subtype by using stomach cancer pathological image Download PDF

Info

Publication number
CN118197603A
CN118197603A CN202410366139.1A CN202410366139A CN118197603A CN 118197603 A CN118197603 A CN 118197603A CN 202410366139 A CN202410366139 A CN 202410366139A CN 118197603 A CN118197603 A CN 118197603A
Authority
CN
China
Prior art keywords
image
stomach cancer
gastric cancer
molecular subtype
gene expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410366139.1A
Other languages
Chinese (zh)
Inventor
王兵
聂传齐
卢琨
汪文艳
吴紫恒
赵远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University of Technology AHUT
Original Assignee
Anhui University of Technology AHUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University of Technology AHUT filed Critical Anhui University of Technology AHUT
Priority to CN202410366139.1A priority Critical patent/CN118197603A/en
Publication of CN118197603A publication Critical patent/CN118197603A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a method for predicting stomach cancer molecular subtype by utilizing stomach cancer pathology image, belonging to the stomach cancer histopathology image technical field, comprising the following steps: s1: processing and analyzing gastric cancer gene expression data; s2: tumor immunity microenvironment contrast analysis; s3: preprocessing a pathological image, and extracting and analyzing characteristics; s4: molecular subtype recognition. The invention pre-processes and analyzes gastric cancer gene expression data, then calculates tumor immunity micro-environment data according to the gene expression data for analysis, then analyzes the collected pathological images through a series of screening, annotating, cutting, quality control, normalizing and extracting features, finally predicts gastric cancer molecular subtype by adopting Resnet model based on focal loss, and experimental results prove that the invention can accurately predict gastric cancer molecular subtype by using gastric cancer images and is expected to be applied to medical image identification in the future.

Description

Method for predicting stomach cancer molecular subtype by using stomach cancer pathological image
Technical Field
The invention relates to the technical field of stomach cancer histopathological images, in particular to a method for predicting stomach cancer molecular subtypes by utilizing stomach cancer pathological images.
Background
Gastric Cancer (GC) is one of the most common malignant tumors of the digestive system and one of the main causes of cancer-related death worldwide. In 2020, 48 cases of new onset and 37 cases of related deaths account for 44% and 48% of global statistics, respectively. Gastric cancer shows heterogeneity and different classification methods have different criteria. According to Lauren's classification, gastric cancer can be classified into diffuse type, intestinal type and mixed type 3 types according to its morphological characteristics. In 2014, cancer genomic profile (TCGA) studied 295 gastric cancer samples using six molecular analysis platforms including Whole Exome Sequencing (WES). Based on their findings, they proposed four molecular subtypes: epstein barr virus positive (EBV), microsatellite instability (MSI), chromosome Instability (CIN) and Genome Stability (GS). These subtypes were subsequently included in the world health organization classification in 2019. Research on pathogenesis and molecular characteristics of gastric cancer has important significance for early diagnosis, treatment strategy establishment and prognosis evaluation.
Gastric cancer has a large difference in clinical manifestation and prognosis, due in part to its molecular level heterogeneity. In recent years, more and more studies have found that gastric cancer can be divided into different molecular subtypes, which have unique gene expression characteristics and biological behaviors. These differences may affect the patient's response to treatment, as well as differences in prognosis. Different subtypes of gastric cancer may have different sensitivities to treatment. Some subtypes may respond better to traditional therapies (such as chemotherapy or radiation therapy), while other subtypes may be more suitable for targeted therapy or immunotherapy. Furthermore, gastric cancer of a particular subtype may have different biological behaviors, leading to a significant difference in prognosis for the patient. Therefore, understanding and identifying molecular subtypes of gastric cancer is of great importance for formulating personalized treatment regimens, improving treatment efficacy, and predicting patient prognosis. While traditional clinical pathology features are helpful for diagnosis and therapy planning, molecular level heterogeneity may not be adequately captured using these features alone.
In recent years, deep learning techniques have made breakthrough progress in the medical field. Its powerful pattern recognition capability and processing power on complex data make it a powerful tool for studying and predicting disease molecular characteristics. In the cancer field, deep learning has been widely used in tumor classification, prognosis evaluation, image analysis, genomics research, and the like. In some medical data analysis tasks, deep learning is performed beyond humans, and images of lung, prostate, and brain tumors can be used to predict patient survival and tumor mutations; kather et al built a deep residual learning model to predict microsatellite instability (MSI) from H & E stained histological images. For this reason, a method for predicting a subtype of gastric cancer molecule using a pathological image of gastric cancer is proposed.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: how to accurately identify the molecular subtype of the gastric cancer based on the gastric cancer pathology image by using a deep learning technology, and a method for predicting the molecular subtype of the gastric cancer by using the gastric cancer pathology image.
The invention solves the technical problems through the following technical proposal, and the invention comprises the following steps:
S1: gastric cancer gene expression data processing analysis
Obtaining gastric cancer gene expression data of a patient from TCGA, matching the patient with respective molecular subtype labels, preprocessing the data according to the gene expression difference requirement, and carrying out difference analysis among different molecular subtype groups;
S2: tumor immunity microenvironment contrast analysis
According to the matched gene expression data in the step S1, calculating tumor immune microenvironment data of a corresponding patient by using a CIBERSORT tool, carrying out statistical analysis on the immune microenvironment, and comparing and analyzing differences among different molecular subtype groups;
s3: pathological image preprocessing and feature extraction and analysis
Collecting pathological images of a corresponding patient from TCGA, cutting and normalizing the collected images to obtain pathological image blocks of the corresponding patient, extracting image block characteristics and analyzing;
s4: molecular subtype recognition
And designing and training a deep learning model for identifying stomach cancer molecular subtypes, obtaining a classification result by using a verification set, and evaluating the classification effect of the model.
Further, in the step S1, the specific process is as follows:
s11: mRNA gene expression data of a gastric cancer patient are obtained from TCGA, and gastric cancer molecular subtype labels are matched;
S12: filtering the matched gene expression data, screening out low-quality data, standardizing the gene expression data, and carrying out differential gene expression statistical analysis on the standardized data.
Further, in the step S2, the specific process is as follows:
s21: according to the matched gene expression data, calculating corresponding gastric cancer immune microenvironment data by using a CIBERSORT tool;
S22: and respectively carrying out differential analysis among molecular subtype groups on components in the immune microenvironment.
Further, in the step S22, when the inter-group difference analysis is performed, the ratio Fold change of the average value of the gene expression levels of the two groups of gene samples is calculated, the value of LFC is calculated by using the formula lfc=log 2 Fold change, the absolute value of LFC is greater than 1 to indicate that there is a difference between the two groups, then a statistical value is calculated for each expression to measure the difference under different groups, then the significance P value is calculated according to the t distribution to measure the significance of the difference, and the detected value P <0.05 is regarded as the group with the significance difference.
Further, in the step S3, the specific procedure is as follows:
s31: the pathologist pre-examines the collected pathological images, eliminates the pathological images with fuzzy unsatisfactory, and manually annotates the tumor areas of the pathological images with the unsuitable requirements;
S32: cutting the annotated pathological image, cutting a tumor region into pixel slices with the size of 512x512, controlling the quality of the slices without overlapping areas among the slices, discarding the slices with blank areas more than 30% or background areas more than 30%, and carrying out Macenko color normalization processing on the rest slices;
S33: extracting the image features of each slice after normalization processing, including color features and texture features, then obtaining image features corresponding to pathological images, carrying out statistical analysis on the image features of different molecular subtypes, and carrying out association analysis on the image features and tumor immune microenvironments which possibly influence the image features.
Further, in the step S33, the color features are obtained as follows:
s3301: converting the pathological image slice into HSV color space to obtain a color image slice;
s3302: then the color image is disassembled into separate R, G, B channels;
s3303: the image is then converted to a gray scale image, and the mean and variance over R, G, B and gray scale space are calculated, respectively.
Further, in the step S33, the texture features are obtained as follows:
S3311: firstly, carrying out 2D wavelet packet transformation on pathological image slices to obtain four sub-images: an approximation sub-image, a horizontal detail sub-image, a vertical detail sub-image, and a diagonal detail sub-image;
S3312: averaging the similar sub-image, the horizontal detail sub-image, the vertical detail sub-image and the diagonal detail sub-image into 16 gray levels, standardizing the images and constructing a window multi-scale symbiotic matrix;
S3313: texture features are extracted based on the window multi-scale co-occurrence matrix, and the extracted texture features are as follows: entropy, contrast, regulation, correlation, IDM, DLMSE, GLMSE, DLA, GLA, SGSDA, SGBDA.
Further, in the step S33, statistical analysis is performed on the image features of different molecular subtypes, and correlation analysis is performed on the image features and tumor immune microenvironment which may affect the image features, which specifically includes the following steps:
S3321: performing difference detection and P value calculation on different subtype groups by using statistical analysis software on all the extracted image features;
s3322: carrying out statistical analysis, difference detection and P value calculation on the immune cell content in different subtype tumor immune microenvironments;
S3323: and carrying out Spearman correlation analysis on the image characteristics and the immune cell content in the tumor immune microenvironment by using statistical analysis software, and calculating a correlation coefficient.
Further, in the step S4, the specific process is as follows:
S41: making a classification strategy according to original definition of gastric cancer molecular subtype, wherein the classification strategy is to judge whether the gastric cancer is EB virus positive, separate out EBV type gastric cancer, judge whether the gastric cancer is microsatellite instability, and finally distinguish genome stability type and chromosome instability type according to the degree of copy number variation;
S42: dividing the data set into a training set and a verification set according to a set proportion by the established classification strategy;
s43: the deep learning model takes Resnet as a basic model, a model full-connection layer is added, a dropout layer is added, a focal loss function is used for replacing a cross entropy loss function, and a training set is used for training the deep learning model;
s44: and after training, obtaining a classification result by using the verification set and evaluating the model classification effect.
Further, in the step S43, the Focal loss function is as follows:
Lfl=-(1-pt)γlog (pt)
Wherein p t reflects the proximity to class y, and a larger p t indicates a closer proximity to class y, i.e., a more accurate classification result, with γ >0 being an adjustable factor.
Compared with the prior art, the invention has the following advantages: the method for predicting the stomach cancer molecular subtype by utilizing the stomach cancer pathology image comprises the steps of preprocessing and analyzing stomach cancer gene expression data, calculating tumor immunity microenvironment data according to the gene expression data for analysis, analyzing the collected pathology image through a series of screening, annotating, cutting, quality control, normalizing and extracting features, and predicting the stomach cancer molecular subtype by adopting a Resnet model based on focal loss, wherein experimental results prove that the method can accurately predict the stomach cancer molecular subtype by using the stomach cancer image and is expected to be applied to medical image recognition in the future.
Drawings
FIG. 1 is a flow chart of an illustrative method for predicting gastric cancer molecular subtypes using gastric cancer pathology images in an embodiment of the present invention;
FIG. 2a is a thermal diagram of differential gene expression between EBV-type and non-EBV-type gastric cancers in an embodiment of the present invention;
FIG. 2b is a thermal graph of differential gene expression of MSI-type and non-MSI-type gastric cancers in an embodiment of the present invention;
FIG. 2c is a thermal map of differential gene expression of CIN-type and non-CIN-type gastric cancers in an embodiment of the invention;
FIG. 2d is a thermal diagram of differential gene expression between GS-type and non-GS-type stomach cancers in examples of the present invention;
FIG. 3 is a volcanic chart of gastric cancer molecular subtype differential genes in an embodiment of the present invention;
FIG. 4 is a thermal diagram of a differential gene enrichment analysis pathway in an embodiment of the invention;
FIG. 5 is a graph of differential analysis violin of Tumor Immune microenvironment scores ((interstitial Score (Stromal Score), immune Score (Immune Score), tumor Purity (Tumor Purity)) over four molecular subtypes in an example of the invention, wherein P values represent the level of inter-group variability, and smaller P values indicate more significant differences;
FIG. 6 is a box plot of immune cell content in a tumor immune microenvironment in an embodiment of the invention;
FIG. 7 is a box plot of differences between pathological image features and molecular subtypes in an embodiment of the invention;
FIG. 8 is a network diagram showing the correlation of pathological image features with immune cells in a tumor immune microenvironment in an embodiment of the invention;
FIG. 9 is a flow chart of pathology image preprocessing (including model thumbnail) in an embodiment of the present invention;
FIG. 10a is a five-fold cross-validation ROC curve for a model of EBV-type tumor classification in an embodiment of the invention;
FIG. 10b is a five-fold cross-validation ROC curve for a model of MSI-type tumor classification in an embodiment of the invention;
FIG. 10c is a five-fold cross-validation ROC curve for model CIN and GS-type tumor classifications in an embodiment of the invention.
Detailed Description
The following describes in detail the examples of the present invention, which are implemented on the premise of the technical solution of the present invention, and detailed embodiments and specific operation procedures are given, but the scope of protection of the present invention is not limited to the following examples.
As shown in fig. 1, this embodiment provides a technical solution: an interpretable method for predicting a subtype of gastric cancer molecules using a gastric cancer pathology image, comprising:
S1: and (5) processing and analyzing gastric cancer gene expression data.
The gastric cancer mRNA gene expression data are collected from TCGA, and are matched with four molecular subtype labels, quality control is carried out on the matched data, low-quality reading is filtered, and normalization processing is carried out on the data, so that the influence of technical differences on results is reduced. The data were analyzed using statistical and computational biology methods, in this example using DESeq2 differential expression analysis tools, gene function analysis, including enrichment analysis, pathway analysis, or gene ontology analysis, was performed on the differentially expressed genes.
S2: tumor immune microenvironment analysis.
According to the gene expression data obtained in the step S1, calculating tumor immunity microenvironment data of a corresponding patient through a CIBERSORT tool, wherein the tumor immunity microenvironment consists of a plurality of immune cells, and the immune cells comprise T cells (CD4+ and CD8+), B cells, plasma cells, natural killer cells (NK cells), dendritic Cells (DCs), macrophages and the like. The degree and type of infiltration of immune cells can exhibit different characteristics in pathology images. Aiming at different molecular subtypes, carrying out differential analysis on the content of various immune cells;
s3: pathological image preprocessing and feature extraction and analysis
And collecting pathological images of the corresponding patient from the TCGA, cutting and normalizing the collected images to obtain pathological image blocks of the corresponding patient, extracting the characteristics of the image blocks, and analyzing.
In this embodiment, the step S3 specifically includes the following steps:
S31: the oncologist (pathologist) pre-examines the collected pathological images, eliminates the pathological images with fuzzy unsatisfactory, and manually annotates the tumor areas of the pathological images with satisfactory; the distribution of gastric cancer molecular subtype data in the pathological images in this example is shown in table 1 below:
TABLE 1 distribution of gastric cancer molecular subtype data
S32: the annotated pathological section is segmented into tumor areas, the tumor areas are cut into small sections by taking 512x512 pixel sections as standards, no overlapping areas exist among the sections, meanwhile, quality control processing is carried out on the obtained sections, some sections contain a large number of blank areas or background areas, the existence of the areas greatly influences the accuracy of a model, and the sections with the blank areas or the background areas being more than 30% are discarded. Since the collected pathological image samples may come from different hospitals, the staining standards of the pathological images are different, and in order to avoid the influence of the staining technology on prediction, the slices are subjected to unified Macenko color normalization processing.
S33: and carrying out feature extraction on the normalized image slice, wherein the feature extraction is also feature extraction at the slice level. Firstly, transferring the image to HSV color space, disassembling the color image into a single R, G, B channel, converting the image into a gray image, and then calculating R, G, B and color characteristics such as mean and variance in the gray space respectively. Then, the texture features of the tumor tissue region slice are extracted, firstly, 2D wavelet packet transformation is carried out on the pathological image slice, and four seed images are obtained through the transformation: an approximate sub-image (Low-Low, which contains Low frequency information of the original image for representing the overall trend and smooth structure), a horizontal detail sub-image (Low-High, which captures High frequency detail information in the horizontal direction of the original image for representing texture and edge information in the horizontal direction), a vertical detail sub-image (High-Low, which captures High frequency detail information in the vertical direction of the original image for representing texture and edge information in the vertical direction), and a diagonal detail sub-image (High-High), high frequency detail information in the diagonal direction in the original image is captured for representing texture and edge information in the diagonal direction). Specifically, performing low-pass filtering on rows and columns of an original image to obtain an approximate sub-image; performing low-pass filtering on the rows of the image and performing high-pass filtering on the columns to obtain a horizontal detail sub-image; carrying out high-pass filtering on the image, and carrying out low-pass filtering on the columns to obtain a vertical detail sub-image; high-pass filtering is carried out on the rows and the columns of the image to obtain diagonal detail sub-images; then, the similar sub-image, the horizontal detail sub-image, the vertical detail sub-image and the diagonal detail sub-image are quantized into 16 gray levels, the images are standardized, a Window multi-scale co-occurrence matrix: WMCM is constructed, texture features are extracted based on WMCM, and the extracted texture features are as follows: entropy (entropy, measure uncertainty or randomness of pixel value distribution in an image), contrast (Contrast, quantify the change in brightness between different parts of an image, reflect the depth of texture and edges in an image), and, Regulation (uniformity, measure the smoothness or consistency of pixel values in an image across the image), corelation (Correlation, evaluate the Correlation between pixel values and their neighborhood pixel values, indicate the likelihood of predicting pixel values from surrounding pixel values), IDM (inverse moment, measure the local uniformity of an image), DLMSE (local mean square error bias, calculate variability or dispersion of pixel values in local columns, indicate uniformity of vertical texture pattern), GLMSE (global mean square error, measure the horizontal dispersion of pixel values, reflect horizontal texture pattern), DLA (local mean value bias, deviation of average pixel values in local columns), GLA (global average, average pixel values between rows to analyze horizontal smoothness or uniformity of pattern), SGSDA (gray differential integration sum, overall texture energy or intensity of image), SGBDA (gradient-based differential integration sum, gradient difference integration in image, describe texture complexity and edge density, reflect structure and texture changes from gradient information). And carrying out differential analysis according to molecular subtypes and image characteristics, and carrying out spearman correlation analysis on various immune cells in the image characteristics and tumor immune microenvironment to find out the correlation between the tumor immune microenvironment and the image characteristics, wherein tumor immune microenvironment components have correlation with the image characteristics, so that the tumor immune microenvironment is influenced by different subtype tumors, and the image characteristics are further influenced.
S4: and (5) identifying stomach cancer molecular subtype types. The detailed scheme is as follows:
S41: the classification strategy is formulated according to the original definition of the stomach cancer molecular subtype, whether the stomach cancer molecular subtype is positive to EB virus is firstly judged, EBV type stomach cancer is separated, whether microsatellite instability (MSI) is judged, and finally, the Genome Stability (GS) and Chromosome Instability (CIN) are distinguished according to the degree of copy number variation (the copy number variation refers to the variation of increasing or decreasing part of DNA fragments in the genome DNA of an individual compared with a reference genome). This classification strategy is also classified according to the effect of these subtypes on immunotherapy. This classification strategy is followed in the recognition of molecular subtypes.
S42: dividing the data set according to the formulated classification strategy, dividing the data set according to a five-fold cross validation method, sequentially taking four-fold training and reserving one as validation.
S43: the Resnet is taken as a basic model, a model full-connection layer is added, and a dropout layer is added to reduce the risk of model overfitting. According to the data distribution condition, replacing the cross entropy loss function with a focal loss function, wherein the cross entropy loss function has the following formula:
Wherein, For the magnitude of the prediction probability, y is label, which corresponds to 0,1 in the two classifications. In this dataset, the cross entropy loss function does not notice difficult-to-sort samples with a smaller amount of data when in use due to the imbalance between datasets. The loss function is replaced by a focal loss function, and the formula is deduced as follows:
Lfl=-(1-pt)γlog (pt)
Where p t reflects the proximity to ground truth, class y, and a larger p t indicates a closer proximity to class y, that is, more accurate classification, with γ >0 being the adjustable factor. In implementation, γ=2, and a coefficient α=0.6 is added before the loss function, and the training process loads the weight of ImageNet pre-training to retrain on the data, and the SGD optimizer is used, and the learning rate is set to 0.001, the momentum (momentum) is set to 0.9, and the learning rate is reduced by 10 times every 7 pieces of epochs; setting the batch size to 128, training 30 epochs.
In summary, the method for predicting the stomach cancer molecular subtype by using the stomach cancer pathology image in the embodiment disclosed above reveals the heterogeneity of the stomach cancer tumor microenvironment caused by gene expression based on the determination of the stomach cancer molecular subtype, the heterogeneity of the tumor immunity microenvironment is reflected on the stomach cancer pathology image, the explanation that the stomach cancer pathology image can be used for predicting the stomach cancer molecular subtype, the accuracy of the focus-based Resnet model in the task of predicting the stomach cancer pathology image molecular subtype is higher, the model has good robustness, and the stomach cancer molecular subtype can be accurately identified.
While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.

Claims (10)

1. A method for predicting stomach cancer molecular subtype by using stomach cancer pathological image, which is characterized by comprising the following steps:
S1: gastric cancer gene expression data processing analysis
Obtaining gastric cancer gene expression data of a patient from TCGA, matching the patient with respective molecular subtype labels, preprocessing the data according to the gene expression difference requirement, and carrying out difference analysis among different molecular subtype groups;
S2: tumor immunity microenvironment contrast analysis
According to the matched gene expression data in the step S1, calculating tumor immune microenvironment data of a corresponding patient by using a CIBERSORT tool, carrying out statistical analysis on the immune microenvironment, and comparing and analyzing differences among different molecular subtype groups;
s3: pathological image preprocessing and feature extraction and analysis
Collecting pathological images of a corresponding patient from TCGA, cutting and normalizing the collected images to obtain pathological image blocks of the corresponding patient, extracting image block characteristics and analyzing;
s4: molecular subtype recognition
And designing and training a deep learning model for identifying stomach cancer molecular subtypes, obtaining a classification result by using a verification set, and evaluating the classification effect of the model.
2. The method for predicting stomach cancer molecular subtype by using stomach cancer pathology image according to claim 1, wherein in the step S1, the specific procedure is as follows:
s11: mRNA gene expression data of a gastric cancer patient are obtained from TCGA, and gastric cancer molecular subtype labels are matched;
S12: filtering the matched gene expression data, screening out low-quality data, standardizing the gene expression data, and carrying out differential gene expression statistical analysis on the standardized data.
3. The method for predicting stomach cancer molecular subtype by using stomach cancer pathology image according to claim 1, wherein in the step S2, the specific procedure is as follows:
s21: according to the matched gene expression data, calculating corresponding gastric cancer immune microenvironment data by using a CIBERSORT tool;
S22: and respectively carrying out differential analysis among molecular subtype groups on components in the immune microenvironment.
4. A method for predicting gastric cancer molecular subtype using gastric cancer pathology image according to claim 3, wherein in step S22, when performing inter-group difference analysis, the ratio Fold change of the average value of gene expression levels of two groups of gene samples is calculated first, then LFC is calculated by using the formula lfc=log 2 Fold change, the absolute value of LFC is greater than 1 to indicate that there is a difference between the two groups, then a statistical value is calculated for each expression level to measure the difference under different groups, then a significance P value is calculated according to t distribution to measure the significance of the difference, and the detected value P <0.05 is regarded as the group with significant difference.
5. The method for predicting stomach cancer molecular subtype by using stomach cancer pathology image according to claim 1, wherein in the step S3, the specific procedure is as follows:
s31: the pathologist pre-examines the collected pathological images, eliminates the pathological images with fuzzy unsatisfactory, and manually annotates the tumor areas of the pathological images with the unsuitable requirements;
S32: cutting the annotated pathological image, cutting a tumor region into pixel slices with the size of 512x512, controlling the quality of the slices without overlapping areas among the slices, discarding the slices with blank areas more than 30% or background areas more than 30%, and carrying out Macenko color normalization processing on the rest slices;
S33: extracting the image features of each slice after normalization processing, including color features and texture features, then obtaining image features corresponding to pathological images, carrying out statistical analysis on the image features of different molecular subtypes, and carrying out association analysis on the image features and tumor immune microenvironments which possibly influence the image features.
6. The method for predicting stomach cancer molecular subtype using stomach cancer pathology image according to claim 5, wherein in the step S33, the color features are obtained as follows:
s3301: converting the pathological image slice into HSV color space to obtain a color image slice;
s3302: then the color image is disassembled into separate R, G, B channels;
s3303: the image is then converted to a gray scale image, and the mean and variance over R, G, B and gray scale space are calculated, respectively.
7. The method according to claim 6, wherein in the step S33, the texture features are obtained as follows:
S3311: firstly, carrying out 2D wavelet packet transformation on pathological image slices to obtain four sub-images: an approximation sub-image, a horizontal detail sub-image, a vertical detail sub-image, and a diagonal detail sub-image;
S3312: averaging the similar sub-image, the horizontal detail sub-image, the vertical detail sub-image and the diagonal detail sub-image into 16 gray levels, standardizing the images and constructing a window multi-scale symbiotic matrix;
S3313: texture features are extracted based on the window multi-scale co-occurrence matrix, and the extracted texture features are as follows: entropy, contrast, regulation, correlation, IDM, DLMSE, GLMSE, DLA, GLA, SGSDA, SGBDA.
8. The method for predicting stomach cancer molecular subtypes by using stomach cancer pathology image according to claim 7, wherein in the step S33, statistical analysis is performed on image features of different molecular subtypes, and correlation analysis is performed on the image features and tumor immune microenvironment which may affect the image features, specifically comprising the following steps:
S3321: performing difference detection and P value calculation on different subtype groups by using statistical analysis software on all the extracted image features;
s3322: carrying out statistical analysis, difference detection and P value calculation on the immune cell content in different subtype tumor immune microenvironments;
S3323: and carrying out Spearman correlation analysis on the image characteristics and the immune cell content in the tumor immune microenvironment by using statistical analysis software, and calculating a correlation coefficient.
9. The method for predicting stomach cancer molecular subtype using stomach cancer pathology image according to claim 8, wherein in the step S4, the specific procedure is as follows:
S41: making a classification strategy according to original definition of gastric cancer molecular subtype, wherein the classification strategy is to judge whether the gastric cancer is EB virus positive, separate out EBV type gastric cancer, judge whether the gastric cancer is microsatellite instability, and finally distinguish genome stability type and chromosome instability type according to the degree of copy number variation;
S42: dividing the data set into a training set and a verification set according to a set proportion by the established classification strategy;
s43: the deep learning model takes Resnet as a basic model, a model full-connection layer is added, a dropout layer is added, a focal loss function is used for replacing a cross entropy loss function, and a training set is used for training the deep learning model;
s44: and after training, obtaining a classification result by using the verification set and evaluating the model classification effect.
10. The method of predicting gastric cancer molecular subtype using gastric cancer pathology images according to claim 9, characterized in that in step S43 the Focal loss function is as follows:
Lfl=-(1-pt)γlog (pt)
Wherein p t reflects the proximity to class y, and a larger p t indicates a closer proximity to class y, i.e., a more accurate classification result, with γ >0 being an adjustable factor.
CN202410366139.1A 2024-03-28 2024-03-28 Method for predicting stomach cancer molecular subtype by using stomach cancer pathological image Pending CN118197603A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410366139.1A CN118197603A (en) 2024-03-28 2024-03-28 Method for predicting stomach cancer molecular subtype by using stomach cancer pathological image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410366139.1A CN118197603A (en) 2024-03-28 2024-03-28 Method for predicting stomach cancer molecular subtype by using stomach cancer pathological image

Publications (1)

Publication Number Publication Date
CN118197603A true CN118197603A (en) 2024-06-14

Family

ID=91397773

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410366139.1A Pending CN118197603A (en) 2024-03-28 2024-03-28 Method for predicting stomach cancer molecular subtype by using stomach cancer pathological image

Country Status (1)

Country Link
CN (1) CN118197603A (en)

Similar Documents

Publication Publication Date Title
Wang et al. Computational staining of pathology images to study the tumor microenvironment in lung cancer
US11610307B2 (en) Determining biomarkers from histopathology slide images
Zhao et al. Toward automatic prediction of EGFR mutation status in pulmonary adenocarcinoma with 3D deep learning
Li et al. Machine learning for lung cancer diagnosis, treatment, and prognosis
CN111079862B (en) Deep learning-based thyroid papillary carcinoma pathological image classification method
Cong et al. Deep learning model as a new trend in computer-aided diagnosis of tumor pathology for lung cancer
WO2021062904A1 (en) Tmb classification method and system based on pathological image, and tmb analysis device based on pathological image
CN112635063B (en) Comprehensive lung cancer prognosis prediction model, construction method and device
CN108198621B (en) Database data comprehensive diagnosis and treatment decision method based on neural network
JP2024016039A (en) Integrated machine-learning framework to estimate homologous recombination deficiency
Xu et al. Using transfer learning on whole slide images to predict tumor mutational burden in bladder cancer patients
CN108335756B (en) Nasopharyngeal carcinoma database and comprehensive diagnosis and treatment decision method based on database
CN107169497A (en) A kind of tumor imaging label extracting method based on gene iconography
CN108206056B (en) Nasopharyngeal darcinoma artificial intelligence assists diagnosis and treatment decision-making terminal
CN113870951A (en) Prediction system for predicting head and neck squamous cell carcinoma immune subtype
Zhao et al. Single-cell morphological and topological atlas reveals the ecosystem diversity of human breast cancer
Liu et al. Pathological prognosis classification of patients with neuroblastoma using computational pathology analysis
CN108320797B (en) Nasopharyngeal carcinoma database and comprehensive diagnosis and treatment decision method based on database
Sali et al. Morphological diversity of cancer cells predicts prognosis across tumor types
Wang et al. Integrative Analysis for Lung Adenocarcinoma Predicts Morphological Features Associated with Genetic Variations.
CN111554381A (en) Artificial intelligent pathological diagnosis method and diagnosis model for renal clear cell carcinoma based on deep learning
CN110942808A (en) Prognosis prediction method and prediction system based on gene big data
CN118197603A (en) Method for predicting stomach cancer molecular subtype by using stomach cancer pathological image
CN114974432A (en) Screening method of biomarker and related application thereof
Garg et al. [Retracted] ML‐Based Texture and Wavelet Features Extraction Technique to Predict Gastric Mesothelioma Cancer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination