CN115132275B - Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network - Google Patents

Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network Download PDF

Info

Publication number
CN115132275B
CN115132275B CN202210583718.2A CN202210583718A CN115132275B CN 115132275 B CN115132275 B CN 115132275B CN 202210583718 A CN202210583718 A CN 202210583718A CN 115132275 B CN115132275 B CN 115132275B
Authority
CN
China
Prior art keywords
lung
dimensional
neural network
image
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210583718.2A
Other languages
Chinese (zh)
Other versions
CN115132275A (en
Inventor
赵世杰
刘卓岩
韩军伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202210583718.2A priority Critical patent/CN115132275B/en
Publication of CN115132275A publication Critical patent/CN115132275A/en
Application granted granted Critical
Publication of CN115132275B publication Critical patent/CN115132275B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30061Lung

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Analytical Chemistry (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Biotechnology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolutional neural network, which provides end-to-end global feature extraction processing for lungs based on a densely connected three-dimensional convolutional structure and captures local lung nodule features by utilizing a proposed multi-scale cavity asymmetric module. In a small range, lung nodule characteristics with different sizes, directions and angles are processed by utilizing a multi-scale cavity asymmetric module, and each detail in the input three-dimensional image is searched and densely connected from small to large. In a large range, the dense network is utilized to combine and process micro features of different stages and different scales of targets, so that comprehensive features of the lung are extracted, deeper information of the lung is extracted by utilizing the idea of feature transformation and channel fusion, and finally, a prediction result of the model is obtained through the full connection layer and the activation function layer.

Description

Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network
Technical Field
The invention belongs to the field of computer vision, relates to a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolutional neural network, and particularly relates to a method for extracting features from CT images and carrying out gene prediction by utilizing a neural network architecture.
Background
Currently, common methods for diagnosing EGFR mutation status include tissue biopsy and liquid biopsy, but there are many limitations including: due to tumor heterogeneity, biopsies may have sampling defects; biopsies are required to meet the requirements of invasive biopsies; biopsies may increase the potential risk of cancer metastasis. Among them, tissue biopsies may fail due to poor tissue quality and are relatively costly; liquid biopsies can extract peripheral blood instead of tumor tissue for detection, but may suffer from low ctDNA concentration or inability to concentrate.
With the development of deep learning, some models based on neural networks also appear, but most of these methods rely on accurate annotation of tumor boundaries by experienced doctors or radiologists, which is time-consuming and laborious, and may be accompanied by some subjective errors introduced in annotation results. While some of the methods currently emerging relax the requirements for data annotation, there is still a need for the radiologist to roughly locate lung nodules. Most importantly, the extracted features are only from the inside of the nodule and annotating the edges of the tumor, other important information including the relative location of the tumor, the size of the tumor and interactions between different lung regions are ignored.
The existing method only considers the information of the inside and the edge of the nodule, depends on the result marked by an expert, and cannot utilize all effective information of the lung.
Disclosure of Invention
Technical problem to be solved
In order to avoid the defects of the prior art, the invention provides a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolution neural network, which combines a multi-scale cavity convolution and an asymmetric three-dimensional convolution module to extract complete lung CT characteristics of a lung adenocarcinoma patient, transforms and recombines the lung CT characteristics in a high-dimensional space, and predicts the EGFR gene mutation state of the lung adenocarcinoma patient by utilizing complete and effective lung information.
Technical proposal
The method for predicting EGFR gene mutation state based on the end-to-end three-dimensional convolutional neural network is characterized by comprising the following steps:
step 1: preprocessing an original three-dimensional lung CT image, emptying the content outside a lung area in the CT image, and scaling the CT image to a uniform size by a spline interpolation mode, namely 112X 90;
step 1a: segmenting the lung through a U-Net model, and setting the region outside the lung region to be zero;
step 1b: obtaining a preprocessing result, namely an image with the size of 112 multiplied by 90 and positioned in the center of a visual field by a spline interpolation mode, ensuring that each slice in the X-axis, Y-axis and Z-axis directions can be cut into the lung, and reserving the detail characteristics of the lung;
step 2: building a neural network structure: the method comprises the steps of adopting a densely connected multi-layer convolutional neural network to extract overall characteristics of a target and fusing multi-scale characteristics;
step 2a: the proposed method is based on Dense Block, builds a four-layer densely connected convolutional neural network, connects the output of each layer in the channel dimension, and serves as the input of the next layer of convolution;
step 2b: in the layer-to-layer connection process, a bottleneck module for feature fusion and dimension reduction between channels is introduced, specifically, feature distribution is improved to normal distribution with a mean value of 0 and a variance of 1 through batch normalization, and the dimension reduction effect is achieved through convolution of a linear rectification function and 1 multiplied by 1;
step 2c: embedding a multi-scale multi-expansion asymmetric cavity convolution module into a baseline model for capturing local micro-features, focusing on lung nodules which appear in the lung of a lung cancer patient in different directions, different sizes and different angles;
step 3: training the neural network structure in an end-to-end mode, namely training the neural network structure by using a cross entropy loss function by using a three-dimensional lung CT image dataset, and using a random gradient descent SGD as an optimizer of a model, wherein the momentum is 0.9;
in the training process, setting the batch size to 6, setting the iteration number of the model to 300, and setting all learning rates to 0.01; EGFR mutant was defined as 1 and EGFR wild type was defined as 0, i.e., the closer the model export was to 1, the easier it was judged to be EGFR mutant. The weight decay for the l2 regularization coefficient is set to 0.0004 to prevent overfitting;
step 4: and inputting the three-dimensional lung CT image data into a trained end-to-end three-dimensional convolutional neural network, inputting the three-dimensional lung CT image data into a full lung CT image, and outputting the CT image data into a prediction result, namely predicting the EGFR gene mutation state.
Advantageous effects
The invention provides a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolutional neural network, which is used for providing end-to-end global feature extraction processing for lungs based on a densely connected three-dimensional convolutional structure and capturing local lung nodule features by utilizing a proposed multi-scale cavity asymmetric module. In a small range, lung nodule characteristics with different sizes, directions and angles are processed by utilizing a multi-scale cavity asymmetric module, and each detail in the input three-dimensional image is searched and densely connected from small to large. In a large range, the dense network is utilized to combine and process micro features of different stages and different scales of targets, so that comprehensive features of the lung are extracted, deeper information of the lung is extracted by utilizing the idea of feature transformation and channel fusion, and finally, a prediction result of the model is obtained through the full connection layer and the activation function layer.
In particular, the proposed method contributes to the following: firstly, the method is a method which does not need any pre-labeling step for the first time in the history of EGFR mutation state prediction field, so that the pressure of doctors is greatly reduced, and the introduction of artificial errors is avoided. The model directly learns the characteristics of a complete three-dimensional CT image, and predicts the EGFR mutation state of a lung adenocarcinoma patient in an end-to-end mode. Second, the present invention predicts the mutation status of EGFR gene using two intact lungs inputted in three-dimensional space for the first time, and it has been proved in experiments that the mutation status of EGFR is not only represented in lung nodules but also in the whole lung of a patient. Thirdly, the proposed model consists of dense connection modules consisting of three-dimensional asymmetric convolution and three-dimensional multi-expansion dense convolution, the application of the modules can enable a network to capture lung nodule information in different directions, and the application of the three-dimensional multi-expansion blocks supports the model to expand the receptive field of the model without losing resolution, so that multi-scale context information of CT is captured, and the prediction performance of the model is further improved.
The invention successfully realizes the function of predicting the EGFR gene mutation state of the lung adenocarcinoma patient from end to end in the three-dimensional CT image by utilizing the neural network, and has the following advantages in the EGFR mutation state field:
the invention provides a deep learning model, namely a three-dimensional dense communication asymmetric convolution and multi-expansion density network for noninvasively predicting EGFR mutation states of lung adenocarcinoma patients. A new view for studying EGFR mutation status, i.e. the information about EGFR status is presented in the intact double lung, not just in the lung nodules, is also innovatively presented. Since the deep learning model requires that the dimensions of the input data are consistent, the invention proposes a method for dealing with the problem of inconsistent dimensions of CT images.
Firstly, compared with the traditional EGFR mutation state prediction model, the method provided by the invention does not need a professional radiologist to carry out edge labeling on the nodule, and does not need any rough positioning, so that the method has a wider application scene. Meanwhile, the method is skillfully combined with the target global feature processor and the lung nodule local feature extractor, embedded in the target three-dimensional prediction network, and solves the problem of extracting the small-range features by decomposing the large-range search problem, so that the problem that the target local features are difficult to accurately capture during three-dimensional target prediction is solved, and the proposed three-dimensional shape deformation model can be trained end to end.
Drawings
Figure 1 shows a flow chart of the method of the invention. The method aims at noninvasive, rapid and accurate prediction of a high-quality three-dimensional model, directly extracts characteristics through a complete three-dimensional CT image of an input target and predicts EGFR gene mutation states through the proposed end-to-end convolutional neural network.
FIG. 2 is a diagram showing the comparison of the method of the present invention with the conventional method. The 3DDADD Net framework provided by the invention is an end-to-end prediction network, has no branch, and directly realizes EGFR mutation state prediction of lung adenocarcinoma patients through complete lung characteristics. The method takes a three-dimensional CT image as input, abandons the input mode of two-dimensional lung nodule slice sequences or two-dimensional CT sequences in the traditional method, and solves the problems that manual labeling is time-consuming and labor-consuming and lung characteristic context information is lost. The method provides a local feature extractor for capturing nodules in different directions and multi-scale backgrounds on the basis of a baseline dense connection network, and can well process detail features.
Figure 3 shows a model frame diagram of the present invention. The input of the model is a complete CT slice of a lung adenocarcinoma patient, and the input of each convolution layer network is a splice of the output of the previous network. The connectivity of dense networks facilitates the use of all features learned at each layer before, without repeated learning. Each layer is composed of a multi-expansion asymmetric convolution module, the expression capability of standard convolution is improved, multi-expansion convolution is involved, different expansion factors are arranged in a single layer to model different resolutions, and the aliasing problem of dense connection is avoided. The three-dimensional asymmetric convolution widens the path of model feature extraction and improves the robustness of the model to certain transformation styles of lung nodules, such as flipping and rotation.
Detailed Description
The invention will now be further described with reference to examples, figures:
the realization method for obtaining EGFR gene mutation state of lung adenocarcinoma patient from three-dimensional lung CT image is characterized by comprising the following steps:
step 1: the original CT image is preprocessed. To make the model more focused on the patient's lungs, the outside of the lung region is emptied and the CT is scaled to a uniform size by spline interpolation, i.e.: 112 x 90.
Step 1a: and inputting CT into a U-Net model to automatically segment the lung and setting the region outside the lung region to zero. The model is more focused on the lung in the calculation process, and meanwhile, the interference outside the lung area is reduced.
Step 1b: the final preprocessing result, i.e. the image with the size of 112×112×90, is obtained by using the spline interpolation method. Each slice in the X-axis, Y-axis and Z-axis directions can be cut into the lung, so that the detail characteristics of the lung are reserved to the greatest extent, and laboratory equipment is utilized to the greatest extent.
Step 2: .
Step 2a: in general, three-dimensional ensemble averaging features reflect global spatial structure information for such objects. The proposed method is based on Dense Block, builds a four-layer densely connected convolutional neural network, connects the output of each layer in the channel dimension, and takes the output as the input of the next layer of convolution. Therefore, the transmission of the characteristics can be enhanced, the characteristics can be better utilized, and the problem of gradient disappearance is effectively solved.
Step 2b: in the layer-to-layer connection process, a bottleneck module for feature fusion and dimension reduction between channels is introduced, so that the calculation amount of the model is effectively reduced. Specifically, the characteristic distribution is improved to be normal distribution with the mean value of 0 and the variance of 1 through batch normalization, and then the effect of dimension reduction is achieved through convolution of a linear rectification function and 1 multiplied by 1.
Step 2c: a multi-scale multi-expansion asymmetric cavity convolution module is built and embedded in a baseline model and is used for capturing local micro-features, and focusing on lung nodules which appear in the lung of a lung cancer patient in different directions, different sizes and different angles. The three-dimensional cavity convolutions with expansion factors of 1,2 and 4 are densely connected, and the three-dimensional asymmetric convolution module is combined, so that the extraction capacity of the three-dimensional cavity convolutions to three-dimensional features with different scales is improved, and the carpet type searching and the accurate striking are performed on the nodules appearing in the lung. The module appears in each layer of the base line network, so that the processing capacity of the network to small features is improved, and the prediction performance of the model is improved.
Step 3: the data set was from the affiliated hospital of the university of medical science in compliance with 173 lung adenocarcinoma patient samples, 119 of which were EGFR mutant and 54 of which were EGFR wild type. The invention performs experiments by a five-fold cross-validation method, and aims to obtain as much effective information as possible from limited data, reduce overfitting and better evaluate the prediction performance and generalization capability of a model. During training, the batch size is set to 6 and the number of iterations of the model is set to 300. All learning rates were set to 0.01. The proposed method defines EGFR mutant as 1 and EGFR wild type as 0, i.e. the closer the model export result is to 1, the easier it is to determine EGFR mutant. The neural network described above was trained using a cross entropy loss function, with a momentum of 0.9 using a random gradient descent (SGD) as the model optimizer. Meanwhile, the weight decay for the l2 regularization coefficient is set to 0.0004 to prevent overfitting.
Step 4: and setting a training model according to the data set and the experimental parameters. Note that the present invention is trained in an end-to-end manner throughout, namely: and inputting a CT image of the whole lung, and outputting a prediction result.
Specific examples:
the invention will be further illustrated with reference to the following figures and examples, which include but are not limited to the following examples.
The computer hardware environment for implementation is: intel Xeon E5-2600 v3@2.6GHz 8-core CPU processor, 64GB memory, equipped with GeForce GTX TITAN X GPU. The software environment running is: linux 16.0464 bit operating system. The method provided by the invention is realized by using Python3.6.7 and TensorFlow 2.1.0 software.
Step 1: a dataset is constructed.
Step 1a: a predicted image dataset is constructed. After approval by the university of medical science ethics committee of Guizhou, the proposed method retrospectively analyzes data of pathologically diagnosed lung adenocarcinoma patients in the affiliated hospitals of medical university of medical science (Guizhou, china) from 10 months in 2018 to 11 months in 2020.
The inclusion criteria followed by the proposed method when collecting patient data are as follows: (1) The patient is proved to be a primary lung adenocarcinoma tumor by CT and histological examination; (2) the patient has EGFR gene mutation detection results; (3) The CT image data of the tumor pathological specimen (4) is complete within 1 month. The exclusion criteria followed were as follows: (1) Patients received anti-tumor therapy (radiation, chemotherapy or radiation) prior to surgery; (2) a post-operative CT imaging interval greater than 1 month; (3) CT images are difficult to identify tumor boundaries (e.g., tumor is located in the hilum of the lung or lesions are combined with atelectasis); (4) lung tumor diameters less than 1cm or CT image artifacts; 5) CT images are of poor quality, affecting segmentation and feature extraction.
After screening, 173 patients met the criteria, 75 men, 98 women, age 31-79 years, average (58±10 years). Of these patients, there were 57 smokers and 116 non-smokers. Clinical staging distribution: 80 cases in stage I, 9 cases in stage II, 21 cases in stage III, and 63 cases in stage IV. And compared with the prior deep learning method, two radiologists with extensive diagnostic experience of 5 years and 10 years respectively have also collected nodule annotations on CT images of each patient when predicting using the ROI images. Different problems in the labeling process are solved through negotiation of two doctors, 50 cases of data are randomly extracted for secondary labeling, and consistency among observers is evaluated.
Step 1b: in the preprocessing stage of CT images, in order to make efficient use of the limited device memory and make the model more focused on the lungs, the proposed method eliminates the blank positions in the three axial directions, leaving only the part of the volume containing the lungs, which is then adjusted to the size 112 x 90 by spline interpolation.
Step 2: and constructing a network structure.
Step 2a: constructing a three-dimensional densely connected base line network. The network adopts a Dense Net network, and maps the input three-dimensional CT image into a vector representing EGFR mutation state prediction results. The first convolution layer consists of a convolution kernel of size 7 x 7, expansion coefficient 7, step size 1, followed by an ELU activation function to convert the 90×112×112 input feature map to a size of 80×112×112; starting from the second convolution layer, densely connected multi-scale multi-expansion asymmetric convolution modules are used, and the outputs of the multi-scale multi-expansion asymmetric convolution modules are densely connected and serve as the input of the next layer; and finally, the output is processed through an average value pooling layer and then connected with a full-connection layer, so that the output of a gene prediction result is obtained.
Step 2b: a multi-dilation asymmetric convolution module for capturing local features is constructed. The feature map first passes through a three-dimensional asymmetric convolution layer whose convolution kernel includes four forms: the feature map is subjected to calculation of the four convolution kernels at the same time, and is output as the addition of four calculation results; then normalizing the output through a BatchNorm3d layer and an ELU layer to ensure that the output is uniformly distributed and inhibit overfitting; then passing through a multi-expansion convolution layer, namely three densely connected convolution layers with expansion coefficients of 1,2 and 4 respectively, wherein the convolution kernels are 3 multiplied by 3; then normalizing the output through a BatchNorm3d layer and an ELU layer to ensure that the output is uniformly distributed and inhibit overfitting; and finally, the input conversion layer converts the channel number of the output feature map into half of the original channel number, and the maximum value pooling layer reduces the size of the feature map.
Step 3: training a three-dimensional end-to-end EGFR gene mutation state prediction model. The network uses the lung adenocarcinoma CT data set provided by the university of medical science in compliance with the definition of step 1 b. The parameters of the network are optimized by using an SGD optimizer, the initial learning rate is set to 0.01, and every 150 epochs are trained, and the learning rate is attenuated to 10% of the original learning rate.
Step 3a, giving a three-dimensional input CT image, and encoding the input image by a first layer of feature extraction layer to obtain a feature map with the size of 80 multiplied by 112.
Step 3b: inputting the feature map obtained in the step 3a into a model to obtain a feature vector with the size of 1 multiplied by 2, which is a prediction result of the model.
And 3c, using the cross entropy loss as a loss function, and inputting the tissue biopsy result corresponding to the lung adenocarcinoma patient given by the database and the EGFR gene mutation state predicted by the model into the loss function. The neural network is trained using a back propagation algorithm.
And 3d, judging whether training is stopped. And (4) returning to the step (3 a) when training is continued, and entering the step (4) when training is stopped.
Step 4: the trained whole three-dimensional prediction network is used as a detector to predict the EGFR gene mutation state of a target on a test set.
And 4a, inputting a three-dimensional CT image into the EGFR gene mutation state prediction network trained in the step 3, outputting a feature vector with the size of 1 multiplied by 2, wherein the feature vector is a prediction result of a model and represents the probability of the mutation type and the wild type respectively.
Step 5: the proposed method achieved AUC of 0.79 and ACC of 0.77 in 173 lung adenocarcinoma CT dataset of the university of compliant medical science by 5 training and testing by the 5 fold cross-validation method mentioned in "review of cross-validation methods in model selection" published 2013 by Fan Yongdong et al.

Claims (1)

1. The method for predicting EGFR gene mutation state based on the end-to-end three-dimensional convolutional neural network is characterized by comprising the following steps:
step 1: preprocessing an original three-dimensional lung CT image, emptying the content outside a lung area in the CT image, and scaling the CT image to a uniform size by a spline interpolation mode, namely 112X 90;
step 1a: segmenting the lung through a U-Net model, and setting the region outside the lung region to be zero;
step 1b: obtaining a preprocessing result, namely an image with the size of 112 multiplied by 90 and positioned in the center of a visual field by a spline interpolation mode, ensuring that each slice in the X-axis, Y-axis and Z-axis directions can be cut into the lung, and reserving the detail characteristics of the lung;
step 2: building a neural network structure: the method comprises the steps of adopting a densely connected multi-layer convolutional neural network to extract overall characteristics of a target and fusing multi-scale characteristics;
step 2a: the proposed method is based on Dense Block, builds a four-layer densely connected convolutional neural network, connects the output of each layer in the channel dimension, and serves as the input of the next layer of convolution;
step 2b: in the layer-to-layer connection process, a bottleneck module for feature fusion and dimension reduction between channels is introduced, specifically, feature distribution is improved to normal distribution with a mean value of 0 and a variance of 1 through batch normalization, and the dimension reduction effect is achieved through convolution of a linear rectification function and 1 multiplied by 1;
step 2c: embedding a multi-scale multi-expansion asymmetric cavity convolution module into a baseline model for capturing local micro-features, focusing on lung nodules which appear in the lung of a lung cancer patient in different directions, different sizes and different angles;
step 3: training the neural network structure in an end-to-end mode, namely training the neural network structure by using a cross entropy loss function by using a three-dimensional lung CT image dataset, and using a random gradient descent SGD as an optimizer of a model, wherein the momentum is 0.9;
in the training process, setting the batch size to 6, setting the iteration number of the model to 300, and setting all learning rates to 0.01; EGFR mutant is defined as 1 and EGFR wild type is defined as 0, i.e., the closer the model export result is to 1, the more easily judged as EGFR mutant; the weight decay for the l2 regularization coefficient is set to 0.0004 to prevent overfitting;
step 4: and inputting the three-dimensional lung CT image data into a trained end-to-end three-dimensional convolutional neural network, inputting the three-dimensional lung CT image data into a full lung CT image, and outputting the CT image data into a prediction result, namely predicting the EGFR gene mutation state.
CN202210583718.2A 2022-05-25 2022-05-25 Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network Active CN115132275B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210583718.2A CN115132275B (en) 2022-05-25 2022-05-25 Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210583718.2A CN115132275B (en) 2022-05-25 2022-05-25 Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network

Publications (2)

Publication Number Publication Date
CN115132275A CN115132275A (en) 2022-09-30
CN115132275B true CN115132275B (en) 2024-02-27

Family

ID=83376354

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210583718.2A Active CN115132275B (en) 2022-05-25 2022-05-25 Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network

Country Status (1)

Country Link
CN (1) CN115132275B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115810016B (en) * 2023-02-13 2023-04-28 四川大学 Automatic identification method, system, storage medium and terminal for CXR (Lung infection) image

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766051A (en) * 2019-09-20 2020-02-07 四川大学华西医院 Lung nodule morphological classification method based on neural network
CN110807764A (en) * 2019-09-20 2020-02-18 成都智能迭迦科技合伙企业(有限合伙) Lung cancer screening method based on neural network
CN111814611A (en) * 2020-06-24 2020-10-23 重庆邮电大学 Multi-scale face age estimation method and system embedded with high-order information
CN113345576A (en) * 2021-06-04 2021-09-03 江南大学 Rectal cancer lymph node metastasis diagnosis method based on deep learning multi-modal CT
WO2022063200A1 (en) * 2020-09-24 2022-03-31 上海健康医学院 Non-small cell lung cancer prognosis survival prediction method, medium and electronic device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766051A (en) * 2019-09-20 2020-02-07 四川大学华西医院 Lung nodule morphological classification method based on neural network
CN110807764A (en) * 2019-09-20 2020-02-18 成都智能迭迦科技合伙企业(有限合伙) Lung cancer screening method based on neural network
CN111814611A (en) * 2020-06-24 2020-10-23 重庆邮电大学 Multi-scale face age estimation method and system embedded with high-order information
WO2022063200A1 (en) * 2020-09-24 2022-03-31 上海健康医学院 Non-small cell lung cancer prognosis survival prediction method, medium and electronic device
CN113345576A (en) * 2021-06-04 2021-09-03 江南大学 Rectal cancer lymph node metastasis diagnosis method based on deep learning multi-modal CT

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于多尺度和特征融合的肺癌识别方法;石陆魁;杜伟昉;马红祺;张军;;计算机工程与设计;20200516(第05期);全文 *

Also Published As

Publication number Publication date
CN115132275A (en) 2022-09-30

Similar Documents

Publication Publication Date Title
CN110310281B (en) Mask-RCNN deep learning-based pulmonary nodule detection and segmentation method in virtual medical treatment
CN108537773B (en) Method for intelligently assisting in identifying pancreatic cancer and pancreatic inflammatory diseases
CN107103187B (en) Lung nodule detection grading and management method and system based on deep learning
US8335359B2 (en) Systems, apparatus and processes for automated medical image segmentation
CN112132917A (en) Intelligent diagnosis method for rectal cancer lymph node metastasis
CN113420826B (en) Liver focus image processing system and image processing method
CN111353998A (en) Tumor diagnosis and treatment prediction model and device based on artificial intelligence
CN111696126B (en) Multi-view-angle-based multi-task liver tumor image segmentation method
CN113538435B (en) Pancreatic cancer pathological image classification method and system based on deep learning
CN111798424B (en) Medical image-based nodule detection method and device and electronic equipment
CN114782307A (en) Enhanced CT image colorectal cancer staging auxiliary diagnosis system based on deep learning
CN114972362A (en) Medical image automatic segmentation method and system based on RMAU-Net network
CN112071418B (en) Gastric cancer peritoneal metastasis prediction system and method based on enhanced CT image histology
Li et al. A novel radiogenomics framework for genomic and image feature correlation using deep learning
CN115132275B (en) Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network
CN114565601A (en) Improved liver CT image segmentation algorithm based on DeepLabV3+
Yadav et al. Deep learning-based CAD system design for thyroid tumor characterization using ultrasound images
CN113764101A (en) CNN-based breast cancer neoadjuvant chemotherapy multi-modal ultrasonic diagnosis system
JP2012504003A (en) Fault detection method and apparatus executed using computer
CN116612313A (en) Pulmonary nodule benign and malignant classification method based on improved Efficient Net-B0 model
CN115409812A (en) CT image automatic classification method based on fusion time attention mechanism
WO2021197176A1 (en) Systems and methods for tumor characterization
CN114822842A (en) Magnetic resonance colorectal cancer T stage prediction method and system
CN114445374A (en) Image feature processing method and system based on diffusion kurtosis imaging MK image
CN112766333B (en) Medical image processing model training method, medical image processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant