CN115132275B

CN115132275B - Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network

Info

Publication number: CN115132275B
Application number: CN202210583718.2A
Authority: CN
Inventors: 赵世杰; 刘卓岩; 韩军伟
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2022-05-25
Filing date: 2022-05-25
Publication date: 2024-02-27
Anticipated expiration: 2042-05-25
Also published as: CN115132275A

Abstract

The invention relates to a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolutional neural network, which provides end-to-end global feature extraction processing for lungs based on a densely connected three-dimensional convolutional structure and captures local lung nodule features by utilizing a proposed multi-scale cavity asymmetric module. In a small range, lung nodule characteristics with different sizes, directions and angles are processed by utilizing a multi-scale cavity asymmetric module, and each detail in the input three-dimensional image is searched and densely connected from small to large. In a large range, the dense network is utilized to combine and process micro features of different stages and different scales of targets, so that comprehensive features of the lung are extracted, deeper information of the lung is extracted by utilizing the idea of feature transformation and channel fusion, and finally, a prediction result of the model is obtained through the full connection layer and the activation function layer.

Description

Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network

Technical Field

The invention belongs to the field of computer vision, relates to a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolutional neural network, and particularly relates to a method for extracting features from CT images and carrying out gene prediction by utilizing a neural network architecture.

Background

Currently, common methods for diagnosing EGFR mutation status include tissue biopsy and liquid biopsy, but there are many limitations including: due to tumor heterogeneity, biopsies may have sampling defects; biopsies are required to meet the requirements of invasive biopsies; biopsies may increase the potential risk of cancer metastasis. Among them, tissue biopsies may fail due to poor tissue quality and are relatively costly; liquid biopsies can extract peripheral blood instead of tumor tissue for detection, but may suffer from low ctDNA concentration or inability to concentrate.

With the development of deep learning, some models based on neural networks also appear, but most of these methods rely on accurate annotation of tumor boundaries by experienced doctors or radiologists, which is time-consuming and laborious, and may be accompanied by some subjective errors introduced in annotation results. While some of the methods currently emerging relax the requirements for data annotation, there is still a need for the radiologist to roughly locate lung nodules. Most importantly, the extracted features are only from the inside of the nodule and annotating the edges of the tumor, other important information including the relative location of the tumor, the size of the tumor and interactions between different lung regions are ignored.

The existing method only considers the information of the inside and the edge of the nodule, depends on the result marked by an expert, and cannot utilize all effective information of the lung.

Disclosure of Invention

Technical problem to be solved

In order to avoid the defects of the prior art, the invention provides a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolution neural network, which combines a multi-scale cavity convolution and an asymmetric three-dimensional convolution module to extract complete lung CT characteristics of a lung adenocarcinoma patient, transforms and recombines the lung CT characteristics in a high-dimensional space, and predicts the EGFR gene mutation state of the lung adenocarcinoma patient by utilizing complete and effective lung information.

Technical proposal

The method for predicting EGFR gene mutation state based on the end-to-end three-dimensional convolutional neural network is characterized by comprising the following steps:

step 1: preprocessing an original three-dimensional lung CT image, emptying the content outside a lung area in the CT image, and scaling the CT image to a uniform size by a spline interpolation mode, namely 112X 90;

step 1a: segmenting the lung through a U-Net model, and setting the region outside the lung region to be zero;

step 1b: obtaining a preprocessing result, namely an image with the size of 112 multiplied by 90 and positioned in the center of a visual field by a spline interpolation mode, ensuring that each slice in the X-axis, Y-axis and Z-axis directions can be cut into the lung, and reserving the detail characteristics of the lung;

step 2: building a neural network structure: the method comprises the steps of adopting a densely connected multi-layer convolutional neural network to extract overall characteristics of a target and fusing multi-scale characteristics;

step 2a: the proposed method is based on Dense Block, builds a four-layer densely connected convolutional neural network, connects the output of each layer in the channel dimension, and serves as the input of the next layer of convolution;

step 2b: in the layer-to-layer connection process, a bottleneck module for feature fusion and dimension reduction between channels is introduced, specifically, feature distribution is improved to normal distribution with a mean value of 0 and a variance of 1 through batch normalization, and the dimension reduction effect is achieved through convolution of a linear rectification function and 1 multiplied by 1;

step 2c: embedding a multi-scale multi-expansion asymmetric cavity convolution module into a baseline model for capturing local micro-features, focusing on lung nodules which appear in the lung of a lung cancer patient in different directions, different sizes and different angles;

step 3: training the neural network structure in an end-to-end mode, namely training the neural network structure by using a cross entropy loss function by using a three-dimensional lung CT image dataset, and using a random gradient descent SGD as an optimizer of a model, wherein the momentum is 0.9;

in the training process, setting the batch size to 6, setting the iteration number of the model to 300, and setting all learning rates to 0.01; EGFR mutant was defined as 1 and EGFR wild type was defined as 0, i.e., the closer the model export was to 1, the easier it was judged to be EGFR mutant. The weight decay for the l2 regularization coefficient is set to 0.0004 to prevent overfitting;

step 4: and inputting the three-dimensional lung CT image data into a trained end-to-end three-dimensional convolutional neural network, inputting the three-dimensional lung CT image data into a full lung CT image, and outputting the CT image data into a prediction result, namely predicting the EGFR gene mutation state.

Advantageous effects

The invention provides a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolutional neural network, which is used for providing end-to-end global feature extraction processing for lungs based on a densely connected three-dimensional convolutional structure and capturing local lung nodule features by utilizing a proposed multi-scale cavity asymmetric module. In a small range, lung nodule characteristics with different sizes, directions and angles are processed by utilizing a multi-scale cavity asymmetric module, and each detail in the input three-dimensional image is searched and densely connected from small to large. In a large range, the dense network is utilized to combine and process micro features of different stages and different scales of targets, so that comprehensive features of the lung are extracted, deeper information of the lung is extracted by utilizing the idea of feature transformation and channel fusion, and finally, a prediction result of the model is obtained through the full connection layer and the activation function layer.

In particular, the proposed method contributes to the following: firstly, the method is a method which does not need any pre-labeling step for the first time in the history of EGFR mutation state prediction field, so that the pressure of doctors is greatly reduced, and the introduction of artificial errors is avoided. The model directly learns the characteristics of a complete three-dimensional CT image, and predicts the EGFR mutation state of a lung adenocarcinoma patient in an end-to-end mode. Second, the present invention predicts the mutation status of EGFR gene using two intact lungs inputted in three-dimensional space for the first time, and it has been proved in experiments that the mutation status of EGFR is not only represented in lung nodules but also in the whole lung of a patient. Thirdly, the proposed model consists of dense connection modules consisting of three-dimensional asymmetric convolution and three-dimensional multi-expansion dense convolution, the application of the modules can enable a network to capture lung nodule information in different directions, and the application of the three-dimensional multi-expansion blocks supports the model to expand the receptive field of the model without losing resolution, so that multi-scale context information of CT is captured, and the prediction performance of the model is further improved.

The invention successfully realizes the function of predicting the EGFR gene mutation state of the lung adenocarcinoma patient from end to end in the three-dimensional CT image by utilizing the neural network, and has the following advantages in the EGFR mutation state field:

the invention provides a deep learning model, namely a three-dimensional dense communication asymmetric convolution and multi-expansion density network for noninvasively predicting EGFR mutation states of lung adenocarcinoma patients. A new view for studying EGFR mutation status, i.e. the information about EGFR status is presented in the intact double lung, not just in the lung nodules, is also innovatively presented. Since the deep learning model requires that the dimensions of the input data are consistent, the invention proposes a method for dealing with the problem of inconsistent dimensions of CT images.

Firstly, compared with the traditional EGFR mutation state prediction model, the method provided by the invention does not need a professional radiologist to carry out edge labeling on the nodule, and does not need any rough positioning, so that the method has a wider application scene. Meanwhile, the method is skillfully combined with the target global feature processor and the lung nodule local feature extractor, embedded in the target three-dimensional prediction network, and solves the problem of extracting the small-range features by decomposing the large-range search problem, so that the problem that the target local features are difficult to accurately capture during three-dimensional target prediction is solved, and the proposed three-dimensional shape deformation model can be trained end to end.

Drawings

Figure 1 shows a flow chart of the method of the invention. The method aims at noninvasive, rapid and accurate prediction of a high-quality three-dimensional model, directly extracts characteristics through a complete three-dimensional CT image of an input target and predicts EGFR gene mutation states through the proposed end-to-end convolutional neural network.

FIG. 2 is a diagram showing the comparison of the method of the present invention with the conventional method. The 3DDADD Net framework provided by the invention is an end-to-end prediction network, has no branch, and directly realizes EGFR mutation state prediction of lung adenocarcinoma patients through complete lung characteristics. The method takes a three-dimensional CT image as input, abandons the input mode of two-dimensional lung nodule slice sequences or two-dimensional CT sequences in the traditional method, and solves the problems that manual labeling is time-consuming and labor-consuming and lung characteristic context information is lost. The method provides a local feature extractor for capturing nodules in different directions and multi-scale backgrounds on the basis of a baseline dense connection network, and can well process detail features.

Figure 3 shows a model frame diagram of the present invention. The input of the model is a complete CT slice of a lung adenocarcinoma patient, and the input of each convolution layer network is a splice of the output of the previous network. The connectivity of dense networks facilitates the use of all features learned at each layer before, without repeated learning. Each layer is composed of a multi-expansion asymmetric convolution module, the expression capability of standard convolution is improved, multi-expansion convolution is involved, different expansion factors are arranged in a single layer to model different resolutions, and the aliasing problem of dense connection is avoided. The three-dimensional asymmetric convolution widens the path of model feature extraction and improves the robustness of the model to certain transformation styles of lung nodules, such as flipping and rotation.

Detailed Description

The invention will now be further described with reference to examples, figures:

the realization method for obtaining EGFR gene mutation state of lung adenocarcinoma patient from three-dimensional lung CT image is characterized by comprising the following steps:

step 1: the original CT image is preprocessed. To make the model more focused on the patient's lungs, the outside of the lung region is emptied and the CT is scaled to a uniform size by spline interpolation, i.e.: 112 x 90.

Step 1a: and inputting CT into a U-Net model to automatically segment the lung and setting the region outside the lung region to zero. The model is more focused on the lung in the calculation process, and meanwhile, the interference outside the lung area is reduced.

Step 1b: the final preprocessing result, i.e. the image with the size of 112×112×90, is obtained by using the spline interpolation method. Each slice in the X-axis, Y-axis and Z-axis directions can be cut into the lung, so that the detail characteristics of the lung are reserved to the greatest extent, and laboratory equipment is utilized to the greatest extent.

Step 2: .

Step 2a: in general, three-dimensional ensemble averaging features reflect global spatial structure information for such objects. The proposed method is based on Dense Block, builds a four-layer densely connected convolutional neural network, connects the output of each layer in the channel dimension, and takes the output as the input of the next layer of convolution. Therefore, the transmission of the characteristics can be enhanced, the characteristics can be better utilized, and the problem of gradient disappearance is effectively solved.

Step 2b: in the layer-to-layer connection process, a bottleneck module for feature fusion and dimension reduction between channels is introduced, so that the calculation amount of the model is effectively reduced. Specifically, the characteristic distribution is improved to be normal distribution with the mean value of 0 and the variance of 1 through batch normalization, and then the effect of dimension reduction is achieved through convolution of a linear rectification function and 1 multiplied by 1.

Step 2c: a multi-scale multi-expansion asymmetric cavity convolution module is built and embedded in a baseline model and is used for capturing local micro-features, and focusing on lung nodules which appear in the lung of a lung cancer patient in different directions, different sizes and different angles. The three-dimensional cavity convolutions with expansion factors of 1,2 and 4 are densely connected, and the three-dimensional asymmetric convolution module is combined, so that the extraction capacity of the three-dimensional cavity convolutions to three-dimensional features with different scales is improved, and the carpet type searching and the accurate striking are performed on the nodules appearing in the lung. The module appears in each layer of the base line network, so that the processing capacity of the network to small features is improved, and the prediction performance of the model is improved.

Step 3: the data set was from the affiliated hospital of the university of medical science in compliance with 173 lung adenocarcinoma patient samples, 119 of which were EGFR mutant and 54 of which were EGFR wild type. The invention performs experiments by a five-fold cross-validation method, and aims to obtain as much effective information as possible from limited data, reduce overfitting and better evaluate the prediction performance and generalization capability of a model. During training, the batch size is set to 6 and the number of iterations of the model is set to 300. All learning rates were set to 0.01. The proposed method defines EGFR mutant as 1 and EGFR wild type as 0, i.e. the closer the model export result is to 1, the easier it is to determine EGFR mutant. The neural network described above was trained using a cross entropy loss function, with a momentum of 0.9 using a random gradient descent (SGD) as the model optimizer. Meanwhile, the weight decay for the l2 regularization coefficient is set to 0.0004 to prevent overfitting.

Step 4: and setting a training model according to the data set and the experimental parameters. Note that the present invention is trained in an end-to-end manner throughout, namely: and inputting a CT image of the whole lung, and outputting a prediction result.

Specific examples:

the invention will be further illustrated with reference to the following figures and examples, which include but are not limited to the following examples.

The computer hardware environment for implementation is: intel Xeon E5-2600 v3@2.6GHz 8-core CPU processor, 64GB memory, equipped with GeForce GTX TITAN X GPU. The software environment running is: linux 16.0464 bit operating system. The method provided by the invention is realized by using Python3.6.7 and TensorFlow 2.1.0 software.

Step 1: a dataset is constructed.

Step 1a: a predicted image dataset is constructed. After approval by the university of medical science ethics committee of Guizhou, the proposed method retrospectively analyzes data of pathologically diagnosed lung adenocarcinoma patients in the affiliated hospitals of medical university of medical science (Guizhou, china) from 10 months in 2018 to 11 months in 2020.

The inclusion criteria followed by the proposed method when collecting patient data are as follows: (1) The patient is proved to be a primary lung adenocarcinoma tumor by CT and histological examination; (2) the patient has EGFR gene mutation detection results; (3) The CT image data of the tumor pathological specimen (4) is complete within 1 month. The exclusion criteria followed were as follows: (1) Patients received anti-tumor therapy (radiation, chemotherapy or radiation) prior to surgery; (2) a post-operative CT imaging interval greater than 1 month; (3) CT images are difficult to identify tumor boundaries (e.g., tumor is located in the hilum of the lung or lesions are combined with atelectasis); (4) lung tumor diameters less than 1cm or CT image artifacts; 5) CT images are of poor quality, affecting segmentation and feature extraction.

After screening, 173 patients met the criteria, 75 men, 98 women, age 31-79 years, average (58±10 years). Of these patients, there were 57 smokers and 116 non-smokers. Clinical staging distribution: 80 cases in stage I, 9 cases in stage II, 21 cases in stage III, and 63 cases in stage IV. And compared with the prior deep learning method, two radiologists with extensive diagnostic experience of 5 years and 10 years respectively have also collected nodule annotations on CT images of each patient when predicting using the ROI images. Different problems in the labeling process are solved through negotiation of two doctors, 50 cases of data are randomly extracted for secondary labeling, and consistency among observers is evaluated.

Step 1b: in the preprocessing stage of CT images, in order to make efficient use of the limited device memory and make the model more focused on the lungs, the proposed method eliminates the blank positions in the three axial directions, leaving only the part of the volume containing the lungs, which is then adjusted to the size 112 x 90 by spline interpolation.

Step 2: and constructing a network structure.

Step 2a: constructing a three-dimensional densely connected base line network. The network adopts a Dense Net network, and maps the input three-dimensional CT image into a vector representing EGFR mutation state prediction results. The first convolution layer consists of a convolution kernel of size 7 x 7, expansion coefficient 7, step size 1, followed by an ELU activation function to convert the 90×112×112 input feature map to a size of 80×112×112; starting from the second convolution layer, densely connected multi-scale multi-expansion asymmetric convolution modules are used, and the outputs of the multi-scale multi-expansion asymmetric convolution modules are densely connected and serve as the input of the next layer; and finally, the output is processed through an average value pooling layer and then connected with a full-connection layer, so that the output of a gene prediction result is obtained.

Step 2b: a multi-dilation asymmetric convolution module for capturing local features is constructed. The feature map first passes through a three-dimensional asymmetric convolution layer whose convolution kernel includes four forms: the feature map is subjected to calculation of the four convolution kernels at the same time, and is output as the addition of four calculation results; then normalizing the output through a BatchNorm3d layer and an ELU layer to ensure that the output is uniformly distributed and inhibit overfitting; then passing through a multi-expansion convolution layer, namely three densely connected convolution layers with expansion coefficients of 1,2 and 4 respectively, wherein the convolution kernels are 3 multiplied by 3; then normalizing the output through a BatchNorm3d layer and an ELU layer to ensure that the output is uniformly distributed and inhibit overfitting; and finally, the input conversion layer converts the channel number of the output feature map into half of the original channel number, and the maximum value pooling layer reduces the size of the feature map.

Step 3: training a three-dimensional end-to-end EGFR gene mutation state prediction model. The network uses the lung adenocarcinoma CT data set provided by the university of medical science in compliance with the definition of step 1 b. The parameters of the network are optimized by using an SGD optimizer, the initial learning rate is set to 0.01, and every 150 epochs are trained, and the learning rate is attenuated to 10% of the original learning rate.

Step 3a, giving a three-dimensional input CT image, and encoding the input image by a first layer of feature extraction layer to obtain a feature map with the size of 80 multiplied by 112.

Step 3b: inputting the feature map obtained in the step 3a into a model to obtain a feature vector with the size of 1 multiplied by 2, which is a prediction result of the model.

And 3c, using the cross entropy loss as a loss function, and inputting the tissue biopsy result corresponding to the lung adenocarcinoma patient given by the database and the EGFR gene mutation state predicted by the model into the loss function. The neural network is trained using a back propagation algorithm.

And 3d, judging whether training is stopped. And (4) returning to the step (3 a) when training is continued, and entering the step (4) when training is stopped.

Step 4: the trained whole three-dimensional prediction network is used as a detector to predict the EGFR gene mutation state of a target on a test set.

And 4a, inputting a three-dimensional CT image into the EGFR gene mutation state prediction network trained in the step 3, outputting a feature vector with the size of 1 multiplied by 2, wherein the feature vector is a prediction result of a model and represents the probability of the mutation type and the wild type respectively.

Step 5: the proposed method achieved AUC of 0.79 and ACC of 0.77 in 173 lung adenocarcinoma CT dataset of the university of compliant medical science by 5 training and testing by the 5 fold cross-validation method mentioned in "review of cross-validation methods in model selection" published 2013 by Fan Yongdong et al.

Claims

1. The method for predicting EGFR gene mutation state based on the end-to-end three-dimensional convolutional neural network is characterized by comprising the following steps:

in the training process, setting the batch size to 6, setting the iteration number of the model to 300, and setting all learning rates to 0.01; EGFR mutant is defined as 1 and EGFR wild type is defined as 0, i.e., the closer the model export result is to 1, the more easily judged as EGFR mutant; the weight decay for the l2 regularization coefficient is set to 0.0004 to prevent overfitting;