CN113571190A - Lung function decline prediction device and prediction method thereof - Google Patents

Lung function decline prediction device and prediction method thereof Download PDF

Info

Publication number
CN113571190A
CN113571190A CN202110988100.XA CN202110988100A CN113571190A CN 113571190 A CN113571190 A CN 113571190A CN 202110988100 A CN202110988100 A CN 202110988100A CN 113571190 A CN113571190 A CN 113571190A
Authority
CN
China
Prior art keywords
layer
convolution layer
feature
convolution
lung
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110988100.XA
Other languages
Chinese (zh)
Other versions
CN113571190B (en
Inventor
陈舞
孙军梅
李秀梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Normal University
Original Assignee
Hangzhou Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Normal University filed Critical Hangzhou Normal University
Priority to CN202110988100.XA priority Critical patent/CN113571190B/en
Publication of CN113571190A publication Critical patent/CN113571190A/en
Application granted granted Critical
Publication of CN113571190B publication Critical patent/CN113571190B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30061Lung

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a lung function decline prediction device and a prediction method thereof. The lung function progress prediction model provided by the invention comprises a CT (computed tomography) feature extraction network and a multi-modal feature prediction network; the CT feature extraction network is used for carrying out CT feature extraction on the preprocessed lung CT images, the multi-mode feature prediction network is used for predicting the lung function progress condition, multi-mode features formed by fusing CT features extracted by the CT feature extraction network and clinical features are input, and the FVC prediction values in different weeks in the future are output. The lung function progress prediction model constructed by the method improves the extraction capability of CT characteristics through the CT characteristic extraction network, and predicts the lung function progress by multi-modal data through the multi-modal characteristic prediction network, thereby effectively improving the accuracy of model prediction.

Description

Lung function decline prediction device and prediction method thereof
Technical Field
The invention belongs to the field of artificial intelligence, and relates to a lung function decline prediction device based on multi-modal data and a non-diagnosis-purpose lung function decline prediction method thereof.
Background
Idiopathic Pulmonary Fibrosis (IPF) is a chronic lung disease characterized by occult onset of disease, unknown etiology, histological or imaging manifestations of common interstitial pneumonia, and progressive dyspnea and decreased Pulmonary function, with morbidity and prevalence ranging from 0.09-1.30 and 0.33-4.51, respectively, in ten thousand people. Due to the progressive development of IPF disease and limited diagnostic tools, complete lung dysfunction may eventually result. The typical median survival time for IPF patients is only 3-5 years, and the prognosis of the disease is more difficult. Although no widely used technique is available to estimate the progression of IPF disease, it is widely believed that decreased lung function in IPF patients may provide some guidance and advice for the prognosis of IPF.
Disclosure of Invention
It is an object of the present invention to address the deficiencies of the prior art by providing a method for predicting lung function decline based on multi-modal data for non-diagnostic purposes.
The invention relates to a method for predicting lung function decline based on multi-modal data, which is not used for diagnosis, and comprises the following steps:
acquiring historical lung CT images and corresponding clinical text data; wherein the clinical text data comprises lung influence factors, week number for measuring Forced Vital Capacity, Forced Vital Capacity (FVC), and percentage of Forced Vital Capacity to normal standard value; the lung influencing factors comprise age, sex and smoking condition;
preprocessing historical lung CT images and corresponding clinical text data to construct a data set;
preferably, the step (2) is specifically:
2-1, preprocessing the lung CT image: removing DICOM medical image files which cannot be opened and worthless CT image data which do not contain lung information; the image size is adjusted to be uniform 512 x 512.
2-2, preprocessing clinical text data: removing incomplete and wrong data in clinical text data; performing feature engineering on the clinical text data to generate more effective data features for model training; carrying out Min-Max standardization treatment; finally, the clinical characteristics after pretreatment are obtained.
And 2-3, calculating different weeks in the data set and corresponding FVC values according to a least square method to obtain the linear change rate of the FVC, and using the linear change rate as one label of the training set.
Step (3), constructing a lung function progress prediction model, and training by using the data set constructed in the step (2)
The lung function progress prediction model comprises a CT (computed tomography) feature extraction network and a multi-modal feature prediction network; the CT feature extraction network is used for carrying out CT feature extraction on the preprocessed lung CT images, the multi-mode feature prediction network is used for predicting the lung function progress condition, inputting multi-mode features formed by fusing CT features extracted by the CT feature extraction network and clinical features, and outputting the multi-mode features as FVC predicted values in different weeks in the future.
3-1 construction of CT feature extraction network
The CT feature extraction network takes IncepotionV 1 as a backbone network and comprises a front-end downsampling module and a multi-scale CT feature fusion module.
The front-end down-sampling module comprises 1 × 1, 3 × 3 convolution layers and a maximum pooling layer, high-dimensional features are obtained through down-sampling, the number of network parameters is reduced, the calculation speed is increased, and meanwhile over-fitting is prevented.
Preferably, the front-end down-sampling module comprises three serially-connected 3 × 3 convolution layers, a maximum pooling layer, 1 × 1 convolution layer, two serially-connected 3 × 3 convolution layers and a maximum pooling layer which are sequentially cascaded;
the multi-scale CT feature fusion module comprises n1 serially-connected multi-scale CT feature fusion modules A, a first maximum pooling layer, n2 serially-connected multi-scale CT feature fusion modules B, a second maximum pooling layer, n3 serially-connected multi-scale CT feature fusion modules C, an average pooling layer, a global average pooling layer, a first full-connection layer and a second full-connection layer which are sequentially cascaded, wherein n1 is more than or equal to 1, n2 is more than or equal to 1, and n3 is more than or equal to 1.
Preferably, the multi-scale CT feature fusion module comprises 2 serially-connected multi-scale CT feature fusion modules a, a first maximum pooling layer, 2 serially-connected multi-scale CT feature fusion modules B, a second maximum pooling layer, 2 serially-connected multi-scale CT feature fusion modules C, an average pooling layer, a global average pooling layer, a first full-connection layer and a second full-connection layer which are sequentially cascaded;
the multi-scale CT feature fusion module A comprises 5 parallel branches, a feature fusion layer, a residual connection layer, an improved CBAM channel attention module and a 1 x 1 convolution dimensionality increasing layer; the 1 st branch in the 5 parallel branches comprises a 1 x 1 convolution layer; the 2 nd branch comprises 1 × 1 convolution layer, 3 × 3 convolution layer, a cavity convolution layer with the cavity rate of 2 and a characteristic diagram addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the 3 × 3 convolution layer, the input end of the cavity convolution layer with the cavity rate of 2, the output end of the 3 × 3 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the characteristic diagram addition layer; the 3 rd branch comprises 1 × 1 convolution layer, 5 × 5 convolution layer, a cavity convolution layer with the cavity rate of 2 and a characteristic diagram addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the 5 × 5 convolution layer, the input end of the cavity convolution layer with the cavity rate of 2, the output end of the 5 × 5 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the characteristic diagram addition layer; the 4 th branch comprises an average pooling layer and a 1 x 1 convolution layer which are sequentially cascaded; the output feature maps of the 1 st to 4 th branches are connected with Concatenate at a feature fusion layer to form a multi-scale CT feature; the residual error connecting layer adds the multi-scale CT characteristics and the original input characteristic diagram of the 5 th branch, so that the characterization capability of the network is improved; the improved CBAM channel attention module is used for receiving the features after residual connecting layer processing, adding attention weight to the multi-scale CT features and inhibiting useless information; and (3) performing cross-channel feature fusion on the multi-scale CT features output by the improved CBAM channel attention module by the 1 × 1 convolution dimensionality increasing layer, and widening the number of network channels by using the minimum parameters.
The multi-scale CT feature fusion module B comprises 5 parallel branches, a feature fusion layer, a residual connection layer and a 1 x 1 convolution dimensionality increasing layer; the structure of the 1 st, 2 nd and 4 th branches in the 5 parallel branches is the same as that of the multi-scale CT characteristic fusion module A; the 3 rd branch comprises 1 × 1 convolution layer, a first 3 × 3 convolution layer, a second 3 × 3 convolution layer, a cavity convolution layer with a cavity rate of 2 and a feature map addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the first 3 × 3 convolution layer, the first input end of the second 3 × 3 convolution layer and the input end of the cavity convolution layer with a cavity rate of 2, the output end of the first 3 × 3 convolution layer is connected with the second input end of the second 3 × 3 convolution layer, and the output end of the second 3 × 3 convolution layer and the output end of the cavity convolution layer with a cavity rate of 2 are connected with the feature map addition layer. The output feature graphs of the 1 st to 4 th branches are connected with each other through concatemate in a feature fusion layer to form a multi-scale CT feature; the residual error connecting layer adds the multi-scale CT characteristics and the original input characteristic diagram of the 5 th branch, so that the characterization capability of the network is improved; and (3) performing cross-channel feature fusion on the features processed by the residual connecting layer by the 1 × 1 convolution dimensionality-increasing layer, and widening the number of network channels by using the minimum parameters.
The multi-scale CT feature fusion module C comprises 5 parallel branches, a feature fusion layer, a residual connection layer, an improved CBAM channel attention module and a 1 x 1 convolution dimensionality increasing layer; the structure of the 1 st branch in the 5 parallel branches is the same as that of the multi-scale CT characteristic fusion module A; the 2 nd branch comprises 1 × 1 convolution layer, 1 × 3 convolution layer, 3 × 1 convolution layer, a cavity convolution layer with the cavity rate of 2 and a feature map addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the 1 × 3 convolution layer and the input end of the cavity convolution layer with the cavity rate of 2, the output end of the 1 × 3 convolution layer is connected with the input end of the 3 × 1 convolution layer, and the output end of the 3 × 1 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the feature map addition layer; the 3 rd branch comprises 1 × 1 convolution layer, 1 × 3 convolution layer A, 3 × 1 convolution layer A, a cavity convolution layer A with a void rate of 2, a feature map addition layer A, 1 × 3 convolution layer B, 3 × 1 convolution layer B, a cavity convolution layer B with a void rate of 2, a feature map addition layer B, an output end of 1 × 1 convolution layer A is connected with an input end of 1 × 3 convolution layer A, an input end of cavity convolution layer A with a void rate of 2, an input end of 1 × 3 convolution layer B, an output end of 1 × 3 convolution layer A is connected with an input end of 3 × 1 convolution layer A, an output end of 3 × 1 convolution layer A is connected with an input end of feature map addition layer A, an output end of cavity convolution layer A with a void rate of 2 is connected with an input end of feature map addition layer A, an output end of feature map addition layer A is connected with an input end of 1 × 3 convolution layer B, an input end of 2 convolution layer B, and an output end of 1 × 3 convolution layer B is connected with an input end of feature map addition layer B, 3, the output end of the convolution layer B by 1 and the output end of the void convolution layer B with the void ratio of 2 are connected with the input end of the characteristic graph addition layer B; the 4 th branch comprises an average pooling layer, a maximum pooling layer, a characteristic diagram adding layer and a 1 x 1 convolution layer, wherein the output end of the average pooling layer and the output end of the maximum pooling layer are connected with the input end of the characteristic diagram adding layer, and the output end of the characteristic diagram adding layer is connected with the input end of the 1 x 1 convolution layer.
The Improved CBAM Channel Attention module (CBAM-ICA) performs global average pooling and global maximum pooling on an input feature map, generates two different Channel Attention feature maps through two 1 × 1 convolutional layers and a Sigmoid activation function, multiplies the two Channel Attention feature maps to form a final Attention weight, and multiplies the final Attention weight with an input feature map F pixel by pixel to obtain a final output feature F'. The specific process can be represented by formula 1:
F'=(ε(C(Pag(F)))×ε(C(Pmx(F))))⊙F (1)
wherein F represents an input feature map, PagRepresenting global average pooling, PmxIndicating global maximum pooling, C representing two 1 x 1 convolutional layers, epsilon indicating Sigmoid activation function, and F' indicating output characteristics after passing through the modified CBAM channel attention module.
3-2 building a multimodal feature prediction network
The multi-modal feature prediction network comprises a first multi-modal feature module and a second multi-modal feature module;
the first multi-modal feature module takes CT features extracted by a CT feature extraction network and preprocessed lung influence factors as input, takes the linear change rate of the FVC as output, and comprises a Concatenate feature fusion layer A and a full connection layer which are sequentially cascaded; the concatemate feature fusion layer A is used for carrying out concatemate fusion on CT features extracted by a CT feature extraction network and preprocessed clinical features (age, gender and smoking condition) to obtain first multi-modal features; the full-link layer is used for predicting the FVC linear change rate. The loss function used for predicting the linear change rate of the FVC is an average Absolute error MAE (mean Absolute error), which is the sum of Absolute values of the difference between a target value and a predicted value, and represents the average error amplitude of the predicted value, so that the method has better robustness. In order to relieve the appearance of an overfitting phenomenon in the network training process and enable the network to have good generalization, a dropout layer is added before a full connection layer.
The second multi-modal feature module takes the FVC linear change rate and all clinical features output by the first multi-modal feature module as input, and takes the FVC predicted values in different weeks in the future as output, and comprises an attention module and a multi-layer perceptron (MLP) which are sequentially cascaded.
The attention module calculation process is represented by equation 2:
Figure BDA0003231478340000041
in the formula, FxRepresents the linear rate of change of the FVC and all clinical features output by the first multimodal feature model, M represents two fully connected layers, ε represents the Sigmoid activation function, FwxRepresenting the output characteristics after passing through the attention module. Multimodal features FxThrough two full-connection layers, calculating by a Sigmoid activation function to obtain an attention weight, and finally, calculating the attention weight and an input feature FxMultiplying and adding to obtain the final output characteristic Fwx
The multilayer perceptron comprises a first full-connection layer, an ELU activation function layer, a second full-connection layer, a GELU activation function layer and a third full-connection layer which are sequentially cascaded, the output characteristic diagram of the attention module is used as input, and the FVC value is used as output.
And (4) utilizing the trained lung function progress prediction model to realize the prediction of the lung function progress.
It is another object of the present invention to provide a lung function deterioration predicting apparatus based on multi-modal data, comprising:
the lung data acquisition module is used for acquiring lung CT images and corresponding lung influence factors, wherein the lung influence factors comprise age, gender and smoking conditions;
the lung data preprocessing module is used for preprocessing the lung CT image and the lung influence factors;
and the lung function progress prediction module is used for processing the preprocessed lung CT images and the lung influence factors by utilizing the trained lung function progress prediction model so as to obtain the FVC prediction values in different weeks in the future.
The invention has the beneficial effects that:
(1) high accuracy of lung function prediction
The lung function progress prediction model constructed by the method improves the extraction capability of CT characteristics through the CT characteristic extraction network, and predicts the lung function progress by multi-modal data through the multi-modal characteristic prediction network, thereby effectively improving the accuracy of model prediction.
(2) Generalization ability enhancement of lung function progression prediction model
The lung function progress prediction model constructed by the invention carries out a series of measures for avoiding overfitting: an Adam optimizer is adopted to adaptively adjust the learning rate during training; a dropout layer is added to the CT characteristic extraction network, so that the appearance of an overfitting phenomenon in the model training process is relieved; when the linear change rate of the FVC is predicted, the average absolute error MAE is used as a loss function, so that the robustness is better; predicting the FVC value in a multi-modal characteristic prediction network, and training by using a K-fold cross verification method to reduce the risk of overfitting to a certain extent; an ELU activation function and a GELU activation function are used after the full connection layer, so that the robustness to noise and the model generalization capability are improved; and setting a training early termination value, and stopping training when the loss value exceeds 15 times and does not decrease. These measures ultimately enhance the generalization ability of the model.
(3) The lung function progress prediction model can effectively predict the lung function progress
The lung function progress prediction model can predict the FVC values of different weeks in the future, so that the severity of the lung function decline of the predicted person can be better understood, and the lung function progress prediction model has guiding significance for the prognosis of the lung function of the predicted person.
Drawings
FIG. 1 is a schematic diagram of the structure of a model for predicting the progression of lung function;
FIG. 2 is a schematic diagram of a CT feature extraction network;
FIG. 3 is a schematic structural diagram of a multi-scale CT feature fusion module A;
FIG. 4 is a schematic structural diagram of a multi-scale CT feature fusion module B;
FIG. 5 is a schematic structural diagram of a multi-scale CT feature fusion module C;
FIG. 6 is a schematic diagram of a structure of a channel attention module of the modified CBAM;
fig. 7 is a schematic structural diagram of a multi-modal feature prediction network.
Detailed Description
The present invention is further analyzed with reference to the following specific examples.
A method for non-diagnostic purposes of predicting lung function decline based on multi-modal data, comprising the steps of:
acquiring historical lung CT images and corresponding clinical text data; wherein the clinical text data comprises lung influence factors, week number for measuring Forced Vital Capacity, Forced Vital Capacity (FVC), and percentage of Forced Vital Capacity to normal standard value; the lung influencing factors comprise age, sex and smoking condition;
table 1 clinical text data
Figure BDA0003231478340000061
Preprocessing historical lung CT images and corresponding clinical text data to construct a data set; the method comprises the following steps:
2-1, preprocessing the lung CT image; removing DICOM medical image files which cannot be opened and worthless CT image data which do not contain lung information; the image size is adjusted to be uniform 512 x 512.
2-2, preprocessing clinical text data; removing incomplete and wrong data in clinical text data; performing feature engineering on the clinical text data to generate more effective data features for model training; carrying out Min-Max standardization treatment; finally, the clinical characteristics after pretreatment are obtained.
And 2-3, calculating different weeks in the data set and corresponding FVC values according to a least square method to obtain the linear change rate of the FVC, and using the linear change rate as one label of the training set.
Step (3), constructing a lung function progress prediction model as shown in figure 1, and training by using the data set constructed in the step (2)
The lung function progress prediction model comprises a CT (computed tomography) feature extraction network and a multi-modal feature prediction network; the CT feature extraction network is used for preprocessing lung CT images liPerforming CT feature extraction, wherein i is more than or equal to 0 and less than N, N represents the cycle number of the forced vital capacity predicted, the multi-mode feature prediction network is used for predicting the lung function progress condition, inputting multi-mode features formed by fusing the CT features extracted by the CT feature extraction network and clinical features, and outputting the multi-mode features as FVC predicted values FVC at different cycle numbers in the futureN
3-1 constructing CT feature extraction network as shown in FIG. 2
The CT feature extraction network takes IncepotionV 1 as a backbone network and comprises a front-end downsampling module and a multi-scale CT feature fusion module.
Compared with the Inception V1 network, the network adds a residual error and an improved CBAM channel attention module to expand the receptive field of the network, pays attention to the effective characteristics of the lung region, adds a cavity convolution module parallel to a convolution layer to supplement lost detail information, and finally forms three different multi-scale CT characteristic fusion modules and carries out two-time series stacking. Therefore, the lung CT image is subjected to multi-scale feature extraction and fusion, the CT feature extraction capability of the network is enhanced, and more accurate and effective CT features are obtained.
The front-end down-sampling module comprises 1 × 1, 3 × 3 convolution layers and a maximum pooling layer, high-dimensional features are obtained through down-sampling, the number of network parameters is reduced, the calculation speed is increased, and meanwhile over-fitting is prevented.
The front-end down-sampling module comprises three serially-connected 3 x 3 convolution layers, a maximum pooling layer, a 1 x 1 convolution layer, two serially-connected 3 x 3 convolution layers and a maximum pooling layer which are sequentially cascaded;
the multi-scale CT feature fusion module comprises 2 serially-connected multi-scale CT feature fusion modules A, a first maximum pooling layer, 2 serially-connected multi-scale CT feature fusion modules B, a second maximum pooling layer, 2 serially-connected multi-scale CT feature fusion modules C, an average pooling layer, a global average pooling layer, a first full-connection layer and a second full-connection layer which are sequentially cascaded.
As shown in fig. 3, the multi-scale CT feature fusion module a includes 5 parallel branches, a feature fusion layer, a residual connection layer, a channel attention module for improving CBAM, and a 1 × 1 convolution dimensionality-increasing layer; the 1 st branch in the 5 parallel branches comprises a 1 x 1 convolution layer; the 2 nd branch comprises 1 × 1 convolution layer, 3 × 3 convolution layer, a cavity convolution layer with the cavity rate of 2 and a characteristic diagram addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the 3 × 3 convolution layer, the input end of the cavity convolution layer with the cavity rate of 2, the output end of the 3 × 3 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the characteristic diagram addition layer; the 3 rd branch comprises 1 × 1 convolution layer, 5 × 5 convolution layer, a cavity convolution layer with the cavity rate of 2 and a characteristic diagram addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the 5 × 5 convolution layer, the input end of the cavity convolution layer with the cavity rate of 2, the output end of the 5 × 5 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the characteristic diagram addition layer; the 4 th branch comprises an average pooling layer and a 1 x 1 convolution layer which are sequentially cascaded; the output feature maps of the 1 st to 4 th branches are connected with Concatenate at a feature fusion layer to form a multi-scale CT feature; the residual error connecting layer adds the multi-scale CT characteristics and the original input characteristic diagram of the 5 th branch, so that the characterization capability of the network is improved; the improved CBAM channel attention module is used for receiving the features after residual connecting layer processing, adding attention weight to the multi-scale CT features and inhibiting useless information; and (3) performing cross-channel feature fusion on the multi-scale CT features output by the improved CBAM channel attention module by the 1 × 1 convolution dimensionality increasing layer, and widening the number of network channels by using the minimum parameters.
As shown in fig. 4, the multi-scale CT feature fusion module B includes 5 parallel branches, a feature fusion layer, a residual connection layer, and a 1 × 1 convolution dimensionality-increasing layer; the structure of the 1 st, 2 nd and 4 th branches in the 5 parallel branches is the same as that of the multi-scale CT characteristic fusion module A; the 3 rd branch comprises 1 × 1 convolution layer, a first 3 × 3 convolution layer, a second 3 × 3 convolution layer, a cavity convolution layer with a cavity rate of 2 and a feature map addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the first 3 × 3 convolution layer, the first input end of the second 3 × 3 convolution layer and the input end of the cavity convolution layer with a cavity rate of 2, the output end of the first 3 × 3 convolution layer is connected with the second input end of the second 3 × 3 convolution layer, and the output end of the second 3 × 3 convolution layer and the output end of the cavity convolution layer with a cavity rate of 2 are connected with the feature map addition layer. The output feature graphs of the 1 st to 4 th branches are connected with each other through concatemate in a feature fusion layer to form a multi-scale CT feature; the residual error connecting layer adds the multi-scale CT characteristics and the original input characteristic diagram of the 5 th branch, so that the characterization capability of the network is improved; and (3) performing cross-channel feature fusion on the features processed by the residual connecting layer by the 1 × 1 convolution dimensionality-increasing layer, and widening the number of network channels by using the minimum parameters.
As shown in fig. 5, the multi-scale CT feature fusion module C includes 5 parallel branches, a feature fusion layer, a residual connection layer, a channel attention module for improving CBAM, and a 1 × 1 convolution dimensionality-increasing layer; the structure of the 1 st branch in the 5 parallel branches is the same as that of the multi-scale CT characteristic fusion module A; the 2 nd branch comprises 1 × 1 convolution layer, 1 × 3 convolution layer, 3 × 1 convolution layer, a cavity convolution layer with the cavity rate of 2 and a feature map addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the 1 × 3 convolution layer and the input end of the cavity convolution layer with the cavity rate of 2, the output end of the 1 × 3 convolution layer is connected with the input end of the 3 × 1 convolution layer, and the output end of the 3 × 1 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the feature map addition layer; the 3 rd branch comprises 1 × 1 convolution layer, 1 × 3 convolution layer A, 3 × 1 convolution layer A, a cavity convolution layer A with a void rate of 2, a feature map addition layer A, 1 × 3 convolution layer B, 3 × 1 convolution layer B, a cavity convolution layer B with a void rate of 2, a feature map addition layer B, an output end of 1 × 1 convolution layer A is connected with an input end of 1 × 3 convolution layer A, an input end of cavity convolution layer A with a void rate of 2, an input end of 1 × 3 convolution layer B, an output end of 1 × 3 convolution layer A is connected with an input end of 3 × 1 convolution layer A, an output end of 3 × 1 convolution layer A is connected with an input end of feature map addition layer A, an output end of cavity convolution layer A with a void rate of 2 is connected with an input end of feature map addition layer A, an output end of feature map addition layer A is connected with an input end of 1 × 3 convolution layer B, an input end of 2 convolution layer B, and an output end of 1 × 3 convolution layer B is connected with an input end of feature map addition layer B, 3, the output end of the convolution layer B by 1 and the output end of the void convolution layer B with the void ratio of 2 are connected with the input end of the characteristic graph addition layer B; the 4 th branch comprises an average pooling layer, a maximum pooling layer, a characteristic diagram adding layer and a 1 x 1 convolution layer, wherein the output end of the average pooling layer and the output end of the maximum pooling layer are connected with the input end of the characteristic diagram adding layer, and the output end of the characteristic diagram adding layer is connected with the input end of the 1 x 1 convolution layer.
As shown in fig. 6, the channel attention module of the improved CBAM specifically includes:
the method comprises the steps of changing two fully-connected layers of the original CBAM for calculating the attention weight into 1-1 convolutional layers, converting the original calculation process of adding the layers and calculating the weight by using an activation function into the process of directly calculating the attention weight value by using a Sigmoid activation function for two convolution output characteristics, then multiplying the two attention weights, outputting the channel attention weight value, and removing the space attention part in the CBAM attention mechanism. That is, in the improved CBAM channel attention module, first, the input feature maps are respectively subjected to global average pooling and global maximum pooling, then, two 1 × 1 convolutional layers are passed through, two different channel attention feature maps are generated through a Sigmoid activation function, finally, the two channel attention feature maps are multiplied to form a final attention weight, and the final attention weight is multiplied with the input feature map F pixel by pixel to obtain a final output feature F'. The specific process can be represented by formula (1):
F'=(ε(C(Pag(F)))×ε(C(Pmx(F))))⊙F (1)
wherein F represents an input feature map, PagRepresenting global average pooling, PmxIndicating global maximum pooling, C representing two 1 x 1 convolutional layers, epsilon indicating Sigmoid activation function, and F' indicating output characteristics after passing through the modified CBAM channel attention module.
3-2 construction of a multimodal feature prediction network as in FIG. 7
The multi-modal feature prediction network comprises a first multi-modal feature module and a second multi-modal feature module;
the first multi-modal feature module takes CT features extracted by a CT feature extraction network and preprocessed lung influence factors as input, |iThe corresponding FVC linear change rate is taken as output and comprises a Concatenate characteristic fusion layer A, a,A fully-connected layer; the concatemate feature fusion layer A is used for carrying out concatemate fusion on CT features extracted by a CT feature extraction network and preprocessed clinical features (age, gender and smoking condition) to obtain first multi-modal features; the full-link layer is used for predicting the FVC linear change rate. The loss function used for predicting the linear change rate of the FVC is an average Absolute error MAE (mean Absolute error), which is the sum of Absolute values of the difference between a target value and a predicted value, and represents the average error amplitude of the predicted value, so that the method has better robustness. In order to relieve the appearance of an overfitting phenomenon in the network training process and enable the network to have good generalization, a dropout layer is added before a full connection layer.
The second multi-modal signature module outputs the linear rate of change of FVC and all clinical features (lung impact factor, /)iThe corresponding cycle number of the forced vital capacity, the forced vital capacity FVC, and the percentage of the forced vital capacity to the standard value of a normal person) are taken as input, and the FVC value at the cycle number N is taken as output, and comprises an attention module and a multilayer perceptron (MLP) which are sequentially cascaded.
The attention module calculation process is represented by equation (2):
Figure BDA0003231478340000101
in the formula, FxRepresents the linear rate of change of the FVC and all clinical features output by the first multimodal feature model, M represents two fully connected layers, ε represents the Sigmoid activation function, FwxRepresenting the output characteristics after passing through the attention module. Multimodal features FxThrough two full-connection layers, calculating by a Sigmoid activation function to obtain an attention weight, and finally, calculating the attention weight and an input feature FxMultiplying and adding to obtain the final output characteristic Fwx
The multilayer perceptron comprises a first full-connection layer, an ELU activation function layer, a second full-connection layer, a GELU activation function layer and a third full-connection layer which are sequentially cascaded, the output characteristic diagram of the attention module is used as input, and the FVC value is used as output. Finally, the third full link layer outputs three characteristic values, Out1, Out2, and Out3, respectively. Out2 is the predicted value of FVC, and Out3 minus Out1 is the standard deviation value used to calculate the Laplace log likelihood score and then evaluate the model. The loss function used for predicting the FVC predicted value in different weeks in the future is Quantile loss function (Quantile loss function), and the Quantile value is [0.2,0.5,0.8 ]. In the training process, a K-fold cross-validation method is used for training, samples of K-1/K are randomly selected from all multi-modal characteristics to serve as a training set, the rest samples are used as a validation set, meanwhile, a training early termination value probability is set to be 15, and the training is stopped when a loss function is reduced to a certain degree, so that the risk of overfitting is reduced to a certain degree. And the second multi-modal feature module selects the K value to be 6, six times of training are carried out, and the final prediction result is the average value of the sum of the six prediction results. And an ELU activation function and a GELU activation function are used after the full connection layer, so that the robustness to noise and the network generalization capability are improved.
And (4) utilizing the trained lung function progress prediction model to realize the prediction of the lung function progress.
And inputting the lung CT images and clinical characteristics after the centralized test preprocessing into a trained lung function progress prediction model, predicting the FVC prediction values of the model at different weeks in the future by the preprocessed lung CT images and clinical characteristics, and finally outputting the lung function progress prediction result of the forecasted person in the test set.
In order to use the appropriate quantile values in the quantile loss function calculation process to obtain a better prediction result, experiments were performed on the selection of the quantile values, and the experimental results are shown in table 2.
TABLE 2 comparison of results for different quantiles
Figure BDA0003231478340000111
The experimental result can obtain that the [0.2,0.5 and 0.8] is used as the quantile loss function quantile value to obtain better Laplace log-likelihood score and improve the accuracy of the model prediction result.
In order to verify that the performance improvement effect obtained by introducing the improved CBAM channel attention mechanism into the multi-scale CT feature fusion module is better, an attention mechanism introduction position comparison experiment is performed, and the experiment result is shown in table 3.
TABLE 3 attention mechanism introduction position comparison
Figure BDA0003231478340000112
The experimental result shows that the model performance can be effectively improved and the prediction accuracy can be improved by introducing the attention mechanism to the proper position in the multi-scale CT feature fusion module, wherein the optimal effect of the attention mechanism ratio is simultaneously added in the structures of the multi-scale CT feature fusion module A and the multi-scale CT feature fusion module C.
In order to compare the roles of different attention modules in the CT feature extraction network, comparative experiments were conducted simultaneously using attention mechanisms such as SE attention module, CBAM attention module, ECA attention module, scSE attention module, CBAM-ICA module, and the like. The attention module is mainly added after the multi-scale connection of the multi-scale CT feature fusion module A, C in the CT feature extraction network, and the experimental results are shown in table 4.
Table 4 attention Module comparative experiment
Figure BDA0003231478340000121
As can be seen from the table, the CBAM-ICA module works best to improve the model performance compared to other attention modules, but the parameter quantity is not greatly improved and is the same as that of the CBAM attention module.
There are two ways in which the attention mechanism and residual module can be combined. After multi-scale feature fusion is completed, attention weight is added to the multi-scale features, and then residual connection is performed on the multi-scale features and original input features. And in the other structure b, after the multi-scale feature fusion is completed, the multi-scale feature and the original input feature are subjected to residual error connection, and then an attention mechanism is added. In order to verify the merits of the two module structures, comparative experiments were performed on the module structures a and b, and the experimental results are shown in table 5.
TABLE 5 comparison of residual error plus CBAM-ICA Module Structure
Figure BDA0003231478340000122
As can be seen from the table, the laplacian log-likelihood score of structure b is significantly better than that of structure a under the same parameter number. Therefore, a multi-scale CT feature fusion module in the CT feature extraction network adopts a structure b mode to construct a residual error and CBAM-ICA module.
In order to verify the performance of extracting the CT characteristics by the CT characteristic extraction network in the lung function progress prediction model, different networks are used for replacing the CT characteristic extraction network to perform a comparison experiment. The comparative networks selected for the experiments were inclusion V1, inclusion V3, inclusion _ respet _ V2, Resnet50, densnet 121 and efficientnet b 0. The results of the experiment are shown in Table 6.
TABLE 6CT feature extraction network effect comparison
Figure BDA0003231478340000123
Figure BDA0003231478340000131
From the table, it is observed that the laplacian log-likelihood score on the test set is superior to most other networks based on using three networks of the inclusion multi-scale module as the CT feature extraction network, so the invention uses the inclusion v1 as the backbone network of the CT feature extraction network. Compared with other networks, the Laplace log-likelihood score obtained by the model on the test set is-6.8107, the best effect is obtained, and the number of used network parameters is small.
A comparison experiment is carried out on several existing lung function progress prediction methods so as to verify the effectiveness of a lung function progress prediction model. The experimental comparison model method comprises the following steps: FibrosisNet, Fibro-CoSANet, DNN + GBDT + NGBoost + Elasticent integration model. The results of the experiment are shown in Table 7.
TABLE 7 Lung function progression prediction method comparison
Figure BDA0003231478340000132
The experimental result can show that compared with the existing lung function progress prediction method, the model of the invention obtains better Laplace log-likelihood score. Therefore, the model of the invention can predict the progress of the lung function more accurately.
The basic model used in the ablation experiment is a lung function progress prediction model with the three modules of a residual error module, a cavity convolution module and a CBAM-ICA attention module in the CT feature extraction network removed, the three modules are respectively added in the experiment process to carry out the ablation experiment, and finally the results are compared with the experiment results of the basic model. The results of the experiment are shown in Table 8.
Table 8 model ablation experiment
Figure BDA0003231478340000133
Figure BDA0003231478340000141
According to the experimental result, the Laplace log-likelihood scores obtained by the CT feature extraction network in the lung function progress prediction model after the residual error module, the cavity convolution module and the CBAM-ICA attention module are respectively added are improved to different degrees. And when three modules are added simultaneously, the prediction score of the model is optimal. The result of the prediction of the lung function progress prediction model comprising the residual error module, the cavity convolution module and the CBAM-ICA attention module is more accurate.
To verify the validity of the multi-modal data, the lung function progress prediction model based on the multi-modal data of the present invention was compared with a model method using only clinical text data or lung CT image data, and the experimental results are shown in table 9.
TABLE 9 Effect of different modality data prediction
Figure BDA0003231478340000142
As can be seen from the experimental results, the laplacian log-likelihood score of the model method using only lung CT image data is lower than that of the model method using only clinical text data. However, these methods using only one medical modality data all score much less than the laplacian log-likelihood score of the lung function progression prediction model based on multi-modality data. Therefore, compared with single medical modal data, the multi-modal data can effectively improve the accuracy of model prediction.
The above embodiments are not intended to limit the present invention, and the present invention is not limited to the above embodiments, and all embodiments are within the scope of the present invention as long as the requirements of the present invention are met.

Claims (6)

1. A method for predicting lung function decline based on multi-modal data for non-diagnostic purposes, comprising the steps of:
acquiring historical lung CT images and corresponding clinical text data; wherein the clinical text data comprises lung influence factors, week number of the forced vital capacity, forced vital capacity FVC, and percentage of the forced vital capacity to the standard value of normal people; the lung influencing factors comprise age, sex and smoking condition;
preprocessing historical lung CT images and corresponding clinical text data to construct a data set;
step (3), constructing a lung function progress prediction model, and training by using the data set constructed in the step (2)
The lung function progress prediction model comprises a CT (computed tomography) feature extraction network and a multi-modal feature prediction network; the CT feature extraction network is used for carrying out CT feature extraction on the preprocessed lung CT images, the multi-mode feature prediction network is used for predicting the lung function progress condition, inputting multi-mode features formed by fusing CT features extracted by the CT feature extraction network and clinical features, and outputting the multi-mode features as FVC predicted values in different weeks in the future;
3-1 construction of CT feature extraction network
The CT feature extraction network takes IncepotionV 1 as a backbone network and comprises a front-end downsampling module and a multi-scale CT feature fusion module;
the multi-scale CT feature fusion module comprises n1 serially-connected multi-scale CT feature fusion modules A, a first maximum pooling layer, n2 serially-connected multi-scale CT feature fusion modules B, a second maximum pooling layer, n3 serially-connected multi-scale CT feature fusion modules C, an average pooling layer, a global average pooling layer, a first full-connection layer and a second full-connection layer which are sequentially cascaded, wherein n1 is more than or equal to 1, n2 is more than or equal to 1, and n3 is more than or equal to 1;
the multi-scale CT feature fusion module A comprises 5 parallel branches, a feature fusion layer, a residual connection layer, an improved CBAM channel attention module and a 1 x 1 convolution dimensionality increasing layer; the 1 st branch in the 5 parallel branches comprises a 1 x 1 convolution layer; the 2 nd branch comprises 1 × 1 convolution layer, 3 × 3 convolution layer, a cavity convolution layer with the cavity rate of 2 and a characteristic diagram addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the 3 × 3 convolution layer, the input end of the cavity convolution layer with the cavity rate of 2, the output end of the 3 × 3 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the characteristic diagram addition layer; the 3 rd branch comprises 1 × 1 convolution layer, 5 × 5 convolution layer, a cavity convolution layer with the cavity rate of 2 and a characteristic diagram addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the 5 × 5 convolution layer, the input end of the cavity convolution layer with the cavity rate of 2, the output end of the 5 × 5 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the characteristic diagram addition layer; the 4 th branch comprises an average pooling layer and a 1 x 1 convolution layer which are sequentially cascaded; the output feature maps of the 1 st to 4 th branches are connected with Concatenate at a feature fusion layer to form a multi-scale CT feature; adding the multi-scale CT characteristics and the original input characteristic graph of the 5 th branch by the residual connection layer; the improved CBAM channel attention module is used for receiving the processed features of the residual connecting layer and adding attention weight to the multi-scale CT features; 1, performing cross-channel feature fusion on the multi-scale CT features output by the improved CBAM channel attention module by the convolution dimensionality-increasing layer;
the multi-scale CT feature fusion module B comprises 5 parallel branches, a feature fusion layer, a residual connection layer and a 1 x 1 convolution dimensionality increasing layer; the structure of the 1 st, 2 nd and 4 th branches in the 5 parallel branches is the same as that of the multi-scale CT characteristic fusion module A; the 3 rd branch comprises 1 × 1 convolution layer, a first 3 × 3 convolution layer, a second 3 × 3 convolution layer, a cavity convolution layer with a cavity rate of 2 and a feature map addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the first 3 × 3 convolution layer, the first input end of the second 3 × 3 convolution layer and the input end of the cavity convolution layer with the cavity rate of 2, the output end of the first 3 × 3 convolution layer is connected with the second input end of the second 3 × 3 convolution layer, and the output end of the second 3 × 3 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the feature map addition layer; the output feature graphs of the 1 st to 4 th branches are connected with each other through concatemate in a feature fusion layer to form a multi-scale CT feature; adding the multi-scale CT characteristics and the original input characteristic graph of the 5 th branch by the residual connection layer; 1, performing cross-channel feature fusion on the features processed by the residual connecting layer by the convolution dimensionality-increasing layer 1;
the multi-scale CT feature fusion module C comprises 5 parallel branches, a feature fusion layer, a residual connection layer, an improved CBAM channel attention module and a 1 x 1 convolution dimensionality increasing layer; the structure of the 1 st branch in the 5 parallel branches is the same as that of the multi-scale CT characteristic fusion module A; the 2 nd branch comprises 1 × 1 convolution layer, 1 × 3 convolution layer, 3 × 1 convolution layer, a cavity convolution layer with the cavity rate of 2 and a feature map addition layer, wherein the output end of the 1 × 1 convolution layer is connected with the input end of the 1 × 3 convolution layer and the input end of the cavity convolution layer with the cavity rate of 2, the output end of the 1 × 3 convolution layer is connected with the input end of the 3 × 1 convolution layer, and the output end of the 3 × 1 convolution layer and the output end of the cavity convolution layer with the cavity rate of 2 are connected with the feature map addition layer; the 3 rd branch comprises 1 × 1 convolution layer, 1 × 3 convolution layer A, 3 × 1 convolution layer A, a cavity convolution layer A with a void rate of 2, a feature map addition layer A, 1 × 3 convolution layer B, 3 × 1 convolution layer B, a cavity convolution layer B with a void rate of 2, a feature map addition layer B, an output end of 1 × 1 convolution layer A is connected with an input end of 1 × 3 convolution layer A, an input end of cavity convolution layer A with a void rate of 2, an input end of 1 × 3 convolution layer B, an output end of 1 × 3 convolution layer A is connected with an input end of 3 × 1 convolution layer A, an output end of 3 × 1 convolution layer A is connected with an input end of feature map addition layer A, an output end of cavity convolution layer A with a void rate of 2 is connected with an input end of feature map addition layer A, an output end of feature map addition layer A is connected with an input end of 1 × 3 convolution layer B, an input end of 2 convolution layer B, and an output end of 1 × 3 convolution layer B is connected with an input end of feature map addition layer B, 3, the output end of the convolution layer B by 1 and the output end of the void convolution layer B with the void ratio of 2 are connected with the input end of the characteristic graph addition layer B; the 4 th branch comprises an average pooling layer, a maximum pooling layer, a characteristic graph adding layer and a 1 x 1 convolution layer, wherein the output end of the average pooling layer and the output end of the maximum pooling layer are connected with the input end of the characteristic graph adding layer, and the output end of the characteristic graph adding layer is connected with the input end of the 1 x 1 convolution layer;
the improved CBAM channel attention module is used for respectively carrying out global average pooling and global maximum pooling on input feature maps, then generating two different channel attention feature maps through two 1-by-1 convolutional layers and a Sigmoid activation function, finally multiplying the two channel attention feature maps to form a final attention weight, and multiplying the final attention weight by an input feature map F pixel by pixel to obtain a final output feature F';
3-2 building a multimodal feature prediction network
The multi-modal feature prediction network comprises a first multi-modal feature module and a second multi-modal feature module;
the first multi-modal feature module takes CT features extracted by a CT feature extraction network and preprocessed lung influence factors as input, takes the linear change rate of the FVC as output, and comprises a Concatenate feature fusion layer A and a full connection layer which are sequentially cascaded; the concatemate feature fusion layer A is used for concatemate fusion of CT features extracted by a CT feature extraction network and lung influence factors in the preprocessed clinical features to obtain first multi-modal features; the full connection layer is used for predicting the linear change rate of the FVC;
the second multi-modal characteristic module takes the FVC linear change rate and all clinical characteristics output by the first multi-modal characteristic module as input, and takes the FVC predicted values in different weeks in the future as output, and comprises an attention module and a multilayer perceptron MLP which are sequentially cascaded;
the attention module calculation process is represented by equation 2:
Figure FDA0003231478330000031
in the formula, FxRepresents the linear rate of change of the FVC and all clinical features output by the first multimodal feature model, M represents two fully connected layers, ε represents the Sigmoid activation function, FwxRepresenting the output characteristics after passing through the attention module;
the multilayer perceptron comprises a first full-connection layer, an ELU activation function layer, a second full-connection layer, a GELU activation function layer and a third full-connection layer which are sequentially cascaded, the output characteristic diagram of the attention module is used as input, and the FVC value is used as output;
and (4) utilizing the trained lung function progress prediction model to realize the prediction of the lung function progress.
2. The method for predicting lung function decline based on multi-modal data for non-diagnostic purposes as claimed in claim 1, wherein the step (2) is specifically:
2-1, preprocessing the lung CT image:
removing DICOM medical image files which cannot be opened and worthless CT image data which do not contain lung information, and adjusting the sizes of the images to be uniform;
2-2, preprocessing clinical text data:
removing incomplete and wrong data in the clinical text data, performing characteristic engineering on the clinical text data, and performing Min-Max standardized processing to obtain required clinical characteristics;
and 2-3, calculating different weeks in the data set and corresponding FVC values according to a least square method to obtain the linear change rate of the FVC, and using the linear change rate as one label of the training set.
3. The method according to claim 1, wherein the front-end down-sampling module in the CT feature extraction network comprises three serially connected 3 x 3 convolutional layers, a max-pooling layer, a 1 x 1 convolutional layer, two serially connected 3 x 3 convolutional layers, and a max-pooling layer, which are sequentially cascaded.
4. The method according to claim 1, wherein the multi-scale CT feature fusion module in the CT feature extraction network comprises sequentially cascaded 2 serially connected multi-scale CT feature fusion modules a, a first maximum pooling layer, 2 serially connected multi-scale CT feature fusion modules B, a second maximum pooling layer, 2 serially connected multi-scale CT feature fusion modules C, an average pooling layer, a global average pooling layer, a first fully-connected layer, and a second fully-connected layer.
5. The method of claim 1, wherein the specific process of the channel attention module in the CT feature extraction network for improving CBAM is represented by formula 1:
F'=(ε(C(Pag(F)))×ε(C(Pmx(F))))⊙F (1)
wherein F represents an input feature map, PagRepresenting global average pooling, PmxIndicating global maximum pooling, C representing two 1 x 1 convolutional layers, epsilon indicating Sigmoid activation function, and F' indicating output characteristics after passing through the modified CBAM channel attention module.
6. A lung function decline prediction apparatus based on multi-modal data, comprising:
the lung data acquisition module is used for acquiring lung CT images and corresponding lung influence factors, wherein the lung influence factors comprise age, gender and smoking conditions;
the lung data preprocessing module is used for preprocessing the lung CT image and the lung influence factors;
and the lung function progress prediction module is used for processing the preprocessed lung CT images and the lung influence factors by utilizing the trained lung function progress prediction model so as to obtain the FVC prediction values in different weeks in the future.
CN202110988100.XA 2021-08-26 2021-08-26 Device and method for predicting lung function decline Active CN113571190B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110988100.XA CN113571190B (en) 2021-08-26 2021-08-26 Device and method for predicting lung function decline

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110988100.XA CN113571190B (en) 2021-08-26 2021-08-26 Device and method for predicting lung function decline

Publications (2)

Publication Number Publication Date
CN113571190A true CN113571190A (en) 2021-10-29
CN113571190B CN113571190B (en) 2023-09-19

Family

ID=78172782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110988100.XA Active CN113571190B (en) 2021-08-26 2021-08-26 Device and method for predicting lung function decline

Country Status (1)

Country Link
CN (1) CN113571190B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116779170A (en) * 2023-08-24 2023-09-19 济南市人民医院 Pulmonary function attenuation prediction system and device based on self-adaptive deep learning
CN117454235A (en) * 2023-02-20 2024-01-26 宁夏隆基宁光仪表股份有限公司 Multi-input distributed photovoltaic arc fault diagnosis method, system and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112447304A (en) * 2020-11-25 2021-03-05 深圳市华嘉生物智能科技有限公司 Visual inspection method and device for judging development of infectious diseases
CN112530578A (en) * 2020-12-02 2021-03-19 中国科学院大学宁波华美医院 Viral pneumonia intelligent diagnosis system based on multi-mode information fusion
CN112786189A (en) * 2021-01-05 2021-05-11 重庆邮电大学 Intelligent diagnosis system for new coronary pneumonia based on deep learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112447304A (en) * 2020-11-25 2021-03-05 深圳市华嘉生物智能科技有限公司 Visual inspection method and device for judging development of infectious diseases
CN112530578A (en) * 2020-12-02 2021-03-19 中国科学院大学宁波华美医院 Viral pneumonia intelligent diagnosis system based on multi-mode information fusion
CN112786189A (en) * 2021-01-05 2021-05-11 重庆邮电大学 Intelligent diagnosis system for new coronary pneumonia based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117454235A (en) * 2023-02-20 2024-01-26 宁夏隆基宁光仪表股份有限公司 Multi-input distributed photovoltaic arc fault diagnosis method, system and device
CN116779170A (en) * 2023-08-24 2023-09-19 济南市人民医院 Pulmonary function attenuation prediction system and device based on self-adaptive deep learning

Also Published As

Publication number Publication date
CN113571190B (en) 2023-09-19

Similar Documents

Publication Publication Date Title
CN113571190B (en) Device and method for predicting lung function decline
CN107480702B (en) Feature selection and feature fusion method for HCC pathological image recognition
CN109389171B (en) Medical image classification method based on multi-granularity convolution noise reduction automatic encoder technology
CN109994201B (en) Diabetes and hypertension probability calculation system based on deep learning
CN112085157B (en) Disease prediction method and device based on neural network and tree model
Huang et al. End-to-end continuous emotion recognition from video using 3D ConvLSTM networks
Phankokkruad COVID-19 pneumonia detection in chest X-ray images using transfer learning of convolutional neural networks
CN113012811A (en) Traditional Chinese medicine syndrome diagnosis and health evaluation method combining deep convolutional network and graph neural network
CN112489769A (en) Intelligent traditional Chinese medicine diagnosis and medicine recommendation system for chronic diseases based on deep neural network
CN115937604A (en) anti-NMDAR encephalitis prognosis classification method based on multi-modal feature fusion
CN116013449B (en) Auxiliary prediction method for cardiomyopathy prognosis by fusing clinical information and magnetic resonance image
CN113610118B (en) Glaucoma diagnosis method, device, equipment and method based on multitasking course learning
Do et al. Affective expression analysis in-the-wild using multi-task temporal statistical deep learning model
CN116680105A (en) Time sequence abnormality detection method based on neighborhood information fusion attention mechanism
Venu An ensemble-based approach by fine-tuning the deep transfer learning models to classify pneumonia from chest X-ray images
CN115290326A (en) Rolling bearing fault intelligent diagnosis method
Liu et al. Audio and video bimodal emotion recognition in social networks based on improved alexnet network and attention mechanism
Nafea et al. A Deep Learning Algorithm for Lung Cancer Detection Using EfficientNet-B3
CN116778158B (en) Multi-tissue composition image segmentation method and system based on improved U-shaped network
CN111582287B (en) Image description method based on sufficient visual information and text information
Haddada et al. Comparative study of deep learning architectures for early alzheimer detection
Dadgar et al. A hybrid method of feature selection and neural network with genetic algorithm to predict diabetes
CN115618751A (en) Steel plate mechanical property prediction method
CN115547502A (en) Hemodialysis patient risk prediction device based on time sequence data
CN115170885A (en) Brain tumor classification detection method and system based on feature pyramid network structure and channel attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant