CN116468732A - Lung CT image segmentation method and imaging method based on deep learning - Google Patents
Lung CT image segmentation method and imaging method based on deep learning Download PDFInfo
- Publication number
- CN116468732A CN116468732A CN202310219966.3A CN202310219966A CN116468732A CN 116468732 A CN116468732 A CN 116468732A CN 202310219966 A CN202310219966 A CN 202310219966A CN 116468732 A CN116468732 A CN 116468732A
- Authority
- CN
- China
- Prior art keywords
- layer
- lung
- module
- convolution
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 210000004072 lung Anatomy 0.000 title claims abstract description 89
- 238000003709 image segmentation Methods 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000003384 imaging method Methods 0.000 title claims abstract description 28
- 238000013135 deep learning Methods 0.000 title claims abstract description 25
- 230000011218 segmentation Effects 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 14
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 230000006870 function Effects 0.000 claims description 80
- 238000012545 processing Methods 0.000 claims description 65
- 230000004927 fusion Effects 0.000 claims description 27
- 238000004364 calculation method Methods 0.000 claims description 24
- 238000010606 normalization Methods 0.000 claims description 21
- 238000012952 Resampling Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 241000282326 Felis catus Species 0.000 claims description 3
- 230000002685 pulmonary effect Effects 0.000 claims 2
- 230000000694 effects Effects 0.000 abstract description 4
- 238000002474 experimental method Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 201000005202 lung cancer Diseases 0.000 description 2
- 208000020816 lung neoplasm Diseases 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 208000019693 Lung disease Diseases 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000012733 comparative method Methods 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/003—Reconstruction from projections, e.g. tomography
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30061—Lung
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2210/00—Indexing scheme for image generation or computer graphics
- G06T2210/41—Medical
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a lung CT image segmentation method based on deep learning, which comprises the steps of acquiring an existing lung CT image data set and preprocessing to obtain a training data set; constructing a lung CT image segmentation original model and training to obtain a lung CT image segmentation model; and dividing the actual lung CT image by adopting a lung CT image division model. The invention also discloses an imaging method comprising the lung CT image segmentation method based on the deep learning. The invention can realize more accurate image segmentation results, can accelerate convergence speed, can show stronger segmentation effect on a small-scale data set, can realize more accurate segmentation results, can avoid feature collapse and generate diversified features; therefore, the invention has high reliability, good accuracy and objectivity and science.
Description
Technical Field
The invention belongs to the field of medical image processing, and particularly relates to a lung CT image segmentation method and an imaging method based on deep learning.
Background
Along with the development of economic technology and the improvement of living standard of people, the attention of people to health is higher and higher; medical image processing techniques are therefore becoming increasingly important.
In clinical application and experimental research process, the lung CT image has important auxiliary effect, and can effectively help clinical staff to diagnose subsequent lung diseases or effectively help experimental researchers to conduct subsequent researches. Therefore, the segmentation of CT images of the lung is particularly important.
At present, a deep learning model is often adopted to independently learn characteristic representation aiming at a lung CT image segmentation method, and a segmentation task is completed by utilizing a learned high-dimensional abstract; for example, image segmentation methods based on the transducer technology are commonly used today to segment CT images of the lung.
The transducer was originally applied to natural language processing and has met with significant success; since the advent of ViT, transformers have achieved competitive performance over a range of computer vision tasks. However, as research continues, some of the shortcomings of the transducer are gradually revealed. Of paramount importance, a transducer typically requires extensive data set training to exhibit better performance. However, the dataset of medical images is typically smaller, while the dataset of CT images of the lungs is relatively smaller. Therefore, the existing lung CT image segmentation method based on the transducer technology has relatively poor accuracy and low reliability.
Disclosure of Invention
The invention aims to provide a lung CT image segmentation method based on deep learning, which has high reliability, good accuracy and objectivity and science.
The second object of the invention is to provide an imaging method comprising the lung CT image segmentation method based on deep learning.
The lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps:
s1, acquiring an existing lung CT image data set;
s2, preprocessing the data acquired in the step S1, so as to acquire a training data set;
s3, constructing a lung CT image segmentation original model based on a convolution encoder, a transform encoder, a fusion module and a corresponding decoder;
s4, training the lung CT image segmentation original model constructed in the step S3 by adopting the training data set obtained in the step S2 to obtain a lung CT image segmentation model;
s5, segmenting the actual lung CT image by adopting the lung CT image segmentation model obtained in the step S4.
The preprocessing of the data acquired in the step S1 in the step S2 specifically includes the following steps:
preprocessing comprises image cropping, resampling and normalization;
the image clipping is to obtain a boundary frame of a required target through setting a pixel threshold clipping;
resampling operation is carried out on the obtained bounding box;
after resampling is completed, the image is normalized using Z-score:
z in i The normalized gray value; x is x i Is the original gray value; mu is the average value of the gray scale of the image; sigma is the standard deviation of the image gray scale.
The step S3 is based on a convolution encoder, a transducer encoder, a fusion module and a corresponding decoder, and constructs a lung CT image segmentation original model, and specifically comprises the following steps:
the acquired image data is input into a convolutional encoder module to learn to obtain local features, and is input into a transducer encoder module to learn to obtain global features;
carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of the encoder;
constructing a decoder correspondingly according to the convolutional encoder module; inputting the acquired characteristics to a decoder for decoding;
and processing the characteristics output by the decoder by adopting a softmax function to obtain the final probabilities of the characteristics of different categories.
The convolution encoder module is specifically composed of a plurality of convolution modules, and each convolution module comprises two Conv-IN-LeakyReLU modules;
each Conv-IN-LeakyReLU module comprises a convolution layer, an InstanceNorm layer and a LeakyReLU layer which are sequentially connected IN series;
the processing function of the convolutional encoder block is Y o =leakyrelu (IN (Conv (X))), wherein Y o As an output feature of the encoder module, X is an input feature, conv () is a convolutional layer processing function; IN () is an InstanceNorm layer processing function; the LeakyReLU () is a LeakyReLU layer processing function;
the convolution layer is a three-dimensional convolution layer;
InstanceNorm layer is processed according to the formulaWherein Y is a processing result feature, X is an input feature, mu is a mean value of the input feature, sigma is a variance of the input feature, epsilon is an error coefficient, and gamma is a first learning parameter; beta is a second learning parameter;
the processing formula of the LeakyReLU layer is as followsWherein y is i Is a processing feature after the activation function; x is x i Is the i-th value in the feature map; a is a fixed constant;
the first Conv-IN-LeakyReLU block IN each convolution block is used for the downsampling operation, with a convolution kernel size of 3 and a convolution kernel sliding step size of 2.
The converter encoder module specifically comprises a patchemibedding module, a plurality of continuous converter blocks and a plurality of downsampling modules; a downsampling module is connected between every two adjacent Transformer blocks;
the downsampling module adopts a tri-linear interpolation operation, and after interpolation is completed, the convolution with the convolution kernel size of 1 is adopted to adjust the channel number of the feature map, so that the channel number of the feature map is ensured to be increased by 1 time;
the patch enabling module is used for dividing the image into a series of non-overlapping patches, and projecting each patch into a high-dimensional space;
each transducer block comprises a first normalization layer, a shift DW layer, a second normalization layer and a multi-layer perceptron layer; after the input features are normalized by the first normalization layer, the input features are processed by the Shift DW layer; the processing result of the Shift DW layer is connected with the input feature to obtain a second input feature; the second input features are input to a second normalization layer for processing, and then are processed through a multi-layer perceptron layer; the processing result of the multi-layer perceptron layer is connected with the second input characteristic and then used as the output characteristic of the current transducer block;
the calculation formula of the transducer block is as follows:
in the middle ofRepresenting the output of the layer i ShiftDW module; shiftDW () is a ShiftDW layer processing function; z l An output for the layer I transducer; BN () is a processing function of the first normalization layer and the second normalization layer; MLP () is a layer processing function of the multi-layer perceptron, and the number of layers of the multi-layer perceptron is 2; the formula of the MLP () is:
MLP=Linear 2 (GELU(Linear 1 (x)))
wherein MLP is a processing function; x is an input feature; linear 1 And Linear 2 Is a fully connected module; gel () is a processing function of a gaussian error linear unit, andx is the input feature, erf () is the intermediate processing function and +.>
The Shift DW layer is used for replacing a multi-head self-attention module in the transducer block; the Shift DW layer specifically comprises a shift operation, a first point-by-point convolution operation, a depth convolution operation and a second point-by-point convolution operation which are sequentially connected in series; the calculation formula of the ShiftDW layer is:
ShiftDW(z)=PWConv 2 (DWConv(PWConv 1 (Shift(x))))
wherein the Shift DW (z) is the output characteristic of the Shift DW layer; x is an input feature; pwconv 1 () The convolution kernel size is 1 for the first point-by-point convolution operation function; DWConv () is a depth convolution operation function; pwconv 2 () The convolution kernel size is 1 for the second point-by-point convolution operation function; shift () is a Shift operation functionAnd the calculation formula of the shift operation is as follows:
in the middle ofRepresenting the features after the shift operation; z []Is an input feature; d is the spatial depth; h is the space width; w is the space height; gamma is a scale factor; c is the number of channels.
The method comprises the steps of carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of an encoder, and specifically comprises the following steps:
the fusion module comprises a global attention module and a local attention module;
the global attention module acquires weights of different channels by adopting global average pooling, and then adopts point-by-point convolution as a channel context aggregator; the local attention module adopts point-by-point convolution as a context aggregator, and simultaneously maintains the same resolution as the original feature map, so that more space information is reserved;
the calculation formula of the fusion module is as follows:
z a =Cat(z c ,z t )
z=Conv(z a )
z l =Local(z)
z g =Global(z)
z o =z l +z g
z in a The characteristics after the output characteristics of the convolution encoder are connected with the output characteristics of the transform encoder in series; cat () is a processing function concatenated in the channel dimension; z c Is an output characteristic of the convolutional encoder; z t Is an output characteristic of the transducer encoder; z is an input feature; conv () is a convolution operation function, and the convolution kernel size is Cx2Cx1x1x1; z l Features that are output by the local attention module; z g Output characteristics of the global attention module; z o Is an output characteristic of the fusion module;
the calculation formula of the global attention module is as follows:
Global(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (GAP(z))))))
wherein Global (z) is the output characteristic of the Global attention module; z is an input feature; GAP () is a global average pooling processing function, andd is the spatial depth, H is the spatial width, W is the spatial height, z [:,i,j,k] Is an input feature; conv 1 () Processing the function for the first convolution layer, and the convolution kernel size is +.>C is the channel number of the input feature, r is the reductionScaling factor; BN (BN) 1 () Normalizing the function for a first batch; reLU () is an activation function; conv 2 () Processing function for the second convolution layer, and convolution kernel size is +.>BN 2 () Normalizing the function for a second batch;
the calculation formula of the local attention module is as follows:
Local(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (z)))))
where Local (z) is the characteristic of the Local attention module output.
The corresponding construction decoder according to the convolution encoder module comprises the following steps:
the decoder comprises a plurality of convolution blocks, and the number of the convolution blocks is the same as that of the convolution encoder module; each convolution module of the decoder comprises two Conv-IN-LeakyReLU modules, an up-sampling module is further arranged between two adjacent Conv-IN-LeakyReLU modules, and the up-sampling module is realized by adopting deconvolution; each stage of the decoder connects the features of the corresponding parallel encoder by jumps and then goes to the next stage of the decoder by upsampling.
The training of step S4 specifically includes the following steps:
inputting the result output by the lung CT image segmentation original model and a real segmentation label into a loss function, and optimizing the lung CT image segmentation original model according to the result of the loss function;
the calculation formula of the loss function L is L=alpha L Dice +βL CE Wherein L is Dice For the Dice loss, L CE For cross entropy loss, α is the weight of the price loss, and β is the weight of the cross entropy loss;
dice loss L Dice Is calculated asWherein TP is a probability that the result is true positiveThe rate, FP, is the probability of false positive as a result, and FN is the probability of false negative as a result;
cross entropy loss L CE Is calculated asWhere N is the number of samples, y i Is a real label->And outputting a prediction result for the model.
The invention also provides an imaging method comprising the lung CT image segmentation method based on deep learning, which comprises the following steps:
A. acquiring an original lung CT image to be segmented;
B. adopting the lung CT image segmentation method based on the deep learning to segment the original lung CT image to be segmented, which is obtained in the step A;
C. and C, marking and secondarily imaging the image segmentation result obtained in the step B on the original lung CT image obtained in the step A, outputting a lung CT image with the lung CT image segmentation result, and completing imaging.
The lung CT image segmentation method and the imaging method based on the deep learning provided by the invention can fully exert the advantages of convolution and transform and realize more accurate image segmentation results; the parallel encoder structure designed by the invention can accelerate the convergence speed of the parallel encoder structure under the condition of keeping the strong global modeling capability of a transducer, so that the parallel encoder structure can also show a strong segmentation effect on a small-scale data set; the fusion module can effectively fuse the characteristics from different branches, thereby realizing more accurate segmentation results; according to the transform module, the shift operation is adopted to replace the original multi-head self-attention operation, so that the calculation complexity can be reduced, faster reasoning is realized, the enhanced shortcut connection is adopted, feature breakdown can be avoided, and diversified features are generated; therefore, the invention has high reliability, good accuracy and objectivity and science.
Drawings
Fig. 1 is a flow chart of the dividing method according to the present invention.
Fig. 2 is a schematic diagram of a transducer encoder module of the segmentation method according to the present invention.
Fig. 3 is a schematic structural diagram of a fusion module of the segmentation method of the present invention.
Fig. 4 is a flow chart of the imaging method of the present invention.
Detailed Description
Fig. 1 is a flow chart of the segmentation method according to the present invention: the lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps:
s1, acquiring an existing lung CT image data set;
s2, preprocessing the data acquired in the step S1, so as to acquire a training data set; the method specifically comprises the following steps:
preprocessing comprises image cropping, resampling and normalization;
the image clipping is to obtain a boundary frame of a required target through setting a pixel threshold clipping;
resampling operation is carried out on the obtained bounding box;
after resampling is completed, the image is normalized using Z-score:
z in i The normalized gray value; x is x i Is the original gray value; mu is the average value of the gray scale of the image; sigma is the standard deviation of the gray scale of the image;
s3, constructing a lung CT image segmentation original model based on a convolution encoder, a transform encoder, a fusion module and a corresponding decoder; the method specifically comprises the following steps:
the acquired image data is input into a convolutional encoder module to learn to obtain local features, and is input into a transducer encoder module to learn to obtain global features;
carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of the encoder;
constructing a decoder correspondingly according to the convolutional encoder module; inputting the acquired characteristics to a decoder for decoding;
processing the characteristics output by the decoder by adopting a softmax function to obtain the final probability of different types of characteristics;
when the method is implemented, the convolution encoder module is composed of a plurality of convolution modules, and each convolution module comprises two Conv-IN-LeakyReLU modules;
each Conv-IN-LeakyReLU module comprises a convolution layer, an InstanceNorm layer and a LeakyReLU layer which are sequentially connected IN series;
the processing function of the convolutional encoder block is Y o =leakyrelu (IN (Conv (X))), wherein Y o As an output feature of the encoder module, X is an input feature, conv () is a convolutional layer processing function; IN () is an InstanceNorm layer processing function; the LeakyReLU () is a LeakyReLU layer processing function;
the convolution layer is a three-dimensional convolution layer;
InstanceNorm layer is processed according to the formulaWherein Y is a processing result feature, X is an input feature, mu is a mean value of the input feature, sigma is a variance of the input feature, epsilon is an error coefficient, and gamma is a first learning parameter; beta is a second learning parameter;
the processing formula of the LeakyReLU layer is as followsWherein y is i Is a processing feature after the activation function; x is x i Is the i-th value in the feature map; a is a fixed constant;
the first Conv-IN-LeakyReLU module IN each convolution module is used for downsampling operation, the convolution kernel size is 3, and the convolution kernel sliding step length is 2;
the transducer encoder module (shown in fig. 2) specifically comprises a patchebedding module, a plurality of continuous transducer blocks and a plurality of downsampling modules; a downsampling module is connected between every two adjacent Transformer blocks;
the downsampling module adopts a tri-linear interpolation operation, and after interpolation is completed, the convolution with the convolution kernel size of 1 is adopted to adjust the channel number of the feature map, so that the channel number of the feature map is ensured to be increased by 1 time;
the patch enabling module is used for dividing the image into a series of non-overlapping patches, and projecting each patch into a high-dimensional space;
each transducer block comprises a first normalization layer, a shift DW layer, a second normalization layer and a multi-layer perceptron layer; after the input features are normalized by the first normalization layer, the input features are processed by the Shift DW layer; the processing result of the Shift DW layer is connected with the input feature to obtain a second input feature; the second input features are input to a second normalization layer for processing, and then are processed through a multi-layer perceptron layer; the processing result of the multi-layer perceptron layer is connected with the second input characteristic and then used as the output characteristic of the current transducer block;
the calculation formula of the transducer block is as follows:
in the middle ofIs the output of the shift dw module; shiftDW () is a ShiftDW layer processing function; z l An output for the layer I transducer; BN () is a processing function of the first normalization layer and the second normalization layer; MLP () is a layer processing function of the multi-layer perceptron, and the number of layers of the multi-layer perceptron is 2; the formula of the MLP () is:
MLP=Linear 2 (GELU(Linear 1 (x)))
wherein MLP is a processing function; linear 1 () For the first full join operation; x is an input feature; linear 2 () For a second full join operation; gel () is a processing function of a gaussian error linear unit, andx is the input feature, erf () is the intermediate processing function and +.>
The Shift DW layer is used for replacing a multi-head self-attention module in the transducer block; the Shift DW layer specifically comprises a shift operation, a first point-by-point convolution operation, a depth convolution operation and a second point-by-point convolution operation which are sequentially connected in series; the calculation formula of the ShiftDW layer is:
ShiftDW(z)=PWConv 2 (DWConv(PWConv 1 (Shift(x))))
wherein the Shift DW (z) is the output characteristic of the Shift DW layer; x is an input feature; pwconv 1 () The convolution kernel size is 1 for the first point-by-point convolution operation function; DWConv () is a depth convolution operation function; pwconv 2 () The convolution kernel size is 1 for the second point-by-point convolution operation function; shift () is a Shift operation function, and the calculation formula of the Shift operation is:
in the middle ofIs a feature obtained after the shift operation; z []Is an input feature; d is the spatial depth; h is the space width; w is the space height; gamma is a scale factor; c is the number of channels;
the output of the encoder is obtained by fusion, and then the method comprises the following steps:
the fusion module includes a global attention module and a local attention module (as shown in fig. 3);
the global attention module acquires weights of different channels by adopting global average pooling, and then adopts point-by-point convolution as a channel context aggregator; the local attention module adopts point-by-point convolution as a context aggregator, and simultaneously maintains the same resolution as the original feature map, so that more space information is reserved;
the calculation formula of the fusion module is as follows:
z a =Cat(z c ,z t )
z=Conv(z a )
z l =Local(z)
z g =Global(z)
z o =z l +z g
z in a The characteristics after the output characteristics of the convolution encoder are connected with the output characteristics of the transform encoder in series; cat () is done in the channel dimensionA serially connected processing function; z c Is an output characteristic of the convolutional encoder; z t Is an output characteristic of the transducer encoder; z is an input feature; conv () is a convolution operation function, and the convolution kernel size is Cx2Cx1x1x1; z l Features that are output by the local attention module; z g Output characteristics of the global attention module; z o Is an output characteristic of the fusion module;
the calculation formula of the global attention module is as follows:
Global(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (GAP(z))))))
wherein Global (z) is the output characteristic of the Global attention module; z is an input feature; GAP () is a global average pooling processing function, andd is the spatial depth, H is the spatial width, W is the spatial height, z [:,i,j,k] Is an input feature; conv 1 () Processing the function for the first convolution layer, and the convolution kernel size is +.>C is the channel number of the input feature, r is the scaling factor; BN (BN) 1 () Normalizing the function for a first batch; reLU () is an activation function; conv 2 () Processing function for the second convolution layer, and convolution kernel size is +.>BN 2 () Normalizing the function for a second batch;
the calculation formula of the local attention module is as follows:
Local(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (z)))))
wherein Local (z) is the characteristic of the Local attention module output;
the construction decoder comprises in particular the following steps:
the decoder comprises a plurality of convolution blocks, and the number of the convolution blocks is the same as that of the convolution encoder module; each convolution module of the decoder comprises two Conv-IN-LeakyReLU modules, an up-sampling module is further arranged between two adjacent Conv-IN-LeakyReLU modules, and the up-sampling module is realized by adopting deconvolution; each stage of the decoder is connected with the characteristics of the corresponding parallel encoder through jump, and then enters the next stage of the decoder through up-sampling;
s4, training the lung CT image segmentation original model constructed in the step S3 by adopting the training data set obtained in the step S2 to obtain a lung CT image segmentation model; the method specifically comprises the following steps:
inputting the result output by the lung CT image segmentation original model and a real segmentation label into a loss function, and optimizing the lung CT image segmentation original model according to the result of the loss function;
the calculation formula of the loss function L is L=alpha L Dice +βL CE Wherein L is Dice For the Dice loss, L CE For cross entropy loss, α is the weight of the price loss, and β is the weight of the cross entropy loss;
dice loss L Dice Is calculated asWherein TP is the probability of true positive, FP is the probability of false positive, FN is the probability of false negative;
cross entropy loss L CE Is calculated asWhere N is the number of samples, y i Is a real label->The prediction result is output for the model;
s5, segmenting the actual lung CT image by adopting the lung CT image segmentation model obtained in the step S4.
The effectiveness of the segmentation method of the present invention is described below through experiments.
UNETR: its encoder is a hybrid architecture of convolution and transform, and its decoder is a convolution architecture. In the encoder, the features are extracted by using a transducer, and then features with different scales are obtained by using convolution as jump connection.
nnFormer: it proposes a new strategy of combining a transducer with a convolution. In the encoder, it uses three-dimensional swinformer for feature extraction, but between adjacent swinformerblocks, it uses convolution to downsample the feature map to obtain a multi-scale feature representation. In the decoder, it also uses three-dimensional swinTransformer, and uses deconvolution to restore the resolution of the image.
nnUNet: the method is a pure convolution split network, which has the same network architecture as UNet. The nnUNet can adaptively adjust the depth of the model based on the dataset as well as some other model hyper-parameters.
The performance of the method was evaluated in experiments using the evaluation index of Dice, precision, sensitivity, HD95, ASSD. The Dice index is often used for evaluating the segmentation quality of medical images, and the score value is approximately 1, which indicates that the segmentation result is more accurate. The calculation formula is as follows:
HD95 and ASSD represent distances between successive sets, with smaller values representing better segmentation performance. The calculation formula is as follows:
in the middle ofAnd->Representing the boundary points of G and M, respectively. />Is the cardinality of the G boundary. />Is the point x andthe shortest distance between all points in the (c). The formulas of Precision and Sensitivity are as follows:
TP, FN, FP represent the probability of true positive, probability of false negative and probability of false positive respectively;
the data used in this experiment is a lung cancer dataset. The present method and the comparative method were evaluated on the same test set, and the experimental results are shown in table 1. Wherein the numbers within () are standard deviations.
TABLE 1 comparison of the segmentation Performance of the segmentation method of the invention and the comparison method
Method | Dice | Precision | Recall | HD95 | ASSD |
UNETR | 0.533(0.239) | 0.588(0.284) | 0.595(0.241) | 103.8(70.50) | 22.90(21.85) |
nnFormer | 0.664(0.205) | 0.724(0.237) | 0.685(0.229) | 36.50(45.54) | 8.06(10.83) |
nnUNet | 0.641(0.228) | 0.644(0.274) | 0.737(0.204) | 74.15(78.63) | 17.40(22.15) |
The invention is that | 0.714(0.169) | 0.803(0.166) | 0.701(0.230) | 27.77(43.96) | 5.59(9.64) |
It can be seen from table 1 that the method proposed by the present invention achieves good performance in all methods on Dice and ASSD. Model evaluation on the test set resulted in average Dice, precision, sensitivity, HD, ASSD of 0.520 (0.341), 0.686 (0.306), 0.530 (0.351), 33.44 (50.35), 13.60 (28.50) for the five experiments, respectively.
In the comparison method, the nnFormer and nnUNet methods also achieve a better segmentation effect. The segmentation result of nnUNet shows that feature extraction using convolution alone in the encoder makes it difficult to model the global features of the image, resulting in lower segmentation performance. The segmentation results of nnFormer and CFC demonstrate that the hybrid architecture can effectively model the global features of the image, but different combinations result in different segmentation performance. On a lung cancer segmentation data set, the parallel architecture can effectively combine a transducer and convolution, and effectively exert the advantages of the transducer and the convolution.
Fig. 4 is a flow chart of the imaging method of the present invention: the imaging method comprising the lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps of:
A. acquiring an original lung CT image to be segmented;
B. adopting the lung CT image segmentation method based on the deep learning to segment the original lung CT image to be segmented, which is obtained in the step A;
C. and C, marking and secondarily imaging the image segmentation result obtained in the step B on the original lung CT image obtained in the step A, outputting a lung CT image with the lung CT image segmentation result, and completing imaging.
Fig. 4 is a flow chart of the imaging method of the present invention: the imaging method comprising the lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps of:
A. acquiring an original lung CT image to be segmented;
B. adopting the lung CT image segmentation method based on the deep learning to segment the original lung CT image to be segmented, which is obtained in the step A;
C. and C, marking and secondarily imaging the image segmentation result obtained in the step B on the original lung CT image obtained in the step A, outputting a lung CT image with the lung CT image segmentation result, and completing imaging.
In particular, the imaging method of the invention can be used with existing lung CT image acquisition equipment, such as CT machines. When the imaging method is particularly used, the imaging method is fused into the existing CT image acquisition equipment, then the original lung CT image is acquired by adopting the prior art, then the imaging method is adopted to carry out secondary imaging on the original lung CT image, then the lung CT image with the lung CT image segmentation result is obtained, and the lung CT image with the imaging result is directly output. In this way, medical workers (including clinicians, imaging doctors or experimenters, etc.) can acquire CT images of the lungs with segmentation results, which greatly facilitates the existing personnel.
Claims (10)
1. A lung CT image segmentation method based on deep learning comprises the following steps:
s1, acquiring an existing lung CT image data set;
s2, preprocessing the data acquired in the step S1, so as to acquire a training data set;
s3, constructing a lung CT image segmentation original model based on a convolution encoder, a transform encoder, a fusion module and a corresponding decoder;
s4, training the lung CT image segmentation original model constructed in the step S3 by adopting the training data set obtained in the step S2 to obtain a lung CT image segmentation model;
s5, segmenting the actual lung CT image by adopting the lung CT image segmentation model obtained in the step S4.
2. The lung CT image segmentation method based on deep learning according to claim 1, wherein the preprocessing of the data acquired in step S1 in step S2 specifically comprises the following steps:
preprocessing comprises image cropping, resampling and normalization;
the image clipping is to obtain a boundary frame of a required target through setting a pixel threshold clipping;
resampling operation is carried out on the obtained bounding box;
after resampling is completed, the image is normalized using Z-score:
z in i The normalized gray value; x is x i Is the original gray value; mu is the average value of the gray scale of the image; sigma is the standard deviation of the image gray scale.
3. The lung CT image segmentation method according to claim 2, wherein the constructing the original model of lung CT image segmentation in step S3 is based on a convolutional encoder, a transform encoder, a fusion module and a corresponding decoder, and specifically comprises the steps of:
the acquired image data is input into a convolutional encoder module to learn to obtain local features, and is input into a transducer encoder module to learn to obtain global features;
carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of the encoder;
constructing a decoder correspondingly according to the convolutional encoder module; inputting the acquired characteristics to a decoder for decoding;
and processing the characteristics output by the decoder by adopting a softmax function to obtain the final probabilities of the characteristics of different categories.
4. The pulmonary CT image segmentation method based on deep learning as claimed IN claim 3, wherein the convolutional encoder module comprises a plurality of convolutional modules, each of which comprises two Conv-IN-LeakyReLU modules;
each Conv-IN-LeakyReLU module comprises a convolution layer, an InstanceNorm layer and a LeakyReLU layer which are sequentially connected IN series;
the processing function of the convolutional encoder block is Y o =leakyrelu (IN (Conv (X))), wherein Y o As an output feature of the encoder module, X is an input feature, conv () is a convolutional layer processing function; IN () is an InstanceNorm layer processing function; the LeakyReLU () is a LeakyReLU layer processing function;
the convolution layer is a three-dimensional convolution layer;
InstanceNorm layer is processed according to the formulaWherein Y is a processing result feature, X is an input feature, mu is a mean value of the input feature, sigma is a variance of the input feature, epsilon is an error coefficient, and gamma is a first learning parameter; beta is a second learning parameter;
the processing formula of the LeakyReLU layer is as followsWherein y is i Is a processing feature after the activation function; x is x i Is the i-th value in the feature map; a is a fixed constant;
the first Conv-IN-LeakyReLU block IN each convolution block is used for the downsampling operation, with a convolution kernel size of 3 and a convolution kernel sliding step size of 2.
5. The lung CT image segmentation method according to claim 4, wherein the transducer encoder module comprises a patch-embedding module, a plurality of consecutive transducer blocks, and a plurality of downsampling modules; a downsampling module is connected between every two adjacent Transformer blocks;
the downsampling module adopts a tri-linear interpolation operation, and after interpolation is completed, the convolution with the convolution kernel size of 1 is adopted to adjust the channel number of the feature map, so that the channel number of the feature map is ensured to be increased by 1 time;
the patch embedding module is used for dividing the image into a series of non-overlapping patches and projecting each patch into a high-dimensional space;
each transducer block comprises a first normalization layer, a shift DW layer, a second normalization layer and a multi-layer perceptron layer; after the input features are normalized by the first normalization layer, the input features are processed by the Shift DW layer; the processing result of the Shift DW layer is connected with the input feature to obtain a second input feature; the second input features are input to a second normalization layer for processing, and then are processed through a multi-layer perceptron layer; the processing result of the multi-layer perceptron layer is connected with the second input characteristic and then used as the output characteristic of the current transducer block;
the calculation formula of the transducer block is as follows:
in the middle ofIs the output of the shift dw module; shiftDW () is a ShiftDW layer processing function; z l An output for the layer I transducer; BN () is a processing function of the first normalization layer and the second normalization layer; MLP () is a layer processing function of the multi-layer perceptron, and the number of layers of the multi-layer perceptron is 2; the formula of the MLP () is:
MLP=Linear 2 (GELU(Linear 1 (x)))
wherein MLP is a processing function; linear 1 () For the first full join operation; x is an input feature; linear 2 () For a second full join operation; gel () is a processing function of a gaussian error linear unit, andx is the input feature, erf () is the intermediate processing function and +.>
6. The deep learning-based lung CT image segmentation method according to claim 5, wherein the ShiftDW layer is used to replace a multi-head self-attention module in a transducer block; the Shift DW layer specifically comprises a shift operation, a first point-by-point convolution operation, a depth convolution operation and a second point-by-point convolution operation which are sequentially connected in series; the calculation formula of the ShiftDW layer is:
ShiftDW(z)=PWConv 2 (DWConv(PWConv 1 (Shift(x))))
wherein the Shift DW (z) is the output characteristic of the Shift DW layer; x is an input feature; pwconv 1 () The convolution kernel size is 1 for the first point-by-point convolution operation function; DWConv () is a depth convolution operation function; pwconv 2 () The convolution kernel size is 1 for the second point-by-point convolution operation function; shift () is a Shift operation function, and the calculation formula of the Shift operation is:
in the middle ofIs a feature obtained after the shift operation; z []Is an input feature; d is the spatial depth; h is the space width; w is the space height; gamma is a scale factor; c is the number of channels.
7. The lung CT image segmentation method based on deep learning according to claim 6, wherein the feature fusion is performed on the obtained local feature and the global feature by using a fusion module, so as to obtain the output of the encoder, and the method specifically comprises the following steps:
the fusion module comprises a global attention module and a local attention module;
the global attention module acquires weights of different channels by adopting global average pooling, and then adopts point-by-point convolution as a channel context aggregator; the local attention module adopts point-by-point convolution as a context aggregator, and simultaneously maintains the same resolution as the original feature map, so that more space information is reserved;
the calculation formula of the fusion module is as follows:
z a =Cat(z c ,z t )
z=Conv(z a )
z l =Local(z)
z g =Global(z)
z o =z l +z g
z in a Is convolutionThe output characteristics of the encoder are connected with the output characteristics of the transducer encoder in series; cat () is a processing function concatenated in the channel dimension; z c Is an output characteristic of the convolutional encoder; z t Is an output characteristic of the transducer encoder; z is an input feature; conv () is a convolution operation function, and the convolution kernel size is Cx2Cx1x1x1; z l Features that are output by the local attention module; z g Output characteristics of the global attention module; z o Is an output characteristic of the fusion module;
the calculation formula of the global attention module is as follows:
Global(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (GAP(z))))))
wherein Global (z) is the output characteristic of the Global attention module; z is an input feature; GAP () is a global average pooling processing function, andd is the spatial depth, H is the spatial width, W is the spatial height, z [:,i,j,k] Is an input feature; conv 1 () Processing the function for the first convolution layer, and the convolution kernel size is +.>C is the channel number of the input feature, r is the scaling factor; BN (BN) 1 () Normalizing the function for a first batch; reLU () is an activation function; conv 2 () Processing function for the second convolution layer, and convolution kernel size is +.>BN 2 () Normalizing the function for a second batch;
the calculation formula of the local attention module is as follows:
Local(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (z)))))
where Local (z) is the characteristic of the Local attention module output.
8. The pulmonary CT image segmentation method based on deep learning as claimed in claim 7, wherein the constructing the decoder according to the convolutional encoder module comprises the following steps:
the decoder comprises a plurality of convolution blocks, and the number of the convolution blocks is the same as that of the convolution encoder module; each convolution module of the decoder comprises two Conv-IN-LeakyReLU modules, an up-sampling module is further arranged between two adjacent Conv-IN-LeakyReLU modules, and the up-sampling module is realized by adopting deconvolution; each stage of the decoder connects the features of the corresponding parallel encoder by jumps and then goes to the next stage of the decoder by upsampling.
9. The lung CT image segmentation method based on deep learning according to claim 8, wherein the training of step S4 comprises the following steps:
inputting the result output by the lung CT image segmentation original model and a real segmentation label into a loss function, and optimizing the lung CT image segmentation original model according to the result of the loss function;
the calculation formula of the loss function L is L=alpha L Dice +βL CE Wherein L is Dice For the Dice loss, L CE For cross entropy loss, α is the weight of the price loss, and β is the weight of the cross entropy loss;
dice loss L Dice Is calculated asWherein TP is the probability of true positive, FP is the probability of false positive, FN is the probability of false negative;
cross entropy loss L CE Is calculated asWhere N is the number of samples, y i Is a real label->And outputting a prediction result for the model.
10. An imaging method comprising the lung CT image segmentation method based on deep learning according to any one of claims 1 to 9, comprising the steps of:
A. acquiring an original lung CT image to be segmented;
B. adopting the lung CT image segmentation method based on deep learning as claimed in one of claims 1-9 to segment the original lung CT image to be segmented acquired in the step A;
C. and C, marking and secondarily imaging the image segmentation result obtained in the step B on the original lung CT image obtained in the step A, outputting a lung CT image with the lung CT image segmentation result, and completing imaging.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310219966.3A CN116468732A (en) | 2023-03-09 | 2023-03-09 | Lung CT image segmentation method and imaging method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310219966.3A CN116468732A (en) | 2023-03-09 | 2023-03-09 | Lung CT image segmentation method and imaging method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116468732A true CN116468732A (en) | 2023-07-21 |
Family
ID=87183163
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310219966.3A Pending CN116468732A (en) | 2023-03-09 | 2023-03-09 | Lung CT image segmentation method and imaging method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116468732A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116825363A (en) * | 2023-08-29 | 2023-09-29 | 济南市人民医院 | Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network |
CN117132606A (en) * | 2023-10-24 | 2023-11-28 | 四川大学 | Segmentation method for lung lesion image |
-
2023
- 2023-03-09 CN CN202310219966.3A patent/CN116468732A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116825363A (en) * | 2023-08-29 | 2023-09-29 | 济南市人民医院 | Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network |
CN116825363B (en) * | 2023-08-29 | 2023-12-12 | 济南市人民医院 | Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network |
CN117132606A (en) * | 2023-10-24 | 2023-11-28 | 四川大学 | Segmentation method for lung lesion image |
CN117132606B (en) * | 2023-10-24 | 2024-01-09 | 四川大学 | Segmentation method for lung lesion image |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116468732A (en) | Lung CT image segmentation method and imaging method based on deep learning | |
CN111429473B (en) | Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion | |
CN109035172B (en) | Non-local mean ultrasonic image denoising method based on deep learning | |
CN112258488A (en) | Medical image focus segmentation method | |
CN112734755A (en) | Lung lobe segmentation method based on 3D full convolution neural network and multitask learning | |
CN110930378B (en) | Emphysema image processing method and system based on low data demand | |
CN114266794B (en) | Pathological section image cancer region segmentation system based on full convolution neural network | |
CN113223005B (en) | Thyroid nodule automatic segmentation and grading intelligent system | |
CN111340816A (en) | Image segmentation method based on double-U-shaped network framework | |
CN114092439A (en) | Multi-organ instance segmentation method and system | |
CN116228792A (en) | Medical image segmentation method, system and electronic device | |
CN110895815A (en) | Chest X-ray pneumothorax segmentation method based on deep learning | |
CN116596949A (en) | Medical image segmentation method based on conditional diffusion model | |
CN113160229A (en) | Pancreas segmentation method and device based on hierarchical supervision cascade pyramid network | |
CN114862800A (en) | Semi-supervised medical image segmentation method based on geometric consistency constraint | |
WO2024104035A1 (en) | Long short-term memory self-attention model-based three-dimensional medical image segmentation method and system | |
CN112420170B (en) | Method for improving image classification accuracy of computer aided diagnosis system | |
CN110992309A (en) | Fundus image segmentation method based on deep information transfer network | |
CN117095016A (en) | Multi-view consistency-based semi-supervised 3D liver CT image segmentation method | |
CN113192076B (en) | MRI brain tumor image segmentation method combining classification prediction and multi-scale feature extraction | |
CN116258685A (en) | Multi-organ segmentation method and device for simultaneous extraction and fusion of global and local features | |
CN115565671A (en) | Atrial fibrillation auxiliary analysis method based on cross-model mutual teaching semi-supervision | |
CN115471512A (en) | Medical image segmentation method based on self-supervision contrast learning | |
CN114821067A (en) | Pathological image segmentation method based on point annotation data | |
Ru et al. | A dermoscopic image segmentation algorithm based on U-shaped architecture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |