CN116468732A - Lung CT image segmentation method and imaging method based on deep learning - Google Patents

Lung CT image segmentation method and imaging method based on deep learning Download PDF

Info

Publication number
CN116468732A
CN116468732A CN202310219966.3A CN202310219966A CN116468732A CN 116468732 A CN116468732 A CN 116468732A CN 202310219966 A CN202310219966 A CN 202310219966A CN 116468732 A CN116468732 A CN 116468732A
Authority
CN
China
Prior art keywords
layer
lung
module
convolution
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310219966.3A
Other languages
Chinese (zh)
Inventor
周建存
匡湖林
王建新
刘权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhixiang Health Technology Beijing Co ltd
Central South University
Hunan City University
Original Assignee
Zhixiang Health Technology Beijing Co ltd
Central South University
Hunan City University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhixiang Health Technology Beijing Co ltd, Central South University, Hunan City University filed Critical Zhixiang Health Technology Beijing Co ltd
Priority to CN202310219966.3A priority Critical patent/CN116468732A/en
Publication of CN116468732A publication Critical patent/CN116468732A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/003Reconstruction from projections, e.g. tomography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30061Lung
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/41Medical
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a lung CT image segmentation method based on deep learning, which comprises the steps of acquiring an existing lung CT image data set and preprocessing to obtain a training data set; constructing a lung CT image segmentation original model and training to obtain a lung CT image segmentation model; and dividing the actual lung CT image by adopting a lung CT image division model. The invention also discloses an imaging method comprising the lung CT image segmentation method based on the deep learning. The invention can realize more accurate image segmentation results, can accelerate convergence speed, can show stronger segmentation effect on a small-scale data set, can realize more accurate segmentation results, can avoid feature collapse and generate diversified features; therefore, the invention has high reliability, good accuracy and objectivity and science.

Description

Lung CT image segmentation method and imaging method based on deep learning
Technical Field
The invention belongs to the field of medical image processing, and particularly relates to a lung CT image segmentation method and an imaging method based on deep learning.
Background
Along with the development of economic technology and the improvement of living standard of people, the attention of people to health is higher and higher; medical image processing techniques are therefore becoming increasingly important.
In clinical application and experimental research process, the lung CT image has important auxiliary effect, and can effectively help clinical staff to diagnose subsequent lung diseases or effectively help experimental researchers to conduct subsequent researches. Therefore, the segmentation of CT images of the lung is particularly important.
At present, a deep learning model is often adopted to independently learn characteristic representation aiming at a lung CT image segmentation method, and a segmentation task is completed by utilizing a learned high-dimensional abstract; for example, image segmentation methods based on the transducer technology are commonly used today to segment CT images of the lung.
The transducer was originally applied to natural language processing and has met with significant success; since the advent of ViT, transformers have achieved competitive performance over a range of computer vision tasks. However, as research continues, some of the shortcomings of the transducer are gradually revealed. Of paramount importance, a transducer typically requires extensive data set training to exhibit better performance. However, the dataset of medical images is typically smaller, while the dataset of CT images of the lungs is relatively smaller. Therefore, the existing lung CT image segmentation method based on the transducer technology has relatively poor accuracy and low reliability.
Disclosure of Invention
The invention aims to provide a lung CT image segmentation method based on deep learning, which has high reliability, good accuracy and objectivity and science.
The second object of the invention is to provide an imaging method comprising the lung CT image segmentation method based on deep learning.
The lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps:
s1, acquiring an existing lung CT image data set;
s2, preprocessing the data acquired in the step S1, so as to acquire a training data set;
s3, constructing a lung CT image segmentation original model based on a convolution encoder, a transform encoder, a fusion module and a corresponding decoder;
s4, training the lung CT image segmentation original model constructed in the step S3 by adopting the training data set obtained in the step S2 to obtain a lung CT image segmentation model;
s5, segmenting the actual lung CT image by adopting the lung CT image segmentation model obtained in the step S4.
The preprocessing of the data acquired in the step S1 in the step S2 specifically includes the following steps:
preprocessing comprises image cropping, resampling and normalization;
the image clipping is to obtain a boundary frame of a required target through setting a pixel threshold clipping;
resampling operation is carried out on the obtained bounding box;
after resampling is completed, the image is normalized using Z-score:
z in i The normalized gray value; x is x i Is the original gray value; mu is the average value of the gray scale of the image; sigma is the standard deviation of the image gray scale.
The step S3 is based on a convolution encoder, a transducer encoder, a fusion module and a corresponding decoder, and constructs a lung CT image segmentation original model, and specifically comprises the following steps:
the acquired image data is input into a convolutional encoder module to learn to obtain local features, and is input into a transducer encoder module to learn to obtain global features;
carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of the encoder;
constructing a decoder correspondingly according to the convolutional encoder module; inputting the acquired characteristics to a decoder for decoding;
and processing the characteristics output by the decoder by adopting a softmax function to obtain the final probabilities of the characteristics of different categories.
The convolution encoder module is specifically composed of a plurality of convolution modules, and each convolution module comprises two Conv-IN-LeakyReLU modules;
each Conv-IN-LeakyReLU module comprises a convolution layer, an InstanceNorm layer and a LeakyReLU layer which are sequentially connected IN series;
the processing function of the convolutional encoder block is Y o =leakyrelu (IN (Conv (X))), wherein Y o As an output feature of the encoder module, X is an input feature, conv () is a convolutional layer processing function; IN () is an InstanceNorm layer processing function; the LeakyReLU () is a LeakyReLU layer processing function;
the convolution layer is a three-dimensional convolution layer;
InstanceNorm layer is processed according to the formulaWherein Y is a processing result feature, X is an input feature, mu is a mean value of the input feature, sigma is a variance of the input feature, epsilon is an error coefficient, and gamma is a first learning parameter; beta is a second learning parameter;
the processing formula of the LeakyReLU layer is as followsWherein y is i Is a processing feature after the activation function; x is x i Is the i-th value in the feature map; a is a fixed constant;
the first Conv-IN-LeakyReLU block IN each convolution block is used for the downsampling operation, with a convolution kernel size of 3 and a convolution kernel sliding step size of 2.
The converter encoder module specifically comprises a patchemibedding module, a plurality of continuous converter blocks and a plurality of downsampling modules; a downsampling module is connected between every two adjacent Transformer blocks;
the downsampling module adopts a tri-linear interpolation operation, and after interpolation is completed, the convolution with the convolution kernel size of 1 is adopted to adjust the channel number of the feature map, so that the channel number of the feature map is ensured to be increased by 1 time;
the patch enabling module is used for dividing the image into a series of non-overlapping patches, and projecting each patch into a high-dimensional space;
each transducer block comprises a first normalization layer, a shift DW layer, a second normalization layer and a multi-layer perceptron layer; after the input features are normalized by the first normalization layer, the input features are processed by the Shift DW layer; the processing result of the Shift DW layer is connected with the input feature to obtain a second input feature; the second input features are input to a second normalization layer for processing, and then are processed through a multi-layer perceptron layer; the processing result of the multi-layer perceptron layer is connected with the second input characteristic and then used as the output characteristic of the current transducer block;
the calculation formula of the transducer block is as follows:
in the middle ofRepresenting the output of the layer i ShiftDW module; shiftDW () is a ShiftDW layer processing function; z l An output for the layer I transducer; BN () is a processing function of the first normalization layer and the second normalization layer; MLP () is a layer processing function of the multi-layer perceptron, and the number of layers of the multi-layer perceptron is 2; the formula of the MLP () is:
MLP=Linear 2 (GELU(Linear 1 (x)))
wherein MLP is a processing function; x is an input feature; linear 1 And Linear 2 Is a fully connected module; gel () is a processing function of a gaussian error linear unit, andx is the input feature, erf () is the intermediate processing function and +.>
The Shift DW layer is used for replacing a multi-head self-attention module in the transducer block; the Shift DW layer specifically comprises a shift operation, a first point-by-point convolution operation, a depth convolution operation and a second point-by-point convolution operation which are sequentially connected in series; the calculation formula of the ShiftDW layer is:
ShiftDW(z)=PWConv 2 (DWConv(PWConv 1 (Shift(x))))
wherein the Shift DW (z) is the output characteristic of the Shift DW layer; x is an input feature; pwconv 1 () The convolution kernel size is 1 for the first point-by-point convolution operation function; DWConv () is a depth convolution operation function; pwconv 2 () The convolution kernel size is 1 for the second point-by-point convolution operation function; shift () is a Shift operation functionAnd the calculation formula of the shift operation is as follows:
in the middle ofRepresenting the features after the shift operation; z []Is an input feature; d is the spatial depth; h is the space width; w is the space height; gamma is a scale factor; c is the number of channels.
The method comprises the steps of carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of an encoder, and specifically comprises the following steps:
the fusion module comprises a global attention module and a local attention module;
the global attention module acquires weights of different channels by adopting global average pooling, and then adopts point-by-point convolution as a channel context aggregator; the local attention module adopts point-by-point convolution as a context aggregator, and simultaneously maintains the same resolution as the original feature map, so that more space information is reserved;
the calculation formula of the fusion module is as follows:
z a =Cat(z c ,z t )
z=Conv(z a )
z l =Local(z)
z g =Global(z)
z o =z l +z g
z in a The characteristics after the output characteristics of the convolution encoder are connected with the output characteristics of the transform encoder in series; cat () is a processing function concatenated in the channel dimension; z c Is an output characteristic of the convolutional encoder; z t Is an output characteristic of the transducer encoder; z is an input feature; conv () is a convolution operation function, and the convolution kernel size is Cx2Cx1x1x1; z l Features that are output by the local attention module; z g Output characteristics of the global attention module; z o Is an output characteristic of the fusion module;
the calculation formula of the global attention module is as follows:
Global(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (GAP(z))))))
wherein Global (z) is the output characteristic of the Global attention module; z is an input feature; GAP () is a global average pooling processing function, andd is the spatial depth, H is the spatial width, W is the spatial height, z [:,i,j,k] Is an input feature; conv 1 () Processing the function for the first convolution layer, and the convolution kernel size is +.>C is the channel number of the input feature, r is the reductionScaling factor; BN (BN) 1 () Normalizing the function for a first batch; reLU () is an activation function; conv 2 () Processing function for the second convolution layer, and convolution kernel size is +.>BN 2 () Normalizing the function for a second batch;
the calculation formula of the local attention module is as follows:
Local(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (z)))))
where Local (z) is the characteristic of the Local attention module output.
The corresponding construction decoder according to the convolution encoder module comprises the following steps:
the decoder comprises a plurality of convolution blocks, and the number of the convolution blocks is the same as that of the convolution encoder module; each convolution module of the decoder comprises two Conv-IN-LeakyReLU modules, an up-sampling module is further arranged between two adjacent Conv-IN-LeakyReLU modules, and the up-sampling module is realized by adopting deconvolution; each stage of the decoder connects the features of the corresponding parallel encoder by jumps and then goes to the next stage of the decoder by upsampling.
The training of step S4 specifically includes the following steps:
inputting the result output by the lung CT image segmentation original model and a real segmentation label into a loss function, and optimizing the lung CT image segmentation original model according to the result of the loss function;
the calculation formula of the loss function L is L=alpha L Dice +βL CE Wherein L is Dice For the Dice loss, L CE For cross entropy loss, α is the weight of the price loss, and β is the weight of the cross entropy loss;
dice loss L Dice Is calculated asWherein TP is a probability that the result is true positiveThe rate, FP, is the probability of false positive as a result, and FN is the probability of false negative as a result;
cross entropy loss L CE Is calculated asWhere N is the number of samples, y i Is a real label->And outputting a prediction result for the model.
The invention also provides an imaging method comprising the lung CT image segmentation method based on deep learning, which comprises the following steps:
A. acquiring an original lung CT image to be segmented;
B. adopting the lung CT image segmentation method based on the deep learning to segment the original lung CT image to be segmented, which is obtained in the step A;
C. and C, marking and secondarily imaging the image segmentation result obtained in the step B on the original lung CT image obtained in the step A, outputting a lung CT image with the lung CT image segmentation result, and completing imaging.
The lung CT image segmentation method and the imaging method based on the deep learning provided by the invention can fully exert the advantages of convolution and transform and realize more accurate image segmentation results; the parallel encoder structure designed by the invention can accelerate the convergence speed of the parallel encoder structure under the condition of keeping the strong global modeling capability of a transducer, so that the parallel encoder structure can also show a strong segmentation effect on a small-scale data set; the fusion module can effectively fuse the characteristics from different branches, thereby realizing more accurate segmentation results; according to the transform module, the shift operation is adopted to replace the original multi-head self-attention operation, so that the calculation complexity can be reduced, faster reasoning is realized, the enhanced shortcut connection is adopted, feature breakdown can be avoided, and diversified features are generated; therefore, the invention has high reliability, good accuracy and objectivity and science.
Drawings
Fig. 1 is a flow chart of the dividing method according to the present invention.
Fig. 2 is a schematic diagram of a transducer encoder module of the segmentation method according to the present invention.
Fig. 3 is a schematic structural diagram of a fusion module of the segmentation method of the present invention.
Fig. 4 is a flow chart of the imaging method of the present invention.
Detailed Description
Fig. 1 is a flow chart of the segmentation method according to the present invention: the lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps:
s1, acquiring an existing lung CT image data set;
s2, preprocessing the data acquired in the step S1, so as to acquire a training data set; the method specifically comprises the following steps:
preprocessing comprises image cropping, resampling and normalization;
the image clipping is to obtain a boundary frame of a required target through setting a pixel threshold clipping;
resampling operation is carried out on the obtained bounding box;
after resampling is completed, the image is normalized using Z-score:
z in i The normalized gray value; x is x i Is the original gray value; mu is the average value of the gray scale of the image; sigma is the standard deviation of the gray scale of the image;
s3, constructing a lung CT image segmentation original model based on a convolution encoder, a transform encoder, a fusion module and a corresponding decoder; the method specifically comprises the following steps:
the acquired image data is input into a convolutional encoder module to learn to obtain local features, and is input into a transducer encoder module to learn to obtain global features;
carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of the encoder;
constructing a decoder correspondingly according to the convolutional encoder module; inputting the acquired characteristics to a decoder for decoding;
processing the characteristics output by the decoder by adopting a softmax function to obtain the final probability of different types of characteristics;
when the method is implemented, the convolution encoder module is composed of a plurality of convolution modules, and each convolution module comprises two Conv-IN-LeakyReLU modules;
each Conv-IN-LeakyReLU module comprises a convolution layer, an InstanceNorm layer and a LeakyReLU layer which are sequentially connected IN series;
the processing function of the convolutional encoder block is Y o =leakyrelu (IN (Conv (X))), wherein Y o As an output feature of the encoder module, X is an input feature, conv () is a convolutional layer processing function; IN () is an InstanceNorm layer processing function; the LeakyReLU () is a LeakyReLU layer processing function;
the convolution layer is a three-dimensional convolution layer;
InstanceNorm layer is processed according to the formulaWherein Y is a processing result feature, X is an input feature, mu is a mean value of the input feature, sigma is a variance of the input feature, epsilon is an error coefficient, and gamma is a first learning parameter; beta is a second learning parameter;
the processing formula of the LeakyReLU layer is as followsWherein y is i Is a processing feature after the activation function; x is x i Is the i-th value in the feature map; a is a fixed constant;
the first Conv-IN-LeakyReLU module IN each convolution module is used for downsampling operation, the convolution kernel size is 3, and the convolution kernel sliding step length is 2;
the transducer encoder module (shown in fig. 2) specifically comprises a patchebedding module, a plurality of continuous transducer blocks and a plurality of downsampling modules; a downsampling module is connected between every two adjacent Transformer blocks;
the downsampling module adopts a tri-linear interpolation operation, and after interpolation is completed, the convolution with the convolution kernel size of 1 is adopted to adjust the channel number of the feature map, so that the channel number of the feature map is ensured to be increased by 1 time;
the patch enabling module is used for dividing the image into a series of non-overlapping patches, and projecting each patch into a high-dimensional space;
each transducer block comprises a first normalization layer, a shift DW layer, a second normalization layer and a multi-layer perceptron layer; after the input features are normalized by the first normalization layer, the input features are processed by the Shift DW layer; the processing result of the Shift DW layer is connected with the input feature to obtain a second input feature; the second input features are input to a second normalization layer for processing, and then are processed through a multi-layer perceptron layer; the processing result of the multi-layer perceptron layer is connected with the second input characteristic and then used as the output characteristic of the current transducer block;
the calculation formula of the transducer block is as follows:
in the middle ofIs the output of the shift dw module; shiftDW () is a ShiftDW layer processing function; z l An output for the layer I transducer; BN () is a processing function of the first normalization layer and the second normalization layer; MLP () is a layer processing function of the multi-layer perceptron, and the number of layers of the multi-layer perceptron is 2; the formula of the MLP () is:
MLP=Linear 2 (GELU(Linear 1 (x)))
wherein MLP is a processing function; linear 1 () For the first full join operation; x is an input feature; linear 2 () For a second full join operation; gel () is a processing function of a gaussian error linear unit, andx is the input feature, erf () is the intermediate processing function and +.>
The Shift DW layer is used for replacing a multi-head self-attention module in the transducer block; the Shift DW layer specifically comprises a shift operation, a first point-by-point convolution operation, a depth convolution operation and a second point-by-point convolution operation which are sequentially connected in series; the calculation formula of the ShiftDW layer is:
ShiftDW(z)=PWConv 2 (DWConv(PWConv 1 (Shift(x))))
wherein the Shift DW (z) is the output characteristic of the Shift DW layer; x is an input feature; pwconv 1 () The convolution kernel size is 1 for the first point-by-point convolution operation function; DWConv () is a depth convolution operation function; pwconv 2 () The convolution kernel size is 1 for the second point-by-point convolution operation function; shift () is a Shift operation function, and the calculation formula of the Shift operation is:
in the middle ofIs a feature obtained after the shift operation; z []Is an input feature; d is the spatial depth; h is the space width; w is the space height; gamma is a scale factor; c is the number of channels;
the output of the encoder is obtained by fusion, and then the method comprises the following steps:
the fusion module includes a global attention module and a local attention module (as shown in fig. 3);
the global attention module acquires weights of different channels by adopting global average pooling, and then adopts point-by-point convolution as a channel context aggregator; the local attention module adopts point-by-point convolution as a context aggregator, and simultaneously maintains the same resolution as the original feature map, so that more space information is reserved;
the calculation formula of the fusion module is as follows:
z a =Cat(z c ,z t )
z=Conv(z a )
z l =Local(z)
z g =Global(z)
z o =z l +z g
z in a The characteristics after the output characteristics of the convolution encoder are connected with the output characteristics of the transform encoder in series; cat () is done in the channel dimensionA serially connected processing function; z c Is an output characteristic of the convolutional encoder; z t Is an output characteristic of the transducer encoder; z is an input feature; conv () is a convolution operation function, and the convolution kernel size is Cx2Cx1x1x1; z l Features that are output by the local attention module; z g Output characteristics of the global attention module; z o Is an output characteristic of the fusion module;
the calculation formula of the global attention module is as follows:
Global(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (GAP(z))))))
wherein Global (z) is the output characteristic of the Global attention module; z is an input feature; GAP () is a global average pooling processing function, andd is the spatial depth, H is the spatial width, W is the spatial height, z [:,i,j,k] Is an input feature; conv 1 () Processing the function for the first convolution layer, and the convolution kernel size is +.>C is the channel number of the input feature, r is the scaling factor; BN (BN) 1 () Normalizing the function for a first batch; reLU () is an activation function; conv 2 () Processing function for the second convolution layer, and convolution kernel size is +.>BN 2 () Normalizing the function for a second batch;
the calculation formula of the local attention module is as follows:
Local(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (z)))))
wherein Local (z) is the characteristic of the Local attention module output;
the construction decoder comprises in particular the following steps:
the decoder comprises a plurality of convolution blocks, and the number of the convolution blocks is the same as that of the convolution encoder module; each convolution module of the decoder comprises two Conv-IN-LeakyReLU modules, an up-sampling module is further arranged between two adjacent Conv-IN-LeakyReLU modules, and the up-sampling module is realized by adopting deconvolution; each stage of the decoder is connected with the characteristics of the corresponding parallel encoder through jump, and then enters the next stage of the decoder through up-sampling;
s4, training the lung CT image segmentation original model constructed in the step S3 by adopting the training data set obtained in the step S2 to obtain a lung CT image segmentation model; the method specifically comprises the following steps:
inputting the result output by the lung CT image segmentation original model and a real segmentation label into a loss function, and optimizing the lung CT image segmentation original model according to the result of the loss function;
the calculation formula of the loss function L is L=alpha L Dice +βL CE Wherein L is Dice For the Dice loss, L CE For cross entropy loss, α is the weight of the price loss, and β is the weight of the cross entropy loss;
dice loss L Dice Is calculated asWherein TP is the probability of true positive, FP is the probability of false positive, FN is the probability of false negative;
cross entropy loss L CE Is calculated asWhere N is the number of samples, y i Is a real label->The prediction result is output for the model;
s5, segmenting the actual lung CT image by adopting the lung CT image segmentation model obtained in the step S4.
The effectiveness of the segmentation method of the present invention is described below through experiments.
UNETR: its encoder is a hybrid architecture of convolution and transform, and its decoder is a convolution architecture. In the encoder, the features are extracted by using a transducer, and then features with different scales are obtained by using convolution as jump connection.
nnFormer: it proposes a new strategy of combining a transducer with a convolution. In the encoder, it uses three-dimensional swinformer for feature extraction, but between adjacent swinformerblocks, it uses convolution to downsample the feature map to obtain a multi-scale feature representation. In the decoder, it also uses three-dimensional swinTransformer, and uses deconvolution to restore the resolution of the image.
nnUNet: the method is a pure convolution split network, which has the same network architecture as UNet. The nnUNet can adaptively adjust the depth of the model based on the dataset as well as some other model hyper-parameters.
The performance of the method was evaluated in experiments using the evaluation index of Dice, precision, sensitivity, HD95, ASSD. The Dice index is often used for evaluating the segmentation quality of medical images, and the score value is approximately 1, which indicates that the segmentation result is more accurate. The calculation formula is as follows:
HD95 and ASSD represent distances between successive sets, with smaller values representing better segmentation performance. The calculation formula is as follows:
in the middle ofAnd->Representing the boundary points of G and M, respectively. />Is the cardinality of the G boundary. />Is the point x andthe shortest distance between all points in the (c). The formulas of Precision and Sensitivity are as follows:
TP, FN, FP represent the probability of true positive, probability of false negative and probability of false positive respectively;
the data used in this experiment is a lung cancer dataset. The present method and the comparative method were evaluated on the same test set, and the experimental results are shown in table 1. Wherein the numbers within () are standard deviations.
TABLE 1 comparison of the segmentation Performance of the segmentation method of the invention and the comparison method
Method Dice Precision Recall HD95 ASSD
UNETR 0.533(0.239) 0.588(0.284) 0.595(0.241) 103.8(70.50) 22.90(21.85)
nnFormer 0.664(0.205) 0.724(0.237) 0.685(0.229) 36.50(45.54) 8.06(10.83)
nnUNet 0.641(0.228) 0.644(0.274) 0.737(0.204) 74.15(78.63) 17.40(22.15)
The invention is that 0.714(0.169) 0.803(0.166) 0.701(0.230) 27.77(43.96) 5.59(9.64)
It can be seen from table 1 that the method proposed by the present invention achieves good performance in all methods on Dice and ASSD. Model evaluation on the test set resulted in average Dice, precision, sensitivity, HD, ASSD of 0.520 (0.341), 0.686 (0.306), 0.530 (0.351), 33.44 (50.35), 13.60 (28.50) for the five experiments, respectively.
In the comparison method, the nnFormer and nnUNet methods also achieve a better segmentation effect. The segmentation result of nnUNet shows that feature extraction using convolution alone in the encoder makes it difficult to model the global features of the image, resulting in lower segmentation performance. The segmentation results of nnFormer and CFC demonstrate that the hybrid architecture can effectively model the global features of the image, but different combinations result in different segmentation performance. On a lung cancer segmentation data set, the parallel architecture can effectively combine a transducer and convolution, and effectively exert the advantages of the transducer and the convolution.
Fig. 4 is a flow chart of the imaging method of the present invention: the imaging method comprising the lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps of:
A. acquiring an original lung CT image to be segmented;
B. adopting the lung CT image segmentation method based on the deep learning to segment the original lung CT image to be segmented, which is obtained in the step A;
C. and C, marking and secondarily imaging the image segmentation result obtained in the step B on the original lung CT image obtained in the step A, outputting a lung CT image with the lung CT image segmentation result, and completing imaging.
Fig. 4 is a flow chart of the imaging method of the present invention: the imaging method comprising the lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps of:
A. acquiring an original lung CT image to be segmented;
B. adopting the lung CT image segmentation method based on the deep learning to segment the original lung CT image to be segmented, which is obtained in the step A;
C. and C, marking and secondarily imaging the image segmentation result obtained in the step B on the original lung CT image obtained in the step A, outputting a lung CT image with the lung CT image segmentation result, and completing imaging.
In particular, the imaging method of the invention can be used with existing lung CT image acquisition equipment, such as CT machines. When the imaging method is particularly used, the imaging method is fused into the existing CT image acquisition equipment, then the original lung CT image is acquired by adopting the prior art, then the imaging method is adopted to carry out secondary imaging on the original lung CT image, then the lung CT image with the lung CT image segmentation result is obtained, and the lung CT image with the imaging result is directly output. In this way, medical workers (including clinicians, imaging doctors or experimenters, etc.) can acquire CT images of the lungs with segmentation results, which greatly facilitates the existing personnel.

Claims (10)

1. A lung CT image segmentation method based on deep learning comprises the following steps:
s1, acquiring an existing lung CT image data set;
s2, preprocessing the data acquired in the step S1, so as to acquire a training data set;
s3, constructing a lung CT image segmentation original model based on a convolution encoder, a transform encoder, a fusion module and a corresponding decoder;
s4, training the lung CT image segmentation original model constructed in the step S3 by adopting the training data set obtained in the step S2 to obtain a lung CT image segmentation model;
s5, segmenting the actual lung CT image by adopting the lung CT image segmentation model obtained in the step S4.
2. The lung CT image segmentation method based on deep learning according to claim 1, wherein the preprocessing of the data acquired in step S1 in step S2 specifically comprises the following steps:
preprocessing comprises image cropping, resampling and normalization;
the image clipping is to obtain a boundary frame of a required target through setting a pixel threshold clipping;
resampling operation is carried out on the obtained bounding box;
after resampling is completed, the image is normalized using Z-score:
z in i The normalized gray value; x is x i Is the original gray value; mu is the average value of the gray scale of the image; sigma is the standard deviation of the image gray scale.
3. The lung CT image segmentation method according to claim 2, wherein the constructing the original model of lung CT image segmentation in step S3 is based on a convolutional encoder, a transform encoder, a fusion module and a corresponding decoder, and specifically comprises the steps of:
the acquired image data is input into a convolutional encoder module to learn to obtain local features, and is input into a transducer encoder module to learn to obtain global features;
carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of the encoder;
constructing a decoder correspondingly according to the convolutional encoder module; inputting the acquired characteristics to a decoder for decoding;
and processing the characteristics output by the decoder by adopting a softmax function to obtain the final probabilities of the characteristics of different categories.
4. The pulmonary CT image segmentation method based on deep learning as claimed IN claim 3, wherein the convolutional encoder module comprises a plurality of convolutional modules, each of which comprises two Conv-IN-LeakyReLU modules;
each Conv-IN-LeakyReLU module comprises a convolution layer, an InstanceNorm layer and a LeakyReLU layer which are sequentially connected IN series;
the processing function of the convolutional encoder block is Y o =leakyrelu (IN (Conv (X))), wherein Y o As an output feature of the encoder module, X is an input feature, conv () is a convolutional layer processing function; IN () is an InstanceNorm layer processing function; the LeakyReLU () is a LeakyReLU layer processing function;
the convolution layer is a three-dimensional convolution layer;
InstanceNorm layer is processed according to the formulaWherein Y is a processing result feature, X is an input feature, mu is a mean value of the input feature, sigma is a variance of the input feature, epsilon is an error coefficient, and gamma is a first learning parameter; beta is a second learning parameter;
the processing formula of the LeakyReLU layer is as followsWherein y is i Is a processing feature after the activation function; x is x i Is the i-th value in the feature map; a is a fixed constant;
the first Conv-IN-LeakyReLU block IN each convolution block is used for the downsampling operation, with a convolution kernel size of 3 and a convolution kernel sliding step size of 2.
5. The lung CT image segmentation method according to claim 4, wherein the transducer encoder module comprises a patch-embedding module, a plurality of consecutive transducer blocks, and a plurality of downsampling modules; a downsampling module is connected between every two adjacent Transformer blocks;
the downsampling module adopts a tri-linear interpolation operation, and after interpolation is completed, the convolution with the convolution kernel size of 1 is adopted to adjust the channel number of the feature map, so that the channel number of the feature map is ensured to be increased by 1 time;
the patch embedding module is used for dividing the image into a series of non-overlapping patches and projecting each patch into a high-dimensional space;
each transducer block comprises a first normalization layer, a shift DW layer, a second normalization layer and a multi-layer perceptron layer; after the input features are normalized by the first normalization layer, the input features are processed by the Shift DW layer; the processing result of the Shift DW layer is connected with the input feature to obtain a second input feature; the second input features are input to a second normalization layer for processing, and then are processed through a multi-layer perceptron layer; the processing result of the multi-layer perceptron layer is connected with the second input characteristic and then used as the output characteristic of the current transducer block;
the calculation formula of the transducer block is as follows:
in the middle ofIs the output of the shift dw module; shiftDW () is a ShiftDW layer processing function; z l An output for the layer I transducer; BN () is a processing function of the first normalization layer and the second normalization layer; MLP () is a layer processing function of the multi-layer perceptron, and the number of layers of the multi-layer perceptron is 2; the formula of the MLP () is:
MLP=Linear 2 (GELU(Linear 1 (x)))
wherein MLP is a processing function; linear 1 () For the first full join operation; x is an input feature; linear 2 () For a second full join operation; gel () is a processing function of a gaussian error linear unit, andx is the input feature, erf () is the intermediate processing function and +.>
6. The deep learning-based lung CT image segmentation method according to claim 5, wherein the ShiftDW layer is used to replace a multi-head self-attention module in a transducer block; the Shift DW layer specifically comprises a shift operation, a first point-by-point convolution operation, a depth convolution operation and a second point-by-point convolution operation which are sequentially connected in series; the calculation formula of the ShiftDW layer is:
ShiftDW(z)=PWConv 2 (DWConv(PWConv 1 (Shift(x))))
wherein the Shift DW (z) is the output characteristic of the Shift DW layer; x is an input feature; pwconv 1 () The convolution kernel size is 1 for the first point-by-point convolution operation function; DWConv () is a depth convolution operation function; pwconv 2 () The convolution kernel size is 1 for the second point-by-point convolution operation function; shift () is a Shift operation function, and the calculation formula of the Shift operation is:
in the middle ofIs a feature obtained after the shift operation; z []Is an input feature; d is the spatial depth; h is the space width; w is the space height; gamma is a scale factor; c is the number of channels.
7. The lung CT image segmentation method based on deep learning according to claim 6, wherein the feature fusion is performed on the obtained local feature and the global feature by using a fusion module, so as to obtain the output of the encoder, and the method specifically comprises the following steps:
the fusion module comprises a global attention module and a local attention module;
the global attention module acquires weights of different channels by adopting global average pooling, and then adopts point-by-point convolution as a channel context aggregator; the local attention module adopts point-by-point convolution as a context aggregator, and simultaneously maintains the same resolution as the original feature map, so that more space information is reserved;
the calculation formula of the fusion module is as follows:
z a =Cat(z c ,z t )
z=Conv(z a )
z l =Local(z)
z g =Global(z)
z o =z l +z g
z in a Is convolutionThe output characteristics of the encoder are connected with the output characteristics of the transducer encoder in series; cat () is a processing function concatenated in the channel dimension; z c Is an output characteristic of the convolutional encoder; z t Is an output characteristic of the transducer encoder; z is an input feature; conv () is a convolution operation function, and the convolution kernel size is Cx2Cx1x1x1; z l Features that are output by the local attention module; z g Output characteristics of the global attention module; z o Is an output characteristic of the fusion module;
the calculation formula of the global attention module is as follows:
Global(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (GAP(z))))))
wherein Global (z) is the output characteristic of the Global attention module; z is an input feature; GAP () is a global average pooling processing function, andd is the spatial depth, H is the spatial width, W is the spatial height, z [:,i,j,k] Is an input feature; conv 1 () Processing the function for the first convolution layer, and the convolution kernel size is +.>C is the channel number of the input feature, r is the scaling factor; BN (BN) 1 () Normalizing the function for a first batch; reLU () is an activation function; conv 2 () Processing function for the second convolution layer, and convolution kernel size is +.>BN 2 () Normalizing the function for a second batch;
the calculation formula of the local attention module is as follows:
Local(z)=BN 2 (Conv 2 (ReLU(BN 1 (Conv 1 (z)))))
where Local (z) is the characteristic of the Local attention module output.
8. The pulmonary CT image segmentation method based on deep learning as claimed in claim 7, wherein the constructing the decoder according to the convolutional encoder module comprises the following steps:
the decoder comprises a plurality of convolution blocks, and the number of the convolution blocks is the same as that of the convolution encoder module; each convolution module of the decoder comprises two Conv-IN-LeakyReLU modules, an up-sampling module is further arranged between two adjacent Conv-IN-LeakyReLU modules, and the up-sampling module is realized by adopting deconvolution; each stage of the decoder connects the features of the corresponding parallel encoder by jumps and then goes to the next stage of the decoder by upsampling.
9. The lung CT image segmentation method based on deep learning according to claim 8, wherein the training of step S4 comprises the following steps:
inputting the result output by the lung CT image segmentation original model and a real segmentation label into a loss function, and optimizing the lung CT image segmentation original model according to the result of the loss function;
the calculation formula of the loss function L is L=alpha L Dice +βL CE Wherein L is Dice For the Dice loss, L CE For cross entropy loss, α is the weight of the price loss, and β is the weight of the cross entropy loss;
dice loss L Dice Is calculated asWherein TP is the probability of true positive, FP is the probability of false positive, FN is the probability of false negative;
cross entropy loss L CE Is calculated asWhere N is the number of samples, y i Is a real label->And outputting a prediction result for the model.
10. An imaging method comprising the lung CT image segmentation method based on deep learning according to any one of claims 1 to 9, comprising the steps of:
A. acquiring an original lung CT image to be segmented;
B. adopting the lung CT image segmentation method based on deep learning as claimed in one of claims 1-9 to segment the original lung CT image to be segmented acquired in the step A;
C. and C, marking and secondarily imaging the image segmentation result obtained in the step B on the original lung CT image obtained in the step A, outputting a lung CT image with the lung CT image segmentation result, and completing imaging.
CN202310219966.3A 2023-03-09 2023-03-09 Lung CT image segmentation method and imaging method based on deep learning Pending CN116468732A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310219966.3A CN116468732A (en) 2023-03-09 2023-03-09 Lung CT image segmentation method and imaging method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310219966.3A CN116468732A (en) 2023-03-09 2023-03-09 Lung CT image segmentation method and imaging method based on deep learning

Publications (1)

Publication Number Publication Date
CN116468732A true CN116468732A (en) 2023-07-21

Family

ID=87183163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310219966.3A Pending CN116468732A (en) 2023-03-09 2023-03-09 Lung CT image segmentation method and imaging method based on deep learning

Country Status (1)

Country Link
CN (1) CN116468732A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116825363A (en) * 2023-08-29 2023-09-29 济南市人民医院 Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network
CN117132606A (en) * 2023-10-24 2023-11-28 四川大学 Segmentation method for lung lesion image

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116825363A (en) * 2023-08-29 2023-09-29 济南市人民医院 Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network
CN116825363B (en) * 2023-08-29 2023-12-12 济南市人民医院 Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network
CN117132606A (en) * 2023-10-24 2023-11-28 四川大学 Segmentation method for lung lesion image
CN117132606B (en) * 2023-10-24 2024-01-09 四川大学 Segmentation method for lung lesion image

Similar Documents

Publication Publication Date Title
CN116468732A (en) Lung CT image segmentation method and imaging method based on deep learning
CN111429473B (en) Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion
CN109035172B (en) Non-local mean ultrasonic image denoising method based on deep learning
CN112258488A (en) Medical image focus segmentation method
CN112734755A (en) Lung lobe segmentation method based on 3D full convolution neural network and multitask learning
CN110930378B (en) Emphysema image processing method and system based on low data demand
CN114266794B (en) Pathological section image cancer region segmentation system based on full convolution neural network
CN113223005B (en) Thyroid nodule automatic segmentation and grading intelligent system
CN111340816A (en) Image segmentation method based on double-U-shaped network framework
CN114092439A (en) Multi-organ instance segmentation method and system
CN116228792A (en) Medical image segmentation method, system and electronic device
CN110895815A (en) Chest X-ray pneumothorax segmentation method based on deep learning
CN116596949A (en) Medical image segmentation method based on conditional diffusion model
CN113160229A (en) Pancreas segmentation method and device based on hierarchical supervision cascade pyramid network
CN114862800A (en) Semi-supervised medical image segmentation method based on geometric consistency constraint
WO2024104035A1 (en) Long short-term memory self-attention model-based three-dimensional medical image segmentation method and system
CN112420170B (en) Method for improving image classification accuracy of computer aided diagnosis system
CN110992309A (en) Fundus image segmentation method based on deep information transfer network
CN117095016A (en) Multi-view consistency-based semi-supervised 3D liver CT image segmentation method
CN113192076B (en) MRI brain tumor image segmentation method combining classification prediction and multi-scale feature extraction
CN116258685A (en) Multi-organ segmentation method and device for simultaneous extraction and fusion of global and local features
CN115565671A (en) Atrial fibrillation auxiliary analysis method based on cross-model mutual teaching semi-supervision
CN115471512A (en) Medical image segmentation method based on self-supervision contrast learning
CN114821067A (en) Pathological image segmentation method based on point annotation data
Ru et al. A dermoscopic image segmentation algorithm based on U-shaped architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination