CN116468732A

CN116468732A - Lung CT image segmentation method and imaging method based on deep learning

Info

Publication number: CN116468732A
Application number: CN202310219966.3A
Authority: CN
Inventors: 周建存; 匡湖林; 王建新; 刘权
Original assignee: Zhixiang Health Technology Beijing Co ltd; Central South University; Hunan City University
Current assignee: Zhixiang Health Technology Beijing Co ltd; Central South University; Hunan City University
Priority date: 2023-03-09
Filing date: 2023-03-09
Publication date: 2023-07-21

Abstract

The invention discloses a lung CT image segmentation method based on deep learning, which comprises the steps of acquiring an existing lung CT image data set and preprocessing to obtain a training data set; constructing a lung CT image segmentation original model and training to obtain a lung CT image segmentation model; and dividing the actual lung CT image by adopting a lung CT image division model. The invention also discloses an imaging method comprising the lung CT image segmentation method based on the deep learning. The invention can realize more accurate image segmentation results, can accelerate convergence speed, can show stronger segmentation effect on a small-scale data set, can realize more accurate segmentation results, can avoid feature collapse and generate diversified features; therefore, the invention has high reliability, good accuracy and objectivity and science.

Description

Lung CT image segmentation method and imaging method based on deep learning

Technical Field

The invention belongs to the field of medical image processing, and particularly relates to a lung CT image segmentation method and an imaging method based on deep learning.

Background

Along with the development of economic technology and the improvement of living standard of people, the attention of people to health is higher and higher; medical image processing techniques are therefore becoming increasingly important.

In clinical application and experimental research process, the lung CT image has important auxiliary effect, and can effectively help clinical staff to diagnose subsequent lung diseases or effectively help experimental researchers to conduct subsequent researches. Therefore, the segmentation of CT images of the lung is particularly important.

At present, a deep learning model is often adopted to independently learn characteristic representation aiming at a lung CT image segmentation method, and a segmentation task is completed by utilizing a learned high-dimensional abstract; for example, image segmentation methods based on the transducer technology are commonly used today to segment CT images of the lung.

The transducer was originally applied to natural language processing and has met with significant success; since the advent of ViT, transformers have achieved competitive performance over a range of computer vision tasks. However, as research continues, some of the shortcomings of the transducer are gradually revealed. Of paramount importance, a transducer typically requires extensive data set training to exhibit better performance. However, the dataset of medical images is typically smaller, while the dataset of CT images of the lungs is relatively smaller. Therefore, the existing lung CT image segmentation method based on the transducer technology has relatively poor accuracy and low reliability.

Disclosure of Invention

The invention aims to provide a lung CT image segmentation method based on deep learning, which has high reliability, good accuracy and objectivity and science.

The second object of the invention is to provide an imaging method comprising the lung CT image segmentation method based on deep learning.

The lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps:

s1, acquiring an existing lung CT image data set;

s2, preprocessing the data acquired in the step S1, so as to acquire a training data set;

s3, constructing a lung CT image segmentation original model based on a convolution encoder, a transform encoder, a fusion module and a corresponding decoder;

s4, training the lung CT image segmentation original model constructed in the step S3 by adopting the training data set obtained in the step S2 to obtain a lung CT image segmentation model;

s5, segmenting the actual lung CT image by adopting the lung CT image segmentation model obtained in the step S4.

The preprocessing of the data acquired in the step S1 in the step S2 specifically includes the following steps:

preprocessing comprises image cropping, resampling and normalization;

the image clipping is to obtain a boundary frame of a required target through setting a pixel threshold clipping;

resampling operation is carried out on the obtained bounding box;

after resampling is completed, the image is normalized using Z-score:

z in _i The normalized gray value; x is x _i Is the original gray value; mu is the average value of the gray scale of the image; sigma is the standard deviation of the image gray scale.

The step S3 is based on a convolution encoder, a transducer encoder, a fusion module and a corresponding decoder, and constructs a lung CT image segmentation original model, and specifically comprises the following steps:

the acquired image data is input into a convolutional encoder module to learn to obtain local features, and is input into a transducer encoder module to learn to obtain global features;

carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of the encoder;

constructing a decoder correspondingly according to the convolutional encoder module; inputting the acquired characteristics to a decoder for decoding;

and processing the characteristics output by the decoder by adopting a softmax function to obtain the final probabilities of the characteristics of different categories.

The convolution encoder module is specifically composed of a plurality of convolution modules, and each convolution module comprises two Conv-IN-LeakyReLU modules;

each Conv-IN-LeakyReLU module comprises a convolution layer, an InstanceNorm layer and a LeakyReLU layer which are sequentially connected IN series;

the processing function of the convolutional encoder block is Y _o =leakyrelu (IN (Conv (X))), wherein Y _o As an output feature of the encoder module, X is an input feature, conv () is a convolutional layer processing function; IN () is an InstanceNorm layer processing function; the LeakyReLU () is a LeakyReLU layer processing function;

the convolution layer is a three-dimensional convolution layer;

InstanceNorm layer is processed according to the formulaWherein Y is a processing result feature, X is an input feature, mu is a mean value of the input feature, sigma is a variance of the input feature, epsilon is an error coefficient, and gamma is a first learning parameter; beta is a second learning parameter;

the processing formula of the LeakyReLU layer is as followsWherein y is _i Is a processing feature after the activation function; x is x _i Is the i-th value in the feature map; a is a fixed constant;

the first Conv-IN-LeakyReLU block IN each convolution block is used for the downsampling operation, with a convolution kernel size of 3 and a convolution kernel sliding step size of 2.

The converter encoder module specifically comprises a patchemibedding module, a plurality of continuous converter blocks and a plurality of downsampling modules; a downsampling module is connected between every two adjacent Transformer blocks;

the downsampling module adopts a tri-linear interpolation operation, and after interpolation is completed, the convolution with the convolution kernel size of 1 is adopted to adjust the channel number of the feature map, so that the channel number of the feature map is ensured to be increased by 1 time;

the patch enabling module is used for dividing the image into a series of non-overlapping patches, and projecting each patch into a high-dimensional space;

each transducer block comprises a first normalization layer, a shift DW layer, a second normalization layer and a multi-layer perceptron layer; after the input features are normalized by the first normalization layer, the input features are processed by the Shift DW layer; the processing result of the Shift DW layer is connected with the input feature to obtain a second input feature; the second input features are input to a second normalization layer for processing, and then are processed through a multi-layer perceptron layer; the processing result of the multi-layer perceptron layer is connected with the second input characteristic and then used as the output characteristic of the current transducer block;

the calculation formula of the transducer block is as follows:

in the middle ofRepresenting the output of the layer i ShiftDW module; shiftDW () is a ShiftDW layer processing function; z ^l An output for the layer I transducer; BN () is a processing function of the first normalization layer and the second normalization layer; MLP () is a layer processing function of the multi-layer perceptron, and the number of layers of the multi-layer perceptron is 2; the formula of the MLP () is:

MLP＝Linear ₂ (GELU(Linear ₁ (x)))

wherein MLP is a processing function; x is an input feature; linear ₁ And Linear ₂ Is a fully connected module; gel () is a processing function of a gaussian error linear unit, andx is the input feature, erf () is the intermediate processing function and +.>

The Shift DW layer is used for replacing a multi-head self-attention module in the transducer block; the Shift DW layer specifically comprises a shift operation, a first point-by-point convolution operation, a depth convolution operation and a second point-by-point convolution operation which are sequentially connected in series; the calculation formula of the ShiftDW layer is:

ShiftDW(z)＝PWConv ₂ (DWConv(PWConv ₁ (Shift(x))))

wherein the Shift DW (z) is the output characteristic of the Shift DW layer; x is an input feature; pwconv ₁ () The convolution kernel size is 1 for the first point-by-point convolution operation function; DWConv () is a depth convolution operation function; pwconv ₂ () The convolution kernel size is 1 for the second point-by-point convolution operation function; shift () is a Shift operation functionAnd the calculation formula of the shift operation is as follows:

in the middle ofRepresenting the features after the shift operation; z []Is an input feature; d is the spatial depth; h is the space width; w is the space height; gamma is a scale factor; c is the number of channels.

The method comprises the steps of carrying out feature fusion on the obtained local features and global features by adopting a fusion module to obtain the output of an encoder, and specifically comprises the following steps:

the fusion module comprises a global attention module and a local attention module;

the global attention module acquires weights of different channels by adopting global average pooling, and then adopts point-by-point convolution as a channel context aggregator; the local attention module adopts point-by-point convolution as a context aggregator, and simultaneously maintains the same resolution as the original feature map, so that more space information is reserved;

the calculation formula of the fusion module is as follows:

z _a ＝Cat(z _c ,z _t )

z＝Conv(z _a )

z _l ＝Local(z)

z _g ＝Global(z)

z _o ＝z _l +z _g

z in _a The characteristics after the output characteristics of the convolution encoder are connected with the output characteristics of the transform encoder in series; cat () is a processing function concatenated in the channel dimension; z _c Is an output characteristic of the convolutional encoder; z _t Is an output characteristic of the transducer encoder; z is an input feature; conv () is a convolution operation function, and the convolution kernel size is Cx2Cx1x1x1; z _l Features that are output by the local attention module; z _g Output characteristics of the global attention module; z _o Is an output characteristic of the fusion module;

the calculation formula of the global attention module is as follows:

Global(z)＝BN ₂ (Conv ₂ (ReLU(BN ₁ (Conv ₁ (GAP(z))))))

wherein Global (z) is the output characteristic of the Global attention module; z is an input feature; GAP () is a global average pooling processing function, andd is the spatial depth, H is the spatial width, W is the spatial height, z _[:,i,j,k] Is an input feature; conv ₁ () Processing the function for the first convolution layer, and the convolution kernel size is +.>C is the channel number of the input feature, r is the reductionScaling factor; BN (BN) ₁ () Normalizing the function for a first batch; reLU () is an activation function; conv ₂ () Processing function for the second convolution layer, and convolution kernel size is +.>BN ₂ () Normalizing the function for a second batch;

the calculation formula of the local attention module is as follows:

Local(z)＝BN ₂ (Conv ₂ (ReLU(BN ₁ (Conv ₁ (z)))))

where Local (z) is the characteristic of the Local attention module output.

The corresponding construction decoder according to the convolution encoder module comprises the following steps:

the decoder comprises a plurality of convolution blocks, and the number of the convolution blocks is the same as that of the convolution encoder module; each convolution module of the decoder comprises two Conv-IN-LeakyReLU modules, an up-sampling module is further arranged between two adjacent Conv-IN-LeakyReLU modules, and the up-sampling module is realized by adopting deconvolution; each stage of the decoder connects the features of the corresponding parallel encoder by jumps and then goes to the next stage of the decoder by upsampling.

The training of step S4 specifically includes the following steps:

inputting the result output by the lung CT image segmentation original model and a real segmentation label into a loss function, and optimizing the lung CT image segmentation original model according to the result of the loss function;

the calculation formula of the loss function L is L=alpha L _Dice +βL _CE Wherein L is _Dice For the Dice loss, L _CE For cross entropy loss, α is the weight of the price loss, and β is the weight of the cross entropy loss;

dice loss L _Dice Is calculated asWherein TP is a probability that the result is true positiveThe rate, FP, is the probability of false positive as a result, and FN is the probability of false negative as a result;

cross entropy loss L _CE Is calculated asWhere N is the number of samples, y _i Is a real label->And outputting a prediction result for the model.

The invention also provides an imaging method comprising the lung CT image segmentation method based on deep learning, which comprises the following steps:

A. acquiring an original lung CT image to be segmented;

B. adopting the lung CT image segmentation method based on the deep learning to segment the original lung CT image to be segmented, which is obtained in the step A;

C. and C, marking and secondarily imaging the image segmentation result obtained in the step B on the original lung CT image obtained in the step A, outputting a lung CT image with the lung CT image segmentation result, and completing imaging.

The lung CT image segmentation method and the imaging method based on the deep learning provided by the invention can fully exert the advantages of convolution and transform and realize more accurate image segmentation results; the parallel encoder structure designed by the invention can accelerate the convergence speed of the parallel encoder structure under the condition of keeping the strong global modeling capability of a transducer, so that the parallel encoder structure can also show a strong segmentation effect on a small-scale data set; the fusion module can effectively fuse the characteristics from different branches, thereby realizing more accurate segmentation results; according to the transform module, the shift operation is adopted to replace the original multi-head self-attention operation, so that the calculation complexity can be reduced, faster reasoning is realized, the enhanced shortcut connection is adopted, feature breakdown can be avoided, and diversified features are generated; therefore, the invention has high reliability, good accuracy and objectivity and science.

Drawings

Fig. 1 is a flow chart of the dividing method according to the present invention.

Fig. 2 is a schematic diagram of a transducer encoder module of the segmentation method according to the present invention.

Fig. 3 is a schematic structural diagram of a fusion module of the segmentation method of the present invention.

Fig. 4 is a flow chart of the imaging method of the present invention.

Detailed Description

Fig. 1 is a flow chart of the segmentation method according to the present invention: the lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps:

s1, acquiring an existing lung CT image data set;

s2, preprocessing the data acquired in the step S1, so as to acquire a training data set; the method specifically comprises the following steps:

preprocessing comprises image cropping, resampling and normalization;

resampling operation is carried out on the obtained bounding box;

after resampling is completed, the image is normalized using Z-score:

z in _i The normalized gray value; x is x _i Is the original gray value; mu is the average value of the gray scale of the image; sigma is the standard deviation of the gray scale of the image;

s3, constructing a lung CT image segmentation original model based on a convolution encoder, a transform encoder, a fusion module and a corresponding decoder; the method specifically comprises the following steps:

processing the characteristics output by the decoder by adopting a softmax function to obtain the final probability of different types of characteristics;

when the method is implemented, the convolution encoder module is composed of a plurality of convolution modules, and each convolution module comprises two Conv-IN-LeakyReLU modules;

the convolution layer is a three-dimensional convolution layer;

the first Conv-IN-LeakyReLU module IN each convolution module is used for downsampling operation, the convolution kernel size is 3, and the convolution kernel sliding step length is 2;

the transducer encoder module (shown in fig. 2) specifically comprises a patchebedding module, a plurality of continuous transducer blocks and a plurality of downsampling modules; a downsampling module is connected between every two adjacent Transformer blocks;

the calculation formula of the transducer block is as follows:

in the middle ofIs the output of the shift dw module; shiftDW () is a ShiftDW layer processing function; z ^l An output for the layer I transducer; BN () is a processing function of the first normalization layer and the second normalization layer; MLP () is a layer processing function of the multi-layer perceptron, and the number of layers of the multi-layer perceptron is 2; the formula of the MLP () is:

MLP＝Linear ₂ (GELU(Linear ₁ (x)))

wherein MLP is a processing function; linear ₁ () For the first full join operation; x is an input feature; linear ₂ () For a second full join operation; gel () is a processing function of a gaussian error linear unit, andx is the input feature, erf () is the intermediate processing function and +.>

ShiftDW(z)＝PWConv ₂ (DWConv(PWConv ₁ (Shift(x))))

wherein the Shift DW (z) is the output characteristic of the Shift DW layer; x is an input feature; pwconv ₁ () The convolution kernel size is 1 for the first point-by-point convolution operation function; DWConv () is a depth convolution operation function; pwconv ₂ () The convolution kernel size is 1 for the second point-by-point convolution operation function; shift () is a Shift operation function, and the calculation formula of the Shift operation is:

in the middle ofIs a feature obtained after the shift operation; z []Is an input feature; d is the spatial depth; h is the space width; w is the space height; gamma is a scale factor; c is the number of channels;

the output of the encoder is obtained by fusion, and then the method comprises the following steps:

the fusion module includes a global attention module and a local attention module (as shown in fig. 3);

the calculation formula of the fusion module is as follows:

z _a ＝Cat(z _c ,z _t )

z＝Conv(z _a )

z _l ＝Local(z)

z _g ＝Global(z)

z _o ＝z _l +z _g

z in _a The characteristics after the output characteristics of the convolution encoder are connected with the output characteristics of the transform encoder in series; cat () is done in the channel dimensionA serially connected processing function; z _c Is an output characteristic of the convolutional encoder; z _t Is an output characteristic of the transducer encoder; z is an input feature; conv () is a convolution operation function, and the convolution kernel size is Cx2Cx1x1x1; z _l Features that are output by the local attention module; z _g Output characteristics of the global attention module; z _o Is an output characteristic of the fusion module;

the calculation formula of the global attention module is as follows:

Global(z)＝BN ₂ (Conv ₂ (ReLU(BN ₁ (Conv ₁ (GAP(z))))))

wherein Global (z) is the output characteristic of the Global attention module; z is an input feature; GAP () is a global average pooling processing function, andd is the spatial depth, H is the spatial width, W is the spatial height, z _[:,i,j,k] Is an input feature; conv ₁ () Processing the function for the first convolution layer, and the convolution kernel size is +.>C is the channel number of the input feature, r is the scaling factor; BN (BN) ₁ () Normalizing the function for a first batch; reLU () is an activation function; conv ₂ () Processing function for the second convolution layer, and convolution kernel size is +.>BN ₂ () Normalizing the function for a second batch;

the calculation formula of the local attention module is as follows:

Local(z)＝BN ₂ (Conv ₂ (ReLU(BN ₁ (Conv ₁ (z)))))

wherein Local (z) is the characteristic of the Local attention module output;

the construction decoder comprises in particular the following steps:

the decoder comprises a plurality of convolution blocks, and the number of the convolution blocks is the same as that of the convolution encoder module; each convolution module of the decoder comprises two Conv-IN-LeakyReLU modules, an up-sampling module is further arranged between two adjacent Conv-IN-LeakyReLU modules, and the up-sampling module is realized by adopting deconvolution; each stage of the decoder is connected with the characteristics of the corresponding parallel encoder through jump, and then enters the next stage of the decoder through up-sampling;

s4, training the lung CT image segmentation original model constructed in the step S3 by adopting the training data set obtained in the step S2 to obtain a lung CT image segmentation model; the method specifically comprises the following steps:

dice loss L _Dice Is calculated asWherein TP is the probability of true positive, FP is the probability of false positive, FN is the probability of false negative;

cross entropy loss L _CE Is calculated asWhere N is the number of samples, y _i Is a real label->The prediction result is output for the model;

The effectiveness of the segmentation method of the present invention is described below through experiments.

UNETR: its encoder is a hybrid architecture of convolution and transform, and its decoder is a convolution architecture. In the encoder, the features are extracted by using a transducer, and then features with different scales are obtained by using convolution as jump connection.

nnFormer: it proposes a new strategy of combining a transducer with a convolution. In the encoder, it uses three-dimensional swinformer for feature extraction, but between adjacent swinformerblocks, it uses convolution to downsample the feature map to obtain a multi-scale feature representation. In the decoder, it also uses three-dimensional swinTransformer, and uses deconvolution to restore the resolution of the image.

nnUNet: the method is a pure convolution split network, which has the same network architecture as UNet. The nnUNet can adaptively adjust the depth of the model based on the dataset as well as some other model hyper-parameters.

The performance of the method was evaluated in experiments using the evaluation index of Dice, precision, sensitivity, HD95, ASSD. The Dice index is often used for evaluating the segmentation quality of medical images, and the score value is approximately 1, which indicates that the segmentation result is more accurate. The calculation formula is as follows:

HD95 and ASSD represent distances between successive sets, with smaller values representing better segmentation performance. The calculation formula is as follows:

in the middle ofAnd->Representing the boundary points of G and M, respectively. />Is the cardinality of the G boundary. />Is the point x andthe shortest distance between all points in the (c). The formulas of Precision and Sensitivity are as follows:

TP, FN, FP represent the probability of true positive, probability of false negative and probability of false positive respectively;

the data used in this experiment is a lung cancer dataset. The present method and the comparative method were evaluated on the same test set, and the experimental results are shown in table 1. Wherein the numbers within () are standard deviations.

TABLE 1 comparison of the segmentation Performance of the segmentation method of the invention and the comparison method

Method	Dice	Precision	Recall	HD95	ASSD
						UNETR	0.533(0.239)	0.588(0.284)	0.595(0.241)	103.8(70.50)	22.90(21.85)
nnFormer	0.664(0.205)	0.724(0.237)	0.685(0.229)	36.50(45.54)	8.06(10.83)
						nnUNet	0.641(0.228)	0.644(0.274)	0.737(0.204)	74.15(78.63)	17.40(22.15)
The invention is that	0.714(0.169)	0.803(0.166)	0.701(0.230)	27.77(43.96)	5.59(9.64)

It can be seen from table 1 that the method proposed by the present invention achieves good performance in all methods on Dice and ASSD. Model evaluation on the test set resulted in average Dice, precision, sensitivity, HD, ASSD of 0.520 (0.341), 0.686 (0.306), 0.530 (0.351), 33.44 (50.35), 13.60 (28.50) for the five experiments, respectively.

In the comparison method, the nnFormer and nnUNet methods also achieve a better segmentation effect. The segmentation result of nnUNet shows that feature extraction using convolution alone in the encoder makes it difficult to model the global features of the image, resulting in lower segmentation performance. The segmentation results of nnFormer and CFC demonstrate that the hybrid architecture can effectively model the global features of the image, but different combinations result in different segmentation performance. On a lung cancer segmentation data set, the parallel architecture can effectively combine a transducer and convolution, and effectively exert the advantages of the transducer and the convolution.

Fig. 4 is a flow chart of the imaging method of the present invention: the imaging method comprising the lung CT image segmentation method based on the deep learning provided by the invention comprises the following steps of:

A. acquiring an original lung CT image to be segmented;

In particular, the imaging method of the invention can be used with existing lung CT image acquisition equipment, such as CT machines. When the imaging method is particularly used, the imaging method is fused into the existing CT image acquisition equipment, then the original lung CT image is acquired by adopting the prior art, then the imaging method is adopted to carry out secondary imaging on the original lung CT image, then the lung CT image with the lung CT image segmentation result is obtained, and the lung CT image with the imaging result is directly output. In this way, medical workers (including clinicians, imaging doctors or experimenters, etc.) can acquire CT images of the lungs with segmentation results, which greatly facilitates the existing personnel.

Claims

1. A lung CT image segmentation method based on deep learning comprises the following steps:

s1, acquiring an existing lung CT image data set;

2. The lung CT image segmentation method based on deep learning according to claim 1, wherein the preprocessing of the data acquired in step S1 in step S2 specifically comprises the following steps:

preprocessing comprises image cropping, resampling and normalization;

resampling operation is carried out on the obtained bounding box;

after resampling is completed, the image is normalized using Z-score:

3. The lung CT image segmentation method according to claim 2, wherein the constructing the original model of lung CT image segmentation in step S3 is based on a convolutional encoder, a transform encoder, a fusion module and a corresponding decoder, and specifically comprises the steps of:

4. The pulmonary CT image segmentation method based on deep learning as claimed IN claim 3, wherein the convolutional encoder module comprises a plurality of convolutional modules, each of which comprises two Conv-IN-LeakyReLU modules;

the convolution layer is a three-dimensional convolution layer;

5. The lung CT image segmentation method according to claim 4, wherein the transducer encoder module comprises a patch-embedding module, a plurality of consecutive transducer blocks, and a plurality of downsampling modules; a downsampling module is connected between every two adjacent Transformer blocks;

the patch embedding module is used for dividing the image into a series of non-overlapping patches and projecting each patch into a high-dimensional space;

the calculation formula of the transducer block is as follows:

MLP＝Linear ₂ (GELU(Linear ₁ (x)))

6. The deep learning-based lung CT image segmentation method according to claim 5, wherein the ShiftDW layer is used to replace a multi-head self-attention module in a transducer block; the Shift DW layer specifically comprises a shift operation, a first point-by-point convolution operation, a depth convolution operation and a second point-by-point convolution operation which are sequentially connected in series; the calculation formula of the ShiftDW layer is:

ShiftDW(z)＝PWConv ₂ (DWConv(PWConv ₁ (Shift(x))))

in the middle ofIs a feature obtained after the shift operation; z []Is an input feature; d is the spatial depth; h is the space width; w is the space height; gamma is a scale factor; c is the number of channels.

7. The lung CT image segmentation method based on deep learning according to claim 6, wherein the feature fusion is performed on the obtained local feature and the global feature by using a fusion module, so as to obtain the output of the encoder, and the method specifically comprises the following steps:

the calculation formula of the fusion module is as follows:

z _a ＝Cat(z _c ,z _t )

z＝Conv(z _a )

z _l ＝Local(z)

z _g ＝Global(z)

z _o ＝z _l +z _g

z in _a Is convolutionThe output characteristics of the encoder are connected with the output characteristics of the transducer encoder in series; cat () is a processing function concatenated in the channel dimension; z _c Is an output characteristic of the convolutional encoder; z _t Is an output characteristic of the transducer encoder; z is an input feature; conv () is a convolution operation function, and the convolution kernel size is Cx2Cx1x1x1; z _l Features that are output by the local attention module; z _g Output characteristics of the global attention module; z _o Is an output characteristic of the fusion module;

the calculation formula of the global attention module is as follows:

Global(z)＝BN ₂ (Conv ₂ (ReLU(BN ₁ (Conv ₁ (GAP(z))))))

the calculation formula of the local attention module is as follows:

Local(z)＝BN ₂ (Conv ₂ (ReLU(BN ₁ (Conv ₁ (z)))))

where Local (z) is the characteristic of the Local attention module output.

8. The pulmonary CT image segmentation method based on deep learning as claimed in claim 7, wherein the constructing the decoder according to the convolutional encoder module comprises the following steps:

9. The lung CT image segmentation method based on deep learning according to claim 8, wherein the training of step S4 comprises the following steps:

10. An imaging method comprising the lung CT image segmentation method based on deep learning according to any one of claims 1 to 9, comprising the steps of:

A. acquiring an original lung CT image to be segmented;

B. adopting the lung CT image segmentation method based on deep learning as claimed in one of claims 1-9 to segment the original lung CT image to be segmented acquired in the step A;