CN114565626A

CN114565626A - Lung CT image segmentation algorithm based on PSPNet improvement

Info

Publication number: CN114565626A
Application number: CN202210221919.8A
Authority: CN
Inventors: 王灏睿; 宋博
Original assignee: Jiangsu Normal University
Current assignee: Jiangsu Normal University
Priority date: 2022-03-07
Filing date: 2022-03-07
Publication date: 2022-05-31

Abstract

The invention discloses a lung CT image segmentation algorithm based on PSPNet improvement, which comprises the following steps: collecting lung CT image data on Kaggle, preprocessing the images, such as image enhancement, geometric transformation, image cutting and the like, and then dividing the samples into a training set and a test set according to the proportion; an improved PSPNet network model is constructed, feature information is extracted by using a latest MobilenetV3 algorithm of a lightweight network Mobilenet series, then a core module pyramid pooling module is introduced, the capability of acquiring global information is improved, and finally, the classified image is closer to the real contour of a target through operations such as up-sampling. The invention aims to apply an improved PSPNet algorithm to the field of lung CT segmentation, and introduces the MobilenetV3 as a trunk extraction network of the PSPNet, so that the accuracy is further improved while the network tends to be light, and the diagnosis and treatment efficiency of doctor expert analysis is improved due to the faster network processing speed.

Description

Lung CT image segmentation algorithm based on PSPNet improvement

Technical Field

The invention relates to the technical field of deep learning medical image segmentation, in particular to a lung CT image segmentation algorithm based on PSPNet improvement.

Background

The traditional medical image segmentation method comprises image segmentation algorithms such as threshold segmentation, region growing method, edge segmentation and the like. However, these segmentation algorithms have extremely high requirements on samples, and are very poor in image effect in complex scenes because the samples are simply manually segmented from information such as gray scale and contrast of the image. With the continuous development and application of deep learning in various fields, the convolutional neural network is also gradually applied in the field of image processing. The convolutional neural network has good effects in image recognition and feature extraction, and the precision and accuracy of the traditional algorithm in medical image segmentation are greatly improved. The PSPNet is an improved convolutional neural network, and context information based on different areas is aggregated by using a pyramid pooling module and a pyramid scene analysis network, so that the capability of acquiring global context information is improved. The MobileNet is a lightweight neural network positioned at a mobile terminal and an embedded device, and through development of years, the MobileNet is found to be good in the aspects of improving feature extraction precision and reducing running time, so that the MobileNet has a good segmentation effect when being introduced into the PSPNet.

Disclosure of Invention

In order to improve the segmentation capability of the existing neural network in medical image segmentation, the invention provides a lung CT image segmentation algorithm improved based on PSPNet, which improves the performance in the aspect of medical image bone segmentation and reduces the prediction speed.

In order to achieve the above object, the present invention provides an improved lung CT image segmentation algorithm based on PSPNet, which includes the following steps:

firstly, making a lung CT image data set, and dividing a training set and a testing set with different proportions by using codes through a data set preprocessing program;

inputting the processed lung CT image sample into a lightweight deep neural network MobileNet 3 for Feature extraction, and performing downsampling for 4 times to obtain a global Feature layer (Feature Map);

step three, dividing the extracted feature layers continuously acquired by the feature layers into areas with different sizes, dividing the input feature layers into areas of 6x6, 3x3, 2x2 and 1x1, and then performing average pooling in each area to obtain local feature layers;

step four, performing upsampling on the feature layers with different local dimensions by utilizing 1x1 convolution to obtain four feature layers with the same dimensions as the global feature layer obtained in the step two, and finally stacking the global feature layer and the local feature layers;

and step five, integrating the characteristic layers obtained in the step four by using a 3x3 convolution, adjusting channels by using a 1x1 convolution to 2 types, outputting a prediction result, and finally performing upsampling by using resize to ensure that the final output layer has the same width and height as the input picture.

Preferably, the invention is based on PSPNet improved lung CT image segmentation algorithm, and uses the MobileNet V3 pre-training weight for training, so that under the CT image with complex background, the prediction result is still not influenced, and relatively good accuracy can be achieved.

Preferably, in the step one, the existing public medical image data set can be used, or the data set can be made by cooperating with a medical institution, manually segmenting and labeling the lung part by a professional doctor or a specialist, and then preprocessing the image.

Preferably, in the first step, we adopt an image preprocessing method letterbox _ image to add gray bars above and below the input image with different sizes, so that the resize image is not distorted.

Preferably, when the picture is input into MobileNetV3 for Feature extraction in step two, the invertedresidulalblock structure is adopted, the dimension-increasing operation is performed first, then the convolution operation is performed in a 3 × 3 depthwiseconstitution mode, the result is transmitted to the SE module (the number of channels is 1/4 of the channels of the expansion layer), then the weight of each channel is adjusted by a lightweight attention model, finally the dimension-increasing operation is performed by 1 × 1 convolution, and then the global Feature Map is output through a linear unit.

Preferably, in the construction of the MobileNetV3 module in the second step, an auxiliary Loss is introduced at the third last block output, and is propagated together with the total Loss to jointly optimize parameters, thereby effectively accelerating the convergence speed.

Preferably, in step three, the PSP module fuses 4 features of different pyramid scales, a single bin output is generated by global pooling using a 1 × 1 convolution kernel, the latter three features are pooled features of different scales, in order to ensure the weight of the global features, if the pyramid has N levels in total, the level channel is reduced to 1/N of the original level by using 1 × 1 convolution after each level, the size of the pyramid before pooling is obtained by bilinear interpolation, and finally concat together, the features include global and local context information.

The invention can obtain the following beneficial effects:

(1) according to the invention, the MobileNet V3 is used as a main feature extraction network of the PSPNet, so that the accuracy is ensured, the parameter quantity is reduced, and the prediction time is reduced;

(2) the global feature map obtained by extraction is divided by a pyramid self-adaptive average pooling module, and pooling is carried out in different areas to obtain local feature maps, so that the obtained feature maps not only contain local features but also contain global features.

Drawings

FIG. 1 is a flow chart of an improved lung CT image segmentation algorithm based on PSPNet according to the present invention;

FIG. 2 is a schematic diagram of a MobileNet V3 model network structure according to the present invention;

FIG. 3 is a schematic diagram of an overall network structure of a PSPNet-based improved lung CT image segmentation algorithm according to the present invention;

fig. 4 is a schematic diagram of the recognition result of the PSPNet-based improved lung CT image segmentation algorithm of the present invention.

Detailed Description

To make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.

Aiming at the existing problems, the invention provides a lung CT image segmentation method based on PSPNet improvement. The invention relates to a semantic segmentation method based on a neural network, which comprises the steps of segmenting lung parts in a CT image by improving a PSPNet network model, inputting a picture into the improved PSPNet network model, gradually reducing the size of a feature map through convolution layers with different sizes by an encoder module, extracting high-level semantic information, then gradually recovering the size of the feature map through operations such as up-sampling by a decoder module, completing the extraction of spatial information, and obtaining a prediction result of segmenting the lung from clinical 3D computer tomography. The invention relates to a lung CT image segmentation method based on PSPNet improvement, which comprises the following steps:

step one, preparing a lung CT image data set, wherein the data set disclosed on Kaggle can be selected, a lung CT scan can also be obtained through a medical institution, and the lung part is manually marked by using tools such as Labelme, Labelimage, Photoshop and the like. Putting the processed Sample into a Sample folder, putting the manually marked Label file into a Label folder, and running an Image _ annotation. 1, dividing a training set and a verification set well;

secondly, performing undistorted resize on the lung CT image by using a letterbox _ image function, wherein the size of an input image is set to be 473 multiplied by 473, the number of channels is 3, that is, num _ classes is set according to the type of the required segmentation, wherein the num _ classes is set to be 2, and the value of self. And (3) automatically setting a down-sampling multiple down-sampling _ factor according to the configuration of a computer, wherein the collected characteristic size is that other parameters are changed, such as: the number of iterations Epoch, batch _ size, learning rate learning _ rate, and so on.

Step three, constructing a unique bneck structure of MobileNet V3: channel number adjustment using 1 × 1 convolution followed by normalization and activation functions, the activation function used in the first 6 bneck structures was ReLU6, in particular of the form:

，

namely, when the input is more than 6, the return value is 6, and the function is taken as a nonlinear function to ensure the robustness of subsequent calculation. An h-swish function is adopted in the last 8 bneck structures, and the specific form is as follows:

the h-swish function reduces the calculation cost, reduces the parameter quantity and has better classification performance when returning pixel points.

Feature extraction is performed by using depth separable convolution, and attention mechanism is applied after standardization: first using the formula

Squeeze operation, i.e. global average pooling to obtain one of the feature stripes, then according to the formula

Performing an excitation operation, i.e. two times of fully connecting to obtain another characteristic strip

And

and multiplying to complete the construction of the attention mechanism. Performing dimensionality reduction and standardization operation by using 1 × 1 convolution, judging whether a residual error edge is used, and adding the residual error edge into returned features if the residual error edge is used, so that the building of the bneck structure is completed; a network is built according to a MobileNet V3 Large structure, the characteristics obtained from the third last bneck structure are taken out to serve as an auxiliary training branch aux _ branch, standardization and dropout operations are carried out, and the characteristics are compared with the labels after being reset to calculate the loss for training.

Inputting the picture into the MobileNet V3 constructed by the user to perform global feature extraction, finally obtaining a global feature map, dividing the feature layer continuously obtained by the extracted feature layer into regions with different sizes, and generating local feature layers of all parts by using convolution kernels of 6x6, 3x3, 2x2 and 1x1 on the input global feature layer; performing upsampling on the feature layers with different local dimensions by utilizing 1 multiplied by 1 convolution to obtain four feature layers with the same dimensions as the global feature layer, and finally performing channel stacking on the global and local feature layers; the obtained feature layers are integrated by using a 3x3 convolution, then channel adjustment is carried out by using a 1x1 convolution, the feature layers are adjusted into 2 types, a prediction result is output, and finally upsampling is carried out by using resize to enable the width and the height of the final output layer to be the same as those of an input picture.

Step five, a trace file can be run by adopting a pre-training weight file of MobileNet V3 disclosed on the network, or model _ path can be set as a space, training is started from the beginning, and Cross entry Loss and Dice Loss are adopted for training, wherein the Cross entry Loss is a Cross Entropy function, and the formula is as follows:

is the value of the tag, and is,

for the predicted value, the formula of Dice Loss is:

，

and calculating and generating the total Loss by using the formula for training, putting the pth weight file generated by training into a PSPNet.

Step six, operating the get _ miou file according to the formula

,

Wherein TP (true): the prediction result is positive type, and the prediction result is positive type; FP (false positive): the prediction result is a positive class, and if the prediction result is a negative class; FN (false negative): the prediction result is a negative class, and the prediction result is a positive class; TN (true negative): the prediction is negative class, and the true is negative class. According to the formula

The average pixel accuracy is calculated. And checking the evaluation index of the training result, optimizing and adjusting parameters according to the index, and finding the most appropriate parameter.

Through testing, compared with the original PSPNet network, the embodiment of the invention has a good lung CT image segmentation effect on a Kaggle data set, the class average pixel accuracy (MPA) reaches 92.68%, and the lung CT image semantic segmentation effect is better.

The invention designs a lung CT image segmentation method based on PSPNet improvement, which simplifies PSPNet model parameters, selects a lightweight neural network MobileNet V3 for feature extraction, and the improved lightweight neural network algorithm can be applied to a mobile terminal or a small embedded device to assist a professional doctor in diagnosing in time and improve the diagnosis efficiency.

Claims

1. A lung CT image segmentation algorithm based on PSPNet improvement is characterized in that MobilenetV3 is introduced as a network backbone extraction network, and the method comprises the following steps:

inputting the processed lung CT image sample into a lightweight deep neural network (MobileNet V3) for Feature extraction, and performing downsampling for 4 times to obtain a global Feature layer (Feature Map);

2. The PSPNet-based improved lung CT image segmentation algorithm as claimed in claim 1, wherein in the second step, Google's newly proposed lightweight network MobileNet V3 is used as a main feature extraction network to replace the original ResNet101 network, thereby greatly increasing the extraction speed and increasing the feature extraction precision.

3. The PSPNet-based improved lung CT image segmentation algorithm as claimed in claim 2, wherein a SE (Squeeze-and-Excite) module is introduced after the Depthwise convolution.

4. The PSPNet-based improved lung CT image segmentation algorithm as claimed in claim 1, wherein the ReLU6 and h-swish activation functions are introduced into the network, and the ReLU6 is a normal ReLU but limits the maximum output to 6, so as to have good numerical resolution even at low precision of mobile end device float16/int8, the formula is as follows:

h-Swish is a recent improved version of Swish nonlinear function, where the formula for the Swish activation function is as follows:

wherein

Let sigmoid function, and H-swish function formula as follows:

the h-swish function replaces a Sigmoid function with a piecewise linear function, the used ReLU6 can be realized in a plurality of deep learning frames, and meanwhile, the accuracy loss of numerical values is reduced during quantification.

5. The PSPNet-based improved lung CT image segmentation algorithm as claimed in claim 1, wherein LOSS as used herein is composed of two parts: cross Engine Loss and Dice Loss,

and evaluating each pixel, wherein Cross entry Loss is a Cross Entropy function, Dice Loss is an evaluation index of semantic segmentation as Loss, a Dice coefficient is a set similarity measurement function and is generally used for calculating the similarity of two samples, the value range is [0,1], and the calculation formula is as follows:

，

where X is the predicted outcome and Y is the true outcome.