CN117764948A - Liver tumor segmentation method based on mixed attention and multi-scale supervision - Google Patents

Liver tumor segmentation method based on mixed attention and multi-scale supervision Download PDF

Info

Publication number
CN117764948A
CN117764948A CN202311787545.7A CN202311787545A CN117764948A CN 117764948 A CN117764948 A CN 117764948A CN 202311787545 A CN202311787545 A CN 202311787545A CN 117764948 A CN117764948 A CN 117764948A
Authority
CN
China
Prior art keywords
liver tumor
convunext
attention
feature map
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311787545.7A
Other languages
Chinese (zh)
Inventor
李会萍
李北辰
陈鑫林
胡尚瑜
何雯欣
胡贤
杨淑琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University
Original Assignee
Henan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University filed Critical Henan University
Priority to CN202311787545.7A priority Critical patent/CN117764948A/en
Publication of CN117764948A publication Critical patent/CN117764948A/en
Pending legal-status Critical Current

Links

Landscapes

  • Apparatus For Radiation Diagnosis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the field of medical image processing, in particular to a liver tumor segmentation method based on mixed attention and multi-scale supervision, which comprises the following steps: preprocessing the acquired CT image of the liver tumor to be segmented; according to a MAS-ConvUNeXt liver tumor segmentation model comprising a mixed attention module and a multi-scale supervision module, carrying out liver tumor segmentation on a CT image of the liver tumor to be segmented, wherein the training method of the segmentation model comprises the following steps: and acquiring an image data set formed by the CT images of the liver tumor, preprocessing, and training the constructed segmentation model according to the preprocessed image data set. The MAS-ConvUNeXt liver tumor segmentation model provided by the invention can reduce the number of parameters while increasing the receptive field of the neural network, and the segmentation quality is enhanced by the mixed attention module, so that the liver tumor can be segmented efficiently and rapidly, and the segmentation accuracy of liver tumor images is improved.

Description

Liver tumor segmentation method based on mixed attention and multi-scale supervision
Technical Field
The invention relates to the field of medical image processing, in particular to a liver tumor segmentation method based on mixed attention and multi-scale supervision.
Background
Liver cancer is one of the most frequent types of cancer in the world. CT (Computed Tomography, electronic computer tomography) can help doctors to find liver cancer early. Accurate liver tumor segmentation results can facilitate doctors to observe liver tumor conditions, and can assist doctors to make treatment plans, operation plans and monitor treatment.
Due to the complexity and number of medical image data, traditional manual segmentation methods become time consuming and error prone. The automatic liver tumor segmentation algorithm can greatly improve the efficiency and lighten the workload of doctors. By utilizing the liver tumor segmentation algorithm, doctors can analyze medical images more rapidly and accurately, and the sensitivity and specificity of diagnosis are improved.
In recent years, deep learning segmentation networks, particularly the framework of UNet and the like, have made remarkable progress in liver tumor segmentation tasks. The deep learning model can automatically learn the characteristics and extract information in a layering mode, so that the deep learning model is better suitable for complex and changeable liver structures and tumor forms. However, the conventional UNet obtains many redundant information after simple image stitching, and loses more semantic features in the up-sampling process, so that the segmentation accuracy for liver tumor is often low.
Disclosure of Invention
In order to solve the technical problem of low liver tumor image segmentation accuracy, the invention provides a liver tumor segmentation method based on mixed attention and multi-scale supervision.
The invention provides a liver tumor segmentation method based on mixed attention and multi-scale supervision, which comprises the following steps:
acquiring a CT image of the liver tumor to be segmented, and preprocessing the CT image of the liver tumor to be segmented;
according to a pre-trained MAS-ConvUNeXt liver tumor segmentation model, carrying out liver tumor segmentation on a CT image of the liver tumor to be segmented, wherein the training method of the MAS-ConvUNeXt liver tumor segmentation model comprises the following steps: acquiring an image dataset formed by liver tumor CT images, preprocessing the image dataset, constructing a MAS-ConvUNeXt liver tumor segmentation model, training the constructed MAS-ConvUNeXt liver tumor segmentation model according to the preprocessed image dataset to obtain a trained MAS-ConvUNeXt liver tumor segmentation model, wherein the MAS-ConvUNeXt liver tumor segmentation model comprises: a hybrid attention module and a multi-scale supervision module.
Optionally, the acquiring an image dataset composed of CT images of liver tumor comprises:
the 3D liver tumor CT image is processed into a 2D image containing liver and liver tumor, the 2D image is used as a liver tumor CT image, and all obtained liver tumor CT images form an image data set.
Optionally, the preprocessing includes: and (5) enhancing the image.
Optionally, the constructing a MAS-ConvUNeXt liver tumor segmentation model includes:
constructing a ConvUNeXt model, wherein the ConvUNeXt model comprises: an encoder, a decoder, and an attention gating module;
on the basis of a ConvUNeXt model, improving an attention gating module to be a mixed attention module, and adding a multi-scale supervision module;
and taking the finally updated ConvUNeXt model as a constructed MAS-ConvUNeXt liver tumor segmentation model before training.
Optionally, the encoder and the decoder are based on the encoder and the decoder of UNet, and the encoder and the decoder are obtained by replacing the convolution block of UNet with a ConvUNeXt convolution block; the stage ratio of the encoder is 1:1:3:1, and a convolutional layer is used for replacing a pooling layer in the downsampling step; the attention gating module is located at the jump connection.
Optionally, the mixed attention module performs linear transformation on a result obtained after up-sampling an output feature map x1 of the lower ConvUNeXt convolution block, and performs 1 on the obtained feature map in a channel dimension: 2, splitting, inputting the feature map with the channel number of 1 into a space attention module, inputting the feature map with the channel number of 2 into a channel attention module, multiplying the attention weights alpha and beta obtained by the two attention modules by the output feature map x2 of a ConvUNeXt convolution block of a corresponding layer in the encoder, and splicing the input feature map x1 after dimension transformation of a linear layer.
Optionally, the spatial attention module inputs the input feature map into a linear layer, adds the result after the input feature map x2 of the corresponding layer ConvUNext convolution block in the encoder is subjected to linear transformation, activates the added result by using a ReLU activation function, maps the feature map into an intermediate space through a linear layer, and finally activates the feature map by using a Sigmoid activation function to obtain the attention weight alpha.
Optionally, the channel attention module performs global average pooling on the input feature map, adds the result obtained after global pooling with the output feature map x2 of the encoder corresponding layer ConvUNext convolution block, activates the added result by using a ReLU activation function, compresses and re-expands feature map dimensions by using two linear layers, and finally activates the feature map by using a Sigmoid activation function to obtain the attention weight beta.
Optionally, the multi-scale supervision module performs 1x1 convolution on the feature map after each up-sampling convolution in the decoder, and then up-samples the feature map to the original size to obtain a segmentation map of each up-sampling stage, and establishes a loss function with a real label to realize multi-scale supervision.
Optionally, the loss function in the MAS-ConvUNeXt liver tumor segmentation model training process is a cross entropy loss function.
The invention has the following beneficial effects:
the invention relates to a liver tumor segmentation method based on mixed attention and multi-scale supervision, which can better solve the liver tumor segmentation problem.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a liver tumor segmentation method based on mixed attention and multi-scale supervision according to the present invention;
FIG. 2 is a schematic diagram of a MAS-ConvUNeXt liver tumor segmentation model framework of the present invention;
fig. 3 is a schematic diagram of a ConvUNext convolution block structure according to the present invention;
FIG. 4 is a schematic diagram of a hybrid attention module configuration of the present invention;
FIG. 5 is a schematic diagram of a multi-scale supervision module according to the present disclosure;
fig. 6 is a further flowchart of the present invention.
Detailed Description
In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description is given below of the specific implementation, structure, features and effects of the technical solution according to the present invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The invention provides a liver tumor segmentation method based on mixed attention and multi-scale supervision, which comprises the following steps:
acquiring a CT image of the liver tumor to be segmented, and preprocessing the CT image of the liver tumor to be segmented;
according to a pre-trained MAS-ConvUNeXt liver tumor segmentation model, carrying out liver tumor segmentation on a CT image of the liver tumor to be segmented, wherein the training method of the MAS-ConvUNeXt liver tumor segmentation model comprises the following steps: acquiring an image dataset formed by liver tumor CT images, preprocessing the image dataset, constructing a MAS-ConvUNeXt liver tumor segmentation model, training the constructed MAS-ConvUNeXt liver tumor segmentation model according to the preprocessed image dataset to obtain a trained MAS-ConvUNeXt liver tumor segmentation model, wherein the MAS-ConvUNeXt liver tumor segmentation model comprises: a hybrid attention module and a multi-scale supervision module.
The following detailed development of each step is performed:
referring to fig. 1, a flow of some embodiments of a liver lesion segmentation method based on mixed attention and multi-scale supervision according to the present invention is shown. The liver tumor segmentation method based on mixed attention and multi-scale supervision comprises the following steps of:
step S1, acquiring a CT image of the liver tumor to be segmented, and preprocessing the CT image of the liver tumor to be segmented.
In some embodiments, a CT image of the liver tumor can be acquired through a CT apparatus and used as a CT image of the liver tumor to be segmented, and histogram equalization and contrast stretching are performed on the CT image of the liver tumor to be segmented, so that the CT image of the liver tumor to be segmented is enhanced, and preprocessing of the CT image of the liver tumor to be segmented is realized.
The CT image of the liver tumor to be segmented may be a CT (Computed Tomography, electronic computed tomography) image of the liver tumor to be segmented. The pretreatment comprises the following steps: and (5) enhancing the image.
And S2, performing liver tumor segmentation on the CT image of the liver tumor to be segmented according to the pre-trained MAS-ConvUNeXt liver tumor segmentation model.
In some embodiments, the CT image of the liver tumor to be segmented may be input to a pre-trained MAS-ConvUNeXt liver tumor segmentation model, through which the liver tumor segmentation of the CT image of the liver tumor to be segmented may be achieved. Wherein, the whole framework of the MAS-ConvUNeXt liver tumor segmentation model can be shown in figure 2. MAS-ConvUNext may be noted as Mixed Attention and multi-scale Supervision ConvUNext.
Optionally, the training method of the MAS-ConvUNeXt liver tumor segmentation model comprises the following steps:
in a first step, an image dataset is acquired, which is made up of CT images of liver tumors.
For example, the 3D liver tumor CT image is processed into a 2D image containing liver and liver tumor, including adjusting the window width and level, and taking the obtained 2D image as a liver tumor CT image, and forming an image dataset from all the obtained liver tumor CT images. The liver tumor CT image may be a CT image labeled with a liver tumor region. For example, a liver tumor region may be marked on a liver tumor CT image by an artificial means.
And secondly, preprocessing the image data set.
For example, histogram equalization and contrast stretching may be performed on the processed 2D image, that is, the liver tumor CT image in the image dataset, to enhance the liver tumor CT image, thereby enhancing the image dataset, performing data enhancement on the enhanced image dataset, amplifying the dataset, and taking the finally obtained image dataset as the preprocessed image dataset, and dividing the preprocessed image dataset into a training set, a verification set, and a test set. Wherein, the data enhancement may include: random horizontal flipping, random scaling, and random cropping. The data enhancement can generate new data, and aims to expand and diversify training data sets, and can improve generalization capability and robustness of a model.
Thirdly, constructing a MAS-ConvUNeXt liver tumor segmentation model.
Wherein, the MAS-ConvUNeXt liver tumor segmentation model comprises: a hybrid attention module and a multi-scale supervision module.
For example, constructing a MAS-ConvUNeXt liver tumor segmentation model may include the sub-steps of:
the first substep, build ConvUNeXt model.
Wherein, convUNeXt model includes: encoder, decoder and attention gating module. The encoder and decoder are obtained by replacing the convolutional block of UNet with the ConvUNeXt convolutional block on the basis of the encoder and decoder of UNet. The ConvUNext convolution block structure is shown in fig. 3. The stage ratio of the encoder is 1:1:3:1 and the convolutional layer is used to replace the pooling layer in the downsampling step. The attention gating module is located at the jump connection.
And a second sub-step, on the basis of the ConvUNeXt model, improving the attention gating module to be a mixed attention module, and adding a multiscale supervision module.
Wherein the structure of the hybrid attention module may be as shown in fig. 4. The mixed attention module carries out linear transformation on a result obtained after up-sampling of an output feature map x1 of a lower ConvUNeXt convolution block, and the obtained feature map is subjected to channel dimension conversion according to 1:2, splitting, wherein the feature map with the channel number of 1 is input into a spatial attention module, the feature map with the channel number of 2 is input into a channel attention module, finally, the attention weights alpha and beta obtained by the two attention modules are multiplied by the output feature map x2 of the ConvUNext convolution block of the corresponding layer in the encoder, and an output result of the mixed attention module is obtained, and the result is spliced with the input feature map x1 after up-sampling after dimension transformation of a linear layer. The spliced feature map is input into the next ConvUNeXt convolution block. Wherein, the feature map x1 refers to an output feature map of a decoder convolution block at the lower layer of the mixed attention module in the structural diagram. Feature map x2 refers to the output feature map of the encoder convolutions at the same layer as the mixed attention module in the block diagram.
The above spatial attention module is to input the input feature map into a linear layer, add the result after linear transformation with the output feature map x2 of the corresponding layer ConvUNext convolution block in the encoder, activate the added result by using a ReLU activation function, map the feature map into an intermediate space by a linear layer, and activate by using a Sigmoid activation function to obtain the attention weight α.
The channel attention module performs global average pooling on an input feature map, and adds the result after global pooling with an output feature map x2 of a corresponding layer ConvUNext convolution block in the encoder, at the moment, the added result is activated by using a ReLU activation function, then the feature map dimension is compressed and then expanded through two linear layers, and finally the attention weight beta is obtained by using a Sigmoid activation function for activation.
The structure of the multi-scale supervision module may be as shown in fig. 5. The multi-scale supervision module carries out 1x1 convolution on the feature map after each up-sampling convolution in the decoder, then up-samples the feature map to the original size to obtain a segmentation map of each up-sampling stage, and establishes a loss function with a real label to realize multi-scale supervision.
And a third sub-step, namely taking the ConvUNeXt model finally updated and obtained as a constructed MAS-ConvUNeXt liver tumor segmentation model before training.
Fourthly, training the constructed MAS-ConvUNeXt liver tumor segmentation model according to the preprocessed image data set to obtain a trained MAS-ConvUNeXt liver tumor segmentation model.
The loss function in the MAS-ConvUNeXt liver tumor segmentation model training process is a cross entropy loss function.
For example, constructing a loss function, training a MAS-ConvUNeXt liver tumor segmentation model on a training set, optimizing segmentation model parameters through a gradient descent method, and selecting an optimal segmentation model, wherein the loss function is a formula corresponding to a cross entropy loss function, and the formula is as follows:
wherein L is a loss function value, N is the total number of pixel points in the segmentation map, c is a category index, y i,c Is the true label for class c in pixel i,is the predicted probability of the model output for class c in sample i.
Optionally, the image in the test set may be input into a MAS-ConvUNeXt liver tumor segmentation model to obtain a liver tumor segmentation result of the image in the test set.
Yet another flow chart of the present invention may be as shown in fig. 6.
In summary, the invention adopts the mixed attention module as the attention gating module, and adds a multi-scale supervision module. The model can increase the receptive field of the neural network, reduce the quantity of parameters, and enhance the quality of the segmentation area through the mixed attention module so as to realize the efficient and rapid segmentation of the liver tumor.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention and are intended to be included within the scope of the invention.

Claims (10)

1. A liver tumor segmentation method based on mixed attention and multi-scale supervision, which is characterized by comprising the following steps:
acquiring a CT image of the liver tumor to be segmented, and preprocessing the CT image of the liver tumor to be segmented;
according to a pre-trained MAS-ConvUNeXt liver tumor segmentation model, carrying out liver tumor segmentation on a CT image of the liver tumor to be segmented, wherein the training method of the MAS-ConvUNeXt liver tumor segmentation model comprises the following steps: acquiring an image dataset formed by liver tumor CT images, preprocessing the image dataset, constructing a MAS-ConvUNeXt liver tumor segmentation model, training the constructed MAS-ConvUNeXt liver tumor segmentation model according to the preprocessed image dataset to obtain a trained MAS-ConvUNeXt liver tumor segmentation model, wherein the MAS-ConvUNeXt liver tumor segmentation model comprises: a hybrid attention module and a multi-scale supervision module.
2. A method of liver lesion segmentation based on mixed attention and multiscale supervision according to claim 1, wherein the acquiring of an image dataset consisting of CT images of liver lesions comprises:
the 3D liver tumor CT image is processed into a 2D image containing liver and liver tumor, the 2D image is used as a liver tumor CT image, and all obtained liver tumor CT images form an image data set.
3. A method of liver lesion segmentation based on mixed attention and multiscale supervision according to claim 1, wherein the preprocessing comprises: and (5) enhancing the image.
4. The liver tumor segmentation method based on mixed attention and multi-scale supervision according to claim 1, wherein the constructing a MAS-ConvUNeXt liver tumor segmentation model comprises:
constructing a ConvUNeXt model, wherein the ConvUNeXt model comprises: an encoder, a decoder, and an attention gating module;
on the basis of a ConvUNeXt model, improving an attention gating module to be a mixed attention module, and adding a multi-scale supervision module;
and taking the finally updated ConvUNeXt model as a constructed MAS-ConvUNeXt liver tumor segmentation model before training.
5. The liver tumor segmentation method based on mixed attention and multi-scale supervision according to claim 4, wherein the encoder and decoder are based on the encoder and decoder of UNet, and the encoder and decoder are obtained by replacing the convolution block of UNet with a ConvUNeXt convolution block; the stage ratio of the encoder is 1:1:3:1, and a convolutional layer is used for replacing a pooling layer in the downsampling step; the attention gating module is located at the jump connection.
6. The liver tumor segmentation method based on mixed attention and multi-scale supervision according to claim 4, wherein the mixed attention module performs linear transformation on a result obtained by up-sampling an output feature map x1 of a lower ConvUNeXt convolution block, and the obtained feature map is subjected to channel dimension 1:2, splitting, inputting the feature map with the channel number of 1 into a space attention module, inputting the feature map with the channel number of 2 into the channel attention module, multiplying the attention weights alpha and beta obtained by the two attention modules by the output feature map x2 of a ConvUNext convolution block of a corresponding layer in the encoder, and splicing the input feature map x1 after up-sampling after the dimension is transformed by a linear layer.
7. The liver tumor segmentation method based on mixed attention and multi-scale supervision according to claim 6, wherein the spatial attention module inputs an input feature map into a linear layer, adds the input feature map to the result obtained by performing linear transformation on the input feature map x2 of a corresponding layer ConvUNext convolution block in the encoder, activates the added result by using a ReLU activation function, maps the feature map into an intermediate space by using a linear layer, and finally activates the feature map by using a Sigmoid activation function to obtain the attention weight alpha.
8. The liver tumor segmentation method based on mixed attention and multi-scale supervision according to claim 6, wherein the channel attention module performs global average pooling on an input feature map, adds the result of global pooling on an output feature map x2 of a corresponding layer ConvUNext convolution block in the encoder, activates the added result by using a ReLU activation function, compresses and re-expands feature map dimensions by using two linear layers, and finally activates the feature map by using a Sigmoid activation function to obtain attention weight beta.
9. The liver tumor segmentation method based on mixed attention and multi-scale supervision according to claim 4, wherein the multi-scale supervision module performs 1x1 convolution on the feature map after each up-sampling convolution in the decoder, and then up-samples the feature map to an original size to obtain a segmentation map of each up-sampling stage, and establishes a loss function with a real label to realize multi-scale supervision.
10. The liver tumor segmentation method based on mixed attention and multi-scale supervision according to claim 1, wherein the loss function in the MAS-ConvUNeXt liver tumor segmentation model training process is a cross entropy loss function.
CN202311787545.7A 2023-12-23 2023-12-23 Liver tumor segmentation method based on mixed attention and multi-scale supervision Pending CN117764948A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311787545.7A CN117764948A (en) 2023-12-23 2023-12-23 Liver tumor segmentation method based on mixed attention and multi-scale supervision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311787545.7A CN117764948A (en) 2023-12-23 2023-12-23 Liver tumor segmentation method based on mixed attention and multi-scale supervision

Publications (1)

Publication Number Publication Date
CN117764948A true CN117764948A (en) 2024-03-26

Family

ID=90323176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311787545.7A Pending CN117764948A (en) 2023-12-23 2023-12-23 Liver tumor segmentation method based on mixed attention and multi-scale supervision

Country Status (1)

Country Link
CN (1) CN117764948A (en)

Similar Documents

Publication Publication Date Title
CN110287849B (en) Lightweight depth network image target detection method suitable for raspberry pi
CN111563902B (en) Lung lobe segmentation method and system based on three-dimensional convolutional neural network
CN111242288B (en) Multi-scale parallel deep neural network model construction method for lesion image segmentation
WO2022001623A1 (en) Image processing method and apparatus based on artificial intelligence, and device and storage medium
CN113674253A (en) Rectal cancer CT image automatic segmentation method based on U-transducer
CN111445474B (en) Kidney CT image segmentation method based on bidirectional re-attention depth network
WO2024104035A1 (en) Long short-term memory self-attention model-based three-dimensional medical image segmentation method and system
CN110599502B (en) Skin lesion segmentation method based on deep learning
CN113192076B (en) MRI brain tumor image segmentation method combining classification prediction and multi-scale feature extraction
CN112348830B (en) Multi-organ segmentation method based on improved 3D U-Net
CN115409846A (en) Colorectal cancer focus region lightweight segmentation method based on deep learning
CN111951289B (en) Underwater sonar image data segmentation method based on BA-Unet
CN115311194A (en) Automatic CT liver image segmentation method based on transformer and SE block
CN116228792A (en) Medical image segmentation method, system and electronic device
CN114596317A (en) CT image whole heart segmentation method based on deep learning
CN115526829A (en) Honeycomb lung focus segmentation method and network based on ViT and context feature fusion
CN113160229A (en) Pancreas segmentation method and device based on hierarchical supervision cascade pyramid network
Li et al. A deep learning method for material performance recognition in laser additive manufacturing
CN113436198A (en) Remote sensing image semantic segmentation method for collaborative image super-resolution reconstruction
CN117437423A (en) Weak supervision medical image segmentation method and device based on SAM collaborative learning and cross-layer feature aggregation enhancement
CN116188509A (en) High-efficiency three-dimensional image segmentation method
CN116310329A (en) Skin lesion image segmentation method based on lightweight multi-scale UNet
CN116883341A (en) Liver tumor CT image automatic segmentation method based on deep learning
CN116797609A (en) Global-local feature association fusion lung CT image segmentation method
CN117036380A (en) Brain tumor segmentation method based on cascade transducer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination