CN115457051A - Liver CT image segmentation method based on global self-attention and multi-scale feature fusion - Google Patents

Liver CT image segmentation method based on global self-attention and multi-scale feature fusion Download PDF

Info

Publication number
CN115457051A
CN115457051A CN202211064580.1A CN202211064580A CN115457051A CN 115457051 A CN115457051 A CN 115457051A CN 202211064580 A CN202211064580 A CN 202211064580A CN 115457051 A CN115457051 A CN 115457051A
Authority
CN
China
Prior art keywords
attention
features
feature
convolution
scale
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211064580.1A
Other languages
Chinese (zh)
Inventor
刘利军
戴舒婷
乔伟晨
黄青松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN202211064580.1A priority Critical patent/CN115457051A/en
Publication of CN115457051A publication Critical patent/CN115457051A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/94Dynamic range modification of images or parts thereof based on local image properties, e.g. for local contrast enhancement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a liver CT image segmentation method based on global self-attention and multi-scale feature fusion, and belongs to the technical field of medical image processing. The invention comprises the following steps: acquiring an abdominal CT data set and preprocessing the abdominal CT data set; (2) Extracting multi-scale features by adopting a ResNeXt convolutional neural network, and introducing multi-scale spatial information; (3) Obtaining a global self-attention fusion feature through a global self-attention module by using the multi-scale feature; (4) And extracting the fusion features through an improved convolution module, and finally performing up-sampling to obtain a segmentation result. The method is verified based on a LiTS public data set, and the average Dice value of an overlapped region of a segmentation result and real segmentation reaches 96.4%, which is 4.3% higher than that of a classical model UNet.

Description

Liver CT image segmentation method based on global self-attention and multi-scale feature fusion
Technical Field
The invention relates to a liver CT image segmentation method based on global self-attention and multi-scale feature fusion, and belongs to the technical field of medical image processing.
Background
Liver cancer is one of the fastest growing cancers worldwide in morbidity and mortality. Computed Tomography (CT) is a commonly used clinical tumor diagnosis method, which mainly benefits from that CT imaging technology can generally avoid the problem of organ image overlapping in other imaging technologies, and is more beneficial to tumor identification. Liver segmentation is a key step of interventional liver cancer clinical diagnosis and analysis, and accurate liver segmentation results can greatly improve the film reading efficiency of a doctor on CT images, so that a diagnosis and treatment scheme can be made as early as possible.
With the increasing number of CT images, the CT scan data of a case is usually accompanied by hundreds of CT slices, and there are problems of subjective interference, inconsistent standards, complex flow, time and labor consuming, and no repeatability in manual one-by-one analysis. Therefore, the liver organ can be accurately and automatically segmented in the abdominal CT image, and the segmentation method has higher value compared with manual segmentation. The difficulty of liver segmentation at present is mainly reflected in that the internal contrast of a liver organ is low, the intensity difference between the liver and other adjacent organs is small, the boundary of adjacent organs is fuzzy, and the shape change is large, so that the difficulty of liver segmentation is high. Therefore, liver organ segmentation based on CT images is a challenging task.
The automatic liver segmentation is mainly solved by the following three methods: 1) The traditional image segmentation method comprises the following steps: the segmentation task is done using shallow features such as grayscale, texture, etc. However, this also results in the conventional method being relatively sensitive to noise pixels and difficult to make good use of the conventional image segmentation method for deeper image features. 2) The machine learning method comprises the following steps: data patterns are analyzed from large-scale data. However, most of the machine learning algorithms need to carefully design artificial image features, and the expression of the features and the final segmentation result are also limited by the feature selection mode. 3) The deep learning method comprises the following steps: more and more abstract features can be extracted without additional intermediate processes, and the selection mode of the features is continuously adjusted according to results, so that the accuracy is greatly improved. The segmentation result of the existing deep learning method is generally better than that of the traditional image processing method, but the existing deep learning method is still insufficient for segmenting the liver and the liver tumor, and is also insufficient for considering relevant characteristics such as fuzzy boundary, variable positions and the like expressed in a CT image of the liver and the liver tumor. Many extracted features play little or no role in the segmentation result in the down-sampling process of the deep learning method, and the features are not weakened and are expressed in the same way as the key segmentation features, so that the segmentation result is not facilitated. In addition, the traditional U-Net jump link mode can cause semantic gap to cause the problem of feature mismatching, and the association among features is not fully considered in part of multi-scale model methods, so that the performance of a segmentation model is influenced.
Disclosure of Invention
In order to solve the above mentioned problems, the invention provides a liver CT image segmentation method based on global attention and multi-scale feature fusion, and the invention selects ResNeXt using packet convolution as an image feature extraction network, and obtains more image features without increasing computation time. Aiming at the problem of fuzzy boundary of liver organs, the method is solved by extracting and fusing different scale characteristics through a multi-scale architecture. And since there must be some relationship between the liver organ and other organs in the CT image, a self-attention mechanism is introduced to capture the relationship between the extracted features. And finally, the features are fused through a residual volume block of an improved attention method, so that the features are better expressed, and a better liver segmentation result is obtained.
The technical scheme of the invention is as follows: a liver CT image segmentation method based on global self-attention and multi-scale feature fusion comprises the following specific steps:
step1, image preprocessing: and processing the CT image in the LiTS data set according to the HU value range to increase the contrast, and expanding the data set by adopting a random overturning mode and the like.
Step2, acquiring the same dimension characteristic and the multi-scale characteristic: after the Step1 preprocessing operation, extracting image features by using a ResNeXt convolution neural network, and obtaining convolution features with uniform dimensionality and multi-scale features based on the convolution features through linear transformation.
Step3, obtaining a global self-attention fusion characteristic: and obtaining a self-attention fusion feature containing global information by a global self-attention module (Non-Local) by using the multi-scale feature obtained in Step2 so as to capture the relation between the target feature and the surrounding features.
And Step4, extracting the features of the self-attention fusion features obtained in Step3 through an improved convolution module, highlighting the effect of important semantic features in channel dimensions, and finally performing up-sampling to obtain a segmentation result.
Further, the specific steps of Step1 are as follows:
step1.1, processing the CT image in the LiTS data set according to the HU value range corresponding to the liver organ to increase the contrast; and processing according to CT values ranging from-130 HU to 230HU, namely the window width 360HU and the window level 50HU, and then performing normalization operation on the processed CT image.
The Step1.2 data expansion adopts the modes of random horizontal turning, vertical turning, zooming and cutting to carry out data enhancement; after random expansion, the data are divided, wherein 82% is used as a training set, the rest 18% is used as a test set, and the training set is further divided into training data and verification data according to the proportion of 8.
Further, the specific steps of Step2 are as follows:
after image preprocessing, step2.1 uses the first five layers of the ResNeXt-101 network as a feature extraction layer, the convolution in each ResNeXt block is divided into 32 paths, the middle channel dimension processed by each path is 4, different paths are equivalent to different feature subspaces and used for extracting different semantic features, meanwhile, the convolution kernels of different paths have sparser relations, and the risk of overfitting is reduced.
Step2.2 unifies the channel dimensions of Layer 1-4 output results in the ResNeXt network structure into 64 through linear transformation, and upsamples the feature map size to be consistent with Layer 1. And splicing the four features, performing 1 × 1 convolution compression on the spliced features to 64 to obtain multi-scale features, wherein the number of the feature channels and the size of the feature diagram are consistent with the dimension of the features processed by Layer 1-4.
Further, the Step3 comprises the following specific steps:
different organs in the Step3.1 abdomen CT image have certain relation, and the obtained relation can improve the liver organ segmentation effect. Inspired by the idea of calculating the correlation between the current position and other positions in the image in the non-local mean algorithm, starting from the multi-scale feature obtained by Step2, performing linear mapping for three times respectively to obtain Key, query and Value embedded space features, wherein the linear mapping is realized by adopting 1 × 1 convolution.
Step3.2 calculates the similarity between the characteristic Key and Query, and the function for calculating the correlation is obtained according to a Gaussian function selected by a non-local mean value, and the calculation formula is as follows:
Figure BDA0003827651300000031
wherein x i Is the ith position of the input feature map and j represents all the positions that may be associated with i. And weighting the calculated similarity to Value to obtain the self-attention feature.
Step3.3 obtains the output of self-attention weight from the attention characteristic through a Softmax layer, thereby integrating the learned long-distance dependency relationship into the output characteristic, and the overall calculation formula is as follows:
Figure BDA0003827651300000032
where C (x) is a Softmax normalization function, function g linearly maps the representation of the input j position, typically by 1 × 1 convolution, and function f calculates the correlation of the input ith position with the jth position.
Further, the Step4 comprises the following specific steps:
step4.1 extracts the fusion features containing multi-scale information and self-attention relations from Step3, and further feature extraction is carried out through an improved convolution module. The multi-scale self-attention fusion feature is subjected to 1 x 1 convolution, a feature channel is mapped to a specified dimension, and then the feature sum of the feature channel and the specified dimension is obtained through 1 x 1 convolution and 3 x 3 convolution.
Step4.2 uses an Attention module (Channel Attention, CA) acting on the Channel dimension to recalibrate the Channel of the feature Channel, and uses a residual path to fuse the original feature and the Attention feature of the Channel to obtain the output feature of the residual module, and the specific calculation is as shown in the formula:
Y MRA (X)=Y CA (W L X+W E X)+X
wherein Y is MRA (X) denotes a multi-level residual attention convolution operation, and X denotes an input characteristic. W L Is a 1 × 1 convolution matrix used to linearly map the original input, which is equivalent to a residual path. W E Is a 3 x 3 convolution matrix for feature extraction of the input features, Y CA Indicating channel attention operation.
The characteristics extracted by Step4.3 adopt a multi-path parallel idea of ensemble learning to obtain four groups of segmentation outputs, and the four groups of outputs are calculated and averaged to be used as a final output result.
The invention is further explained, in Step1 and Step 4:
1) The data preprocessing method comprises the following steps:
the original CT image contains a CT value in a large range, the contrast is poor as a whole, and the gray level difference among organs in the image is small and difficult to distinguish. On medical CT images, the HU value range corresponding to the target organ is usually used for processing to increase the contrast. The liver part is usually processed by adopting a window width of 150 and a window level of 30, but the liver organ and the liver tumor have gray level difference, and if the method is processed according to the HU value of the liver, the gray level loss of part of the liver tumor area is inevitably caused, so that important information is lost, and the training effect is not good. Aiming at the problem, the CT value range is-130 HU to 230HU, namely the window width 360HU and the window level 50HU, which are obtained by analyzing the distribution of the HU values of the histogram. And obtaining a processed CT image after normalization operation, wherein the processed image enhances the contrast between organs while retaining the information of the target region to the maximum extent, and is more beneficial to training a model, the image before processing is as shown in fig. 3 (a), and the image after processing is as shown in fig. 3 (b).
2) Designing a loss function:
for the condition that positive and negative samples in the data set are unbalanced, a Binary Cross Entropy (BCE) and a Dice Loss (Dice Loss, DL) are combined in a weighted mode to serve as training Loss, and a calculation formula of a Loss function L is as follows:
Figure BDA0003827651300000041
Figure BDA0003827651300000042
Figure BDA0003827651300000043
where y represents the true segmentation map value,
Figure BDA0003827651300000051
for the segmentation map values predicted by the model, ω is set to 0.5 for the weight of the two penalties, and e is set to 1.0 for the smoothing term set to avoid the denominator being 0.
The invention has the beneficial effects that:
1. according to the liver CT image segmentation method based on the fusion of the global self-attention and the multi-scale features, aiming at the characteristics of a liver segmentation task, a multi-scale strategy is selected to extract various features, the multi-scale features extracted by different network layers are used for introducing multi-scale spatial information, and the problems of fuzzy boundaries and the like in liver segmentation are solved. And a global attention mechanism is introduced to construct the relation between the image features corresponding to different semantic categories, so that the model can better capture the association between the semantic features corresponding to the liver and other organs, and the problem of large change of the liver shape is solved.
2. Each channel dimension corresponds to a type of semantic information, which is mapped to a type of image features in the original image. But we hope that the image features corresponding to the liver organ should have a higher degree of importance, and obviously the same channel weight is not good for the expression of key features. Aiming at the problem, the invention designs a multi-stage residual error attention convolution MRA module for highlighting the important semantic features in the channel dimension.
To sum up, the liver CT image segmentation method based on the fusion of the global self-attention and the multi-scale features firstly utilizes the ResNeXt convolutional neural network to obtain the multi-scale features in the abdominal CT image, then uses the global self-attention module to capture the spatial position relationship, and combines the effect of the multi-level residual error attention module to highlight the important semantic features in the channel dimension; finally, the accuracy of liver image segmentation is improved.
Drawings
FIG. 1 is a diagram of a segmentation method of a liver CT image based on global attention and multi-scale feature fusion;
FIG. 2 is a schematic diagram of a global self-attention-based module according to the present invention;
FIG. 3 is a comparison of the pretreatment of the present invention before and after; wherein, (a) the pre-processed image; (b) the processed image;
FIG. 4 is a visual comparison of the segmentation results of the present invention; wherein, (a) a CT picture; (b) a reference segmentation criterion; (c) original model segmentation results; (d) adding the self-attention module model segmentation result; (e) adding the improved convolution model segmentation result.
Detailed Description
Example 1: as shown in fig. 1-4, a liver CT image segmentation method based on global attention and multi-scale feature fusion specifically includes the following steps:
step1, image preprocessing: and processing the CT image in the LiTS data set according to the HU value range to increase the contrast, and expanding the data set by adopting a random overturning mode and the like.
Further, the specific steps of Step1 are as follows:
step1.1 processes the CT images in the LiTS dataset according to the HU value range corresponding to the liver organ to increase the contrast. And processing according to CT values ranging from-130 HU to 230HU, namely the window width 360HU and the window level 50HU, and then performing normalization operation on the processed CT image.
The Step1.2 data expansion adopts the modes of random horizontal turning, vertical turning, zooming and cutting to carry out data enhancement. After random expansion, the data are divided, wherein 82% is used as a training set, the rest 18% is used as a test set, and the training set is further divided into training data and verification data according to the proportion of 8.
Step2, acquiring the same dimension characteristic and the multi-scale characteristic: after the Step1 preprocessing operation, extracting image features by using a ResNeXt convolution neural network, and obtaining convolution features with uniform dimensionality and multi-scale features based on the convolution features through linear transformation.
Further, the specific steps of Step2 are as follows:
after image preprocessing, step2.1 uses the first five layers of the ResNeXt-101 network as feature extraction layers, the convolution in each ResNeXt block is divided into 32 paths, the middle channel dimension processed by each path is 4, and different paths are equivalent to different feature subspaces and used for extracting different semantic features.
Step2.2 unifies the channel dimensions of Layer 1-4 output results in the ResNeXt network structure into 64 through linear transformation, and upsamples the feature map size to keep consistent with Layer 1. And splicing the four features, compressing the spliced features to 64 through a 1 × 1 convolution to obtain the multi-scale features, wherein the number of the feature channels and the size of the feature map are consistent with the dimension of the features processed by Layer 1-4.
Step3, obtaining a global self-attention fusion characteristic: and obtaining a self-attention fusion feature containing global information through a global self-attention module (Non-Local) by using the multi-scale feature obtained in Step2 so as to capture the relation between the target feature and the surrounding features.
Further, the specific steps of Step3 are as follows:
certain relation exists among different organs in the Step3.1 abdominal CT image, and the liver organ segmentation effect can be improved by acquiring the relation. Inspired by a Non-Local mean algorithm, the invention selects Non-Local global self attention, starts from multi-scale characteristics obtained by Step2, and respectively carries out cubic linear mapping to obtain Key, query and Value embedded space characteristics, wherein the linear mapping is realized by adopting 1 × 1 convolution, and the multi-scale fusion characteristics and different attention methods are shown in a table 1.
TABLE 1 Multi-Scale fusion features Using different attention methods for comparison
Figure BDA0003827651300000061
Figure BDA0003827651300000071
Step3.2 calculates the similarity between the characteristic Key and Query, and the function for calculating the correlation is obtained according to a Gaussian function selected by a non-local mean value, and the calculation formula is as follows:
Figure BDA0003827651300000072
wherein x i Is the ith position of the input feature map and j represents all the positions that may be associated with i. And weighting the calculated similarity to Value to obtain the self-attention feature.
Step3.3 obtains the output of the self-attention weight from the attention feature through a Softmax layer, so that the learned long-distance dependency relationship is merged into the output feature; the overall calculation formula is as follows:
Figure BDA0003827651300000073
where C (x) is a Softmax normalization function, function g linearly maps the representation of the input j position, typically by 1 × 1 convolution, and function f calculates the correlation of the input ith position with the jth position.
And Step4, extracting the features of the self-attention fusion features obtained in Step3 through an improved convolution module, highlighting the effect of important semantic features in channel dimensions, and finally performing up-sampling to obtain a segmentation result.
Further, the specific steps of Step4 are as follows:
step4.1 extracting the fusion feature containing multi-scale information and self-attention relation from Step3, and further extracting the information in the fusion feature through an improved convolution module. The multi-scale self-attention fusion feature is subjected to 1 x 1 convolution, the feature channel is mapped to the specified dimension, and then the feature summation of the two is obtained through 1 x 1 convolution and 3 x 3 convolution.
Step4.2 uses an Attention module (Channel Attention, CA) acting on the Channel dimension to perform Channel recalibration on the characteristic Channel, and uses a residual error path to fuse the original characteristic and the Channel Attention characteristic to obtain the output characteristic of the residual error module; the specific calculation is shown as the formula:
Y MRA (X)=Y CA (W L X+W E X)+X
wherein, Y MRA (X) denotes a multi-level residual attention convolution operation, and X denotes an input feature. W is a group of L Is a 1 × 1 convolution matrix used to linearly map the original input, which is equivalent to a residual path. W E Is a 3 x 3 convolution matrix for feature extraction of the input features, Y CA Indicating channel attention operation.
The characteristics extracted by Step4.3 adopt a multi-path parallel idea of ensemble learning to obtain four groups of segmentation outputs, and the four groups of outputs are calculated and averaged to be used as a final output result.
The hardware environment adopted by the experiment is configured as Intel (R) Xeon (R) CPU E5-2620 v4@2.10GHZ and GPU NVDIA TITAN XP hardware platform, the operating system is Ubuntu 18.04.1, and the software platform comprises a GPU parallel computing architecture CUDA and a Pytron programming language-based Pytrch deep learning framework. Here, using Adam optimizer, the learning rate variation strategy employs a cosine annealing (cosine annealing) strategy with an initial learning rate of 0.001 and a minimum of 0.00001, reset every 30 rounds. The total number of training rounds is 80 rounds and the batch size is 4.
Table 2 shows the experimental comparison results between the method of the present invention and the medical image segmentation field on the dataset LiTS, wherein the method includes classical segmentation algorithms including UNet, FCN, etc., DAF, msAUNet, etc. which are similar to the method of the present invention using multi-scale feature fusion, and H-DenseUNet, multiple UNet, etc. which are well-known algorithms in the field of liver segmentation. The liver segmentation result of the method is far higher than that of a classical segmentation algorithm including FCN, and the average Dice value of the overlapped region of the segmentation result and the real segmentation reaches 96.4 percent and is 4.3 percent higher than that of a classical model UNet. Since the invention obtains the result similar to the 3D method under the condition of adopting the 2D model, the method has strong competitiveness in the liver segmentation task.
TABLE 2 comparison with existing Process
Figure BDA0003827651300000081
Fig. 4 is a result of experimental image segmentation on a dataset LiTS according to the present invention, wherein (a) CT pictures; (b) a reference segmentation criterion; (c) original model segmentation results; (d) adding the self-attention module model segmentation result; (e) adding the improved convolution module model segmentation result; after the self-attention mechanism is added, the error prediction of the region outside the liver in the prediction can be relieved to a certain extent, partial false positive prediction results are eliminated, and the effect of the self-attention mechanism in the liver segmentation task is further proved. And an MRA module with a perfect channel attention mechanism is further added, most false positive predictions are successfully eliminated through enhancement or inhibition of semantic features on channel dimensions, and meanwhile, the segmentation edge is closer to a real segmentation result.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit and scope of the present invention.

Claims (5)

1. The liver CT image segmentation method based on the fusion of global self-attention and multi-scale features is characterized by comprising the following specific operation steps of:
step1, image preprocessing: processing the CT image in the LiTS data set according to the HU value range to increase the contrast, and then expanding the data set;
step2, acquiring the same dimension characteristic and the multi-scale characteristic: after the Step1 preprocessing operation, extracting image features by using a ResNeXt convolutional neural network, and obtaining convolution features of uniform dimensions and multi-scale features based on the convolution features through linear transformation;
step3, obtaining a global self-attention fusion characteristic: obtaining a self-attention fusion feature containing global information through a global self-attention module Non-Local by using the multi-scale feature obtained in Step2 so as to capture the relation between the target feature and the surrounding features;
and Step4, extracting the features of the self-attention fusion features obtained in Step3 through an improved convolution module, highlighting the effect of important semantic features in channel dimensions, and finally performing up-sampling to obtain a segmentation result.
2. The liver CT image segmentation method based on the fusion of the global self-attention and the multi-scale features as claimed in claim 1, wherein Step1 comprises the following specific steps:
step1.1 processing the CT image in the LiTS data set according to the HU value range corresponding to the liver organ to increase the contrast; processing according to the CT value ranging from-130 HU to 230HU, namely the window width 360HU and the window level 50HU, and then performing normalization operation on the processed CT image;
the Step1.2 data expansion adopts the modes of random horizontal turning, vertical turning, zooming and cutting to carry out data enhancement; after random expansion, the data are divided, wherein 82% is used as a training set, the rest 18% is used as a test set, and the training set is further divided into training data and verification data according to the proportion of 8.
3. The liver CT image segmentation method based on the fusion of the global self-attention and the multi-scale features as claimed in claim 1, wherein Step2 comprises the following specific steps:
after image preprocessing, step2.1 uses the first five layers of a ResNeXt-101 network as feature extraction layers, the convolution in each ResNeXt block is divided into 32 paths, the middle channel dimension processed by each path is 4, different paths are equivalent to different feature subspaces and used for extracting different semantic features, meanwhile, the convolution kernels of different paths have sparser relations, and the risk of overfitting is reduced;
step2.2 unifies the channel dimensions of Layer 1-4 output results in the ResNeXt network structure into 64 through linear transformation, and upsamples the feature map size to keep consistent with Layer 1. And splicing the four features, compressing the spliced features to 64 through a 1 × 1 convolution to obtain the multi-scale features, wherein the number of the feature channels and the size of the feature map are consistent with the dimension of the features processed by Layer 1-4.
4. The liver CT image segmentation method based on the fusion of the global self-attention and the multi-scale features as claimed in claim 1, wherein Step3 comprises the following specific steps:
certain relation exists among different organs in a Step3.1 abdominal CT image, and the liver organ segmentation effect can be improved by acquiring the relation; inspired by the idea of calculating the correlation between the current position and other positions in the image in a non-local mean algorithm, starting from the multi-scale feature obtained by Step2, performing linear mapping for three times respectively to obtain Key, query and Value embedded space features, wherein the linear mapping is realized by adopting 1 × 1 convolution;
step3.2 calculates the similarity between the characteristic Key and Query, and the function for calculating the correlation is obtained according to a Gaussian function selected by a non-local mean value, and the calculation formula is as follows:
Figure FDA0003827651290000021
wherein x i I, inputting the ith position of the feature map, wherein j represents all positions possibly related to i, and weighting the calculated similarity on Value to obtain the self-attention feature;
step3.3 obtains the output of self-attention weight from the attention characteristic through a Softmax layer, thereby integrating the learned long-distance dependency relationship into the output characteristic, and the overall calculation formula is as follows:
Figure FDA0003827651290000022
where C (x) is a Softmax normalization function, function g linearly maps the representation of the input j position, typically by 1 × 1 convolution, and function f calculates the correlation of the input ith position with the jth position.
5. The liver CT image segmentation method based on the fusion of the global self-attention and the multi-scale features as claimed in claim 1, wherein Step4 comprises the following specific steps:
step4.1 extracting fusion characteristics containing multi-scale information and self-attention relation from Step3, and further extracting information in the fusion characteristics through an improved convolution module; the multi-scale self-attention fusion feature is subjected to 1 × 1 convolution, a feature channel is mapped to a specified dimension, and then the feature sum of the multi-scale self-attention fusion feature and the specified dimension is obtained through 1 × 1 convolution and 3 × 3 convolution;
step4.2 uses an attention module acting on channel dimension to perform channel recalibration on the characteristic channel, uses a residual path to fuse the original characteristic and the channel attention characteristic to obtain the output characteristic of the residual module, and the specific calculation is as shown in a formula:
Y MRA (X)=Y CA (W L X+W E X)+X
wherein, Y MRA (X) denotes a multi-level residual attention convolution operation, X denotes an input feature, W L The convolution matrix is 1 multiplied by 1 and is used for linear mapping of original input and is equivalent to a residual path; w is a group of E Is a 3 x 3 convolution matrix for feature extraction of the input features, Y CA Indicating channel attention operation;
the characteristics extracted by Step4.3 adopt a multi-path parallel idea of ensemble learning to obtain four groups of segmentation outputs, and the four groups of outputs are calculated and averaged to be used as a final output result.
CN202211064580.1A 2022-08-31 2022-08-31 Liver CT image segmentation method based on global self-attention and multi-scale feature fusion Pending CN115457051A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211064580.1A CN115457051A (en) 2022-08-31 2022-08-31 Liver CT image segmentation method based on global self-attention and multi-scale feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211064580.1A CN115457051A (en) 2022-08-31 2022-08-31 Liver CT image segmentation method based on global self-attention and multi-scale feature fusion

Publications (1)

Publication Number Publication Date
CN115457051A true CN115457051A (en) 2022-12-09

Family

ID=84299992

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211064580.1A Pending CN115457051A (en) 2022-08-31 2022-08-31 Liver CT image segmentation method based on global self-attention and multi-scale feature fusion

Country Status (1)

Country Link
CN (1) CN115457051A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115984574A (en) * 2023-03-20 2023-04-18 北京航空航天大学 Image information extraction model and method based on cyclic transform and application thereof
CN116152278A (en) * 2023-04-17 2023-05-23 杭州堃博生物科技有限公司 Medical image segmentation method and device and nonvolatile storage medium
CN116248959A (en) * 2023-05-12 2023-06-09 深圳市橙视科技发展有限公司 Network player fault detection method, device, equipment and storage medium
CN116681958A (en) * 2023-08-04 2023-09-01 首都医科大学附属北京妇产医院 Fetal lung ultrasonic image maturity prediction method based on machine learning

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115984574A (en) * 2023-03-20 2023-04-18 北京航空航天大学 Image information extraction model and method based on cyclic transform and application thereof
CN115984574B (en) * 2023-03-20 2023-09-19 北京航空航天大学 Image information extraction model and method based on cyclic transducer and application thereof
CN116152278A (en) * 2023-04-17 2023-05-23 杭州堃博生物科技有限公司 Medical image segmentation method and device and nonvolatile storage medium
CN116248959A (en) * 2023-05-12 2023-06-09 深圳市橙视科技发展有限公司 Network player fault detection method, device, equipment and storage medium
CN116681958A (en) * 2023-08-04 2023-09-01 首都医科大学附属北京妇产医院 Fetal lung ultrasonic image maturity prediction method based on machine learning
CN116681958B (en) * 2023-08-04 2023-10-20 首都医科大学附属北京妇产医院 Fetal lung ultrasonic image maturity prediction method based on machine learning

Similar Documents

Publication Publication Date Title
CN113077471B (en) Medical image segmentation method based on U-shaped network
CN110930397B (en) Magnetic resonance image segmentation method and device, terminal equipment and storage medium
CN115457051A (en) Liver CT image segmentation method based on global self-attention and multi-scale feature fusion
CN112927255B (en) Three-dimensional liver image semantic segmentation method based on context attention strategy
CN111784671A (en) Pathological image focus region detection method based on multi-scale deep learning
CN112270666A (en) Non-small cell lung cancer pathological section identification method based on deep convolutional neural network
CN111612008A (en) Image segmentation method based on convolution network
CN110853011B (en) Method for constructing convolutional neural network model for pulmonary nodule detection
CN113569724B (en) Road extraction method and system based on attention mechanism and dilation convolution
CN115393584A (en) Establishment method based on multi-task ultrasonic thyroid nodule segmentation and classification model, segmentation and classification method and computer equipment
KR20220144687A (en) Dual attention multiple instance learning method
CN115457057A (en) Multi-scale feature fusion gland segmentation method adopting deep supervision strategy
CN112750137A (en) Liver tumor segmentation method and system based on deep learning
CN114511523B (en) Gastric cancer molecular subtype classification method and device based on self-supervision learning
CN116363081A (en) Placenta implantation MRI sign detection classification method and device based on deep neural network
CN116825363B (en) Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network
CN117611599B (en) Blood vessel segmentation method and system integrating centre line diagram and contrast enhancement network
CN114693671A (en) Lung nodule semi-automatic segmentation method, device, equipment and medium based on deep learning
CN112489062B (en) Medical image segmentation method and system based on boundary and neighborhood guidance
CN117409201A (en) MR medical image colorectal cancer segmentation method and system based on semi-supervised learning
CN116884597A (en) Pathological image breast cancer molecular typing method and system based on self-supervision pre-training and multi-example learning
CN115131628A (en) Mammary gland image classification method and equipment based on typing auxiliary information
CN111598144B (en) Training method and device for image recognition model
Salunkhe et al. Rapid tri-net: breast cancer classification from histology images using rapid tri-attention network
Sun et al. DARMF-UNet: A dual-branch attention-guided refinement network with multi-scale features fusion U-Net for gland segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination