CN115424103A - Improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion - Google Patents

Improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion Download PDF

Info

Publication number
CN115424103A
CN115424103A CN202210990335.7A CN202210990335A CN115424103A CN 115424103 A CN115424103 A CN 115424103A CN 202210990335 A CN202210990335 A CN 202210990335A CN 115424103 A CN115424103 A CN 115424103A
Authority
CN
China
Prior art keywords
net
brain tumor
improved
attention mechanism
feature fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210990335.7A
Other languages
Chinese (zh)
Inventor
黄同愿
刘瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Technology
Original Assignee
Chongqing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Technology filed Critical Chongqing University of Technology
Priority to CN202210990335.7A priority Critical patent/CN115424103A/en
Publication of CN115424103A publication Critical patent/CN115424103A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)

Abstract

The invention discloses an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion, belonging to the field of semantic segmentation and comprising the following steps: s1, data acquisition and data preprocessing; s2, constructing an improved U-Net brain tumor segmentation model based on attention mechanism and multi-scale feature fusion; s3, constructing a mixed loss function, training the improved U-Net brain tumor segmentation model, and storing the optimal model; s4, forecasting by using the optimal model, storing a forecasting result, performing online verification, obtaining an evaluation index, and finally comparing the results; adding a residual error module on the basis of a U-Net model, replacing common convolution by using depth parameterization convolution, and adding a multi-scale feature fusion module and an attention mechanism module, and then quantitatively evaluating the segmentation effect by adopting a Dice similarity coefficient, wherein experimental results show that: the improved U-Net network can effectively improve the segmentation precision of the MRI brain tumor image and has good segmentation performance.

Description

Improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion
Technical Field
The invention relates to a deep neural network, in particular to an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion, and belongs to the field of semantic segmentation.
Background
According to the global cancer statistics report, new cases of brain tumors in 2018 account for 1.6% of all tumors worldwide. In 2018 years of China, 6.4 ten thousand cases of brain tumor death are listed, and the 9 th case of all cancer death cases is listed. A series of physiological and psychological symptoms such as headache, hypodynamia, epilepsy, cognitive disorder, anxiety, depression and the like are frequently generated in the process from disease onset to treatment of a brain tumor patient, and a plurality of symptoms coexist, so that the functional state and the life quality of the patient are greatly influenced, and therefore, the early diagnosis plays a crucial role.
Magnetic Resonance Imaging (MRI), as a non-invasive brain tumor Imaging technique, can generate high-resolution, non-invasive, multi-modal brain images, provide comprehensive and accurate information for clinicians to diagnose brain tumors, and is one of the important technical means for identifying brain tumors. In addition, MRI can provide visualization of different structures of the same tissue under different visualization techniques. Four common modalities are available: the T1 weighted image, the T1ce weighted image, the T2 weighted image and the FLAIR are complementary to each other in brain tissue characteristic information among different modes, and therefore more accurate positioning and segmentation of the tumor are facilitated. The computer aided diagnosis technology of the radiology department has the advantages of low cost, high efficiency, time saving and the like. In recent years, computer-aided diagnosis techniques based on deep learning methods have been applied in many medical fields and have made major breakthroughs.
Because of factors such as many brain tumor shapes, uneven position size distribution, complex boundary and the like, the current brain tumor image is mainly manually segmented by experts, which wastes time and labor. And the segmentation result has larger difference and the segmentation precision is difficult to ensure due to the influence of factors such as actual combat experience of different doctors, personal knowledge accumulation, working time and the like. Therefore, the research of the efficient and reliable full-automatic brain tumor segmentation method has important clinical practical significance for the diagnosis of the brain tumor diseases. BraTS is the longest history of all the MICCAI games, and the number of players in the games is almost the largest in all the games every year, so the BraTS is a platform for well understanding the forefront brain tumor segmentation method. The champion Myronenko et al of BraTS in 2018 proposes an asymmetric U-Net, firstly, a larger encoder is used for extracting brain tumor image characteristics, a smaller decoder is used for reconstructing a segmentation mask, and an extra decoder branch is added by an author for reconstructing an original image; jujun Jiang Z et al in 2019 use two cascaded U-Net networks to complete the segmentation of brain tumors, firstly use the first network to perform coarse segmentation, then use the output of the first network as the input of the second network, and finally complete the refined segmentation; in 2020, champion Isense et al propose an nnU-Net model on the basis of a traditional U-Net model by improving a data preprocessing process, wherein the nnU-Net model is based on three models: the self-adaptive framework of the 2D U-Net, the 3D U-Net and the U-Net Cascade enables a user to automatically adjust the self structure according to the image geometric structure by inputting data, and realizes a full-automatic segmentation process.
Disclosure of Invention
In order to solve the problems that the segmentation structure of the existing brain tumor segmentation has difference and the segmentation precision is difficult to guarantee, the invention provides an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion, the method can effectively improve the segmentation precision of an MRI brain tumor image, and has good segmentation performance, and the specific scheme of the invention is as follows:
an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion comprises the following steps:
s1, data acquisition and data preprocessing;
s2, constructing an improved U-Net brain tumor segmentation model based on attention mechanism and multi-scale feature fusion;
s3, constructing a mixed loss function, training the improved U-Net brain tumor segmentation model, and storing the optimal model;
and S4, predicting by using the optimal model, storing a prediction result, performing online verification, obtaining an evaluation index, and finally comparing the results.
Further, in the step S1, the experimental data set adopts a brain glioma public data set BraTS2018 provided by the international association for medical image calculation and computer-aided intervention, wherein the labeled data are divided into a tumor enhancement area, an edema area, a necrosis area and a back; preprocessing the brain tumor data, and removing noise signals of the multi-modal MRI brain tumor image data. Has the beneficial effects that: the method is used for eliminating gray-scale nonuniformity and abnormal points existing in the brain tumor image.
Further, in the step S2, improvement is performed on the basis of the U-Net network structure, specifically:
s21, replacing a common unit in the original U-Net with a residual unit;
s22, adding a multi-scale feature fusion mechanism in the jump connection, and concat the obtained result and the up-sampled advanced feature;
s23, adding an attention mechanism in the decoding stage of the U-Net structure.
Further, in step S21, replacing the normal unit in the original U-Net with the residual unit is to add a batch normalization layer before each convolution layer in the normal convolution module, replace the ReLU activation function with the prilu activation function, and then add a shortcut connection to prevent the gradient from disappearing, and add one more 1 × 1 convolution to keep the dimension consistent in the decoding stage of the U-Net model. The beneficial effects are as follows: and the residual error module is used for replacing the original convolution layer, so that the feature extraction capability of the network is enhanced, the segmentation performance of the network is improved, and the problem of gradient disappearance faced by network deepening is solved.
Further, in step S22, multi-scale feature fusion is added to the jump connection of U-Net, and the specific structure adopts an improved cavity space pyramid pooling module, and uses three cavity convolutions with sampling rates of 2, 4, and 8, one common convolution and one maximum pooling, and then performs convolution and pooling on the features in the encoding stage, and fuses the obtained five feature maps, thereby obtaining more detailed features. Has the beneficial effects that: the receptive field of brain tumor characteristics is enlarged by using cavity convolution with different sampling rates, characteristic information with more abstract significance in a network deep layer is captured, utilization of multi-scale information by the network is improved, and the problems that a brain tumor position area is not fixed and the sizes of tumors are different are solved.
Further, the attention mechanism uses channel attention in step S23, including both compression and excitation operations; in the compression operation, performing Global Average Pooling (GAP) on a Feature map U with the input size of C multiplied by W multiplied by H (wherein (W, H) is the size of the Feature map, and C is the number of channels) to obtain a global compression Feature vector Z with the size of 1 multiplied by C; performing compression operation, namely obtaining a result Z obtained by the compression operation through a gate mechanism (gate mechanism) formed by fully connecting two layers to obtain a channel weight matrix S of the characteristic diagram U, and finally multiplying the weight matrix S by the original characteristic diagram U to obtain U'; the specific calculation formula is as follows:
S=F se (X)=σ(W 2 δ(W 1 GAP(U)))
U'=S·U
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003803592000000031
is the weight matrix of the first fully-connected layer,
Figure BDA0003803592000000032
the weight matrix of the second fully connected layer, r is the scaling parameter (reduction ratio), default value is 16, δ is the ReLU activation function, and σ is the sigmoid activation function. Has the advantages that: the input characteristics are modified by using the channel weight, so that the acquisition of effective information is enhanced, the information redundancy is avoided, and the problems of unobvious tumor characteristics, more interference factors such as noise, background and the like in the brain tumor image are solved.
Further, all convolution modes in the U-Net network use deep over-parameterized convolution DO-Conv instead of normal convolution. The beneficial effects are that: the DO-Conv convolution can speed up the convergence of the network and does not increase computational complexity.
Further, the mixed loss function in step S3 includes a generalized Dice loss function and a class cross entropy loss function, and the calculation formula of the mixed loss function is as follows:
L=L g +λL c
in the formula, λ is a hyper-parameter for controlling L g And L c The balance between them, set to 1.25 in this experiment;
the GDL function is a multi-classification loss function, and assigns an adaptive weight to each class to process the class imbalance problem; the GDL function is calculated as:
Figure BDA0003803592000000041
in the formula, L is a tumor category label, N is the number of pixel points, epsilon is a smooth operator set for preventing calculation errors caused by a denominator of 0, and W is j Weight, g, representing the jth class ij Is the true value label, p, of the jth label at the ith pixel point ij A prediction result of a corresponding position of the model;
the calculation formula of the cross entropy loss function of the multi-classification is as follows:
Figure BDA0003803592000000042
in the formula, N is the number of all samples, C is the number of all labels, g ij Is the true value, p, of the j element of the i sample ij Is the predicted value of the jth element of the ith sample. Has the advantages that: the mixing loss function solves the problem of class imbalance of brain tumor images.
Further, in step S4, the verification set of BraTS2018 is input into a network, segmentation prediction is performed by using a trained model, a segmentation prediction graph of each patient is stored, then an official website is logged in, an online verification link of the verification set of BraTS2018 is found, the online verification is performed by entering a link to upload the segmentation results of the verification set, and finally a prediction result table of all patients in the verification set is obtained.
Further, a Dice similarity coefficient is used as a segmentation result of the performance evaluation index quantitative evaluation model, dice is the overlapping degree between a segmentation region obtained by calculating the model and a real segmentation region of a label, the range is [0,1], the larger the value is, the closer the tumor segmentation result and the labeling result is, the better the segmentation effect is, and the definition formula is as follows:
Figure BDA0003803592000000043
where n is the logical AND operator, |, is the size of the set, P 1 And T 1 Representing the set of voxels P =1 and T =1, respectively.
The invention has the beneficial effects that:
the invention discloses an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion, which is characterized in that a residual error module is added on the basis of a U-Net model, the depth parameterization convolution is used for replacing common convolution, and a multi-scale feature fusion module and an attention mechanism module are added, and then a Dice similarity coefficient is adopted to quantitatively evaluate the segmentation effect, and the experimental result shows that: the improved U-Net network can effectively improve the segmentation precision of the MRI brain tumor image and has good segmentation performance.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic diagram of a complete model structure of an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion according to the present invention;
FIG. 2 is a schematic diagram of depth over-parameterization convolution in an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion;
FIG. 3 (a) is a schematic diagram of a common rolling block in the improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion;
FIG. 3 (b) is a schematic diagram of a residual module in the improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion;
FIG. 3 (c) is a schematic diagram of the invention of FIG. 3 (b) with the addition of a 1 × 1 DO-Conv residual block;
FIG. 4 is a schematic structural diagram of multi-scale feature fusion in an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion;
FIG. 5 is a schematic diagram of an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion;
FIG. 6 is a schematic diagram of the calculation of evaluation indexes in the improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion;
FIG. 7 is a schematic diagram of a BraTS2018 validation set segmentation result in an improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion.
Detailed Description
The following is a more detailed description of the present invention by way of specific embodiments.
The complete model of the improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion as shown in FIG. 1 comprises the following steps:
s1, data acquisition and data preprocessing
S11, a data acquisition stage:
using BraTS2018 as the experimental dataset of the present invention, the data of each patient contains MR images of four modalities (Flair, T1ce and T2) and true segmentation labels, and the size of each MR image is 240 × 240 × 155. All MRI-tagged images were manually segmented by 1 to 4 experienced neuroradiologists, where the labeling data can be divided into four parts: tumor enhancement zone, edema zone, necrosis zone and background. To better assess the segmentation, these segments were divided into three segmentation tasks, including segmentation of WT (tumor region), TC (tumor core region), and ET (enhanced tumor region), where WT includes tumor enhancement, edema, and necrosis regions, TC includes tumor enhancement and necrosis regions, and ET includes only tumor enhancement regions.
S12, data preprocessing stage:
preprocessing the brain tumor data, and removing noise signals of the multi-modal MRI brain tumor image data. First, other areas of the MR image that do not contain brain are excised, while top and bottom slices of 1% of the area that does not contain brain tumors are removed. Subsequently, the original 240 × 240 × 155 MR image is cut into 128 × 128 × 144 MR images, and finally the MR images of each modality are normalized, and the average value is subtracted and divided by the standard deviation of the intensity within the slice to alleviate the gray-scale inhomogeneity of the brain tumor image.
S2, constructing an improved U-Net brain tumor segmentation model based on attention mechanism and multi-scale feature fusion. The improvement is carried out on the basis of a U-Net network structure, and the specific steps are as follows:
s21, replacing a common unit in the original U-Net with a residual unit; at present, image segmentation based on a deep learning algorithm is mainly used for deepening the network depth by continuously superposing a convolution layer and a pooling layer, so that feature information can be more fully extracted. Meanwhile, as the network model deepens, the trainable parameter quantity increases, and the gradient disappearance problem becomes more serious. Based on the above problems, the residual error module shown in fig. 3 (b) and fig. 3 (c) is adopted to replace the common rolling block shown in fig. 3 (a), and the short-distance jump connection inside the residual error unit and the long-distance jump connection between the residual error units jointly act, so that the feature extraction capability of the network is enhanced and the segmentation performance of the network is improved on the premise of not increasing the network depth.
Compared with an ordinary convolution module, the BN layer is added in the residual error unit, the ReLU is replaced by the PReLU, the ordinary convolution is replaced by the DO-Conv, and the constant mapping shortcut structure is included, so that data is allowed to be transmitted among layers without influencing the learning capability of the model, and the problem of reduction of model prediction accuracy caused by gradient disappearance is solved. The difference between (b) and (c) in fig. 3 is that there is one more 1 × 1 DO-Conv in the shortcut structure of (c), because the characteristic dimensions of the input and output of the residual module are different in the decoding stage, and the 1 × 1 convolution operation is needed to achieve the dimensional consistency.
S22, adding multi-scale feature fusion shown in figure 4 into the jump connection of the U-Net; at present, most convolutional neural networks extract features of a target in a layer-by-layer abstract mode, and as is known, shallow features have higher resolution and contain more position information and detail information, but the passed convolutions are few, so that the semantic property is lower, and the noise is more. The high-level features have stronger semantic information, but the resolution is very low, and the perception capability of the details is poor. In summary, information of both the deep network and the shallow network is essential, and how to perform effective fusion is the key to improve the segmentation model.
The multi-scale feature fusion module is constructed based on a cavity space pyramid pooling structure, 3 × 3 cavity convolution with three sampling rates of 2, 4 and 8, a 1 × 1 common convolution and a maximum pooling are used for capturing multi-scale information, then the features in the encoding stage are respectively subjected to convolution and pooling, then the obtained five features are subjected to concat operation, and finally, in order to keep the dimensionality unchanged, a 1 × 1 convolution is used for realizing the consistency of the dimensionality. The context multi-scale information of the input features is acquired through the cavity convolution parallel sampling with different sampling rates, the acquisition capability of the global information is improved, the invariance of the network to translation can be increased by adding the maximum pooling layer, and the method is very critical to the improvement of the generalization capability of the network.
S23, adding an attention mechanism shown in figure 5 in a decoding stage of the U-Net structure; the U-Net segments the brain tumor image, and splices the feature vectors of the expansion path and the contraction path through jump connection, so that a better result is obtained. However, simple image feature stitching depends on fixed weight, information is lost along with hierarchical propagation, and the segmentation precision is not high. The original U-Net model fuses low-level features and high-level features using a jump connection approach. However, directly linking these high-level and low-level feature maps is not the best way to effectively integrate them, regardless of their importance. In addition, inaccurate, ambiguous information of some feature maps may confuse the network, resulting in erroneous segmentation of the tumor. These problems are solved using a channel attention mechanism, with a SEblock being employed after each concat layer. It can use semantic information in the high-level feature map to enhance the feature weight of the brain tumor region in the low-level feature map to retain more detailed information.
The process of an SEblock is divided into two steps of compression (Squeeze) and Excitation (Excitation). First, a compression operation is performed, where a Feature map U with an input size of C × W × H (where (W, H) is the size of Feature map, and C is the number of channels) is subjected to Global Average Pooling (GAP), to obtain a global compressed Feature vector Z with a size of 1 × 1 × C. The excitation operation is to obtain a channel weight matrix S of the characteristic diagram U by a gate mechanism (gate mechanism) formed by fully connecting two layers of the result Z obtained by the compression operation, and finally, to multiply the weight matrix S and the original characteristic diagram U to obtain U'. The specific calculation formula is as follows.
S=F se (X)=σ(W 2 δ(W 1 GAP(U)))
U'=S·U
Wherein the content of the first and second substances,
Figure BDA0003803592000000071
is the weight matrix of the first fully-connected layer,
Figure BDA0003803592000000072
the weight matrix of the second fully-connected layer, r is the scaling parameter (reduction ratio), the default value is 16, δ is the ReLU activation function, and σ is the sigmoid activation function.
Meanwhile, DO-Conv is used for replacing ordinary convolution, the DO-Conv convolution can accelerate convergence of the network, calculation complexity is not increased, and brain tumor segmentation network performance is improved.
Wherein DO-Conv is a core component of the DO-UNet network architecture. Unlike conventional convolutionDO-Conv implements the over-parameterized convolutional layer by adding an "extra" component, which is a deep convolution operation. Deep convolution has a trainable kernel D e R (M×N) ×D mul ×C in The ordinary convolution has a trainable kernel W ∈ R (M×N) ×D mul ×C out Where M and N are the spatial dimensions of the input vector, C in Number of channels being input vector, C out Number of channels being output vector, D mul Is the depth multiplier of the depth convolution. Firstly, the deep convolution operator
Figure BDA0003803592000000081
Applying to deep convolution kernel D and input feature P to obtain new feature after transformation
Figure BDA0003803592000000082
Then the conventional convolution operator "+" is applied to the ordinary convolution kernel W and the feature P ', resulting in the transformed new feature O = W × P'.
S3, constructing a mixed loss function, training the improved U-Net brain tumor segmentation model, and storing the optimal model
Since brain glioma represents only 2% of the entire brain in most cases, the brain tumor dataset is an extremely unbalanced dataset. Convolutional neural networks are very sensitive to data imbalance, and during training, the networks are more biased to data with larger sample size, which greatly reduces the accuracy of the algorithm. In the face of the problem of unbalanced classes of brain tumor images, a mixing loss function is adopted.
The mixed loss function comprises a generalized Dice loss function and a category cross entropy loss function, and the calculation formula of the mixed loss function is as follows:
L=L g +λL c
in the formula, λ is a hyper-parameter for controlling L g And L c The balance between them, set to 1.25 in this experiment.
The GDL function is a multi-classification loss function, and an adaptive weight is allocated to each class to process the imbalance problem of the classes. The GDL function is calculated as:
Figure BDA0003803592000000083
in the formula, L is a tumor category label, N is the number of pixel points, epsilon is a smooth operator set for preventing calculation errors caused by the denominator being 0, and W is j Weight, g, representing the jth class ij Is the true value label, p, of the jth label at the ith pixel point ij Is the prediction result of the corresponding position of the model.
The calculation formula of the cross entropy loss function of the multi-classification is as follows:
Figure BDA0003803592000000084
in the formula, N is the number of all samples, C is the number of all labels, g ij Is the true value, p, of the j element of the i sample ij Is the predicted value of the jth element of the ith sample.
And S4, predicting by using the optimal model, storing a prediction result, performing online verification, obtaining an evaluation index, and finally comparing the results.
Inputting the verification set of the BraTS2018 into a network, performing segmentation prediction by using a trained model, storing a segmentation prediction graph of each patient, logging in an official website, finding out an online verification link of the verification set of the BraTS2018, entering the link to upload a segmentation result of the verification set for online verification, and finally obtaining a prediction result table of all patients in the verification set.
The Dice similarity coefficient shown in fig. 6 is used as a performance evaluation index to quantitatively evaluate the segmentation result of the model shown in fig. 7, dice is the overlapping degree between the segmentation region obtained by calculating the model and the real segmentation region of the label, the range is [0,1], the larger the value is, the closer the tumor segmentation result and the labeling result is, the better the segmentation effect is, and the definition formula is as follows:
Figure BDA0003803592000000091
where n is the logical AND operator, |, is the size of the set, P 1 And T 1 Representing the set of voxels P =1 and T =1, respectively.
In order to more intuitively show the influence of adding different modules on the segmentation result, an ablation experiment is carried out on a BraTS2018 data set, different modules are sequentially added into a U-Net model, the experiment is carried out by using the same parameters, and the specific result is shown in table 1, wherein Res represents a residual module, DO-Conv represents a depth over-parameterization convolution, ASPP represents a multi-scale feature fusion module, and SE represents an attention mechanism module.
TABLE 1 comparison of segmentation results of U-Net with different added modules
Figure BDA0003803592000000092
As can be seen from Table 1, after the residual module is added into the U-Net model, the segmentation performance is improved, and the Dice of ET, WT and TC are respectively improved by 1%,1% and 2%; then adding a multi-scale feature fusion module ASPP, wherein only WT is improved by 2%; then, the convolution mode is changed, all the common convolutions in the model are replaced by DO-Conv, and the Dice of ET, WT and TC are respectively improved by 2 percent, 1 percent and 2 percent; and finally, adding an attention mechanism module, wherein the Dice coefficients of the model in the three segmentation areas are all optimal. Experimental results show that different modules of the network structure provided by the method improve the brain tumor segmentation precision. Experimental results show that the method can effectively improve the segmentation precision of the MRI brain tumor image and has good segmentation performance.

Claims (10)

1. An improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion is characterized by comprising the following steps:
s1, data acquisition and data preprocessing;
s2, constructing an improved U-Net brain tumor segmentation model based on attention mechanism and multi-scale feature fusion;
s3, constructing a mixed loss function, training the improved U-Net brain tumor segmentation model, and storing the optimal model;
and S4, predicting by using the optimal model, storing a prediction result, performing online verification, obtaining an evaluation index, and finally comparing the results.
2. The improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion of claim 1, characterized in that: in the step S1, the experimental data set adopts a brain glioma public data set BraTS2018 provided by the International society for medical image calculation and computer-aided intervention, wherein the marking data is divided into a tumor enhancement area, an edema area, a necrosis area and a back; preprocessing the brain tumor data, and removing noise signals of the multi-modal MRI brain tumor image data.
3. The improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion of claim 1, characterized in that: the step S2 is improved on the basis of a U-Net network structure, and specifically comprises the following steps:
s21, replacing a common unit in the original U-Net with a residual unit;
s22, adding a multi-scale feature fusion mechanism in the jump connection, and concat the obtained result and the up-sampled advanced feature;
and S23, adding an attention mechanism in the decoding stage of the U-Net structure.
4. The improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion of claim 3, wherein: in the step S21, the step of replacing the normal unit in the original U-Net with the residual unit is to add a batch normalization layer before each convolution layer in the normal convolution module, replace the ReLU activation function with the prilu activation function, and then add the shortcut connection to prevent the gradient from disappearing, and add one more 1 × 1 convolution at the decoding stage of the U-Net model to keep the dimensions consistent.
5. The improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion of claim 3, wherein: in the step S22, multi-scale feature fusion is added to the jump connection of the U-Net, and the specific structure adopts an improved cavity space pyramid pooling module, and uses three cavity convolutions with sampling rates of 2, 4, and 8, one common convolution and one maximum pooling, and then performs convolution and pooling on the features in the encoding stage, and fuses the obtained five feature maps, thereby obtaining more detailed features.
6. The improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion of claim 3, wherein: the attention mechanism in step S23 uses channel attention, including both compression and excitation operations; in the compression operation, performing Global Average Pooling (GAP) on a Feature map U with the input size of C × W × H (wherein (W, H) is the size of the Feature map, and C is the number of channels) to obtain a global compression Feature vector Z with the size of 1 × 1 × C; the operation is to pass the result Z obtained by the compression operation through a gate mechanism (gate mechanism) formed by two layers of full connection to obtain a channel weight matrix S of the characteristic diagram U, and finally, multiply the weight matrix S and the original characteristic diagram U to obtain U'; the specific calculation formula is as follows:
S=F se (X)=σ(W 2 δ(W 1 GAP(U)))
U'=S·U
wherein the content of the first and second substances,
Figure FDA0003803591990000021
is the weight matrix of the first fully-connected layer,
Figure FDA0003803591990000022
the weight matrix of the second fully connected layer, r is the scaling parameter (reduction ratio), default value is 16, δ is the ReLU activation function, and σ is the sigmoid activation function.
7. The improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion of claim 3, characterized in that: all convolution modes in the U-Net network use deep over-parameterized convolution DO-Conv instead of normal convolution.
8. The improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion of claim 1, characterized in that: the mixed loss function in the step S3 includes a generalized Dice loss function and a class cross entropy loss function, and the calculation formula of the mixed loss function is as follows:
L=L g +λL c
in the formula, λ is a hyper-parameter for controlling L g And L c The balance between them, set to 1.25 in this experiment;
the GDL function is a multi-classification loss function, and assigns an adaptive weight to each class to process the unbalanced problem of the class; the GDL function is calculated as:
Figure FDA0003803591990000023
in the formula, L is a tumor category label, N is the number of pixel points, epsilon is a smooth operator set for preventing calculation errors caused by the denominator being 0, and W is j Weight, g, representing the jth class ij Is the true value label, p, of the jth label at the ith pixel point ij A prediction result of a corresponding position of the model;
the calculation formula of the cross entropy loss function of the multi-classification is as follows:
Figure FDA0003803591990000031
in the formula, N is the number of all samples, C is the number of all labels, g ij Is the true value, p, of the j element of the i sample ij Predicted value of jth element for ith sample。
9. The improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion of claim 1, characterized in that: in the step S4, the verification set of the BraTS2018 is input into a network, segmentation prediction is carried out by using a trained model, a segmentation prediction graph of each patient is stored, then an official website is logged in, an online verification link of the verification set of the BraTS2018 is found, the online verification is carried out by entering a link and uploading the segmentation result of the verification set, and finally a prediction result table of all patients in the verification set is obtained.
10. The improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion of claim 9, wherein: the division result of the quantitative evaluation model by adopting the Dice similarity coefficient as a performance evaluation index, the Dice is the overlapping degree between the division area obtained by the calculation model and the real division area of the label, the range is [0,1], the larger the value is, the closer the tumor division result and the labeling result is, the better the division effect is, and the definition formula is as follows:
Figure FDA0003803591990000032
where n is the logical AND operator, |, is the size of the set, P 1 And T 1 Representing the set of voxels P =1 and T =1, respectively.
CN202210990335.7A 2022-08-18 2022-08-18 Improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion Pending CN115424103A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210990335.7A CN115424103A (en) 2022-08-18 2022-08-18 Improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210990335.7A CN115424103A (en) 2022-08-18 2022-08-18 Improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion

Publications (1)

Publication Number Publication Date
CN115424103A true CN115424103A (en) 2022-12-02

Family

ID=84197971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210990335.7A Pending CN115424103A (en) 2022-08-18 2022-08-18 Improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion

Country Status (1)

Country Link
CN (1) CN115424103A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030347A (en) * 2023-01-06 2023-04-28 山东建筑大学 High-resolution remote sensing image building extraction method based on attention network
CN116152278A (en) * 2023-04-17 2023-05-23 杭州堃博生物科技有限公司 Medical image segmentation method and device and nonvolatile storage medium
CN116594061A (en) * 2023-07-18 2023-08-15 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN117078692A (en) * 2023-10-13 2023-11-17 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Medical ultrasonic image segmentation method and system based on self-adaptive feature fusion
CN117392392A (en) * 2023-12-13 2024-01-12 河南科技学院 Rubber cutting line identification and generation method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030347A (en) * 2023-01-06 2023-04-28 山东建筑大学 High-resolution remote sensing image building extraction method based on attention network
CN116030347B (en) * 2023-01-06 2024-01-26 山东建筑大学 High-resolution remote sensing image building extraction method based on attention network
CN116152278A (en) * 2023-04-17 2023-05-23 杭州堃博生物科技有限公司 Medical image segmentation method and device and nonvolatile storage medium
CN116594061A (en) * 2023-07-18 2023-08-15 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN116594061B (en) * 2023-07-18 2023-09-22 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN117078692A (en) * 2023-10-13 2023-11-17 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Medical ultrasonic image segmentation method and system based on self-adaptive feature fusion
CN117078692B (en) * 2023-10-13 2024-02-06 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Medical ultrasonic image segmentation method and system based on self-adaptive feature fusion
CN117392392A (en) * 2023-12-13 2024-01-12 河南科技学院 Rubber cutting line identification and generation method
CN117392392B (en) * 2023-12-13 2024-02-13 河南科技学院 Rubber cutting line identification and generation method

Similar Documents

Publication Publication Date Title
Zhang et al. Automatic segmentation of acute ischemic stroke from DWI using 3-D fully convolutional DenseNets
CN115424103A (en) Improved U-Net brain tumor segmentation method based on attention mechanism and multi-scale feature fusion
Ma et al. Multiple sclerosis lesion analysis in brain magnetic resonance images: techniques and clinical applications
CN112674720B (en) Alzheimer disease pre-judgment method based on 3D convolutional neural network
CN113888555A (en) Multi-modal brain tumor image segmentation system based on attention mechanism
Li et al. Brain tumor segmentation using 3D generative adversarial networks
Zhao et al. Effective Combination of 3D-DenseNet's Artificial Intelligence Technology and Gallbladder Cancer Diagnosis Model
Sadeghibakhi et al. Multiple sclerosis lesions segmentation using attention-based CNNs in FLAIR images
Yuan et al. ResD-Unet research and application for pulmonary artery segmentation
Dhanagopal et al. An efficient retinal segmentation-based deep learning framework for disease prediction
Khalifa et al. Deep Learning for Image Segmentation: A Focus on Medical Imaging
CN117036288A (en) Tumor subtype diagnosis method for full-slice pathological image
Ait Mohamed et al. Hybrid method combining superpixel, supervised learning, and random walk for glioma segmentation
Cui et al. Automatic Segmentation of Kidney Volume Using Multi-Module Hybrid Based U-Shape in Polycystic Kidney Disease
Pallawi et al. Study of Alzheimer’s disease brain impairment and methods for its early diagnosis: a comprehensive survey
Zhang et al. ETUNet: Exploring efficient transformer enhanced UNet for 3D brain tumor segmentation
Feng et al. DAUnet: A U-shaped network combining deep supervision and attention for brain tumor segmentation
Rezaee et al. SkinNet: A Hybrid Convolutional Learning Approach and Transformer Module Through Bi-directional Feature Fusion
Atiyah et al. Segmentation of human brain gliomas tumour images using u-net architecture with transfer learning
Soh et al. HUT: Hybrid UNet transformer for brain lesion and tumour segmentation
Zequan et al. Brain Tumor MRI Image Segmentation Via Combining Pyramid Convolution and Attention Gate
CN115619810B (en) Prostate partition segmentation method, system and equipment
Dhanagopal et al. Research Article An Efficient Retinal Segmentation-Based Deep Learning Framework for Disease Prediction
He et al. Automatic aid diagnosis report generation for lumbar disc MR image based on lightweight artificial neural networks
CN117372689A (en) MRI brain tumor segmentation method based on triple high-efficiency attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination