CN111681252B - Medical image automatic segmentation method based on multipath attention fusion - Google Patents
Medical image automatic segmentation method based on multipath attention fusion Download PDFInfo
- Publication number
- CN111681252B CN111681252B CN202010479507.5A CN202010479507A CN111681252B CN 111681252 B CN111681252 B CN 111681252B CN 202010479507 A CN202010479507 A CN 202010479507A CN 111681252 B CN111681252 B CN 111681252B
- Authority
- CN
- China
- Prior art keywords
- feature
- multiplied
- pictures
- inputting
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30061—Lung
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30088—Skin; Dermal
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The invention belongs to the technical field of medical image processing and computer vision, and particularly relates to a medical image automatic segmentation method based on multipath attention fusion, which comprises the steps of acquiring a medical image data set, dividing the data set into a training set and a verification set, amplifying images in the training set, and normalizing the images in the training set after the verification and the amplification; inputting the pictures in the training set into a multi-path attention fusion network model, and outputting under the guidance of a cross entropy loss function to obtain a segmentation result graph; selecting a model with the highest accuracy of a verification set, inputting a test set into a multi-path attention fusion network loaded with the model, and outputting a segmentation result graph of the obtained image; the invention solves the problems that the existing network can not effectively improve the feature quality under different scales at an encoder and is difficult to control the interlayer dependence between the low-level structural features and the high-level semantic features of the network so as to cause poor segmentation results and the like in the medical image segmentation process.
Description
Technical Field
The invention belongs to the technical field of medical image processing and computer vision, and particularly relates to a medical image automatic segmentation method based on multipath attention fusion.
Background
Medical images play a key role in medical treatment and diagnosis. The goal of computer-aided diagnosis (CAD) systems is to provide physicians with an accurate interpretation of medical images to allow a large number of patients to be treated better. Moreover, automated processing of medical images results in reduced time, cost and error for human-based processing. One of the main areas of research in this area is medical image segmentation, which is a key step in many medical imaging studies.
Deep learning networks, like other search domains in computer vision, achieve excellent results and perform better in medical imaging than non-deep techniques. Deep neural networks are mainly used for classification tasks, where the output of the network is the probability values of a single label or a label related to a given input image. These network operations benefit from certain structural features, such as: activation functions, different efficient optimization algorithms and as a network regularizer. Given the large number of network parameters, these networks require a large amount of data to train and provide good generalization behavior.
The full convolutional neural network (FCN) is one of the earliest deep networks applied to image segmentation. The U-Net continues to expand under the framework of a full convolution neural network (FCN), a standard encoder-decoder structure is possessed, the framework becomes one of popular frameworks of medical images, and the deep neural network achieves good segmentation results by utilizing the requirement of a large amount of training data. The U-Net network consists of encoding and decoding paths in which a number of dimensionally reduced feature maps are extracted from the input data. The decoding path is used to generate a segmentation map (of the same size as the input) by performing an upward convolution. The most important modification is mainly related to skip connection, and the feature map extracted from the skip connection is fed to the encoder for cascade processing. However, in the hierarchical conversion in the U-Net network, the learning processes of different pooling levels often share the same data path, so the generated multi-scale feature map may not be fully partitioned as expected, the encoder may lose part of the spatial information due to the existence of the pooling layer, a single U-Net network uses two simple continuous 3 × 3 convolutional layers, the convolutional layers of the standard are difficult to store rich spatial information, another disadvantage is that the U-Net network uses a simple jump connection, only the features of the encoder and the features of the decoder are spliced, the features of the output from the decoder have redundancy, and the final segmentation result is affected.
Disclosure of Invention
In order to improve the separation effect of medical images, the invention provides a medical image automatic segmentation method based on multipath attention fusion, which comprises the following steps:
s1, acquiring a medical picture data set, dividing the data set into a training set and a verification set, amplifying pictures in the training set, and normalizing the pictures in the training set after verification and amplification;
s2, inputting the pictures in the training set into the multi-path attention fusion network model, and outputting under the guidance of the cross entropy loss function to obtain a segmentation result graph;
s3, verifying the accuracy of the multi-path attention fusion network model after each iterative training by using verification set data, and taking the network parameter with the highest accuracy as the network parameter of the multi-path attention fusion network model;
and S4, inputting the image data which is subjected to the normalization processing and needs to be segmented into the multipath attention fusion network model to obtain a segmentation result graph.
Further, the process of augmenting the training set picture includes:
rotating the pictures in the training set by 10 degrees, 20 degrees, -10 degrees and-20 degrees, and storing the rotated pictures;
turning the pictures in the training set up and down and left and right, and storing the turned pictures;
performing elastic transformation on the pictures in the training set, and storing the pictures after the elastic transformation;
carrying out (20%, 80%) range scaling on the pictures in the training set, and storing the pictures after scaling;
and (4) taking the pictures in the training set and the pictures in the training set after the processing as the training set together to finish the augmentation.
Further, the multi-path attention fusion network model comprises a multi-path encoder, an attention fusion module and a decoder with reconstructed upsampling, wherein:
the attention fusion module inputs two features at a time, and the two features are connected in series by convolution operation to obtain a combined feature A; sequentially using convolution operation, ReLu activation function and convolution operation on the combined feature A, and then using sigmoid processing to obtain a feature map with the dimension of 1 × 1 × C, wherein C is the number of channels of the feature, multiplying the feature map by the feature A to selectively filter the feature to obtain the feature, and summing the selectively filtered feature and the feature A to obtain the final output feature;
the decoder with reconstruction upsampling comprises three layers of reconstruction upsampling, wherein the first layer of reconstruction upsampling is to perform upsampling on the characteristics of the bottommost layer of a first path of a multi-path encoder, then to splice with the characteristics output by a third layer of attention fusion module, then to input the spliced characteristics into a decoding module, the second layer of reconstruction upsampling is to perform upsampling on the characteristics output after the first layer of reconstruction upsampling is input into the decoder, then to splice with the characteristics output by the second layer of attention module, and to input into the decoding module, the third layer of reconstruction upsampling is to perform upsampling on the characteristics output after the second layer of reconstruction upsampling is input into the decoder, then to splice with the characteristics output by the first layer of attention module, and to input into the decoding module, a 1 x 1 convolution operation and a sigmoid activation function are used for operating the characteristics output after the third layer of reconstruction upsampling is input into the decoder, and then a final segmentation result graph is obtained.
Further, the operations performed in the residual module include:
201: inputting a feature map with the size of h multiplied by w multiplied by c into a 3 multiplied by 3 convolutional layer;
202: inputting the convolution result in 201 to the standard normalization and the ReLU activation function;
203: continuing to input the result of 202 into a 3 × 3 convolutional layer, standard normalization and a ReLU activation function;
204: inputting the result of 203 into a 3 × 3 convolutional layer;
205: summing the convolution result of 204 and the convolution result of 201;
206: inputting the result of 205 into a standard normalization and a ReLU activation function to obtain a characteristic of h multiplied by w multiplied by c;
where h denotes the height of the feature map, w denotes the width of the feature map, and c denotes the number of channels of the feature map.
Further, the operations at the attention fusion module include:
211: for a feature M with the resolution of h multiplied by w multiplied by c from a multipath encoder and a feature F with the resolution of h/2 multiplied by w/2 multiplied by 2c from the result output by the previous attention fusion module, firstly performing up-sampling operation on the feature F to enable the resolution of the feature F to be h multiplied by w multiplied by c, then performing splicing operation on the feature F subjected to up-sampling operation and the feature M, and inputting the spliced features into a 3 multiplied by 3 convolutional layer;
212: inputting the convolution result of 211 to a standard normalization and a ReLU activation function;
213: inputting the convolution result of 212 into a global average pooling function, wherein one dimension is a feature map of 1 × 1 × c, and c is the channel number of the feature;
214: inputting the convolution result of 213 into a 1 × 1 convolution layer, a ReLu activation function and a 1 × 1 convolution layer in sequence, and finally inputting the convolution result into a sigmoid function to obtain a feature map with a dimension of 1 × 1 × c;
215: multiplying the feature map of 214 with the convolution result of 211 to obtain a selected filtering feature;
216: and summing the convolution result of 215 and the convolution result of 211 to obtain the final output characteristic.
Further, the operations at the decoder module include:
221: for the feature F with the resolution h multiplied by w multiplied by c from the attention fusion module and the feature D with the resolution h/2 multiplied by w/2 multiplied by c from the result output by the previous decoder module, the feature D is firstly subjected to the upsampling operation so that the resolution of the feature D is h multiplied by w multiplied by c, and then the feature F subjected to the upsampling operation is subjected to the splicing operation with the feature D. Finally inputting the spliced characteristics into a 3 x 3 convolutional layer;
222: inputting the resulting features of 221 into a 3 × 3 convolutional layer;
223, inputting the convolution result in 222 to the standard normalization and ReLU activation functions;
224: the 223 results are continuously input into a 3 x 3 convolutional layer, standard normalization and a ReLU activation function;
225: inputting the result of 224 into a 3 × 3 convolutional layer;
226: summing the convolution result of 225 and the convolution result of 201;
227: the results of 226 are input to the standard normalization and the ReLU activation functions to get the h × w × c feature.
The invention has the beneficial effects that:
1) according to the invention, as for medical image segmentation, enough training samples are not easy to obtain, so that a training data set is increased by adopting rotation, turnover and elastic transformation, and the number of pictures in the effective training set can be greatly increased by the method;
2) the invention makes improvement aiming at the encoder with a single structure, uses four multi-path encoding structures, can effectively improve the feature quality under different scales and control the interlayer dependency between the low-level structure features and the high-level semantic features;
3) the invention improves the convolution block of the standard U-Net network, and uses the residual error module to replace the standard convolution block, thereby accelerating the network training and improving the network segmentation accuracy;
4) the invention improves aiming at simple jump connection, uses an attention fusion mechanism to effectively and gradually accumulate image representations from different semantic levels, then transmits the recombined function to the back end of the network on the basis, can establish visual association among different paths by introducing a novel attention-based feature fusion algorithm, and explores the jump connection in a data-driven mode.
5) The invention has good robustness and accuracy, and has good accuracy in skin cancer lesion segmentation and breast CT images.
Drawings
FIG. 1 is a flow chart of a medical image automatic segmentation method based on multi-path attention fusion according to the present invention;
FIG. 2 is a schematic diagram of a path attention convergence network structure according to the present invention;
FIG. 3 is a schematic view of an attention fusion structure according to the present invention;
FIG. 4 is a schematic diagram of a test data set of the present invention and the segmentation results obtained by the method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a medical image automatic segmentation method based on multipath attention fusion, as shown in figure 1, comprising the following steps:
s1, acquiring a medical picture data set, dividing the data set into a training set and a verification set, amplifying pictures in the training set, and normalizing the pictures in the training set after verification and amplification;
s2, inputting the pictures in the training set into the multi-path attention fusion network model, and outputting under the guidance of the cross entropy loss function to obtain a segmentation result graph;
s3, verifying the accuracy of the multi-path attention fusion network model after each iterative training by using verification set data, and taking the network parameter with the highest accuracy as the network parameter of the multi-path attention fusion network model;
and S4, inputting the image data which is subjected to the normalization processing and needs to be segmented into the multipath attention fusion network model to obtain a segmentation result graph.
Example 1
The images in the medical image data set are divided into a training set and a verification set, the training set is used for training the model, the verification set is used for optimizing various indexes of the model, and for medical image segmentation, enough training samples are not easy to obtain, so that the images in the training set are augmented, and the augmentation operation comprises the following steps:
rotating the pictures in the training set by 10 degrees, 20 degrees, -10 degrees and-20 degrees, and storing the rotated pictures;
turning the pictures in the training set up and down and left and right, and storing the turned pictures;
performing elastic transformation on the pictures in the training set, and storing the pictures after the elastic transformation;
carrying out (20%, 80%) range scaling on the pictures in the training set, and storing the pictures after scaling;
and (4) taking the pictures in the training set and the pictures in the training set after the processing as the training set together to finish the augmentation.
The method for normalizing the pictures in the training set and the verification set after augmentation comprises the following steps:
I=(I-M)/Std
where I denotes the contrast of the image, M denotes the mean of the image data, and Std denotes the standard deviation of the image data.
Inputting training data into a multipath attention fusion network model, outputting under the guidance of a cross entropy loss function to obtain a segmentation result graph, and when the multipath attention fusion network model is constructed, mainly comprising a multipath encoder, an attention fusion module and a decoder with reconstructed upsampling, wherein:
the multi-path encoder comprises 4 convolution modules with different lengths, wherein the first path comprises 4 residual network modules and 3 maximum pooling operations, the second path comprises 3 residual network modules and 2 maximum pooling operations, the third path comprises 2 residual network modules and 1 maximum pooling operation, and the fourth path only comprises 1 residual network module; as shown in fig. 2, the multi-path encoder includes four paths, i.e., a first path, a second path, a third path, and a fourth path from the left side to the right side of the diagram, wherein a max-pooling operation is performed between two residual blocks in each path;
inputting two features into the attention fusion module each time, as shown in fig. 3, the present embodiment concatenates the convolution operations of 3 × 3 for the two features to obtain a combined feature a; performing standard normalization on the characteristic A by using convolution operation, activating a ReLu activation function, performing global average pooling by using the convolution operation, sequentially passing through a 1 × 1 convolutional layer, the ReLu activation function and a 1 × 1 convolutional layer, and processing by using a sigmoid to obtain a characteristic diagram with a dimension of 1 × 1 × C, wherein C is the number of channels of the characteristic, multiplying the characteristic diagram with the characteristic A to selectively filter the characteristic to obtain the characteristic, and summing the selectively filtered characteristic and the characteristic A to obtain the final output characteristic; as shown in fig. 2, the present invention performs three times of fusion, where the input of the first time of fusion is the output of the first path and the output of the second path, the input of the second time of fusion is the output of the third path and the output of the first layer attention module, and the input of the third time of fusion is the output of the fourth path and the output of the second layer attention module;
the decoder with reconstruction upsampling comprises three layers of reconstruction upsampling, wherein the first layer of reconstruction upsampling is to perform upsampling on the characteristics of the bottommost layer of a first path of a multi-path encoder, then to splice with the characteristics output by a third layer of attention fusion module, then to input the spliced characteristics into a decoding module, the second layer of reconstruction upsampling is to perform upsampling on the characteristics output after the first layer of reconstruction upsampling is input into the decoder, then to splice with the characteristics output by the second layer of attention module, and to input into the decoding module, the third layer of reconstruction upsampling is to perform upsampling on the characteristics output after the second layer of reconstruction upsampling is input into the decoder, then to splice with the characteristics output by the first layer of attention module, and to input into the decoding module, a 1 x 1 convolution operation and a sigmoid activation function are used for operating the characteristics output after the third layer of reconstruction upsampling is input into the decoder, and then a final segmentation result graph is obtained.
The above operation at the attention fusion module can be expressed as:
ui=C([ei,S(xi-1)]);
si=Pavg(R(B(ui)));
ai=σ(C1x1(R(C1x1(si))));
wherein u isiRepresents; c (-) 3X 3 convolutional layer, S (-) represents an upsampling operation]Representing a splicing operation. e.g. of the typeiAnd xi-1Respectively representing the characteristics of an ith layer encoder and the characteristics output by the i-1 attention fusion module; siNote that B (-) and R (-) represent the standard normalization and ReLU activation functions, P, respectivelyavgRepresents a global tie pooling; a isiIs represented by C1x1Represents a convolution layer of 1 multiplied by 1, and sigma represents a sigmoid activation function; x is the number ofiIt is shown that,representing multiplication, and + representing pixel summation.
During training, calculating the error between a prediction result and a label through a cross entropy loss function, and propagating backwards through a gradient, wherein an Adam optimization algorithm is adopted to update parameters of a model, and the learning rate is set to be 0.0001; wherein the cross entropy loss function is expressed as:
wherein, yiA label representing a picture of the picture,representing the prediction result of the picture.
During testing, the model with the highest accuracy of the verification set is selected, the model is loaded, then the test set is subjected to normalization processing, and the test set is input into a network to obtain a final segmentation result.
Example 2
The separation method in the embodiment 1 is adopted, in the implementation, a Keras and Tensorflow open source deep learning library is adopted, NIVIDIA Geform RTX-2080Ti GPU is used for training, an Adam optimization algorithm model is adopted, and the learning rate is set to be 0.0001; a 2018ISIC skin cancer lesion segmentation, LUNA lung CT dataset was used.
One data set of this example contains 2954 skin cancer lesion pictures in total, each picture being 700 x 900 in size and having a segmentation label map corresponding thereto, provided by the skin cancer lesion segmentation challenge in 2018; using 1815 pictures as training set, 59 pictures as verification set, and the remaining 520 pictures as test set, to facilitate network training, all the pictures are resized to 256 × 256, and the data in the test set is shown in fig. 4, where the first row in fig. 4 is original data, the second row is labels of the original, the first and second columns are LUNA data set, and the third and fourth columns are ISIC data set.
In this example, four evaluation indexes, F1-score, Accuracy, Sensitivity, and Specificity, were used. The larger these four indices are, the more accurate the segmentation effect is. Wherein F1-score was used to assess the lesion area, Accuracy was indicated for Accuracy, Sensitivity was indicated for Sensitivity, and Specificity was indicated for Specificity. As can be seen from Table 1, in the 2018ISIC skin cancer lesion segmentation data set, compared with U-Ne, R2-Unet, BCD-Net and U-Net + +, the main indexes F1-score, Accuracy and Sensitivity are improved in the case of smaller parameters in the method and the compared network model, and the F1-score is obviously improved as the most main index and is improved by 2.98% compared with the U-Net method.
The fourth row of fig. 4 shows the result of the inventive network segmentation.
TABLE 1 comparison of experimental results of this and other methods on the ISIC2018 dataset
Method | F1-score | Accuracy | Sensitivity | Specificity | Parameter (M) |
U-Net | 0.8607 | 0.9417 | 0.8092 | 0.9796 | 9 |
R2-Unet | 0.8740 | 0.9479 | 0.8104 | 0.9873 | 17.6 |
BCD-Unet | 0.8637 | 0.9444 | 0.7822 | 0.9878 | 20.6 |
U-Net++ | 0.8756 | 0.9472 | 0.8343 | 0.9795 | 9 |
Method for producing a composite material | 0.8905 | 0.9527 | 0.8649 | 0.9779 | 4.3 |
Example 3
Using the partitioning method in example 1, unlike example 2, this example used the LUNA data set provided by the 2017 Kaggle lung node competition. The total number of the split label maps comprises 730 pictures and 730 corresponding split label maps, the pixel size of each picture is 512 x 512, 70% of the pictures are used as a training set, 10% of the pictures are used as a verification set, and the rest 20% of data are used as a test set.
Because the data is less, the training data set is amplified by using the technologies of rotation, inversion, elastic transformation and the like, so that the network has good robustness and segmentation accuracy.
Four evaluation indices, F1-score, Accuracy, Sensitivity and Specificity, were used. The larger these four indices are, the more accurate the segmentation effect is. As can be seen from Table 2, the results of the LUNA data set experiments show that the method of the present invention and the compared network model have improved main indexes F1-score, Accuracy, Sensitivity and Specificity with smaller parameters than those of U-Net, R2-Unet, BCD-Net and U-Net + +.
TABLE 2 comparison of the results of the LUNA data set with the present method and other methods
Method | F1-score | Accuracy | Sensitivity | Specificity | Parameter (M) |
U-Net | 0.9658 | 0.9872 | 0.9696 | 0.9872 | 9 |
R2-Unet | 0.9823 | 0.9918 | 0.9832 | 0.9944 | 17.6 |
BCD-Net | 0.9904 | 0.9972 | 0.9910 | 0.9982 | 20.6 |
U-Net++ | 0.9899 | 0.9971 | 0.9942 | 0.9975 | 9 |
Method for producing a composite material | 0.9924 | 0.9978 | 0.9944 | 0.9984 | 4.3 |
From the above table it can be seen that the present invention is effective and has significant advantages over the prior art methods of the same type.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (6)
1. A medical image automatic segmentation method based on multipath attention fusion is characterized by comprising the following steps:
s1, acquiring a medical picture data set, dividing the data set into a training set and a verification set, amplifying pictures in the training set, and normalizing the pictures in the training set after verification and amplification;
s2, inputting the pictures in the training set into a multipath attention fusion network model, and outputting the pictures under the guidance of a cross entropy loss function to obtain a segmentation result graph, wherein the multipath attention fusion network model comprises a multipath encoder, an attention fusion module and a decoder with reconstruction upsampling, and the method comprises the following steps:
the multi-path encoder comprises 4 paths with different lengths, wherein the first path comprises 4 residual network modules and 3 maximum pooling operations, the second path comprises 3 residual network modules and 2 maximum pooling operations, the third path comprises 2 residual network modules and 1 maximum pooling operation, and the fourth path comprises 1 residual network module; wherein the operations performed in the residual module include:
201: inputting a feature map with the size of h multiplied by w multiplied by c into a 3 multiplied by 3 convolutional layer;
202: inputting the convolution result in 201 to the standard normalization and the ReLU activation function;
203: continuing to input the result of 202 into a 3 × 3 convolutional layer, standard normalization and a ReLU activation function;
204: inputting the result of 203 into a 3 × 3 convolutional layer;
205: summing the convolution result of 204 and the convolution result of 201;
206: inputting the result of 205 into a standard normalization and a ReLU activation function to obtain a characteristic of h multiplied by w multiplied by c;
wherein h represents the height of the characteristic diagram, w represents the width of the characteristic diagram, and c represents the channel number of the characteristic diagram;
the attention fusion module inputs two features at a time, and the two features are connected in series by convolution operation to obtain a combined feature A; sequentially using convolution operation, ReLu activation function and convolution operation on the combined feature A, and then using sigmoid processing to obtain a feature map with the dimension of 1 × 1 × C, wherein C is the number of channels of the feature, multiplying the feature map by the feature A to selectively filter the feature to obtain the feature, and summing the selectively filtered feature and the feature A to obtain the final output feature;
the decoder with reconstruction upsampling comprises three layers of reconstruction upsampling, wherein the first layer of reconstruction upsampling is to perform upsampling on the characteristics of the bottommost layer of a first path of a multi-path encoder, then to splice with the characteristics output by a third layer of attention fusion module, then to input the spliced characteristics into a decoding module, the second layer of reconstruction upsampling is to perform upsampling on the characteristics output after the first layer of reconstruction upsampling is input into the decoder, then to splice with the characteristics output by the second layer of attention module, and to input into the decoding module, the third layer of reconstruction upsampling is to perform upsampling on the characteristics output after the second layer of reconstruction upsampling is input into the decoder, then to splice with the characteristics output by the first layer of attention module, and to input into the decoding module, a 1 x 1 convolution operation and a sigmoid activation function are used to operate on the characteristics output after the third layer of reconstruction upsampling is input into the decoder, so as to obtain a final segmentation result graph;
s3, verifying the accuracy of the multi-path attention fusion network model after each iterative training by using verification set data, and taking the network parameter with the highest accuracy as the network parameter of the multi-path attention fusion network model;
and S4, inputting the image data which is subjected to the normalization processing and needs to be segmented into the multipath attention fusion network model to obtain a segmentation result graph.
2. The method for automatic segmentation of medical images based on multi-path attention fusion as claimed in claim 1, wherein the augmenting process of the training set picture comprises:
rotating the pictures in the training set by 10 degrees, 20 degrees, -10 degrees and-20 degrees, and storing the rotated pictures;
turning the pictures in the training set up and down and left and right, and storing the turned pictures;
performing elastic transformation on the pictures in the training set, and storing the pictures after the elastic transformation;
carrying out (20%, 80%) range scaling on the pictures in the training set, and storing the pictures after scaling;
and (4) taking the pictures in the training set and the pictures in the training set after the processing as the training set together to finish the augmentation.
3. The method for automatic segmentation of medical images based on multi-path attention fusion as claimed in claim 1, wherein the normalization process is expressed as:
I=(I-M)/Std;
where I denotes the contrast of the image, M denotes the mean of the image data, and Std denotes the standard deviation of the image data.
4. The method for automatic segmentation of medical images based on multi-path attention fusion as claimed in claim 1, wherein the operation in the attention fusion module comprises:
211: for a feature M with the resolution of h multiplied by w multiplied by c from a multipath encoder and a feature F with the resolution of h/2 multiplied by w/2 multiplied by 2c from the result output by the previous attention fusion module, firstly performing up-sampling operation on the feature F to enable the resolution of the feature F to be h multiplied by w multiplied by c, then performing splicing operation on the feature F subjected to up-sampling operation and the feature M, and inputting the spliced features into a 3 multiplied by 3 convolutional layer;
212: inputting the convolution result of 211 to a standard normalization and a ReLU activation function;
213: inputting the convolution result of 212 into a global average pooling function, wherein one dimension is a feature map of 1 × 1 × c, and c is the channel number of the feature;
214: inputting the convolution result of 213 into a 1 × 1 convolution layer, a ReLu activation function and a 1 × 1 convolution layer in sequence, and finally inputting the convolution result into a sigmoid function to obtain a feature map with a dimension of 1 × 1 × c;
215: multiplying the feature map of 214 with the convolution result of 211 to obtain a selected filtering feature;
216: and summing the convolution result of 215 and the convolution result of 211 to obtain the final output characteristic.
5. The method of claim 1, wherein the operations at the decoder module comprise:
221: for the feature F with the resolution of h multiplied by w multiplied by c from the attention fusion module and the feature D with the result resolution of h/2 multiplied by w/2 multiplied by 2c output from the previous decoder module, firstly performing up-sampling operation on the feature D to enable the resolution of the feature D to be h multiplied by w multiplied by c, then performing splicing operation on the feature F with the feature D after up-sampling, and finally inputting the spliced features into a convolution layer of 3 multiplied by 3;
222: inputting the resulting features of 221 into a 3 × 3 convolutional layer;
223, inputting the convolution result in 222 to the standard normalization and ReLU activation functions;
224: the 223 results are continuously input into a 3 x 3 convolutional layer, standard normalization and a ReLU activation function;
225: inputting the result of 224 into a 3 × 3 convolutional layer;
226: summing the convolution result of 225 and the convolution result of 201;
227: the results of 226 are input to the standard normalization and the ReLU activation functions to get the h × w × c feature.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010479507.5A CN111681252B (en) | 2020-05-30 | 2020-05-30 | Medical image automatic segmentation method based on multipath attention fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010479507.5A CN111681252B (en) | 2020-05-30 | 2020-05-30 | Medical image automatic segmentation method based on multipath attention fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111681252A CN111681252A (en) | 2020-09-18 |
CN111681252B true CN111681252B (en) | 2022-05-03 |
Family
ID=72453028
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010479507.5A Active CN111681252B (en) | 2020-05-30 | 2020-05-30 | Medical image automatic segmentation method based on multipath attention fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111681252B (en) |
Families Citing this family (46)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112164074B (en) * | 2020-09-22 | 2021-08-10 | 江南大学 | 3D CT bed fast segmentation method based on deep learning |
CN112132813B (en) * | 2020-09-24 | 2022-08-05 | 中国医学科学院生物医学工程研究所 | Skin ultrasonic image segmentation method based on improved UNet network model |
CN112348839B (en) * | 2020-10-27 | 2024-03-15 | 重庆大学 | Image segmentation method and system based on deep learning |
CN112183561B (en) * | 2020-11-09 | 2024-04-30 | 山东中医药大学 | Combined fusion-subtraction automatic encoder algorithm for image feature extraction |
CN112258129A (en) * | 2020-11-12 | 2021-01-22 | 拉扎斯网络科技(上海)有限公司 | Distribution path prediction network training and distribution resource scheduling method and device |
CN112330542B (en) * | 2020-11-18 | 2022-05-03 | 重庆邮电大学 | Image reconstruction system and method based on CRCSAN network |
CN112216371B (en) * | 2020-11-20 | 2022-07-12 | 中国科学院大学 | Multi-path multi-scale parallel coding and decoding network image segmentation method, system and medium |
CN112614112B (en) * | 2020-12-24 | 2023-05-12 | 苏州大学 | Segmentation method for stripe damage in MCSLI image |
CN112734762B (en) * | 2020-12-31 | 2022-10-11 | 西华师范大学 | Dual-path UNet network tumor segmentation method based on covariance self-attention mechanism |
CN112950639B (en) * | 2020-12-31 | 2024-05-10 | 山西三友和智慧信息技术股份有限公司 | SA-Net-based MRI medical image segmentation method |
CN112767502B (en) * | 2021-01-08 | 2023-04-07 | 广东中科天机医疗装备有限公司 | Image processing method and device based on medical image model |
CN112651979B (en) * | 2021-01-11 | 2023-10-10 | 华南农业大学 | Lung X-ray image segmentation method, system, computer equipment and storage medium |
CN112862830B (en) * | 2021-01-28 | 2023-12-22 | 陕西师范大学 | Multi-mode image segmentation method, system, terminal and readable storage medium |
CN112927236B (en) * | 2021-03-01 | 2021-10-15 | 南京理工大学 | Clothing analysis method and system based on channel attention and self-supervision constraint |
CN112927209B (en) * | 2021-03-05 | 2022-02-11 | 重庆邮电大学 | CNN-based significance detection system and method |
CN113065578B (en) * | 2021-03-10 | 2022-09-23 | 合肥市正茂科技有限公司 | Image visual semantic segmentation method based on double-path region attention coding and decoding |
CN113139972A (en) * | 2021-03-22 | 2021-07-20 | 杭州电子科技大学 | Cerebral apoplexy MRI image focus region segmentation method based on artificial intelligence |
US11580646B2 (en) | 2021-03-26 | 2023-02-14 | Nanjing University Of Posts And Telecommunications | Medical image segmentation method based on U-Net |
CN113129316A (en) * | 2021-04-15 | 2021-07-16 | 重庆邮电大学 | Heart MRI image multi-task segmentation method based on multi-mode complementary information exploration |
CN113128583B (en) * | 2021-04-15 | 2022-08-23 | 重庆邮电大学 | Medical image fusion method and medium based on multi-scale mechanism and residual attention |
CN113012155B (en) * | 2021-05-07 | 2023-05-05 | 刘慧烨 | Bone segmentation method in hip joint image, electronic equipment and storage medium |
CN113343995A (en) * | 2021-05-07 | 2021-09-03 | 西安智诊智能科技有限公司 | Image segmentation method based on reverse attention network |
CN113379773B (en) * | 2021-05-28 | 2023-04-28 | 陕西大智慧医疗科技股份有限公司 | Segmentation model establishment and segmentation method and device based on dual-attention mechanism |
CN113744279B (en) * | 2021-06-09 | 2023-11-14 | 东北大学 | Image segmentation method based on FAF-Net network |
CN113240691B (en) * | 2021-06-10 | 2023-08-01 | 南京邮电大学 | Medical image segmentation method based on U-shaped network |
CN113361445B (en) * | 2021-06-22 | 2023-06-20 | 华南理工大学 | Attention mechanism-based document binarization processing method and system |
CN113421276B (en) * | 2021-07-02 | 2023-07-21 | 深圳大学 | Image processing method, device and storage medium |
CN113256641B (en) * | 2021-07-08 | 2021-10-01 | 湖南大学 | Skin lesion image segmentation method based on deep learning |
CN113393469A (en) * | 2021-07-09 | 2021-09-14 | 浙江工业大学 | Medical image segmentation method and device based on cyclic residual convolutional neural network |
CN113744275B (en) * | 2021-07-26 | 2023-10-20 | 重庆邮电大学 | Feature transformation-based three-dimensional CBCT tooth image segmentation method |
CN113570611A (en) * | 2021-07-27 | 2021-10-29 | 华北理工大学 | Mineral real-time segmentation method based on multi-feature fusion decoder |
CN113642581B (en) * | 2021-08-12 | 2023-09-22 | 福州大学 | Image semantic segmentation method and system based on coding multipath semantic cross network |
CN113706544B (en) * | 2021-08-19 | 2023-08-29 | 天津师范大学 | Medical image segmentation method based on complete attention convolutional neural network |
CN113850821A (en) * | 2021-09-17 | 2021-12-28 | 武汉兰丁智能医学股份有限公司 | Attention mechanism and multi-scale fusion leukocyte segmentation method |
CN114022486A (en) * | 2021-10-19 | 2022-02-08 | 西安工程大学 | Medical image segmentation method based on improved U-net network |
CN114119627B (en) * | 2021-10-19 | 2022-05-17 | 北京科技大学 | High-temperature alloy microstructure image segmentation method and device based on deep learning |
CN114155231A (en) * | 2021-12-08 | 2022-03-08 | 电子科技大学 | Medical image fusion algorithm based on improved Unet network |
CN114332535B (en) * | 2021-12-30 | 2022-07-15 | 宁波大学 | sMRI image classification method based on high-resolution complementary attention UNet classifier |
CN114581859B (en) * | 2022-05-07 | 2022-09-13 | 北京科技大学 | Converter slag discharging monitoring method and system |
CN114757938B (en) * | 2022-05-16 | 2023-09-15 | 国网四川省电力公司电力科学研究院 | Transformer oil leakage identification method and system |
CN114782440B (en) * | 2022-06-21 | 2022-10-14 | 杭州三坛医疗科技有限公司 | Medical image segmentation method and electronic equipment |
CN115239716B (en) * | 2022-09-22 | 2023-01-24 | 杭州影想未来科技有限公司 | Medical image segmentation method based on shape prior U-Net |
CN116402780B (en) * | 2023-03-31 | 2024-04-02 | 北京长木谷医疗科技股份有限公司 | Thoracic vertebra image segmentation method and device based on double self-attention and deep learning |
CN116612131B (en) * | 2023-05-22 | 2024-02-13 | 山东省人工智能研究院 | Cardiac MRI structure segmentation method based on ADC-UNet model |
CN117422880B (en) * | 2023-12-18 | 2024-03-22 | 齐鲁工业大学(山东省科学院) | Segmentation method and system combining improved attention mechanism and CV model |
CN117974670B (en) * | 2024-04-02 | 2024-06-04 | 齐鲁工业大学(山东省科学院) | Image analysis method, device, equipment and medium for fusing scattering network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106937113A (en) * | 2011-12-05 | 2017-07-07 | 同济大学 | Method for compressing image and device based on mixing colourity sample rate |
CN109191476A (en) * | 2018-09-10 | 2019-01-11 | 重庆邮电大学 | The automatic segmentation of Biomedical Image based on U-net network structure |
CN109671094A (en) * | 2018-11-09 | 2019-04-23 | 杭州电子科技大学 | A kind of eye fundus image blood vessel segmentation method based on frequency domain classification |
CN109785336A (en) * | 2018-12-18 | 2019-05-21 | 深圳先进技术研究院 | Image partition method and device based on multipath convolutional neural networks model |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10755391B2 (en) * | 2018-05-15 | 2020-08-25 | Adobe Inc. | Digital image completion by learning generation and patch matching jointly |
-
2020
- 2020-05-30 CN CN202010479507.5A patent/CN111681252B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106937113A (en) * | 2011-12-05 | 2017-07-07 | 同济大学 | Method for compressing image and device based on mixing colourity sample rate |
CN109191476A (en) * | 2018-09-10 | 2019-01-11 | 重庆邮电大学 | The automatic segmentation of Biomedical Image based on U-net network structure |
CN109671094A (en) * | 2018-11-09 | 2019-04-23 | 杭州电子科技大学 | A kind of eye fundus image blood vessel segmentation method based on frequency domain classification |
CN109785336A (en) * | 2018-12-18 | 2019-05-21 | 深圳先进技术研究院 | Image partition method and device based on multipath convolutional neural networks model |
Non-Patent Citations (4)
Title |
---|
Deep Learning Techniques for Medical Image Segmentation: Achievements and Challenges;Mohammad Hesam Hesamian等;《J Digit Imaging》;20190529;第32卷(第4期);582-596 * |
LVC-Net: Medical image segmentation with noisy label based on local visual cues;Yucheng Shu等;《International Conference on Medical Image Computing and Computer-Assisted Intervention》;20191010;558-566 * |
基于多路径网络的权值调整图像语义分割算法;秦晓飞 等;《光学仪器》;20200228;第42卷(第1期);46-51 * |
基于自适应云模型的多模态脑部图像融合方法;赵佳 等;《计算机科学》;20161130;第43卷(第11期);391-296、321 * |
Also Published As
Publication number | Publication date |
---|---|
CN111681252A (en) | 2020-09-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111681252B (en) | Medical image automatic segmentation method based on multipath attention fusion | |
WO2022047625A1 (en) | Image processing method and system, and computer storage medium | |
CN110889852B (en) | Liver segmentation method based on residual error-attention deep neural network | |
CN110889853B (en) | Tumor segmentation method based on residual error-attention deep neural network | |
CN113012172A (en) | AS-UNet-based medical image segmentation method and system | |
CN111860528B (en) | Image segmentation model based on improved U-Net network and training method | |
CN112465827A (en) | Contour perception multi-organ segmentation network construction method based on class-by-class convolution operation | |
CN110223304B (en) | Image segmentation method and device based on multipath aggregation and computer-readable storage medium | |
CN113870335A (en) | Monocular depth estimation method based on multi-scale feature fusion | |
CN110930378B (en) | Emphysema image processing method and system based on low data demand | |
CN113468996B (en) | Camouflage object detection method based on edge refinement | |
CN112184582B (en) | Attention mechanism-based image completion method and device | |
CN113706545A (en) | Semi-supervised image segmentation method based on dual-branch nerve discrimination dimensionality reduction | |
CN114663440A (en) | Fundus image focus segmentation method based on deep learning | |
CN110599495B (en) | Image segmentation method based on semantic information mining | |
CN114049314A (en) | Medical image segmentation method based on feature rearrangement and gated axial attention | |
Xu et al. | AutoSegNet: An automated neural network for image segmentation | |
CN114821050A (en) | Named image segmentation method based on transformer | |
CN116524307A (en) | Self-supervision pre-training method based on diffusion model | |
CN117058307A (en) | Method, system, equipment and storage medium for generating heart three-dimensional nuclear magnetic resonance image | |
CN112990359B (en) | Image data processing method, device, computer and storage medium | |
Zhu et al. | Brain tumor segmentation for missing modalities by supplementing missing features | |
Lai et al. | Generative focused feedback residual networks for image steganalysis and hidden information reconstruction | |
CN110458849B (en) | Image segmentation method based on feature correction | |
CN116779091A (en) | Automatic generation method of multi-mode network interconnection and fusion chest image diagnosis report |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |