CN111325751A - CT image segmentation system based on attention convolution neural network - Google Patents
CT image segmentation system based on attention convolution neural network Download PDFInfo
- Publication number
- CN111325751A CN111325751A CN202010190946.4A CN202010190946A CN111325751A CN 111325751 A CN111325751 A CN 111325751A CN 202010190946 A CN202010190946 A CN 202010190946A CN 111325751 A CN111325751 A CN 111325751A
- Authority
- CN
- China
- Prior art keywords
- module
- attention
- convolution
- feature
- pooling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003709 image segmentation Methods 0.000 title claims abstract description 31
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 26
- 238000011176 pooling Methods 0.000 claims abstract description 110
- 238000000605 extraction Methods 0.000 claims abstract description 46
- 230000004927 fusion Effects 0.000 claims abstract description 29
- 238000005070 sampling Methods 0.000 claims abstract description 22
- 238000007670 refining Methods 0.000 claims abstract description 6
- 238000000034 method Methods 0.000 claims description 32
- 230000006870 function Effects 0.000 claims description 26
- 230000004913 activation Effects 0.000 claims description 24
- 230000008569 process Effects 0.000 claims description 14
- 230000001965 increasing effect Effects 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 description 38
- 238000010586 diagram Methods 0.000 description 21
- 238000012360 testing method Methods 0.000 description 11
- 238000012549 training Methods 0.000 description 11
- 238000013461 design Methods 0.000 description 8
- 239000000284 extract Substances 0.000 description 7
- 230000006399 behavior Effects 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- 210000000496 pancreas Anatomy 0.000 description 5
- 230000007547 defect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 241000208340 Araliaceae Species 0.000 description 3
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 3
- 235000003140 Panax quinquefolius Nutrition 0.000 description 3
- 241000270295 Serpentes Species 0.000 description 3
- 238000003708 edge detection Methods 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 235000008434 ginseng Nutrition 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000002790 cross-validation Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000007499 fusion processing Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20172—Image enhancement details
- G06T2207/20192—Edge enhancement; Edge preservation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Apparatus For Radiation Diagnosis (AREA)
Abstract
The invention provides a CT image segmentation system based on an attention convolution neural network, which comprises a feature coding module, a feature extraction module and a feature extraction module, wherein the feature coding module is used for gradually reducing the size of a feature map of an input image by using a parallel convolution neural network, and realizing the simultaneous extraction of image semantic information and spatial information through network layer multiplexing and the interception and fusion of features of each layer; the attention feature is generated by using pooling, and the semantic information extraction attention module is used for further refining and refining the semantic information features extracted by the feature coding module; the feature fusion pooling attention module is used for combining the refined semantic information features with the semantic information and spatial information features spliced by the feature coding module in parallel to form an attention feature map; and a convolution module and an up-sampling module are used for finely restoring the attention feature map step by step into a feature map code module of the size of the input image. The invention realizes efficient and accurate image segmentation by fusing the attention module.
Description
Technical Field
The invention relates to the technical field of image understanding, in particular to a CT image segmentation system based on an attention convolution neural network.
Background
Image segmentation is an important fundamental research problem in the field of computer vision, while medical image segmentation is an application of image segmentation, which can accurately and rapidly position a large number of patient lesions in a short time. Therefore, how to effectively apply the image segmentation technique to medical images becomes a major task of researchers.
The medical image segmentation classifies semantic expressions in an image pixel by extracting medical image features, and the medical image segmentation needs to accurately position an object and a class to which the object belongs and the position of the object, and clearly divides an object boundary to distinguish different classes of objects.
At present, there are many medical image segmentation methods widely used at home and abroad, wherein the traditional method mainly comprises the following steps: based on threshold segmentation, the threshold segmentation has the advantages of relatively simple implementation, but is not suitable for multi-channel images and images with little difference of characteristic values, and is difficult to obtain accurate results for the image segmentation problem that obvious gray difference does not exist in the images or gray value ranges of various objects are greatly overlapped; based on the edge segmentation method, the edge detection has the advantages of high search detection speed and good edge detection effect, but also has the defects of incapability of obtaining better region structure and contradiction between noise resistance and detection precision during edge detection; the method based on the active contour model is also called as a Snake model, the basic idea of the original Snake model is that an initial curve with an energy function is gradually deformed and moved towards the contour direction of a target to be detected through energy minimization, and finally the initial curve is converged to a target boundary to obtain a smooth and continuous contour, and the original Snake model has the defects of difficulty in capturing a target concave boundary, sensitivity to an initial contour line and the like, so that a plurality of subsequent improved methods are provided.
In addition, the segmentation method based on the neural network populates the end-to-end convolutional network into semantic segmentation since Long et al proposed an FCN algorithm (FullyConvolutional Networks) in 2014. The pretrained ImageNet network is used for the segmentation problem again, the deconvolution layer is used for up-sampling, the jump connection is provided to improve the roughness of the up-sampling, but the result obtained by the FCN has a certain difference from the practical application. Although the accuracy is improved by using the skip structure, the model cannot be well separated from the edge information of the image. In the process of classifying pixels one by one, the FCN does not fully consider the connection between pixels, and lacks spatial consistency. Vijay et al proposed a SegNet (semantic segmentation) algorithm in 2015 that shifted large pooling indices into the decoder, improving segmentation resolution. In an FCN network, a coarse segmentation map is generated by convolutional layers and some hopping connections, and more hopping connections are introduced to improve the effect. However, FCN only replicates the encoder features, while SegNet replicates the maximum pooling index, which makes SegNet more efficient than FCN in memory usage.
The U-Net proposed by Ronneberger et al combines shallow semantic information with deep semantic information, and segments medical images using Encoder and Decoder architectures, but the feature extraction is not good. Yu et al proposed in 2016 a hole convolution layer (dilatedconvolentions) that increased the corresponding receptive field index without reducing the spatial dimensions. In deep lab, which will be mentioned next, the hole convolution is called porous convolution (AtrousConvolution). The last two pooling layers are removed from the pre-trained classification network (here VGG, Visual Geometry group network) and the subsequent convolutional layers are replaced with hole convolutions. The DeepLabV2 and V3 use hole convolution, and implement pyramid-shaped hole pooling ASPP (atomic Spatial pyramid) in Spatial dimension, and use full-connected conditional random field, and the hole convolution increases the receptive field without increasing the number of parameters.
Zhao et al proposed pspnet (pyramid Scene Parsing network) in 2017. the algorithm proposed a pyramid pooling module to aggregate the background information and used additive Loss (auxiary Loss). In addition global scene classification is important because it provides clues to segment the distribution of classes, and pyramid pooling modules use large kernel pooling layers to capture this information. As with the hole convolution system mentioned above, PSPNet also improves the ResNet structure with hole convolution and adds a pyramidal pooling module that connects the feature map of ResNet to the upsampled output of the parallel pooling layer, with the kernel covering the entire area, half-area and small areas of the image, respectively.
Chen et al, in 2018, again proposed the deep labv3+ model, using a spatial pyramid pool module and a codec structure to be used for the deep neural network for the semantic segmentation task. The former network can encode multi-scale context information by detecting input features with filter or sink operations at multiple rates and multiple effective fields of view, while the latter network can capture sharper object boundaries by gradually restoring spatial information. The algorithm combines the advantages of both methods, extending deepLabv3+ by adding a simple and efficient decoder module to refine the segmentation results, especially along object boundaries. By further exploring the Xception model and applying the deep separable convolution to aspp (advanced spatial gradient) and decoder modules, a faster and stronger encoder-decoder network is constructed, but there are disadvantages of large consumption of computing resources, etc. The pyramid structure is used as a module for semantic segmentation, has good integration, can be easily added into any neural network structure, and obtains excellent effect in the process of extracting context information. However, the pyramid structure has some defects, such as what is really needed to be valued by the network for the extracted information, and the pyramid structure is not well explained.
Disclosure of Invention
Aiming at the technical problems in the prior art, the invention provides a CT image segmentation system based on an attention convolution neural network, which designs an accurate and efficient segmentation model by using a deep learning method and fusing an attention module, so that the execution efficiency of the existing CT image segmentation method is improved, and a more accurate segmentation result is obtained.
In order to solve the technical problems, the invention adopts the following technical scheme:
a CT image segmentation system based on an attention convolution neural network comprises a feature coding module, a semantic information extraction attention module, a feature fusion pooling attention module and a feature graph code module; the feature coding module gradually reduces the size of a feature map of an input image by using a parallel convolution neural network, and realizes the simultaneous extraction of semantic information features and spatial information features of the image through network layer multiplexing and interception and fusion of features of each layer; the semantic information extraction attention module generates attention features by using pooling, and further refines and refines the semantic information features extracted by the feature coding module; the feature fusion pooling attention module is connected in parallel with the average pooling by using maximum pooling and average pooling, and combines semantic information features refined by the semantic information extraction attention module with semantic information and spatial information features spliced by the feature coding module to form an attention feature map; and the feature map decoding module gradually and finely restores the attention feature map fused by the feature fusion pooling attention module into the size of the input image by using a convolution module and an up-sampling module.
Compared with the prior art, the CT image segmentation system based on the attention convolution neural network provided by the invention firstly gradually reduces the size of the feature map of an input image by using the convolution neural network, further extracts abundant semantic information features for optimizing a classification task, and simultaneously reduces the loss of space information feature compression by network design during the extraction of the semantic information features; then, optimizing the extraction of the semantic information by using a semantic information extraction attention module; then, a characteristic fusion pooling attention module is used for combining semantic information characteristics refined by a semantic information extraction attention module with semantic information and spatial information characteristics spliced by a characteristic coding module, and fusion processing is carried out through pooling attention to obtain an attention characteristic diagram; and finally, performing upsampling and convolution operations by using a feature map decoding module, and finely restoring the attention feature map to the size of the input image step by step. In addition, compared with the current typical segmentation network, the segmentation system model provided by the invention has higher adaptability to the CT image data set segmentation.
Further, the feature coding module comprises a first convolution module, a second convolution module, a first bottleneck channel, a second bottleneck channel, a third bottleneck channel, a fourth bottleneck channel and a first splicing operation module which are arranged in sequence, the first convolution module comprises a convolution layer and a batch regularization which are sequentially arranged, the second convolution module comprises a convolution layer, a batch regularization and a ReLu activation function which are sequentially arranged, the first bottleneck channel, the second bottleneck channel, the third bottleneck channel and the fourth bottleneck channel are arranged in parallel, the bottleneck layer in each bottleneck channel is continuously reduced from the first bottleneck channel to the end of the fourth bottleneck channel, while the second to fourth bottleneck passageways are continuously reduced in size compared to the output characteristic of the first bottleneck passageway, and the number of the feature map channels finally output by each bottleneck layer is increased along with the increase of the number of the layers, and the semantic information features and the spatial information features extracted by the four bottleneck channels are spliced by the first splicing operation module.
Further, the convolution kernel size of the convolutional layer is 3 × 3, and the step size is 2.
Further, the number of the bottleneck layers in the first to fourth bottleneck channels is 4, 3, 2, 1 respectively, the sizes of the characteristic diagrams output by the second to fourth bottleneck channels compared with the first bottleneck channel are 1/2, 1/4, 1/8 respectively, and the number of the channels of the output characteristic diagrams in the first to fourth bottleneck channels is 128, 256, 512 and 1024 respectively.
Further, each bottleneck layer comprises three convolution units, an addition unit and a ReLu activation function unit which are sequentially arranged, each convolution unit comprises a convolution kernel, a batch regularization function and a ReLu activation function which are sequentially arranged, and the addition unit is also in jump connection with the feature map input into the convolution kernel of the first convolution unit.
Further, the semantic information extraction attention module comprises a first channel attention module, a second channel attention module, a global pooling module, a multiplication operation module and a second splicing operation module, wherein the first channel attention module and the second channel attention module are arranged in parallel, each channel attention module comprises a global average pooling module, a convolution module, a batch regularization and Sigmoid activation function and a multiplication operation, the global average pooling module is sequentially arranged and used for capturing the semantic feature information of the lower context in the input feature map, the convolution module is used for calculating the weight of the semantic information, the batch regularization and Sigmoid activation function are used for refining the semantic information extraction, the multiplication operation is used for multiplying the refined semantic information and the input feature map, the multiplication operation module is used for multiplying the feature map output by the second channel attention module and the output feature map processed by the global pooling module, the second splicing operation module is used for splicing the feature map output by the first channel attention module and the output feature map of the multiplication operation module, and the input feature maps of the two channel attention modules are obtained by connecting semantic information features extracted by the feature coding module.
Further, the feature fusion pooling attention module comprises a third convolution module, an average pooling passage, a maximum pooling passage and a two-way pooling multiplication operation module, wherein the third convolution module is used for extracting mixed information features of the fused semantic information features and the spatial information features and simultaneously converting channels of the information, the average pooling passage and the maximum pooling passage are arranged in parallel and are respectively used for processing the features extracted by the third convolution module, and the two-way pooling multiplication operation module is used for multiplying the two processed features of the average pooling passage and the maximum pooling passage to form an attention feature map.
Further, the average pooling passage uses two serially connected average pooling modules to process the features as a first passage for feature extraction, and the maximum pooling passage uses two serially connected maximum pooling modules to process the features as a second passage for feature extraction.
Further, the feature map coding module comprises a first up-sampling module, a fourth convolution module, a second up-sampling module, a fifth convolution module and a sixth convolution module which are sequentially arranged, the feature maps output by the first up-sampling module and the fourth convolution module are the same in size, and the feature maps output by the second up-sampling module, the fifth convolution module and the sixth convolution module are all the same in size as the input image.
Further, the sampling coefficients of the first and second upsampling modules are 2.
Drawings
FIG. 1 is a schematic block diagram of a CT image segmentation system based on an attention convolution neural network according to the present invention.
Fig. 2 is a schematic structural diagram of the feature encoding module of fig. 1.
Fig. 3 is a schematic diagram of the structure of each bottleneck layer in the feature encoding module of fig. 2.
FIG. 4 is a block diagram of a channel attention module of the semantic information extraction attention module of FIG. 1.
FIG. 5 is a schematic diagram of the structure of the feature fusion pooling attention module of FIG. 1.
Fig. 6 is a schematic structural diagram of a feature diagram decoding module of fig. 1.
FIG. 7 is a graph illustrating the FCN and FEM training process.
FIG. 8 is a schematic diagram of an image comparison of pancreas segmentation test results provided by the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further explained below by combining the specific drawings.
Referring to fig. 1, the present invention provides a CT image segmentation system based on an attention convolutional neural network, which includes a feature coding module, a semantic information extraction attention module, a feature fusion pooling attention module, and a feature graph coding module; the feature coding module gradually reduces the size of a feature map of an input image by using a parallel convolution neural network, and realizes the simultaneous extraction of semantic information features and spatial information features of the image through network layer multiplexing and interception and fusion of features of each layer; the semantic information extraction attention module generates attention features by using pooling, and further refines and refines the semantic information features extracted by the feature coding module; the feature fusion pooling attention module is connected in parallel with the average pooling by using maximum pooling and average pooling, and combines semantic information features refined by the semantic information extraction attention module with semantic information and spatial information features spliced by the feature coding module to form an attention feature map; and the feature map decoding module gradually and finely restores the attention feature map fused by the feature fusion pooling attention module into the size of the input image by using a convolution module and an up-sampling module.
Compared with the prior art, the CT image segmentation system based on the attention convolution neural network provided by the invention firstly gradually reduces the size of the feature map of an input image by using the convolution neural network, further extracts abundant semantic information features for optimizing a classification task, and simultaneously reduces the loss of space information feature compression by network design during the extraction of the semantic information features; then, optimizing the extraction of the semantic information by using a semantic information extraction attention module; then, a characteristic fusion pooling attention module is used for combining semantic information characteristics refined by a semantic information extraction attention module with semantic information and spatial information characteristics spliced by a characteristic coding module, and fusion processing is carried out through pooling attention to obtain an attention characteristic diagram; and finally, performing upsampling and convolution operations by using a feature map decoding module, and finely restoring the attention feature map to the size of the input image step by step. In addition, compared with the current typical segmentation network, the segmentation system model provided by the invention has higher adaptability to the CT image data set segmentation.
Specifically, the design background on the feature encoding module is as follows: as is known, for semantic segmentation tasks, spatial information and semantic information are as important, and the traditional deep learning method uses a series convolution mode, and reduces the size of a feature map step by step through convolution and pooling so as to achieve the purpose of extracting the semantic information and the spatial information, for example, FCN, SegNet, U-Net, deep Lab and other methods. However, spatial information is inevitably lost in the process of reducing the feature map, so many models make a lot of improvements to this point, for example: DeepLabV3 and PSPNet extract spatial information by using pyramid pooling and cavity convolution, BiseNet extracts spatial features by adding a very short network again, DenseASPP reduces the loss of the feature space to the minimum by using a Dense connection structure, PAN is arranged at the tail and the middle of a backbone network, and an attention module is added to increase the spatial feature extraction power of the network. However, if the spatial information is too much emphasized, very accurate semantic information cannot be obtained, which causes a dilemma. According to the invention, through designing a network, two complex tasks of semantic information extraction and spatial information extraction are simultaneously carried out, and under the condition of only increasing a small amount of network parameters, spatial information and semantic information features are simultaneously extracted through network layer multiplexing and interception and fusion of features of each layer, and no additional loss is brought.
As a specific embodiment, please refer to fig. 2, the feature encoding module includes a first convolution module, a second convolution module, first to fourth bottleneck paths, and a first splicing operation module arranged in sequence, the first convolution module comprises a sequentially arranged convolution layer (Conv) and batch regularization (BN), the second convolution module comprises a convolutional layer (Conv), a batch regularization (BN) and a ReLu activation function which are sequentially arranged, the first Bottleneck passage, the second Bottleneck passage and the fourth Bottleneck passage are arranged in parallel, from the first Bottleneck passage to the end of the fourth Bottleneck passage, the Bottleneck layer (Bottleneck) in each Bottleneck passage is continuously reduced, while the second to fourth bottleneck passageways are continuously reduced in size compared to the output characteristic of the first bottleneck passageway, and the number of the feature map channels finally output by each bottleneck layer is increased along with the increase of the number of the layers, and the semantic information features and the spatial information features extracted by the four bottleneck channels are spliced by the first splicing (concat) operation module. In the design of the feature coding module provided by this embodiment, a traditional convolution series mode is changed, a parallel mode is used to simultaneously extract semantic information features and spatial information features, a Bottleneck layer (Bottleneck) is set as 4 parallel paths when a network is designed, the spatial information features are retained because the size of a feature map on each path is not changed, and the combination of multi-scale feature maps is realized because the size of each channel feature map is different; the size of each path feature map is gradually reduced, so that the extraction of semantic information features is realized at the top layer of each path.
As a preferred embodiment, please refer to fig. 2, the convolution kernel size of the convolution layer is 3 × 3, and the step size is 2, so that the first convolution module and the second convolution module can be used to reduce the feature map of the input image, and reduce the calculation amount.
As a preferred embodiment, please refer to fig. 2, the number of the bottleneck layers in the first to fourth bottleneck paths is 4, 3, 2, 1, respectively, the sizes of the feature maps output by the second to fourth bottleneck paths are 1/2, 1/4, 1/8, respectively, compared to the first bottleneck path, and the number of channels of the output feature maps in the first to fourth bottleneck paths is 128, 256, 512, 1024, respectively, thereby better extracting the semantic information features and the spatial information features at the same time.
As a specific embodiment, please refer to fig. 3, each bottleneck layer includes three convolution units, an adding unit (Add) and a ReLu activation function unit, which are sequentially arranged, each convolution unit includes a convolution kernel (ConV2D), a Batch regularization (BN), and a ReLu activation function, which are sequentially arranged, and the adding unit is also jump-connected to a feature map in the convolution kernel input to the first convolution unit, so that the jump-connection and the ReLu activation function are added to the convolution layer, so that a path of a convolutional neural network can be autonomously selected through network learning, thereby further improving accuracy.
Specifically, aiming at the Semantic Information characteristics, the invention designs a Semantic Information Extraction Attention Module (SIEAM) for the task again. As a specific embodiment, please refer to fig. 1 and 4, the semantic information extracting attention module includes a first channel attention module, a second channel attention module, a global pooling module, a multiplication operation module, and a second concatenation operation module, the first channel attention module and the second channel attention module are arranged in parallel, each of the channel attention modules includes a global average pooling for capturing context semantic feature information in an input feature map, a convolution (ConV2D) for calculating semantic information weight, a canonical Batching (BN) and Sigmoid activation functions for refining semantic information extraction after convolution, and a multiplication (Mul) operation for multiplying the refined semantic information with the input feature map, the multiplication (Mul) operation module is used for multiplying the feature map output by the second channel attention module with an output feature map processed by the global pooling module, the second splicing (concat) operation module is used for splicing the feature graph output by the first channel attention module and the output feature graph of the multiplying operation module, and the feature graph is multiplied to be used as a weight influence input feature graph, so that the task of thinning semantic information is achieved; wherein, the input feature maps of the two channel attention modules are obtained by butting semantic information features extracted by a feature coding module, specifically, as shown in fig. 2, a leftmost Bottleneck layer (Bottleneck) and a second leftmost upper Bottleneck layer (Bottleneck) in fig. 2 are rich in a large amount of semantic information features, so for the two bottlenecks, two channel attention modules in a Semantic Information Extraction Attention Module (SIEAM) are butted with the two Bottleneck layers in a one-to-one correspondence manner, specifically, the leftmost Bottleneck layer is connected with the second channel attention module, and the second leftmost upper Bottleneck layer is connected with the first channel attention module, so that the semantic information features extracted by the two Bottleneck layers are respectively used as input feature maps of the two channel attention modules one by one, and then, after being refined by the semantic information extraction attention module, are sent to a feature fusion pooling attention module for integration, and accordingly, the SIEAM realizes integration of a large amount of global context semantic information features, only a little more computational cost is added.
Specifically, the design background on the feature fusion pooling attention module is as follows: although the feature coding module can fully extract the spatial information of the image features, and the semantic information extraction attention module can also extract more detailed semantic information, the spatial information is not matched with the semantic information, and a module is required to integrate the two information instead of removing the rough fusion. Therefore, the invention provides a Feature Fusion PolingAttention Module (FFPAM), semantic information features and spatial information features are fused through the FFPAM and are applied to a Feature map as attention information, so that the context semantic information and the spatial information can be fully fused, and the segmentation precision is improved.
As a specific embodiment, please refer to fig. 5, the feature fusion pooling attention module includes a third convolution module (including a convolution ConV2D-BN-ReLU activation function) for extracting mixed information features of fused semantic information features and spatial information features and simultaneously converting channels of information, an average pooling path, a maximum pooling path, and a two-way pooling multiplication operation module, where the average pooling path and the maximum pooling path are arranged in parallel and are respectively used for processing the features extracted by the third convolution module, and the two-way pooling multiplication operation module is used for multiplying the two processed features of the average pooling path and the maximum pooling path to form an attention feature map. The invention fuses the spatial information characteristic and the semantic information characteristic by two paths of the average pooling path and the maximum pooling path which are connected in parallel, thereby increasing the receptive field of the model and enhancing the characteristic extraction capability of the model, and an attention characteristic diagram formed by multiplying the two paths of characteristics has the characteristics of the average pooling path and the maximum pooling path at the same time, the attention characteristic is multiplied with the input characteristic diagram and is superposed on the input characteristic as a weight to influence the input characteristic diagram, and finally, a jump connection structure in ResNet is used, so that the negative influence of an attention module on the input characteristic diagram can be reduced and the final characteristic diagram can be output. The feature fusion pooling attention module in the embodiment successfully combines context semantic information and image space information together through multiplication of two routes, so that higher precision is improved, in order to verify the effectiveness of average pooling and maximum pooling, the invention tests 5 conditions of single-path maximum pooling, single-path average pooling, two-path pooling addition, two-path pooling combination and two-path pooling multiplication, and experiments prove that the two-path pooling multiplication really brings optimal precision, and the similarity (dice) precision of 2.71% is improved by the module through the design of the module.
As a preferred embodiment, please refer to fig. 5, the average pooling path uses two serially connected average pooling modules (including average pooled AvgPool-convolution ConV2D-ReLU activation function) to process the features as a first path for feature extraction, and after the output of the ReLU activation function in the second average pooling module is multiplied by the input feature map of the path, the multiplied feature map is added to the input feature map of the path to obtain the final output result of the path; the maximum pooling path uses two maximum pooling modules (including maximum pooling Maxpool-convolution ConV2D-ReLU activation function) connected in series to process the characteristics as a second path for characteristic extraction, after the output of the ReLU activation function in the second maximum pooling module is multiplied by the input characteristic diagram of the path, the characteristic diagram formed by multiplication is added to the input characteristic diagram of the path to serve as the final output result of the path; and finally, multiplying the characteristics finally output by the two channels with the characteristics extracted by the third convolution module (namely the output of the ReLU activation function in the third convolution module), adding (Add) the multiplied result with the characteristics extracted by the third convolution module, and forming an attention characteristic diagram through the ReLU activation function.
As a specific example, please refer to fig. 6, the feature map coding module includes a first upsampling module (Upsample), a fourth convolution module (including convolution Conv-BN-ReLU activation function), a second upsampling module (Upsample), a fifth convolution module (including convolution Conv-BN-ReLU activation function), and a sixth convolution module (including convolution Conv-BN-ReLU activation function), which are sequentially arranged, the feature map sizes output by the first upsampling module and the fourth convolution module are the same (e.g., 96, 128), and the feature map sizes output by the second upsampling module, the fifth convolution module, and the sixth convolution module (e.g., 192, 256) are all the same as the input image. In the embodiment, the up-sampling information is refined by using the three convolution modules, so that the segmentation result is refined by one step, and the precision is improved finally.
As a specific embodiment, the sampling coefficient of the first upsampling module and the sampling coefficient of the second upsampling module are 2, and specifically, the existing bilinear interpolation method can be used for sampling, that is, 2 times of upsampling by the bilinear interpolation method is used for sampling, and a convolution module is used for refining certain spatial information loss caused by the bilinear interpolation method upsampling, so that spatial information loss caused by sampling is reduced.
In designing a CT image (e.g. pancreas image) segmentation system model provided by the present invention, it is first necessary to prepare a data set and preprocess the data set, and process the data set as an input required by the model, so as to improve the robustness of the model. Specifically, the data preprocessing includes: processing each slice, and classifying all pixels with the pixel larger than 240 as 240 and all pixels with the pixel smaller than-100 as-100, wherein the calculation formula is as follows:
image Pixel[Pixel<low_range]=low_range
image Pixel[Pixel>high_range]=high_range
wherein image Pixel is an image Pixel, low _ range is-100, and high _ range is 240. Each slice is then normalized so that its pixel intensity iso-map is between (-1, 1).
The data set preparation includes: the data set is divided into three parts, namely a training set, a verification set and a test set by adopting an NIH criteria segmentation dataset and using 4-fold cross-validation. The training set and the validation set total 62 samples, and the test set total 20 samples. During training, using Adam optimizer, initial learning rate was set to 10-5Then every 10epochs (which can be understood as a batch, equal to training with all samples in the training set)Once) the learning rate decayed by 0.2, for a total of 100 batches of training in the experiment. The results show that training medical images from scratch can achieve better performance and shorter training times than model pre-trained with fine-tuned natural images.
Compared with the prior art, the CT image segmentation system based on the attention convolution neural network has the following advantages:
firstly, in a feature coding module, the FCN is used for carrying out an experiment relative to a backbone network, due to the fact that strategies such as learning rate attenuation, initialization parameters, regularization input and overfitting prevention are used, under the condition that 100epochs are repeated in a training process, in a scheme image of the system provided by the invention, object segmentation has a high dice value; because the semantic information and the spatial information are considered at the same time, the convergence rate is very high, and the loss value is lower than the baseline FCN, which is also reflected in that the dice value is higher than the FCN.
Secondly, the cross parallel network used in the invention learns more features than the FCN in the process of extracting the image information. As shown in table 1 below, while the parameter amount is much smaller than the FCN with VGG16 as the basic architecture, the network of the present invention scores much higher than the FCN in terms of precision rate, recall rate or dice score, which proves the effectiveness of the feature coding module used in the present invention.
TABLE 1
Model (model) | Average dice% | Maximum dice% | Minimum dice% | Rate of accuracy | Recall rate | Amount of ginseng |
FCN | 69.02±6.3 | 76.14 | 49.48 | 0.7092 | 0.6754 | 134.3M |
FEM | 78.93±5.6 | 86.54 | 65.15 | 0.8339 | 0.7543 | 16.15M |
Thirdly, in the feature fusion pooling attention module, as shown in table 2 below, the path of the feature fusion pooling attention module is set to be one, and an experiment is performed by multiplying the average pooling path by the maximum pooling path, so that indexes in all aspects are greatly improved, all indexes are higher than those in the previous item, and the context semantic information and the image space information are successfully combined together by multiplying the results of the two paths, so that high precision is brought.
TABLE 2
Fourth, as shown in table 3 below, the frame used in the present invention has a large increase in dice value when the parameters are much smaller than those of the current typical networks FCN and U-Net.
TABLE 3
Model (model) | Basic network | Dice% | Amount of ginseng |
FCN | VGG16 | 80.3 | 134.3M |
U-Net | VGG16 | 79.7 | 23.3M |
Bisenet | XceptionV1 | 82.8 | 44.8M |
Framework for use in the system | FEM | 86.6 | 18.9M |
Fifth, as shown in table 4 below, the present invention will be compared to the current typical network to observe the adaptability of each model to the pancreatic CT dataset. Of the current 82 samples, most models used 62/20 training/test set ratios, with # Folds being the fold number for cross validation, it can be seen that the system model of the present invention is higher than these typical models at present.
TABLE 4
Sixthly, the fusion experiment of each module uses 20 samples as a test set as the same as the previous experiment, and then the precision rate, the recall rate and the dice value of the 20 samples are respectively tested. As shown in table 5 below, it can be seen that Base + Decoder + ARM + GAM is much higher than others in the aspect of recall rate and dice value, except that the precision rate is slightly lower than that of Base + Decoder + ARM, which also verifies the validity of all modules stacked.
TABLE 5
Model (model) | Average dice% | Maximum dice% | Minimum dice% | Rate of accuracy | Recall rate | Amount of ginseng |
FCN(baseline) | 69.02±6.3 | 76.14 | 49.48 | 0.7092 | 0.6754 | 134.3M |
FEM+FDM | 82.81±4.2 | 88.54 | 74.07 | 0.8477 | 0.8115 | 16.15M |
FEM+FDM+SIEAM | 83.91±4.4 | 89.70 | 73.89 | 0.8726 | 0.8106 | 18,96M |
FEM+FDM+SIEAM+FFPAM | 86.62±3.6 | 91.31 | 78.91 | 0.8607 | 0.8737 | 19.8M |
Referring to fig. 8, the Image before the 1 st behavior segmentation, the 2 nd behavior label GT, the test result segmented by the 3 rd behavior FCN, the test result segmented by the 4 th behavior U-Net, the test result segmented by the 5 th behavior FEM + FDM, and the test result segmented by the 6 th behavior FEM + FDM + SIEAM + FFPAM are the final algorithm segmentation test results. As can be seen from this figure, since the FCN directly upsamples the feature map with small segmentation using the transposed convolution, the result lacks edge smoothness, presenting a mosaic-like segmentation result. The algorithm well smoothes the hard edge feature of the FCN because U-Net has a gentle upsampling, but U-Net generates many extra small fragments on the detail segmentation, and the small fragments are not generated on the 2 nd, 3 rd and 4 th segmentation prediction maps of the 4 th row. On the 5 th line, on the FEM + FDM used by the invention, as the spatial information and semantic information of the image are effectively reserved, the fragments generated in the segmentation process of U-Net are effectively reduced, and the whole picture becomes clean; however, in detail segmentation, there are some disadvantages. For example, in row 5, no effective segmentation of the folds of the pancreas occurred, and in row 5, no segmentation of the pancreas was achieved, in row 3, too much segmentation of the pancreas area. On the basis of the method, two attention modules are added to the method, so that the method is focused on solving the detail defects. In line 6, the final model used in the present invention effectively solves the fragmentation of the region around the segmented target, and is more complete for the detailed region than FEM + FDM, and the whole is closer to GT.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.
Claims (10)
1. The CT image segmentation system based on the attention convolutional neural network is characterized by comprising a feature coding module, a semantic information extraction attention module, a feature fusion pooling attention module and a feature graph code module; the feature coding module gradually reduces the size of a feature map of an input image by using a parallel convolution neural network, and realizes the simultaneous extraction of semantic information features and spatial information features of the image through network layer multiplexing and interception and fusion of features of each layer; the semantic information extraction attention module generates attention features by using pooling, and further refines and refines the semantic information features extracted by the feature coding module; the feature fusion pooling attention module is connected in parallel with the average pooling by using maximum pooling and average pooling, and combines semantic information features refined by the semantic information extraction attention module with semantic information and spatial information features spliced by the feature coding module to form an attention feature map; and the feature map decoding module gradually and finely restores the attention feature map fused by the feature fusion pooling attention module into the size of the input image by using a convolution module and an up-sampling module.
2. The CT image segmentation system based on the attention convolutional neural network as claimed in claim 1, wherein the feature coding module comprises a first convolution module, a second convolution module, a first bottleneck channel, a second bottleneck channel, a fourth bottleneck channel and a first splicing operation module, the first convolution module comprises convolution layers and batch regularization, the convolution layers and the batch regularization are sequentially arranged, the second convolution module comprises convolution layers, batch regularization and ReLu activation functions, the first bottleneck channel, the second bottleneck channel and the fourth bottleneck channel are arranged in parallel, the bottleneck layers in each bottleneck channel are continuously reduced from the first bottleneck channel to the fourth bottleneck channel, the size of output feature maps of the second bottleneck channel, the fourth bottleneck channel and the first bottleneck channel is continuously reduced, the number of feature map channels finally output by each bottleneck layer is increased along with the increase of the number of the layers, and the first splicing operation module splices semantic information feature and spatial information feature extracted by the four bottleneck channels and the spatial information feature And (6) connecting.
3. The attention convolution neural network-based CT image segmentation system of claim 2, wherein the convolution kernel size of the convolution layer is 3 × 3 with a step size of 2.
4. The attention convolution neural network-based CT image segmentation system according to claim 2, wherein the number of bottleneck layers in the first to fourth bottleneck paths is 4, 3, 2, 1, respectively, and the sizes of feature maps output by the second to fourth bottleneck paths compared with the first bottleneck path are 1/2, 1/4, 1/8, respectively, and the number of channels of output feature maps in the first to fourth bottleneck paths is 128, 256, 512 and 1024, respectively.
5. The attention convolution neural network-based CT image segmentation system according to claim 2, wherein each bottleneck layer comprises three convolution units, an addition unit and a ReLu activation function unit which are sequentially arranged, each convolution unit comprises a convolution kernel, a batch regularization and a ReLu activation function which are sequentially arranged, and the addition unit is also in jump connection with a feature map in the convolution kernel input to the first convolution unit.
6. The CT image segmentation system based on attention convolution neural network of claim 1, wherein the semantic information extraction attention module comprises a first channel attention module, a second channel attention module, a global pooling module, a multiplication module and a second stitching module, the first channel attention module and the second channel attention module are arranged in parallel, each of the channel attention modules comprises a global average pooling for capturing context semantic feature information in an input feature map, a convolution for calculating semantic information weight, a batch regularization and Sigmoid activation function for refining semantic information extraction, and a multiplication operation for multiplying the refined semantic information with the input feature map, the multiplication module is used for multiplying the feature map output by the second channel attention module with an output feature map processed by the global pooling module, the second splicing operation module is used for splicing the feature map output by the first channel attention module and the output feature map of the multiplication operation module, and the input feature maps of the two channel attention modules are obtained by butting semantic information features extracted by the feature coding module.
7. The CT image segmentation system based on the attention convolutional neural network of claim 1, wherein the feature fusion pooling attention module comprises a third convolution module, an average pooling path, a maximum pooling path and a two-way pooling multiplication operation module, the third convolution module is used for extracting mixed information features of the fused semantic information features and spatial information features and simultaneously converting channels of information, the average pooling path and the maximum pooling path are arranged in parallel and are respectively used for processing the features extracted by the third convolution module, and the two-way pooling multiplication operation module is used for multiplying the two paths of features processed by the average pooling path and the maximum pooling path to form an attention feature map.
8. The attention convolution neural network-based CT image segmentation system of claim 7, wherein the average pooling pass uses two serially connected average pooling modules to process features as a first pass of feature extraction and the max pooling pass uses two serially connected max pooling modules to process features as a second pass of feature extraction.
9. The CT image segmentation system based on the attention convolution neural network is characterized in that the feature map coding module comprises a first up-sampling module, a fourth convolution module, a second up-sampling module, a fifth convolution module and a sixth convolution module which are sequentially arranged, the feature maps output by the first up-sampling module and the fourth convolution module are the same in size, and the feature maps output by the second up-sampling module, the fifth convolution module and the sixth convolution module are the same in size as the input image.
10. The attention convolution neural network-based CT image segmentation system of claim 9, wherein a sampling coefficient of the first and second upsampling modules is 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010190946.4A CN111325751B (en) | 2020-03-18 | 2020-03-18 | CT image segmentation system based on attention convolution neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010190946.4A CN111325751B (en) | 2020-03-18 | 2020-03-18 | CT image segmentation system based on attention convolution neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111325751A true CN111325751A (en) | 2020-06-23 |
CN111325751B CN111325751B (en) | 2022-05-27 |
Family
ID=71171544
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010190946.4A Expired - Fee Related CN111325751B (en) | 2020-03-18 | 2020-03-18 | CT image segmentation system based on attention convolution neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111325751B (en) |
Cited By (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111798428A (en) * | 2020-07-03 | 2020-10-20 | 南京信息工程大学 | Automatic segmentation method for multiple tissues of skin pathological image |
CN111914947A (en) * | 2020-08-20 | 2020-11-10 | 华侨大学 | Image instance segmentation method, device and equipment based on feature fusion and storage medium |
CN112085741A (en) * | 2020-09-04 | 2020-12-15 | 厦门大学 | Stomach cancer pathological section segmentation algorithm based on deep learning |
CN112085760A (en) * | 2020-09-04 | 2020-12-15 | 厦门大学 | Prospect segmentation method of laparoscopic surgery video |
CN112084911A (en) * | 2020-08-28 | 2020-12-15 | 安徽清新互联信息科技有限公司 | Human face feature point positioning method and system based on global attention |
CN112287940A (en) * | 2020-10-30 | 2021-01-29 | 西安工程大学 | Semantic segmentation method of attention mechanism based on deep learning |
CN112365480A (en) * | 2020-11-13 | 2021-02-12 | 哈尔滨市科佳通用机电股份有限公司 | Brake pad loss fault identification method for brake clamp device |
CN112446891A (en) * | 2020-10-23 | 2021-03-05 | 浙江工业大学 | Medical image segmentation method based on U-Net network brain glioma |
CN112509052A (en) * | 2020-12-22 | 2021-03-16 | 苏州超云生命智能产业研究院有限公司 | Method and device for detecting fovea maculata, computer equipment and storage medium |
CN112580654A (en) * | 2020-12-25 | 2021-03-30 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Semantic segmentation method for ground objects of remote sensing image |
CN112598650A (en) * | 2020-12-24 | 2021-04-02 | 苏州大学 | Combined segmentation method for optic cup optic disk in fundus medical image |
CN112767502A (en) * | 2021-01-08 | 2021-05-07 | 广东中科天机医疗装备有限公司 | Image processing method and device based on medical image model |
CN112927255A (en) * | 2021-02-22 | 2021-06-08 | 武汉科技大学 | Three-dimensional liver image semantic segmentation method based on context attention strategy |
CN113033572A (en) * | 2021-04-23 | 2021-06-25 | 上海海事大学 | Obstacle segmentation network based on USV and generation method thereof |
CN113065412A (en) * | 2021-03-12 | 2021-07-02 | 武汉大学 | Improved Deeplabv3+ based aerial image electromagnetic medium semantic recognition method and device |
CN113112465A (en) * | 2021-03-31 | 2021-07-13 | 上海深至信息科技有限公司 | System and method for generating carotid intima-media segmentation model |
CN113129321A (en) * | 2021-04-20 | 2021-07-16 | 重庆邮电大学 | Turbine blade CT image segmentation method based on full convolution neural network |
CN113158802A (en) * | 2021-03-22 | 2021-07-23 | 安徽理工大学 | Smart scene segmentation technique |
CN113298174A (en) * | 2021-06-10 | 2021-08-24 | 东南大学 | Semantic segmentation model improvement method based on progressive feature fusion |
CN113298825A (en) * | 2021-06-09 | 2021-08-24 | 东北大学 | Image segmentation method based on MSF-Net network |
CN113361537A (en) * | 2021-07-23 | 2021-09-07 | 人民网股份有限公司 | Image semantic segmentation method and device based on channel attention |
CN113378791A (en) * | 2021-07-09 | 2021-09-10 | 合肥工业大学 | Cervical cell classification method based on double-attention mechanism and multi-scale feature fusion |
CN113436094A (en) * | 2021-06-24 | 2021-09-24 | 湖南大学 | Gray level image automatic coloring method based on multi-view attention mechanism |
CN113610164A (en) * | 2021-08-10 | 2021-11-05 | 北京邮电大学 | Fine-grained image recognition method and system based on attention balance |
CN113689326A (en) * | 2021-08-06 | 2021-11-23 | 西南科技大学 | Three-dimensional positioning method based on two-dimensional image segmentation guidance |
CN113689434A (en) * | 2021-07-14 | 2021-11-23 | 淮阴工学院 | Image semantic segmentation method based on strip pooling |
CN113744279A (en) * | 2021-06-09 | 2021-12-03 | 东北大学 | Image segmentation method based on FAF-Net network |
CN114038037A (en) * | 2021-11-09 | 2022-02-11 | 合肥工业大学 | Expression label correction and identification method based on separable residual attention network |
CN114049315A (en) * | 2021-10-29 | 2022-02-15 | 北京长木谷医疗科技有限公司 | Joint recognition method, electronic device, storage medium, and computer program product |
CN114638256A (en) * | 2022-02-22 | 2022-06-17 | 合肥华威自动化有限公司 | Transformer fault detection method and system based on sound wave signals and attention network |
CN116229065A (en) * | 2023-02-14 | 2023-06-06 | 湖南大学 | Multi-branch fusion-based robotic surgical instrument segmentation method |
CN116630626A (en) * | 2023-06-05 | 2023-08-22 | 吉林农业科技学院 | Connected double-attention multi-scale fusion semantic segmentation network |
CN116630626B (en) * | 2023-06-05 | 2024-04-26 | 吉林农业科技学院 | Connected double-attention multi-scale fusion semantic segmentation network |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170124432A1 (en) * | 2015-11-03 | 2017-05-04 | Baidu Usa Llc | Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering |
CN107506774A (en) * | 2017-10-09 | 2017-12-22 | 深圳市唯特视科技有限公司 | A kind of segmentation layered perception neural networks method based on local attention mask |
US20180144208A1 (en) * | 2016-11-18 | 2018-05-24 | Salesforce.Com, Inc. | Adaptive attention model for image captioning |
CN110211127A (en) * | 2019-08-01 | 2019-09-06 | 成都考拉悠然科技有限公司 | Image partition method based on bicoherence network |
US10482603B1 (en) * | 2019-06-25 | 2019-11-19 | Artificial Intelligence, Ltd. | Medical image segmentation using an integrated edge guidance module and object segmentation network |
CN110490891A (en) * | 2019-08-23 | 2019-11-22 | 杭州依图医疗技术有限公司 | The method, equipment and computer readable storage medium of perpetual object in segmented image |
CN110490813A (en) * | 2019-07-05 | 2019-11-22 | 特斯联(北京)科技有限公司 | Characteristic pattern Enhancement Method, device, equipment and the medium of convolutional neural networks |
CN110532955A (en) * | 2019-08-30 | 2019-12-03 | 中国科学院宁波材料技术与工程研究所 | Example dividing method and device based on feature attention and son up-sampling |
US20200027211A1 (en) * | 2018-07-17 | 2020-01-23 | International Business Machines Corporation | Knockout Autoencoder for Detecting Anomalies in Biomedical Images |
US20200065969A1 (en) * | 2018-08-27 | 2020-02-27 | Siemens Healthcare Gmbh | Medical image segmentation from raw data using a deep attention neural network |
-
2020
- 2020-03-18 CN CN202010190946.4A patent/CN111325751B/en not_active Expired - Fee Related
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170124432A1 (en) * | 2015-11-03 | 2017-05-04 | Baidu Usa Llc | Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering |
US20180144208A1 (en) * | 2016-11-18 | 2018-05-24 | Salesforce.Com, Inc. | Adaptive attention model for image captioning |
CN107506774A (en) * | 2017-10-09 | 2017-12-22 | 深圳市唯特视科技有限公司 | A kind of segmentation layered perception neural networks method based on local attention mask |
US20200027211A1 (en) * | 2018-07-17 | 2020-01-23 | International Business Machines Corporation | Knockout Autoencoder for Detecting Anomalies in Biomedical Images |
US20200065969A1 (en) * | 2018-08-27 | 2020-02-27 | Siemens Healthcare Gmbh | Medical image segmentation from raw data using a deep attention neural network |
US10482603B1 (en) * | 2019-06-25 | 2019-11-19 | Artificial Intelligence, Ltd. | Medical image segmentation using an integrated edge guidance module and object segmentation network |
CN110490813A (en) * | 2019-07-05 | 2019-11-22 | 特斯联(北京)科技有限公司 | Characteristic pattern Enhancement Method, device, equipment and the medium of convolutional neural networks |
CN110211127A (en) * | 2019-08-01 | 2019-09-06 | 成都考拉悠然科技有限公司 | Image partition method based on bicoherence network |
CN110490891A (en) * | 2019-08-23 | 2019-11-22 | 杭州依图医疗技术有限公司 | The method, equipment and computer readable storage medium of perpetual object in segmented image |
CN110532955A (en) * | 2019-08-30 | 2019-12-03 | 中国科学院宁波材料技术与工程研究所 | Example dividing method and device based on feature attention and son up-sampling |
Non-Patent Citations (3)
Title |
---|
CHANGQIAN YU: ""BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation"", 《EPRINT ARXIV:1808.00897》 * |
YONG AN,JIANWU LONG: ""A segmentation network with multi-attention and its application to SAR image analysis"", 《IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING》 * |
陶永才等: ""池化和注意力相结合的新闻文本分类方法"", 《小型微型计算机系统》 * |
Cited By (53)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111798428B (en) * | 2020-07-03 | 2023-05-30 | 南京信息工程大学 | Automatic segmentation method for multiple tissues of skin pathology image |
CN111798428A (en) * | 2020-07-03 | 2020-10-20 | 南京信息工程大学 | Automatic segmentation method for multiple tissues of skin pathological image |
CN111914947B (en) * | 2020-08-20 | 2024-04-16 | 华侨大学 | Image instance segmentation method, device, equipment and storage medium based on feature fusion |
CN111914947A (en) * | 2020-08-20 | 2020-11-10 | 华侨大学 | Image instance segmentation method, device and equipment based on feature fusion and storage medium |
CN112084911A (en) * | 2020-08-28 | 2020-12-15 | 安徽清新互联信息科技有限公司 | Human face feature point positioning method and system based on global attention |
CN112084911B (en) * | 2020-08-28 | 2023-03-07 | 安徽清新互联信息科技有限公司 | Human face feature point positioning method and system based on global attention |
CN112085760A (en) * | 2020-09-04 | 2020-12-15 | 厦门大学 | Prospect segmentation method of laparoscopic surgery video |
CN112085741A (en) * | 2020-09-04 | 2020-12-15 | 厦门大学 | Stomach cancer pathological section segmentation algorithm based on deep learning |
CN112085741B (en) * | 2020-09-04 | 2024-03-26 | 厦门大学 | Gastric cancer pathological section segmentation algorithm based on deep learning |
CN112085760B (en) * | 2020-09-04 | 2024-04-26 | 厦门大学 | Foreground segmentation method for laparoscopic surgery video |
CN112446891A (en) * | 2020-10-23 | 2021-03-05 | 浙江工业大学 | Medical image segmentation method based on U-Net network brain glioma |
CN112446891B (en) * | 2020-10-23 | 2024-04-02 | 浙江工业大学 | Medical image segmentation method based on U-Net network brain glioma |
CN112287940A (en) * | 2020-10-30 | 2021-01-29 | 西安工程大学 | Semantic segmentation method of attention mechanism based on deep learning |
CN112365480A (en) * | 2020-11-13 | 2021-02-12 | 哈尔滨市科佳通用机电股份有限公司 | Brake pad loss fault identification method for brake clamp device |
CN112509052B (en) * | 2020-12-22 | 2024-04-23 | 苏州超云生命智能产业研究院有限公司 | Method, device, computer equipment and storage medium for detecting macula fovea |
CN112509052A (en) * | 2020-12-22 | 2021-03-16 | 苏州超云生命智能产业研究院有限公司 | Method and device for detecting fovea maculata, computer equipment and storage medium |
CN112598650A (en) * | 2020-12-24 | 2021-04-02 | 苏州大学 | Combined segmentation method for optic cup optic disk in fundus medical image |
CN112580654A (en) * | 2020-12-25 | 2021-03-30 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Semantic segmentation method for ground objects of remote sensing image |
CN112767502B (en) * | 2021-01-08 | 2023-04-07 | 广东中科天机医疗装备有限公司 | Image processing method and device based on medical image model |
CN112767502A (en) * | 2021-01-08 | 2021-05-07 | 广东中科天机医疗装备有限公司 | Image processing method and device based on medical image model |
CN112927255B (en) * | 2021-02-22 | 2022-06-21 | 武汉科技大学 | Three-dimensional liver image semantic segmentation method based on context attention strategy |
CN112927255A (en) * | 2021-02-22 | 2021-06-08 | 武汉科技大学 | Three-dimensional liver image semantic segmentation method based on context attention strategy |
CN113065412A (en) * | 2021-03-12 | 2021-07-02 | 武汉大学 | Improved Deeplabv3+ based aerial image electromagnetic medium semantic recognition method and device |
CN113158802A (en) * | 2021-03-22 | 2021-07-23 | 安徽理工大学 | Smart scene segmentation technique |
CN113112465A (en) * | 2021-03-31 | 2021-07-13 | 上海深至信息科技有限公司 | System and method for generating carotid intima-media segmentation model |
CN113129321A (en) * | 2021-04-20 | 2021-07-16 | 重庆邮电大学 | Turbine blade CT image segmentation method based on full convolution neural network |
CN113033572B (en) * | 2021-04-23 | 2024-04-05 | 上海海事大学 | Obstacle segmentation network based on USV and generation method thereof |
CN113033572A (en) * | 2021-04-23 | 2021-06-25 | 上海海事大学 | Obstacle segmentation network based on USV and generation method thereof |
CN113744279A (en) * | 2021-06-09 | 2021-12-03 | 东北大学 | Image segmentation method based on FAF-Net network |
CN113744279B (en) * | 2021-06-09 | 2023-11-14 | 东北大学 | Image segmentation method based on FAF-Net network |
CN113298825B (en) * | 2021-06-09 | 2023-11-14 | 东北大学 | Image segmentation method based on MSF-Net network |
CN113298825A (en) * | 2021-06-09 | 2021-08-24 | 东北大学 | Image segmentation method based on MSF-Net network |
CN113298174B (en) * | 2021-06-10 | 2022-04-29 | 东南大学 | Semantic segmentation model improvement method based on progressive feature fusion |
CN113298174A (en) * | 2021-06-10 | 2021-08-24 | 东南大学 | Semantic segmentation model improvement method based on progressive feature fusion |
CN113436094A (en) * | 2021-06-24 | 2021-09-24 | 湖南大学 | Gray level image automatic coloring method based on multi-view attention mechanism |
CN113378791A (en) * | 2021-07-09 | 2021-09-10 | 合肥工业大学 | Cervical cell classification method based on double-attention mechanism and multi-scale feature fusion |
CN113378791B (en) * | 2021-07-09 | 2022-08-05 | 合肥工业大学 | Cervical cell classification method based on double-attention mechanism and multi-scale feature fusion |
CN113689434A (en) * | 2021-07-14 | 2021-11-23 | 淮阴工学院 | Image semantic segmentation method based on strip pooling |
CN113689434B (en) * | 2021-07-14 | 2022-05-27 | 淮阴工学院 | Image semantic segmentation method based on strip pooling |
CN113361537B (en) * | 2021-07-23 | 2022-05-10 | 人民网股份有限公司 | Image semantic segmentation method and device based on channel attention |
CN113361537A (en) * | 2021-07-23 | 2021-09-07 | 人民网股份有限公司 | Image semantic segmentation method and device based on channel attention |
CN113689326B (en) * | 2021-08-06 | 2023-08-04 | 西南科技大学 | Three-dimensional positioning method based on two-dimensional image segmentation guidance |
CN113689326A (en) * | 2021-08-06 | 2021-11-23 | 西南科技大学 | Three-dimensional positioning method based on two-dimensional image segmentation guidance |
CN113610164A (en) * | 2021-08-10 | 2021-11-05 | 北京邮电大学 | Fine-grained image recognition method and system based on attention balance |
CN113610164B (en) * | 2021-08-10 | 2023-12-22 | 北京邮电大学 | Fine granularity image recognition method and system based on attention balance |
CN114049315A (en) * | 2021-10-29 | 2022-02-15 | 北京长木谷医疗科技有限公司 | Joint recognition method, electronic device, storage medium, and computer program product |
CN114038037B (en) * | 2021-11-09 | 2024-02-13 | 合肥工业大学 | Expression label correction and identification method based on separable residual error attention network |
CN114038037A (en) * | 2021-11-09 | 2022-02-11 | 合肥工业大学 | Expression label correction and identification method based on separable residual attention network |
CN114638256A (en) * | 2022-02-22 | 2022-06-17 | 合肥华威自动化有限公司 | Transformer fault detection method and system based on sound wave signals and attention network |
CN116229065A (en) * | 2023-02-14 | 2023-06-06 | 湖南大学 | Multi-branch fusion-based robotic surgical instrument segmentation method |
CN116229065B (en) * | 2023-02-14 | 2023-12-01 | 湖南大学 | Multi-branch fusion-based robotic surgical instrument segmentation method |
CN116630626B (en) * | 2023-06-05 | 2024-04-26 | 吉林农业科技学院 | Connected double-attention multi-scale fusion semantic segmentation network |
CN116630626A (en) * | 2023-06-05 | 2023-08-22 | 吉林农业科技学院 | Connected double-attention multi-scale fusion semantic segmentation network |
Also Published As
Publication number | Publication date |
---|---|
CN111325751B (en) | 2022-05-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111325751B (en) | CT image segmentation system based on attention convolution neural network | |
Zhou et al. | GMNet: Graded-feature multilabel-learning network for RGB-thermal urban scene semantic segmentation | |
CN109241982B (en) | Target detection method based on deep and shallow layer convolutional neural network | |
CN111340814B (en) | RGB-D image semantic segmentation method based on multi-mode self-adaptive convolution | |
CN113642390B (en) | Street view image semantic segmentation method based on local attention network | |
CN111325165B (en) | Urban remote sensing image scene classification method considering spatial relationship information | |
CN111612008A (en) | Image segmentation method based on convolution network | |
CN110223304B (en) | Image segmentation method and device based on multipath aggregation and computer-readable storage medium | |
CN113807355A (en) | Image semantic segmentation method based on coding and decoding structure | |
CN111797841B (en) | Visual saliency detection method based on depth residual error network | |
CN116309648A (en) | Medical image segmentation model construction method based on multi-attention fusion | |
CN110866938B (en) | Full-automatic video moving object segmentation method | |
CN112149526B (en) | Lane line detection method and system based on long-distance information fusion | |
CN114119975A (en) | Language-guided cross-modal instance segmentation method | |
CN115620010A (en) | Semantic segmentation method for RGB-T bimodal feature fusion | |
CN113076957A (en) | RGB-D image saliency target detection method based on cross-modal feature fusion | |
CN113963170A (en) | RGBD image saliency detection method based on interactive feature fusion | |
CN116129289A (en) | Attention edge interaction optical remote sensing image saliency target detection method | |
CN114219824A (en) | Visible light-infrared target tracking method and system based on deep network | |
CN110599495B (en) | Image segmentation method based on semantic information mining | |
Chen et al. | Adaptive fusion network for RGB-D salient object detection | |
Zhang et al. | CSNet: a ConvNeXt-based Siamese network for RGB-D salient object detection | |
CN113870286A (en) | Foreground segmentation method based on multi-level feature and mask fusion | |
CN111612803B (en) | Vehicle image semantic segmentation method based on image definition | |
TWI809957B (en) | Object detection method and electronic apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220527 |