CN115187621A

CN115187621A - Automatic U-Net medical image contour extraction network integrating attention mechanism

Info

Publication number: CN115187621A
Application number: CN202111184138.8A
Authority: CN
Inventors: 吕巨建; 陈豪源; 赵慧民; 战荫伟; 陈荣军; 熊建斌; 林凯瀚
Original assignee: Guangdong Polytechnic Normal University
Current assignee: Guangdong Polytechnic Normal University
Priority date: 2021-10-11
Filing date: 2021-10-11
Publication date: 2022-10-14

Abstract

The invention discloses a U-Net medical image contour automatic extraction network fused with an attention mechanism, comprising an RGB image input module, the output end of the RGB image input module is connected to the input end of the feature extraction module, and the feature extraction module includes a feature encoding module , feature decoding module and attention module; the output end of the feature extraction module is connected to the input end of the MLP, and the attention module includes spatial attention and channel attention, which are used to suppress the neurons in the non-attention area; the MLP output unit is set as 2 neurons, respectively representing the probability of foreground and background, and followed by Softmax and Marching Square in turn. The invention integrates the attention module, improves the edge contour extraction accuracy, preliminarily solves the problem of blurred edges generated by the traditional frame, and reduces the interference of background noise, thereby basically meeting the accuracy requirements for medical image contour extraction in the medical field; The process of the framework greatly saves the time and cost of obtaining the target model.

Description

U-Net Medical Image Contour Automatic Extraction Network with Attention Mechanism

技术领域technical field

本发明涉及医学图像处理技术领域，具体为一种融合注意力机制的U-Net医学影像轮廓自动提取网络。The invention relates to the technical field of medical image processing, in particular to a U-Net medical image contour automatic extraction network fused with an attention mechanism.

背景技术Background technique

医学图像能够对人体内的解剖结构或功能组织进行反映。根据医学图像中某种相似性特征将医学图像划分为若干个互不相交的区域，即医学图像分割，是医学图像分析中最重要的基础。准确、鲁棒和快速的图像分割，是定量分析、三维可视化等后续环节之前的最重要步骤，也为图像引导手术、放疗计划和治疗评估等重要临床应用奠定了最根本的基础。Medical images can reflect anatomical structures or functional tissues in the human body. Dividing a medical image into several disjoint regions according to a certain similarity feature in the medical image, that is, medical image segmentation, is the most important basis in medical image analysis. Accurate, robust and fast image segmentation is the most important step before subsequent steps such as quantitative analysis and 3D visualization, and also lays the most fundamental foundation for important clinical applications such as image-guided surgery, radiotherapy planning, and treatment evaluation.

近年来，随着深度神经网络在医学图像处理领域的发展，深度学习己成为医学图像分割任务中的主流方法，众多研究者的实践证明，基于深度学习的分割方法在医学图像分割领域具有很强的应用潜力。深度学习分割方法是通过对像素进行分类来实现医学图像的分割。与传统的像素或超像素分类方法使用手工制作的特征不同，深度学习的方法能够自动从医学图像中学习到与任务相关的特征，并根据这些特征来对像素进行分类，进而实现了端到端的分割。其中，U-Net是目前医学图像分割领域中应用最广泛的框架。In recent years, with the development of deep neural network in the field of medical image processing, deep learning has become the mainstream method in medical image segmentation tasks. The practice of many researchers has proved that the segmentation method based on deep learning has a strong role in the field of medical image segmentation. application potential. The deep learning segmentation method is to achieve the segmentation of medical images by classifying the pixels. Unlike traditional pixel or superpixel classification methods that use hand-crafted features, deep learning methods can automatically learn task-relevant features from medical images and classify pixels based on these features, thereby achieving end-to-end performance. segmentation. Among them, U-Net is currently the most widely used framework in the field of medical image segmentation.

现有技术：U-Net网络结构在编码器部分能够获取图像的细节信息和轮廓信息；然后，通过跳跃连接阶段将提取到的特征传递至解码器部分；最后，由解码器部分结合多个尺度的特征进行特征恢复。由于具有U型结构，U-Net可以用较少的图片训练得到效果不错的模型。U-Net网络可分为特征提取网络和特征融合网络，特征提取网络使用卷积层和池化层，实现下采样操作，特征融合网络则为上采样操作，可以恢复图像分辨率的同时，网络逐渐收敛到目标区域。在特征融合阶段，再次融合同层次中提取的特征，避免细节丢失。Prior art: The U-Net network structure can obtain the detailed information and contour information of the image in the encoder part; then, the extracted features are transferred to the decoder part through the skip connection stage; finally, the decoder part combines multiple scales. feature for feature recovery. Due to its U-shaped structure, U-Net can be trained with fewer pictures to get a good model. The U-Net network can be divided into a feature extraction network and a feature fusion network. The feature extraction network uses a convolution layer and a pooling layer to implement downsampling operations, while the feature fusion network is an upsampling operation, which can restore the image resolution. At the same time, the network gradually converge to the target area. In the feature fusion stage, the features extracted from the same level are fused again to avoid loss of details.

虽然基于U-Net的医学图像分割方法获得了令人瞩目的成绩，但由于噪声问题的影响，获得准确的分割结果仍然十分困难，多数方法仍然存在边缘模糊，细节被忽略、需人工调参等问题。为此，我们推出一种融合注意力机制的U-Net医学影像轮廓自动提取网络。Although the medical image segmentation method based on U-Net has achieved remarkable results, it is still very difficult to obtain accurate segmentation results due to the influence of noise. Most methods still have blurred edges, neglected details, and need to manually adjust parameters. question. To this end, we introduce an automatic U-Net medical image contour extraction network fused with attention mechanism.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种融合注意力机制的U-Net医学影像轮廓自动提取网络，以解决上述背景技术中提出的问题。The purpose of the present invention is to provide a U-Net medical image contour automatic extraction network fused with an attention mechanism, so as to solve the problems raised in the above background art.

为实现上述目的，本发明提供如下技术方案：一种融合注意力机制的U-Net医学影像轮廓自动提取网络，包括RGB图像输入模块，所述RGB图像输入模块的输出端连接于特征提取模块的输入端，所述特征提取模块包括特征编码模块、特征解码模块和注意力模块；In order to achieve the above object, the present invention provides the following technical solutions: a U-Net medical image contour automatic extraction network fused with attention mechanism, including an RGB image input module, the output end of the RGB image input module is connected to the feature extraction module. The input end, the feature extraction module includes a feature encoding module, a feature decoding module and an attention module;

所述特征提取模块的输出端连接于MLP的输入端，所述注意力模块包括空间注意力和通道注意力，用于抑制非关注区域的神经元；The output end of the feature extraction module is connected to the input end of the MLP, and the attention module includes spatial attention and channel attention, and is used to suppress neurons in non-attention areas;

所述MLP用于分类提取出来的特征，其输出元设定为2个神经元，分别表示前景和背景的概率，并在其后依次接上Softmax和Marching Square。The MLP is used for classifying the extracted features, and its output unit is set to 2 neurons, which represent the probability of foreground and background respectively, and is followed by Softmax and Marching Square in turn.

所述RGB图像输入模块用于输入一张RGB图像。The RGB image input module is used for inputting an RGB image.

所述特征提取模块用于提取RGB图像的特征，特征提取模块得到RGB图像的特征后，对于每个像素都有C维的特征表示，融合局部和全局的信息，因此只需要对每个像素的C维特征进行一次MLP推理，由于该阶段任务是二分类，最后得到2维的信息，分别表示目标和非目标的概率，通过比较目标和非目标，即可得到一张二值图片，最后采用二值化图片轮廓提取的算法。The feature extraction module is used to extract the features of the RGB image. After the feature extraction module obtains the features of the RGB image, there is a C-dimensional feature representation for each pixel, and the local and global information is fused. Therefore, only the The C-dimensional feature performs an MLP inference. Since the task at this stage is a binary classification, 2-dimensional information is finally obtained, representing the probability of the target and the non-target respectively. By comparing the target and non-target, a binary image can be obtained. Finally, use Algorithm for binarizing image contour extraction.

所述注意力模块用于去除RGB图像中的干扰信息。The attention module is used to remove interference information in RGB images.

所述特征编码模块采用的是ResNet18。The feature encoding module adopts ResNet18.

所述通道注意力为下式所示：The channel attention is as follows:

M_c(F)＝F*Sigmoid(MLP(AvgPool(F))+MLP(MaxPool(F)))；Mc(F)=F*Sigmoid(MLP( _AvgPool (F))+MLP(MaxPool(F)));

所述空间注意力的计算如下式所示，其中

表示将A和B按通道拼接；The calculation of the spatial attention is as follows, where

Indicates that A and B are spliced by channel;

结合通道注意力和空间注意力，得到CBAM的计算公式：Combining channel attention and spatial attention, the calculation formula of CBAM is obtained:

M(F)＝M_s(M_c(G(F)))+F；其中：G(F)＝Conv₂(Conv₁(F))。M(F)=Ms(Mc(G(F)))+F; where: G(F) ₌ _Conv2 ₍ _Conv1 (F)).

所述MLP采用含3层隐藏层的多层感知器。The MLP uses a multilayer perceptron with 3 hidden layers.

与现有技术相比，本发明的有益效果是：本发明通过以一定的方式融合注意力模块，提高了边缘轮廓提取精度，初步解决传统框架产生模糊边缘的问题，并减少了背景噪声的干扰，从而基本满足了医疗领域对医学影像轮廓提取的精度要求。Compared with the prior art, the beneficial effects of the present invention are: the present invention improves the edge contour extraction accuracy by integrating the attention module in a certain way, preliminarily solves the problem of blurred edges generated by the traditional framework, and reduces the interference of background noise. , which basically meets the precision requirements of medical image contour extraction in the medical field.

本发明简化了传统框架的流程，使得训练和推断所用的时间相对较少，大大节约得到目标模型的时间和成本。The invention simplifies the flow of the traditional framework, so that the time used for training and inference is relatively small, and the time and cost for obtaining the target model are greatly saved.

本发明通过Marching Square算法，对轮廓进行最后的提取，算法实现简单快速，而且可以进行并行处理。The present invention performs the final extraction of the contour through the Marching Square algorithm, the algorithm is simple and fast to implement, and can be processed in parallel.

附图说明Description of drawings

图1为本发明融合注意力机制的U-Net医学影像轮廓自动提取网络的框架结构示意图；1 is a schematic diagram of the frame structure of a U-Net medical image contour automatic extraction network incorporating an attention mechanism according to the present invention;

图2为Backbone框架结构示意图；Fig. 2 is Backbone frame structure schematic diagram;

图3为MLP框架结构示意图；Figure 3 is a schematic diagram of the MLP frame structure;

图4为MLP调用过程示意图；4 is a schematic diagram of an MLP calling process;

图5为Marching Squares的第一种基本情况示意图；Figure 5 is a schematic diagram of the first basic situation of Marching Squares;

图6为Marching Squares的第二种基本情况示意图；Figure 6 is a schematic diagram of the second basic situation of Marching Squares;

图7为Marching Squares的第三种基本情况示意图；Figure 7 is a schematic diagram of the third basic situation of Marching Squares;

图8为Marching Squares的第四种基本情况示意图；Figure 8 is a schematic diagram of the fourth basic situation of Marching Squares;

图9为Marching Squares的第五种基本情况示意图；Figure 9 is a schematic diagram of the fifth basic situation of Marching Squares;

图10为Marching Squares的第六种基本情况示意图。Figure 10 is a schematic diagram of the sixth basic case of Marching Squares.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

请参阅图1-10，本发明提供一种技术方案：一种融合注意力机制的U-Net医学影像轮廓自动提取网络，包括RGB图像输入模块，所述RGB图像输入模块的输出端连接于特征提取模块的输入端，所述特征提取模块包括特征编码模块、特征解码模块和注意力模块；1-10, the present invention provides a technical solution: a U-Net medical image contour automatic extraction network fused with attention mechanism, including an RGB image input module, the output of the RGB image input module is connected to the feature The input end of the extraction module, the feature extraction module includes a feature encoding module, a feature decoding module and an attention module;

输入一张RGB图像，首先对其进行特征提取。特征提取模块类似于U-Net，一部分是编码网络(即特征编码模块)，另一部分是解码网络(特征解码模块)，同时融合了注意力模块，通过后续的消融实验发现，加入的注意力模块能够很好去除对图片中的干扰信息。Input an RGB image, first perform feature extraction on it. The feature extraction module is similar to U-Net. One part is the encoding network (ie, the feature encoding module), and the other part is the decoding network (feature decoding module). At the same time, the attention module is integrated. Through subsequent ablation experiments, it is found that the added attention module It can well remove the interference information in the picture.

得到特征后，对于每个像素都有C维的特征表示，融合了局部和全局的信息，因此只需要对每个像素的C维特征进行一次MLP推理，由于该阶段任务是二分类，最后得到2维的信息，分别表示目标和非目标的概率。通过比较目标和非目标，即可得到一张二值图片，最后采用二值化图片轮廓提取的算法即可，实验中使用Marching Squares算法提取轮廓。After the feature is obtained, there is a C-dimensional feature representation for each pixel, which combines local and global information, so only one MLP inference is needed for the C-dimensional feature of each pixel. Since the task at this stage is two-classification, the final result is obtained. 2-dimensional information, representing the probability of target and non-target, respectively. By comparing the target and the non-target, a binary image can be obtained. Finally, the algorithm of binary image contour extraction can be used. In the experiment, the Marching Squares algorithm is used to extract the contour.

特征提取模块分为两个部分，一个是特征编码模块，一个是特征解码模块。特征编码模块采用的是ResNet18，具有容易训练、易于实现和参数量相对较少的特点，更重要的是具有下采样的结构，非常适合进行快速的特征提取。The feature extraction module is divided into two parts, one is the feature encoding module and the other is the feature decoding module. The feature encoding module uses ResNet18, which is easy to train, easy to implement, and has relatively few parameters. More importantly, it has a down-sampling structure, which is very suitable for fast feature extraction.

特征解码模块类似于U-Net的特征解码部分，但是在每个上采样前加入了注意力模块。CBAM模块(即注意力模块)是一种具有注意力机制的网络，包括空间注意力和通道注意力，用于抑制非关注区域的神经元。The feature decoding module is similar to the feature decoding part of U-Net, but an attention module is added before each upsampling. The CBAM module (i.e., the attention module) is a network with attention mechanisms, including spatial attention and channel attention, for suppressing neurons in non-attention areas.

通道注意力为下式所示：The channel attention is given by:

空间注意力的计算如下式所示，其中

表示将A和B按通道拼接；The calculation of spatial attention is as follows, where

Indicates that A and B are spliced by channel;

结合通道注意力和空间注意力，得到CBAM的计算公式。Combining channel attention and spatial attention, the calculation formula of CBAM is obtained.

M(F)＝M_s(M_c(G(F)))+F；其中：G(F)＝Conv₂(Conv₁(F))；M(F)=Ms(Mc(G(F)))+F; where: G(F) ₌ _Conv2 ₍ _Conv1 (F));

特征编码后得到的往往是一些图片的细节，而这些细节包括了目标和非目标的细节，如果这个时候进行注意力处理，提前将非目标部分的特征抑制，那么预测的目标区域更容易被后续的MLP模块认为是1，有利于后面的MLP推断。After feature encoding, the details of some pictures are often obtained, and these details include the details of the target and non-target. If attention is processed at this time and the features of the non-target part are suppressed in advance, the predicted target area is easier to follow. The MLP module considers it to be 1, which is beneficial to the subsequent MLP inference.

特征提取模块类似于U-Net，但不同于U-Net的是，该模块的输出不是一个概率图，而是特征的描述。假设输入的图片是W*H*3，那么特征提取后得到的维度是W*H*C，其中C是特征描述数量，在实验中发现设定C＝512使得网络更具有鲁棒性。对于每个像素，都有C维向量的特征描述。这些特征不仅描述该像素周围的信息，还融合了全局的信息，因此后续的判断只需要对这个像素的向量进行推断即可。The feature extraction module is similar to U-Net, but unlike U-Net, the output of this module is not a probability map, but a description of features. Assuming that the input image is W*H*3, then the dimension obtained after feature extraction is W*H*C, where C is the number of feature descriptions. In the experiment, it is found that setting C=512 makes the network more robust. For each pixel, there is a feature description of a C-dimensional vector. These features not only describe the information around the pixel, but also integrate the global information, so the subsequent judgment only needs to infer the vector of this pixel.

本发明设计的多层感知器(即MLP)，用于分类提取出来的特征。PIFU会对整个特征采样再分类，跟PIFU不同的是，本发明的框架中不会对特征进行采样，因为这是二维图片的信息，计算量不大，再者现在设备计算能力和显存容量有大幅度提升，完全可以对所有特征进行训练，无需采样。对每个像素，都有C维向量描述该像素在全局的特征信息。The multi-layer perceptron (ie MLP) designed in the present invention is used for classifying the extracted features. PIFU will reclassify the entire feature sampling. Different from PIFU, the feature will not be sampled in the framework of the present invention, because this is the information of a two-dimensional picture, and the calculation amount is not large. Furthermore, the current device computing power and video memory capacity There is a significant improvement, and all features can be trained without sampling. For each pixel, there is a C-dimensional vector describing the global feature information of the pixel.

本发明的框架设置了3层隐藏层，分别是[512,256,128]。由于图像分割的目的是二分类，因此输出元设定为2个神经元，分别表示前景和背景的概率，并在其后接上Softmax和Marching Squares。The framework of the present invention sets 3 hidden layers, which are [512, 256, 128] respectively. Since the purpose of image segmentation is binary classification, the output unit is set to 2 neurons, representing the probability of foreground and background respectively, and followed by Softmax and Marching Squares.

最后对比前景和背景的概率，只要前景大于背景，该像素设为1，通过上述流程可以得到一张纯净的二值图。Finally, compare the probabilities of the foreground and the background. As long as the foreground is greater than the background, the pixel is set to 1, and a pure binary image can be obtained through the above process.

这样做的好处：The benefits of doing this:

(1)提高准确率。以往的图像分割得到的都是一张概率图，将对前景和背景都进行概率计算。从消融实验可以看到，引入MLP在准确率可以提升0.3％左右。(1) Improve the accuracy. In the past image segmentation, a probability map was obtained, and probability calculations were performed on both the foreground and the background. It can be seen from the ablation experiment that the introduction of MLP can improve the accuracy by about 0.3%.

(2)无需人工设定阈值。以往的图像轮廓提取，往往需要在概率图上进行一个人工设定的阈值，以此得到一个纯净的二值图片。本框架的设计将免除设置这个超参数，避免人工设定的阈值对结果产生影响。(2) There is no need to manually set the threshold. In the past image contour extraction, it is often necessary to perform an artificially set threshold on the probability map to obtain a pure binary image. The design of this framework will avoid setting this hyperparameter and avoid the impact of artificially set thresholds on the results.

得到纯净的二值图后，理论上任意一个二值图轮廓提取算法都可以提取轮廓，本发明使用的是Marching Squares算法。跟Marching Cubes类似，Marching Squares是一种用于提取等高线的算法，给定一个二维的概率图，根据一个阈值，线性插值得到阈值所在的曲线。After the pure binary image is obtained, theoretically any binary image contour extraction algorithm can extract the contour, and the present invention uses the Marching Squares algorithm. Similar to Marching Cubes, Marching Squares is an algorithm for extracting contour lines. Given a two-dimensional probability map, according to a threshold, linear interpolation is used to obtain the curve where the threshold is located.

由于本发明输出是一个二值图，因此使用Marching Squares算法时，只要阈值大于0，得到的结果是一样的。使用Marching Squares算法的原因是它实现简单，计算量小。对于位于单元格内的四个点，共有以下6种基本情况，通过旋转镜像可以得到16种情况。只需要判断是哪种情况，即可构造出边，该算法是可以并行运行的，意味着可以进一步优化和加速。Since the output of the present invention is a binary image, when the Marching Squares algorithm is used, as long as the threshold value is greater than 0, the obtained result is the same. The reason for using the Marching Squares algorithm is that it is simple to implement and requires little computation. For the four points located in the cell, there are the following 6 basic cases, and 16 cases can be obtained by rotating the mirror. It is only necessary to determine which case is the edge, and the algorithm can be run in parallel, which means that it can be further optimized and accelerated.

关键点：key point:

1、本发明在UNet框架的基础上，引入了注意力模块，利用全卷积网络，得到医学图像物体的二值掩码，将物体从背景图像分割出来，提高了轮廓提取的准确度，可以更好地向轮廓提取阶段提供特征。1. On the basis of the UNet framework, the present invention introduces an attention module, uses a fully convolutional network to obtain a binary mask of medical image objects, separates the objects from the background image, improves the accuracy of contour extraction, and can Better feed features to the contour extraction stage.

2、本发明使用了含3层隐藏层的多层感知器作为分类器，从上一步的掩码信息进行第二次修正，使得结果更加的准确，并且使得轮廓界限明确，有利于使用MarchingSquare算法提取轮廓。2. The present invention uses a multi-layer perceptron with 3 hidden layers as a classifier, and performs a second correction from the mask information of the previous step, so that the result is more accurate, and the contour boundary is clear, which is conducive to the use of the MarchingSquare algorithm Extract contours.

3、本发明使用了Marching Square作为最后的轮廓提取算法，相比于神经网络方法，该方法提高了轮廓提取的性能，而且可以并行加速执行该流程。3. The present invention uses Marching Square as the final contour extraction algorithm. Compared with the neural network method, this method improves the performance of contour extraction and can accelerate the execution of the process in parallel.

保护点：Protection point:

1、发明在传统的U-Net基础上，在U-Net的上采样阶段引入了注意力模块，利用全卷积网络，得到医学图像物体的二值掩码，将物体从背景图像分割出来，应在本发明保护范围以内。1. The invention is based on the traditional U-Net, and introduces an attention module in the up-sampling stage of U-Net. Using a fully convolutional network, the binary mask of the medical image object is obtained, and the object is segmented from the background image. should be within the scope of protection of the present invention.

2、本发明使用了含3层隐藏层的多层感知器作为分类器，对二值掩码进行第二次修正，使得结果更加的准确，并且使得轮廓界限明确，从而更好达到医疗领域对图像轮廓提取效果的要求，应在本发明保护范围以内。2. The present invention uses a multi-layer perceptron with 3 hidden layers as a classifier, and performs a second correction on the binary mask, so that the results are more accurate, and the contour boundaries are clear, so as to better achieve the accuracy of the medical field. The requirements for the effect of image contour extraction should be within the protection scope of the present invention.

3、本发明使用了Marching Square作为最后的轮廓提取算法，相比于神经网络方法，该方法提高了轮廓提取的性能，而且可以并行加速执行该流程。Marching Square算法在医学轮廓提取上的应用，应在本发明保护范围以内。3. The present invention uses Marching Square as the final contour extraction algorithm. Compared with the neural network method, this method improves the performance of contour extraction and can accelerate the execution of the process in parallel. The application of the Marching Square algorithm in medical contour extraction should fall within the protection scope of the present invention.

虽然基于U-Net的医学图像分割方法获得了令人瞩目的成绩，但由于噪声问题的影响，获得准确的分割结果仍然十分困难，多数方法仍然存在边缘模糊，细节被忽略、需人工调参等问题。Although the medical image segmentation method based on U-Net has achieved remarkable results, it is still very difficult to obtain accurate segmentation results due to the influence of noise. Most methods still have blurred edges, neglected details, and need to manually adjust parameters. question.

现有技术的缺点：Disadvantages of the prior art:

现有医学影像轮廓提取网络的缺点：Disadvantages of existing medical image contour extraction networks:

1、检测精度相对较低，容易出现误检现象或漏检现象。1. The detection accuracy is relatively low, and it is prone to false detection or missed detection.

2、需要大量的数据集训练，消耗大量的成本和时间。2. A large number of datasets are required for training, which consumes a lot of cost and time.

3、输出结果边缘模糊，细节被忽略。3. The edges of the output result are blurred and details are ignored.

4、传统的网络存在一个或多个需要人为调整的超参数，而超参数的调整对结果产生直接的影响。4. The traditional network has one or more hyperparameters that need to be adjusted manually, and the adjustment of the hyperparameters has a direct impact on the results.

出现上述缺点的原因有：The reasons for the above shortcomings are:

1、模型没有关注某些重要的特征，容易受到噪声的干扰。1. The model does not pay attention to some important features and is easily disturbed by noise.

2、模型本身的设计较为复杂，导致训练消耗大量的时间。2. The design of the model itself is more complicated, resulting in a lot of time spent on training.

3、CNN中对张量的卷积操作容易使得细节被忽略，使结果趋于确定与不确定之间的稳定态。3. The convolution operation on tensors in CNN tends to make details ignored, making the results tend to be in a stable state between certainty and uncertainty.

4、确定轮廓边缘需要人为设置一个阈值，通常设定为0.5，但这个数值并非最佳，对于不同的数据都有一个或多个不同的最佳数值。4. It is necessary to manually set a threshold to determine the contour edge, usually set to 0.5, but this value is not optimal, and there are one or more different optimal values for different data.

为了解决边缘模糊，忽略细节等问题，并提高轮廓提取的精确率和训练效率，降低人为对超参数的影响，具体的，针对上述问题，本发明聚焦于通过注意力机制实现的一种基于像素判断且无需使用ACMs的网络，结合注意力机制，能够更快地收敛，无需大量的数据样本也能训练出很好的效果。使用该网络推断像素位于内部区域的概率，通过一定的步骤能够直接输出物体的二值图，无需人工设置阈值。本发明的主要改进如下：In order to solve the problems of blurred edges, ignoring details, etc., improve the accuracy and training efficiency of contour extraction, and reduce the artificial influence on hyperparameters, specifically, in view of the above problems, the present invention focuses on a pixel-based pixel-based algorithm realized by an attention mechanism. The network that judges and does not need to use ACMs, combined with the attention mechanism, can converge faster, and can train well without a large number of data samples. Using this network to infer the probability that the pixel is located in the inner area, the binary image of the object can be directly output through certain steps, without the need to manually set the threshold. The main improvement of the present invention is as follows:

(1)为了解决干扰项的问题，本文提出一个具有注意力机制的U-Net模块，该模块能够有效去除干扰，在果蝇胚胎数据集中召回率比U-Net提升了6.3％。(1) In order to solve the problem of interference items, this paper proposes a U-Net module with an attention mechanism, which can effectively remove interference, and the recall rate in the Drosophila embryo dataset is 6.3% higher than that of U-Net.

(2)使用MLP表示该像素位于目标内的概率，将U-Net提取的特征再次计算，而非直接由U-Net得到。在果蝇胚胎数据集中准确率比U-Net提升了0.3％。(2) MLP is used to represent the probability that the pixel is located in the target, and the features extracted by U-Net are recalculated instead of directly obtained by U-Net. In the Drosophila embryo dataset, the accuracy improved by 0.3% over U-Net.

(3)人工设置阈值的大小对结果影响很大，为了避免人工设置阈值，利用one-hot编码方法解决这个问题，比自适应设置阈值的方法简单有效。(3) The size of the manual setting of the threshold has a great influence on the results. In order to avoid the manual setting of the threshold, the one-hot encoding method is used to solve this problem, which is simpler and more effective than the method of adaptively setting the threshold.

本发明通过以一定的方式融合注意力模块，提高了边缘轮廓提取精度，初步解决传统框架产生模糊边缘的问题，并减少了背景噪声的干扰，从而基本满足了医疗领域对医学影像轮廓提取的精度要求。By integrating the attention module in a certain way, the invention improves the edge contour extraction accuracy, preliminarily solves the problem of blurred edges generated by the traditional frame, and reduces the interference of background noise, thereby basically satisfying the medical image contour extraction accuracy in the medical field Require.

本发明解决的技术问题：The technical problem solved by the present invention:

1、加入注意力模块，使用CBAM模块使网络聚焦于关注的区域，去除干扰。1. Add the attention module, use the CBAM module to make the network focus on the area of interest and remove interference.

2、利用one-hot编码方法解决人工设置阈值的问题。2. Use the one-hot encoding method to solve the problem of manually setting the threshold.

3、改进和优化现有模型，使得训练时间相对较短，尽可能使模型轻量化。3. Improve and optimize the existing model so that the training time is relatively short and the model is as lightweight as possible.

尽管已经示出和描述了本发明的实施例，对于本领域的普通技术人员而言，可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型，本发明的范围由所附权利要求及其等同物限定。Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, and substitutions can be made in these embodiments without departing from the principle and spirit of the invention and modifications, the scope of the present invention is defined by the appended claims and their equivalents.

Claims

1. A U-Net medical image contour automatic extraction network integrating attention mechanism comprises an RGB image input module and is characterized in that: the output end of the RGB image input module is connected to the input end of the feature extraction module, and the feature extraction module comprises a feature coding module, a feature decoding module and an attention module;

the output end of the feature extraction module is connected to the input end of the MLP, and the attention module comprises space attention and channel attention and is used for inhibiting neurons in a non-attention area;

the MLP is used for classifying the extracted features, the output elements of the MLP are set to be 2 neurons which respectively represent the probability of the foreground and the probability of the background, and the probability of the foreground and the probability of the background are sequentially connected with Softmax and Marching Square.

2. The attention mechanism-fused U-Net medical image contour automatic extraction network according to claim 1, wherein: the RGB image input module is used for inputting an RGB image.

3. The attention mechanism-fused U-Net medical image contour automatic extraction network according to claim 1, wherein: the feature extraction module is used for extracting features of the RGB image, after the features of the RGB image are obtained, C-dimensional feature representation is carried out on each pixel, local and global information is fused, therefore, only one MLP reasoning needs to be carried out on the C-dimensional features of each pixel, as tasks in the stage are classified in two ways, 2-dimensional information is obtained finally, the probability of a target and the probability of a non-target are represented respectively, a binary image can be obtained by comparing the target with the non-target, and finally an algorithm for extracting binary image contours is adopted.

4. The attention mechanism-fused U-Net medical image contour automatic extraction network according to claim 1, wherein: the attention module is used for removing interference information in the RGB image.

5. The attention mechanism-fused U-Net medical image contour automatic extraction network according to claim 1, wherein: the feature encoding module uses ResNet18.

6. The attention mechanism-fused U-Net medical image contour automatic extraction network according to claim 1, wherein: the channel attention is given by:

M _c (F)＝F*Sigmoid(MLP(AvgPool(F))+MLP(MaxPool(F)))；

the spatial attention is calculated as follows, wherein

The method comprises the following steps of (1) splicing A and B according to channels;

combining the channel attention and the spatial attention, a calculation formula of the CBAM is obtained:

M(F)＝M _s (M _c (G (F))) + F; wherein: g (F) = Conv ₂ (Conv ₁ (F))。

7. The attention mechanism-fused U-Net medical image contour automatic extraction network according to claim 1, wherein: the MLP employs a multi-layer perceptron with 3 hidden layers.