CN114581467A

CN114581467A - Image segmentation method based on residual error expansion space pyramid network algorithm

Info

Publication number: CN114581467A
Application number: CN202210210185.3A
Authority: CN
Inventors: 徐超; 韩俱宝; 李正平
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2022-03-03
Filing date: 2022-03-03
Publication date: 2022-06-03

Abstract

The invention discloses an image segmentation method based on a residual expanded space pyramid network algorithm, which relates to the technical field of image segmentation and specifically comprises the following steps: acquiring an image to be processed; extracting the features of an image to be processed to obtain an original feature map; inputting the original characteristic diagram into a residual cavity space pyramid model, and extracting a global characteristic diagram; aggregating the global characteristic graph and the original characteristic graph to obtain a characteristic mapping graph; carrying out segmentation processing on the feature mapping chart; performing refinement processing on the segmented feature mapping by using an attention module and a conditional random field; the method and the device extract information on different scales of general features by using cavity convolution with different sampling rates, and avoid gradient explosion or gradient disappearance while further extracting image information with higher dimensionality by using a residual block; the obtained features comprise image information with different scales and deeper semantic information, and the image region boundary can be segmented more finely.

Description

Image segmentation method based on residual error expansion space pyramid network algorithm

Technical Field

The invention relates to the technical field of image segmentation, in particular to an image segmentation method based on a residual expanded space pyramid network algorithm.

Background

Image segmentation is the first step of image analysis, and the traditional image segmentation methods mainly include a threshold segmentation method, an edge detection method, a region segmentation method and a histogram method. With the continuous development of deep learning, more and more deep learning-based segmentation methods have been proposed, and although these segmentation methods have made some progress, they can only use the bounding box to label the detected shape, and cannot effectively determine the boundary contour of the polyp. Such as in a polypectomy procedure in the medical field, can result in the physician not being able to accurately resect the polyp tissue. To address this problem, a FCN network based on a pre-trained model is proposed to identify and segment polyps. Later akbai et al proposed an improved FCN-based network to further improve the accuracy of polyp segmentation. Since 2015 the U-Net network was proposed, the U-Net network is widely applied to the field of medical image segmentation. Many new convolutional neural network design methods still use the core design concept of U-Net, and new modules are added or other design concepts are blended to improve the performance of the convolutional neural network on medical image segmentation. U-Net + + and reset + + work well in the polyp segmentation task, which focuses on segmenting the entire outline of a polyp, but neglects the constraints of the polyp region boundaries, resulting in the polyp segmented edges not being very fine. Therefore, it is an urgent problem to those skilled in the art to develop a method capable of accurately determining the boundary of an image region.

Disclosure of Invention

In view of this, the present invention provides an image segmentation method based on a residual expanded space pyramid network algorithm, which overcomes the above disadvantages.

In order to achieve the above purpose, the invention provides the following technical scheme:

an image segmentation method based on a residual expanded space pyramid network algorithm comprises the following specific steps:

acquiring an image to be processed;

extracting the features of the image to be processed to obtain an original feature map;

inputting the original characteristic diagram into a residual cavity space pyramid model, and extracting a global characteristic diagram;

aggregating the global characteristic graph and the original characteristic graph to obtain a characteristic mapping graph;

carrying out segmentation processing on the feature mapping graph;

and refining the segmented feature mapping by using an attention module and a conditional random field.

Optionally, the raw feature map is composed of general features extracted from a backbone network based on ResNet50 with a fixed resolution.

Optionally, the residual empty space pyramid model includes a global feature extraction module and a semantic information extraction module; the global feature extraction module and the semantic information extraction module adopt a parallel structure, and the global feature extraction module is used for extracting multi-scale image information from the original feature map; the semantic information extraction module extracts deep semantic information based on the original feature map.

Optionally, the global feature extraction module includes a plurality of groups of void convolution layers with different sampling rates; the sampling rate is incremented by a multiple.

Optionally, the semantic information extraction module includes an extrusion excitation layer, a void convolution layer, batch normalization and a Relu activation function; the extrusion excitation layer is connected with the cavity convolution layer; the hole convolution layer is cross-superimposed with the normalization and Relu activation functions.

Optionally, the semantic information extraction module adopts a residual structure and is connected with the output end through jump connection.

Optionally, the global feature map includes global feature information and deep semantic information.

According to the technical scheme, compared with the prior art, the image segmentation method based on the residual expansion space pyramid network algorithm extracts information of general features on different scales by utilizing cavity convolution of different sampling rates, and avoids gradient explosion or gradient disappearance while further extracting image information of higher dimensionality by adopting a residual block; the finally obtained features comprise image information of different scales and deeper semantic information, and the image region boundary can be conveniently and finely segmented.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

FIG. 1 is a schematic flow diagram of the process of the present invention;

FIG. 2 is a flow chart of the algorithm of the present invention;

FIG. 3 is a diagram of a residual void space pyramid model structure according to the present invention;

FIG. 4 is a graph comparing the results of example 3 of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention discloses an image segmentation method based on a residual error expansion space pyramid network algorithm, wherein a resNet network is used as a backbone network for extracting general features, a new global feature extraction Module (RASP Module) is designed for further acquiring global information of an image from the general features extracted from a full convolution network, the RASP Module extracts information of the general features on different scales by utilizing hole convolution of different sampling rates, simultaneously, a residual error block is used for further extracting image information of higher dimensions, and finally the obtained features comprise image information of different scales and semantic information of deeper levels.

Example 1

As shown in fig. 1, the image segmentation method based on the residual expanded space pyramid network algorithm disclosed in this embodiment specifically includes the following steps: acquiring an image to be processed; extracting the features of the image to be processed to obtain an original feature map; inputting the original characteristic diagram into a residual cavity space pyramid model, and extracting a global characteristic diagram; aggregating the global characteristic graph and the original characteristic graph to obtain a characteristic mapping graph; carrying out segmentation processing on the feature mapping graph; and refining the segmented feature mapping by using an attention module and a conditional random field.

In the embodiment, the RASPNet architecture is applied to the intestinal polyp image, as shown in fig. 2, based on the FCN network architecture, in the network coding part, the characteristic coding is performed on the intestinal polyp image through 50 layers of residual structure networks, and the residual structure is formed by connecting convolution and jump of 3 × 3; in the feature extraction part, a feature map containing rich global information and deeper semantic information is obtained through a residual empty space pyramid model; as shown in fig. 3, the residual void space pyramid model is composed of two parallel paths, one path is composed of three or more groups of void convolution layers, and the convolution rates are sequentially set to 6, 12, and 18; and the other path adopts a residual error framework and a squeezing excitation layer, and comprises a 3 x 3 cavity convolution and batch normalization and a Relu activation function, and finally, the input end is connected to the output end by using jump connection. The size of the image input by the encoding part is uniform to 480 × 480, and the size of the feature map output after encoding is 2048 × 60 × 60. And in the result refining part, refining the segmentation result by using an attention block and a conditional random field.

Example 2

In this embodiment, as shown in fig. 1 and fig. 3, a backbone network based on ResNet50 is used to extract a general feature with a resolution of 60 × 60, and then the extracted general feature is sent to a residual void space pyramid module, global information of a feature map is extracted by three void convolution layers with sampling rates of 6, 12, and 18, high-level semantic information is extracted by two layers of void convolution layers, and the obtained global feature information and high-level semantic information are aggregated with the general feature to form a new feature map; and (4) carrying out segmentation processing on the feature mapping image, and refining the segmentation result through an attention block and a conditional random field.

Example 3

To validate the effectiveness of the RASPNet network, experiments were performed on the Kvasir-SEG dataset. In order to compare models, three current most popular segmentation networks of U-Net, ResUNet and ResUNet + + are selected for comparison; selecting an average cross-over ratio (mIoU), precision rate accuracy (precision) and Dice coefficient (Dice coefficient) as indexes for judging the quality of a network model; the calculation formula is as follows:

wherein, the TP represents that the model judges that a certain pixel is polyp and the pixel is polyp indeed; the FP representation model determines that the pixel is a polyp, but the fact is a non-polyp; FN indicates that the model determines that the pixel is a non-polyp, but is in fact a polyp; the results obtained are shown in table 1:

table 1: evaluation results of different models on Kvasir-SEG dataset

The RASPNet architecture proposed by the present application produces satisfactory results on the Kvasir-SEG dataset. As shown in fig. 4, from left to right: the images 1 and 2 are original images, masks and images 3 are images obtained by the U-NET network; image 4 is an image obtained by the ResUNet network; image 5 is an image obtained by a ResUNet + + network; the image 6 is an image obtained by the RASPNet network proposed in the present application. It can be seen from fig. 4 that RASPNet produces a better segmentation map than the other comparative models.

In the Kvasir-SEG dataset, the segmentation maps generated by RASPNet are superior to other architectures in capturing shape information, which means that the segmentation mask maps generated in RASPNet have a more similar ground truth than existing models.

Therefore, the RASPNet structure based on the FCN framework can be used for precise polyp segmentation, and by using a residual empty space pyramid Module (RASP Module), intestinal polyp regions can be segmented more finely. In order to extract multi-scale information of an intestinal endoscope image, a residual empty pyramid model (RASP Module) is provided, the model is composed of two parallel paths, one path is composed of empty convolution layers with various sampling rates, the other path is of a residual structure, the impression field can be enlarged as far as possible on the premise that the resolution ratio of the image is not reduced by empty convolution, and the residual structure can prevent gradient explosion or gradient disappearance while extracting deeper semantic information of the image.

In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An image segmentation method based on a residual expanded space pyramid network algorithm is characterized by comprising the following specific steps:

acquiring an image to be processed;

carrying out segmentation processing on the feature mapping graph;

2. The image segmentation method based on the residual expanded spatial pyramid network algorithm as claimed in claim 1, wherein the original feature map is formed by extracting general features from a backbone network based on ResNet 50.

3. The image segmentation method based on the residual error expansion space pyramid network algorithm is characterized in that the residual error cavity space pyramid model comprises a global feature extraction module and a semantic information extraction module; the global feature extraction module and the semantic information extraction module adopt a parallel structure, and the global feature extraction module is used for extracting multi-scale image information from the original feature map; the semantic information extraction module extracts deep semantic information based on the original feature map.

4. The image segmentation method based on the residual expansion space pyramid network algorithm of claim 3, wherein the global feature extraction module comprises a plurality of groups of hole convolution layers with different sampling rates; the sampling rate is incremented by a multiple.

5. The image segmentation method based on the residual error expanded space pyramid network algorithm as claimed in claim 3, wherein the semantic information extraction module comprises an extrusion excitation layer, a cavity convolution layer, batch normalization and a Relu activation function; the extrusion excitation layer is connected with the cavity convolution layer; the hole convolution layer is interleaved with the normalization and Relu activation functions.

6. The image segmentation method based on the residual expanded space pyramid network algorithm as claimed in claim 5, wherein the semantic information extraction module adopts a residual structure and is connected with the output end through a skip connection.

7. The image segmentation method based on the residual expanded spatial pyramid network algorithm as claimed in claim 1, wherein the global feature map includes global feature information and deep semantic information.