CN115439470A - Polyp image segmentation method, computer-readable storage medium, and computer device - Google Patents
Polyp image segmentation method, computer-readable storage medium, and computer device Download PDFInfo
- Publication number
- CN115439470A CN115439470A CN202211261125.0A CN202211261125A CN115439470A CN 115439470 A CN115439470 A CN 115439470A CN 202211261125 A CN202211261125 A CN 202211261125A CN 115439470 A CN115439470 A CN 115439470A
- Authority
- CN
- China
- Prior art keywords
- channel
- image
- feature map
- polyp
- semantic information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 208000037062 Polyps Diseases 0.000 title claims abstract description 72
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000003709 image segmentation Methods 0.000 title claims abstract description 23
- 230000004927 fusion Effects 0.000 claims abstract description 26
- 230000002776 aggregation Effects 0.000 claims abstract description 15
- 238000004220 aggregation Methods 0.000 claims abstract description 15
- 238000005070 sampling Methods 0.000 claims abstract description 14
- 238000000605 extraction Methods 0.000 claims abstract description 10
- 238000010586 diagram Methods 0.000 claims description 47
- 230000007246 mechanism Effects 0.000 claims description 38
- 230000004913 activation Effects 0.000 claims description 19
- 230000006870 function Effects 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 12
- 238000011176 pooling Methods 0.000 claims description 9
- 238000012937 correction Methods 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 8
- 230000009467 reduction Effects 0.000 claims description 7
- 230000003993 interaction Effects 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims 1
- 230000008859 change Effects 0.000 abstract description 3
- 230000011218 segmentation Effects 0.000 abstract description 3
- 206010009944 Colon cancer Diseases 0.000 description 5
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 5
- 238000001839 endoscopy Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 230000002496 gastric effect Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000002271 resection Methods 0.000 description 2
- 208000023445 Congenital pulmonary airway malformation Diseases 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 210000001035 gastrointestinal tract Anatomy 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image
- G06T3/40—Scaling the whole image or part thereof
- G06T3/4038—Scaling the whole image or part thereof for image mosaicing, i.e. plane images composed of plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image
- G06T3/40—Scaling the whole image or part thereof
- G06T3/4046—Scaling the whole image or part thereof using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/32—Indexing scheme for image data processing or generation, in general involving image mosaicing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
Abstract
The application provides a polyp image segmentation method, a computer readable storage medium and a computer device, comprising: inputting a polyp image to be segmented into a multi-scale semantic fusion model to obtain a semantic information image, then performing down-sampling on the semantic information image to obtain a down-sampled semantic information image, inputting the down-sampled semantic information image into the multi-scale semantic fusion model again, then performing down-sampling, and repeating for multiple times to obtain a high-dimensional semantic information image; inputting a high-dimensional semantic information image into a context-aware pyramid aggregation model, outputting a fused feature map, performing upsampling on the fused feature map, extracting features through convolution, performing upsampling on the feature map after feature extraction again, extracting features through convolution, and repeating for multiple times until a final feature map with the same size as the polyp image channel to be segmented is obtained. Can adapt to polyp change to realize accurate polyp segmentation.
Description
Technical Field
The present application relates to the field of image segmentation, and more particularly, to a polyp image segmentation method, a computer-readable storage medium, and a computer device.
Background
Colorectal cancer is a colorectal cancer which is caused by long-term and many reasons when polyps (raised masses in gastrointestinal channels) formed in intestinal tracts are developed at the earliest stage, and if the polyps can be detected and cut off by intervention at the early stage and the colorectal cancer can be prevented, the most effective method for screening and diagnosing the colorectal cancer is colorectal endoscopy which is the most mainstream method for diagnosing the colorectal cancer at present.
However, although the current diagnosis method is advanced and accurate, some problems still exist, according to some professional research reports, in the process of endoscopy, every four polyps are missed to cause incomplete resection, and hidden troubles are left, in addition, the shapes of the polyps are different and changeable, each fine judgment is difficult to be carried out through naked eyes, and especially under the condition that some polyps are not different from the gastrointestinal channel background, the last is that the rapid identification can not be carried out only by people, a lot of time and energy are needed, and the judgment needs to be carried out by adding a lot of workload of gastroenterology doctors under the current medical system.
Disclosure of Invention
The application aims to provide a polyp image segmentation method, a computer readable storage medium and computer equipment, and aims to solve the problem that hidden dangers are left due to incomplete resection caused by omission of polyps in endoscopy.
In a first aspect, the present application provides a polyp image segmentation method, comprising:
acquiring a polyp image to be segmented;
inputting a polyp image to be segmented into a multi-scale semantic fusion model to obtain a semantic information image, then down-sampling the semantic information image to obtain a down-sampled semantic information image, inputting the down-sampled semantic information image into the multi-scale semantic fusion model again, then down-sampling, and repeating for multiple times to obtain a high-dimensional semantic information image; the multi-scale semantic fusion model is characterized in that an initial feature map with the same size as a polyp image to be segmented is obtained by extracting features of the polyp image to be segmented, the initial feature map is divided into 4 feature maps with the same channel number, 3 feature maps are selected to be spliced with the remaining feature map in channel dimension in sequence after being subjected to convolution and batch regularization algorithm, and the feature maps obtained after residual connection and splicing are fused with the polyp image to be segmented to obtain a semantic information image;
inputting a high-dimensional semantic information image into a context-aware pyramid aggregation model, outputting a fused feature map, performing upsampling on the fused feature map, extracting features through convolution, performing upsampling on the feature map after feature extraction again, extracting features through convolution, and repeating for multiple times until a final feature map with the same size as the polyp image channel to be segmented is obtained;
the context-aware pyramid aggregation model is used for performing pooling operation on an input high-dimensional semantic information image in multiple different scales, extracting feature maps with the same number of channels and different resolutions, sequentially performing up-sampling on the four feature maps after dimension reduction to obtain an up-sampled feature map with the same size as the high-dimensional semantic information image, and splicing the up-sampled feature maps according to channel dimensions to obtain a spliced feature map; performing convolution on the spliced feature map to reduce the dimension of a channel, obtaining an attention weight map by using a Sigmoid activation function, performing attention moment matrix multiplication on the attention weight map, reshaping the weight of the spliced feature map, and obtaining a feature map based on a spatial attention mechanism; performing feature extraction on the spliced feature graph, inputting the feature graph into a channel attention mechanism to obtain channel weight, and obtaining a feature graph based on the channel attention mechanism; and fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram.
Further, the specific process of the multi-scale semantic fusion model is as follows:
the polyp image X to be segmented is defined as: x is formed by R C×H×W Passing the polyp image to be segmented through W 1 (. The) extracting the characteristics to obtain an initial characteristic diagram X' with the same size as the polyp image to be segmented as follows: x' is belonged to R C×H×W ;
The W is 1 (. 1) includes a 1 × 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function;
according to the channel dimension, making the initial characteristic diagram X' be equal to R C×H×W Divided into 4 characteristic graphs with same channel number
3 feature maps X in the three 1 ,X 2 ,X 3 Via W 2 (. The) is converted, and the converted characteristic diagram W is 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ) With the remaining one of the feature maps X 0 Sequentially splicing according to the channel dimension to obtain a spliced characteristic diagram X with the same number as the channels of the polyp image to be segmented Cat Namely:
X Cat =CONCAT(W 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ),X 0 );
the W is 2 (. H) includes 3 × 3 convolution and batch regularization algorithms;
residual errors are connected and spliced to form a characteristic diagram, then the characteristic diagram is fused with a polyp image to be segmented, and a semantic information image X is output Out Namely:
wherein, R represents a three-dimensional array image, C, H and W respectively represent the channel number, length and width of the image;the addition and summation operation of the pixel level is represented, and CONCAT represents splicing on the channel dimension; w 3 (. Cndot.) includes a 1 × 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function.
Further, the context-aware pyramid aggregation model includes a context-aware fusion model and an attention correction model.
Further, the specific operation flow of the context-aware fusion model is as follows:
defining the input semantic information image D with high dimension as D belonged to R C×H×W Extracting feature maps with unchanged number of four channels and different resolutions by using a plurality of pooling operations with different scales, wherein the feature maps respectively comprise: d 0 ∈R C×6×6 ,D 1 ∈R C×3×3 ,D 2 ∈R C×2×2 and D3 ∈R C×1×1 ;
Respectively reducing dimensions of the four feature maps through 1 × 1 convolution, a batch regularization algorithm and a ReLU nonlinear activation function, and compressing the number of channels to one fourth, namely:
then, the feature map after dimensionality reduction is up-sampled to obtain an up-sampled feature map D' with the same size as the high-dimensional semantic information image D i Namely:
D″ i =(Up(D′ i ,β i ));
splicing the up-sampled characteristic graphs by channel dimension to obtain a spliced characteristic graph D Cat Namely:
D Cat =CONCAT(D″ 0 ,D″ 1 ,D″ 2 ,D″ 3 );
wherein ,i represents a natural number, beta i And representing a correlation coefficient, wherein Up is conventional bilinear interpolation upsampling, and CONCAT is splicing on a channel dimension.
Further, the specific operation flow of the attention correction model is as follows:
reducing the dimension of the channel dimension by adopting 1 multiplied by 1 convolution on the spliced feature map, obtaining an attention weight map through a Sigmoid activation function, carrying out attention moment matrix multiplication on the attention weight map, reshaping the weight of the spliced feature map, modeling a spatial attention mechanism, and obtaining a feature map D based on the spatial attention mechanism Spatial Namely:
wherein ,multiplication of the expression attention matrix, σ (-) is the Sigmiod activation function, S 0 Representing a 1 × 1 convolution operation, α being with S 0 A coefficient of correlation;
extracting the characteristics of the spliced characteristic diagram to obtain an extracted characteristic diagram, inputting the extracted characteristic diagram into a channel attention mechanism to obtain channel weight, and acquiring the characteristic diagram based on the channel attention mechanism, namely:
wherein ,FAdaptive (i) Cross-channel information interaction can be achieved locally with different convolution kernel sizes, G (i) denotes global average poolingH ', W' refers to the pixel space coordinates, D channel The attention mechanism of channel dimension is represented, i, j represents natural number, and theta is the phase of G (i)A correlation coefficient;
fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram D Out Namely:
in a second aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the polyp image segmentation method.
In a third aspect, the present application provides a computer device comprising: one or more processors, a memory, and one or more computer programs, the processors and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, which when executing the computer programs implement the steps of the polyp image segmentation method.
In the application, a multi-scale semantic fusion model is designed, and semantic information images of different scales are collected through various filters to improve the representation capability, so that the size change of polyps is adapted, particularly smaller polyps are internally provided with finer granularity levels, and the sense field of a network is increased by extracting features through convolution kernels of different scales; a context perception pyramid aggregation model is designed, feature information of different regions is guided to be fused, a dual attention mechanism is contained inside the context perception pyramid aggregation model, important features are further strengthened, features of non-important regions are effectively restrained, accurate polyp segmentation is achieved, and real-time performance is considered.
Drawings
Fig. 1 is a flowchart of a polyp image segmentation method according to an embodiment of the present application.
Fig. 2 is a flowchart of another polyp image segmentation method according to an embodiment of the present application.
Fig. 3 is a flowchart of a multi-scale semantic fusion model provided in an embodiment of the present application.
Fig. 4 is a flowchart of a context-aware fusion model according to an embodiment of the present application.
Fig. 5 is a flowchart of an attention correction model according to an embodiment of the present application.
Fig. 6 is a table of data analysis provided by an embodiment of the present application in contrast to current advanced polyp image segmentation methods.
Fig. 7 is a block diagram illustrating a specific structure of a computer device according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
In order to explain the technical solution described in the present application, the following description will be given by way of specific examples.
Referring to fig. 1, a polyp image segmentation method according to an embodiment of the present application includes the following steps: note that the polyp image segmentation method according to the present application is not limited to the flow sequence shown in fig. 1 if substantially the same result is obtained.
S101, obtaining a polyp image to be segmented;
s102, inputting a polyp image to be segmented into a multi-scale semantic fusion model to obtain a semantic information image, then performing down-sampling on the semantic information image to obtain a semantic information image after down-sampling, inputting the semantic information image after down-sampling into the multi-scale semantic fusion model again, then performing down-sampling, and repeating for multiple times to obtain a high-dimensional semantic information image; the multi-scale semantic fusion model is characterized in that an initial feature map with the same size as a polyp image to be segmented is obtained by extracting features of the polyp image to be segmented, the initial feature map is divided into 4 feature maps with the same channel number, 3 feature maps are selected to be spliced with the remaining feature map in channel dimension in sequence after being subjected to convolution and batch regularization algorithm, and the feature maps obtained after residual connection and splicing are fused with the polyp image to be segmented to obtain a semantic information image;
s103, inputting a high-dimensional semantic information image into a context-aware pyramid aggregation model, outputting a fused feature map, performing upsampling on the fused feature map, extracting features through convolution, performing upsampling on the feature map after feature extraction again, extracting features through convolution, and repeating for multiple times until a final feature map with the same size as the polyp image channel to be segmented is obtained;
s104, performing pooling operation on an input high-dimensional semantic information image in multiple different scales by using the context-aware pyramid aggregation model, extracting four feature maps with unchanged channel number and different resolutions, performing dimensionality reduction on the four feature maps, then sequentially performing upsampling on the feature maps to obtain an upsampled feature map with the same size as the high-dimensional semantic information image, and splicing the upsampled feature maps by using channel dimensionality to obtain a spliced feature map; performing convolution on the spliced feature graph to reduce the dimension of a channel, obtaining an attention weight graph by using a Sigmoid activation function, performing attention moment matrix multiplication on the attention weight graph, and reshaping the weight of the spliced feature graph to obtain a feature graph based on a space attention mechanism; performing feature extraction on the spliced feature map, and inputting the feature map into a channel attention mechanism to obtain channel weight and a feature map based on the channel attention mechanism; and fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram.
Referring to fig. 2, 001 represents a multi-scale semantic fusion model, 002 represents down-sampling, 003 represents a context-aware pyramid aggregation model, and 004 represents up-sampling; CAF represents a context-aware fusion model, and APO represents an attention correction model; 005 stands for convolution feature extraction; the left side and the right side are symmetrical, the left area is a coding area, the right side is a decoding area, and a broken line arrow represents jump connection operation.
Referring to fig. 3, in an embodiment of the present application, a specific process of the multi-scale semantic fusion model (i.e., MSFM) is as follows:
the polyp image X to be segmented is defined as: x belongs to R C×H×W Passing the polyp image to be segmented through W 1 (i) Performing feature extraction to obtain an initial feature map X' with the same size as the polyp image to be segmented as follows: x' is belonged to R C×H×W ;
W is 1 (i) The method comprises 1 × 1 convolution, a batch regularization algorithm and a ReLU nonlinear activation function;
according to the channel dimension, making the initial characteristic diagram X' be equal to R C×H×W Divided into 4 characteristic graphs with same channel number
3 feature maps X in the three 1 ,X 2 ,X 3 Via W 2 (. The) is converted, and the converted characteristic diagram W is 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ) With the remaining one of the feature maps X 0 Sequentially splicing according to the channel dimension to obtain a spliced characteristic diagram X with the same number as the channels of the polyp image to be segmented Cat Namely:
X Cat =CONCAT(W 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ),X 0 );
w is 2 (. H) includes 3 × 3 convolution and batch regularization algorithms;
residual errors are connected and spliced to form a characteristic diagram, then the characteristic diagram is fused with a polyp image to be segmented, and a semantic information image X is output Out Namely:
wherein, R represents a three-dimensional array image, C, H and W respectively represent the channel number, length and width of the image;addition of representation pixel levelsAnd operation, CONCAT represents splicing in channel dimension; w 3 (. Cndot.) includes a 1 x 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function.
In an embodiment of the present application, the context-aware pyramid aggregation model (i.e., CPAM) includes a context-aware fusion model and an attention-correction model.
In an embodiment of the present application, a specific operation flow of the context-aware fusion model is as follows:
defining the input semantic information image D with high dimension as D belonged to R C×H×W Extracting feature maps with unchanged number of four channels and different resolutions by using a plurality of pooling operations with different scales, wherein the feature maps respectively comprise: d 0 ∈R C×6×6 ,D 1 ∈R C×3×3 ,D 2 ∈R C×2×2 and D3 ∈R C×1×1 ;
Respectively reducing dimensions of the four feature maps through 1 × 1 convolution, a batch regularization algorithm and a ReLU nonlinear activation function, and compressing the number of channels to one fourth, namely:
then, the feature map after dimensionality reduction is up-sampled to obtain an up-sampled feature map D' with the same size as the high-dimensional semantic information image D i Namely:
D″ i =(Up(D′ i ,β i ));
splicing the up-sampled characteristic graphs by channel dimension to obtain a spliced characteristic graph D Cat Namely:
D Cat =CONCAT(D″ 0 ,D″ 1 ,D″ 2 ,D″ 3 );
wherein ,i represents a natural number, β i And representing a correlation coefficient, wherein Up is conventional bilinear interpolation upsampling, and CONCAT is splicing on a channel dimension.
Referring to FIG. 4, CBR represents a 1 × 1 convolution, batch regularization algorithm, and ReLU nonlinear activation function.
In an embodiment of the present application, referring to fig. 5, a specific operation flow of the attention correction model includes:
reducing the dimension of the channel dimension by adopting 1 multiplied by 1 convolution on the spliced feature map, obtaining an attention weight map through a Sigmoid activation function, carrying out attention moment matrix multiplication on the attention weight map, reshaping the weight of the spliced feature map, modeling a spatial attention mechanism, and obtaining a feature map D based on the spatial attention mechanism Spatial Namely:
wherein ,multiplication of the expression attention matrix, σ (-) is the Sigmiod activation function, S 0 Representing a 1 × 1 convolution operation, α being with S 0 A coefficient of correlation;
extracting the characteristics of the spliced characteristic diagram to obtain an extracted characteristic diagram, inputting the extracted characteristic diagram into a channel attention mechanism to obtain channel weight, and acquiring the characteristic diagram based on the channel attention mechanism, namely:
wherein ,FAdaptive (. H) Cross-channel information interaction can be achieved locally with different convolution kernel sizes, G (-) represents the Global average poolingH ', W' refer to the pixel spatial coordinates,D channel representing a channel dimension attention mechanism, i, j representing a natural number, and theta is a correlation coefficient of G (·);
fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram D Out Namely:
fig. 6 is a data analysis table comparing with the current advanced polyp image segmentation method, which is provided by an embodiment of the present application, and can more intuitively show various performance indicators.
An embodiment of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of a polyp image segmentation method as provided by an embodiment of the present application.
Fig. 7 shows a specific structural block diagram of a computer device provided in an embodiment of the present application, where a computer device 100 includes: one or more processors 101, a memory 102, and one or more computer programs, wherein the processors 101 and the memory 102 are connected by a bus, the one or more computer programs being stored in the memory 102 and configured to be executed by the one or more processors 101, the processor 101 implementing the steps of the polyp image segmentation method as provided by an embodiment of the present application when executing the computer programs.
The computer equipment comprises a server, a terminal and the like. The computer device may be a desktop computer, a mobile terminal or a vehicle-mounted device, and the mobile terminal includes at least one of a mobile phone, a tablet computer, a personal digital assistant or a wearable device.
In the embodiment of the application, a multi-scale semantic fusion model is designed, and semantic information images of different scales are collected through various filters to improve the representation capability, so that the size change of polyps is adapted, particularly smaller polyps are internally provided with finer granularity levels, and the sense field of a network is increased by extracting features through convolution kernels of different scales; a context-aware pyramid aggregation model is designed, the feature information of different regions is guided to be fused, a dual attention mechanism is contained in the context-aware pyramid aggregation model, important features are further strengthened, the features of non-important regions are effectively restrained, accurate polyp segmentation is achieved, and real-time performance is considered.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), random Access Memory (RAM), magnetic or optical disks, and the like.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.
Claims (7)
1. A method of segmenting a polyp image, comprising:
acquiring a polyp image to be segmented;
inputting a polyp image to be segmented into a multi-scale semantic fusion model to obtain a semantic information image, then performing down-sampling on the semantic information image to obtain a down-sampled semantic information image, inputting the down-sampled semantic information image into the multi-scale semantic fusion model again, then performing down-sampling, and repeating for multiple times to obtain a high-dimensional semantic information image; the multi-scale semantic fusion model is characterized in that an initial feature map with the same size as a polyp image to be segmented is obtained by extracting features of the polyp image to be segmented, the initial feature map is divided into 4 feature maps with the same channel number, 3 feature maps are selected to be spliced with the remaining feature map in channel dimension in sequence after being subjected to convolution and batch regularization algorithm, and the feature maps obtained after residual connection and splicing are fused with the polyp image to be segmented to obtain a semantic information image;
inputting a high-dimensional semantic information image into a context-aware pyramid aggregation model, outputting a fused feature map, performing upsampling on the fused feature map, extracting features through convolution, performing upsampling on the feature map after feature extraction again, extracting features through convolution, and repeating for multiple times until a final feature map with the same size as the polyp image channel to be segmented is obtained;
the context-aware pyramid aggregation model is used for performing pooling operation on an input high-dimensional semantic information image in multiple different scales, extracting four feature maps with unchanged channel number and different resolutions, sequentially performing up-sampling on the four feature maps after dimensionality reduction to obtain an up-sampled feature map with the same size as the high-dimensional semantic information image, and splicing the up-sampled feature maps according to channel dimensionality to obtain a spliced feature map; performing convolution on the spliced feature graph to reduce the dimension of a channel, obtaining an attention weight graph by using a Sigmoid activation function, performing attention moment matrix multiplication on the attention weight graph, and reshaping the weight of the spliced feature graph to obtain a feature graph based on a space attention mechanism; performing feature extraction on the spliced feature map, and inputting the feature map into a channel attention mechanism to obtain channel weight and a feature map based on the channel attention mechanism; and fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram.
2. The polyp image segmentation method as set forth in claim 1, wherein the specific flow of the multi-scale semantic fusion model is:
the polyp image X to be segmented is defined as: x is formed by R C×H×W Passing the polyp image to be segmented through W 1 (. The) carries on the characteristic extraction, get an initial characteristic map X' with the same size of the polyp picture to be cut apart: x' is belonged to R C×H×W ;
W is 1 (. 1) includes a 1 × 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function;
according to the channel dimensionDegree is to make the initial characteristic diagram X' be equal to R C×H×W Divided into 4 characteristic graphs with same channel number
3 feature maps X in the three 1 ,X 2 ,X 3 Via W 2 (. The) making a transition, and making the transformed characteristic diagram W 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ) With the remaining one of the feature maps X 0 Sequentially splicing according to the channel dimension to obtain a spliced characteristic diagram X with the same number as the channels of the polyp image to be segmented Cat Namely:
X Cat =CONCAT(W 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ),X 0 );
the W is 2 (. H) includes 3 × 3 convolution and batch regularization algorithms;
residual errors are connected and spliced to form a characteristic diagram, then the characteristic diagram is fused with a polyp image to be segmented, and a semantic information image X is output Out Namely:
wherein, R represents a three-dimensional array image, C, H and W respectively represent the channel number, length and width of the image;the addition and summation operation of the pixel level is represented, and CONCAT represents splicing on the channel dimension; w 3 (. Cndot.) includes a 1 × 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function.
3. The polyp image segmentation method of claim 1, wherein the context-aware pyramid aggregation model comprises a context-aware fusion model and an attention correction model.
4. The polyp image segmentation method as set forth in claim 3, wherein the specific operation flow of the context-aware fusion model is:
defining the input high-dimensional semantic information image D as D belongs to R C×H×W Extracting feature maps with unchanged number of four channels and different resolutions by using a plurality of pooling operations with different scales, wherein the feature maps respectively comprise: d 0 ∈R C×6×6 ,D 1 ∈R C×3×3 ,D 2 ∈R C×2×2 and D3 ∈R C ×1×1 ;
Respectively reducing dimensions of the four feature maps through 1 × 1 convolution, a batch regularization algorithm and a ReLU nonlinear activation function, and compressing the number of channels to one fourth, namely:
then, the feature map after dimensionality reduction is subjected to upsampling to obtain an upsampled feature map D' with the same size as the semantic information image D with high dimensionality i Namely:
D″ i =(Up(D i ′,β i ));
splicing the up-sampled characteristic graphs by channel dimension to obtain a spliced characteristic graph D Cat Namely:
D Cat =CONCAT(D″ 0 ,D″ 1 ,D″ 2 ,D″ 3 );
5. The polyp image segmentation method as set forth in claim 4, wherein the specific operation flow of the attention correction model is:
performing dimension reduction on channel dimensions on the spliced feature map by adopting 1 multiplied by 1 convolution, obtaining an attention weight map through a Sigmoid activation function, performing attention moment array multiplication on the attention weight map, reshaping the weight of the spliced feature map, modeling a spatial attention mechanism, and obtaining a feature map D based on the spatial attention mechanism Spatial Namely:
wherein ,multiplication of the expression attention matrix, σ (-) is the Sigmiod activation function, S 0 Representing a 1 × 1 convolution operation, α being with S 0 A coefficient of correlation;
extracting the characteristics of the spliced characteristic diagram to obtain an extracted characteristic diagram, inputting the extracted characteristic diagram into a channel attention mechanism to obtain channel weight, and acquiring the characteristic diagram based on the channel attention mechanism, namely:
wherein ,FAdaptive (. H) Cross-channel information interaction can be achieved locally with different convolution kernel sizes, G (-) represents the Global average poolingH ', W' refers to the pixel space coordinates, D channel Representing a channel dimension attention mechanism, i, j representing a natural number, and theta is a correlation coefficient of G (·);
fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagramCombined feature map D Out Namely:
6. a computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the polyp image segmentation method according to any one of claims 1 to 5.
7. A computer device, comprising:
one or more processors;
a memory; and one or more computer programs, the processor and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, characterized in that the steps of the polyp image segmentation method according to any one of claims 1 to 5 are implemented when the computer programs are executed by the processors.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211261125.0A CN115439470B (en) | 2022-10-14 | 2022-10-14 | Polyp image segmentation method, computer readable storage medium and computer device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211261125.0A CN115439470B (en) | 2022-10-14 | 2022-10-14 | Polyp image segmentation method, computer readable storage medium and computer device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115439470A true CN115439470A (en) | 2022-12-06 |
CN115439470B CN115439470B (en) | 2023-05-26 |
Family
ID=84250185
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211261125.0A Active CN115439470B (en) | 2022-10-14 | 2022-10-14 | Polyp image segmentation method, computer readable storage medium and computer device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115439470B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116486230A (en) * | 2023-04-21 | 2023-07-25 | 哈尔滨工业大学(威海) | Image detection method based on semi-recursion characteristic pyramid structure and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112465827A (en) * | 2020-12-09 | 2021-03-09 | 北京航空航天大学 | Contour perception multi-organ segmentation network construction method based on class-by-class convolution operation |
CN113506300A (en) * | 2021-06-25 | 2021-10-15 | 江苏大学 | Image semantic segmentation method and system based on rainy complex road scene |
CN113538313A (en) * | 2021-07-22 | 2021-10-22 | 深圳大学 | Polyp segmentation method and device, computer equipment and storage medium |
CN114170167A (en) * | 2021-11-29 | 2022-03-11 | 深圳职业技术学院 | Polyp segmentation method and computer device based on attention-guided context correction |
CN114581662A (en) * | 2022-02-17 | 2022-06-03 | 华南理工大学 | Method, system, device and storage medium for segmenting brain tumor image |
CN115018824A (en) * | 2022-07-21 | 2022-09-06 | 湘潭大学 | Colonoscope polyp image segmentation method based on CNN and Transformer fusion |
-
2022
- 2022-10-14 CN CN202211261125.0A patent/CN115439470B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112465827A (en) * | 2020-12-09 | 2021-03-09 | 北京航空航天大学 | Contour perception multi-organ segmentation network construction method based on class-by-class convolution operation |
CN113506300A (en) * | 2021-06-25 | 2021-10-15 | 江苏大学 | Image semantic segmentation method and system based on rainy complex road scene |
CN113538313A (en) * | 2021-07-22 | 2021-10-22 | 深圳大学 | Polyp segmentation method and device, computer equipment and storage medium |
CN114170167A (en) * | 2021-11-29 | 2022-03-11 | 深圳职业技术学院 | Polyp segmentation method and computer device based on attention-guided context correction |
CN114581662A (en) * | 2022-02-17 | 2022-06-03 | 华南理工大学 | Method, system, device and storage medium for segmenting brain tumor image |
CN115018824A (en) * | 2022-07-21 | 2022-09-06 | 湘潭大学 | Colonoscope polyp image segmentation method based on CNN and Transformer fusion |
Non-Patent Citations (2)
Title |
---|
LIANTAO SHI: ""FRCNet: Feature Refining and Context-Guided Network for Efficient Polyp Segmentation"", 《FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY》 * |
范润泽 等: ""基于多尺度注意力机制的道路场景语义分割模型"", 《计算机工程》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116486230A (en) * | 2023-04-21 | 2023-07-25 | 哈尔滨工业大学(威海) | Image detection method based on semi-recursion characteristic pyramid structure and storage medium |
CN116486230B (en) * | 2023-04-21 | 2024-02-02 | 哈尔滨工业大学(威海) | Image detection method based on semi-recursion characteristic pyramid structure and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN115439470B (en) | 2023-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI728465B (en) | Method, device and electronic apparatus for image processing and storage medium thereof | |
CN107665491B (en) | Pathological image identification method and system | |
CN111369440B (en) | Model training and image super-resolution processing method, device, terminal and storage medium | |
EP3923233A1 (en) | Image denoising method and apparatus | |
CN114170167B (en) | Polyp segmentation method and computer device based on attention-guided context correction | |
AU2021354030B2 (en) | Processing images using self-attention based neural networks | |
CN113012140A (en) | Digestive endoscopy video frame effective information region extraction method based on deep learning | |
CN110866938B (en) | Full-automatic video moving object segmentation method | |
CN112700460A (en) | Image segmentation method and system | |
CN113486890A (en) | Text detection method based on attention feature fusion and cavity residual error feature enhancement | |
CN114004811A (en) | Image segmentation method and system based on multi-scale residual error coding and decoding network | |
CN115439470A (en) | Polyp image segmentation method, computer-readable storage medium, and computer device | |
CN112150470A (en) | Image segmentation method, image segmentation device, image segmentation medium, and electronic device | |
CN112633260B (en) | Video motion classification method and device, readable storage medium and equipment | |
CN114399510A (en) | Skin lesion segmentation and classification method and system combining image and clinical metadata | |
CN113392791A (en) | Skin prediction processing method, device, equipment and storage medium | |
CN117252890A (en) | Carotid plaque segmentation method, device, equipment and medium | |
CN116542988A (en) | Nodule segmentation method, nodule segmentation device, electronic equipment and storage medium | |
CN111369564B (en) | Image processing method, model training method and model training device | |
CN114332574A (en) | Image processing method, device, equipment and storage medium | |
CN112001479B (en) | Processing method and system based on deep learning model and electronic equipment | |
CN114022458A (en) | Skeleton detection method and device, electronic equipment and computer readable storage medium | |
CN111833991A (en) | Auxiliary interpretation method and device based on artificial intelligence, terminal and storage medium | |
Patel et al. | Deep Learning in Medical Image Super-Resolution: A Survey | |
CN115861604B (en) | Cervical tissue image processing method, cervical tissue image processing device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |