CN115439470A - Polyp image segmentation method, computer-readable storage medium, and computer device - Google Patents

Polyp image segmentation method, computer-readable storage medium, and computer device Download PDF

Info

Publication number
CN115439470A
CN115439470A CN202211261125.0A CN202211261125A CN115439470A CN 115439470 A CN115439470 A CN 115439470A CN 202211261125 A CN202211261125 A CN 202211261125A CN 115439470 A CN115439470 A CN 115439470A
Authority
CN
China
Prior art keywords
channel
image
feature map
polyp
semantic information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211261125.0A
Other languages
Chinese (zh)
Other versions
CN115439470B (en
Inventor
施连焘
李正国
王玉峰
李建阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Polytechnic
Original Assignee
Shenzhen Polytechnic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Polytechnic filed Critical Shenzhen Polytechnic
Priority to CN202211261125.0A priority Critical patent/CN115439470B/en
Publication of CN115439470A publication Critical patent/CN115439470A/en
Application granted granted Critical
Publication of CN115439470B publication Critical patent/CN115439470B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4038Scaling the whole image or part thereof for image mosaicing, i.e. plane images composed of plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Abstract

The application provides a polyp image segmentation method, a computer readable storage medium and a computer device, comprising: inputting a polyp image to be segmented into a multi-scale semantic fusion model to obtain a semantic information image, then performing down-sampling on the semantic information image to obtain a down-sampled semantic information image, inputting the down-sampled semantic information image into the multi-scale semantic fusion model again, then performing down-sampling, and repeating for multiple times to obtain a high-dimensional semantic information image; inputting a high-dimensional semantic information image into a context-aware pyramid aggregation model, outputting a fused feature map, performing upsampling on the fused feature map, extracting features through convolution, performing upsampling on the feature map after feature extraction again, extracting features through convolution, and repeating for multiple times until a final feature map with the same size as the polyp image channel to be segmented is obtained. Can adapt to polyp change to realize accurate polyp segmentation.

Description

Polyp image segmentation method, computer-readable storage medium, and computer device
Technical Field
The present application relates to the field of image segmentation, and more particularly, to a polyp image segmentation method, a computer-readable storage medium, and a computer device.
Background
Colorectal cancer is a colorectal cancer which is caused by long-term and many reasons when polyps (raised masses in gastrointestinal channels) formed in intestinal tracts are developed at the earliest stage, and if the polyps can be detected and cut off by intervention at the early stage and the colorectal cancer can be prevented, the most effective method for screening and diagnosing the colorectal cancer is colorectal endoscopy which is the most mainstream method for diagnosing the colorectal cancer at present.
However, although the current diagnosis method is advanced and accurate, some problems still exist, according to some professional research reports, in the process of endoscopy, every four polyps are missed to cause incomplete resection, and hidden troubles are left, in addition, the shapes of the polyps are different and changeable, each fine judgment is difficult to be carried out through naked eyes, and especially under the condition that some polyps are not different from the gastrointestinal channel background, the last is that the rapid identification can not be carried out only by people, a lot of time and energy are needed, and the judgment needs to be carried out by adding a lot of workload of gastroenterology doctors under the current medical system.
Disclosure of Invention
The application aims to provide a polyp image segmentation method, a computer readable storage medium and computer equipment, and aims to solve the problem that hidden dangers are left due to incomplete resection caused by omission of polyps in endoscopy.
In a first aspect, the present application provides a polyp image segmentation method, comprising:
acquiring a polyp image to be segmented;
inputting a polyp image to be segmented into a multi-scale semantic fusion model to obtain a semantic information image, then down-sampling the semantic information image to obtain a down-sampled semantic information image, inputting the down-sampled semantic information image into the multi-scale semantic fusion model again, then down-sampling, and repeating for multiple times to obtain a high-dimensional semantic information image; the multi-scale semantic fusion model is characterized in that an initial feature map with the same size as a polyp image to be segmented is obtained by extracting features of the polyp image to be segmented, the initial feature map is divided into 4 feature maps with the same channel number, 3 feature maps are selected to be spliced with the remaining feature map in channel dimension in sequence after being subjected to convolution and batch regularization algorithm, and the feature maps obtained after residual connection and splicing are fused with the polyp image to be segmented to obtain a semantic information image;
inputting a high-dimensional semantic information image into a context-aware pyramid aggregation model, outputting a fused feature map, performing upsampling on the fused feature map, extracting features through convolution, performing upsampling on the feature map after feature extraction again, extracting features through convolution, and repeating for multiple times until a final feature map with the same size as the polyp image channel to be segmented is obtained;
the context-aware pyramid aggregation model is used for performing pooling operation on an input high-dimensional semantic information image in multiple different scales, extracting feature maps with the same number of channels and different resolutions, sequentially performing up-sampling on the four feature maps after dimension reduction to obtain an up-sampled feature map with the same size as the high-dimensional semantic information image, and splicing the up-sampled feature maps according to channel dimensions to obtain a spliced feature map; performing convolution on the spliced feature map to reduce the dimension of a channel, obtaining an attention weight map by using a Sigmoid activation function, performing attention moment matrix multiplication on the attention weight map, reshaping the weight of the spliced feature map, and obtaining a feature map based on a spatial attention mechanism; performing feature extraction on the spliced feature graph, inputting the feature graph into a channel attention mechanism to obtain channel weight, and obtaining a feature graph based on the channel attention mechanism; and fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram.
Further, the specific process of the multi-scale semantic fusion model is as follows:
the polyp image X to be segmented is defined as: x is formed by R C×H×W Passing the polyp image to be segmented through W 1 (. The) extracting the characteristics to obtain an initial characteristic diagram X' with the same size as the polyp image to be segmented as follows: x' is belonged to R C×H×W
The W is 1 (. 1) includes a 1 × 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function;
according to the channel dimension, making the initial characteristic diagram X' be equal to R C×H×W Divided into 4 characteristic graphs with same channel number
Figure BDA0003891592050000031
3 feature maps X in the three 1 ,X 2 ,X 3 Via W 2 (. The) is converted, and the converted characteristic diagram W is 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ) With the remaining one of the feature maps X 0 Sequentially splicing according to the channel dimension to obtain a spliced characteristic diagram X with the same number as the channels of the polyp image to be segmented Cat Namely:
X Cat =CONCAT(W 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ),X 0 );
the W is 2 (. H) includes 3 × 3 convolution and batch regularization algorithms;
residual errors are connected and spliced to form a characteristic diagram, then the characteristic diagram is fused with a polyp image to be segmented, and a semantic information image X is output Out Namely:
Figure BDA0003891592050000032
wherein, R represents a three-dimensional array image, C, H and W respectively represent the channel number, length and width of the image;
Figure BDA0003891592050000033
the addition and summation operation of the pixel level is represented, and CONCAT represents splicing on the channel dimension; w 3 (. Cndot.) includes a 1 × 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function.
Further, the context-aware pyramid aggregation model includes a context-aware fusion model and an attention correction model.
Further, the specific operation flow of the context-aware fusion model is as follows:
defining the input semantic information image D with high dimension as D belonged to R C×H×W Extracting feature maps with unchanged number of four channels and different resolutions by using a plurality of pooling operations with different scales, wherein the feature maps respectively comprise: d 0 ∈R C×6×6 ,D 1 ∈R C×3×3 ,D 2 ∈R C×2×2 and D3 ∈R C×1×1
Respectively reducing dimensions of the four feature maps through 1 × 1 convolution, a batch regularization algorithm and a ReLU nonlinear activation function, and compressing the number of channels to one fourth, namely:
Figure BDA0003891592050000034
and
Figure BDA0003891592050000035
then, the feature map after dimensionality reduction is up-sampled to obtain an up-sampled feature map D' with the same size as the high-dimensional semantic information image D i Namely:
D″ i =(Up(D′ ii ));
splicing the up-sampled characteristic graphs by channel dimension to obtain a spliced characteristic graph D Cat Namely:
D Cat =CONCAT(D″ 0 ,D″ 1 ,D″ 2 ,D″ 3 );
wherein ,
Figure BDA0003891592050000041
i represents a natural number, beta i And representing a correlation coefficient, wherein Up is conventional bilinear interpolation upsampling, and CONCAT is splicing on a channel dimension.
Further, the specific operation flow of the attention correction model is as follows:
reducing the dimension of the channel dimension by adopting 1 multiplied by 1 convolution on the spliced feature map, obtaining an attention weight map through a Sigmoid activation function, carrying out attention moment matrix multiplication on the attention weight map, reshaping the weight of the spliced feature map, modeling a spatial attention mechanism, and obtaining a feature map D based on the spatial attention mechanism Spatial Namely:
Figure BDA0003891592050000042
wherein ,
Figure BDA0003891592050000043
multiplication of the expression attention matrix, σ (-) is the Sigmiod activation function, S 0 Representing a 1 × 1 convolution operation, α being with S 0 A coefficient of correlation;
extracting the characteristics of the spliced characteristic diagram to obtain an extracted characteristic diagram, inputting the extracted characteristic diagram into a channel attention mechanism to obtain channel weight, and acquiring the characteristic diagram based on the channel attention mechanism, namely:
Figure BDA0003891592050000044
wherein ,FAdaptive (i) Cross-channel information interaction can be achieved locally with different convolution kernel sizes, G (i) denotes global average pooling
Figure BDA0003891592050000045
H ', W' refers to the pixel space coordinates, D channel The attention mechanism of channel dimension is represented, i, j represents natural number, and theta is the phase of G (i)A correlation coefficient;
fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram D Out Namely:
Figure BDA0003891592050000046
in a second aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the polyp image segmentation method.
In a third aspect, the present application provides a computer device comprising: one or more processors, a memory, and one or more computer programs, the processors and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, which when executing the computer programs implement the steps of the polyp image segmentation method.
In the application, a multi-scale semantic fusion model is designed, and semantic information images of different scales are collected through various filters to improve the representation capability, so that the size change of polyps is adapted, particularly smaller polyps are internally provided with finer granularity levels, and the sense field of a network is increased by extracting features through convolution kernels of different scales; a context perception pyramid aggregation model is designed, feature information of different regions is guided to be fused, a dual attention mechanism is contained inside the context perception pyramid aggregation model, important features are further strengthened, features of non-important regions are effectively restrained, accurate polyp segmentation is achieved, and real-time performance is considered.
Drawings
Fig. 1 is a flowchart of a polyp image segmentation method according to an embodiment of the present application.
Fig. 2 is a flowchart of another polyp image segmentation method according to an embodiment of the present application.
Fig. 3 is a flowchart of a multi-scale semantic fusion model provided in an embodiment of the present application.
Fig. 4 is a flowchart of a context-aware fusion model according to an embodiment of the present application.
Fig. 5 is a flowchart of an attention correction model according to an embodiment of the present application.
Fig. 6 is a table of data analysis provided by an embodiment of the present application in contrast to current advanced polyp image segmentation methods.
Fig. 7 is a block diagram illustrating a specific structure of a computer device according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
In order to explain the technical solution described in the present application, the following description will be given by way of specific examples.
Referring to fig. 1, a polyp image segmentation method according to an embodiment of the present application includes the following steps: note that the polyp image segmentation method according to the present application is not limited to the flow sequence shown in fig. 1 if substantially the same result is obtained.
S101, obtaining a polyp image to be segmented;
s102, inputting a polyp image to be segmented into a multi-scale semantic fusion model to obtain a semantic information image, then performing down-sampling on the semantic information image to obtain a semantic information image after down-sampling, inputting the semantic information image after down-sampling into the multi-scale semantic fusion model again, then performing down-sampling, and repeating for multiple times to obtain a high-dimensional semantic information image; the multi-scale semantic fusion model is characterized in that an initial feature map with the same size as a polyp image to be segmented is obtained by extracting features of the polyp image to be segmented, the initial feature map is divided into 4 feature maps with the same channel number, 3 feature maps are selected to be spliced with the remaining feature map in channel dimension in sequence after being subjected to convolution and batch regularization algorithm, and the feature maps obtained after residual connection and splicing are fused with the polyp image to be segmented to obtain a semantic information image;
s103, inputting a high-dimensional semantic information image into a context-aware pyramid aggregation model, outputting a fused feature map, performing upsampling on the fused feature map, extracting features through convolution, performing upsampling on the feature map after feature extraction again, extracting features through convolution, and repeating for multiple times until a final feature map with the same size as the polyp image channel to be segmented is obtained;
s104, performing pooling operation on an input high-dimensional semantic information image in multiple different scales by using the context-aware pyramid aggregation model, extracting four feature maps with unchanged channel number and different resolutions, performing dimensionality reduction on the four feature maps, then sequentially performing upsampling on the feature maps to obtain an upsampled feature map with the same size as the high-dimensional semantic information image, and splicing the upsampled feature maps by using channel dimensionality to obtain a spliced feature map; performing convolution on the spliced feature graph to reduce the dimension of a channel, obtaining an attention weight graph by using a Sigmoid activation function, performing attention moment matrix multiplication on the attention weight graph, and reshaping the weight of the spliced feature graph to obtain a feature graph based on a space attention mechanism; performing feature extraction on the spliced feature map, and inputting the feature map into a channel attention mechanism to obtain channel weight and a feature map based on the channel attention mechanism; and fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram.
Referring to fig. 2, 001 represents a multi-scale semantic fusion model, 002 represents down-sampling, 003 represents a context-aware pyramid aggregation model, and 004 represents up-sampling; CAF represents a context-aware fusion model, and APO represents an attention correction model; 005 stands for convolution feature extraction; the left side and the right side are symmetrical, the left area is a coding area, the right side is a decoding area, and a broken line arrow represents jump connection operation.
Referring to fig. 3, in an embodiment of the present application, a specific process of the multi-scale semantic fusion model (i.e., MSFM) is as follows:
the polyp image X to be segmented is defined as: x belongs to R C×H×W Passing the polyp image to be segmented through W 1 (i) Performing feature extraction to obtain an initial feature map X' with the same size as the polyp image to be segmented as follows: x' is belonged to R C×H×W
W is 1 (i) The method comprises 1 × 1 convolution, a batch regularization algorithm and a ReLU nonlinear activation function;
according to the channel dimension, making the initial characteristic diagram X' be equal to R C×H×W Divided into 4 characteristic graphs with same channel number
Figure BDA0003891592050000071
3 feature maps X in the three 1 ,X 2 ,X 3 Via W 2 (. The) is converted, and the converted characteristic diagram W is 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ) With the remaining one of the feature maps X 0 Sequentially splicing according to the channel dimension to obtain a spliced characteristic diagram X with the same number as the channels of the polyp image to be segmented Cat Namely:
X Cat =CONCAT(W 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ),X 0 );
w is 2 (. H) includes 3 × 3 convolution and batch regularization algorithms;
residual errors are connected and spliced to form a characteristic diagram, then the characteristic diagram is fused with a polyp image to be segmented, and a semantic information image X is output Out Namely:
Figure BDA0003891592050000072
wherein, R represents a three-dimensional array image, C, H and W respectively represent the channel number, length and width of the image;
Figure BDA0003891592050000073
addition of representation pixel levelsAnd operation, CONCAT represents splicing in channel dimension; w 3 (. Cndot.) includes a 1 x 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function.
In an embodiment of the present application, the context-aware pyramid aggregation model (i.e., CPAM) includes a context-aware fusion model and an attention-correction model.
In an embodiment of the present application, a specific operation flow of the context-aware fusion model is as follows:
defining the input semantic information image D with high dimension as D belonged to R C×H×W Extracting feature maps with unchanged number of four channels and different resolutions by using a plurality of pooling operations with different scales, wherein the feature maps respectively comprise: d 0 ∈R C×6×6 ,D 1 ∈R C×3×3 ,D 2 ∈R C×2×2 and D3 ∈R C×1×1
Respectively reducing dimensions of the four feature maps through 1 × 1 convolution, a batch regularization algorithm and a ReLU nonlinear activation function, and compressing the number of channels to one fourth, namely:
Figure BDA0003891592050000081
and
Figure BDA0003891592050000082
then, the feature map after dimensionality reduction is up-sampled to obtain an up-sampled feature map D' with the same size as the high-dimensional semantic information image D i Namely:
D″ i =(Up(D′ ii ));
splicing the up-sampled characteristic graphs by channel dimension to obtain a spliced characteristic graph D Cat Namely:
D Cat =CONCAT(D″ 0 ,D″ 1 ,D″ 2 ,D″ 3 );
wherein ,
Figure BDA0003891592050000083
i represents a natural number, β i And representing a correlation coefficient, wherein Up is conventional bilinear interpolation upsampling, and CONCAT is splicing on a channel dimension.
Referring to FIG. 4, CBR represents a 1 × 1 convolution, batch regularization algorithm, and ReLU nonlinear activation function.
In an embodiment of the present application, referring to fig. 5, a specific operation flow of the attention correction model includes:
reducing the dimension of the channel dimension by adopting 1 multiplied by 1 convolution on the spliced feature map, obtaining an attention weight map through a Sigmoid activation function, carrying out attention moment matrix multiplication on the attention weight map, reshaping the weight of the spliced feature map, modeling a spatial attention mechanism, and obtaining a feature map D based on the spatial attention mechanism Spatial Namely:
Figure BDA0003891592050000084
wherein ,
Figure BDA0003891592050000091
multiplication of the expression attention matrix, σ (-) is the Sigmiod activation function, S 0 Representing a 1 × 1 convolution operation, α being with S 0 A coefficient of correlation;
extracting the characteristics of the spliced characteristic diagram to obtain an extracted characteristic diagram, inputting the extracted characteristic diagram into a channel attention mechanism to obtain channel weight, and acquiring the characteristic diagram based on the channel attention mechanism, namely:
Figure BDA0003891592050000092
wherein ,FAdaptive (. H) Cross-channel information interaction can be achieved locally with different convolution kernel sizes, G (-) represents the Global average pooling
Figure BDA0003891592050000093
H ', W' refer to the pixel spatial coordinates,D channel representing a channel dimension attention mechanism, i, j representing a natural number, and theta is a correlation coefficient of G (·);
fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram D Out Namely:
Figure BDA0003891592050000094
fig. 6 is a data analysis table comparing with the current advanced polyp image segmentation method, which is provided by an embodiment of the present application, and can more intuitively show various performance indicators.
An embodiment of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of a polyp image segmentation method as provided by an embodiment of the present application.
Fig. 7 shows a specific structural block diagram of a computer device provided in an embodiment of the present application, where a computer device 100 includes: one or more processors 101, a memory 102, and one or more computer programs, wherein the processors 101 and the memory 102 are connected by a bus, the one or more computer programs being stored in the memory 102 and configured to be executed by the one or more processors 101, the processor 101 implementing the steps of the polyp image segmentation method as provided by an embodiment of the present application when executing the computer programs.
The computer equipment comprises a server, a terminal and the like. The computer device may be a desktop computer, a mobile terminal or a vehicle-mounted device, and the mobile terminal includes at least one of a mobile phone, a tablet computer, a personal digital assistant or a wearable device.
In the embodiment of the application, a multi-scale semantic fusion model is designed, and semantic information images of different scales are collected through various filters to improve the representation capability, so that the size change of polyps is adapted, particularly smaller polyps are internally provided with finer granularity levels, and the sense field of a network is increased by extracting features through convolution kernels of different scales; a context-aware pyramid aggregation model is designed, the feature information of different regions is guided to be fused, a dual attention mechanism is contained in the context-aware pyramid aggregation model, important features are further strengthened, the features of non-important regions are effectively restrained, accurate polyp segmentation is achieved, and real-time performance is considered.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), random Access Memory (RAM), magnetic or optical disks, and the like.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (7)

1. A method of segmenting a polyp image, comprising:
acquiring a polyp image to be segmented;
inputting a polyp image to be segmented into a multi-scale semantic fusion model to obtain a semantic information image, then performing down-sampling on the semantic information image to obtain a down-sampled semantic information image, inputting the down-sampled semantic information image into the multi-scale semantic fusion model again, then performing down-sampling, and repeating for multiple times to obtain a high-dimensional semantic information image; the multi-scale semantic fusion model is characterized in that an initial feature map with the same size as a polyp image to be segmented is obtained by extracting features of the polyp image to be segmented, the initial feature map is divided into 4 feature maps with the same channel number, 3 feature maps are selected to be spliced with the remaining feature map in channel dimension in sequence after being subjected to convolution and batch regularization algorithm, and the feature maps obtained after residual connection and splicing are fused with the polyp image to be segmented to obtain a semantic information image;
inputting a high-dimensional semantic information image into a context-aware pyramid aggregation model, outputting a fused feature map, performing upsampling on the fused feature map, extracting features through convolution, performing upsampling on the feature map after feature extraction again, extracting features through convolution, and repeating for multiple times until a final feature map with the same size as the polyp image channel to be segmented is obtained;
the context-aware pyramid aggregation model is used for performing pooling operation on an input high-dimensional semantic information image in multiple different scales, extracting four feature maps with unchanged channel number and different resolutions, sequentially performing up-sampling on the four feature maps after dimensionality reduction to obtain an up-sampled feature map with the same size as the high-dimensional semantic information image, and splicing the up-sampled feature maps according to channel dimensionality to obtain a spliced feature map; performing convolution on the spliced feature graph to reduce the dimension of a channel, obtaining an attention weight graph by using a Sigmoid activation function, performing attention moment matrix multiplication on the attention weight graph, and reshaping the weight of the spliced feature graph to obtain a feature graph based on a space attention mechanism; performing feature extraction on the spliced feature map, and inputting the feature map into a channel attention mechanism to obtain channel weight and a feature map based on the channel attention mechanism; and fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagram.
2. The polyp image segmentation method as set forth in claim 1, wherein the specific flow of the multi-scale semantic fusion model is:
the polyp image X to be segmented is defined as: x is formed by R C×H×W Passing the polyp image to be segmented through W 1 (. The) carries on the characteristic extraction, get an initial characteristic map X' with the same size of the polyp picture to be cut apart: x' is belonged to R C×H×W
W is 1 (. 1) includes a 1 × 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function;
according to the channel dimensionDegree is to make the initial characteristic diagram X' be equal to R C×H×W Divided into 4 characteristic graphs with same channel number
Figure FDA0003891592040000021
3 feature maps X in the three 1 ,X 2 ,X 3 Via W 2 (. The) making a transition, and making the transformed characteristic diagram W 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ) With the remaining one of the feature maps X 0 Sequentially splicing according to the channel dimension to obtain a spliced characteristic diagram X with the same number as the channels of the polyp image to be segmented Cat Namely:
X Cat =CONCAT(W 2 (X 1 ),W 2 (X 2 ),W 2 (X 3 ),X 0 );
the W is 2 (. H) includes 3 × 3 convolution and batch regularization algorithms;
residual errors are connected and spliced to form a characteristic diagram, then the characteristic diagram is fused with a polyp image to be segmented, and a semantic information image X is output Out Namely:
Figure FDA0003891592040000022
wherein, R represents a three-dimensional array image, C, H and W respectively represent the channel number, length and width of the image;
Figure FDA0003891592040000023
the addition and summation operation of the pixel level is represented, and CONCAT represents splicing on the channel dimension; w 3 (. Cndot.) includes a 1 × 1 convolution, a batch regularization algorithm, and a ReLU nonlinear activation function.
3. The polyp image segmentation method of claim 1, wherein the context-aware pyramid aggregation model comprises a context-aware fusion model and an attention correction model.
4. The polyp image segmentation method as set forth in claim 3, wherein the specific operation flow of the context-aware fusion model is:
defining the input high-dimensional semantic information image D as D belongs to R C×H×W Extracting feature maps with unchanged number of four channels and different resolutions by using a plurality of pooling operations with different scales, wherein the feature maps respectively comprise: d 0 ∈R C×6×6 ,D 1 ∈R C×3×3 ,D 2 ∈R C×2×2 and D3 ∈R C ×1×1
Respectively reducing dimensions of the four feature maps through 1 × 1 convolution, a batch regularization algorithm and a ReLU nonlinear activation function, and compressing the number of channels to one fourth, namely:
Figure FDA0003891592040000031
and
Figure FDA0003891592040000032
then, the feature map after dimensionality reduction is subjected to upsampling to obtain an upsampled feature map D' with the same size as the semantic information image D with high dimensionality i Namely:
D″ i =(Up(D i ′,β i ));
splicing the up-sampled characteristic graphs by channel dimension to obtain a spliced characteristic graph D Cat Namely:
D Cat =CONCAT(D″ 0 ,D″ 1 ,D″ 2 ,D″ 3 );
wherein ,
Figure FDA0003891592040000033
i represents a natural number, beta i And representing a correlation coefficient, wherein Up is conventional bilinear interpolation upsampling, and CONCAT is splicing on a channel dimension.
5. The polyp image segmentation method as set forth in claim 4, wherein the specific operation flow of the attention correction model is:
performing dimension reduction on channel dimensions on the spliced feature map by adopting 1 multiplied by 1 convolution, obtaining an attention weight map through a Sigmoid activation function, performing attention moment array multiplication on the attention weight map, reshaping the weight of the spliced feature map, modeling a spatial attention mechanism, and obtaining a feature map D based on the spatial attention mechanism Spatial Namely:
Figure FDA0003891592040000041
wherein ,
Figure FDA0003891592040000042
multiplication of the expression attention matrix, σ (-) is the Sigmiod activation function, S 0 Representing a 1 × 1 convolution operation, α being with S 0 A coefficient of correlation;
extracting the characteristics of the spliced characteristic diagram to obtain an extracted characteristic diagram, inputting the extracted characteristic diagram into a channel attention mechanism to obtain channel weight, and acquiring the characteristic diagram based on the channel attention mechanism, namely:
Figure FDA0003891592040000043
wherein ,FAdaptive (. H) Cross-channel information interaction can be achieved locally with different convolution kernel sizes, G (-) represents the Global average pooling
Figure FDA0003891592040000044
H ', W' refers to the pixel space coordinates, D channel Representing a channel dimension attention mechanism, i, j representing a natural number, and theta is a correlation coefficient of G (·);
fusing the characteristic diagram based on the space attention mechanism and the characteristic diagram based on the channel attention mechanism to obtain a fused characteristic diagramCombined feature map D Out Namely:
Figure FDA0003891592040000045
6. a computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the polyp image segmentation method according to any one of claims 1 to 5.
7. A computer device, comprising:
one or more processors;
a memory; and one or more computer programs, the processor and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, characterized in that the steps of the polyp image segmentation method according to any one of claims 1 to 5 are implemented when the computer programs are executed by the processors.
CN202211261125.0A 2022-10-14 2022-10-14 Polyp image segmentation method, computer readable storage medium and computer device Active CN115439470B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211261125.0A CN115439470B (en) 2022-10-14 2022-10-14 Polyp image segmentation method, computer readable storage medium and computer device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211261125.0A CN115439470B (en) 2022-10-14 2022-10-14 Polyp image segmentation method, computer readable storage medium and computer device

Publications (2)

Publication Number Publication Date
CN115439470A true CN115439470A (en) 2022-12-06
CN115439470B CN115439470B (en) 2023-05-26

Family

ID=84250185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211261125.0A Active CN115439470B (en) 2022-10-14 2022-10-14 Polyp image segmentation method, computer readable storage medium and computer device

Country Status (1)

Country Link
CN (1) CN115439470B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116486230A (en) * 2023-04-21 2023-07-25 哈尔滨工业大学(威海) Image detection method based on semi-recursion characteristic pyramid structure and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465827A (en) * 2020-12-09 2021-03-09 北京航空航天大学 Contour perception multi-organ segmentation network construction method based on class-by-class convolution operation
CN113506300A (en) * 2021-06-25 2021-10-15 江苏大学 Image semantic segmentation method and system based on rainy complex road scene
CN113538313A (en) * 2021-07-22 2021-10-22 深圳大学 Polyp segmentation method and device, computer equipment and storage medium
CN114170167A (en) * 2021-11-29 2022-03-11 深圳职业技术学院 Polyp segmentation method and computer device based on attention-guided context correction
CN114581662A (en) * 2022-02-17 2022-06-03 华南理工大学 Method, system, device and storage medium for segmenting brain tumor image
CN115018824A (en) * 2022-07-21 2022-09-06 湘潭大学 Colonoscope polyp image segmentation method based on CNN and Transformer fusion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465827A (en) * 2020-12-09 2021-03-09 北京航空航天大学 Contour perception multi-organ segmentation network construction method based on class-by-class convolution operation
CN113506300A (en) * 2021-06-25 2021-10-15 江苏大学 Image semantic segmentation method and system based on rainy complex road scene
CN113538313A (en) * 2021-07-22 2021-10-22 深圳大学 Polyp segmentation method and device, computer equipment and storage medium
CN114170167A (en) * 2021-11-29 2022-03-11 深圳职业技术学院 Polyp segmentation method and computer device based on attention-guided context correction
CN114581662A (en) * 2022-02-17 2022-06-03 华南理工大学 Method, system, device and storage medium for segmenting brain tumor image
CN115018824A (en) * 2022-07-21 2022-09-06 湘潭大学 Colonoscope polyp image segmentation method based on CNN and Transformer fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIANTAO SHI: ""FRCNet: Feature Refining and Context-Guided Network for Efficient Polyp Segmentation"", 《FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY》 *
范润泽 等: ""基于多尺度注意力机制的道路场景语义分割模型"", 《计算机工程》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116486230A (en) * 2023-04-21 2023-07-25 哈尔滨工业大学(威海) Image detection method based on semi-recursion characteristic pyramid structure and storage medium
CN116486230B (en) * 2023-04-21 2024-02-02 哈尔滨工业大学(威海) Image detection method based on semi-recursion characteristic pyramid structure and storage medium

Also Published As

Publication number Publication date
CN115439470B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
TWI728465B (en) Method, device and electronic apparatus for image processing and storage medium thereof
CN107665491B (en) Pathological image identification method and system
CN111369440B (en) Model training and image super-resolution processing method, device, terminal and storage medium
EP3923233A1 (en) Image denoising method and apparatus
CN114170167B (en) Polyp segmentation method and computer device based on attention-guided context correction
AU2021354030B2 (en) Processing images using self-attention based neural networks
CN113012140A (en) Digestive endoscopy video frame effective information region extraction method based on deep learning
CN110866938B (en) Full-automatic video moving object segmentation method
CN112700460A (en) Image segmentation method and system
CN113486890A (en) Text detection method based on attention feature fusion and cavity residual error feature enhancement
CN114004811A (en) Image segmentation method and system based on multi-scale residual error coding and decoding network
CN115439470A (en) Polyp image segmentation method, computer-readable storage medium, and computer device
CN112150470A (en) Image segmentation method, image segmentation device, image segmentation medium, and electronic device
CN112633260B (en) Video motion classification method and device, readable storage medium and equipment
CN114399510A (en) Skin lesion segmentation and classification method and system combining image and clinical metadata
CN113392791A (en) Skin prediction processing method, device, equipment and storage medium
CN117252890A (en) Carotid plaque segmentation method, device, equipment and medium
CN116542988A (en) Nodule segmentation method, nodule segmentation device, electronic equipment and storage medium
CN111369564B (en) Image processing method, model training method and model training device
CN114332574A (en) Image processing method, device, equipment and storage medium
CN112001479B (en) Processing method and system based on deep learning model and electronic equipment
CN114022458A (en) Skeleton detection method and device, electronic equipment and computer readable storage medium
CN111833991A (en) Auxiliary interpretation method and device based on artificial intelligence, terminal and storage medium
Patel et al. Deep Learning in Medical Image Super-Resolution: A Survey
CN115861604B (en) Cervical tissue image processing method, cervical tissue image processing device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant