CN114170167A - Polyp segmentation method and computer device based on attention-guided context correction - Google Patents

Polyp segmentation method and computer device based on attention-guided context correction Download PDF

Info

Publication number
CN114170167A
CN114170167A CN202111434451.2A CN202111434451A CN114170167A CN 114170167 A CN114170167 A CN 114170167A CN 202111434451 A CN202111434451 A CN 202111434451A CN 114170167 A CN114170167 A CN 114170167A
Authority
CN
China
Prior art keywords
feature
semantic information
polyp
information image
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111434451.2A
Other languages
Chinese (zh)
Other versions
CN114170167B (en
Inventor
施连焘
李正国
王玉峰
郭玉宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Polytechnic
Original Assignee
Shenzhen Polytechnic
University of Science and Technology Liaoning USTL
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Polytechnic, University of Science and Technology Liaoning USTL filed Critical Shenzhen Polytechnic
Priority to CN202111434451.2A priority Critical patent/CN114170167B/en
Publication of CN114170167A publication Critical patent/CN114170167A/en
Application granted granted Critical
Publication of CN114170167B publication Critical patent/CN114170167B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10068Endoscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30028Colon; Small intestine
    • G06T2207/30032Colon polyp

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a polyp segmentation method based on attention-guided context correction, a computer-readable storage medium and a computer device, comprising: inputting a polyp picture to be segmented into an enhanced context correction model for training, then performing down-sampling, then performing enhanced context correction model training, then performing down-sampling, and repeatedly obtaining a final semantic information image; inputting the final semantic information image into a progressive context fusion model for training, and outputting a semantic information image with fused features; performing upsampling on the semantic information image after feature fusion, performing enhanced context correction model training to obtain a feature mapping image, performing upsampling, performing enhanced context correction model training, and repeatedly obtaining a final feature mapping image with the same size as a polyp image channel to be segmented; and inputting the final feature mapping image into a multi-level feature fusion model for training and outputting a polyp segmentation picture. Thereby enabling the present invention to more accurately identify polyps.

Description

Polyp segmentation method and computer device based on attention-guided context correction
Technical Field
The present invention belongs to the medical field, and in particular, to a polyp segmentation method based on attention-guided context correction, a computer-readable storage medium, and a computer device.
Background
Colorectal cancer is a colorectal cancer which is caused by long-term and many reasons when polyps (raised masses in gastrointestinal channels) formed in intestinal tracts are developed at the earliest stage, and if the polyps can be detected and cut off by intervention at the early stage and the colorectal cancer can be prevented, the most effective method for screening and diagnosing the colorectal cancer is colorectal endoscopy which is the most mainstream method for diagnosing the colorectal cancer at present.
However, although the current diagnosis method is advanced and accurate, problems still exist, according to some professional research reports, every four polyps are missed in the process of endoscopy, so that the resection is not clean, hidden dangers are left, in addition, the shapes of the polyps are different and variable, each fine judgment is difficult to be carried out through naked eyes, and especially under the condition that the background difference between some polyps and gastrointestinal channels is not large, finally, the rapid identification cannot be carried out only through human beings, so that a great amount of time and effort are needed, and the judgment needs to be carried out by increasing a great amount of workload of a gastroenterology doctor under the current medical system.
Disclosure of Invention
The invention aims to provide a polyp segmentation method based on attention-guided context correction, a computer-readable storage medium and a computer device, and aims to solve the problem that hidden dangers are left due to incomplete resection caused by omission of endoscopic polyps.
In a first aspect, the present invention provides a polyp segmentation method based on attention-guided context correction, comprising:
acquiring a polyp picture to be segmented;
inputting a polyp picture to be segmented into an enhanced context correction model for training to obtain a semantic information image, then performing down-sampling on the semantic information image to obtain a down-sampled semantic information image, inputting the down-sampled semantic information image into the enhanced context correction model again for training and then performing down-sampling, and repeating for multiple times to obtain a final semantic information image; the enhanced context correction model is characterized in that an input polyp picture to be segmented is divided into two feature maps with equal channel quantity by channel dimension, one feature map is subjected to attention mechanism to obtain a first feature map, the other feature map is subjected to feature extraction by depth separable convolution to obtain a second feature map, the first feature map and the second feature map are spliced, and then residual errors are connected and fused to output a semantic information image;
inputting the final semantic information image into a progressive context fusion model for training, and outputting a semantic information image with fused features; the progressive context fusion model is used for respectively extracting the characteristics of the final semantic information image through a cavity convolution and a conventional convolution to obtain two characteristic graphs, splicing the two characteristic graphs and inputting the spliced characteristic graphs into a channel attention mechanism of context modeling to obtain channel weights, performing characteristic fusion on the channel weights and the final semantic information image according to channel dimensions, and outputting a semantic information image after the characteristic fusion;
performing up-sampling on the semantic information image after feature fusion to obtain an up-sampled semantic information image, performing enhanced context correction model training on the up-sampled semantic information image to obtain a feature mapping image, performing up-sampling on the feature mapping image again, performing enhanced context correction model training, and repeating for multiple times to obtain a final feature mapping image with the same channel size as the polyp image to be segmented;
inputting the final feature mapping image into a multi-level feature fusion model for training, and outputting a polyp segmentation picture; the multi-level feature fusion model is characterized in that the final feature mapping image is up-sampled to the same resolution ratio one by one, the up-sampled feature mapping images are spliced and input into a channel attention mechanism to obtain channel weight, and the channel weight and the final feature mapping output image are modeled by pixel-level multiplication to obtain a polyp segmentation image.
Further, after the upsampling is performed on the semantic information image after the feature fusion, the method further includes: and adding a jump link structure during the training of the enhanced context correction model, and complementing the spatial fine granularity of deep semantic information of a decoding layer by using the representation information of a shallow coding layer.
Further, the inputting of the polyp picture to be segmented into the enhanced context correction model training specifically includes:
defining a polyp picture X to be segmentedinIs Xin∈RC×H×WExtracting features from the polyp picture to be segmented by 1X 1 convolution, and outputting two feature maps X with equal channel number1And X2
Figure BDA0003381118170000031
And
Figure BDA0003381118170000032
and two feature maps X1And X2Obtaining a first profile X by means of an attention mechanism and a depth-separable convolution, respectivelyattAnd a second feature map X'2Namely:
Figure BDA0003381118170000033
Figure BDA0003381118170000034
the first characteristic diagram XattAnd a second characteristic diagram X2' splicing, and then obtaining semantic information image X by residual connection and fusionoutComprises the following steps:
Figure BDA0003381118170000035
wherein,
Figure BDA0003381118170000036
is X1Sending the feature graph obtained by 1 × 1 convolution through a batch regularization algorithm and a RELU nonlinear activation function; r represents a three-dimensional array image, C is the number of channels, H is the length, W is the width, sigma and
Figure BDA0003381118170000037
referred to as sigmoid activation function and pixel level summation respectively,
Figure BDA0003381118170000038
for pixel level multiplication, Up is conventional bilinear interpolation upsampling, Down is downsampling, Cat represents splicing in channel dimension, and XoutTo output a characteristic map, F3×3Representing a 3 x 3 convolution batch normalization and a nonlinear activation function.
Further, the training of the progressive context fusion model specifically includes:
defining the final semantic information image X 'as X' belonging to RC×H×WExtracting features by convolution of 1 × 1, and extracting two feature maps X according to conventional convolution and cavity convolution respectivelysAnd XlNamely:
Figure BDA0003381118170000039
Figure BDA00033811181700000310
two kinds of feature maps XsAnd XlObtaining a spliced characteristic diagram X after splicingcatThe method comprises the following steps:
Xcat=Cat(Xl,Xs);
the spliced characteristic diagram XcatInputting the channel attention mechanism of global context modeling to obtain channel weight, performing feature fusion on the channel weight and the final semantic information image according to channel dimension, and outputting semantic information image y after feature fusioniThe method comprises the following steps:
Figure BDA0003381118170000041
wherein, TpH.W represents XcatJ represents the lower bound of the summation symbol, βjRepresenting global attention pooling weights for context modeling, P ═ S3RELU(LN(S2(.))) refers to a bottleneck layer for capturing the dependencies between channels, RELU stands for a non-linear activation function, S2And S3Representing the information interaction between the channel dimensions, LN is used as optimizer for LayerNorm.
Further, the step of inputting the final feature map into the multi-level feature fusion model for training specifically includes:
defining the final characteristic mapping chart L as L epsilon RC×H×WThe final feature maps are up-sampled to the same resolution one by one and then spliced, and the spliced feature maps are sent into a convolution W of 1 multiplied by 11Extracting the features to obtain an extracted feature map G:
G=W1(Cat(B(l1,l2,l3,l4),l0));
inputting G into a channel attention mechanism to obtain channel weight, modeling the channel weight and the final feature mapping graph by pixel level multiplication to obtain a polyp segmentation picture Y, wherein the obtained polyp segmentation picture Y comprises the following steps:
Figure BDA0003381118170000042
wherein, L ═ L0,l1,l2,l3,l4),l0-l4Respectively representing decoded layer feature maps from large to small resolution, B (l)1,l2,l3,l4),l0Represents l1-l4Upsampling to0Splicing is carried out after the resolution is the same, xi is a correlation coefficient related to G, G is global average pooling, and delta represents an activation function.
In a second aspect, the invention provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when being executed by a processor, performs the steps of the method for polyp segmentation based on attention-guided context correction according to the first aspect.
In a third aspect, the present invention provides a computer device comprising: one or more processors, a memory, and one or more computer programs, the processors and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, characterized in that the processor, when executing the computer program, implements the steps of the polyp segmentation method based on attention-directed context correction as described in the first aspect.
In the invention, the operation of downsampling is carried out after the enhanced context correction model is repeatedly trained for many times, deeper semantic information can be obtained, and the interference of background noise can be effectively inhibited; the problem of large-scale polyps in the polyp recognition process is solved through the progressive context fusion model training; through multi-level feature fusion model training, accurate segmentation results can be effectively obtained, and the accuracy of polyp recognition results is improved.
Drawings
Fig. 1 is a flowchart of a polyp segmentation method based on attention-guided context correction according to an embodiment of the present invention.
Fig. 2 is a flowchart of another polyp segmentation method based on attention-directed context correction according to an embodiment of the present invention.
FIG. 3 is a flowchart of enhanced context correction model training according to an embodiment of the present invention.
FIG. 4 is a flowchart of progressive context fusion model training according to an embodiment of the present invention.
Fig. 5 is a flowchart of multi-level feature fusion model training according to an embodiment of the present invention.
Fig. 6 is a block diagram of a specific structure of a computer device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clearly apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Referring to fig. 1, a polyp segmentation method based on attention-directed context correction according to an embodiment of the present invention includes the following steps: it should be noted that the polyp segmentation method based on attention-directed context correction according to the present invention is not limited to the flow sequence shown in fig. 1, if the results are substantially the same.
S1, obtaining a polyp picture to be segmented;
s2, inputting a polyp picture to be segmented into an enhanced context correction model for training to obtain a semantic information image, then performing down-sampling on the semantic information image to obtain a down-sampled semantic information image, inputting the down-sampled semantic information image into the enhanced context correction model again for training and then performing down-sampling, and repeating for multiple times to obtain a final semantic information image; the enhanced context correction model is characterized in that an input polyp picture to be segmented is divided into two feature maps with equal channel quantity by channel dimension, one feature map is subjected to attention mechanism to obtain a first feature map, the other feature map is subjected to feature extraction by depth separable convolution to obtain a second feature map, the first feature map and the second feature map are spliced, and then residual errors are connected and fused to output a semantic information image;
s3, inputting the final semantic information image into a progressive context fusion model for training, and outputting a semantic information image with fused features; the progressive context fusion model is used for respectively extracting the characteristics of the final semantic information image through a cavity convolution and a conventional convolution to obtain two characteristic graphs, splicing the two characteristic graphs and inputting the spliced characteristic graphs into a channel attention mechanism of context modeling to obtain channel weights, performing characteristic fusion on the channel weights and the final semantic information image according to channel dimensions, and outputting a semantic information image after the characteristic fusion;
s4, performing up-sampling on the semantic information image subjected to feature fusion to obtain an up-sampled semantic information image, performing enhanced context correction model training on the up-sampled semantic information image to obtain a feature mapping image, performing up-sampling on the feature mapping image again, performing enhanced context correction model training, and repeating for multiple times to obtain a final feature mapping image with the same channel size as the polyp image to be segmented;
s5, inputting the final feature mapping image into a multi-level feature fusion model for training, and outputting a polyp segmentation image; the multi-level feature fusion model is characterized in that the final feature mapping image is up-sampled to the same resolution ratio one by one, the up-sampled feature mapping images are spliced and input into a channel attention mechanism to obtain channel weight, and the channel weight and the final feature mapping output image are modeled by pixel-level multiplication to obtain a polyp segmentation image.
Fig. 2 is a flowchart illustrating a polyp segmentation method based on attention-directed context correction according to an embodiment of the present invention, where Input is an Input picture to be segmented, ECC represents an enhanced context correction model, Down × 2 represents a downsampling operation (with a convolution kernel size of 2), a PCF progressive context fusion model, Up × 2 represents an upsampling operation (with a convolution kernel size of 2), MPA is a multi-level feature fusion model, Output is an Output segmentation result, and skip connection is residual connection.
In an embodiment of the present invention, after the upsampling the semantic information image after the feature fusion, the method further includes: and adding a jump link structure during the training of the enhanced context correction model, and complementing the spatial fine granularity of deep semantic information of a decoding layer by using the representation information of a shallow coding layer.
In an embodiment of the present invention, the inputting of the polyp picture to be segmented into the enhanced context correction model training specifically includes:
defining a polyp picture X to be segmentedinIs Xin∈RC×H×WExtracting features from the polyp picture to be segmented by 1X 1 convolution, and outputting two feature maps X with equal channel number1And X2
Figure BDA0003381118170000071
And
Figure BDA0003381118170000072
and two feature maps X1And X2Obtaining a first profile X by means of an attention mechanism and a depth-separable convolution, respectivelyattAnd a second characteristic diagram X2', i.e.:
Figure BDA0003381118170000073
Figure BDA0003381118170000074
the first characteristic diagram XattAnd a second characteristic diagram X2' splicing, and then obtaining semantic information image X by residual connection and fusionoutComprises the following steps:
Figure BDA0003381118170000075
wherein,
Figure BDA0003381118170000076
is X1Sending the feature graph obtained by 1 × 1 convolution through a batch regularization algorithm and a RELU nonlinear activation function; r represents a three-dimensional array image, C is the number of channels, H is the length, W is the width, sigma and
Figure BDA0003381118170000077
referred to as sigmoid activation function and pixel level summation respectively,
Figure BDA0003381118170000078
for pixel level multiplication, Up is conventional bilinear interpolation upsampling, Down is downsampling, Cat represents splicing in channel dimension, and XoutTo output a characteristic map, F3×3Representing a 3 x 3 convolution batch normalization and a nonlinear activation function.
Fig. 3 is a flow chart of enhanced context correction model training, wherein,
Figure BDA0003381118170000079
a pixel level summation, a pixel level multiplication operation,
Figure BDA00033811181700000711
the function is activated for the sigmoid and,
Figure BDA00033811181700000712
in order to perform the down-sampling,
Figure BDA00033811181700000713
in order to splice in the channel dimension,
Figure BDA00033811181700000714
is a conventional bilinear interpolated upsampling.
In an embodiment of the present invention, referring to fig. 4, the training of the progressive context fusion model specifically includes:
defining a final semantic information graphLike X 'as X' ∈ RC×H×WExtracting features by convolution of 1 × 1, and extracting two feature maps X according to conventional convolution and cavity convolution respectivelysAnd XlNamely:
Figure BDA0003381118170000081
Figure BDA0003381118170000082
two kinds of feature maps XsAnd XlObtaining a spliced characteristic diagram X after splicingcatThe method comprises the following steps:
Xcat=Cat(Xl,Xs);
the spliced characteristic diagram XcatInputting the channel attention mechanism of global context modeling to obtain channel weight, performing feature fusion on the channel weight and the final semantic information image according to channel dimension, and outputting semantic information image y after feature fusioniThe method comprises the following steps:
Figure BDA0003381118170000083
wherein, TpH.W represents XcatJ represents the lower bound of the summation symbol, βjRepresenting global attention pooling weights for context modeling, P ═ S3RELU(LN(S2(.))) refers to a bottleneck layer for capturing the dependencies between channels, RELU stands for a non-linear activation function, S2And S3Representing the information interaction between the channel dimensions, LN is used as optimizer for LayerNorm.
In an embodiment of the present invention, the inputting the final feature map into the multi-level feature fusion model for training specifically includes:
defining the final characteristic mapping chart L as L epsilon RC×H×WThe final feature maps are up-sampled one by one to the same scoreSplicing again according to the resolution, and sending the spliced feature mapping chart into a convolution W of 1 multiplied by 11Extracting the features to obtain an extracted feature map G:
G=W1(Cat(B(l1,l2,l3,l4),l0));
inputting G into a channel attention mechanism to obtain channel weight, modeling the channel weight and the final feature mapping graph by pixel level multiplication to obtain a polyp segmentation picture Y, wherein the obtained polyp segmentation picture Y comprises the following steps:
Figure BDA0003381118170000084
wherein, L ═ L0,l1,l2,l3,l4),l0-l4Respectively representing decoded layer feature maps from large to small resolution, B (l)1,l2,l3,l4),l0Represents l1-l4Upsampling to0Splicing is carried out after the resolution is the same, xi is a correlation coefficient related to G, G is global average pooling, and delta represents an activation function.
Fig. 5 is a flow chart of multi-level fusion model training, wherein,
Figure BDA0003381118170000091
for the purpose of bi-linear interpolation up-sampling,
Figure BDA0003381118170000092
is a pixel level multiplication operation.
An embodiment of the present invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of a polyp segmentation method based on attention-directed context correction as provided by an embodiment of the present invention.
Fig. 6 shows a specific block diagram of a computer device according to an embodiment of the present invention, where the computer device 100 includes: one or more processors 101, a memory 102, and one or more computer programs, wherein the processors 101 and the memory 102 are connected by a bus, the one or more computer programs being stored in the memory 102 and configured to be executed by the one or more processors 101, the processor 101 implementing the steps of the attention-directed context correction based polyp segmentation method as provided by an embodiment of the invention when executing the computer programs.
The computer equipment comprises a server, a terminal and the like. The computer device may be a desktop computer, a mobile terminal or a vehicle-mounted device, and the mobile terminal includes at least one of a mobile phone, a tablet computer, a personal digital assistant or a wearable device.
In the embodiment of the invention, the operation of downsampling is carried out after the enhanced context correction model is repeatedly trained for many times, so that deeper semantic information can be obtained, and the interference of background noise can be effectively inhibited; the problem of large-scale polyps in the polyp recognition process is solved through the progressive context fusion model training; through multi-level feature fusion model training, accurate segmentation results can be effectively obtained, and the accuracy of polyp recognition results is improved.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (7)

1. A polyp segmentation method based on attention-directed context correction, comprising:
acquiring a polyp picture to be segmented;
inputting a polyp picture to be segmented into an enhanced context correction model for training to obtain a semantic information image, then performing down-sampling on the semantic information image to obtain a down-sampled semantic information image, inputting the down-sampled semantic information image into the enhanced context correction model again for training and then performing down-sampling, and repeating for multiple times to obtain a final semantic information image; the enhanced context correction model is characterized in that an input polyp picture to be segmented is divided into two feature maps with equal channel quantity by channel dimension, one feature map is subjected to attention mechanism to obtain a first feature map, the other feature map is subjected to feature extraction by depth separable convolution to obtain a second feature map, the first feature map and the second feature map are spliced, and then residual errors are connected and fused to output a semantic information image;
inputting the final semantic information image into a progressive context fusion model for training, and outputting a semantic information image with fused features; the progressive context fusion model is used for respectively extracting the characteristics of the final semantic information image through a cavity convolution and a conventional convolution to obtain two characteristic graphs, splicing the two characteristic graphs and inputting the spliced characteristic graphs into a channel attention mechanism of context modeling to obtain channel weights, performing characteristic fusion on the channel weights and the final semantic information image according to channel dimensions, and outputting a semantic information image after the characteristic fusion;
performing up-sampling on the semantic information image after feature fusion to obtain an up-sampled semantic information image, performing enhanced context correction model training on the up-sampled semantic information image to obtain a feature mapping image, performing up-sampling on the feature mapping image again, performing enhanced context correction model training, and repeating for multiple times to obtain a final feature mapping image with the same channel size as the polyp image to be segmented;
inputting the final feature mapping image into a multi-level feature fusion model for training, and outputting a polyp segmentation picture; the multi-level feature fusion model is characterized in that the final feature mapping image is up-sampled to the same resolution ratio one by one, the up-sampled feature mapping images are spliced and input into a channel attention mechanism to obtain channel weight, and the channel weight and the final feature mapping output image are modeled by pixel-level multiplication to obtain a polyp segmentation image.
2. The polyp segmentation method according to claim 1, wherein after the upsampling of the feature-fused semantic information image, further comprising: and adding a jump link structure during the training of the enhanced context correction model, and complementing the spatial fine granularity of deep semantic information of a decoding layer by using the representation information of a shallow coding layer.
3. The polyp segmentation method as set forth in claim 1, wherein the inputting of the polyp picture to be segmented into the enhanced context correction model training is specifically:
defining a polyp picture X to be segmentedinIs Xin∈RC×H×WExtracting features from the polyp picture to be segmented by 1X 1 convolution, and outputting two feature maps X with equal channel number1And X2
Figure FDA0003381118160000021
And
Figure FDA0003381118160000022
and two feature maps X1And X2Obtaining a first profile X by means of an attention mechanism and a depth-separable convolution, respectivelyattAnd a second feature map X'2Namely:
Figure FDA0003381118160000023
Figure FDA0003381118160000024
the first characteristic diagram XattAnd the second featureSymbol of figure X'2Splicing, and obtaining semantic information image X by residual error connection and fusionoutComprises the following steps:
Figure FDA0003381118160000025
wherein,
Figure FDA0003381118160000026
is X1Sending the feature graph obtained by 1 × 1 convolution through a batch regularization algorithm and a RELU nonlinear activation function; r represents a three-dimensional array image, C represents the number of channels, H represents the length, W represents the width, sigma and ^ refer to sigma activation function and pixel level summation respectively,
Figure FDA0003381118160000027
for pixel level multiplication, Up is conventional bilinear interpolation upsampling, Down is downsampling, Cat represents splicing in channel dimension, and XoutTo output a characteristic map, F3×3Representing a 3 x 3 convolution batch normalization and a nonlinear activation function.
4. The polyp segmentation method as claimed in claim 1 wherein said progressive context fusion model is trained by:
defining the final semantic information image X 'as X' belonging to RC×H×WExtracting features by convolution of 1 × 1, and extracting two feature maps X according to conventional convolution and cavity convolution respectivelysAnd XlNamely:
Figure FDA0003381118160000031
Figure FDA0003381118160000032
two kinds of feature maps XsAnd XlObtaining a spliced characteristic diagram X after splicingcatThe method comprises the following steps:
Xcat=Cat(Xl,Xs);
the spliced characteristic diagram XcatInputting the channel attention mechanism of global context modeling to obtain channel weight, performing feature fusion on the channel weight and the final semantic information image according to channel dimension, and outputting semantic information image y after feature fusioniThe method comprises the following steps:
Figure FDA0003381118160000033
wherein, TpH.W represents XcatJ represents the lower bound of the summation symbol, βjRepresenting global attention pooling weights for context modeling, P ═ S3RELU(LN(S2(.))) refers to a bottleneck layer for capturing the dependencies between channels, RELU stands for a non-linear activation function, S2And S3Representing the information interaction between the channel dimensions, LN is used as optimizer for LayerNorm.
5. The polyp segmentation method as set forth in claim 1, wherein the inputting the final feature map into the multi-level feature fusion model for training is specifically:
defining the final characteristic mapping chart L as L epsilon RC×H×WThe final feature maps are up-sampled to the same resolution one by one and then spliced, and the spliced feature maps are sent into a convolution W of 1 multiplied by 11Extracting the features to obtain an extracted feature map G:
G=W1(Cat(B(l1,l2,l3,l4),l0));
inputting G into a channel attention mechanism to obtain channel weight, modeling the channel weight and the final feature mapping graph by pixel level multiplication to obtain a polyp segmentation picture Y, wherein the obtained polyp segmentation picture Y comprises the following steps:
Figure FDA0003381118160000041
wherein, L ═ L0,l1,l2,l3,l4),l0-l4Respectively representing decoded layer feature maps from large to small resolution, B (l)1,l2,l3,l4),l0Represents l1-l4Upsampling to0Splicing is carried out after the resolution is the same, xi is a correlation coefficient related to G, G is global average pooling, and delta represents an activation function.
6. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method of polyp segmentation based on attention-directed context correction according to any one of claims 1 to 5.
7. A computer device, comprising: one or more processors, a memory, and one or more computer programs, the processors and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, characterized in that the steps of the attention-directed context correction based polyp segmentation method according to any one of claims 1 to 5 are implemented when the computer programs are executed by the processors.
CN202111434451.2A 2021-11-29 2021-11-29 Polyp segmentation method and computer device based on attention-guided context correction Active CN114170167B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111434451.2A CN114170167B (en) 2021-11-29 2021-11-29 Polyp segmentation method and computer device based on attention-guided context correction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111434451.2A CN114170167B (en) 2021-11-29 2021-11-29 Polyp segmentation method and computer device based on attention-guided context correction

Publications (2)

Publication Number Publication Date
CN114170167A true CN114170167A (en) 2022-03-11
CN114170167B CN114170167B (en) 2022-11-18

Family

ID=80481501

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111434451.2A Active CN114170167B (en) 2021-11-29 2021-11-29 Polyp segmentation method and computer device based on attention-guided context correction

Country Status (1)

Country Link
CN (1) CN114170167B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742848A (en) * 2022-05-20 2022-07-12 深圳大学 Method, device, equipment and medium for segmenting polyp image based on residual double attention
CN114913325A (en) * 2022-03-24 2022-08-16 北京百度网讯科技有限公司 Semantic segmentation method, device and computer program product
CN115439470A (en) * 2022-10-14 2022-12-06 深圳职业技术学院 Polyp image segmentation method, computer-readable storage medium, and computer device
CN115578341A (en) * 2022-09-30 2023-01-06 深圳大学 Large intestine polypus segmentation method based on attention-guided pyramid context network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197182A (en) * 2019-06-11 2019-09-03 中国电子科技集团公司第五十四研究所 Remote sensing image semantic segmentation method based on contextual information and attention mechanism
CN111462126A (en) * 2020-04-08 2020-07-28 武汉大学 Semantic image segmentation method and system based on edge enhancement
WO2021031066A1 (en) * 2019-08-19 2021-02-25 中国科学院深圳先进技术研究院 Cartilage image segmentation method and apparatus, readable storage medium, and terminal device
CN112541503A (en) * 2020-12-11 2021-03-23 南京邮电大学 Real-time semantic segmentation method based on context attention mechanism and information fusion
CN112927255A (en) * 2021-02-22 2021-06-08 武汉科技大学 Three-dimensional liver image semantic segmentation method based on context attention strategy
CN113298818A (en) * 2021-07-09 2021-08-24 大连大学 Remote sensing image building segmentation method based on attention mechanism and multi-scale features
CN113486890A (en) * 2021-06-16 2021-10-08 湖北工业大学 Text detection method based on attention feature fusion and cavity residual error feature enhancement
CN113538313A (en) * 2021-07-22 2021-10-22 深圳大学 Polyp segmentation method and device, computer equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197182A (en) * 2019-06-11 2019-09-03 中国电子科技集团公司第五十四研究所 Remote sensing image semantic segmentation method based on contextual information and attention mechanism
WO2021031066A1 (en) * 2019-08-19 2021-02-25 中国科学院深圳先进技术研究院 Cartilage image segmentation method and apparatus, readable storage medium, and terminal device
CN111462126A (en) * 2020-04-08 2020-07-28 武汉大学 Semantic image segmentation method and system based on edge enhancement
CN112541503A (en) * 2020-12-11 2021-03-23 南京邮电大学 Real-time semantic segmentation method based on context attention mechanism and information fusion
CN112927255A (en) * 2021-02-22 2021-06-08 武汉科技大学 Three-dimensional liver image semantic segmentation method based on context attention strategy
CN113486890A (en) * 2021-06-16 2021-10-08 湖北工业大学 Text detection method based on attention feature fusion and cavity residual error feature enhancement
CN113298818A (en) * 2021-07-09 2021-08-24 大连大学 Remote sensing image building segmentation method based on attention mechanism and multi-scale features
CN113538313A (en) * 2021-07-22 2021-10-22 深圳大学 Polyp segmentation method and device, computer equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
QUAN TANG 等: "Attention-guided Chained Context Aggregation for Semantic Segmentation", 《ARXIV》 *
文凯等: "基于多级上下文引导的实时语义分割网络", 《计算机应用研究》 *
肖建桥: "基于深度学习的道路场景语义分割", 《中国硕士学位论文全文数据库》 *
胡文俊等: "基于上下文的多路径空间编码图像语义分割方法", 《工业控制计算机》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114913325A (en) * 2022-03-24 2022-08-16 北京百度网讯科技有限公司 Semantic segmentation method, device and computer program product
CN114913325B (en) * 2022-03-24 2024-05-10 北京百度网讯科技有限公司 Semantic segmentation method, semantic segmentation device and computer program product
CN114742848A (en) * 2022-05-20 2022-07-12 深圳大学 Method, device, equipment and medium for segmenting polyp image based on residual double attention
CN115578341A (en) * 2022-09-30 2023-01-06 深圳大学 Large intestine polypus segmentation method based on attention-guided pyramid context network
CN115578341B (en) * 2022-09-30 2023-05-12 深圳大学 Method for segmenting large intestine polyps based on attention-directed pyramid context network
CN115439470A (en) * 2022-10-14 2022-12-06 深圳职业技术学院 Polyp image segmentation method, computer-readable storage medium, and computer device
CN115439470B (en) * 2022-10-14 2023-05-26 深圳职业技术学院 Polyp image segmentation method, computer readable storage medium and computer device

Also Published As

Publication number Publication date
CN114170167B (en) 2022-11-18

Similar Documents

Publication Publication Date Title
CN114170167B (en) Polyp segmentation method and computer device based on attention-guided context correction
CN113240580B (en) Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
US20200380695A1 (en) Methods, systems, and media for segmenting images
CN111898701B (en) Model training, frame image generation and frame insertion methods, devices, equipment and media
CN110211045B (en) Super-resolution face image reconstruction method based on SRGAN network
CN109493350B (en) Portrait segmentation method and device
CN111369440B (en) Model training and image super-resolution processing method, device, terminal and storage medium
CN112132959B (en) Digital rock core image processing method and device, computer equipment and storage medium
CN112465828A (en) Image semantic segmentation method and device, electronic equipment and storage medium
CN111476719B (en) Image processing method, device, computer equipment and storage medium
KR20200084434A (en) Machine Learning Method for Restoring Super-Resolution Image
CN115439470B (en) Polyp image segmentation method, computer readable storage medium and computer device
KR101977067B1 (en) Method for reconstructing diagnosis map by deep neural network-based feature extraction and apparatus using the same
CN106127689A (en) Image/video super-resolution method and device
CN116091313A (en) Image super-resolution network model and reconstruction method
CN112700460A (en) Image segmentation method and system
WO2024041235A1 (en) Image processing method and apparatus, device, storage medium and program product
CN113838047A (en) Large intestine polyp segmentation method and system based on endoscope image and related components
CN116757930A (en) Remote sensing image super-resolution method, system and medium based on residual separation attention mechanism
CN115358952A (en) Image enhancement method, system, equipment and storage medium based on meta-learning
CN112633260B (en) Video motion classification method and device, readable storage medium and equipment
CN116681888A (en) Intelligent image segmentation method and system
CN115293966A (en) Face image reconstruction method and device and storage medium
CN114418987A (en) Retinal vessel segmentation method and system based on multi-stage feature fusion
CN117522896A (en) Self-attention-based image segmentation method and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220429

Address after: 518000 Guangdong city of Shenzhen province Nanshan District Xili Lake

Applicant after: SHENZHEN POLYTECHNIC

Address before: 518000 Guangdong city of Shenzhen province Nanshan District Xili Lake

Applicant before: SHENZHEN POLYTECHNIC

Applicant before: University of Science and Technology Liaoning

GR01 Patent grant
GR01 Patent grant