CN115424023B - Self-attention method for enhancing small target segmentation performance - Google Patents

Self-attention method for enhancing small target segmentation performance Download PDF

Info

Publication number
CN115424023B
CN115424023B CN202211381902.5A CN202211381902A CN115424023B CN 115424023 B CN115424023 B CN 115424023B CN 202211381902 A CN202211381902 A CN 202211381902A CN 115424023 B CN115424023 B CN 115424023B
Authority
CN
China
Prior art keywords
phase
characteristic
channel
characteristic diagram
expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211381902.5A
Other languages
Chinese (zh)
Other versions
CN115424023A (en
Inventor
王博
赵威
申建虎
张伟
徐正清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Precision Diagnosis Medical Technology Co ltd
Original Assignee
Beijing Precision Diagnosis Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Precision Diagnosis Medical Technology Co ltd filed Critical Beijing Precision Diagnosis Medical Technology Co ltd
Priority to CN202211381902.5A priority Critical patent/CN115424023B/en
Publication of CN115424023A publication Critical patent/CN115424023A/en
Application granted granted Critical
Publication of CN115424023B publication Critical patent/CN115424023B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a self-attention mechanism module for enhancing small target segmentation performance, which comprises the following steps: dividing the multi-phase characteristic diagram X into C branches, inputting the C branches into the self-attention mechanism module, wherein each branch has the same structure, the characteristic diagram input by each branch is named as an ith-phase characteristic diagram, the ith-phase characteristic diagram is subjected to single-phase attention enhancement unit to obtain ith characteristic expression, the 1 st characteristic table is connected with the C characteristic expression to obtain final characteristic expression, and the final characteristic expression is H multiplied by W multiplied by D multiplied by C. The channels containing the small target feature information can be weighted, so that the small target segmentation capability is enhanced.

Description

Self-attention method for enhancing small target segmentation performance
Technical Field
The invention belongs to the technical field of deep learning to medical image classification, and relates to a self-attention method for enhancing small target segmentation performance.
Background
An Attention Mechanism (Attention Mechanism) can help the model to endow different weights to each input part, extract more key and important information, enable the model to make more accurate judgment, and meanwhile, can not bring larger expenses to the calculation and storage of the model. The attention mechanism is simple and can endow the model with stronger discrimination capability, and the current deep learning neural network structure generally comprises the attention mechanism.
The attention mechanism is helpful for improving the feature expression of a small target segmentation network, namely, paying attention to essential features and inhibiting unnecessary features, and the convolution block integration attention mechanism can effectively improve the performance of computer vision tasks such as image classification, target segmentation and example segmentation. In the detection algorithm based on the spatial attention mechanism, because the Ratio (RBI) of the small object Bounding box Area to the Image Area is between 0.08% and 0.58%, the edge features are blurred or even lost, the resolution and the available feature information are limited, and the small target feature information is lost due to the successive multi-layer downsampling convolution, so that the performance of the target segmentation algorithm based on the spatial attention mechanism is limited.
Based on the defects in the prior art, in order to capture the position of a tiny object and sense the global space structure of the tiny object, the invention provides a self-attention method for enhancing the small object segmentation performance on the premise of not increasing the calculation complexity.
Disclosure of Invention
The invention aims to provide a self-attention method for enhancing the small target segmentation performance, which can enhance the small target segmentation performance.
The technical scheme adopted by the invention is that,
a self-attention method for enhancing small object segmentation performance, said self-attention mechanism module comprising the steps of:
inputting a multi-phase characteristic diagram X into the self-attention mechanism module, wherein the multi-phase characteristic diagram X belongs to R C×H×W×D Wherein C, H, W and D respectively represent the number of channels, space height, space width and space depth of the multi-phase; dividing the multi-phase characteristic diagram X into C branches when inputting, wherein the branches are named as a phase 1 branch and a phase 2 branch in sequence, and so on, and the last branch is a phase C branch; each branch has the same structure, and the characteristic diagram input by each branch is named as the i-th phase characteristic diagram X i ∈R H×W×D And i is more than or equal to 2 and less than or equal to C, the phase i characteristic diagram obtains phase i characteristic expressions through a single-phase attention enhancing unit, each characteristic expression is H multiplied by W multiplied by D, the phase 1 characteristic expression is achieved through the phase C characteristic expression, and the phase I characteristic expression and the phase C characteristic expression are spliced together by concat to obtain final characteristic expression, and the final characteristic expression is H multiplied by W multiplied by D multiplied by C.
Further, the single-phase attention-enhancing unit comprises the following steps:
A. the ith phase feature map X i ∈R H×W×D Firstly, performing convolution block operation, dividing an operation result into D branches along a channel dimension D of the phase, and sequentially inputting each branch into 2 standard convolution layers with convolution kernel size k =1 to obtain an i-th phase standard convolution characteristic diagram;
B. inputting the standard convolution characteristic diagram of the phase i into a global average pooling GAP module to obtain a compressed characteristic diagram x of the phase i, wherein x belongs to R 1×D ,x=[x 1 ,x 2 ,...,x D ]Wherein the characteristic x i Gradually capturing specific feature responses during the training process;
C. compression profile x for the first D-1 branches 1 ~x D-1 The weight is normalized in a dot product mode to obtain a channel autocorrelation weight matrix X T The process can be expressed as follows:
the normalization weight uses a sigmoid activation function, and the process can be expressed as the following equation (2):
a,b,...,D-1=σ(W 2 δ(W 1 {x 1 ,x 2 ,...,x D-1 })) (2)
wherein: a, b, D-1 respectively represent x 1 ,x 2 ,...,x D-1 Sigma is a sigmoid activation function, and delta represents an activation function ReLU; w 1 And W 2 Two one-dimensional convolution layers;
Figure GDA0004021122370000031
D. the channel autocorrelation weight matrix X will be obtained by the following equation T And x D Channel adaptive weight X obtained by dot product s
X s =X T ·x D
E. In order to capture the long-distance dependence relationship of the channel and obtain effective small target semantic feature representation, the channel is subjected to self-adaptive weight X s And given phase i profile X i ∈R H×W×D Multiplication:
X D =X s ×X i
wherein X D H.times.Wtimes.D is a characteristic expression of phase i.
Drawings
FIG. 1 is a schematic diagram of the overall structure of a self-attention mechanism module according to the present invention;
FIG. 2 is a schematic diagram of a single-phase enhanced attention unit according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
A self-attention method for enhancing small object segmentation performance, as shown in fig. 1, the self-attention mechanism module includes the following steps:
inputting a multi-phase characteristic diagram X into the self-attention mechanism module, wherein the multi-phase characteristic diagram X belongs to R C×H×W×D Wherein C, H, W and D respectively represent the number of channels, space height, space width and space depth of the multi-phase; when the multi-phase characteristic diagram X is input, dividing the multi-phase characteristic diagram into C branches, wherein the branches are named as a phase 1 branch and a phase 2 branch in sequence, and so on, and the last branch is a phase C branch; each branch has the same structure, and the characteristic diagram input by each branch is named as the i-th phase characteristic diagram X i ∈R H×W×D And i is more than or equal to 2 and less than or equal to C, the phase i characteristic diagram obtains phase i characteristic expressions through a single-phase attention enhancing unit, each characteristic expression is H multiplied by W multiplied by D, the phase 1 characteristic expression is achieved through the phase C characteristic expression, and the phase I characteristic expression is spliced by concat to obtain final characteristic expression, and the characteristic expression is H multiplied by W multiplied by D multiplied by C.
Further, as shown in fig. 2, the single-phase enhanced attention unit includes the following steps:
A. the channel weight calculation formula (1) of the single-phase enhanced attention unit is as follows:
Figure GDA0004021122370000051
wherein: omega represents the weight of the channel, sigma is a sigmoid activation function, and F is convolution operation;
as can be seen from fig. 2, for each single phase, i.e. for the ith phase, the spatial depth D of the multi-phase feature map is the channel dimension of the phase; the ith phase feature map X i ∈R H×W×D Firstly, performing convolution block operation, dividing an operation result into D branches along a channel dimension D of the phase, and sequentially inputting each branch into 2 standard convolution layers with convolution kernel size k =1 to obtain an i-th phase standard convolution characteristic diagram;
B. inputting the standard convolution characteristic diagram of the phase i into a global average pooling GAP module to obtain a compressed characteristic diagram x of the phase i, wherein x belongs to R 1×D ,x=[x 1 ,x 2 ,...,x D ]Wherein the characteristic x i Gradually capturing specific characteristic response in the training process;
C. compressed feature map x of the first D-1 branches 1 ~x D-1 The weight is normalized in a dot product mode to obtain a channel autocorrelation weight matrix X T The process can be expressed as follows:
the normalization weight uses a sigmoid activation function, and the process can be expressed as the following equation (2):
a,b,...,D-1=σ(W 2 δ(W 1 {x 1 ,x 2 ,...,x D-1 })) (2)
wherein: a, b, D-1 respectively represent x 1 ,x 2 ,...,x D-1 Sigma is a sigmoid activation function, and delta represents an activation function ReLU; w 1 And W 2 Two one-dimensional convolution layers;
Figure GDA0004021122370000052
D. the channel autocorrelation weight matrix X will be obtained by the following equation T And x D Channel adaptive weight X obtained by dot product s
X s =X T ·x D
E. In order to capture the long-distance dependence relationship of the channel and obtain effective small target semantic feature representation, the channel is subjected to self-adaptive weight X s And given phase i profile X i ∈R H×W×D Multiplication:
X D =X s ·X i
wherein X D H.times.Wtimes.D is a characteristic expression of phase i.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (1)

1. A self-attention method for enhancing the small target segmentation performance is characterized by comprising a self-attention mechanism module, and specifically comprising the following steps:
inputting a multi-phase characteristic diagram X into the self-attention mechanism module, wherein the multi-phase characteristic diagram X belongs to R C×H×W×D Wherein C, H, W and D respectively represent the number of channels, space height, space width and space depth of the multi-phase; dividing the multi-phase characteristic diagram X into C channels when inputting, wherein the channels are named as a phase 1 channel and a phase 2 channel in sequence, and so on, and the last channel is a phase C branch; each channel has the same structure, and the characteristic diagram input by each channel is named as the i-th phase characteristic diagram X i ∈R H×W×D I is more than or equal to 2 and less than or equal to C, the phase i characteristic diagram is subjected to single-phase attention enhancement unit to obtain phase i characteristic expression, each characteristic expression is H multiplied by W multiplied by D, the phase 1 characteristic expression is subjected to C characteristic expression, concat is used for splicing to obtain final characteristic expression, and the final characteristic expression is H multiplied by W multiplied by D multiplied by C;
the single-phase attention enhancing unit comprises the following steps:
s1, for each single-phase, firstly, an i-th phase characteristic diagram X is obtained i ∈R H×W×D The operation is carried out by a convolution block, the operation result is divided into D branches along the space depth D of the phase, and each branch is firstly and sequentially arrangedInputting 2 standard convolution layers with convolution kernel size k =1 to obtain an i-th phase standard convolution characteristic diagram;
s2, inputting the standard convolution feature map of the phase i into a global average pooling GAP module to obtain a compressed feature map x of the phase i, wherein x belongs to R 1×D ,x=[x 1 ,x 2 ,...,x D ]Wherein the characteristic x D Gradually capturing specific characteristic response in the training process;
s3, compression characteristic diagram x of front D-1 branches 1 ~x D-1 Normalizing the weight through dot product operation to obtain a channel autocorrelation weight matrix X T
The normalization uses a sigmoid activation function, denoted as a, b 2 δ(W 1 {x 1 ,x 2 ,...,x D-1 }), where: a, b, D-1 respectively represent a compression characteristic diagram x 1 ,x 2 ,...,x D-1 Sigma is a sigmoid activation function, and delta represents a ReLU activation function; w is a group of 1 And W 2 Two one-dimensional convolution layers;
the channel autocorrelation weight matrix
Figure FDA0004116257230000011
S4, obtaining a channel autocorrelation weight matrix X T And x D Performing dot product operation to obtain channel adaptive weight X s I.e. said X s =X T ·x D
S5, in order to capture the long-distance dependence relationship of the channel, obtaining the semantic feature representation of the small target, and carrying out self-adaptive weighting on the channel X s And phase i profile X i ∈R H×W×D Multiplication:
X D =X s ·X i
wherein X D The expression H.times.Wtimes.D is characteristic of phase i.
CN202211381902.5A 2022-11-07 2022-11-07 Self-attention method for enhancing small target segmentation performance Active CN115424023B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211381902.5A CN115424023B (en) 2022-11-07 2022-11-07 Self-attention method for enhancing small target segmentation performance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211381902.5A CN115424023B (en) 2022-11-07 2022-11-07 Self-attention method for enhancing small target segmentation performance

Publications (2)

Publication Number Publication Date
CN115424023A CN115424023A (en) 2022-12-02
CN115424023B true CN115424023B (en) 2023-04-18

Family

ID=84207858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211381902.5A Active CN115424023B (en) 2022-11-07 2022-11-07 Self-attention method for enhancing small target segmentation performance

Country Status (1)

Country Link
CN (1) CN115424023B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113743505A (en) * 2021-09-06 2021-12-03 辽宁工程技术大学 Improved SSD target detection method based on self-attention and feature fusion
CN114119993A (en) * 2021-10-30 2022-03-01 南京理工大学 Salient object detection method based on self-attention mechanism
CN115240049A (en) * 2022-06-08 2022-10-25 江苏师范大学 Deep learning model based on attention mechanism

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950467B (en) * 2020-08-14 2021-06-25 清华大学 Fusion network lane line detection method based on attention mechanism and terminal equipment
CN113283435B (en) * 2021-05-14 2023-08-22 陕西科技大学 Remote sensing image semantic segmentation method based on multi-scale attention fusion
CN114897780B (en) * 2022-04-12 2023-04-07 南通大学 MIP sequence-based mesenteric artery blood vessel reconstruction method
CN114881962B (en) * 2022-04-28 2024-04-19 桂林理工大学 Retina image blood vessel segmentation method based on improved U-Net network
CN114943876A (en) * 2022-06-20 2022-08-26 南京信息工程大学 Cloud and cloud shadow detection method and device for multi-level semantic fusion and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113743505A (en) * 2021-09-06 2021-12-03 辽宁工程技术大学 Improved SSD target detection method based on self-attention and feature fusion
CN114119993A (en) * 2021-10-30 2022-03-01 南京理工大学 Salient object detection method based on self-attention mechanism
CN115240049A (en) * 2022-06-08 2022-10-25 江苏师范大学 Deep learning model based on attention mechanism

Also Published As

Publication number Publication date
CN115424023A (en) 2022-12-02

Similar Documents

Publication Publication Date Title
CN111639692B (en) Shadow detection method based on attention mechanism
Xu et al. Inter/intra-category discriminative features for aerial image classification: A quality-aware selection model
CN112926396B (en) Action identification method based on double-current convolution attention
CN110188705B (en) Remote traffic sign detection and identification method suitable for vehicle-mounted system
CN110569738B (en) Natural scene text detection method, equipment and medium based on densely connected network
CN111160249A (en) Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion
CN113642634A (en) Shadow detection method based on mixed attention
CN109522831B (en) Real-time vehicle detection method based on micro-convolution neural network
CN111899203A (en) Real image generation method based on label graph under unsupervised training and storage medium
CN115565043A (en) Method for detecting target by combining multiple characteristic features and target prediction method
CN116091979A (en) Target tracking method based on feature fusion and channel attention
CN115222998A (en) Image classification method
CN115880495A (en) Ship image target detection method and system under complex environment
CN118212417A (en) Medical image segmentation model based on light attention module and training method of model
CN114066844A (en) Pneumonia X-ray image analysis model and method based on attention superposition and feature fusion
CN115424023B (en) Self-attention method for enhancing small target segmentation performance
CN114723733B (en) Class activation mapping method and device based on axiom explanation
CN114863132A (en) Method, system, equipment and storage medium for modeling and capturing image spatial domain information
CN110689071B (en) Target detection system and method based on structured high-order features
CN113780305A (en) Saliency target detection method based on interaction of two clues
CN117808784B (en) Flexible film fold prediction method, system, electronic equipment and medium
CN114240991B (en) Instance segmentation method of RGB image
CN118212496B (en) Image fusion method based on noise reduction and complementary information enhancement
CN116503603B (en) Training method of inter-class shielding target detection network model based on weak supervision semantic segmentation and feature compensation
CN114694119B (en) Traffic sign detection method and related device based on heavy parameterization and feature weighting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant