CN116310693A - Camouflage target detection method based on edge feature fusion and high-order space interaction - Google Patents

Camouflage target detection method based on edge feature fusion and high-order space interaction Download PDF

Info

Publication number
CN116310693A
CN116310693A CN202310356445.2A CN202310356445A CN116310693A CN 116310693 A CN116310693 A CN 116310693A CN 202310356445 A CN202310356445 A CN 202310356445A CN 116310693 A CN116310693 A CN 116310693A
Authority
CN
China
Prior art keywords
edge
module
convolution
feature map
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310356445.2A
Other languages
Chinese (zh)
Inventor
牛玉贞
张家榜
杨立芬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202310356445.2A priority Critical patent/CN116310693A/en
Publication of CN116310693A publication Critical patent/CN116310693A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a camouflage target detection method based on edge feature fusion and high-order space interaction. Comprising the following steps: performing data preprocessing, including data pairing and data enhancement processing, to obtain a training data set; designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the network comprises an edge perception module, an edge enhancement module, an edge feature fusion module, a high-order space interaction module and a context aggregation module; designing a loss function, and guiding the parameter optimization of the network designed in the step B; c, training the camouflage target detection network based on the edge feature fusion and the high-order space interaction in the step B by using the training data set obtained in the step A, converging to Nash balance, and obtaining a trained camouflage target detection model based on the edge feature fusion and the high-order space interaction; inputting the image to be detected into a trained camouflage target detection model based on edge feature fusion and high-order space interaction, and outputting a mask image of the camouflage target.

Description

Camouflage target detection method based on edge feature fusion and high-order space interaction
Technical Field
The invention relates to the technical fields of image and video processing and computer vision, in particular to a camouflage target detection method based on edge feature fusion and high-order space interaction.
Background
With the development of the technology level, digital image processing has been widely applied to various aspects of human society life, and furthermore, it can be applied to many aspects of scientific research and the like. Camouflage target detection is an emerging digital image processing task, and aims to accurately and efficiently detect a camouflage target embedded in the surrounding environment, and divide an image into the camouflage target and a background so as to find the camouflage target therein. Camouflage phenomenon widely exists in the nature, and living things in the nature utilize self structures and physiological characteristics to blend in the surrounding environment, so that predators are avoided. The camouflage target detection can help to find camouflage organisms in nature and help scientists to better study the natural organisms. The applicable field of camouflage target detection is quite wide, and besides the academic value, the camouflage object detection is also helpful to promote search detection of camouflage targets in military, judgment of disease conditions in medical fields, invasion of locusts in agricultural remote sensing and the like.
Early camouflage target detection methods differentiated camouflage targets from background based on low-level features such as color, texture, geometric gradients, frequency domain, motion, etc. that were made by hand. Most camouflage targets are however very similar in color to the background, and color-based methods only address situations where the object is color-different from the background. The texture feature-based method has good detection effect when the color is very close to the background, but has poor performance when the texture of the camouflage target is similar to the background. The motion-based detection method relies on motion information that locates a camouflage target based on the varying differences between background color and texture created by the motion. However, the method is greatly influenced by interference factors, and the problems of false leakage detection and the like can occur due to illumination change or background movement. The camouflage target detection method based on the manual design features can achieve a certain effect, but often fails in a complex scene.
In recent years, with deep learning being deeply applied to various fields of computer vision, a plurality of camouflage target detection models based on convolutional neural networks appear, and the models model camouflage target information with strong feature extraction capability and autonomous learning capability, so that the accuracy of camouflage target detection can be improved, meanwhile, the generalization of the camouflage target detection models can be enhanced, and the effect is obviously improved compared with that of a traditional camouflage detection method. The mainstream method is to input an image into a backbone network, extract image features from the backbone network, and then predict masks of camouflage targets based on the features, thereby finding the camouflage targets therein. The methods make full use of semantic information of convolutional neural networks and expand receptive fields to detect camouflage targets. However, since the camouflage target has high similarity in color and texture with the background, the camouflage target detection model based on the convolutional neural network has difficulty in learning the characteristics of the camouflage target to distinguish the foreground from the background. Therefore, other methods introduce additional clues, such as edge information, based on the original, so as to help the camouflage target detection module based on the convolutional neural network to better distinguish the camouflage target from the background. Therefore, the accuracy of camouflage target detection can be effectively improved by utilizing the additional information. The invention designs a camouflage target detection method based on edge feature fusion and high-order space interaction, which comprises the steps of firstly extracting image features through a backbone network, then designing an edge perception module to generate an edge mask and edge features, then designing an edge enhancement module and an edge feature fusion module, constructing a high-order space interaction module and a context aggregation module, and finally generating a camouflage target mask by using the designed network.
Disclosure of Invention
In view of the above, the present invention aims to provide a camouflage target detection method based on edge feature fusion and higher-order spatial interaction, which is beneficial to significantly improving the performance of camouflage target detection by fusing edge features and performing higher-order spatial interaction.
In order to achieve the above purpose, the invention adopts the following technical scheme: a camouflage target detection method based on edge feature fusion and high-order space interaction comprises the following steps:
step A, data preprocessing, including data pairing and data enhancement processing, is carried out, and a training data set is obtained;
step B, designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the camouflage target detection network consists of an edge perception module, an edge enhancement module, an edge feature fusion module, a high-order space interaction module and a context aggregation module;
c, designing a loss function, and guiding parameter optimization of the network designed in the step B;
step D, training the camouflage target detection network based on the edge feature fusion and the high-order space interaction in the step B by using the training data set obtained in the step A, converging to Nash balance, and obtaining a trained camouflage target detection model based on the edge feature fusion and the high-order space interaction;
And E, inputting the image to be detected into a trained camouflage target detection model based on edge feature fusion and high-order space interaction, and outputting a mask image of the camouflage target.
In a preferred embodiment, the step a is implemented as follows:
a1, forming an image triplet by each original image, a label image corresponding to the original image and an edge label image;
step A2, randomly turning left and right, randomly cutting and randomly rotating each group of image triples; performing color enhancement on the original image, and adjusting the brightness, contrast, saturation and definition of the original image by setting random values as parameters; adding random black points or white points as random noise to the label image corresponding to the original image;
and A3, scaling each image in the data set into images with the same size of H multiplied by W.
In a preferred embodiment, the step B is implemented as follows:
step B1, constructing an image feature extraction network, and extracting image features by using the constructed network;
step B2, designing an edge perception module, and generating an edge mask and edge characteristics by using the designed module;
step B3, designing an edge enhancement module and an edge feature fusion module, enhancing the feature representation with camouflage target edge structure semantics by using the edge enhancement module, and generating features of fusion edge information by using the edge feature fusion module;
Step B4, constructing a high-order space interaction module and a context aggregation module, using the high-order space interaction module to inhibit the attention to the background and promote the attention to the foreground, and using the context aggregation module to mine context semantics to enhance object detection;
and B5, designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the camouflage target detection network comprises an edge perception module, an edge feature fusion module, an edge enhancement module, a high-order space interaction module and a context aggregation module, and generating a final camouflage target mask by using the designed network.
In a preferred embodiment, the step B1 is implemented as follows:
step B1, taking Res2Net-50 as a backbone network, extracting characteristics of an original image I with the input size of H multiplied by W multiplied by 3, and specifically, respectively recording characteristic diagrams output by the original image I in a first stage, a second stage, a third stage and a fourth stage as F 1 、F 2 、F 3 And F 4 Wherein the characteristic diagram F 1 The size is as follows
Figure BDA0004163444400000041
Figure BDA0004163444400000042
Feature map F 2 The size is +.>
Figure BDA0004163444400000043
Feature map F 3 The size is +.>
Figure BDA0004163444400000044
Feature map F 4 The size is +.>
Figure BDA0004163444400000045
C=256。
In a preferred embodiment, the step B2 is implemented as follows:
step B21, designing an edge perception module, wherein the input of the edge perception module is the first stage characteristic diagram F extracted in the step B1 1 And fourth stage characteristic diagram F 4 The edge perception module outputs as an edge feature map F e And edge mask M e
Step B22, designing a feature fusion block in the edge perception module; the input of the edge perception module is the feature map F extracted in the step B1 1 And F 4 Input of a feature map F 1 Sequentially performing 1×1 convolution, BN layer and ReLU activation function to reduce channel number to obtain feature map
Figure BDA0004163444400000051
Input of a feature map F 4 The number of channels is reduced by a 1 multiplied by 1 convolution, a BN layer and a ReLU activation function in sequence to obtain a characteristic diagram +.>
Figure BDA0004163444400000052
Feature map F 'using bilinear interpolation' 4 Is adjusted to sum F', the width and the height of 1 The same width and height, a characteristic diagram is obtained>
Figure BDA0004163444400000053
Will F' 1 And F' 4 The edge feature map is obtained by the channel attention module after being spliced along the channel dimension>
Figure BDA0004163444400000054
The specific formula is as follows:
F 1 =ReLU(BN(Conv1(F 1 )))
F 4 =ReLU(BN(Conv1(F 4 )))
”'
F 4 =Up(F 4 )
Fe=SE(Concat(F 1 ,F 4 ))
wherein Conv1 (·) is a convolution layer with a convolution kernel size of 1×1, BN (·) is a batch normalization operation, reLU (·) is a ReLU activation function, up (·) is bilinear interpolation upsampling, concat (·, ·) is a splice operation along the channel dimension, SE (·) is a channel attention module;
step B23, designing a convolution block in the edge perception module; inputting the edge characteristic diagram F obtained in the step B22 e Sequentially performing 3×3 convolution, BN layer, reLU activation function, and 1×1 convolution to finally generate edge mask
Figure BDA0004163444400000055
The specific formula is as follows:
Me=Conv1(ReLU(BN(Conv3(ReLU(BN(Conv3(Fe))))))))
where Conv3 (·) is the convolution layer with a convolution kernel size of 3×3, BN (·) is the batch normalization operation, reLU (·) is the activation function, conv1 (·) is the convolution with a convolution kernel size of 1×1.
In a preferred embodiment, the step B3 is implemented as follows:
step B31, designing an edge enhancement module, namely firstly designing edge guiding operation in the edge enhancement module; input is the edge mask M obtained in step B2 e And the feature map obtained in the step B1
Figure BDA0004163444400000061
Figure BDA0004163444400000062
Masking the edges of the input M e Downsampling bilinear interpolation to and from feature map F i The same width and height, resulting in a mask +.>
Figure BDA0004163444400000063
Mask M' e And feature map F i Multiplying and thenAnd F is equal to i Adding, and sequentially performing 3×3 convolution, BN layer, and ReLU activation function to obtain edge-guided feature map ++>
Figure BDA0004163444400000064
The specific formula is as follows:
M' e =Down(M e )
Figure BDA0004163444400000065
where Down (·) is a bilinear interpolation downsampling operation,
Figure BDA0004163444400000066
is a matrix multiplication, +.>
Figure BDA0004163444400000067
Is a matrix addition operation, conv3 (·) is a convolution layer with a convolution kernel size of 3×3, BN (·) is a batch normalization operation, and ReLU (·) is an activation function;
step B32, constructing a CBAM attention sub-module in an edge enhancement module, wherein the edge enhancement module consists of serial channel attention SE and spatial attention SA, and the input feature map is a feature map F obtained in the step B32 guide Obtaining edge enhancement features
Figure BDA0004163444400000068
The specific formula is as follows:
F ee =SA(SE(F guide ))
wherein SE (-) is a channel attention module and SA (-) is a spatial attention module;
step B33, designing an edge feature fusion module, and inputting the edge feature fusion module into the first-stage feature map extracted in the step B1
Figure BDA0004163444400000069
Edge feature map obtained in step B2 +.>
Figure BDA00041634444000000610
And edge mask->
Figure BDA00041634444000000611
Masking the edges M e And feature map F 4 Multiplying by F 4 Adding to obtain a feature map->
Figure BDA00041634444000000612
Edge feature map F e Sequentially performing 3×3 convolution, BN layer and ReLU activation function to obtain a reduced channel feature map +.>
Figure BDA0004163444400000071
Will F M With F' e Splicing along the channel dimension, sequentially passing through 3×3 convolution, swish activation function, SE module, and 3×3 convolution, and adding feature map F' e Obtaining a characteristic diagram +.>
Figure BDA0004163444400000072
Map F' e Through SE module and->
Figure BDA0004163444400000073
Splicing along the channel dimension, and then performing 3X 3 convolution to obtain a feature map +.>
Figure BDA0004163444400000074
Figure BDA0004163444400000075
Finally, the feature map is added->
Figure BDA0004163444400000076
And feature map F 1 Adding to obtain a feature map of the final fused edge information>
Figure BDA0004163444400000077
The specific formula is as follows:
Figure BDA0004163444400000078
F' e =ReLU(BN(Conv3(F e )))
Figure BDA0004163444400000079
Figure BDA00041634444000000710
Figure BDA00041634444000000711
wherein the method comprises the steps of
Figure BDA00041634444000000712
Is a matrix multiplication, +.>
Figure BDA00041634444000000713
Is a matrix addition operation, conv3 (·) is a convolution layer with a convolution kernel size of 3×3, BN (·) is a batch normalization operation, reLU (·) is an activation function, swish (·) is a Swish activation function, SE (·) is a channel attention module, concat (·), is a concatenation operation along the channel dimension.
In a preferred embodiment, the step B4 is implemented as follows:
step B41, firstly constructing a gating convolution module in the high-order space interaction module, and recording the characteristic diagram input by the module as
Figure BDA00041634444000000714
Will input a feature map F α Layer Normalization (LN) 1 ) Obtaining a normalized feature map
Figure BDA00041634444000000715
Then will->
Figure BDA00041634444000000716
The channel is expanded to twice by a 1X 1 convolution to obtain a characteristic diagram +.>
Figure BDA0004163444400000081
Will->
Figure BDA0004163444400000082
Splitting into two feature maps along the channel>
Figure BDA0004163444400000083
Inputting q into depth separable convolution to obtain a characteristic diagram +.>
Figure BDA0004163444400000084
Splitting it into n (n is the order) feature maps>
Figure BDA0004163444400000085
Wherein->
Figure BDA0004163444400000086
Map the characteristic map p 0 And feature map q 0 Multiplying, and expanding its channel to twice by 1×1 convolution to obtain first space interaction characteristic diagram ++>
Figure BDA0004163444400000087
Map the characteristic map p 1 And feature map q 1 Multiplying, and expanding the channel to twice by 1×1 convolution to obtain second space interaction characteristic diagram ++>
Figure BDA0004163444400000088
Then sequentially iterate to the feature map p n-1 And feature map q n-1 After multiplication, a convolution layer with the same number of input channels and output channels and a convolution kernel size of 1 multiplied by 1 is passed, so as to obtain n times of space interaction characteristic diagrams +.>
Figure BDA0004163444400000089
Finally, the characteristic diagram F is input α And p is as follows n Adding to obtain an intermediate output profile->
Figure BDA00041634444000000810
The specific formula is as follows:
Figure BDA00041634444000000811
Figure BDA00041634444000000812
Figure BDA00041634444000000813
Q=DWConv(q)
Figure BDA00041634444000000814
Figure BDA00041634444000000815
Figure BDA00041634444000000816
Wherein Split (·) is Split along the channel dimension, DWConv (·) is a depth separable convolution, conv1 (·) is a convolution layer with a convolution kernel size of 1 x 1,
Figure BDA00041634444000000817
is a matrix multiplication, +.>
Figure BDA00041634444000000818
Is a matrix addition operation;
step B42, constructing a high-order space interaction moduleIs input into the feed-forward module of step B41 to obtain a characteristic diagram F mid For F mid Layer normalization was performed, denoted LN 2 Then input into two layers of full-connection layers, marked as MLP, and output of the two layers of full-connection layers and the feature map F mid Adding to obtain high-order space interaction characteristics
Figure BDA0004163444400000091
The specific formula is as follows:
Figure BDA0004163444400000092
wherein the method comprises the steps of
Figure BDA0004163444400000093
Is a matrix addition operation;
step B43, constructing a channel reduction module in the high-order space interaction module, and inputting F obtained in step B42 hsi F is to F hsi Sequentially performing 1×1 convolution, BN layer and ReLU activation function to obtain a channel-reduced high-order space interaction characteristic diagram
Figure BDA0004163444400000094
The specific formula is as follows:
F’ hsi =ReLU(BN(Conv1(F hsi )))
where Conv1 (·) is the convolution layer with a convolution kernel size of 1×1, BN (·) is the batch normalization operation, reLU (·) is the activation function;
step B44, firstly constructing a convolution block in a context aggregation module, and recording that the context aggregation module inputs two feature graphs with different scales
Figure BDA0004163444400000095
And->
Figure BDA0004163444400000096
First, feature map F high Upsampling bilinear interpolation to adjust its width and height to be equal to F low Likewise, theWidth and height, and then F low Splicing along the channel dimension, and then sequentially carrying out 1×1 convolution, BN layer and ReLU activation function to obtain a feature map +.>
Figure BDA0004163444400000097
And then F is arranged cat Four feature maps are equally divided along the channel dimension>
Figure BDA0004163444400000098
And->
Figure BDA0004163444400000099
Will->
Figure BDA00041634444000000920
And->
Figure BDA00041634444000000921
After addition, the characteristic diagram ++is obtained by 3X 3 convolution, BN layer and ReLU activation function in sequence>
Figure BDA00041634444000000910
Will->
Figure BDA00041634444000000911
And->
Figure BDA00041634444000000912
After the three are added, the characteristic diagram +.f is obtained by 3 multiplied by 3 convolution with expansion rate of 2, BN layer and ReLU activation function>
Figure BDA00041634444000000913
Figure BDA00041634444000000914
Will->
Figure BDA00041634444000000915
And->
Figure BDA00041634444000000916
Adding the three componentsThen the characteristic diagram is obtained by 3X 3 convolution with the expansion rate of 3, BN layer and ReLU activation function>
Figure BDA00041634444000000917
Will->
Figure BDA00041634444000000918
And->
Figure BDA00041634444000000919
After addition, the characteristic diagram is obtained by 3X 3 convolution with expansion ratio of 4, BN layer and ReLU activation function
Figure BDA0004163444400000101
Then will->
Figure BDA0004163444400000102
And->
Figure BDA0004163444400000103
After being spliced along the channel dimension, the characteristic diagram +.f is obtained by sequentially carrying out 1 multiplied by 1 convolution, BN layer and ReLU activation function>
Figure BDA0004163444400000104
Finally F is arranged cat And F' cat After addition, the context feature map +.f. is obtained by 3×3 convolution, BN layer and ReLU activation function in sequence>
Figure BDA0004163444400000105
The specific formula is as follows:
F cat =ReLU(BN(Conv1(Concat(F low ,Up(F high )))))
Figure BDA00041634444000001013
Figure BDA0004163444400000106
Figure BDA0004163444400000107
Figure BDA0004163444400000108
Figure BDA0004163444400000109
Figure BDA00041634444000001010
Figure BDA00041634444000001011
wherein Up (-) is a bilinear interpolation upsampling operation, concat (-), and Concat (-), are concatenation operations along the channel dimension,
Figure BDA00041634444000001012
is a matrix addition operation, conv3 (&) is a convolution layer with a convolution kernel size of 3×3, conv3 d=i (. Cndot.) is a 3X 3 convolution with a dilation rate of i, conv1 (-) is a convolution layer with a convolution kernel size of 1X 1, BN (-) is a batch normalization operation, reLU (-) is an activation function, split (-) is a Split equally operation along the channel dimension.
In a preferred embodiment, the step B5 is implemented as follows:
step B5, designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the camouflage target detection network comprises an edge perception module, an edge feature fusion module, an edge enhancement module, a high-order space interaction module and a context aggregation module; inputting an original image, and obtaining four feature images with different scales through a backbone network in the step B1
Figure BDA0004163444400000111
And->
Figure BDA0004163444400000112
Figure BDA0004163444400000113
Will F 1 And F 4 Inputting the edge sensing module in the step B2 to obtain an edge feature map +.>
Figure BDA0004163444400000114
Figure BDA0004163444400000115
And edge mask->
Figure BDA0004163444400000116
Then three edge enhancement modules in step B3 are constructed and respectively marked as EEM 1 、EEM 2 And EEM 3 Wherein EEM is 1 Is input as the fourth stage characteristic map F extracted in the step B1 4 And an edge mask M obtained in the step B2 e The output is edge enhancement features
Figure BDA0004163444400000117
EEM 2 Is input as the third stage characteristic map F extracted in the step B1 3 And an edge mask M obtained in the step B2 e The output is edge enhancement feature->
Figure BDA0004163444400000118
EEM 3 Is input as the second stage feature map F extracted in step B1 2 And an edge mask M obtained in the step B2 e The output is edge enhancement feature->
Figure BDA0004163444400000119
Then constructing an edge feature fusion module in the step B3, and inputting the edge feature fusion module into the first-stage feature map F extracted in the step B1 1 Edge feature map F obtained in step B2 e And edge mask M e Output as fused edge informationFeature map->
Figure BDA00041634444000001110
Then, constructing four high-order space interaction modules in the step B4, which are respectively marked as HSIM 1 、HSIM 2 、HSIM 3 And HSIM 4 Their inputs are the characteristic map obtained in step B3 +.>
Figure BDA00041634444000001111
Figure BDA00041634444000001112
And->
Figure BDA00041634444000001113
The outputs are +.>
Figure BDA00041634444000001114
Figure BDA00041634444000001115
And->
Figure BDA00041634444000001116
In the context aggregation module immediately following the construction of the three steps B4, they are denoted as CAM respectively 1 、CAM 2 And CAM (CAM) 3 In which CAM is 1 Is the input of a feature map->
Figure BDA00041634444000001117
And->
Figure BDA00041634444000001118
Output as context feature map +.>
Figure BDA00041634444000001119
Figure BDA00041634444000001120
CAM 2 The input of (2) is CAM 1 Output of (2)
Figure BDA00041634444000001121
And feature map->
Figure BDA00041634444000001122
Output as context feature map +.>
Figure BDA00041634444000001123
CAM 3 The input of (2) is CAM 2 Output of +.>
Figure BDA00041634444000001124
And feature map->
Figure BDA00041634444000001125
Output as context feature map +.>
Figure BDA00041634444000001126
For edge mask M e The two linear interpolation up-sampling is amplified by 4 times to obtain a final edge mask M edge The method comprises the steps of carrying out a first treatment on the surface of the For contextual profile->
Figure BDA00041634444000001127
Compressing the mask into 1 channel through 1X 1 convolution, and then performing bilinear interpolation up-sampling and amplifying by 16 times to obtain a first-stage camouflage target mask +.>
Figure BDA0004163444400000121
For contextual profile->
Figure BDA0004163444400000122
Compressing the mask into 1 channel by 1X 1 convolution, and performing bilinear interpolation up-sampling and amplification by 8 times to obtain second-stage camouflage target mask +.>
Figure BDA0004163444400000123
For contextual profile->
Figure BDA0004163444400000124
Will be convolved by 1 x 1Compressing the mask into 1 channel, and performing bilinear interpolation up-sampling and amplification for 4 times to obtain final camouflage target mask +. >
Figure BDA0004163444400000125
The specific formula is as follows:
M edge =Up scale=4 (M e )
Figure BDA0004163444400000126
Figure BDA0004163444400000127
Figure BDA0004163444400000128
wherein Up scale=4 (. Cndot.) is double linear interpolation upsampling with a multiple of 4, up scale=8 (. Cndot.) is bilinear interpolation upsampling with a multiple of 8, up scale=16 (. Cndot.) is bilinear interpolation upsampling by a factor of 16, conv1 (. Cndot.) is a convolution layer with a convolution kernel size of 1×1 and output channel number of 1.
In a preferred embodiment, the step C is implemented as follows:
step C, designing a loss function as constraint to optimize a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the specific formula is as follows:
Figure BDA0004163444400000129
wherein G is camo Representing a label image corresponding to the original image I, G edge Representing the edge label image to which the original image I corresponds,
Figure BDA00041634444000001210
expressed as the total loss functionCount (n)/(l)>
Figure BDA00041634444000001211
Representing weighted binary cross entropy loss, ">
Figure BDA00041634444000001212
Represented as a weighted cross-ratio loss,
Figure BDA00041634444000001213
the Dice coefficient loss is represented, and λ is represented as the weight of the loss.
In a preferred embodiment, the step D is implemented as follows:
step D1, randomly dividing the training data set obtained in the step A into a plurality of batches, wherein each batch comprises N pairs of images;
step D2, inputting an original image I, and obtaining an edge mask M after the original image I passes through the camouflage target detection network based on edge feature fusion and high-order space interaction in the step B edge Camouflage target mask
Figure BDA0004163444400000131
Figure BDA0004163444400000132
And->
Figure BDA0004163444400000133
Calculating the loss +.using the formula in step C>
Figure BDA0004163444400000134
Step D3, calculating the gradient of the parameters in the network by using a back propagation method according to the loss, and updating the network parameters by using an Adam optimization method;
step D4, repeatedly executing the steps D1 to D3 by taking the batch as a unit until the target loss function value of the network converges to Nash balance, and storing network parameters to obtain a camouflage target detection model based on edge feature fusion and higher-order space interaction; for a camouflage target image tested toThe highest resolution of three camouflage target masks for model prediction
Figure BDA0004163444400000135
As the final camouflage target mask.
Compared with the prior art, the invention has the following beneficial effects: according to the invention, on the basis of utilizing the good edge information, the edge information is better fused with the main features, and the features are subjected to high-order spatial interaction, so that the relationship between the camouflage target and the background in the image can be better learned. The invention provides a camouflage target detection method based on edge feature fusion and high-order space interaction, which is characterized in that edge features and edge masks are respectively generated through an edge perception module, edge information is fused in an edge enhancement module and an edge feature fusion module, high-order space interaction is carried out on the fused features in a high-order space interaction module, and finally, features of different levels are aggregated in a context aggregation module, so that a high-quality camouflage target mask can be finally output.
Drawings
FIG. 1 is a flow chart of an implementation of the method in a preferred embodiment of the invention.
FIG. 2 is a block diagram of a camouflage object detection network based on edge feature fusion and higher order spatial interaction in a preferred embodiment of the invention.
Fig. 3 is a block diagram of an edge-aware module in a preferred embodiment of the present invention.
Fig. 4 is a block diagram of an edge enhancement module in a preferred embodiment of the present invention.
Fig. 5 is a block diagram of an edge feature fusion module in a preferred embodiment of the invention.
Fig. 6 is a block diagram of a high-order spatial interaction module in a preferred embodiment of the present invention.
FIG. 7 is a block diagram of a context aggregation module in accordance with a preferred embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the present application; as used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
The invention provides a camouflage target detection method based on edge feature fusion and high-order space interaction, which is shown in fig. 1-7 and comprises the following steps:
step A, data preprocessing, including data pairing and data enhancement processing, is carried out, and a training data set is obtained;
step B, designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the network comprises an edge perception module, an edge enhancement module, an edge feature fusion module, a high-order space interaction module and a context aggregation module;
c, designing a loss function, and guiding parameter optimization of the network designed in the step B;
step D, training the camouflage target detection network based on the edge feature fusion and the high-order space interaction in the step B by using the training data set obtained in the step A, converging to Nash balance, and obtaining a trained camouflage target detection model based on the edge feature fusion and the high-order space interaction;
and E, inputting the image to be detected into a trained camouflage target detection model based on edge feature fusion and high-order space interaction, and outputting a mask image of the camouflage target.
Further, the step a includes the steps of:
and A1, forming an image triplet by each original image, the corresponding label image and the corresponding edge label image.
Step A2, randomly turning left and right, randomly cutting and randomly rotating each group of image triples; performing color enhancement on the original image, and adjusting the brightness, contrast, saturation and definition of the original image by setting random values as parameters; and adding random black points or white points as random noise to the label image corresponding to the original image.
And A3, scaling each image in the data set into images with the same size of H multiplied by W.
Further, the step B includes the steps of:
and B1, constructing an image feature extraction network, and extracting image features by using the constructed network.
And B2, designing an edge perception module, and generating an edge mask and edge characteristics by using the designed module.
And B3, designing an edge enhancement module and an edge feature fusion module, enhancing the feature representation with camouflage target edge structure semantics by using the edge enhancement module, and generating features of fusion edge information by using the edge feature fusion module.
And B4, constructing a high-order space interaction module and a context aggregation module, inhibiting the attention to the background by using the high-order space interaction module, promoting the attention to the foreground, and mining context semantics by using the context aggregation module to enhance object detection.
And B5, designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the camouflage target detection network comprises an edge perception module, an edge feature fusion module, an edge enhancement module, a high-order space interaction module and a context aggregation module, and generating a final camouflage target mask by using the designed network.
Further, step B1 includes the steps of:
step B1, taking Res2Net-50 as a backbone network, extracting characteristics of an original image I with the input size of H multiplied by W multiplied by 3, and specifically, respectively recording characteristic diagrams output by the original image I in a first stage, a second stage, a third stage and a fourth stage as F 1 、F 2 、F 3 And F 4 Wherein the characteristic diagram F 1 The size is as follows
Figure BDA0004163444400000161
Feature map F 2 The size is +.>
Figure BDA0004163444400000162
Feature map F 3 The size is +.>
Figure BDA0004163444400000163
Feature map F 4 The size is +.>
Figure BDA0004163444400000164
Further, as shown in fig. 3, step B2 includes the steps of:
step B21, designing an edge perception module, wherein the input of the edge perception module is the first stage characteristic diagram F extracted in the step B1 1 And fourth stage characteristic diagram F 4 The output of the module is an edge feature map F e And edge mask M e
And step B22, designing a feature fusion block in the edge perception module. The input of the module is the feature map F extracted in the step B1 1 And F 4 Input of a feature map F 1 Sequentially performing 1×1 convolution, BN layer and ReLU activation function to reduce channel number to obtain feature map
Figure BDA0004163444400000165
Input of a feature map F 4 The number of channels is reduced by a 1 multiplied by 1 convolution, a BN layer and a ReLU activation function in sequence to obtain a characteristic diagram +.>
Figure BDA0004163444400000166
Feature map F 'using bilinear interpolation' 4 Width and height of (2) are adjusted to sum F' 1 The same width and height, a characteristic diagram is obtained>
Figure BDA0004163444400000171
Will F' 1 And F'. 4 The edge special is obtained through the channel attention module after the channel dimension is splicedSyndrome/pattern of->
Figure BDA0004163444400000172
The specific formula is as follows:
F’ 1 =ReLU(BN(Conv1(F 1 )))
F’ 4 =ReLU(BN(Conv1(F 4 )))
F” 4 =Up(F’ 4 )
F e =SE(Concat(F’ 1 ,F” 4 ))
where Conv1 (·) is the convolution layer with a convolution kernel size of 1×1, BN (·) is the batch normalization operation, reLU (·) is the ReLU activation function, up (·) is bilinear interpolation upsampling, concat (·, ·) is the splice operation along the channel dimension, SE (·) is the channel attention module.
And step B23, designing a convolution block in the edge perception module. Inputting the edge characteristic diagram F obtained in the step B22 e Sequentially performing 3×3 convolution, BN layer, reLU activation function, and 1×1 convolution to finally generate edge mask
Figure BDA0004163444400000173
The specific formula is as follows:
M e =Conv1(ReLU(BN(Conv3(ReLU(BN(Conv3(F e ))))))))
where Conv3 (·) is the convolution layer with a convolution kernel size of 3×3, BN (·) is the batch normalization operation, reLU (·) is the activation function, conv1 (·) is the convolution with a convolution kernel size of 1×1.
Further, as shown in fig. 4 and 5, step B3 includes the steps of:
and step B31, designing an edge enhancement module, and firstly designing an edge guiding operation in the edge enhancement module. Input is the edge mask M obtained in step B2 e And the feature map obtained in the step B1
Figure BDA0004163444400000174
Masking the edges of the input M e Downsampling bilinear interpolation to and from feature map F i The same width and height, resulting in a mask +.>
Figure BDA0004163444400000175
Mask M' e And feature map F i Multiplying by F i Adding, and sequentially performing 3×3 convolution, BN layer, and ReLU activation function to obtain edge-guided feature map ++>
Figure BDA0004163444400000176
Figure BDA0004163444400000181
The specific formula is as follows:
M' e =Down(M e )
Figure BDA0004163444400000182
where Down (·) is a bilinear interpolation downsampling operation,
Figure BDA0004163444400000183
is a matrix multiplication, +.>
Figure BDA0004163444400000184
Is a matrix addition operation, conv3 (·) is a convolution layer with a convolution kernel size of 3×3, BN (·) is a batch normalization operation, and ReLU (·) is an activation function.
Step B32, constructing a CBAM attention sub-module in the edge enhancement module, wherein the module consists of serial channel attention SE and spatial attention SA, and the input feature map is a feature map F obtained in the step B32 guide Obtaining edge enhancement features
Figure BDA0004163444400000185
The specific formula is as follows:
F ee =SA(SE(F guide ))
where SE (-) is the channel attention module and SA (-) is the spatial attention module.
Step B33, designing an edge feature fusion module, and inputting the edge feature fusion module into the first-stage feature map extracted in the step B1
Figure BDA0004163444400000186
Edge feature map obtained in step B2 +.>
Figure BDA0004163444400000187
And edge mask->
Figure BDA0004163444400000188
Masking the edges M e And feature map F 4 Multiplying by F 4 Adding to obtain a feature map->
Figure BDA0004163444400000189
Edge feature map F e Sequentially performing 3×3 convolution, BN layer and ReLU activation function to obtain a reduced channel feature map +.>
Figure BDA00041634444000001810
Will F M With F' e Splicing along the channel dimension, sequentially passing through 3×3 convolution, swish activation function, SE module, and 3×3 convolution, and adding feature map F' e Obtaining a characteristic diagram +.>
Figure BDA00041634444000001811
Figure BDA00041634444000001812
Map F' e Through SE module and->
Figure BDA00041634444000001813
Splicing along the channel dimension, and then performing 3X 3 convolution to obtain a feature map +.>
Figure BDA00041634444000001814
Finally, the feature map is added->
Figure BDA00041634444000001815
And feature map F 1 Adding to obtain a feature map of the final fused edge information>
Figure BDA00041634444000001816
Figure BDA00041634444000001817
The specific formula is as follows:
Figure BDA0004163444400000191
F' e =ReLU(BN(Conv3(F e )))
Figure BDA0004163444400000192
Figure BDA0004163444400000193
Figure BDA0004163444400000194
wherein the method comprises the steps of
Figure BDA0004163444400000195
Is a matrix multiplication, +.>
Figure BDA0004163444400000196
Is a matrix addition operation, conv3 (·) is a convolution layer with a convolution kernel size of 3×3, BN (·) is a batch normalization operation, reLU (·) is an activation function, swish (·) is a Swish activation function, SE (·) is a channel attention module, concat (·), is a concatenation operation along the channel dimension.
Further, as shown in fig. 6 and 7, step B4 includes the steps of:
step (a)B41, firstly constructing a gating convolution module in the high-order space interaction module, and recording the characteristic diagram input by the module as
Figure BDA0004163444400000197
Will input a feature map F α Layer Normalization (LN) 1 ) Obtaining a normalized feature map
Figure BDA0004163444400000198
Then will->
Figure BDA0004163444400000199
The channel is enlarged to be twice of the original channel by a 1X 1 convolution to obtain a characteristic diagram
Figure BDA00041634444000001910
Will->
Figure BDA00041634444000001911
Splitting into two feature maps along the channel>
Figure BDA00041634444000001912
Inputting q into depth separable convolution to obtain a characteristic diagram +.>
Figure BDA00041634444000001913
Splitting it into n (n is the order) feature maps>
Figure BDA00041634444000001914
Wherein->
Figure BDA00041634444000001915
Map the characteristic map p 0 And feature map q 0 Multiplying, and expanding its channel to twice by 1×1 convolution to obtain first space interaction characteristic diagram ++>
Figure BDA00041634444000001916
Map the characteristic map p 1 And feature map q 1 Multiplying and expanding the channel to the original one by a 1X 1 convolutionTwice as much as the first time, obtain the second time space interaction feature map +.>
Figure BDA0004163444400000201
Then sequentially iterate to the feature map p n-1 And feature map q n-1 After multiplication, a convolution layer with the same number of input channels and output channels and a convolution kernel size of 1 multiplied by 1 is passed, so as to obtain n times of space interaction characteristic diagrams +.>
Figure BDA0004163444400000202
Finally, the characteristic diagram F is input α And p is as follows n Adding to obtain an intermediate output profile->
Figure BDA0004163444400000203
The specific formula is as follows:
Figure BDA0004163444400000204
Figure BDA0004163444400000205
Figure BDA0004163444400000206
Q=DWConv(q)
Figure BDA0004163444400000207
/>
Figure BDA0004163444400000208
Figure BDA0004163444400000209
Wherein Split (·) is Split along the channel dimension, DWConv (.cndot.) is a depth separable convolution, conv1 (.cndot.) is a convolution layer with a convolution kernel size of 1X 1,
Figure BDA00041634444000002010
is a matrix multiplication, +.>
Figure BDA00041634444000002011
Is a matrix addition operation.
Step B42, constructing a feedforward module in the high-order space interaction module, and inputting the feedforward module into the feature map F obtained in the step B41 mid For F mid Layer Normalization (LN) 2 ) Then input into two layers of full-connection layer (marked as MLP), and output of the two layers of full-connection layer and the feature map F mid Adding to obtain high-order space interaction characteristics
Figure BDA00041634444000002012
The specific formula is as follows:
Figure BDA00041634444000002013
wherein the method comprises the steps of
Figure BDA00041634444000002014
Is a matrix addition operation.
Step B43, constructing a channel reduction module in the high-order space interaction module, and inputting F obtained in step B42 hsi F is to F hsi Sequentially performing 1×1 convolution, BN layer and ReLU activation function to obtain a channel-reduced high-order space interaction characteristic diagram
Figure BDA0004163444400000211
The specific formula is as follows:
F’ hsi =ReLU(BN(Conv1(F hsi )))
where Conv1 (·) is the convolution layer with a convolution kernel size of 1×1, BN (·) is the batch normalization operation, and ReLU (·) is the activation function.
Step B44, first build context aggregationThe convolution block in the module is combined, and the module is recorded as two characteristic diagrams with different scales
Figure BDA0004163444400000212
And->
Figure BDA0004163444400000213
First, feature map F high Upsampling bilinear interpolation to adjust its width and height to be equal to F low The same width and height as F low Splicing along the channel dimension, and then sequentially carrying out 1×1 convolution, BN layer and ReLU activation function to obtain a feature map +.>
Figure BDA0004163444400000214
And then F is arranged cat Four feature maps are equally divided along the channel dimension>
Figure BDA0004163444400000215
And->
Figure BDA0004163444400000216
Figure BDA0004163444400000217
Will->
Figure BDA0004163444400000218
And->
Figure BDA0004163444400000219
After addition, the characteristic diagram ++is obtained by 3X 3 convolution, BN layer and ReLU activation function in sequence>
Figure BDA00041634444000002110
Will->
Figure BDA00041634444000002111
And->
Figure BDA00041634444000002112
Adding the three components, and sequentially carrying out 3×3 convolution with expansion rate of 2 and BNLayer, reLU activation function gets feature map->
Figure BDA00041634444000002113
Figure BDA00041634444000002114
Will->
Figure BDA00041634444000002115
And->
Figure BDA00041634444000002116
The three are added and then sequentially subjected to 3 multiplied by 3 convolution with the expansion rate of 3, BN layer and ReLU activation function to obtain a characteristic diagram +.>
Figure BDA00041634444000002117
Will->
Figure BDA00041634444000002118
And->
Figure BDA00041634444000002119
After addition, the characteristic diagram is obtained by 3X 3 convolution with expansion ratio of 4, BN layer and ReLU activation function
Figure BDA00041634444000002120
Then will->
Figure BDA00041634444000002121
And->
Figure BDA00041634444000002122
After being spliced along the channel dimension, the characteristic diagram +.f is obtained by sequentially carrying out 1 multiplied by 1 convolution, BN layer and ReLU activation function>
Figure BDA00041634444000002123
Finally F is arranged cat And F' cat After addition, the context feature map +.f. is obtained by 3×3 convolution, BN layer and ReLU activation function in sequence>
Figure BDA00041634444000002213
The specific formula is as follows:
F cat =ReLU(BN(Conv1(Concat(F low ,Up(F high )))))
Figure BDA0004163444400000221
Figure BDA0004163444400000222
Figure BDA0004163444400000223
Figure BDA0004163444400000224
Figure BDA0004163444400000225
Figure BDA0004163444400000226
Figure BDA0004163444400000227
wherein Up (-) is a bilinear interpolation upsampling operation, concat (-), and Concat (-), are concatenation operations along the channel dimension,
Figure BDA0004163444400000228
is a matrix addition operation, conv3 (&) is a convolution layer with a convolution kernel size of 3×3, conv3 d=i (. Cndot.) is a 3X 3 convolution with an expansion ratio of i, conv1 (-) is a convolution layer with a convolution kernel size of 1X 1, BN (-) is a batch normalization operation, and ReLU (-) is an activation function Split (·) is an equal Split operation along the channel dimension.
Further, as shown in fig. 2, step B5 includes the steps of:
and B5, designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the camouflage target detection network comprises an edge perception module, an edge feature fusion module, an edge enhancement module, a high-order space interaction module and a context aggregation module. Inputting an original image, and obtaining four feature images with different scales through a backbone network in the step B1
Figure BDA0004163444400000229
And->
Figure BDA00041634444000002210
Will F 1 And F 4 Inputting the edge sensing module in the step B2 to obtain an edge feature map +.>
Figure BDA00041634444000002211
And edge mask->
Figure BDA00041634444000002212
Then three edge enhancement modules in step B3 are constructed and respectively marked as EEM 1 、EEM 2 And EEM 3 Wherein EEM is 1 Is input as the fourth stage characteristic map F extracted in the step B1 4 And an edge mask M obtained in the step B2 e The output is edge enhancement feature->
Figure BDA0004163444400000231
Figure BDA0004163444400000232
EEM 2 Is input as the third stage characteristic map F extracted in the step B1 3 And an edge mask M obtained in the step B2 e The output is edge enhancement feature->
Figure BDA0004163444400000233
EEM 3 The input of (a) is the second order extracted in step B1Segment characteristic diagram F 2 And an edge mask M obtained in the step B2 e The output is edge enhancement feature->
Figure BDA0004163444400000234
Then constructing an edge feature fusion module in the step B3, and inputting the edge feature fusion module into the first-stage feature map F extracted in the step B1 1 Edge feature map F obtained in step B2 e And edge mask M e Output as feature map of fused edge information +.>
Figure BDA0004163444400000235
Then, constructing four high-order space interaction modules in the step B4, which are respectively marked as HSIM 1 、HSIM 2 、HSIM 3 And HSIM 4 Their inputs are the feature maps obtained in step B3, respectively
Figure BDA0004163444400000236
Figure BDA0004163444400000237
And->
Figure BDA0004163444400000238
The outputs are respectively
Figure BDA0004163444400000239
Figure BDA00041634444000002310
And->
Figure BDA00041634444000002311
In the context aggregation module immediately following the construction of the three steps B4, they are denoted as CAM respectively 1 、CAM 2 And CAM (CAM) 3 In which CAM is 1 Is the input of a feature map->
Figure BDA00041634444000002312
And->
Figure BDA00041634444000002313
Output as context feature map +.>
Figure BDA00041634444000002314
CAM 2 The input of (2) is CAM 1 Output of +.>
Figure BDA00041634444000002315
And feature map->
Figure BDA00041634444000002316
Output as context feature map +.>
Figure BDA00041634444000002317
CAM 3 The input of (2) is CAM 2 Output of +.>
Figure BDA00041634444000002318
And feature map->
Figure BDA00041634444000002319
Output as context feature map +.>
Figure BDA00041634444000002320
For edge mask M e The two linear interpolation up-sampling is amplified by 4 times to obtain a final edge mask M edge . For contextual profile->
Figure BDA00041634444000002321
Compressing the mask into 1 channel through 1X 1 convolution, and then performing bilinear interpolation up-sampling and amplifying by 16 times to obtain a first-stage camouflage target mask +.>
Figure BDA00041634444000002322
For contextual profile->
Figure BDA00041634444000002323
Compressing the pseudo-noise signal into 1 channel through 1X 1 convolution, and performing bilinear interpolation up-sampling and amplification for 8 times to obtain a second-stage pseudo-noise signalLoad target mask->
Figure BDA00041634444000002324
For contextual profile->
Figure BDA00041634444000002325
Compressing it into 1 channel by 1X 1 convolution, then carrying out bilinear interpolation up-sampling and amplifying 4 times to obtain final camouflage target mask +. >
Figure BDA00041634444000002326
The specific formula is as follows:
M edge =Up scale=4 (M e )
Figure BDA0004163444400000241
Figure BDA0004163444400000242
Figure BDA0004163444400000243
wherein Up scale=4 (. Cndot.) is double linear interpolation upsampling with a multiple of 4, up scale=8 (. Cndot.) is bilinear interpolation upsampling with a multiple of 8, up scale=16 (. Cndot.) is bilinear interpolation upsampling by a factor of 16, conv1 (. Cndot.) is a convolution layer with a convolution kernel size of 1×1 and output channel number of 1.
Further, step C comprises the steps of:
step C, designing a loss function as constraint to optimize a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the specific formula is as follows:
Figure BDA0004163444400000244
wherein G is camo Representing a label image corresponding to the original image I, G edge Representing the edge label image to which the original image I corresponds,
Figure BDA0004163444400000245
expressed as a total loss function->
Figure BDA0004163444400000246
Representing weighted binary cross entropy loss, ">
Figure BDA0004163444400000247
Expressed as weighted cross-ratio loss, ">
Figure BDA0004163444400000248
The Dice coefficient loss is represented, and λ is represented as the weight of the loss. />
Further, the step D is implemented as follows:
and D1, randomly dividing the training data set obtained in the step A into a plurality of batches, wherein each batch comprises N pairs of images.
Step D2, inputting an original image I, and obtaining an edge mask M after the original image I passes through the camouflage target detection network based on edge feature fusion and high-order space interaction in the step B edge Camouflage target mask
Figure BDA0004163444400000249
Figure BDA0004163444400000251
And->
Figure BDA0004163444400000252
Calculating the loss +.using the formula in step C>
Figure BDA0004163444400000253
And D3, calculating the gradient of the parameters in the network by using a back propagation method according to the loss, and updating the network parameters by using an Adam optimization method.
And D4, repeating the steps D1 to D3 by taking the batch as a unit until the target loss function value of the network converges to Nash balance, and storing network parameters to obtain a camouflage target detection model based on edge feature fusion and higher-order space interaction. For the tested camouflage target image, the highest resolution of three camouflage target masks is predicted by a model
Figure BDA0004163444400000254
As the final camouflage target mask.
The above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.

Claims (10)

1. The camouflage target detection method based on edge feature fusion and high-order space interaction is characterized by comprising the following steps of:
step A, data preprocessing, including data pairing and data enhancement processing, is carried out, and a training data set is obtained;
step B, designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the camouflage target detection network consists of an edge perception module, an edge enhancement module, an edge feature fusion module, a high-order space interaction module and a context aggregation module;
C, designing a loss function, and guiding parameter optimization of the network designed in the step B;
step D, training the camouflage target detection network based on the edge feature fusion and the high-order space interaction in the step B by using the training data set obtained in the step A, converging to Nash balance, and obtaining a trained camouflage target detection model based on the edge feature fusion and the high-order space interaction;
and E, inputting the image to be detected into a trained camouflage target detection model based on edge feature fusion and high-order space interaction, and outputting a mask image of the camouflage target.
2. The method for detecting the camouflage target based on the edge feature fusion and the higher-order spatial interaction according to claim 1, wherein the specific implementation steps of the step A are as follows:
a1, forming an image triplet by each original image, a label image corresponding to the original image and an edge label image;
step A2, randomly turning left and right, randomly cutting and randomly rotating each group of image triples; performing color enhancement on the original image, and adjusting the brightness, contrast, saturation and definition of the original image by setting random values as parameters; adding random black points or white points as random noise to the label image corresponding to the original image;
And A3, scaling each image in the data set into images with the same size of H multiplied by W.
3. The method for detecting the camouflage target based on the edge feature fusion and the higher-order spatial interaction according to claim 1, wherein the specific implementation steps of the step B are as follows:
step B1, constructing an image feature extraction network, and extracting image features by using the constructed network;
step B2, designing an edge perception module, and generating an edge mask and edge characteristics by using the designed module;
step B3, designing an edge enhancement module and an edge feature fusion module, enhancing the feature representation with camouflage target edge structure semantics by using the edge enhancement module, and generating features of fusion edge information by using the edge feature fusion module;
step B4, constructing a high-order space interaction module and a context aggregation module, using the high-order space interaction module to inhibit the attention to the background and promote the attention to the foreground, and using the context aggregation module to mine context semantics to enhance object detection;
and B5, designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the camouflage target detection network comprises an edge perception module, an edge feature fusion module, an edge enhancement module, a high-order space interaction module and a context aggregation module, and generating a final camouflage target mask by using the designed network.
4. The method for detecting a camouflage target based on edge feature fusion and higher-order spatial interaction according to claim 3, wherein the step B1 is specifically implemented as follows:
step B1, taking Res2Net-50 as a backbone network, extracting characteristics of an original image I with the input size of H multiplied by W multiplied by 3, and specifically, respectively recording characteristic diagrams output by the original image I in a first stage, a second stage, a third stage and a fourth stage as F 1 、F 2 、F 3 And F 4 Wherein the characteristic diagram F 1 The size is as follows
Figure FDA0004163444390000021
Figure FDA0004163444390000022
Feature map F 2 The size is +.>
Figure FDA0004163444390000023
Feature map F 3 The size is as follows
Figure FDA0004163444390000024
Feature map F 4 The size is +.>
Figure FDA0004163444390000025
C=256。
5. The method for detecting a camouflage target based on edge feature fusion and higher-order spatial interaction according to claim 3, wherein the step B2 is specifically implemented as follows:
step B21, designing an edge perception module, wherein the input of the edge perception module is the first stage characteristic diagram F extracted in the step B1 1 And fourth stage characteristic diagram F 4 The edge perception module outputs as an edge feature map F e And edge mask M e
Step B22, designing a feature fusion block in the edge perception module; the input of the edge perception module is the feature map F extracted in the step B1 1 And F 4 Input of a feature map F 1 Sequentially performing 1×1 convolution, BN layer and ReLU activation function to reduce channel number to obtain feature map
Figure FDA0004163444390000031
Input of a feature map F 4 The number of channels is reduced by a 1 multiplied by 1 convolution, a BN layer and a ReLU activation function in sequence to obtain a characteristic diagram +.>
Figure FDA0004163444390000032
Feature map F 'using bilinear interpolation' 4 Width and height of (2) are adjusted to sum F' 1 The same width and height, a characteristic diagram is obtained>
Figure FDA0004163444390000033
Will F' 1 And F' 4 After being spliced along the channel dimension, the edge feature diagram is obtained through a channel attention module>
Figure FDA0004163444390000034
The specific formula is as follows:
F′ 1 =ReLU(BN(Conv1(F 1 )))
F′ 4 =ReLU(BN(Conv1(F 4 )))
F″ 4 =Up(F′ 4 )
F e =SE(Concat(F′ 1 ,F″ 4 ))
wherein Conv1 (·) is a convolution layer with a convolution kernel size of 1×1, BN (·) is a batch normalization operation, reLU (·) is a ReLU activation function, up (·) is bilinear interpolation upsampling, concat (·, ·) is a splice operation along the channel dimension, SE (·) is a channel attention module;
step B23, designing a convolution block in the edge perception module; inputting the edge characteristic diagram F obtained in the step B22 e Sequentially passing through 3×3 convolution, BN layer, reLU activation function and 3×3 volumeThe product, BN layer, reLU activation function, 1×1 convolution ultimately generates an edge mask
Figure FDA0004163444390000035
The specific formula is as follows:
M e =Conv1(ReLU(BN(Conv3(ReLU(BN(Conv3(F e ))))))))
where Conv3 (·) is the convolution layer with a convolution kernel size of 3×3, BN (·) is the batch normalization operation, reLU (·) is the activation function, conv1 (·) is the convolution with a convolution kernel size of 1×1.
6. The method for detecting a camouflage target based on edge feature fusion and higher-order spatial interaction according to claim 3, wherein the step B3 is specifically implemented as follows:
Step B31, designing an edge enhancement module, namely firstly designing edge guiding operation in the edge enhancement module; input is the edge mask M obtained in step B2 e And the feature map obtained in the step B1
Figure FDA0004163444390000041
Figure FDA0004163444390000042
Masking the edges of the input M e Downsampling bilinear interpolation to and from feature map F i The same width and height, resulting in a mask +.>
Figure FDA0004163444390000043
Mask M' e And feature map F i Multiplying by F i Adding, and sequentially performing 3×3 convolution, BN layer, and ReLU activation function to obtain edge-guided feature map ++>
Figure FDA0004163444390000044
The specific formula is as follows:
M' e =Down(M e )
Figure FDA0004163444390000045
where Down (·) is a bilinear interpolation downsampling operation,
Figure FDA0004163444390000046
is a matrix multiplication, +.>
Figure FDA0004163444390000047
Is a matrix addition operation, conv3 (·) is a convolution layer with a convolution kernel size of 3×3, BN (·) is a batch normalization operation, and ReLU (·) is an activation function;
step B32, constructing a CBAM attention sub-module in an edge enhancement module, wherein the edge enhancement module consists of serial channel attention SE and spatial attention SA, and the input feature map is a feature map F obtained in the step B32 guide Obtaining edge enhancement features
Figure FDA0004163444390000048
The specific formula is as follows:
F ee =SA(SE(F guide ))
wherein SE (-) is a channel attention module and SA (-) is a spatial attention module;
step B33, designing an edge feature fusion module, and inputting the edge feature fusion module into the first-stage feature map extracted in the step B1
Figure FDA0004163444390000051
Edge feature map obtained in step B2 +.>
Figure FDA0004163444390000052
And edge mask->
Figure FDA0004163444390000053
Masking the edges M e And feature map F 4 Multiplying by F 4 Adding to obtain a feature map->
Figure FDA0004163444390000054
Edge feature map F e Sequentially performing 3×3 convolution, BN layer and ReLU activation function to obtain a reduced channel feature map +.>
Figure FDA0004163444390000055
Will F M With F' e Splicing along the channel dimension, sequentially passing through 3×3 convolution, swish activation function, SE module, and 3×3 convolution, and adding feature map F' e Obtaining a characteristic diagram +.>
Figure FDA0004163444390000056
Map F' e Through SE module and->
Figure FDA0004163444390000057
Splicing along the channel dimension, and then performing 3X 3 convolution to obtain a feature map +.>
Figure FDA0004163444390000058
Figure FDA0004163444390000059
Finally, the feature map is added->
Figure FDA00041634443900000510
And feature map F 1 Adding to obtain a feature map of the final fused edge information>
Figure FDA00041634443900000511
The specific formula is as follows:
Figure FDA00041634443900000512
F' e =ReLI(BN(Conv3(F e )))
Figure FDA00041634443900000513
Figure FDA00041634443900000514
Figure FDA00041634443900000515
wherein the method comprises the steps of
Figure FDA00041634443900000516
Is a matrix multiplication, +.>
Figure FDA00041634443900000517
Is a matrix addition operation, conv3 (·) is a convolution layer with a convolution kernel size of 3×3, BN (·) is a batch normalization operation, reLU (·) is an activation function, swish (·) is a Swish activation function, SE (·) is a channel attention module, concat (·), is a concatenation operation along the channel dimension.
7. The method for detecting a camouflage target based on edge feature fusion and higher-order spatial interaction according to claim 3, wherein the step B4 is specifically implemented as follows:
Step B41, firstly constructing a gating convolution module in the high-order space interaction module, and recording the characteristic diagram input by the module as
Figure FDA0004163444390000061
Will input a feature map F α Layer Normalization (LN) 1 ) Obtaining a normalized feature map
Figure FDA0004163444390000062
Then will->
Figure FDA0004163444390000063
The channel is enlarged to be twice of the original channel by a 1X 1 convolution to obtain a characteristic diagram
Figure FDA0004163444390000064
Will->
Figure FDA0004163444390000065
Splitting into two feature maps along the channel>
Figure FDA0004163444390000066
Inputting q into depth separable convolution to obtain a characteristic diagram +.>
Figure FDA0004163444390000067
Splitting it into n (n is the order) feature maps>
Figure FDA0004163444390000068
Wherein->
Figure FDA0004163444390000069
Map the characteristic map p 0 And feature map q 0 Multiplying, and expanding its channel to twice by 1×1 convolution to obtain first space interaction characteristic diagram ++>
Figure FDA00041634443900000610
Map the characteristic map p 1 And feature map q 1 Multiplying, and expanding the channel to twice by 1×1 convolution to obtain second space interaction characteristic diagram ++>
Figure FDA00041634443900000611
Then sequentially iterate to the feature map p n-1 And feature map q n-1 After multiplication, the number of the input channels is the same as the number of the output channels, and the convolution kernel size is obtainedFor a 1×1 convolution layer, n-degree spatial interaction feature map is obtained>
Figure FDA00041634443900000612
Finally, the characteristic diagram F is input α And p is as follows n Adding to obtain an intermediate output profile->
Figure FDA00041634443900000613
The specific formula is as follows:
Figure FDA00041634443900000614
Figure FDA00041634443900000615
Figure FDA00041634443900000616
Q=DWConv(q)
Figure FDA00041634443900000617
Figure FDA00041634443900000618
Figure FDA0004163444390000071
wherein Split (·) is Split along the channel dimension, DWConv (·) is a depth separable convolution, conv1 (·) is a convolution layer with a convolution kernel size of 1 x 1,
Figure FDA0004163444390000072
Is a matrix multiplication, +.>
Figure FDA0004163444390000073
Is a matrix addition operation;
step B42, constructing a feedforward module in the high-order space interaction module, and inputting the feedforward module into the feature map F obtained in the step B41 mid For F mid Layer normalization was performed, denoted LN 2 Then input into two layers of full-connection layers, marked as MLP, and output of the two layers of full-connection layers and the feature map F mid Adding to obtain high-order space interaction characteristics
Figure FDA0004163444390000074
The specific formula is as follows:
Figure FDA0004163444390000075
wherein the method comprises the steps of
Figure FDA0004163444390000076
Is a matrix addition operation;
step B43, constructing a channel reduction module in the high-order space interaction module, and inputting F obtained in step B42 h si F is to F h si Sequentially performing 1×1 convolution, BN layer and ReLU activation function to obtain a channel-reduced high-order space interaction characteristic diagram
Figure FDA0004163444390000077
The specific formula is as follows:
F′ h si =ReLU(BN(Conv1(F h si )))
where Conv1 (·) is the convolution layer with a convolution kernel size of 1×1, BN (·) is the batch normalization operation, reLU (·) is the activation function;
step B44, firstly constructing a convolution block in a context aggregation module, and recording that the context aggregation module inputs two feature graphs with different scales
Figure FDA0004163444390000078
And->
Figure FDA0004163444390000079
First, feature map F h igh Upsampling bilinear interpolation to adjust its width and height to be equal to F low The same width and height as F low Splicing along the channel dimension, and then sequentially carrying out 1×1 convolution, BN layer and ReLU activation function to obtain a feature map +. >
Figure FDA00041634443900000710
And then F is arranged cat Four feature maps are equally divided along the channel dimension>
Figure FDA00041634443900000711
And->
Figure FDA0004163444390000081
Will->
Figure FDA0004163444390000082
And->
Figure FDA0004163444390000083
After addition, the characteristic diagram ++is obtained by 3X 3 convolution, BN layer and ReLU activation function in sequence>
Figure FDA0004163444390000084
Will->
Figure FDA0004163444390000085
And->
Figure FDA0004163444390000086
After the three are added, the characteristic diagram +.f is obtained by 3 multiplied by 3 convolution with expansion rate of 2, BN layer and ReLU activation function>
Figure FDA0004163444390000087
Figure FDA0004163444390000088
Will->
Figure FDA0004163444390000089
And->
Figure FDA00041634443900000810
The three are added and then sequentially subjected to 3 multiplied by 3 convolution with the expansion rate of 3, BN layer and ReLU activation function to obtain a characteristic diagram +.>
Figure FDA00041634443900000811
Will->
Figure FDA00041634443900000812
And->
Figure FDA00041634443900000813
After addition, the characteristic diagram is obtained by 3X 3 convolution with expansion ratio of 4, BN layer and ReLU activation function
Figure FDA00041634443900000814
Then will->
Figure FDA00041634443900000815
And->
Figure FDA00041634443900000816
After being spliced along the channel dimension, the characteristic diagram +.f is obtained by sequentially carrying out 1 multiplied by 1 convolution, BN layer and ReLU activation function>
Figure FDA00041634443900000817
Finally F is arranged cat And F' cat After addition, the context feature map +.f. is obtained by 3×3 convolution, BN layer and ReLU activation function in sequence>
Figure FDA00041634443900000818
The specific formula is as follows:
F cat =ReLU(BN(Conv1(Concat(F low ,Up(F high )))))
Figure FDA00041634443900000819
Figure FDA00041634443900000820
Figure FDA00041634443900000821
Figure FDA00041634443900000822
Figure FDA00041634443900000823
Figure FDA00041634443900000824
Figure FDA00041634443900000825
wherein Up (-) is a bilinear interpolation upsampling operation, concat (-), and Concat (-), are concatenation operations along the channel dimension,
Figure FDA00041634443900000826
is a matrix addition operation, conv3 (&) is a convolution with a convolution kernel size of 3×3Layer, conv3 d=i (. Cndot.) is a 3X 3 convolution with a rate of expansion of i, conv1 (-) is a convolution layer with a convolution kernel size of 1X 1, EN (-) is a batch normalization operation, reLU (-) is an activation function, split (-) is a Split operation equally along the channel dimension.
8. The method for detecting a camouflage target based on edge feature fusion and higher-order spatial interaction according to claim 3, wherein the step B5 is specifically implemented as follows:
step B5, designing a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the camouflage target detection network comprises an edge perception module, an edge feature fusion module, an edge enhancement module, a high-order space interaction module and a context aggregation module; inputting an original image, and obtaining four feature images with different scales through a backbone network in the step B1
Figure FDA0004163444390000091
And->
Figure FDA0004163444390000092
Figure FDA0004163444390000093
Will F 1 And F 4 Inputting the edge sensing module in the step B2 to obtain an edge feature map +.>
Figure FDA0004163444390000094
Figure FDA0004163444390000095
And edge mask->
Figure FDA0004163444390000096
Then three edge enhancement modules in step B3 are constructed and respectively marked as EEM 1 、EEM 2 And EEM 3 Wherein EEM is 1 Is input as the fourth stage characteristic map F extracted in the step B1 4 And an edge mask M obtained in the step B2 e The output is edge enhancement features
Figure FDA0004163444390000097
EEM 2 Is input as the third stage characteristic map F extracted in the step B1 3 And an edge mask M obtained in the step B2 e The output is edge enhancement feature->
Figure FDA0004163444390000098
EEM 3 Is input as the second stage feature map F extracted in step B1 2 And an edge mask M obtained in the step B2 e The output is edge enhancement feature->
Figure FDA0004163444390000099
Then constructing an edge feature fusion module in the step B3, and inputting the edge feature fusion module into the first-stage feature map F extracted in the step B1 1 Edge feature map F obtained in step B2 e And edge mask M e Output as feature map of fused edge information +.>
Figure FDA00041634443900000910
Then, constructing four high-order space interaction modules in the step B4, which are respectively marked as HSIM 1 、HSIM 2 、HSIM 3 And HSIM 4 Their inputs are the characteristic map obtained in step B3 +.>
Figure FDA00041634443900000911
Figure FDA00041634443900000912
And->
Figure FDA00041634443900000913
The outputs are +.>
Figure FDA00041634443900000914
Figure FDA00041634443900000915
And->
Figure FDA00041634443900000916
In the context aggregation module immediately following the construction of the three steps B4, they are denoted as CAM respectively 1 、CAM 2 And CAM (CAM) 3 In which CAM is 1 Is the input of a feature map->
Figure FDA0004163444390000101
And->
Figure FDA0004163444390000102
Output as context feature map +.>
Figure FDA0004163444390000103
Figure FDA0004163444390000104
CAM 2 The input of (2) is CAM 1 Output of +.>
Figure FDA0004163444390000105
And feature map->
Figure FDA0004163444390000106
Output as context feature map +.>
Figure FDA0004163444390000107
CAM 3 The input of (2) is CAM 2 Output of +.>
Figure FDA0004163444390000108
And feature map->
Figure FDA0004163444390000109
Output as context feature map +.>
Figure FDA00041634443900001010
For edge mask M e The two linear interpolation up-sampling is amplified by 4 times to obtain a final edge mask M edge The method comprises the steps of carrying out a first treatment on the surface of the For contextual profile->
Figure FDA00041634443900001011
Compressing the mask into 1 channel through 1X 1 convolution, and then performing bilinear interpolation up-sampling and amplifying by 16 times to obtain a first-stage camouflage target mask +.>
Figure FDA00041634443900001012
For contextual profile->
Figure FDA00041634443900001013
Compressing the mask into 1 channel by 1X 1 convolution, and performing bilinear interpolation up-sampling and amplification by 8 times to obtain second-stage camouflage target mask +.>
Figure FDA00041634443900001014
For contextual profile->
Figure FDA00041634443900001015
Compressing it into 1 channel by 1X 1 convolution, then carrying out bilinear interpolation up-sampling and amplifying 4 times to obtain final camouflage target mask +. >
Figure FDA00041634443900001016
The specific formula is as follows:
M edge =Up scale=4 (M e )
Figure FDA00041634443900001017
Figure FDA00041634443900001018
Figure FDA00041634443900001019
wherein Up scale=4 (. Cndot.) is double linear interpolation upsampling with a multiple of 4, up scale=8 (. Cndot.) is bilinear interpolation upsampling with a multiple of 8, up scale=16 (. Cndot.) is bilinear interpolation upsampling by a factor of 16, conv1 (. Cndot.) is a convolution layer with a convolution kernel size of 1×1 and output channel number of 1.
9. The method for detecting the camouflage target based on the edge feature fusion and the higher-order spatial interaction according to claim 1, wherein the specific implementation step of the step C is as follows:
step C, designing a loss function as constraint to optimize a camouflage target detection network based on edge feature fusion and high-order space interaction, wherein the specific formula is as follows:
Figure FDA0004163444390000111
wherein G is camo Representing a label image corresponding to the original image I, G edge Representing the edge label image to which the original image I corresponds,
Figure FDA0004163444390000112
expressed as a total loss function->
Figure FDA0004163444390000113
Representing weighted binary cross entropy loss, ">
Figure FDA0004163444390000114
Expressed as weighted cross-ratio loss, ">
Figure FDA0004163444390000115
The Dice coefficient loss is represented, and λ is represented as the weight of the loss.
10. The method for detecting the camouflage target based on the edge feature fusion and the higher-order spatial interaction according to claim 1, wherein the specific implementation step of the step D is as follows:
step D1, randomly dividing the training data set obtained in the step A into a plurality of batches, wherein each batch comprises N pairs of images;
Step D2, inputting an original image I, and obtaining an edge mask M after the original image I passes through the camouflage target detection network based on edge feature fusion and high-order space interaction in the step B edge Camouflage target mask
Figure FDA0004163444390000116
Figure FDA0004163444390000117
And->
Figure FDA0004163444390000118
Calculating the loss +.using the formula in step C>
Figure FDA0004163444390000119
Step D3, calculating the gradient of the parameters in the network by using a back propagation method according to the loss, and updating the network parameters by using an Adam optimization method;
step D4, repeatedly executing the steps D1 to D3 by taking the batch as a unit until the target loss function value of the network converges to Nash balance, and storing network parameters to obtain a camouflage target detection model based on edge feature fusion and higher-order space interaction; for the tested camouflage target image, the highest resolution of three camouflage target masks is predicted by a model
Figure FDA00041634443900001110
As the final camouflage target mask.
CN202310356445.2A 2023-04-06 2023-04-06 Camouflage target detection method based on edge feature fusion and high-order space interaction Pending CN116310693A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310356445.2A CN116310693A (en) 2023-04-06 2023-04-06 Camouflage target detection method based on edge feature fusion and high-order space interaction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310356445.2A CN116310693A (en) 2023-04-06 2023-04-06 Camouflage target detection method based on edge feature fusion and high-order space interaction

Publications (1)

Publication Number Publication Date
CN116310693A true CN116310693A (en) 2023-06-23

Family

ID=86824077

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310356445.2A Pending CN116310693A (en) 2023-04-06 2023-04-06 Camouflage target detection method based on edge feature fusion and high-order space interaction

Country Status (1)

Country Link
CN (1) CN116310693A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116563313A (en) * 2023-07-11 2023-08-08 安徽大学 Remote sensing image soybean planting region segmentation method based on gating and attention fusion
CN117095180A (en) * 2023-09-01 2023-11-21 武汉互创联合科技有限公司 Embryo development stage prediction and quality assessment method based on stage identification
CN117593517A (en) * 2024-01-19 2024-02-23 南京信息工程大学 Camouflage target detection method based on complementary perception cross-view fusion network

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116563313A (en) * 2023-07-11 2023-08-08 安徽大学 Remote sensing image soybean planting region segmentation method based on gating and attention fusion
CN116563313B (en) * 2023-07-11 2023-09-19 安徽大学 Remote sensing image soybean planting region segmentation method based on gating and attention fusion
CN117095180A (en) * 2023-09-01 2023-11-21 武汉互创联合科技有限公司 Embryo development stage prediction and quality assessment method based on stage identification
CN117095180B (en) * 2023-09-01 2024-04-19 武汉互创联合科技有限公司 Embryo development stage prediction and quality assessment method based on stage identification
CN117593517A (en) * 2024-01-19 2024-02-23 南京信息工程大学 Camouflage target detection method based on complementary perception cross-view fusion network
CN117593517B (en) * 2024-01-19 2024-04-16 南京信息工程大学 Camouflage target detection method based on complementary perception cross-view fusion network

Similar Documents

Publication Publication Date Title
Shao et al. Feature learning for image classification via multiobjective genetic programming
CN116310693A (en) Camouflage target detection method based on edge feature fusion and high-order space interaction
Alshdaifat et al. Improved deep learning framework for fish segmentation in underwater videos
CN111242841B (en) Image background style migration method based on semantic segmentation and deep learning
CN112614077B (en) Unsupervised low-illumination image enhancement method based on generation countermeasure network
CN112598643B (en) Depth fake image detection and model training method, device, equipment and medium
CN113221639A (en) Micro-expression recognition method for representative AU (AU) region extraction based on multitask learning
CN113870124B (en) Weak supervision-based double-network mutual excitation learning shadow removing method
Xu et al. Instance segmentation of biological images using graph convolutional network
Su et al. Multi‐scale cross‐path concatenation residual network for Poisson denoising
CN111062329A (en) Unsupervised pedestrian re-identification method based on augmented network
CN112052877A (en) Image fine-grained classification method based on cascade enhanced network
Qu et al. Visual cross-image fusion using deep neural networks for image edge detection
Zheng et al. Differential-evolution-based generative adversarial networks for edge detection
Xu et al. AutoSegNet: An automated neural network for image segmentation
Zhang et al. MultiResolution attention extractor for small object detection
CN116402851A (en) Infrared dim target tracking method under complex background
Li et al. Findnet: Can you find me? boundary-and-texture enhancement network for camouflaged object detection
Zhu et al. A novel simple visual tracking algorithm based on hashing and deep learning
CN112801092B (en) Method for detecting character elements in natural scene image
Dai et al. DFN-PSAN: Multi-level deep information feature fusion extraction network for interpretable plant disease classification
CN109284765A (en) The scene image classification method of convolutional neural networks based on negative value feature
CN116740362B (en) Attention-based lightweight asymmetric scene semantic segmentation method and system
Yan et al. Joint image-to-image translation with denoising using enhanced generative adversarial networks
Zhang et al. Object tracking in siamese network with attention mechanism and Mish function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination