CN114581752A - Camouflage target detection method based on context sensing and boundary refining - Google Patents

Camouflage target detection method based on context sensing and boundary refining Download PDF

Info

Publication number
CN114581752A
CN114581752A CN202210495815.6A CN202210495815A CN114581752A CN 114581752 A CN114581752 A CN 114581752A CN 202210495815 A CN202210495815 A CN 202210495815A CN 114581752 A CN114581752 A CN 114581752A
Authority
CN
China
Prior art keywords
module
target
disguised
features
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210495815.6A
Other languages
Chinese (zh)
Other versions
CN114581752B (en
Inventor
史彩娟
任弼娟
陈厚儒
段昌钰
闫晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China University of Science and Technology
Original Assignee
North China University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China University of Science and Technology filed Critical North China University of Science and Technology
Priority to CN202210495815.6A priority Critical patent/CN114581752B/en
Publication of CN114581752A publication Critical patent/CN114581752A/en
Application granted granted Critical
Publication of CN114581752B publication Critical patent/CN114581752B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a disguised target detection method based on context awareness and boundary refinement, which comprises the following steps of: the system comprises a backbone network, a GCIE module, an AINF module and a BR module; extracting multi-scale characteristics of an image to be detected containing a camouflage target by a backbone network; the GCIE module enhances the third layer, the fourth layer and the fifth layer of characteristics in the multi-scale characteristics extracted by the backbone network to fully sense the global context information; the AINF module adopts a hierarchical structure and an AFF assembly to fuse the characteristics of adjacent layers, and senses global and local information at the same time to obtain a rough prediction map of regional characteristics and a camouflage target; the BR module refines the boundary by using spatial information in the low-level features and inhibits non-disguised factors, so that disguised targets with rich boundaries can be better detected; the invention can comprehensively sense the disguised target because of simultaneously paying attention to the context information and the boundary information, thereby being beneficial to improving the detection performance of the disguised target and expanding the use scene of the invention.

Description

Camouflage target detection method based on context sensing and boundary refining
Technical Field
The invention relates to image camouflaged target detection, belongs to the technical field of data identification, and particularly relates to a camouflaged target detection method based on context sensing and boundary refinement.
Background
In recognizing patterns, camouflage targets typically embed themselves into the surrounding environment using structural and physiological features, or by manual techniques to achieve self-protection. The camouflage target is highly similar to the background in visual features such as color, texture and the like, so that the common target detection algorithm and the salient target detection algorithm cannot detect the camouflage target, and the existing camouflage target detection algorithm has the following defects: 1. the global context information is not sufficiently perceived, resulting in limited detection performance of disguised objects, particularly large disguised objects and occluded disguised objects. The receptive field block is the only way to acquire the context information, but the receptive field in the existing disguised target detection method cannot cover the whole image with high resolution and can not well sense the interaction of different positions in the image, so that the global context information cannot be fully and comprehensively sensed. 2. Global and local context information is not perceived sufficiently at the same time, resulting in limited detection performance of the decoy target, especially of a plurality of small decoy targets. The existing detection method of the disguised target mainly determines the position of the disguised target by sensing rich global context information, and little work is needed while local context information is fully sensed. 3. Sufficient refinement of the boundary information is omitted, resulting in limited detection performance of the disguised object, particularly a disguised object with rich boundaries. In summary, the capability of the existing disguised object detection algorithm is to be improved.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method for detecting a disguised target based on context sensing and boundary refinement, which improves the capability of detecting the disguised target.
The purpose of the invention is realized by the following technical scheme:
a method for detecting a disguised target based on context sensing and boundary refinement comprises the steps of inputting an image to be detected, which contains the disguised target, into a constructed and trained disguised target detection model, and detecting the disguised target; the camouflage target detection model comprises a backbone network, a GCIE module, an AINF module and a BR module;
the multi-scale features of the image to be detected containing the disguised target extracted by the backbone network comprise five layers of features
Figure 612306DEST_PATH_IMAGE001
Third-layer, fourth-layer and fifth-layer features in multi-scale features extracted by GCIE module for backbone network
Figure 904747DEST_PATH_IMAGE002
Enhancing and fully sensing global context information, and outputting enhanced features to an AINF module;
the AINF module adopts a hierarchical structure and an AFF component to fuse the characteristics of adjacent layers, and simultaneously senses global and local information to obtain a rough prediction graph of the regional characteristics and the camouflage target;
second-tier features of regional and backbone extraction
Figure 413044DEST_PATH_IMAGE003
And inputting the boundary into a BR module, refining the boundary by utilizing the spatial information in the low-layer characteristics and inhibiting non-disguising factors, refining the boundary of the disguised target and obtaining a fine prediction map of the disguised target.
Further, the constructing and training of the disguised target detection model comprises:
s11, dividing a pre-collected image data set containing a camouflage target into a training set and a test set;
s12, constructing a camouflage target detection model;
s13, training the constructed disguised target detection model by using a training set;
s14, the trained camouflage target detection model is tested by using the test set.
Further, the GCIE module comprises a GC sub-module and a PMMC sub-module, and is used for enlarging the receptive field to fully sense the global upper and lower partsMessage, enhancing third, fourth and fifth tier features in a backbone network
Figure 107331DEST_PATH_IMAGE002
Wherein, the GC sub-module firstly obtains global context characteristics from the backbone network characteristics, then obtains conversion characteristics from the global context characteristics, and finally adds the conversion characteristics and the backbone network characteristics to obtain enhanced characteristics
Figure 24471DEST_PATH_IMAGE004
I.e., the output of the GC sub-module; PMMC submodule first reducing enhancement features
Figure 69787DEST_PATH_IMAGE004
The number of channels is input into three parallel mixed convolution branches, the results of the three branches and the characteristics after the channels are reduced are spliced and convolved, and finally the global enhancement characteristics are obtained through jump connection and Relu function operation
Figure 464997DEST_PATH_IMAGE005
I.e. the GCIE module output.
Further, the AINF module firstly adopts a feature fusion component AFF to respectively fuse the global enhanced features
Figure 279369DEST_PATH_IMAGE006
And
Figure 367411DEST_PATH_IMAGE007
to obtain
Figure 634444DEST_PATH_IMAGE008
And
Figure 98923DEST_PATH_IMAGE009
then fused again using AFF components
Figure 643168DEST_PATH_IMAGE008
And
Figure 902111DEST_PATH_IMAGE009
to obtain
Figure 922020DEST_PATH_IMAGE010
Figure 659032DEST_PATH_IMAGE011
And
Figure 182417DEST_PATH_IMAGE012
after splicing, the two are convolved and then
Figure 877840DEST_PATH_IMAGE010
Splicing, the splicing characteristic is convoluted to obtain the regional characteristic of the disguised target
Figure 853887DEST_PATH_IMAGE013
And a roughness prediction map
Figure 660169DEST_PATH_IMAGE014
I.e. the AINF module output; roughness prediction map of a target to be camouflaged
Figure 303640DEST_PATH_IMAGE014
Constructing loss with the detected tag value, and characterizing the region
Figure 638806DEST_PATH_IMAGE013
Input to the BR module.
Further, the BR module uses the second layer characteristics of the backbone network
Figure 508673DEST_PATH_IMAGE003
The space information in the method refines the boundary of the disguised target, and firstly, the second layer characteristics of a backbone network are subjected to
Figure 587487DEST_PATH_IMAGE003
De-noising to obtain
Figure 351044DEST_PATH_IMAGE015
Will be
Figure 857112DEST_PATH_IMAGE015
And regional characteristics
Figure 338909DEST_PATH_IMAGE013
Adding and fusing to obtain fused features
Figure 221414DEST_PATH_IMAGE016
Then will be
Figure 573898DEST_PATH_IMAGE016
Input to the MSCA component and SA component in turn calculates attention coefficients and uses the jump connection to re-integrate with the features
Figure 782025DEST_PATH_IMAGE016
Adding to obtain a weighted feature
Figure 219960DEST_PATH_IMAGE017
(ii) a Then will be
Figure 781523DEST_PATH_IMAGE017
And
Figure 254092DEST_PATH_IMAGE015
multiplying to enhance denoised second layer features
Figure 101962DEST_PATH_IMAGE015
The disguised target boundary information contained in (1); then multiplying the multiplied features with the region features
Figure 292772DEST_PATH_IMAGE013
Adding to obtain fine features
Figure 517080DEST_PATH_IMAGE018
(ii) a Finally, the
Figure 844156DEST_PATH_IMAGE018
Obtaining the final fine prediction image of the disguised target through convolution
Figure 394086DEST_PATH_IMAGE019
And a fine prediction map is generated
Figure 806613DEST_PATH_IMAGE019
And detecting tag building loss.
Further, a loss function for training the constructed camouflage target detection model by using a training set adopts pixel position perception loss
Figure 834612DEST_PATH_IMAGE020
Total loss function of the disguised object detection model
Figure 16195DEST_PATH_IMAGE021
Comprises the following steps:
Figure 471447DEST_PATH_IMAGE022
Figure 10750DEST_PATH_IMAGE023
Figure 576861DEST_PATH_IMAGE024
wherein
Figure 878529DEST_PATH_IMAGE025
And
Figure 504683DEST_PATH_IMAGE026
respectively representing the surveillance of the camouflage target behind the AINF module and the BR module,
Figure 891802DEST_PATH_IMAGE027
and
Figure 261603DEST_PATH_IMAGE028
respectively representing weighted binary cross-entropy loss and weighted cross-over ratio loss,
Figure 417778DEST_PATH_IMAGE029
and
Figure 480412DEST_PATH_IMAGE030
coarse prediction graph representing AINF module and BR module prediction
Figure 89248DEST_PATH_IMAGE031
And fine prediction maps
Figure 262740DEST_PATH_IMAGE019
Respectively carrying out 8 times of upsampling to obtain a pretending target prediction graph,
Figure 414367DEST_PATH_IMAGE032
a binary label map representing the disguised object.
Furthermore, the MSCA component comprises two branches, one branch obtains global information by using a global average pooling layer and two point-by-point convolution layers, the other branch obtains local information by using only the two point-by-point convolution layers, and finally, the global information and the local information are subjected to additive fusion and a sigmoid activation function is carried out to obtain a multi-scale channel attention coefficient.
Further, the SA component obtains Max feature and Avg feature by respectively using global maximum pooling and global average pooling operations for the channel refinement feature processed by the MSCA component along a channel axis, splices the Max feature and the Avg feature to generate a channel feature descriptor, and generates a spatial attention coefficient by using a 3 × 3 convolution and sigmoid activation function.
Compared with the prior art, the invention has the following advantages:
the invention improves the detection performance of the disguised target by utilizing the deep learning technology. According to the invention, global context information is fully sensed through feature enhancement, and the detection performance of a large disguised target and a shielded disguised target is improved; global and local context information is sensed simultaneously through feature fusion, and the detection performance of a plurality of small disguised targets is improved; the detection performance of the disguised target with rich boundaries is improved by refining the boundaries of the disguised target by utilizing the spatial information in the bottom layer characteristics, the detection capability of the disguised target is improved by the characteristics, and the use scene of the method is expanded; the invention is a detection model obtained by training on a large-scale data set, and has better robustness and universality.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a schematic analysis diagram of the present invention for improving the detection accuracy of a disguised target based on context awareness and boundary refinement;
FIG. 2 is a block diagram of a disguised object detection model of the present invention;
FIG. 3 is a block diagram of the GCIE module of the present invention;
FIG. 4 is a block diagram of an AINF module of the present invention;
FIG. 5 is a block diagram of the AFF assembly and the MSCA assembly of the present invention;
FIG. 6 is a block diagram of a BR module of the present invention;
fig. 7 is a structural view of an SA module of the present invention.
Detailed Description
The invention is further illustrated by the following figures and examples.
Referring to fig. 2-6, a disguised object detection method based on context awareness and boundary refinement includes: inputting an image to be detected containing a disguised target into a constructed and trained disguised target detection model, and detecting the disguised target; the camouflage target detection model comprises a backbone network, a GCIE module, an AINF module and a BR module; the multi-scale features of the image to be detected containing the disguised target extracted by the backbone network comprise five layers of features
Figure 382323DEST_PATH_IMAGE001
(ii) a Third-layer, fourth-layer and fifth-layer features in multi-scale features extracted by GCIE module for backbone network
Figure 744034DEST_PATH_IMAGE002
Enhancing and fully sensing global context information, and outputting enhanced features to an AINF module; AINF module adopts hierarchical structure and AFF component fusionCombining the characteristics of adjacent layers, and simultaneously sensing global and local information to obtain a rough prediction graph of the regional characteristics and the disguised target; second-tier features of regional and backbone extraction
Figure 190059DEST_PATH_IMAGE003
And the boundary is refined by using the spatial information in the low-layer characteristics and non-camouflage factors are inhibited by the BR module, the boundary of the camouflage target is refined, a refined prediction image of the camouflage target is obtained, and the detection of the camouflage target is finished.
In this embodiment, constructing and training the disguised target detection model includes:
s11, dividing a pre-collected image data set containing a camouflage target into a training set and a test set;
s12, constructing a camouflage target detection model;
s13, training the constructed disguised target detection model by using a training set; in this embodiment, the loss function for training the constructed detection model of the disguised target by using the training set is the pixel position perception loss
Figure 586405DEST_PATH_IMAGE020
Total loss function of the disguised object detection model
Figure 725263DEST_PATH_IMAGE021
Comprises the following steps:
Figure 308691DEST_PATH_IMAGE022
Figure 823986DEST_PATH_IMAGE033
Figure 809259DEST_PATH_IMAGE034
wherein
Figure 119018DEST_PATH_IMAGE025
And
Figure 330687DEST_PATH_IMAGE026
respectively representing the surveillance of the camouflage target behind the AINF module and the BR module,
Figure 384094DEST_PATH_IMAGE027
and
Figure 489453DEST_PATH_IMAGE028
respectively representing weighted binary cross-entropy loss and weighted cross-over ratio loss,
Figure 970113DEST_PATH_IMAGE035
and
Figure 528133DEST_PATH_IMAGE036
coarse prediction graph representing AINF module and BR module prediction
Figure 385231DEST_PATH_IMAGE014
And fine prediction maps
Figure 79518DEST_PATH_IMAGE019
Respectively carrying out 8 times of upsampling to obtain a pretending target prediction graph,
Figure 137603DEST_PATH_IMAGE032
a binary label map representing the disguised object.
S14, the trained camouflage target detection model is tested by using the test set.
In the embodiment, the multi-scale features extracted by the backbone network from the image to be detected and containing the disguised target comprise five layers of features
Figure 182920DEST_PATH_IMAGE001
. The backbone network generally has detection and segmentation capabilities after being trained by the ImageNet data set, and may adopt the most common networks such as VGG and ResNet, which are not specifically limited herein.
In this embodiment, the GCIE module structure is as shown in fig. 3, the GCIE module,comprises a GC sub-module and a PMMC sub-module, and is used for extracting third-layer, fourth-layer and fifth-layer characteristics in multi-scale characteristics of a backbone network
Figure 578129DEST_PATH_IMAGE002
Enhancing the fully-sensed global context information, and outputting the enhanced features to an AINF module, wherein a GC sub-module firstly performs 1 × 1 convolution and softmax function operation on the characteristics of a backbone network and multiplies the result with the characteristics of the backbone network to obtain the global context features, then the global context features are subjected to 1 × 1 convolution, layer normalization, Relu function operation and 1 × 1 convolution to obtain conversion features, and finally the conversion features and the characteristics of the backbone network are added to obtain the enhanced features
Figure 392501DEST_PATH_IMAGE004
I.e., the output of the GC sub-module; the PMMC sub-module first reduces the enhancement feature using a 1 × 1 convolution
Figure 480543DEST_PATH_IMAGE004
Then input into three parallel mixed convolution branches, where the expansion ratio of the expansion convolutionrate=3,5,7, then splicing the results of the three branches and the features after the channels are reduced, performing 3 × 3 convolution, and finally performing jump connection and Relu function operation to obtain the global enhanced features
Figure 747576DEST_PATH_IMAGE005
I.e. the GCIE module output. Third, fourth and fifth tier features of backbone extraction
Figure 946476DEST_PATH_IMAGE037
As the input of the GCIE module, the GCIE module uses a GC sub-module capable of capturing long-distance dependency and a PMMC sub-module simulating a receptive field mechanism in human visual sense, so that the model can comprehensively sense global context information, and meanwhile, the model has stronger robustness.
In the embodiment, the structure of the AINF module is shown in FIG. 4, and the AINF module adopts a hierarchical structure and an AFF assembly to fuse the characteristics of adjacent layersSimultaneously perceiving global and local information to obtain a rough prediction graph of regional characteristics and a disguised target, and respectively fusing global enhanced characteristics by adopting a characteristic fusion component AFF (auto-fuzzy function)
Figure 615355DEST_PATH_IMAGE006
And
Figure 139877DEST_PATH_IMAGE007
to obtain
Figure 628628DEST_PATH_IMAGE008
And
Figure 631219DEST_PATH_IMAGE009
then fused again using AFF components
Figure 794084DEST_PATH_IMAGE008
And
Figure 958350DEST_PATH_IMAGE009
to obtain
Figure 465554DEST_PATH_IMAGE010
Figure 6257DEST_PATH_IMAGE009
And
Figure 384149DEST_PATH_IMAGE012
after splicing, the data is convolved by 3 x 3 and then combined with
Figure 250474DEST_PATH_IMAGE010
Splicing, namely obtaining the area characteristics of the disguised target by carrying out 3 x 3 convolution and 1 x 1 convolution on the spliced characteristics
Figure 713816DEST_PATH_IMAGE013
And a roughness prediction map
Figure 58210DEST_PATH_IMAGE014
I.e. the AINF module output; roughness prediction map of a target to be camouflaged
Figure 821766DEST_PATH_IMAGE014
Constructing loss with the detected tag value, and characterizing the region
Figure 327834DEST_PATH_IMAGE013
Input to the BR module;
the formula of the AFF component fusion characteristics is as follows:
Figure 684997DEST_PATH_IMAGE038
wherein
Figure 567502DEST_PATH_IMAGE039
Showing the fusion features obtained using the AFF components,
Figure 919986DEST_PATH_IMAGE040
the representation of the function of the ReLU,
Figure 862535DEST_PATH_IMAGE041
representing a 3 x 3 convolution followed by a batch normalization operation,
Figure 300469DEST_PATH_IMAGE042
representing the multi-scale channel attention coefficients obtained using the MSCA components,
Figure 252245DEST_PATH_IMAGE043
and
Figure 459235DEST_PATH_IMAGE044
respectively representing the input low-level features and high-level features,
Figure 572685DEST_PATH_IMAGE045
the MSCA component comprises two branches, wherein one branch acquires global information by using a global average pooling layer and two point-by-point convolution layers, the other branch acquires local information by using only the two point-by-point convolution layers, and finally the global information and the local information are subjected to additive fusion and pass through a sigmoid activation function to obtain a multi-scale channel attention coefficient.
In this embodiment, the BR module structure is as shown in fig. 6, and the regional characteristics and the second-layer characteristics extracted by the backbone network
Figure 763495DEST_PATH_IMAGE003
Input to BR module, it utilizes space information in low-level feature to refine boundary and inhibit non-disguise factor, refine disguise target boundary and obtain fine prediction graph of disguise target, firstly, for second-level feature of backbone network
Figure 987803DEST_PATH_IMAGE003
De-noising to obtain
Figure 314879DEST_PATH_IMAGE015
Will be
Figure 740175DEST_PATH_IMAGE015
And regional characteristics
Figure 887122DEST_PATH_IMAGE013
Adding and fusing to obtain fused features
Figure 915121DEST_PATH_IMAGE016
Then will be
Figure 362283DEST_PATH_IMAGE016
Input to the MSCA component and SA component in turn calculates attention coefficients and uses the jump connection to re-integrate with the features
Figure 817535DEST_PATH_IMAGE016
Adding to obtain a weighted feature
Figure 982937DEST_PATH_IMAGE017
(ii) a Then will be
Figure 549048DEST_PATH_IMAGE017
And
Figure 585137DEST_PATH_IMAGE015
multiplying to enhance denoised second layer features
Figure 476870DEST_PATH_IMAGE015
The disguised target boundary information contained in (1); then multiplying the multiplied features with the region features
Figure 598409DEST_PATH_IMAGE013
Adding to obtain fine features
Figure 968211DEST_PATH_IMAGE018
(ii) a Finally, the
Figure 265331DEST_PATH_IMAGE018
Obtaining the final fine prediction image of the camouflage target by a convolution of 3 multiplied by 3 and a convolution of 1 multiplied by 1
Figure 62386DEST_PATH_IMAGE019
And a fine prediction map is generated
Figure 936801DEST_PATH_IMAGE019
And detecting tag building loss.
And the SA component obtains Max feature and Avg feature by respectively using global maximum pooling and global average pooling operations for the channel refining feature processed by the MSCA component along a channel axis, splices the Max feature and the Avg feature to generate a channel feature descriptor, and generates a spatial attention coefficient by using a 3 x 3 convolution and sigmoid activation function.
The BR module utilizes a large amount of spatial information contained in low-level features to supplement and refine the region features, and simultaneously, the multi-scale channel attention and the spatial attention adopted by the BR module can effectively inhibit the interference of non-camouflage factors.
Fig. 1 (a) is an input image including a masquerading target, fig. 1(b) is an initial prediction image output by a backbone network, fig. 1(c) is a rough prediction image obtained using context sensing, fig. 1(d) is a fine prediction image obtained after boundary refinement, and fig. 1 (e) is a binary label image of an input image. The object (devil's dragon) is similar in shape and texture to the seaweed in the background and the color of the object (devil's dragon) is similar to the background color in the image, and thus can be considered as a camouflage object. As can be seen from the figure, the initial predicted image (fig. 1 (b)) obtained after the input image is detected by the backbone network is not complete enough and has unclear edges, and the detection capability of the disguised target needs to be improved; the accuracy of the position and the contour of the disguised target in the rough predicted image (fig. 1 (c)) obtained by using context sensing is greatly improved, but the edge details are still not fine enough; the boundary of the masquerading target in the fine prediction image (fig. 1 (d)) obtained after the boundary refinement is further clear. The idea of context sensing and boundary refining not only can accurately highlight the position and the outline main body of the disguised target, but also can obtain a fine target boundary. Therefore, the invention provides a method for detecting a disguised target based on context sensing and boundary refinement, which can improve the overall detection capability and detection precision of the disguised target, so that the method can be applied to more practical data detection and identification application scenes (such as military fields, medical fields, biological fields, agricultural fields, traffic fields and the like) related to disguise, and the working efficiency of related workers is improved. The method mainly applies a deep learning technology, adds a context sensing module and a boundary refining module in the neural network, and combines context information (global context and local context information) and boundary information to effectively separate the disguised target from the environment with complex background.
The above-mentioned embodiments are preferred embodiments of the present invention, and the present invention is not limited thereto, and any other modifications or equivalent substitutions that do not depart from the technical spirit of the present invention are included in the scope of the present invention.

Claims (8)

1. A method for detecting a disguised target based on context sensing and boundary refinement comprises the steps of inputting an image to be detected, which contains the disguised target, into a constructed and trained disguised target detection model, and detecting the disguised target; the method is characterized in that the camouflage target detection model comprises a main network, a GCIE module, an AINF module and a BR module;
the multi-scale features of the image to be detected containing the camouflage target extracted by the trunk network comprise five layers of features
Figure DEST_PATH_IMAGE002
Third-layer, fourth-layer and fifth-layer features in multi-scale features extracted by GCIE module for backbone network
Figure DEST_PATH_IMAGE004
Enhancing and fully sensing global context information, and outputting enhanced features to an AINF module;
the AINF module adopts a hierarchical structure and an AFF component to fuse the characteristics of adjacent layers, and simultaneously senses global and local information to obtain a rough prediction graph of the regional characteristics and the camouflage target;
regional features and backbone extracted second-level features
Figure DEST_PATH_IMAGE006
And inputting the boundary into a BR module, refining the boundary by utilizing the spatial information in the low-layer characteristics and inhibiting non-disguising factors, refining the boundary of the disguised target and obtaining a fine prediction map of the disguised target.
2. The method for detecting a disguised target based on context awareness and boundary refinement according to claim 1, wherein constructing and training a disguised target detection model comprises:
s11, dividing a pre-collected image data set containing a camouflage target into a training set and a test set;
s12, constructing a camouflage target detection model;
s13, training the constructed camouflage target detection model by using a training set;
s14, the trained camouflage target detection model is tested by using the test set.
3. The disguised object detection method based on context awareness and boundary refinement of claim 1, wherein GC is used for detecting the disguised objectIE module including GC sub-module and PMMC sub-module for enlarging reception field to fully sense global context information and enhancing third layer, fourth layer and fifth layer characteristics in backbone network
Figure 644485DEST_PATH_IMAGE004
Wherein, the GC sub-module firstly obtains global context characteristics from the backbone network characteristics, then obtains conversion characteristics from the global context characteristics, and finally adds the conversion characteristics and the backbone network characteristics to obtain enhanced characteristics
Figure DEST_PATH_IMAGE008
I.e. the output of the GC sub-module; PMMC submodule first reducing enhancement features
Figure DEST_PATH_IMAGE010
The number of channels is input into three parallel mixed convolution branches, the results of the three branches and the characteristics after the channels are reduced are spliced and convolved, and finally the global enhancement characteristics are obtained through jump connection and Relu function operation
Figure DEST_PATH_IMAGE012
I.e. the GCIE module output.
4. The method for detecting the disguised target based on the context awareness and the boundary refinement as claimed in claim 1, wherein the AINF module firstly adopts a feature fusion component AFF to respectively fuse the global enhanced features
Figure DEST_PATH_IMAGE014
And
Figure DEST_PATH_IMAGE016
to obtain
Figure DEST_PATH_IMAGE018
And
Figure DEST_PATH_IMAGE020
then fused again using AFF components
Figure 850295DEST_PATH_IMAGE018
And
Figure 638122DEST_PATH_IMAGE020
to obtain
Figure DEST_PATH_IMAGE022
Figure DEST_PATH_IMAGE024
And
Figure DEST_PATH_IMAGE026
after splicing, the two are convolved and then
Figure 290952DEST_PATH_IMAGE022
Splicing, the splicing characteristic is convoluted to obtain the regional characteristic of the disguised target
Figure DEST_PATH_IMAGE028
And a roughness prediction map
Figure DEST_PATH_IMAGE030
I.e. the AINF module output; roughness prediction map of a target to be camouflaged
Figure 99508DEST_PATH_IMAGE030
Constructing loss with the detected tag value, and characterizing the region
Figure 126370DEST_PATH_IMAGE028
Input to the BR module.
5. The method for detecting the disguised target based on the context awareness and the boundary refinement as claimed in claim 1, wherein the BR module uses the second layer characteristics of the backbone network
Figure 983467DEST_PATH_IMAGE006
The space information in the method refines the boundary of the disguised target, and firstly, the second layer characteristics of a backbone network are subjected to
Figure 474491DEST_PATH_IMAGE006
De-noising to obtain
Figure DEST_PATH_IMAGE032
Will be
Figure 407943DEST_PATH_IMAGE032
And regional characteristics
Figure 187681DEST_PATH_IMAGE028
Adding and fusing to obtain fused features
Figure DEST_PATH_IMAGE034
Then will be
Figure 114048DEST_PATH_IMAGE034
Input to the MSCA component and SA component in turn calculates attention coefficients and uses the jump connection to re-integrate with the features
Figure 459579DEST_PATH_IMAGE034
Adding them to obtain a weighted feature
Figure DEST_PATH_IMAGE036
(ii) a Then will be
Figure 282041DEST_PATH_IMAGE036
And with
Figure 345812DEST_PATH_IMAGE032
Multiplying to enhance denoised second layer features
Figure 13554DEST_PATH_IMAGE032
The disguised target boundary information contained in (1); then will beMultiplied features and regional features
Figure 229903DEST_PATH_IMAGE028
Adding to obtain fine features
Figure DEST_PATH_IMAGE038
(ii) a Finally, the
Figure 20004DEST_PATH_IMAGE038
Obtaining the final fine prediction image of the disguised target through convolution
Figure DEST_PATH_IMAGE040
And a fine prediction map is generated
Figure 243175DEST_PATH_IMAGE040
And detecting tag building loss.
6. The method of claim 2, wherein the loss function for training the constructed detection model of the disguised target by using the training set employs pixel position perception loss
Figure DEST_PATH_IMAGE042
Total loss function of the disguised object detection model
Figure DEST_PATH_IMAGE044
Comprises the following steps:
Figure DEST_PATH_IMAGE046
Figure DEST_PATH_IMAGE048
Figure DEST_PATH_IMAGE050
wherein
Figure DEST_PATH_IMAGE052
And
Figure DEST_PATH_IMAGE054
respectively representing the surveillance of the camouflage target behind the AINF module and the BR module,
Figure DEST_PATH_IMAGE056
and
Figure DEST_PATH_IMAGE058
respectively representing weighted binary cross-entropy loss and weighted cross-over ratio loss,
Figure DEST_PATH_IMAGE060
and
Figure DEST_PATH_IMAGE062
coarse prediction graph representing AINF module and BR module prediction
Figure DEST_PATH_IMAGE064
And fine prediction maps
Figure 183450DEST_PATH_IMAGE040
Respectively carrying out 8 times of upsampling to obtain a pretending target prediction graph,
Figure DEST_PATH_IMAGE066
a binary label map representing the disguised object.
7. The method for detecting a camouflaged object based on context awareness and boundary refinement according to claim 5, wherein the MSCA component comprises two branches, one branch uses global mean pooling and two point-by-point convolutional layers to obtain global information, the other branch uses only two point-by-point convolutional layers to obtain local information, and finally the global information and the local information are subjected to additive fusion and a sigmoid activation function to obtain a multi-scale channel attention coefficient.
8. The method for detecting the disguised target based on the context awareness and the boundary refinement as claimed in claim 5, wherein the SA component uses the global Max pooling and the global mean pooling respectively for the channel refinement features processed by the MSCA component along the channel axis to obtain Max feature and Avg feature and concatenate them to generate the channel feature descriptor, and then uses a 3 x 3 convolution and sigmoid activation function to generate the spatial attention coefficient.
CN202210495815.6A 2022-05-09 2022-05-09 Camouflage target detection method based on context awareness and boundary refinement Active CN114581752B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210495815.6A CN114581752B (en) 2022-05-09 2022-05-09 Camouflage target detection method based on context awareness and boundary refinement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210495815.6A CN114581752B (en) 2022-05-09 2022-05-09 Camouflage target detection method based on context awareness and boundary refinement

Publications (2)

Publication Number Publication Date
CN114581752A true CN114581752A (en) 2022-06-03
CN114581752B CN114581752B (en) 2022-07-15

Family

ID=81769028

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210495815.6A Active CN114581752B (en) 2022-05-09 2022-05-09 Camouflage target detection method based on context awareness and boundary refinement

Country Status (1)

Country Link
CN (1) CN114581752B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115063373A (en) * 2022-06-24 2022-09-16 山东省人工智能研究院 Social network image tampering positioning method based on multi-scale feature intelligent perception
CN115346094A (en) * 2022-08-25 2022-11-15 杭州电子科技大学 Camouflage target detection method based on main body area guidance
CN115376094A (en) * 2022-10-27 2022-11-22 山东聚祥机械股份有限公司 Unmanned sweeper road surface identification method and system based on scale perception neural network
CN116703950A (en) * 2023-08-07 2023-09-05 中南大学 Camouflage target image segmentation method and system based on multi-level feature fusion
CN116894943A (en) * 2023-07-20 2023-10-17 深圳大学 Double-constraint camouflage target detection method and system
CN117237645A (en) * 2023-11-15 2023-12-15 中国农业科学院农业资源与农业区划研究所 Training method, device and equipment of semantic segmentation model based on boundary enhancement

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778867A (en) * 2016-12-15 2017-05-31 北京旷视科技有限公司 Object detection method and device, neural network training method and device
CN112733744A (en) * 2021-01-14 2021-04-30 北京航空航天大学 Camouflage object detection model based on edge cooperative supervision and multi-level constraint
CN112750140A (en) * 2021-01-21 2021-05-04 大连理工大学 Disguised target image segmentation method based on information mining
CN113139450A (en) * 2021-04-16 2021-07-20 广州大学 Camouflage target detection method based on edge detection
CN113468996A (en) * 2021-06-22 2021-10-01 广州大学 Camouflage object detection method based on edge refinement
CN114220013A (en) * 2021-12-17 2022-03-22 扬州大学 Camouflaged object detection method based on boundary alternating guidance
CN114549567A (en) * 2022-02-23 2022-05-27 大连理工大学 Disguised target image segmentation method based on omnibearing sensing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778867A (en) * 2016-12-15 2017-05-31 北京旷视科技有限公司 Object detection method and device, neural network training method and device
CN112733744A (en) * 2021-01-14 2021-04-30 北京航空航天大学 Camouflage object detection model based on edge cooperative supervision and multi-level constraint
CN112750140A (en) * 2021-01-21 2021-05-04 大连理工大学 Disguised target image segmentation method based on information mining
CN113139450A (en) * 2021-04-16 2021-07-20 广州大学 Camouflage target detection method based on edge detection
CN113468996A (en) * 2021-06-22 2021-10-01 广州大学 Camouflage object detection method based on edge refinement
CN114220013A (en) * 2021-12-17 2022-03-22 扬州大学 Camouflaged object detection method based on boundary alternating guidance
CN114549567A (en) * 2022-02-23 2022-05-27 大连理工大学 Disguised target image segmentation method based on omnibearing sensing

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIUQI XU 等: "Boundary guidance network for camouflage object detection", 《IMAGE AND VISION COMPUTING》 *
YUJIA SUN 等: "Context-aware Cross-level Fusion Network for Camouflaged Object Detection", 《ARXIV》 *
何淋艳 等: "伪装目标检测与分割研究进展", 《软件导刊》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115063373A (en) * 2022-06-24 2022-09-16 山东省人工智能研究院 Social network image tampering positioning method based on multi-scale feature intelligent perception
CN115346094A (en) * 2022-08-25 2022-11-15 杭州电子科技大学 Camouflage target detection method based on main body area guidance
CN115346094B (en) * 2022-08-25 2023-08-22 杭州电子科技大学 Camouflage target detection method based on main body region guidance
CN115376094A (en) * 2022-10-27 2022-11-22 山东聚祥机械股份有限公司 Unmanned sweeper road surface identification method and system based on scale perception neural network
CN116894943A (en) * 2023-07-20 2023-10-17 深圳大学 Double-constraint camouflage target detection method and system
CN116703950A (en) * 2023-08-07 2023-09-05 中南大学 Camouflage target image segmentation method and system based on multi-level feature fusion
CN116703950B (en) * 2023-08-07 2023-10-20 中南大学 Camouflage target image segmentation method and system based on multi-level feature fusion
CN117237645A (en) * 2023-11-15 2023-12-15 中国农业科学院农业资源与农业区划研究所 Training method, device and equipment of semantic segmentation model based on boundary enhancement
CN117237645B (en) * 2023-11-15 2024-02-06 中国农业科学院农业资源与农业区划研究所 Training method, device and equipment of semantic segmentation model based on boundary enhancement

Also Published As

Publication number Publication date
CN114581752B (en) 2022-07-15

Similar Documents

Publication Publication Date Title
CN114581752B (en) Camouflage target detection method based on context awareness and boundary refinement
CN110956094B (en) RGB-D multi-mode fusion personnel detection method based on asymmetric double-flow network
Mou et al. A relation-augmented fully convolutional network for semantic segmentation in aerial scenes
Abdollahi et al. Building footprint extraction from high resolution aerial images using generative adversarial network (GAN) architecture
Peng et al. Detecting heads using feature refine net and cascaded multi-scale architecture
CN109460764B (en) Satellite video ship monitoring method combining brightness characteristics and improved interframe difference method
CN112288008B (en) Mosaic multispectral image disguised target detection method based on deep learning
CN104700381A (en) Infrared and visible light image fusion method based on salient objects
CN113222819B (en) Remote sensing image super-resolution reconstruction method based on deep convolution neural network
CN113408594A (en) Remote sensing scene classification method based on attention network scale feature fusion
CN111582232A (en) SLAM method based on pixel-level semantic information
Zhang et al. Self-attention guidance and multi-scale feature fusion based uav image object detection
Zhan et al. Vegetation land use/land cover extraction from high-resolution satellite images based on adaptive context inference
CN112446357A (en) SAR automatic target recognition method based on capsule network
Sun et al. IRDCLNet: Instance segmentation of ship images based on interference reduction and dynamic contour learning in foggy scenes
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
CN115861756A (en) Earth background small target identification method based on cascade combination network
CN112926667B (en) Method and device for detecting saliency target of depth fusion edge and high-level feature
CN114463624A (en) Method and device for detecting illegal buildings applied to city management supervision
CN112598032A (en) Multi-task defense model construction method for anti-attack of infrared image
CN111008555B (en) Unmanned aerial vehicle image small and weak target enhancement extraction method
Marcu et al. Object contra context: Dual local-global semantic segmentation in aerial images
CN110852172B (en) Method for expanding crowd counting data set based on Cycle Gan picture collage and enhancement
CN113139450A (en) Camouflage target detection method based on edge detection
CN112633158A (en) Power transmission line corridor vehicle identification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant