CN115620049A - Method for detecting disguised target based on polarized image clues and application thereof - Google Patents

Method for detecting disguised target based on polarized image clues and application thereof Download PDF

Info

Publication number
CN115620049A
CN115620049A CN202211210090.8A CN202211210090A CN115620049A CN 115620049 A CN115620049 A CN 115620049A CN 202211210090 A CN202211210090 A CN 202211210090A CN 115620049 A CN115620049 A CN 115620049A
Authority
CN
China
Prior art keywords
image
level
layer
polarization
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211210090.8A
Other languages
Chinese (zh)
Inventor
王昕�
丁甲甲
张钊
张勇
周育民
高隽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202211210090.8A priority Critical patent/CN115620049A/en
Publication of CN115620049A publication Critical patent/CN115620049A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention discloses a camouflage target detection method based on a polarized image clue and application thereof, wherein the method comprises the following steps: 1. acquiring a polarized image data set with a pixel-level label; 2. calculating the four original polarization angles to obtain an intensity image I, a linear polarization degree image DoLP and a linear polarization angle image AoLP; 3. performing data enhancement on the intensity image, the DoLP image and the AoLP image; 4. constructing a depth convolution network based on a polarized image clue, taking an intensity image, a DoLP image and an AoLP image as input, and training the depth convolution network to obtain a camouflage target detection model; 5. and carrying out camouflage target detection on the intensity image, the DoLP image and the AoLP image to be detected by using the trained model. The invention can realize the detection of the disguised target based on the threads of the polarized image, thereby effectively improving the accuracy of the detection of the disguised target in the scene under the complex and changeable environment.

Description

Method for detecting disguised target based on polarized image clues and application thereof
Technical Field
The invention belongs to the field of computer vision, image processing and analysis, and particularly relates to a method for detecting a disguised target based on a polarized image clue.
Background
The term "camouflage" was originally used to describe the ability of animals such as insects to attempt to hide their behavior in order to avoid pursuits by natural enemies of their surroundings, and this camouflage reduces the risk of their being discovered by natural enemies and increases the chances of survival. For example, the chameleon can change the appearance of the chameleon according to the color and the pattern of the surrounding environment. Humans are motivated to adopt this mechanism and start to find widespread use in the field. Such as soldiers and war devices, enhance camouflage or concealment by dressing and coloring. In many civilian areas, there is also a need to handle task scenarios with highly similar goals to the environment. Such as polyp segmentation in medical images, pest identification in agriculture, etc.
In recent years, the detection of the disguised object is more and more emphasized by researchers, and a plurality of disguised object detection models with good performance are proposed. Currently, existing frames are broadly divided into two categories: traditional detection of a disguised object relying on visual features (color, texture, motion, gradient, etc.) and detection of a disguised object based on a deep learning algorithm. Traditional models are highly dependent on visual features, and although some progress has been made in established scenes, the generalization capability is limited. In other words, once the environment has changed significantly, the visual features need to be redesigned, relabeled, and the model designed. Therefore, the traditional detection method for the disguised target is not suitable for scenes with complicated backgrounds. The method based on deep learning trains a camouflage target detection model through a certain amount of training data, and tests are carried out on the test data by using the trained model. The learning-based method relies on the strong learning ability of the deep neural network, integrates various features, and greatly improves the detection precision compared with the traditional method relying on visual features.
However, these learning-based approaches still have deficiencies: 1. most of the deep learning-based methods input a single image (RGB image) for training and testing, more visual perception knowledge is not learned, and a disguised target or area is difficult to effectively detect in complex scenes such as highlight, dim light, partial shielding, low contrast or disordered background; 2. the other part of the disguised target detection method based on deep learning uses RGB-D as input, the method simultaneously inputs an RGB image and a depth map, and additionally introduces depth information, which has proved that the performance of detecting the disguised target can be improved, but if the quality of the depth map is poor, the detection effect of the disguised target is deteriorated.
Disclosure of Invention
The invention provides a method for detecting a disguised target based on a polarized image clue and application thereof to solve the defects in the prior art, so that the disguised target of the polarized image can be effectively detected in a complex scene, and the accuracy and the efficiency of detecting the disguised target in the scene can be effectively improved.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention discloses a method for detecting a disguised target based on a polarized image clue, which is characterized by comprising the following steps of:
step 1, acquiring a polarized image data set with pixel level labeling;
step 1.1, acquiring a group of original polarization images with polarization directions theta of 0 degrees, 45 degrees, 90 degrees and 135 degrees respectively in an nth shooting scene by using a focus-dividing plane polarimeter in a polarization camera
Figure BDA0003874845180000021
Thereby obtaining N groups of original polarization images under N scenes; wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003874845180000022
representing the original polarization image in the n-th scene in the polarization direction theta, n ∈ [1, N];
Step 1.2, respectively labeling the N groups of original polarization images to obtain pixel-level labeled images; and the value range of each pixel point in the pixel-level labeling image is between (0, 1); forming a training data set D by N groups of original polarization images and corresponding pixel level labeled images tr
Step 2, calculating a group of original polarization images of the nth scene to obtain an intensity image, a linear polarization degree image and a linear polarization angle image;
step 2.1, calculating an original polarization image in the polarization direction theta in the nth scene by using the formula (1)
Figure BDA0003874845180000023
Stokes vector of
Figure BDA0003874845180000024
Figure BDA0003874845180000025
In the formula (1), the reaction mixture is,
Figure BDA0003874845180000026
representing the total light intensity of objects in the nth scene, i.e. intensity I n
Figure BDA0003874845180000027
Respectively representing linearly polarized light in the vertical direction and the horizontal direction in the nth scene;
step 2.2, using equation (2) to correct the Stokes vector
Figure BDA0003874845180000028
Calculating and imaging to obtain a linear polarization degree image DoLP of the nth scene n
Figure BDA0003874845180000029
Step 2.3, using formula (3) to correct the Stokes vector
Figure BDA00038748451800000210
Calculating and imaging to obtain the linear polarization angle image AoLP of the nth scene n
Figure BDA00038748451800000211
Step 3, carrying out intensity image I on the nth scene n Linear degree of polarization image DoLP n And linear polarization angle image AoLP n Carrying out data enhancement processing to obtain the enhanced intensity image of the nth scene
Figure BDA0003874845180000031
Linear polarization degree image
Figure BDA0003874845180000032
And linear polarization angle image
Figure BDA0003874845180000033
And forming an nth polarized image set;
taking the pixel level annotation image of the nth scene as a real camouflage image of the nth scene, and marking as G n True camouflage G of the nth scene n Data enhancement is carried out to obtain a real camouflage painting after the data enhancement
Figure BDA0003874845180000034
Step 4, constructing a camouflage target detection model based on the polarized image clues, comprising the following steps: the device comprises an encoder, a channel dimension reduction module, a fusion module, an up-sampling module and an output module;
and 4.1, the encoder is composed of three Res2Net50 backbone networks with the same structure, each Res2Net backbone network is composed of H downsampling blocks, and each downsampling block is marked as a DSampleBlock 1 ,……,DSampleBlock h ,......,DSampleBlock H (ii) a Wherein, DSampleBlock h Represents an H-level downsample block, H =1, 2.... H; and the h-th level down-sampling block DSampleBlock h From the h-th X layer two-dimensional convolution layer Dconv2d h Are connected in series;
the h-th two-dimensional convolution layer Dconv2d of the x-th layer h,x Sequentially comprises the following steps: a convolution layer, a batch normalization and a Relu activation function layer, wherein the x-th two-dimensional convolution layer Dconv2d h,x The convolution kernel size of the convolution layer in (1) is k x ;x=1,2,......,X;
When h =1, inputting the nth polarized image set into the encoder, and respectively corresponding to h-level down-sampling blocks DSampleBlock passing through three Res2Net backbone networks h The h-th order X layer convolution layer Dconv2d in (1) h And respectively outputting the nth intensity image characteristic map of the h level
Figure BDA0003874845180000035
Nth polarization degree image characteristic diagram of h level
Figure BDA0003874845180000036
And nth polarization angle characteristic diagram of h order
Figure BDA0003874845180000037
x h ,y h ,c h Respectively representing the h-th down-sampling block dsampletlock h The height, width and channel number of the output feature map;
when H =2,3.... H, the nth intensity image feature map of the H-1 th level is used
Figure BDA0003874845180000038
Characteristic diagram of nth polarization degree image of h-1 level
Figure BDA0003874845180000039
And the nth polarization angle image feature map of the h-1 th order
Figure BDA00038748451800000310
Corresponding to h-th down-sampling blocks DSampleBlock respectively input into three Res2Net backbone networks h And obtaining corresponding intensity image characteristic diagram
Figure BDA00038748451800000311
Polarization degree image characteristic diagram
Figure BDA00038748451800000312
And polarization angle image feature map
Figure BDA0003874845180000041
Thus, the H-th two-dimensional convolutional layer Dconv2d of the three Res2Net backbone networks H Correspondingly obtaining the final output intensity image
Figure BDA0003874845180000042
Polarization degree image characteristic diagram
Figure BDA0003874845180000043
And polarization angle image feature map
Figure BDA0003874845180000044
x h-1 ,y h-1 ,c h-1 Respectively representing the h-1 th down-sampling block dsampletlock h-1 The height, width and channel number of the output feature map; x is the number of H ,y H ,c H Respectively, the H-th down-sampling block DSMampleBlock H The height, width and channel number of the output feature map;
and 4.2, the channel dimension reduction module is formed by sequentially connecting H layers of two-dimensional convolution layers in series, and each layer of two-dimensional convolution layer sequentially comprises: one convolution kernel is k' h ×k′ h 2 ofDimension convolution layer, a batch normalization and a Relu activation function;
the intensity image feature map
Figure BDA0003874845180000045
Polarization degree image characteristic diagram
Figure BDA0003874845180000046
And polarization angle image feature map
Figure BDA0003874845180000047
After the processing of the channel dimension reduction module, an intensity image feature map is output
Figure BDA0003874845180000048
Polarization degree image characteristic diagram
Figure BDA0003874845180000049
And polarization angle image feature map
Figure BDA00038748451800000410
H =1,2.... H, wherein c N Representing the number of channels of the feature diagram after the channel dimension reduction module;
step 4.3, the Fusion module is composed of H Fusion modules and is respectively marked as Fusion 1 ,...,Fusion h ,...,Fusion H Wherein, fusion h Representing a h-th level fusion block;
wherein, the Fusion block Fusion of the level 1 1 Includes a 1 st-stage T-layer two-dimensional convolution layer Dconv2d' 1 And O layer convolution layer conv of level 1 1 The other H-1 level fusion blocks comprise T two-dimensional convolution layers, an upper sampling layer and O layers;
h-th-stage T-layer two-dimensional convolution layer Dconv2d' h T-th two-dimensional buildup layer of (1) Dconv2d' h,t Sequentially comprises the following steps: a convolution layer, a batch normalization and a Relu activation function layer, wherein the convolution kernel size of the t-th convolution layer is k ″) t ×k″ t (ii) a The h-th O-layer convolution layer conv h Middle o layerConvolution layer conv h,o Has a convolution kernel size of k' o ×k″′ o
When h =1, the intensity image feature map is set
Figure BDA00038748451800000411
Degree of polarization image feature map
Figure BDA00038748451800000412
And polarization angle image feature map
Figure BDA00038748451800000413
Jointly input into the Fusion module and pass through the h-level Fusion block Fusion h H-th-stage T-layer two-dimensional buildup layer Dconv2d' h After processing, respectively outputting h-level intensity image characteristic maps
Figure BDA00038748451800000414
H-order polarization degree image characteristic diagram
Figure BDA00038748451800000415
And h-th order polarization angle image feature map
Figure BDA00038748451800000416
And splicing the channels to obtain a h-level splicing characteristic diagram
Figure BDA00038748451800000417
H-th level of stitching feature maps
Figure BDA0003874845180000051
Then passes through a h-th-stage T-layer two-dimensional winding layer Dconv2d' h After the processing, outputting to obtain a processed mosaic
Figure BDA0003874845180000052
Then passing through the h-th O-layer convolution layer conv h Outputting the obtained dimension-reduced splicing characteristic diagram
Figure BDA0003874845180000053
Wherein, c o Is the number of channels output after the convolution layer;
dimension reduction splicing feature map by splitting processing
Figure BDA0003874845180000054
Splitting into three weight feature maps
Figure BDA0003874845180000055
And
Figure BDA0003874845180000056
feature map of intensity image
Figure BDA0003874845180000057
And with
Figure BDA0003874845180000058
Multiplied, polarization degree image feature map
Figure BDA0003874845180000059
And with
Figure BDA00038748451800000510
Multiplicative, polarization angle image signatures
Figure BDA00038748451800000511
And with
Figure BDA00038748451800000512
Adding the multiplied three results, and finally outputting the h-level rough characteristic diagram
Figure BDA00038748451800000513
When H =2,3.... H, the H-level intensity image feature map is used
Figure BDA00038748451800000514
Polarization degree image characteristic diagram
Figure BDA00038748451800000515
And polarization angle image feature map
Figure BDA00038748451800000516
Inputting the h-th level up-sampling layer Usample h After the spatial resolution of the signal is changed to be a times of the input signal, the h-level up-sampling intensity characteristic diagram is output
Figure BDA00038748451800000517
Up-sampling polarization degree characteristic diagram
Figure BDA00038748451800000518
And up-sampling polarization angle profiles
Figure BDA00038748451800000519
x h ,y h Respectively representing the height and the width of an output characteristic diagram of the upper sampling layer;
will be provided with
Figure BDA00038748451800000520
And with
Figure BDA00038748451800000521
The addition of the two components is carried out,
Figure BDA00038748451800000522
and
Figure BDA00038748451800000523
the addition is carried out in such a way that,
Figure BDA00038748451800000524
and
Figure BDA00038748451800000525
after addition, the sum is inputted to the h-th two-dimensional convolution layer Dconv2d' h After the h-level intensity image characteristic graph is processed, the h-level intensity image characteristic graph is correspondingly output
Figure BDA00038748451800000526
H-order polarization degree image characteristic diagram
Figure BDA00038748451800000527
And h-th order polarization angle profile
Figure BDA00038748451800000528
And obtaining an h-level splicing characteristic diagram after channel splicing
Figure BDA00038748451800000529
The h-th level splicing characteristic diagram
Figure BDA00038748451800000530
Then passes through the h-th two-dimensional convolution layer Dconv2d 'of the T-layer' h After the intermediate processing, outputting to obtain a processing and splicing characteristic diagram
Figure BDA00038748451800000531
Then passing through the h-th O-layer convolution layer conv h The post-output obtains a dimension reduction splicing characteristic diagram
Figure BDA00038748451800000532
Wherein, c o The number of channels output after the convolutional layer;
dimension reduction splicing feature map by splitting processing
Figure BDA00038748451800000533
Splitting into three weight feature maps
Figure BDA00038748451800000534
And
Figure BDA00038748451800000535
feature map of intensity image
Figure BDA00038748451800000536
And with
Figure BDA00038748451800000537
Multiplication, polarization degree image characteristic diagram
Figure BDA00038748451800000538
And
Figure BDA00038748451800000539
multiplicative, polarization angle image signatures
Figure BDA00038748451800000540
And with
Figure BDA00038748451800000541
Adding the multiplied three results, and finally outputting the h-level rough characteristic diagram
Figure BDA0003874845180000061
So that the H-th coarse feature map is output by the H-th fusion block
Figure BDA0003874845180000062
Step 4.4, the up-sampling module is composed of H-2 aggregation convolution blocks and an up-sampling block, and each aggregation convolution block comprises: the system comprises an aggregation convolution layer, a batch normalization layer and a Relu activation function layer, wherein the convolution kernel size of the aggregation convolution layer is k x k;
mapping the H-th coarse feature
Figure BDA0003874845180000063
Inputting the sampling block to perform a-time upsampling, and fusing with H-1 level Fusion block H-1 H-1 level coarse feature map of output
Figure BDA0003874845180000064
Performing channel splicing and obtaining a rough characteristic diagram after H-1 level splicing
Figure BDA0003874845180000065
The H-1 level semi-rough characteristic diagram is obtained after the output processing of H-2 aggregation volume blocks
Figure BDA0003874845180000066
The H-1 level semi-rough characteristic diagram
Figure BDA0003874845180000067
Inputting the sampling data into the upsampling block for a-time upsampling, and performing H-2 stage Fusion H-2 Output coarse feature map
Figure BDA0003874845180000068
Performing channel splicing and obtaining a rough characteristic diagram after H-2 level splicing
Figure BDA0003874845180000069
The H-2 level semi-rough characteristic diagram is obtained after the output processing of H-2 aggregation volume blocks
Figure BDA00038748451800000610
So as to obtain the 1 st-level semi-rough characteristic diagram after the output processing of the H-2 aggregation volume blocks
Figure BDA00038748451800000611
The output module is composed of a layer of convolution layer, and the convolution kernel size of the output module is b multiplied by b two-dimensional convolution;
level 1 semi-coarse feature map of the nth scene
Figure BDA00038748451800000612
Outputting the nth camouflage prediction map pre after the processing of the output module n
Step 5, training a camouflage target detection model based on a polarized image clue;
based on the intensity images, the linear polarization degree images, the linear polarization angle images and the corresponding real camouflage images of the N scenes after data enhancement, a gradient descent method is utilized to train the camouflage target detection model based on the polarization image clues, and the weighted binary cross entropy loss and the weighted IoU loss are used as loss functions together for calculating the loss between the camouflage prediction image and the real camouflage image so as to update model parameters until the loss functions are converged, so that the camouflage target detection model of the optimal polarization image clue is obtained and is used for carrying out camouflage target detection on any intensity image to be predicted and any polarization degree image.
An electronic device of the present invention includes a memory for storing a program for supporting a processor to execute the disguised object detection method, and a processor configured to execute the program stored in the memory.
The present invention is a computer-readable storage medium having stored thereon a computer program characterized in that the computer program is executed by a processor to perform the steps of the disguised object detection method.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the method, the neural network based on the polarized image clues is constructed, the label data is used for monitoring the deep neural network for learning, so that the disguised target detection characteristic model with robustness is obtained, and the problems that a plurality of characteristic information is ignored and the detection precision is low due to the fact that model design is carried out based on clues such as colors, depth and background prior in a statistical model are solved, and the disguised target is detected more accurately.
2. The depth neural network based on the polarized image clues constructed by the invention introduces the polarized information, considers that the reflection of polarized light to each object is different, so that the polarization difference exists in the whole image, and utilizes the characteristic to well segment the target and the background. Meanwhile, the intensity image is complementary with the polarization degree image and the polarization angle image, so that the problem that the detection effect of the disguised target is poor due to low quality of a depth map in the detection of the disguised target with RGB-D as input is solved.
3. According to the depth neural network based on the polarization image clues, the intensity image I, the linear polarization degree image DoLP and the linear polarization angle image AoLP are input into the network in parallel, the intensity image characteristics, the polarization degree image characteristics and the polarization angle image characteristics obtained by the encoder are fused by the fusion module, the characteristic performance of the intensity image is enhanced, the problems that a single-image camouflage target detection network model is poor in learning of more visual perception knowledge are solved, and therefore the robustness of camouflage target detection in a low-contrast or complex scene is effectively improved.
4. According to the invention, the up-sampling module is used, and the fused results of the intensity image characteristics, the polarization degree image characteristics and the polarization angle image characteristics are aggregated on each layer of the encoder, so that a more refined prediction result is obtained, the detection edge is smoother, and the accuracy of the detection of the camouflage target is improved.
Drawings
FIG. 1 is a flow chart of the method of the present invention for detection of a pretended object using polarized image cues;
FIG. 2 is a schematic diagram of a deep neural network structure based on polarized image cues according to the present invention;
FIG. 3 is a graph of the detection results of the decoy target on the polarization data set by the method of the present invention and other decoy target detection methods.
Detailed Description
In this embodiment, as shown in fig. 1, a method for detecting a disguised target based on a polarized image cue aims to solve the problems that an existing network lacks much learning visual perception knowledge and has poor detection performance in a low-contrast or background cluttered scene, and a multi-channel input disguised target detection model capable of effectively detecting a disguised target in a complex scene is obtained by constructing a deep neural network based on a polarized image cue, so that the accuracy and precision of detecting the disguised target in a low-contrast or complex and changeable environment can be improved, and specifically, the method includes the following steps:
step 1, obtaining a polarized image data set with pixel level marks;
step 1.1, acquiring a group of original polarization images with polarization directions theta of 0 degrees, 45 degrees, 90 degrees and 135 degrees respectively in an nth shooting scene by using a focus-dividing plane polarimeter in a polarization camera
Figure BDA0003874845180000071
Thereby obtaining N groups of original polarization images under N scenes; wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003874845180000072
representing the original polarization image in the n-th scene in the polarization direction theta, n ∈ [1, N];
In this embodiment, a polarization-of-focal-plane (DoFP) polarimeter in a lucidtron polarization camera is used to photograph a polarization camouflage object detection data set, which contains N =639 scenes in total, with 352 width and 352 height per image.
Step 1.2, respectively labeling the N groups of original polarization images to obtain pixel-level labeled images; and the value range of each pixel point in the pixel-level labeling image is between (0, 1); forming a polarized image data set D by N groups of original polarized images and corresponding pixel level labeled images and dividing the polarized image data set D into a training data set D tr And a test data set D te
In this embodiment, labeling is performed through Labelme software, where a labeled image is obtained by assigning a category label V to each pixel in a polarized image, where V belongs to (0.1), and the colors of the pixels are respectively represented as: black, white; where black represents the background of the pixel and white represents the target. And dividing a polarization camouflage target detection data set into training and testing, wherein the training set comprises 511 scenes, and the testing machine comprises 128 scenes.
Step 2, calculating a group of original polarization images of the nth scene to obtain an intensity image, a linear polarization degree image and a linear polarization angle image;
step 2.1, calculating an original polarization image in the polarization direction theta in the nth scene by using the formula (1)
Figure BDA0003874845180000081
Stokes vector of
Figure BDA0003874845180000082
Figure BDA0003874845180000083
In the formula (1), the acid-base catalyst,
Figure BDA0003874845180000084
representing the total light intensity, i.e. intensity I, of objects in the nth scene n
Figure BDA0003874845180000085
Respectively representing linearly polarized light in the vertical direction and the horizontal direction in the nth scene;
in this embodiment, there is a difference between four original polarization angle images of the same scene, and this difference determines the stokes vector, which is an important parameter of polarization, and the obtained intensity image is a color image, the number of channels is 3, and the height and width thereof are consistent with those of the original polarization angle images.
Step 2.2, stokes vector pair by using formula (2)
Figure BDA0003874845180000086
Performing calculation imaging to obtain a linear polarization degree image DoLP of the nth scene n
Figure BDA0003874845180000087
In this embodiment, the degree of polarization image DoLP obtained from the stokes vector is a grayscale image, the number of channels is 3, and the height and width thereof are consistent with the original polarization angle image.
Step 2.3, stokes vector pair by using formula (3)
Figure BDA0003874845180000091
Performing calculation imaging to obtain a linear polarization angle image AoLP of the nth scene n
Figure BDA0003874845180000092
In this embodiment, the polarization angle image AoLP obtained from the stokes vector is a gray scale image, the number of channels is 3, and the height and width thereof are consistent with those of the original polarization angle image.
Step 3, intensity image I of nth scene n Linear polarization degree image DoLP n And linear polarization angle image AoLP n Carrying out data enhancement processing to obtain the enhanced intensity image of the nth scene
Figure BDA0003874845180000093
Linear polarization degree image
Figure BDA0003874845180000096
And linear polarization angle image
Figure BDA0003874845180000094
And forming an nth polarized image set;
taking the pixel level annotation image of the nth scene as a real camouflage image of the nth scene, and marking as G n True camouflage G for nth scene n Data enhancement is carried out to obtain a real camouflage painting after the data enhancement
Figure BDA0003874845180000095
In this embodiment, the intensity image, the DoLP image, the AoLP image, and the corresponding real camouflage image in the polarization camouflage target detection data set are subjected to data enhancement in a manner including rotation and mirror inversion, and the intensity image, the DoLP image, the AoLP image, and the corresponding real camouflage image of each scene in the polarization camouflage target detection are expanded to 4 times of the original intensity image, doLP image, aoLP image, and the corresponding real camouflage image.
Step 4, constructing a detection model of the disguised target based on the polarized image clues, comprising the following steps: the device comprises an encoder, a channel dimension reduction module, a fusion module, an up-sampling module and an output module;
step 4.1, constructing an encoder, wherein the encoder is composed of three Res2Net50 backbone networks with the same structure, each Res2Net backbone network is composed of H down-sampling blocks, and the H down-sampling blocks are respectively marked as DSMampleBlock 1 ,……,DSampleBlock h ,......,DSampleBlock H (ii) a Wherein, the DSampleBlock h Representing a block of down-sampling of the h-th levelH =1,2, ·.. H; and the h-th level down-sampling block DSampleBlock h From the h-th X layer two-dimensional convolution layer Dconv2d h Are connected in series;
in this embodiment, H =4,resnet backbone network corresponds to Res1, res2, res3, res4 in fig. 2, where Res1, res2, res3, res4 correspond to 4 downsampling blocks dsampletlock respectively 1 ,DSampleBlock 2 ,DSampleBlock 3 ,DSampleBlock 4
The x-th two-dimensional convolution layer Dconv2d of the h-th order h,x Sequentially comprises the following steps: a convolution layer, a batch normalization and a Relu activation function layer, wherein the x-th two-dimensional convolution layer Dconv2d h,x The convolution kernel size of the convolution layer in (1) is k x ;x=1,2,......,X;
In the present embodiment, X =3,k x Equal to 1,3,1, respectively; when h =1, DSMampleBlock 1 Contains 9 Dconv2d 1 When h =2, DSMampleBlock 2 Contains 12 Dconv2d 2 When h =3, DSMampleBlock 3 Contains 18 Dconv2d 3 When h =4, DSMampleBlock 4 Contains 9 Dconv2d 4
When h =1, inputting the nth polarized image set into the encoder, and respectively corresponding to h-level down-sampling blocks DSampleBlock passing through three Res2Net backbone networks h H-th order X-layer convolution layer Dconv2d in (1) h And respectively outputting the nth intensity image characteristic map of the h level
Figure BDA0003874845180000101
Nth polarization degree image characteristic diagram of h level
Figure BDA0003874845180000102
And nth polarization angle characteristic diagram of h order
Figure BDA0003874845180000103
x h ,y h ,c h Respectively representing the h-th down-sampling block dsampletlock h The height, width and channel number of the output feature map; in thatIn this example, x h =88,y h =88,c h =256。
When H =2,3.... H, the nth intensity image feature map of the H-1 th level is used
Figure BDA0003874845180000104
Characteristic diagram of nth polarization degree image of h-1 level
Figure BDA0003874845180000105
And the nth polarization angle image feature map of the h-1 th order
Figure BDA0003874845180000106
Corresponding to h-th down-sampling blocks DSampleBlock respectively input into three Res2Net backbone networks h And obtaining corresponding intensity image characteristic diagram
Figure BDA0003874845180000107
Degree of polarization image feature map
Figure BDA0003874845180000108
And polarization angle image feature map
Figure BDA0003874845180000109
Thus, the H-th two-dimensional convolutional layer Dconv2d of the three Res2Net backbone networks H Correspondingly obtaining the final output intensity image
Figure BDA00038748451800001010
Polarization degree image characteristic diagram
Figure BDA00038748451800001011
And polarization angle image feature map
Figure BDA00038748451800001012
x h-1 ,y h-1 ,c h-1 Respectively representing the h-1 th down-sampling block dsampletlock h-1 The height, width and channel number of the output feature map; x is a radical of a fluorine atom H ,y H ,c H Respectively representing the H-th lower miningSample block DSampleBlock H The height, width and channel number of the output feature map;
in this example, the output characteristic diagram of the 2 nd-level multilayer two-dimensional convolution layer
Figure BDA00038748451800001013
And
Figure BDA00038748451800001014
is 44, 44, 512, and the output characteristic diagram of the 3 rd level multi-layer two-dimensional convolution layer
Figure BDA00038748451800001015
22, 22, 352, output characteristic diagram of a 4 th-level multilayer two-dimensional convolution layer
Figure BDA00038748451800001016
And
Figure BDA0003874845180000111
the width, height and number of channels of (2) are all 11, 11, 2048.
Step 4.2, the channel dimension reduction module is formed by sequentially connecting H layers of two-dimensional convolution layers in series, and each layer of two-dimensional convolution layer sequentially comprises: one convolution kernel is k' h ×k′ h The two-dimensional convolution layer, a batch normalization and a Relu activation function; in this embodiment, k' h 1,3, respectively, the channel dimension reduction module corresponds to the CJ module in FIG. 2.
Intensity image feature map
Figure BDA0003874845180000112
Polarization degree image characteristic diagram
Figure BDA0003874845180000113
And polarization angle image feature map
Figure BDA0003874845180000114
After being processed by the channel dimension reduction module, the output intensityImage feature map
Figure BDA0003874845180000115
Polarization degree image characteristic diagram
Figure BDA0003874845180000116
And polarization angle image feature map
Figure BDA0003874845180000117
H =1,2.... H, wherein c N Representing the number of channels of the feature diagram after the channel dimension reduction module; in the present embodiment, when h =1,2,3,4, c N Are both 32.
Step 4.3, constructing the Fusion module is composed of H Fusion modules and respectively marked as Fusion 1 ,...,Fusion h ,...,Fusion H Wherein, fusion h Representing a h-th level fusion block; in the present embodiment, fusion h Corresponding to the Fuse module in fig. 2.
Wherein, the Fusion block Fusion of the level 1 1 Includes a 1 st-stage T-layer two-dimensional convolution layer Dconv2d' 1 And O layer convolution layer conv of level 1 1 The other H-1 level fusion blocks comprise T two-dimensional convolution layers, an upper sampling layer and O layers;
h-th-stage T-layer two-dimensional convolution layer Dconv2d' h T-th two-dimensional buildup layer of (1) Dconv2d' h,t Sequentially comprises the following steps: a convolution layer, a batch normalization and a Relu activation function layer, wherein the convolution kernel size of the t-th convolution layer is k ″) t ×k″ t (ii) a H-th O-layer convolution layer conv h Second layer convolution layer conv h,o Has a convolution kernel size of k' o ×k″′ o (ii) a In the present embodiment, T =2,k ″ t 1,3 respectively; o =1,k' o =1。
When h =1, the intensity image feature map is set
Figure BDA0003874845180000118
Polarization degree image characteristic diagram
Figure BDA0003874845180000119
And polarization angle image feature map
Figure BDA00038748451800001110
The combined input is input into a Fusion module and passes through a h-level Fusion block Fusion h The h-th-stage T-layer two-dimensional build-up layer Dconv2d' h After processing, respectively outputting h-level intensity image characteristic maps
Figure BDA00038748451800001111
H-order polarization degree image characteristic diagram
Figure BDA00038748451800001112
And h-th order polarization angle image feature map
Figure BDA00038748451800001113
And obtaining h-level splicing characteristic diagram after channel splicing
Figure BDA00038748451800001114
H-th level of stitching feature maps
Figure BDA00038748451800001115
Then passes through the h-th two-dimensional convolution layer Dconv2d 'of the T-layer' h After the processing, outputting to obtain a processing mosaic
Figure BDA0003874845180000121
Then passing through the h-th O-layer convolution layer conv h The post-output obtains a dimension reduction splicing characteristic diagram
Figure BDA0003874845180000122
Wherein, c o Is the number of channels output after the convolution layer; in this embodiment, c o =3。
Dimension reduction splicing feature map by splitting processing
Figure BDA0003874845180000123
Splitting into weighted feature maps
Figure BDA0003874845180000124
And
Figure BDA0003874845180000125
feature map of intensity image
Figure BDA0003874845180000126
And
Figure BDA0003874845180000127
multiplication, polarization degree image characteristic diagram
Figure BDA0003874845180000128
And
Figure BDA0003874845180000129
multiplication, polarization angle image characteristic map
Figure BDA00038748451800001210
And
Figure BDA00038748451800001211
adding the multiplied three results, and finally outputting the h-th level rough characteristic diagram
Figure BDA00038748451800001212
In the present embodiment
Figure BDA00038748451800001213
And
Figure BDA00038748451800001214
the number of the channels of (a) is 1,
Figure BDA00038748451800001215
is 88, 88, 32, respectively.
When H =2,3.... H, the H-level intensity image feature map is used
Figure BDA00038748451800001216
Degree of polarization image feature map
Figure BDA00038748451800001217
And polarization angle image feature map
Figure BDA00038748451800001218
Input the h-th level upsampling layer Usample h After the spatial resolution of the signal is changed to be a times of the input signal, the h-level up-sampling intensity characteristic diagram is output
Figure BDA00038748451800001219
Up-sampling polarization degree characteristic diagram
Figure BDA00038748451800001220
And up-sampling polarization angle signatures
Figure BDA00038748451800001221
x h ,y h Respectively representing the height and the width of an output characteristic diagram of the upper sampling layer; in this embodiment, a =2, and the upsampling layer sampling transpose convolution convTransposed at each level upsamples the signal; x is a radical of a fluorine atom 2 ,y 2 44, respectively; x is a radical of a fluorine atom 3 ,y 3 22, respectively; x is the number of 4 ,y 4 11, 11 respectively.
Will be provided with
Figure BDA00038748451800001222
And
Figure BDA00038748451800001223
the addition is carried out in such a way that,
Figure BDA00038748451800001224
and
Figure BDA00038748451800001225
the addition is carried out in such a way that,
Figure BDA00038748451800001226
and
Figure BDA00038748451800001227
after addition, the sum is inputted to the h-th two-dimensional convolution layer Dconv2d' h After the h-level intensity image characteristic diagram is correspondingly output after the h-level intensity image characteristic diagram is processed
Figure BDA00038748451800001228
H-order polarization degree image feature map
Figure BDA00038748451800001237
And h-th order polarization angle profile
Figure BDA00038748451800001229
And h-level splicing characteristic diagram is obtained after channel splicing
Figure BDA00038748451800001230
H-th level mosaic feature map
Figure BDA00038748451800001231
Then passes through a h-th-stage T-layer two-dimensional winding layer Dconv2d' h After the intermediate processing, outputting to obtain a processing and splicing characteristic diagram
Figure BDA00038748451800001232
Then passing through the h-th O-layer convolution layer conv h Outputting the obtained dimension-reduced splicing characteristic diagram
Figure BDA00038748451800001233
Wherein, c o Is the number of channels output after the convolution layer;
dimension reduction splicing characteristic graph through splitting processing
Figure BDA00038748451800001234
Splitting into weight feature maps
Figure BDA00038748451800001235
And
Figure BDA00038748451800001236
characterizing the intensity image
Figure BDA0003874845180000131
And with
Figure BDA0003874845180000132
Multiplication, polarization degree image characteristic diagram
Figure BDA0003874845180000133
And
Figure BDA0003874845180000134
multiplication, polarization angle image characteristic map
Figure BDA0003874845180000135
And with
Figure BDA0003874845180000136
Adding the multiplied three results, and finally outputting the h-level rough characteristic diagram
Figure BDA0003874845180000137
So that the H-th coarse feature map is output by the H-th fusion block
Figure BDA0003874845180000138
In this embodiment, c o =3;
Figure BDA0003874845180000139
Is 44, 44, 32, 22, 32, 11, 32, respectively.
Step 4.4, the up-sampling module is composed of H-2 aggregation convolution blocks and an up-sampling block, and each aggregation convolution block comprises: the system comprises an aggregation convolution layer, a batch normalization layer and a Relu activation function layer, wherein the size of a convolution kernel of the aggregation convolution layer is k multiplied by k; in this embodiment, k =3, the convolution step is 1, and the padding is 0.
Drawing the H-th coarse feature
Figure BDA00038748451800001310
Inputting the sampling block to perform a-time upsampling, and fusing with the H-1 level Fusion block H-1 Output H-1 level coarse feature map
Figure BDA00038748451800001311
Performing channel splicing and obtaining a rough characteristic diagram after H-1 level splicing
Figure BDA00038748451800001312
The H-1 level semi-rough characteristic diagram is obtained after the output processing of H-2 aggregation volume blocks
Figure BDA00038748451800001313
The H-1 level semi-rough characteristic diagram
Figure BDA00038748451800001314
After a times of up-sampling is carried out in an input up-sampling block, the H-2 level Fusion is carried out H-2 Output coarse feature map
Figure BDA00038748451800001315
Performing channel splicing and obtaining a rough characteristic diagram after H-2 level splicing
Figure BDA00038748451800001316
The H-2 level semi-rough characteristic diagram is obtained after the output processing of H-2 aggregation volume blocks
Figure BDA00038748451800001317
So as to obtain the 1 st-level semi-rough characteristic diagram after the output processing of the H-2 aggregation volume blocks
Figure BDA00038748451800001318
In this embodiment, the nth half-roughness profile is obtained
Figure BDA00038748451800001319
The number of channels of (2) is 32.
The output module is composed of a layer of convolution layer, and the convolution kernel size is b multiplied by b two-dimensional convolution; in this embodiment, the convolution kernel size of the two-dimensional convolution is 1 × 1, the input channel is 32, and the output channel is 1.
Nth half roughness profile
Figure BDA00038748451800001320
The nth camouflage prediction image pre is output after being processed by the output module n
Step 5, training a camouflage target detection model based on a polarized image clue;
and 5.1, training a disguised target detection model based on a polarized image clue by using a gradient descent method based on the intensity images, the linear polarization degree images, the linear polarization angle images and the corresponding real disguised images of the N scenes after data enhancement, and using weighted binary cross entropy loss and weighted IoU loss as loss functions together for calculating the loss between the disguised prediction image and the real disguised image so as to update model parameters until the loss functions are converged, thereby obtaining the disguised target detection model of the optimal polarized image clue, and performing disguised target detection on any intensity image to be predicted and any polarization degree image.
In this embodiment, the intensity images, doLP images, aoLP images of 2044 scenes after data enhancement by the polarization data set and the corresponding real camouflage maps are used for training, and the signals output by the channel dimensionality reduction module, the fusion module, the upsampling module and the output module are output
Figure BDA0003874845180000141
Respectively with the real camouflage pattern
Figure BDA0003874845180000142
And calculating weighted binary cross entropy and weighted IoU loss to obtain 5 training losses, adding the 5 training losses to obtain a total loss, and guiding the network to train by using the total loss and combining a gradient descent algorithm to obtain the camouflage target detection characteristic model using the polarized image clues.
In this embodiment, an electronic device includes a memory for storing a program that supports a processor to execute the disguised object detection method, and a processor configured to execute the program stored in the memory.
In the present embodiment, a computer-readable storage medium is a computer program stored on a computer-readable storage medium, and the computer program is executed by a processor to execute the steps of the disguised object detection method.
TABLE 1
Methods S-measure E-measure F-measure MAE
BASNet 0.830 0.868 0.722 0.020
SINet-V1 0.789 0.811 0.646 0.042
LSR 0.863 0.910 0.793 0.014
PFNet 0.849 0.910 0.769 0.017
C 2 FNet 0.860 0.913 0.774 0.018
SINet-V2 0.865 0.925 0.784 0.015
Ours 0.879 0.927 0.805 0.014
Table 1 shows that the detection method of the disguised target based on the polarized image clues respectively takes S-measure, F-measure and MAE as evaluation indexes, utilizes a test set of a polarized disguised target detection data set and compares the evaluation indexes with the results of other current disguised target detection methods, and S-measure is used for measuring the similarity of results of area-oriented and object-oriented prediction between a prediction disguised image and a real disguised image, and the closer the value of the S-measure is to 1, the better the detection effect of the disguised target is. The E-measure is the combination of local pixels and image-level average values, and commonly captures image-level statistic h-amount local pixel matching information, and the closer the value is to 1, the better the detection effect of the disguised target is. The 'F-measure' is a weighted harmonic mean of the precision rate and the recall rate, and the closer the value is to 1, the better the detection effect of the camouflage target is. The MAE is the average absolute error, the difference between the predicted value and the actual value is measured, and the closer the value is to 0, the better the detection effect of the disguised target is. From the quantitative analysis in table 1, it can be seen that the method of the present invention achieves the best results in all four evaluation indexes.
Fig. 3 shows the results of the method for detecting a disguised object based on polarized image cues and other current methods for detecting a disguised object according to the present invention. Wherein, ours represents the camouflage target detection method based on the polarized image clue; BASNet shows that two U-shaped networks are stacked in sequence, a significance map is generated in a prediction-refinement mode, and a mixed training loss is further provided to supervise a training process; SINet-V1 shows that the masquerading object is designed and identified by a searching module and an identification module based on the behavior process of a hunter; the LSR represents a multi-task learning framework in the pretending target prediction process, introduces auxiliary tasks such as classification and the like, and provides a model for simultaneously positioning, segmenting and sequencing the pretending targets, wherein the Rank module can sequence the difficulty degree of the pretending target detection; PFNet represents learning based on context characteristics, provides a new distraction mining strategy, and develops a framework for accurately detecting a disguised target, wherein the framework positions a potential target by exploring long-term correlation and refines a segmentation effect by distraction discovery and removal; c 2 Net indicates that a double-branch global context module (DGCM) is designed based on context features to mine rich context information, and in addition, an attention-induced cross-layer fusion module (ACFM) is introduced to aggregate multi-level features; the detection mechanism of the SINet-V2 is similar to that of the SINet-V1, the detection mechanism is also divided into two steps of searching and identifying, and meanwhile, in order to better realize interlayer information fusion and prevent information loss or characteristic redundancy, a reverse guidance method is adopted to obtain a final camouflage prediction map.

Claims (3)

1. A camouflaged target detection method based on polarized image clues is characterized by comprising the following steps:
step 1, obtaining a polarized image data set with pixel level marks;
step 1.1, acquiring a group of original polarization images with polarization directions theta of 0 degrees, 45 degrees, 90 degrees and 135 degrees respectively in an nth shooting scene by using a focus-dividing plane polarimeter in a polarization camera
Figure FDA0003874845170000011
Thereby obtaining N groups of original polarization images under N scenes; wherein the content of the first and second substances,
Figure FDA0003874845170000012
representing the original polarization image in the n-th scene in the polarization direction theta, n ∈ [1, N];
Step 1.2, respectively labeling the N groups of original polarization images to obtain pixel-level labeled images; and the value range of each pixel point in the pixel-level labeling image is between (0, 1); forming a training data set D by N groups of original polarization images and corresponding pixel level labeled images tr
Step 2, calculating a group of original polarization images of the nth scene to obtain an intensity image, a linear polarization degree image and a linear polarization angle image;
step 2.1, calculating an original polarization image in the polarization direction theta in the nth scene by using the formula (1)
Figure FDA0003874845170000013
Stokes vector of
Figure FDA0003874845170000014
Figure FDA0003874845170000015
In the formula (1), the reaction mixture is,
Figure FDA0003874845170000016
representing the total light intensity of objects in the nth scene, i.e. intensity I n
Figure FDA0003874845170000017
Respectively representing linearly polarized light in the vertical direction and the horizontal direction in the nth scene;
step 2.2, using equation (2) to correct the Stokes vector
Figure FDA0003874845170000018
Calculating and imaging to obtain a linear polarization degree image DoLP of the nth scene n
Figure FDA0003874845170000019
Step 2.3 of aligning the Stokes vectors by using the formula (3)
Figure FDA00038748451700000110
Calculating and imaging to obtain the linear polarization angle image AoLP of the nth scene n
Figure FDA00038748451700000111
Step 3, carrying out intensity image I on the nth scene n Linear polarization degree image DoLP n And linear polarization angle image AoLP n Carrying out data enhancement processing to obtain the enhanced intensity image of the nth scene
Figure FDA0003874845170000021
Linear polarization degree image
Figure FDA0003874845170000022
And angle of linear polarizationImage of a person
Figure FDA0003874845170000023
And forming an nth polarized image set;
taking the pixel level annotation image of the nth scene as the real camouflage image of the nth scene, and recording the image as G n True camouflage G of the nth scene n Data enhancement is carried out to obtain a real camouflage image after the data enhancement
Figure FDA0003874845170000024
Step 4, constructing a camouflage target detection model based on the polarized image clues, comprising the following steps: the device comprises an encoder, a channel dimension reduction module, a fusion module, an up-sampling module and an output module;
step 4.1, the encoder is composed of three Res2Net50 backbone networks with the same structure, and the Res2Net backbone network is composed of H down-sampling blocks, and the H down-sampling blocks are respectively marked as DSMampleBlock 1 ,……,DSampleBlock h ,......,DSampleBlock H (ii) a Wherein, the DSampleBlock h Represents an H-level downsample block, H =1, 2.... H; and the h-th level down-sampling block DSampleBlock h From the h-th X layer two-dimensional convolution layer Dconv2d h Are connected in series;
the x-th two-dimensional convolution layer Dconv2d of the h-th level h,x Sequentially comprises the following steps: a convolution layer, a batch normalization and a Relu activation function layer, wherein the x-th two-dimensional convolution layer Dconv2d h,x The convolution kernel size of the convolution layer in (1) is k x ;x=1,2,......,X;
When h =1, inputting the nth polarized image set into the encoder, and respectively corresponding to h-level down-sampling blocks DSampleBlock passing through three Res2Net backbone networks h H-th order X-layer convolution layer Dconv2d in (1) h And respectively outputting the nth intensity image characteristic map of the h level
Figure FDA0003874845170000025
H stageCharacteristic diagram of the n-th polarization degree image
Figure FDA0003874845170000026
And nth polarization angle characteristic diagram of h order
Figure FDA0003874845170000027
x h ,y h ,c h Respectively representing the h-th down-sampling block dsampletlock h The height, width and channel number of the output feature map;
when H =2,3.... H, the nth intensity image feature map of the H-1 level is used
Figure FDA0003874845170000028
Characteristic diagram of nth polarization degree image of h-1 level
Figure FDA0003874845170000029
And the nth polarization angle image feature map of the h-1 th order
Figure FDA00038748451700000210
Corresponding to h-level down-sampling blocks DSampleBlock input into three Res2Net backbone networks respectively h And obtaining corresponding intensity image characteristic diagram
Figure FDA00038748451700000211
Polarization degree image characteristic diagram
Figure FDA00038748451700000212
And polarization angle image feature map
Figure FDA00038748451700000213
Thus, the H-th two-dimensional convolutional layer Dconv2d of the three Res2Net backbone networks H Correspondingly obtaining the final output intensity image
Figure FDA0003874845170000031
Degree of polarization imageCharacteristic diagram
Figure FDA0003874845170000032
And polarization angle image feature map
Figure FDA0003874845170000033
x h-1 ,y h-1 ,c h-1 Respectively representing the h-1 th down-sampling block dsampletlock h-1 The height, width and channel number of the output feature map; x is the number of H ,y H ,c H Respectively, the H-th down-sampling block DSMampleBlock H The height, width and channel number of the output feature map;
and 4.2, the channel dimension reduction module is formed by sequentially connecting H layers of two-dimensional convolution layers in series, and each layer of two-dimensional convolution layer sequentially comprises: one convolution kernel is k h ′×k h ' two-dimensional convolution layer, a batch normalization and a Relu activation function;
the intensity image feature map
Figure FDA0003874845170000034
Degree of polarization image feature map
Figure FDA0003874845170000035
And polarization angle image feature map
Figure FDA0003874845170000036
After the processing of the channel dimension reduction module, an intensity image feature map is output
Figure FDA0003874845170000037
Polarization degree image characteristic diagram
Figure FDA0003874845170000038
And polarization angle image feature map
Figure FDA0003874845170000039
H =1,2.... H, wherein c N Representing the number of channels of the feature diagram after the channel dimension reduction module;
step 4.3, the Fusion module is composed of H Fusion modules and is respectively marked as Fusion 1 ,...,Fusion h ,...,Fusion H Wherein, fusion h Representing an h-th level fusion block;
wherein, the Fusion block Fusion of the level 1 1 Including a 1 st-level T-layer two-dimensional convolution layer Dconv2d 1 ' and O layer convolution layer conv of level 1 1 The other H-1 level fusion blocks comprise T two-dimensional convolution layers, an upper sampling layer and O layers;
h-th-level T-layer two-dimensional convolution layer Dconv2d h ' the t-th two-dimensional convolution layer Dconv2d h,t Sequentially comprises the following steps: a convolution layer, a batch normalization and a Relu activation function layer, wherein the convolution kernel size of the t-th convolution layer is k t ″×k t "; the h-th O layer convolution layer conv h Second layer convolution layer conv h,o Has a convolution kernel size of k o ″′×k o ″′;
When h =1, the intensity image feature map is set
Figure FDA00038748451700000310
Degree of polarization image feature map
Figure FDA00038748451700000311
And polarization angle image feature map
Figure FDA00038748451700000312
Jointly input into the Fusion module and pass through the h-level Fusion block Fusion h H-th order T layer two-dimensional convolution layer Dconv2d in (1) h ' after processing, the h-level intensity image characteristic maps are respectively output
Figure FDA00038748451700000313
H-order polarization degree image characteristic diagram
Figure FDA00038748451700000314
And h-th order polarization angle image feature map
Figure FDA00038748451700000315
And splicing the channels to obtain a h-level splicing characteristic diagram
Figure FDA00038748451700000316
H-th level of stitching feature maps
Figure FDA00038748451700000317
Then passes through the h-th-level T-layer two-dimensional convolution layer Dconv2d h ' after processing, the output is processed to obtain a mosaic
Figure FDA0003874845170000041
Then passing through the h-th O-layer convolution layer conv h Outputting the obtained dimension-reduced splicing characteristic diagram
Figure FDA0003874845170000042
Wherein, c o The number of channels output after the convolutional layer;
dimension reduction splicing feature map by splitting processing
Figure FDA0003874845170000043
Splitting into three weight feature maps
Figure FDA0003874845170000044
And
Figure FDA0003874845170000045
feature map of intensity image
Figure FDA0003874845170000046
And with
Figure FDA0003874845170000047
Multiplication, polarization degree image characteristic diagram
Figure FDA0003874845170000048
And with
Figure FDA0003874845170000049
Multiplicative, polarization angle image signatures
Figure FDA00038748451700000410
And
Figure FDA00038748451700000411
adding the multiplied three results, and finally outputting the h-th level rough characteristic diagram
Figure FDA00038748451700000412
When H =2,3.... H, the H-level intensity image feature map is used
Figure FDA00038748451700000413
Degree of polarization image feature map
Figure FDA00038748451700000414
And polarization angle image feature map
Figure FDA00038748451700000415
Inputting the h-th level up-sampling layer Usample h In the method, after the spatial resolution of the signal is changed to be a times of the input signal, the h-level up-sampling intensity characteristic diagram is output
Figure FDA00038748451700000416
Up-sampling polarization degree characteristic diagram
Figure FDA00038748451700000417
And up-sampling polarization angle profiles
Figure FDA00038748451700000418
x h ,y h Respectively representing the height and the width of an output characteristic diagram of the upper sampling layer;
will be provided with
Figure FDA00038748451700000419
And
Figure FDA00038748451700000420
the addition is carried out in such a way that,
Figure FDA00038748451700000421
and
Figure FDA00038748451700000422
the addition is carried out in such a way that,
Figure FDA00038748451700000423
and with
Figure FDA00038748451700000424
After addition, the sum is inputted to the h-th level T-layer two-dimensional convolution layer Dconv2d h ' after the processing in the method, the h-level intensity image feature map is correspondingly output
Figure FDA00038748451700000425
H-order polarization degree image feature map
Figure FDA00038748451700000426
And h-th order polarization angle characteristic diagram
Figure FDA00038748451700000427
And h-level splicing characteristic diagram is obtained after channel splicing
Figure FDA00038748451700000428
The h-th level splicing characteristic diagram
Figure FDA00038748451700000429
Then passes through the hLevel T-layer two-dimensional convolution layer Dconv2d h After the' middle processing, the output obtains a processing splicing characteristic diagram
Figure FDA00038748451700000430
Then passing through the h-th O-layer convolution layer conv h Outputting the obtained dimension-reduced splicing characteristic diagram
Figure FDA00038748451700000431
Wherein, c o The number of channels output after the convolutional layer;
dimension reduction splicing characteristic graph through splitting processing
Figure FDA00038748451700000432
Splitting into three weight feature maps
Figure FDA00038748451700000433
And
Figure FDA00038748451700000434
feature map of intensity image
Figure FDA00038748451700000435
And with
Figure FDA00038748451700000436
Multiplied, polarization degree image feature map
Figure FDA00038748451700000437
And with
Figure FDA00038748451700000438
Multiplicative, polarization angle image signatures
Figure FDA00038748451700000439
And with
Figure FDA00038748451700000440
Adding the multiplied three results, and finally outputting the h-level rough feature map
Figure FDA00038748451700000441
So that the H-th coarse feature map is output by the H-th fusion block
Figure FDA00038748451700000442
Step 4.4, the up-sampling module is composed of H-2 aggregation convolution blocks and an up-sampling block, and each aggregation convolution block comprises: an aggregate convolution layer, a batch normalization and a Relu activation function layer, wherein the convolution kernel size of the aggregate convolution layer is k x k;
mapping the H-th coarse feature
Figure FDA0003874845170000051
Inputting the sampling data into the up-sampling block for a-time up-sampling, and fusing the sampling data with an H-1 level Fusion block H-1 Output H-1 level coarse feature map
Figure FDA0003874845170000052
Performing channel splicing and obtaining a rough characteristic diagram after H-1 level splicing
Figure FDA0003874845170000053
The H-1 level semi-rough characteristic diagram is obtained after the output treatment of H-2 aggregation volume blocks
Figure FDA0003874845170000054
The H-1 level semi-rough feature map
Figure FDA0003874845170000055
Inputting the sampling data into the upsampling block for a-time upsampling, and performing H-2 stage Fusion H-2 Output roughness profile
Figure FDA00038748451700000510
Performing channel splicing and obtaining a rough characteristic diagram after H-2 level splicing
Figure FDA0003874845170000056
The H-2 level semi-rough characteristic diagram is obtained after the output processing of H-2 aggregation volume blocks
Figure FDA0003874845170000057
So as to obtain the 1 st-stage semi-rough characteristic diagram after the output processing of the H-2 aggregation volume blocks
Figure FDA0003874845170000058
The output module is composed of a layer of convolution layer, and the convolution kernel size of the output module is b multiplied by b two-dimensional convolution;
level 1 semi-coarse feature map of the nth scene
Figure FDA0003874845170000059
The nth camouflage prediction image pre is output after being processed by the output module n
Step 5, training a camouflage target detection model based on a polarized image clue;
based on the intensity images, the linear polarization degree images, the linear polarization angle images and the corresponding real camouflage images of the N scenes after data enhancement, a gradient descent method is utilized to train the camouflage target detection model based on the polarization image clues, and the weighted binary cross entropy loss and the weighted IoU loss are used as loss functions together for calculating the loss between the camouflage prediction image and the real camouflage image so as to update model parameters until the loss functions are converged, so that the camouflage target detection model of the optimal polarization image clue is obtained and is used for carrying out camouflage target detection on any intensity image to be predicted and any polarization degree image.
2. An electronic device comprising a memory and a processor, wherein the memory is used for storing a program that enables the processor to execute the method of detecting a false target of claim 1, and the processor is configured to execute the program stored in the memory.
3. A computer-readable storage medium, having a computer program stored thereon, wherein the computer program, when being executed by a processor, is adapted to perform the steps of the disguised object detection method as claimed in claim 1.
CN202211210090.8A 2022-09-30 2022-09-30 Method for detecting disguised target based on polarized image clues and application thereof Pending CN115620049A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211210090.8A CN115620049A (en) 2022-09-30 2022-09-30 Method for detecting disguised target based on polarized image clues and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211210090.8A CN115620049A (en) 2022-09-30 2022-09-30 Method for detecting disguised target based on polarized image clues and application thereof

Publications (1)

Publication Number Publication Date
CN115620049A true CN115620049A (en) 2023-01-17

Family

ID=84861039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211210090.8A Pending CN115620049A (en) 2022-09-30 2022-09-30 Method for detecting disguised target based on polarized image clues and application thereof

Country Status (1)

Country Link
CN (1) CN115620049A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116935189B (en) * 2023-09-15 2023-12-05 北京理工导航控制科技股份有限公司 Camouflage target detection method and device based on neural network and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116935189B (en) * 2023-09-15 2023-12-05 北京理工导航控制科技股份有限公司 Camouflage target detection method and device based on neural network and storage medium

Similar Documents

Publication Publication Date Title
Li et al. Underwater image enhancement via medium transmission-guided multi-color space embedding
CN113065558B (en) Lightweight small target detection method combined with attention mechanism
Snell et al. Learning to generate images with perceptual similarity metrics
CN106897673B (en) Retinex algorithm and convolutional neural network-based pedestrian re-identification method
CN109376611A (en) A kind of saliency detection method based on 3D convolutional neural networks
CN111563418A (en) Asymmetric multi-mode fusion significance detection method based on attention mechanism
CN109034184B (en) Grading ring detection and identification method based on deep learning
CN109215053B (en) Method for detecting moving vehicle with pause state in aerial video shot by unmanned aerial vehicle
CN113449727A (en) Camouflage target detection and identification method based on deep neural network
CN113591968A (en) Infrared weak and small target detection method based on asymmetric attention feature fusion
CN108154133B (en) Face portrait-photo recognition method based on asymmetric joint learning
CN107766864B (en) Method and device for extracting features and method and device for object recognition
CN109977834B (en) Method and device for segmenting human hand and interactive object from depth image
CN114881871A (en) Attention-fused single image rain removing method
CN114549567A (en) Disguised target image segmentation method based on omnibearing sensing
CN114429457A (en) Intelligent fan blade defect detection method based on bimodal fusion
CN113505634A (en) Double-flow decoding cross-task interaction network optical remote sensing image salient target detection method
CN107045630B (en) RGBD-based pedestrian detection and identity recognition method and system
CN115620049A (en) Method for detecting disguised target based on polarized image clues and application thereof
CN109376719B (en) Camera light response non-uniformity fingerprint extraction and comparison method based on combined feature representation
Song et al. Multistage curvature-guided network for progressive single image reflection removal
CN112508863B (en) Target detection method based on RGB image and MSR image double channels
Babu et al. An efficient image dahazing using Googlenet based convolution neural networks
CN111815529B (en) Low-quality image classification enhancement method based on model fusion and data enhancement
CN111539434B (en) Infrared weak and small target detection method based on similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination