CN114842235A - Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation - Google Patents
Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation Download PDFInfo
- Publication number
- CN114842235A CN114842235A CN202210284099.7A CN202210284099A CN114842235A CN 114842235 A CN114842235 A CN 114842235A CN 202210284099 A CN202210284099 A CN 202210284099A CN 114842235 A CN114842235 A CN 114842235A
- Authority
- CN
- China
- Prior art keywords
- target
- infrared
- feature
- image
- shape prior
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000011218 segmentation Effects 0.000 title claims abstract description 33
- 230000002776 aggregation Effects 0.000 title claims abstract description 27
- 238000004220 aggregation Methods 0.000 title claims abstract description 27
- 238000001914 filtration Methods 0.000 claims abstract description 18
- 238000012545 processing Methods 0.000 claims abstract description 9
- 238000000605 extraction Methods 0.000 claims abstract description 7
- 230000009466 transformation Effects 0.000 claims abstract description 6
- 238000010606 normalization Methods 0.000 claims abstract description 5
- 238000013507 mapping Methods 0.000 claims description 18
- 238000011176 pooling Methods 0.000 claims description 11
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000001514 detection method Methods 0.000 description 27
- 230000006870 function Effects 0.000 description 15
- 230000000694 effects Effects 0.000 description 5
- 230000009471 action Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000002708 enhancing effect Effects 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000003331 infrared imaging Methods 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000005855 radiation Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
- G06V10/765—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation, which comprises the following steps: performing Gaussian filtering operation on the input infrared original image to enhance dim weak and small targets; carrying out shape prior-based segmentation on the infrared image subjected to Gaussian filtering to obtain a target candidate region; cutting the target candidate area and inputting the target candidate area into a multi-scale feature extraction module to obtain feature representation of the small target; inputting the feature representation of the small target into a feature aggregation network to obtain a tensor spliced image; and carrying out batch normalization processing and nonlinear transformation on the image after tensor splicing, and outputting a target classification result through Softmax. The shape prior-based segmentation module fully utilizes prior information of the weak and small targets to obtain suspicious target areas, reduces the overall parameters to improve the algorithm efficiency, and the multi-scale feature extraction and aggregation module realizes a sufficient number of feature channels for the weak and small targets so as to ensure the detectability of the weak and small targets.
Description
Technical Field
The invention belongs to the technical field of infrared target detection, and particularly relates to an infrared small and weak target identification method based on shape prior segmentation and multi-scale feature aggregation.
Background
The infrared imaging detection technology can be used for detecting and tracking the unmanned aerial vehicle, and is a technical means for effectively monitoring the unmanned aerial vehicle. However, in an actual scene, due to influences caused by remote imaging and atmospheric radiation interference, the target signal-to-noise ratio is low, the number of pixel points is small, no shape texture and structural information exist, and interference of complex background noise and random noise is easily caused, so that the detection accuracy and the detection efficiency cannot be balanced by a conventional target detection and identification algorithm.
In order to solve the problem of infrared weak and small target detection, two types of methods are mainly used at present: single frame based and multi-frame based. Because the multi-frame detection algorithm generally performs segmentation processing based on prior information such as the shape of a small target, the continuity of a gray level motion track and the like, more time is consumed than that consumed by a single-frame detection algorithm, so that the multi-frame detection algorithm is not suitable for real-time application. This report mainly studies single frame detection algorithms.
Existing single frame detection methods are broadly classified into traditional and neural network-based detection algorithms. Most of traditional detection methods are based on target priori knowledge, the target contrast is improved by inhibiting the background and enhancing the target, the target is extracted by self-adaptive threshold segmentation, and the false alarm rate of the traditional algorithm detection effect is high due to the existence of noise and the lack of robustness characteristics. Most of deep learning detection methods obtain a candidate region through an Anchor mechanism, then share parameters, and unify classification and regression together to obtain a detection result, but the method is mainly directed to general target detection and still has an unsatisfactory effect on detection of weak and small targets with low signal-to-noise ratio and extremely small pixels.
Fan et al in the literature "Zunlin, Fan, Duyan, et al. dim in-less image enhancement on a proportional neural network [ J ]. neuro-output, 2018, 272(Jan.10): 396-404." for the target blur and background complexity problem caused by the long shooting distance that is common in the current infrared imaging system, a convolutional neural network enhancement method for suppressing the background clutter while enhancing the small target is proposed, the handwritten characters in MNIST data set are used to simulate the difference between the foreground and the background of the infrared image, the small target and the background are predicted, and the contrast of the weak infrared image in which the background clutter is embedded into the small target is improved. However, the above method does not consider the influence of noise existing in the imaging system itself on the detection, so the document "Deng Q, Lu H, Tao H, et al, Multi-scale connected neural networks for space-included point objects discrimination [ J ]. IEEE Access,2019: 1-1" by Deng et al at the national defense science and technology university proposes a multi-scale convolutional neural network considering that the existence of noise in the infrared detection system and the lack of robustness characteristics make the detection of space objects difficult to process in a long observation distance. The network structure provided by the method consists of 3 parts of transformation, partial convolution and full convolution, small target training data is generated by an infrared radiation model, and the inherent properties of the small and medium targets in different scenes are considered. The method can improve the performance of the system and has stronger robustness on the noise of the detection system. However, most of the current detection methods for deep learning regard the problem of infrared weak and small target detection as a problem of two-classification or significance detection, and the detection accuracy for the weak and small targets is still not ideal.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation. The technical problem to be solved by the invention is realized by the following technical scheme:
an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation is characterized by comprising the following steps:
performing Gaussian filtering operation on the input infrared original image to enhance dim weak and small targets;
carrying out shape prior-based segmentation on the infrared image subjected to Gaussian filtering to obtain a target candidate region;
cutting the target candidate region and inputting the target candidate region into a multi-scale feature extraction module to obtain feature representation of a small target;
inputting the feature representation of the small target into a feature aggregation network to obtain a tensor spliced image;
and carrying out batch normalization processing and nonlinear transformation on the image after tensor splicing, and outputting a target classification result through Softmax.
In an embodiment of the present invention, the S1 includes:
s11: preprocessing the infrared original image by adopting a 3 x 3 Gaussian kernel template;
s12: gaussian filtering operation is performed using the imfilter () function.
In one embodiment of the present invention, the enhanced image after performing the gaussian filtering operation is represented as:
wherein, (x, y) is the coordinate of a certain pixel in the infrared original image, I represents the infrared original image, and G represents a gaussian kernel.
In an embodiment of the present invention, the S2 includes:
s21: processing the infrared image after Gaussian filtering by using a segmentation algorithm based on shape prior to eliminate a large-size continuous background area and high-energy noise with a preset pixel number;
s22: selecting a point region with the aspect ratio not more than 2 for the image excluding the large-size continuous background region and the high-energy noise with the preset number of pixels to fit the suspicious target, and segmenting the region around the fitted boundary to exclude the strip-shaped edge region to obtain the target candidate region.
In an embodiment of the present invention, the S21 includes:
s211: and the prior shape information of the weak and small target is fused into an energy function so as to eliminate large-size continuous background areas and high-energy noise with preset pixel quantity.
In an embodiment of the present invention, the S211 includes:
s2111: combining the energies of the shape priors into an energy function;
s2112: and eliminating high-energy noise of a large-size continuous background area and a preset pixel number through the energy function, the energy of the area item and the energy of the boundary item.
In one embodiment of the present invention, the expression of the energy function is:
E(L)=R(L)+B(L)+E shape ;
where R (L) is the region term, B (L) is the boundary term, and Escape is the shape prior term;
the energy expression of the region term is as follows:
E(L)=αR(L)+B(L);
the energy expression of the boundary term is as follows:
wherein, alpha is a relative importance factor between the region item and the boundary item; r p (l p ) Is to label l p The weight assigned to pixel p.
In an embodiment of the present invention, the S3 includes:
s31: the gaussian filtered image is processed using five-size convolution kernels to obtain a feature representation of the small target.
In an embodiment of the present invention, the S31 includes:
s311: convolution kernels of five sizes, 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11, are used, and the number of the convolution kernels is 1, 2, 3, 4 and 5 respectively;
s312: the five groups of convolution kernels correspond to targets with the sizes of 1 × 1, 3 × 3, 5 × 5, 7 × 7 and 9 × 9 respectively, the numbers of the five groups of convolution kernels are 5, 4, 3, 2 and 1 respectively, and the numbers of corresponding feature mappings are 5, 4, 3, 2 and 1, so that the 15 feature mappings and the infrared original image form feature mappings of 16 channels;
s313: and serially connecting the image formed by the five groups of convolution kernels and the feature map, and inputting the image and the feature map into an intermediate maximum pooling layer, wherein the kernel size and the step size of the maximum pooling layer are set to be 2 so as to obtain the mapping features of 32 channels, and the mapping features of the 32 channels are the feature representation of the small target.
In an embodiment of the present invention, the S4 includes:
s41: performing convolution and pooling on the feature representation of the small target to obtain a feature map of the middle layer;
s42: downsampling the feature representation of the small target, and decomposing pixels at the same position in downsampling into 4 sub-images;
s43: and carrying out tensor splicing on the feature map of the middle layer and the 4 sub-maps.
The invention has the beneficial effects that:
the segmentation module based on the shape prior fully utilizes the prior information of the weak and small targets to obtain suspicious target areas, reduces the global parameter number to improve the algorithm efficiency, and the multi-scale feature extraction and aggregation module realizes a sufficient number of feature channels for the weak and small targets, thereby ensuring the detectability of the weak and small targets and obviously reducing the false alarm rate under higher recall rate.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Drawings
Fig. 1 is a schematic flow chart of an infrared small and weak target identification method based on shape prior segmentation and multi-scale feature aggregation according to an embodiment of the present invention;
fig. 2 is a structural diagram of an infrared small and weak target identification method based on shape prior segmentation and multi-scale feature aggregation according to an embodiment of the present invention.
Detailed Description
In order to further explain the technical means and effects of the present invention adopted to achieve the predetermined invention purpose, the following detailed description is made with reference to the accompanying drawings and the detailed description, for an infrared small and weak target identification method based on shape prior segmentation and multi-scale feature aggregation according to the present invention.
The foregoing and other technical matters, features and effects of the present invention will be apparent from the following detailed description of the embodiments, which is to be read in connection with the accompanying drawings. The technical means and effects of the present invention adopted to achieve the predetermined purpose can be more deeply and specifically understood through the description of the specific embodiments, however, the attached drawings are provided for reference and description only and are not used for limiting the technical scheme of the present invention.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that an article or device that comprises a list of elements does not include only those elements but may include other elements not expressly listed. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of additional like elements in the article or device comprising the element.
The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
Example one
Referring to fig. 1, fig. 1 is a schematic flow chart of an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation according to an embodiment of the present invention, and fig. 2 is a structural diagram of an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation according to an embodiment of the present invention, which is combined with fig. 1 and fig. 2, and includes:
s1: and performing a Gaussian filtering operation on the input infrared original image to enhance dim weak and small targets.
Specifically, step S1 includes:
s11: preprocessing the infrared original image by adopting a 3 x 3 Gaussian kernel template;
s12: gaussian filtering operation is performed using the imfilter () function.
Firstly, preprocessing an input infrared original image by adopting a 3 x 3 Gaussian kernel template, and executing Gaussian filtering operation by using an imfilter () function, namely, performing a weighting and averaging process on the values of the pixel points of the whole image and the pixel points in the neighborhood of the pixel points, effectively inhibiting a background area, enhancing a weak target to be detected, and facilitating subsequent detection and identification. The gaussian filtered enhanced image is represented as:
wherein, (x, y) is the coordinate of a certain pixel in the infrared original image, I represents the infrared original image, and G represents the gaussian kernel.
S2: carrying out shape prior-based segmentation on the infrared image subjected to Gaussian filtering to obtain a target candidate region;
specifically, step S2 includes:
s21: processing the infrared image after Gaussian filtering by using a segmentation algorithm based on shape prior to eliminate a large-size continuous background area and high-energy noise with a preset pixel number;
specifically, step S21 includes:
s211: and the prior shape information of the weak and small target is fused into an energy function so as to eliminate large-size continuous background areas and high-energy noise with preset pixel quantity.
Specifically, step S211 includes:
s2111: combining the energies of the shape priors into an energy function;
s2112: and eliminating high-energy noise of large-size continuous background areas and preset pixel quantity through the energy function, the energy of the area item and the energy of the boundary item.
S22: selecting a point region with the aspect ratio not more than 2 for the image excluding the large-size continuous background region and the high-energy noise with the preset number of pixels to fit the suspicious target, and segmenting the region around the fitted boundary to exclude the strip-shaped edge region to obtain a target candidate region.
Firstly, processing the infrared image after Gaussian filtering by using a segmentation algorithm based on shape prior to eliminate a large-size continuous background area and high-energy noise with a preset number of pixels.
Specifically, there are three types of regions in the infrared image, including a large continuous background region (such as a cloud layer), a band-shaped edge region (long strip), and a small dot region (dot region), in which a weak target exists, according to the characteristic analysis of the target. And (3) integrating the prior shape information of the weak and small target into an energy function to generate a candidate region with higher recall rate, namely, excluding a large-size continuous background region and high-energy noise with preset pixel quantity.
The method includes the steps of fusing prior shape information of a weak and small target into an energy function to eliminate large-size continuous background areas and high-energy noise with preset pixel quantity, and specifically includes the following steps: based on the gray scale information of the infrared image, the edge is the euclidean distance of the gray scale values of the two pixels. Thus, a distance function φ is introduced to express the shape prior, φ (x, y) represents the minimum Euclidean distance between a point (x, y) and an edge. Since the drone target may be approximately represented by a rectangle with a threshold ratio of no greater than 2, a template is selected that constructs a rectangular shape. Expressing an energy term of the shape template by Escape, and combining the energy of the shape prior into an energy function;
E(L)=R(L)+B(L)+E shape ;
where R (L) is the region term, B (L) is the boundary term, and Escape is the shape prior term. The energy formula for the region and boundary terms is expressed as follows:
E(L)=αR(L)+B(L);
where α is the relative importance factor between the region term and the boundary term. Rp (lp) is the weight that assigns the label lp to the pixel p. By comparing the intensity of the pixel p with the intensity models (given histograms) of the target and background, the weight of rp (lp) can be obtained, i.e. when a pixel is more likely to be a target, the weight assigning the pixel to the target should be smaller, which can reduce the energy in the equation, so when all pixels are correctly assigned as target and background, the area term will be minimized. The energy representation method of shape prior is to select the approximate center pixel of the target as a reference point, set the pixel values around the center point to 255 and the other values to 0 according to the distance between the pixel and the center pixel. The new image is then considered as a shape template for the object and incorporates the energy function. In this way large size continuous background areas (clouds and buildings) and high energy noise of about a few pixels (e.g. hot pixels) can be excluded, since the value of the central part of the object is relatively larger than elsewhere in the template.
Then, selecting a point region with an aspect ratio not more than 2 for the image excluding the large-size continuous background region and the high-energy noise with the preset number of pixels to fit the suspicious target, and segmenting the region around the fitted boundary to exclude the strip-shaped edge region to obtain a target candidate region.
S3: cutting the target candidate area and inputting the target candidate area into a multi-scale feature extraction module to obtain feature representation of the small target;
specifically, step S3 includes:
s31: the gaussian filtered image is processed using five-size convolution kernels to obtain a feature representation of the small target.
Specifically, step S31 includes:
s311: convolution kernels of five sizes, 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11, are used, and the number of the convolution kernels is 1, 2, 3, 4 and 5 respectively;
s312: the five groups of convolution kernels correspond to targets with the sizes of 1 × 1, 3 × 3, 5 × 5, 7 × 7 and 9 × 9 respectively, the number of the five groups of convolution kernels is 5, 4, 3, 2 and 1 respectively, and the number of corresponding feature mappings is 5, 4, 3, 2 and 1, so that 15 feature mappings and the infrared original image form 16-channel feature mappings;
s313: and connecting the images formed by the five groups of convolution kernels and the feature maps in series and inputting the images and the feature maps into an intermediate maximum pooling layer, wherein the kernel size and the step size of the maximum pooling layer are set to be 2 so as to obtain the mapping features of 32 channels, and the mapping features of the 32 channels are the feature representation of the small target.
The gaussian filtered image is processed using five-size convolution kernels to obtain a feature representation of the small target, i.e., multi-scale information of the small target is extracted using a multi-scale convolution kernel.
The method comprises the following steps of extracting multi-scale information of a small target by utilizing a multi-scale convolution kernel, wherein the method specifically comprises the following steps:
convolution kernels of five sizes, 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11, are used, and the number of the convolution kernels is 1, 2, 3, 4 and 5 respectively;
the 15 convolution kernels mentioned above correspond to objects with sizes of 1 × 1, 3 × 3, 5 × 5, 7 × 7 and 9 × 9 respectively, the numbers of the convolution kernels are 5, 4, 3, 2 and 1 respectively, and the numbers of the corresponding feature maps are 5, 4, 3, 2 and 1, namely, the 15 feature maps and the original image form a 16-channel feature map, and a sufficient number of feature channels are obtained for weak and small objects to ensure detectability;
connecting images formed by five groups of convolution kernels and feature maps in series and inputting the images and the feature maps into a middle maximum pooling layer, setting the kernel size and the step length of the maximum pooling layer to be 2, doubling the number of feature mappings to obtain the mapping features of 32 channels, and reducing consumed memory resources when extracting features as much as possible; the mapping features of the 32 channels are the feature representation of the small target.
S4: inputting the feature representation of the small target into a feature aggregation network to obtain a tensor spliced image;
specifically, step S4 includes:
s41: performing convolution and pooling treatment on the feature representation of the small target to obtain a feature map of the middle layer;
s42: carrying out down-sampling on the feature representation of the small target, and decomposing pixels at the same position into 4 sub-images during the down-sampling;
s43: and carrying out tensor splicing on the feature map of the middle layer and the 4 sub-maps.
The number of the converted channels is changed into 4 times, and the down-sampling is 2 times, so that the feature information under higher resolution is effectively fused.
S5: and carrying out batch normalization processing and nonlinear transformation on the image after tensor splicing, and outputting a target classification result through Softmax.
The tensor-stitched image obtained in S4 is subjected to batch normalization before being subjected to nonlinear transformation. And finally, outputting the unmanned aerial vehicle target classification result by Softmax.
In the embodiment, the shape prior-based segmentation module fully utilizes prior information of the small and weak targets to obtain suspicious target regions, reduces the global parameter to improve the algorithm efficiency, and the multi-scale feature extraction and aggregation module realizes a sufficient number of feature channels for the small and weak targets, so that the detectability of the small and weak targets is ensured, and the false alarm rate can be obviously reduced under a higher recall rate.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.
Claims (10)
1. An infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation is characterized by comprising the following steps:
s1: performing Gaussian filtering operation on the input infrared original image to enhance dim weak and small targets;
s2: carrying out shape prior-based segmentation on the infrared image subjected to Gaussian filtering to obtain a target candidate region;
s3: cutting the target candidate region and inputting the target candidate region into a multi-scale feature extraction module to obtain feature representation of a small target;
s4: inputting the feature representation of the small target into a feature aggregation network to obtain a tensor spliced image;
s5: and carrying out batch normalization processing and nonlinear transformation on the image after tensor splicing, and outputting a target classification result through Softmax.
2. The infrared dim target recognition method based on shape prior segmentation and multi-scale feature aggregation according to claim 1, wherein the S1 includes:
s11: preprocessing the infrared original image by adopting a 3 x 3 Gaussian kernel template;
s12: gaussian filtering operation is performed using the imfilter () function.
3. The method for evaluating anti-interference performance of radar according to claim 2, wherein the enhanced image after performing the gaussian filtering operation is represented as:
wherein, (x, y) is the coordinate of a certain pixel in the infrared original image, I represents the infrared original image, and G represents a gaussian kernel.
4. The infrared dim target recognition method based on shape prior segmentation and multi-scale feature aggregation according to claim 1, wherein the S2 includes:
s21: processing the infrared image after Gaussian filtering by using a segmentation algorithm based on shape prior to eliminate a large-size continuous background area and high-energy noise with a preset pixel number;
s22: selecting a point region with the aspect ratio not more than 2 for the image excluding the large-size continuous background region and the high-energy noise with the preset number of pixels to fit the suspicious target, and segmenting the region around the fitted boundary to exclude the strip-shaped edge region to obtain the target candidate region.
5. The method for infrared weak and small target recognition based on shape prior segmentation and multi-scale feature aggregation according to claim 4, wherein the S21 includes:
s211: and the prior shape information of the weak and small target is fused into an energy function so as to eliminate large-size continuous background areas and high-energy noise with preset pixel quantity.
6. The method for identifying infrared weak and small targets based on shape prior segmentation and multi-scale feature aggregation according to claim 5, wherein the step S211 comprises:
s2111: combining the energies of the shape priors into an energy function;
s2112: and eliminating high-energy noise of a large-size continuous background area and a preset pixel number through the energy function, the energy of the area item and the energy of the boundary item.
7. The method for identifying the infrared dim target based on the shape prior segmentation and the multi-scale feature aggregation according to claim 6, wherein the expression of the energy function is as follows:
E(L)=R(L)+B(L)+E shape ;
wherein R (L) is a region term, B (L) is a boundary term, E shape Is a shape prior term;
the energy expression of the region term is as follows:
E(L)=αR(L)+B(L);
the energy expression of the boundary term is as follows:
wherein, alpha is a relative importance factor between the region item and the boundary item; r p (l p ) Is to label l p The weight assigned to pixel p.
8. The infrared dim target recognition method based on shape prior segmentation and multi-scale feature aggregation according to claim 1, wherein the S3 includes:
s31: the gaussian filtered image is processed using five-size convolution kernels to obtain a feature representation of the small target.
9. The method for infrared weak and small target recognition based on shape prior segmentation and multi-scale feature aggregation according to claim 8, wherein the S31 includes:
s311: convolution kernels of five sizes, 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11, are used, and the number of the convolution kernels is 1, 2, 3, 4 and 5 respectively;
s312: the five groups of convolution kernels correspond to targets with the sizes of 1 × 1, 3 × 3, 5 × 5, 7 × 7 and 9 × 9 respectively, the numbers of the five groups of convolution kernels are 5, 4, 3, 2 and 1 respectively, and the numbers of corresponding feature mappings are 5, 4, 3, 2 and 1, so that the 15 feature mappings and the infrared original image form feature mappings of 16 channels;
s313: and serially connecting the image formed by the five groups of convolution kernels and the feature map, and inputting the image and the feature map into an intermediate maximum pooling layer, wherein the kernel size and the step size of the maximum pooling layer are set to be 2 so as to obtain the mapping features of 32 channels, and the mapping features of the 32 channels are the feature representation of the small target.
10. The infrared dim target recognition method based on shape prior segmentation and multi-scale feature aggregation according to claim 1, wherein the S4 includes:
s41: performing convolution and pooling on the feature representation of the small target to obtain a feature map of the middle layer;
s42: downsampling the feature representation of the small target, and decomposing pixels at the same position in downsampling into 4 sub-images;
s43: and carrying out tensor splicing on the feature map of the middle layer and the 4 sub-maps.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210284099.7A CN114842235B (en) | 2022-03-22 | 2022-03-22 | Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210284099.7A CN114842235B (en) | 2022-03-22 | 2022-03-22 | Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114842235A true CN114842235A (en) | 2022-08-02 |
CN114842235B CN114842235B (en) | 2024-07-16 |
Family
ID=82561648
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210284099.7A Active CN114842235B (en) | 2022-03-22 | 2022-03-22 | Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114842235B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117808685A (en) * | 2024-02-29 | 2024-04-02 | 广东琴智科技研究院有限公司 | Method and device for enhancing infrared image data |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108257133A (en) * | 2016-12-28 | 2018-07-06 | 南宁市浩发科技有限公司 | A kind of image object dividing method |
CN110110675A (en) * | 2019-05-13 | 2019-08-09 | 电子科技大学 | A kind of wavelet field of combination of edge information divides shape infrared cirrus detection method |
-
2022
- 2022-03-22 CN CN202210284099.7A patent/CN114842235B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108257133A (en) * | 2016-12-28 | 2018-07-06 | 南宁市浩发科技有限公司 | A kind of image object dividing method |
CN110110675A (en) * | 2019-05-13 | 2019-08-09 | 电子科技大学 | A kind of wavelet field of combination of edge information divides shape infrared cirrus detection method |
Non-Patent Citations (3)
Title |
---|
HONG ZHANG等: "Infrared small target detection based on local intensity and gradient properties", 《INFRARED PHYSICS & TECHNOLOGY》, vol. 89, 30 December 2017 (2017-12-30), pages 88 - 96 * |
ZUJING YAN等: "Multi-Scale Infrared Small Target Detection Method via Precise Feature Matching and Scale Selection Strategy", 《IEEE ACCESS 》, vol. 8, 28 February 2020 (2020-02-28), pages 48660 - 48672, XP011778751, DOI: 10.1109/ACCESS.2020.2976805 * |
谢勤岚等: "自适应形状约束Graph cuts算法在腹部CT图像分割中的应用", 《中南民族大学学报(自然科学版)》, vol. 38, no. 01, 15 March 2019 (2019-03-15), pages 119 - 125 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117808685A (en) * | 2024-02-29 | 2024-04-02 | 广东琴智科技研究院有限公司 | Method and device for enhancing infrared image data |
CN117808685B (en) * | 2024-02-29 | 2024-05-07 | 广东琴智科技研究院有限公司 | Method and device for enhancing infrared image data |
Also Published As
Publication number | Publication date |
---|---|
CN114842235B (en) | 2024-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Qin et al. | Infrared small target detection based on facet kernel and random walker | |
US10614736B2 (en) | Foreground and background detection method | |
Johnson et al. | AutoGAD: An improved ICA-based hyperspectral anomaly detection algorithm | |
CN110660065B (en) | Infrared fault detection and identification algorithm | |
CN111027497B (en) | Weak and small target rapid detection method based on high-resolution optical remote sensing image | |
CN110008900B (en) | Method for extracting candidate target from visible light remote sensing image from region to target | |
CN111639610A (en) | Fire recognition method and system based on deep learning | |
CN111489330B (en) | Weak and small target detection method based on multi-source information fusion | |
CN111274964B (en) | Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle | |
CN113128481A (en) | Face living body detection method, device, equipment and storage medium | |
Li et al. | SDBD: A hierarchical region-of-interest detection approach in large-scale remote sensing image | |
Fengping et al. | Road extraction using modified dark channel prior and neighborhood FCM in foggy aerial images | |
Chen et al. | Visual depth guided image rain streaks removal via sparse coding | |
CN114842235B (en) | Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation | |
CN110796677B (en) | Cirrus cloud false alarm source detection method based on multiband characteristics | |
Wu et al. | Research on asphalt pavement disease detection based on improved YOLOv5s | |
US20190251695A1 (en) | Foreground and background detection method | |
Li et al. | Infrared Small Target Detection Based on Gradient-Intensity Joint Saliency Measure | |
Kaimkhani et al. | UAV with Vision to Recognise Vehicle Number Plates | |
Zhou et al. | On contrast combinations for visual saliency detection | |
Renno et al. | Evaluating motion detection algorithms: issues and results | |
CN114140792B (en) | Micro target detection method and device based on dynamic sliding window | |
Li et al. | Contrast and distribution based saliency detection in infrared images | |
Sun et al. | Infrared Small-Target Detection Based on Multi-level Local Contrast Measure | |
CN114220053B (en) | Unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |