CN114842235A - Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation - Google Patents

Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation Download PDF

Info

Publication number
CN114842235A
CN114842235A CN202210284099.7A CN202210284099A CN114842235A CN 114842235 A CN114842235 A CN 114842235A CN 202210284099 A CN202210284099 A CN 202210284099A CN 114842235 A CN114842235 A CN 114842235A
Authority
CN
China
Prior art keywords
target
infrared
feature
image
shape prior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210284099.7A
Other languages
Chinese (zh)
Inventor
秦翰林
欧洪璇
延翔
罗国慧
张昱赓
孙鹏
陈嘉欣
冯冬竹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202210284099.7A priority Critical patent/CN114842235A/en
Publication of CN114842235A publication Critical patent/CN114842235A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation, which comprises the following steps: performing Gaussian filtering operation on the input infrared original image to enhance dim weak and small targets; carrying out shape prior-based segmentation on the infrared image subjected to Gaussian filtering to obtain a target candidate region; cutting the target candidate area and inputting the target candidate area into a multi-scale feature extraction module to obtain feature representation of the small target; inputting the feature representation of the small target into a feature aggregation network to obtain a tensor spliced image; and carrying out batch normalization processing and nonlinear transformation on the image after tensor splicing, and outputting a target classification result through Softmax. The shape prior-based segmentation module fully utilizes prior information of the weak and small targets to obtain suspicious target areas, reduces the overall parameters to improve the algorithm efficiency, and the multi-scale feature extraction and aggregation module realizes a sufficient number of feature channels for the weak and small targets so as to ensure the detectability of the weak and small targets.

Description

Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation
Technical Field
The invention belongs to the technical field of infrared target detection, and particularly relates to an infrared small and weak target identification method based on shape prior segmentation and multi-scale feature aggregation.
Background
The infrared imaging detection technology can be used for detecting and tracking the unmanned aerial vehicle, and is a technical means for effectively monitoring the unmanned aerial vehicle. However, in an actual scene, due to influences caused by remote imaging and atmospheric radiation interference, the target signal-to-noise ratio is low, the number of pixel points is small, no shape texture and structural information exist, and interference of complex background noise and random noise is easily caused, so that the detection accuracy and the detection efficiency cannot be balanced by a conventional target detection and identification algorithm.
In order to solve the problem of infrared weak and small target detection, two types of methods are mainly used at present: single frame based and multi-frame based. Because the multi-frame detection algorithm generally performs segmentation processing based on prior information such as the shape of a small target, the continuity of a gray level motion track and the like, more time is consumed than that consumed by a single-frame detection algorithm, so that the multi-frame detection algorithm is not suitable for real-time application. This report mainly studies single frame detection algorithms.
Existing single frame detection methods are broadly classified into traditional and neural network-based detection algorithms. Most of traditional detection methods are based on target priori knowledge, the target contrast is improved by inhibiting the background and enhancing the target, the target is extracted by self-adaptive threshold segmentation, and the false alarm rate of the traditional algorithm detection effect is high due to the existence of noise and the lack of robustness characteristics. Most of deep learning detection methods obtain a candidate region through an Anchor mechanism, then share parameters, and unify classification and regression together to obtain a detection result, but the method is mainly directed to general target detection and still has an unsatisfactory effect on detection of weak and small targets with low signal-to-noise ratio and extremely small pixels.
Fan et al in the literature "Zunlin, Fan, Duyan, et al. dim in-less image enhancement on a proportional neural network [ J ]. neuro-output, 2018, 272(Jan.10): 396-404." for the target blur and background complexity problem caused by the long shooting distance that is common in the current infrared imaging system, a convolutional neural network enhancement method for suppressing the background clutter while enhancing the small target is proposed, the handwritten characters in MNIST data set are used to simulate the difference between the foreground and the background of the infrared image, the small target and the background are predicted, and the contrast of the weak infrared image in which the background clutter is embedded into the small target is improved. However, the above method does not consider the influence of noise existing in the imaging system itself on the detection, so the document "Deng Q, Lu H, Tao H, et al, Multi-scale connected neural networks for space-included point objects discrimination [ J ]. IEEE Access,2019: 1-1" by Deng et al at the national defense science and technology university proposes a multi-scale convolutional neural network considering that the existence of noise in the infrared detection system and the lack of robustness characteristics make the detection of space objects difficult to process in a long observation distance. The network structure provided by the method consists of 3 parts of transformation, partial convolution and full convolution, small target training data is generated by an infrared radiation model, and the inherent properties of the small and medium targets in different scenes are considered. The method can improve the performance of the system and has stronger robustness on the noise of the detection system. However, most of the current detection methods for deep learning regard the problem of infrared weak and small target detection as a problem of two-classification or significance detection, and the detection accuracy for the weak and small targets is still not ideal.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation. The technical problem to be solved by the invention is realized by the following technical scheme:
an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation is characterized by comprising the following steps:
performing Gaussian filtering operation on the input infrared original image to enhance dim weak and small targets;
carrying out shape prior-based segmentation on the infrared image subjected to Gaussian filtering to obtain a target candidate region;
cutting the target candidate region and inputting the target candidate region into a multi-scale feature extraction module to obtain feature representation of a small target;
inputting the feature representation of the small target into a feature aggregation network to obtain a tensor spliced image;
and carrying out batch normalization processing and nonlinear transformation on the image after tensor splicing, and outputting a target classification result through Softmax.
In an embodiment of the present invention, the S1 includes:
s11: preprocessing the infrared original image by adopting a 3 x 3 Gaussian kernel template;
s12: gaussian filtering operation is performed using the imfilter () function.
In one embodiment of the present invention, the enhanced image after performing the gaussian filtering operation is represented as:
Figure BDA0003559370620000031
wherein, (x, y) is the coordinate of a certain pixel in the infrared original image, I represents the infrared original image, and G represents a gaussian kernel.
In an embodiment of the present invention, the S2 includes:
s21: processing the infrared image after Gaussian filtering by using a segmentation algorithm based on shape prior to eliminate a large-size continuous background area and high-energy noise with a preset pixel number;
s22: selecting a point region with the aspect ratio not more than 2 for the image excluding the large-size continuous background region and the high-energy noise with the preset number of pixels to fit the suspicious target, and segmenting the region around the fitted boundary to exclude the strip-shaped edge region to obtain the target candidate region.
In an embodiment of the present invention, the S21 includes:
s211: and the prior shape information of the weak and small target is fused into an energy function so as to eliminate large-size continuous background areas and high-energy noise with preset pixel quantity.
In an embodiment of the present invention, the S211 includes:
s2111: combining the energies of the shape priors into an energy function;
s2112: and eliminating high-energy noise of a large-size continuous background area and a preset pixel number through the energy function, the energy of the area item and the energy of the boundary item.
In one embodiment of the present invention, the expression of the energy function is:
E(L)=R(L)+B(L)+E shape
where R (L) is the region term, B (L) is the boundary term, and Escape is the shape prior term;
the energy expression of the region term is as follows:
E(L)=αR(L)+B(L);
the energy expression of the boundary term is as follows:
Figure BDA0003559370620000041
wherein, alpha is a relative importance factor between the region item and the boundary item; r p (l p ) Is to label l p The weight assigned to pixel p.
In an embodiment of the present invention, the S3 includes:
s31: the gaussian filtered image is processed using five-size convolution kernels to obtain a feature representation of the small target.
In an embodiment of the present invention, the S31 includes:
s311: convolution kernels of five sizes, 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11, are used, and the number of the convolution kernels is 1, 2, 3, 4 and 5 respectively;
s312: the five groups of convolution kernels correspond to targets with the sizes of 1 × 1, 3 × 3, 5 × 5, 7 × 7 and 9 × 9 respectively, the numbers of the five groups of convolution kernels are 5, 4, 3, 2 and 1 respectively, and the numbers of corresponding feature mappings are 5, 4, 3, 2 and 1, so that the 15 feature mappings and the infrared original image form feature mappings of 16 channels;
s313: and serially connecting the image formed by the five groups of convolution kernels and the feature map, and inputting the image and the feature map into an intermediate maximum pooling layer, wherein the kernel size and the step size of the maximum pooling layer are set to be 2 so as to obtain the mapping features of 32 channels, and the mapping features of the 32 channels are the feature representation of the small target.
In an embodiment of the present invention, the S4 includes:
s41: performing convolution and pooling on the feature representation of the small target to obtain a feature map of the middle layer;
s42: downsampling the feature representation of the small target, and decomposing pixels at the same position in downsampling into 4 sub-images;
s43: and carrying out tensor splicing on the feature map of the middle layer and the 4 sub-maps.
The invention has the beneficial effects that:
the segmentation module based on the shape prior fully utilizes the prior information of the weak and small targets to obtain suspicious target areas, reduces the global parameter number to improve the algorithm efficiency, and the multi-scale feature extraction and aggregation module realizes a sufficient number of feature channels for the weak and small targets, thereby ensuring the detectability of the weak and small targets and obviously reducing the false alarm rate under higher recall rate.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Drawings
Fig. 1 is a schematic flow chart of an infrared small and weak target identification method based on shape prior segmentation and multi-scale feature aggregation according to an embodiment of the present invention;
fig. 2 is a structural diagram of an infrared small and weak target identification method based on shape prior segmentation and multi-scale feature aggregation according to an embodiment of the present invention.
Detailed Description
In order to further explain the technical means and effects of the present invention adopted to achieve the predetermined invention purpose, the following detailed description is made with reference to the accompanying drawings and the detailed description, for an infrared small and weak target identification method based on shape prior segmentation and multi-scale feature aggregation according to the present invention.
The foregoing and other technical matters, features and effects of the present invention will be apparent from the following detailed description of the embodiments, which is to be read in connection with the accompanying drawings. The technical means and effects of the present invention adopted to achieve the predetermined purpose can be more deeply and specifically understood through the description of the specific embodiments, however, the attached drawings are provided for reference and description only and are not used for limiting the technical scheme of the present invention.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that an article or device that comprises a list of elements does not include only those elements but may include other elements not expressly listed. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of additional like elements in the article or device comprising the element.
The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
Example one
Referring to fig. 1, fig. 1 is a schematic flow chart of an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation according to an embodiment of the present invention, and fig. 2 is a structural diagram of an infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation according to an embodiment of the present invention, which is combined with fig. 1 and fig. 2, and includes:
s1: and performing a Gaussian filtering operation on the input infrared original image to enhance dim weak and small targets.
Specifically, step S1 includes:
s11: preprocessing the infrared original image by adopting a 3 x 3 Gaussian kernel template;
s12: gaussian filtering operation is performed using the imfilter () function.
Firstly, preprocessing an input infrared original image by adopting a 3 x 3 Gaussian kernel template, and executing Gaussian filtering operation by using an imfilter () function, namely, performing a weighting and averaging process on the values of the pixel points of the whole image and the pixel points in the neighborhood of the pixel points, effectively inhibiting a background area, enhancing a weak target to be detected, and facilitating subsequent detection and identification. The gaussian filtered enhanced image is represented as:
Figure BDA0003559370620000071
wherein, (x, y) is the coordinate of a certain pixel in the infrared original image, I represents the infrared original image, and G represents the gaussian kernel.
S2: carrying out shape prior-based segmentation on the infrared image subjected to Gaussian filtering to obtain a target candidate region;
specifically, step S2 includes:
s21: processing the infrared image after Gaussian filtering by using a segmentation algorithm based on shape prior to eliminate a large-size continuous background area and high-energy noise with a preset pixel number;
specifically, step S21 includes:
s211: and the prior shape information of the weak and small target is fused into an energy function so as to eliminate large-size continuous background areas and high-energy noise with preset pixel quantity.
Specifically, step S211 includes:
s2111: combining the energies of the shape priors into an energy function;
s2112: and eliminating high-energy noise of large-size continuous background areas and preset pixel quantity through the energy function, the energy of the area item and the energy of the boundary item.
S22: selecting a point region with the aspect ratio not more than 2 for the image excluding the large-size continuous background region and the high-energy noise with the preset number of pixels to fit the suspicious target, and segmenting the region around the fitted boundary to exclude the strip-shaped edge region to obtain a target candidate region.
Firstly, processing the infrared image after Gaussian filtering by using a segmentation algorithm based on shape prior to eliminate a large-size continuous background area and high-energy noise with a preset number of pixels.
Specifically, there are three types of regions in the infrared image, including a large continuous background region (such as a cloud layer), a band-shaped edge region (long strip), and a small dot region (dot region), in which a weak target exists, according to the characteristic analysis of the target. And (3) integrating the prior shape information of the weak and small target into an energy function to generate a candidate region with higher recall rate, namely, excluding a large-size continuous background region and high-energy noise with preset pixel quantity.
The method includes the steps of fusing prior shape information of a weak and small target into an energy function to eliminate large-size continuous background areas and high-energy noise with preset pixel quantity, and specifically includes the following steps: based on the gray scale information of the infrared image, the edge is the euclidean distance of the gray scale values of the two pixels. Thus, a distance function φ is introduced to express the shape prior, φ (x, y) represents the minimum Euclidean distance between a point (x, y) and an edge. Since the drone target may be approximately represented by a rectangle with a threshold ratio of no greater than 2, a template is selected that constructs a rectangular shape. Expressing an energy term of the shape template by Escape, and combining the energy of the shape prior into an energy function;
E(L)=R(L)+B(L)+E shape
where R (L) is the region term, B (L) is the boundary term, and Escape is the shape prior term. The energy formula for the region and boundary terms is expressed as follows:
E(L)=αR(L)+B(L);
Figure BDA0003559370620000091
where α is the relative importance factor between the region term and the boundary term. Rp (lp) is the weight that assigns the label lp to the pixel p. By comparing the intensity of the pixel p with the intensity models (given histograms) of the target and background, the weight of rp (lp) can be obtained, i.e. when a pixel is more likely to be a target, the weight assigning the pixel to the target should be smaller, which can reduce the energy in the equation, so when all pixels are correctly assigned as target and background, the area term will be minimized. The energy representation method of shape prior is to select the approximate center pixel of the target as a reference point, set the pixel values around the center point to 255 and the other values to 0 according to the distance between the pixel and the center pixel. The new image is then considered as a shape template for the object and incorporates the energy function. In this way large size continuous background areas (clouds and buildings) and high energy noise of about a few pixels (e.g. hot pixels) can be excluded, since the value of the central part of the object is relatively larger than elsewhere in the template.
Then, selecting a point region with an aspect ratio not more than 2 for the image excluding the large-size continuous background region and the high-energy noise with the preset number of pixels to fit the suspicious target, and segmenting the region around the fitted boundary to exclude the strip-shaped edge region to obtain a target candidate region.
S3: cutting the target candidate area and inputting the target candidate area into a multi-scale feature extraction module to obtain feature representation of the small target;
specifically, step S3 includes:
s31: the gaussian filtered image is processed using five-size convolution kernels to obtain a feature representation of the small target.
Specifically, step S31 includes:
s311: convolution kernels of five sizes, 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11, are used, and the number of the convolution kernels is 1, 2, 3, 4 and 5 respectively;
s312: the five groups of convolution kernels correspond to targets with the sizes of 1 × 1, 3 × 3, 5 × 5, 7 × 7 and 9 × 9 respectively, the number of the five groups of convolution kernels is 5, 4, 3, 2 and 1 respectively, and the number of corresponding feature mappings is 5, 4, 3, 2 and 1, so that 15 feature mappings and the infrared original image form 16-channel feature mappings;
s313: and connecting the images formed by the five groups of convolution kernels and the feature maps in series and inputting the images and the feature maps into an intermediate maximum pooling layer, wherein the kernel size and the step size of the maximum pooling layer are set to be 2 so as to obtain the mapping features of 32 channels, and the mapping features of the 32 channels are the feature representation of the small target.
The gaussian filtered image is processed using five-size convolution kernels to obtain a feature representation of the small target, i.e., multi-scale information of the small target is extracted using a multi-scale convolution kernel.
The method comprises the following steps of extracting multi-scale information of a small target by utilizing a multi-scale convolution kernel, wherein the method specifically comprises the following steps:
convolution kernels of five sizes, 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11, are used, and the number of the convolution kernels is 1, 2, 3, 4 and 5 respectively;
the 15 convolution kernels mentioned above correspond to objects with sizes of 1 × 1, 3 × 3, 5 × 5, 7 × 7 and 9 × 9 respectively, the numbers of the convolution kernels are 5, 4, 3, 2 and 1 respectively, and the numbers of the corresponding feature maps are 5, 4, 3, 2 and 1, namely, the 15 feature maps and the original image form a 16-channel feature map, and a sufficient number of feature channels are obtained for weak and small objects to ensure detectability;
connecting images formed by five groups of convolution kernels and feature maps in series and inputting the images and the feature maps into a middle maximum pooling layer, setting the kernel size and the step length of the maximum pooling layer to be 2, doubling the number of feature mappings to obtain the mapping features of 32 channels, and reducing consumed memory resources when extracting features as much as possible; the mapping features of the 32 channels are the feature representation of the small target.
S4: inputting the feature representation of the small target into a feature aggregation network to obtain a tensor spliced image;
specifically, step S4 includes:
s41: performing convolution and pooling treatment on the feature representation of the small target to obtain a feature map of the middle layer;
s42: carrying out down-sampling on the feature representation of the small target, and decomposing pixels at the same position into 4 sub-images during the down-sampling;
s43: and carrying out tensor splicing on the feature map of the middle layer and the 4 sub-maps.
The number of the converted channels is changed into 4 times, and the down-sampling is 2 times, so that the feature information under higher resolution is effectively fused.
S5: and carrying out batch normalization processing and nonlinear transformation on the image after tensor splicing, and outputting a target classification result through Softmax.
The tensor-stitched image obtained in S4 is subjected to batch normalization before being subjected to nonlinear transformation. And finally, outputting the unmanned aerial vehicle target classification result by Softmax.
In the embodiment, the shape prior-based segmentation module fully utilizes prior information of the small and weak targets to obtain suspicious target regions, reduces the global parameter to improve the algorithm efficiency, and the multi-scale feature extraction and aggregation module realizes a sufficient number of feature channels for the small and weak targets, so that the detectability of the small and weak targets is ensured, and the false alarm rate can be obviously reduced under a higher recall rate.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (10)

1. An infrared dim target identification method based on shape prior segmentation and multi-scale feature aggregation is characterized by comprising the following steps:
s1: performing Gaussian filtering operation on the input infrared original image to enhance dim weak and small targets;
s2: carrying out shape prior-based segmentation on the infrared image subjected to Gaussian filtering to obtain a target candidate region;
s3: cutting the target candidate region and inputting the target candidate region into a multi-scale feature extraction module to obtain feature representation of a small target;
s4: inputting the feature representation of the small target into a feature aggregation network to obtain a tensor spliced image;
s5: and carrying out batch normalization processing and nonlinear transformation on the image after tensor splicing, and outputting a target classification result through Softmax.
2. The infrared dim target recognition method based on shape prior segmentation and multi-scale feature aggregation according to claim 1, wherein the S1 includes:
s11: preprocessing the infrared original image by adopting a 3 x 3 Gaussian kernel template;
s12: gaussian filtering operation is performed using the imfilter () function.
3. The method for evaluating anti-interference performance of radar according to claim 2, wherein the enhanced image after performing the gaussian filtering operation is represented as:
Figure FDA0003559370610000011
wherein, (x, y) is the coordinate of a certain pixel in the infrared original image, I represents the infrared original image, and G represents a gaussian kernel.
4. The infrared dim target recognition method based on shape prior segmentation and multi-scale feature aggregation according to claim 1, wherein the S2 includes:
s21: processing the infrared image after Gaussian filtering by using a segmentation algorithm based on shape prior to eliminate a large-size continuous background area and high-energy noise with a preset pixel number;
s22: selecting a point region with the aspect ratio not more than 2 for the image excluding the large-size continuous background region and the high-energy noise with the preset number of pixels to fit the suspicious target, and segmenting the region around the fitted boundary to exclude the strip-shaped edge region to obtain the target candidate region.
5. The method for infrared weak and small target recognition based on shape prior segmentation and multi-scale feature aggregation according to claim 4, wherein the S21 includes:
s211: and the prior shape information of the weak and small target is fused into an energy function so as to eliminate large-size continuous background areas and high-energy noise with preset pixel quantity.
6. The method for identifying infrared weak and small targets based on shape prior segmentation and multi-scale feature aggregation according to claim 5, wherein the step S211 comprises:
s2111: combining the energies of the shape priors into an energy function;
s2112: and eliminating high-energy noise of a large-size continuous background area and a preset pixel number through the energy function, the energy of the area item and the energy of the boundary item.
7. The method for identifying the infrared dim target based on the shape prior segmentation and the multi-scale feature aggregation according to claim 6, wherein the expression of the energy function is as follows:
E(L)=R(L)+B(L)+E shape
wherein R (L) is a region term, B (L) is a boundary term, E shape Is a shape prior term;
the energy expression of the region term is as follows:
E(L)=αR(L)+B(L);
the energy expression of the boundary term is as follows:
Figure FDA0003559370610000031
wherein, alpha is a relative importance factor between the region item and the boundary item; r p (l p ) Is to label l p The weight assigned to pixel p.
8. The infrared dim target recognition method based on shape prior segmentation and multi-scale feature aggregation according to claim 1, wherein the S3 includes:
s31: the gaussian filtered image is processed using five-size convolution kernels to obtain a feature representation of the small target.
9. The method for infrared weak and small target recognition based on shape prior segmentation and multi-scale feature aggregation according to claim 8, wherein the S31 includes:
s311: convolution kernels of five sizes, 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11, are used, and the number of the convolution kernels is 1, 2, 3, 4 and 5 respectively;
s312: the five groups of convolution kernels correspond to targets with the sizes of 1 × 1, 3 × 3, 5 × 5, 7 × 7 and 9 × 9 respectively, the numbers of the five groups of convolution kernels are 5, 4, 3, 2 and 1 respectively, and the numbers of corresponding feature mappings are 5, 4, 3, 2 and 1, so that the 15 feature mappings and the infrared original image form feature mappings of 16 channels;
s313: and serially connecting the image formed by the five groups of convolution kernels and the feature map, and inputting the image and the feature map into an intermediate maximum pooling layer, wherein the kernel size and the step size of the maximum pooling layer are set to be 2 so as to obtain the mapping features of 32 channels, and the mapping features of the 32 channels are the feature representation of the small target.
10. The infrared dim target recognition method based on shape prior segmentation and multi-scale feature aggregation according to claim 1, wherein the S4 includes:
s41: performing convolution and pooling on the feature representation of the small target to obtain a feature map of the middle layer;
s42: downsampling the feature representation of the small target, and decomposing pixels at the same position in downsampling into 4 sub-images;
s43: and carrying out tensor splicing on the feature map of the middle layer and the 4 sub-maps.
CN202210284099.7A 2022-03-22 2022-03-22 Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation Pending CN114842235A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210284099.7A CN114842235A (en) 2022-03-22 2022-03-22 Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210284099.7A CN114842235A (en) 2022-03-22 2022-03-22 Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation

Publications (1)

Publication Number Publication Date
CN114842235A true CN114842235A (en) 2022-08-02

Family

ID=82561648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210284099.7A Pending CN114842235A (en) 2022-03-22 2022-03-22 Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation

Country Status (1)

Country Link
CN (1) CN114842235A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117808685A (en) * 2024-02-29 2024-04-02 广东琴智科技研究院有限公司 Method and device for enhancing infrared image data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117808685A (en) * 2024-02-29 2024-04-02 广东琴智科技研究院有限公司 Method and device for enhancing infrared image data
CN117808685B (en) * 2024-02-29 2024-05-07 广东琴智科技研究院有限公司 Method and device for enhancing infrared image data

Similar Documents

Publication Publication Date Title
Qin et al. Infrared small target detection based on facet kernel and random walker
US10614736B2 (en) Foreground and background detection method
CN109460764B (en) Satellite video ship monitoring method combining brightness characteristics and improved interframe difference method
Johnson et al. AutoGAD: An improved ICA-based hyperspectral anomaly detection algorithm
CN111027497B (en) Weak and small target rapid detection method based on high-resolution optical remote sensing image
CN110660065B (en) Infrared fault detection and identification algorithm
CN110008900B (en) Method for extracting candidate target from visible light remote sensing image from region to target
Wang et al. Clutter-adaptive infrared small target detection in infrared maritime scenarios
CN111489330B (en) Weak and small target detection method based on multi-source information fusion
CN111814690B (en) Target re-identification method, device and computer readable storage medium
CN113128481A (en) Face living body detection method, device, equipment and storage medium
CN110992378B (en) Dynamic updating vision tracking aerial photographing method and system based on rotor flying robot
CN111639610A (en) Fire recognition method and system based on deep learning
Li et al. SDBD: A hierarchical region-of-interest detection approach in large-scale remote sensing image
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
Fengping et al. Road extraction using modified dark channel prior and neighborhood FCM in foggy aerial images
Chen et al. Visual depth guided image rain streaks removal via sparse coding
CN114842235A (en) Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation
CN114581709A (en) Model training, method, apparatus, and medium for recognizing target in medical image
Peng et al. CourtNet: Dynamically balance the precision and recall rates in infrared small target detection
CN110796677B (en) Cirrus cloud false alarm source detection method based on multiband characteristics
US20190251695A1 (en) Foreground and background detection method
Li et al. Infrared Small Target Detection Based on Gradient-Intensity Joint Saliency Measure
Wu et al. Research on asphalt pavement disease detection based on improved YOLOv5s
Zhou et al. On contrast combinations for visual saliency detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination