CN115410012A - Method and system for detecting infrared small target in night airport clear airspace and application - Google Patents
Method and system for detecting infrared small target in night airport clear airspace and application Download PDFInfo
- Publication number
- CN115410012A CN115410012A CN202211359429.0A CN202211359429A CN115410012A CN 115410012 A CN115410012 A CN 115410012A CN 202211359429 A CN202211359429 A CN 202211359429A CN 115410012 A CN115410012 A CN 115410012A
- Authority
- CN
- China
- Prior art keywords
- target
- infrared small
- small
- feature
- infrared
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 102
- 238000001514 detection method Methods 0.000 claims abstract description 100
- 238000012549 training Methods 0.000 claims abstract description 25
- 238000000605 extraction Methods 0.000 claims abstract description 22
- 238000013135 deep learning Methods 0.000 claims abstract description 18
- 230000002776 aggregation Effects 0.000 claims abstract description 14
- 238000004220 aggregation Methods 0.000 claims abstract description 14
- 238000010586 diagram Methods 0.000 claims description 51
- 239000011159 matrix material Substances 0.000 claims description 27
- 230000000875 corresponding effect Effects 0.000 claims description 24
- 230000008569 process Effects 0.000 claims description 23
- 230000006870 function Effects 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 14
- 230000004927 fusion Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 13
- 230000005764 inhibitory process Effects 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 6
- 230000002596 correlated effect Effects 0.000 claims description 5
- 238000009826 distribution Methods 0.000 claims description 5
- 230000002708 enhancing effect Effects 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 4
- 210000002569 neuron Anatomy 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 3
- 239000012467 final product Substances 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 11
- 230000000694 effects Effects 0.000 description 9
- 230000008901 benefit Effects 0.000 description 5
- 230000006872 improvement Effects 0.000 description 4
- 230000002159 abnormal effect Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000011897 real-time detection Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000001629 suppression Effects 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Abstract
The invention belongs to the technical field of image recognition, and discloses a method and a system for detecting a night airport clean airspace infrared small target and application of the method and the system. The method comprises the following steps: setting initial parameters of a heterogeneous parallel network model, inputting images of a training set into the heterogeneous parallel network model for training, and obtaining an infrared small target detection model based on deep learning; inputting an infrared small target image to be detected into an infrared small target detection model, performing feature extraction on the infrared small target image passing through a heterogeneous parallel backbone network, and fusing the feature maps through a pixel aggregation network to obtain a plurality of layers of feature maps containing target information; and then processing a feature map containing target information through a prediction structure, respectively obtaining the type and the position information of the target, and generating a target frame corresponding to each target, thereby directly displaying the type and the position information of the predicted target in the image. The invention is easy to be arranged on hardware equipment; the model is simple to operate and easy to train.
Description
Technical Field
The invention belongs to the technical field of image recognition, and particularly relates to a method and a system for detecting a night airport clearance area infrared small target and application of the method and the system.
Background
The infrared image small target detection plays an important role in target early warning, ground monitoring and flight guidance. Aiming at the target detection technology of visible light, the existing method obtains good detection performance. However, the difference between the infrared image shot at night and the visible light image is huge, and the infrared small target detection difficulty is increased due to the characteristics of long infrared shooting distance, low image contrast, weak texture characteristics and small target ratio. In order to provide more comprehensive guarantee and realize 24-hour supervision of important places, the research on small target detection of night infrared images is particularly important.
For infrared small target detection, continuous frame video target detection and single frame image target detection can be generally divided. The continuous frame video target detection mainly utilizes the corresponding relation of a target between continuous frames in a video to research; and detecting the single-frame image target, namely detecting and identifying the target directly according to the unique information of the target in the image. The video detection utilizes prior information such as the form and track continuity of a target and time-space domain information to realize target detection, and the fast moving unmanned aerial vehicle target changes rapidly relative to a background in an infrared image, so that the track continuity is difficult to guarantee, and the continuous frame method is difficult to apply. In contrast, the single-frame image target detection method only needs to calculate target information in a single image, has calculation complexity obviously lower than that of video target detection, is easy to implement by hardware, has wide application in infrared target detection, and can be roughly divided into single-frame infrared target detection methods based on model driving and deep learning.
The method for detecting the target of the infrared single-frame image based on model driving generally models a target point, and a small target in the infrared image is regarded as an abnormal point from a highly-correlated background pixel and is marked as a target. The common disadvantages of the model-based driving method are that when the background is a mixed background of buildings, trees, vehicles, etc., the detection performance is limited, and the requirement of real-time detection is difficult to meet. With the development of computer vision, more and more infrared image target detection methods based on a deep learning method are used. The infrared image target detection method based on deep learning meets the requirement of real-time detection, and meanwhile, the detection performance is improved continuously according to the continuous improvement of a machine vision technology.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) In the prior art, the detection speed and the accuracy of the infrared small target in the airport clearance area at night are low.
(2) In the prior art, the application and deployment practicability of various different environments are poor.
(3) In the prior art, the ratio of small target information in an image cannot be effectively enhanced, and effective support cannot be provided for subsequent feature extraction and target detection.
(4) In the prior art, feature maps obtained by three different channels cannot be effectively fused, and the information content of a target is low. And does not provide effective support for subsequent target detection.
Disclosure of Invention
In order to overcome the problems in the related art, the disclosed embodiment of the invention provides a method, a system and an application for detecting a night airport clean airspace infrared small target.
The technical scheme is as follows: the method for detecting the infrared small target in the clean airspace of the airport at night comprises the following steps:
s1, setting initial parameters of a heterogeneous parallel network model, inputting images of a training set in an infrared small target database into the heterogeneous parallel network model with the set parameters for training, and obtaining a deep-learning infrared small target detection model;
s2, inputting the infrared small target image to be detected into an infrared small target detection model, carrying out feature extraction on the infrared small target image of the heterogeneous parallel backbone network, and splicing feature maps of three different channels, wherein the three different channels comprise: a Similarity Object Enhancement (SOE) channel, a general feature extraction structure channel, and a backbone network structure channel based on a Simple Attention mechanism (a Simple Attention Module, simAM);
s3, performing feature splicing on the obtained partial multilayer feature maps through a pixel aggregation network, and then obtaining multilayer feature maps containing predictable targets with different sizes;
s4, judging the type and the position of the target through the prediction structure, obtaining the confidence coefficient of each infrared small target, generating a target prediction frame corresponding to each target, and predicting to obtain the type and the position information of the target;
s5: and deleting the target frames with low scores in the obtained multiple target prediction frames calculated by an IOU (interaction-over-interaction) formula by using a non-maximum value inhibition method to obtain the target frames, and storing and displaying the category and position information of the targets.
In one embodiment, in step S1, the heterogeneous parallel network model builds a heterogeneous parallel backbone network in a VGG form based on a basic Focus module, a C3 module, and an SPP module, generates a multi-layer fused feature map in combination with a pixel aggregation network, and predicts the category and position of a target according to a prediction network; correcting the position information returned to the original image according to the L1 loss function to obtain the final classification and accurate position information of the target; target prediction box loss functionThe calculation formula is shown as formula (1):
in the formulaIndicating the probability of returning to the correct position after mapping a certain target position to the original image,for the influence range factor, the L1 loss function is used for calculating the loss function, so that the loss value is insensitive to outliers and abnormal values, the gradient change is relatively small, the model in the training stage is not easy to deviate from the optimal model, and the deviation is obtainedA smaller plurality of target bounding boxes;
the initial parameters of the heterogeneous parallel network model comprise: the number of network layers, and the weights and bias values of neurons in each layer.
In one embodiment, in step S2, the feature map obtained by the similarity target enhancement module channel and the feature map obtained by the general feature extraction structure based on the simple attention-free mechanism backbone network structure of the heterogeneous parallel network model are subjected to a splicing operation, so as to enhance the proportion and importance degree of the feature map target information in the convolution operation process.
In one embodiment, in step S5, the score of the target box is calculated by the IOU formula, defined as the ratio between the intersection and union of the two boxes; the calculation of the IOU is shown in equation (2) below:
wherein A and B respectively represent two frames,a union region representing two boxes is shown,representing the intersection area of the two boxes.
In one embodiment, the similarity target enhancement module first partitions the feature map H W into A 2 × 2 small blockAnd extending to 4 × 4 large block with one small block as centerBig pieceCalculating the Wasserstein distance between each small block and the corresponding large block; the Wasserstein distance calculation mode is shown as formula (3):
the distance between two rectangular blocks is defined approximately as:
and simplified as follows:
whereinAndare respectivelyAndthe mean value and the variance of the feature points in the area, the Wasserstein distance is used for measuring the distance between two distributions, the distance is used as the similarity of the two distributions, and the distance between the central block and the surrounding area is calculated as the similarity between the central block and the surrounding area;the larger the value, the higher the similarity between the small block and the large block, the greater the probability that the small block is the background,the smaller the value, theThe lower the similarity between the small block and the large block, the greater the likelihood that the small block is a target; calculating block by block to obtain a final productThe similarity matrix W _ SOE (Wasserstein-SOE);
the feature map input to the similarity target enhancement module isDividing the blocks according to 2 × 2, obtaining corresponding 4 × 4 blocks radiating outwards with each 2 × 2 block as the center, sliding according to the step length of 1 by taking the 2 × 2 block as a unit according to the formula (5), and calculating to obtainW _ soe.
In one embodiment, enhancing the proportion and importance of the feature map target information during the convolution operation further comprises:
inputting a feature map F of size H x W x C by taking a set small blockForming a similarity matrix W _ soe, then, taking W _ soe '= 1/W _ soe, called Wasserstein similarity, and obtaining a similarity matrix W _ soe' positively correlated with the target and a Wasserstein similarity matrix; on the basis of a similarity matrix W 'obtained by activating a function, a Sigmoid function is used for normalization and activation to obtain a matrix W, the characteristic value of a characteristic diagram target is basically unchanged by combining an original input characteristic diagram, a new characteristic diagram F' containing the target information weight is obtained, and the target information in the characteristic diagram is enhanced.
In one embodiment, in a feature fusion stage in the heterogeneous parallel network model, a pixel aggregation network structure is adopted to perform top-down and bottom-up feature fusion, multilayer feature maps from different depths are fused, and multilayer feature maps with different sizes are further obtained; the feature maps of different depths correspond to different sized objects, respectively, each feature map being responsive to a particular sized object.
In one embodiment, the heterogeneous parallel network model performs pixel-by-pixel prediction on a feature map in a prediction structure to obtain a potential target, and classifies the target to obtain the category information of the target; and reversely transmitting the corresponding target position information on the feature map to the original image through a regression strategy to obtain the position of the target original image, obtain the category and the approximate position information of the target, finally deleting the non-optimal target frame through a non-maximum value inhibition method in the step S5 to obtain the final accurate target frame position, and storing and displaying the position information and the classification information.
Another object of the present invention is to provide a system for implementing the method for detecting infrared small targets in clean airspace of airports at night, where the system for detecting infrared small targets in clean airspace of airports at night includes:
the deep learning infrared small target detection model acquisition module is used for setting initial parameters of the heterogeneous parallel network model, inputting images of a training set in the infrared small target database into the heterogeneous parallel network model with the set parameters for training, and obtaining a deep learning infrared small target detection model;
the different-channel characteristic diagram splicing module is used for inputting the infrared small target image to be detected into the infrared small target detection model, extracting the characteristics of the infrared small target image of the heterogeneous parallel backbone network and splicing the characteristic diagrams from three different channels;
the predictive target feature map obtaining module is used for performing feature splicing on the obtained partial multilayer feature maps through a pixel aggregation network so as to obtain multilayer predictive target feature maps containing target information of different sizes;
the target type and position information acquisition module is used for judging the type and position of the target through the prediction structure, acquiring the confidence coefficient of each infrared small target, generating a target frame corresponding to each target and predicting and acquiring the type and position information of the target;
and the target frame acquisition module deletes the non-optimal target frames in the obtained target prediction frames by using a non-maximum value inhibition method to obtain target frames, and stores and displays the type and position information of the targets.
Another object of the present invention is to provide a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and the computer program, when executed by the processor, causes the processor to execute the method for detecting night airport net airspace infrared small targets.
Another object of the present invention is to provide a computer-readable storage medium, which stores a computer program, which when executed by a processor, causes the processor to execute the method for detecting infrared small targets in the clear airspace of night airports.
By combining all the technical schemes, the invention has the advantages and positive effects that:
first, aiming at the technical problems existing in the prior art and the difficulty in solving the problems, the technical problems to be solved by the technical scheme of the present invention are closely combined with results, data and the like in the research and development process, and some creative technical effects are brought after the problems are solved. The specific description is as follows:
the method comprises the steps of firstly setting initial parameters of a heterogeneous parallel network model of a detection method of the night airport clear airspace infrared small target, inputting training set images in an infrared small target database into the set heterogeneous parallel network model of the detection method of the night airport clear airspace infrared small target for training, and obtaining an infrared small target detection model based on deep learning; inputting an infrared small target image to be detected into a trained infrared small target detection model, performing feature extraction on the infrared small target image passing through a heterogeneous parallel backbone network, and fusing the feature maps through a pixel aggregation network to further obtain a plurality of layers of feature maps containing target information; and then processing the feature map containing the target information through the prediction structure, respectively obtaining the type and the position information of the target, and generating a target frame corresponding to each target, thereby directly displaying the type and the position information of the predicted target in the image. The invention has good adaptivity and detection performance to infrared small targets, and the detection rate is as high as 80.0%; the detection rate is high, the detection rate is 31.2 frames per second, the model size is small, is only 30.5M, and is easy to deploy in hardware equipment; the model is simple to operate and easy to train.
Secondly, regarding the technical solution as a whole or from the perspective of products, the technical effects and advantages of the technical solution to be protected by the present invention are specifically described as follows:
the method is in a form of combining a model-driven-based method and a deep learning-based method, firstly, an SOE model is provided based on a model-driven principle to increase the difference between a target and a background, so that more potential target information can be purposefully acquired in the convolution operation process, and the model is modularized and can be flexibly used in a neural network. In addition, based on the convolutional neural network principle, a heterogeneous parallel backbone network structure is constructed, the feature graph obtained by the similarity target enhancement module, a general feature extraction structure and the feature graph obtained by the backbone network structure based on the SimAM are spliced, the occupation ratio of feature graph target information in the convolutional operation process is increased, and the effectiveness of the method is verified in a large number of experiments of an infrared small target data set.
The advantages of the present invention over the prior art further include: the integral model of the invention has good performance for infrared target detection. The invention provides a method for detecting a small infrared target in a clear airspace of an airport at night, which has the advantages of high detection speed and high accuracy, can realize the detection speed of 31.2 frames per second for input images of 256 pixels multiplied by 256 pixels, can reach the accuracy of 80.0 percent, can completely realize the real-time detection of important places, monitors the dynamics of the places in real time and ensures the safety of the places.
The integral model of the invention is suitable for application and deployment in various environments. The invention provides a method for detecting infrared small targets in a clear airspace of an airport at night, and a data set used by the method comprises the infrared small targets from different visual angles, different scenes and different distances, so that the method has better adaptability to the infrared targets in different environments, and the application range of the method is better expanded.
The SOE module provided by the invention effectively enhances the proportion of small target information in the image and provides effective support for subsequent feature extraction and target detection. The SOE module effectively highlights the information of the small targets in the feature map in each channel through the relation of the mean value and the variance of the local feature blocks in the feature map, so that the information of multi-learning small targets with purposiveness and tendency is provided in the subsequent feature extraction and feature fusion processes, and effective technical support is provided for the subsequent detection of the infrared small targets. The module has strong flexibility and can be applied to any stage of the network according to the requirements of researchers.
The heterogeneous parallel trunk network structure effectively fuses feature maps obtained by three different channels, and improves the information content of a target. The heterogeneous parallel backbone network structure comprises three different convolution channels, namely a common convolution network channel, a convolution network channel based on SimAM and a convolution network channel based on an SOE module, and splicing the feature maps obtained by the three channels to realize fusion of the feature maps from the three different channels, effectively enhance the proportion of target information in the feature maps and provide effective support for detection of subsequent targets.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow chart of a method for detecting infrared small targets in a clear airspace of an airport at night according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a method for detecting infrared small targets in a clear airspace of an airport at night according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a detection system for night airport clean airspace infrared small targets provided by an embodiment of the invention;
FIG. 4 is a schematic diagram of a schematic frame structure of a similarity target enhancement module provided in an embodiment of the present invention;
FIG. 5 (a) is a comparison of the original images before and after processing by the SOE module provided in the embodiment of the present invention;
FIG. 5 (b) is an input feature diagram of a comparison before and after processing by an SOE module according to an embodiment of the present invention;
FIG. 5 (c) is a feature diagram of the processed similarity enhancement module before and after the SOE module processing according to the embodiment of the present invention;
FIG. 6 is a diagram of a heterogeneous parallel backbone structure provided by an embodiment of the present invention, in which a general convolution channel at an upper layer performs a conventional convolution operation on an input feature map, a SimAM channel at a middle layer is a non-parametric attention convolution channel, and a convolution channel diagram including an SOE module of the present invention at a lower layer;
fig. 7 is a PAN feature fusion structure diagram provided in an embodiment of the present invention;
FIG. 8 is a diagram of an example of a data set provided by an embodiment of the present invention;
fig. 9 is a comparison diagram of Receiver Operating Characteristic (ROC) curves provided in the embodiment of the present invention, where a, B, C, and D respectively represent different backbone network structures, and E is a backbone network structure of the present invention, and other parameters are the same;
fig. 10 (a) is 8 original images of small infrared targets to be detected in the improved front and rear small infrared target detection results provided by the embodiment of the present invention;
fig. 10 (b) is a detection result diagram of the detection method for infrared small targets in airport headroom at night in the detection results of infrared small targets before and after improvement provided by the embodiment of the present invention;
in the figure: 1. an infrared small target detection model acquisition module for deep learning; 2. different channel characteristic diagram splicing modules; 3. a characteristic diagram obtaining module capable of predicting targets; 4. a target type and position information acquisition module; 5. and a target frame acquisition module.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather should be construed as broadly as the present invention is capable of modification in various respects, all without departing from the spirit and scope of the present invention.
The invention relates to the technical field of infrared small target detection in video and image processing in important places, in particular to a method, a system and application for detecting an infrared small target in a night airport clearance area, which are used for detecting and positioning the small target in a red image.
1. Illustrative examples are illustrated:
example 1
As shown in fig. 1, the method for detecting infrared small targets in a clear airspace of an airport at night according to the embodiment of the present invention includes the following steps:
s101: setting initial parameters of a heterogeneous parallel network model of the detection method of the small infrared targets in the clean airspace of the airport at night, inputting training set images in an infrared small target database into the heterogeneous parallel network model of the detection method of the small infrared targets in the clean airspace of the airport at night after the set parameters are input for training, and obtaining a deeply-learned infrared small target detection model;
s102: inputting an infrared small target image to be detected into the infrared small target detection model trained in the step S101, performing feature extraction on the heterogeneous parallel backbone network infrared small target image, and splicing feature maps of three different channels, namely a Similarity Object Enhancement (SOE) Module channel and a general feature extraction structure channel, a backbone network structure channel based on a Simple Attention mechanism without parameters (SimAM), and the like;
s103: performing feature splicing on part of the multilayer feature map obtained in the step S102 through a Pixel Aggregation Network (PAN), and then obtaining a plurality of layers of feature maps containing objects of different sizes and predictable targets;
s104: secondly, judging the type and the position of the target through a prediction structure so as to obtain the confidence coefficient of each infrared small target, and generating a target prediction frame corresponding to each target so as to directly predict and obtain the type and the position information of the target;
s105: deleting the Non-optimal target frame from the plurality of target prediction frames obtained in step S104 by using a Non-Maximum Suppression (NMS) method to obtain an optimal target frame, and storing and displaying the category and position information of the target.
Deleting the non-optimal target frame from the plurality of target prediction frames obtained in the step S104, and obtaining the optimal target frame is: and deleting the target frames with low scores in the obtained multiple target prediction frames by IOU (interaction-over-unity) formula calculation to obtain the target frames.
Example 2
Based on the method for detecting the infrared small target in the clean airspace of the airport at night provided by embodiment 1, further, in step S101, the heterogeneous parallel network model builds a heterogeneous parallel backbone network in the VGG form based on basic structures such as basic Focus, C3 and SPP, and combines with a pixel aggregation network to generate a multi-layer fused feature map, predicts the type and position of the target according to a prediction network, and corrects the position information returned to the original image according to an L1 loss function, so as to obtain the final classification and accurate position information of the target. Initial parameters of a heterogeneous parallel network model of the detection method of the infrared small target in the clean airspace of the airport at night comprise the number of network layers, the weight and the bias value of neurons in each layer; the initial learning rate of a heterogeneous parallel network model of the detection method of the infrared small target in the clean airspace of the airport at night is 0.001, and a warmup strategy is adopted in the initial iteration process, so that the model is prevented from oscillating, the convergence rate of the model is higher, and the model effect is better; decay at a rate of 0.1 times in 800 th to 1000 th Epoch, with a maximum training Epoch of 1200.
In the embodiment of the invention, the obtained target prediction frame corrects the position information returning to the original image through the L1 loss function, and the target prediction frame loss functionThe calculation formula is shown as formula (1):
in the formulaIndicating the probability of returning to the correct position after mapping a certain target position to the original image,and calculating a loss function by using the L1 loss function as an influence range factor, so that the loss value is insensitive to outliers and abnormal values, the gradient change is relatively small, the model is not easy to deviate from the optimal model in the training stage, and a plurality of target bounding boxes with small deviation are obtained.
Example 3
Based on the method for detecting the infrared small target in the clean airspace of the airport at night provided by embodiment 1, further, in step S102, the heterogeneous parallel backbone network of the heterogeneous parallel network model performs a splicing operation with a feature map and a general feature extraction structure obtained by a Similarity Object Enhancement Module (SOE) channel and a feature map obtained by a backbone network structure based on a Simple Attention mechanism (a Simple Attention Module, simAM) provided by the present invention, so as to increase the proportion and the importance degree of the feature map target information in the convolution operation process.
Example 4
Based on the method for detecting the infrared small target in the clean airspace of the airport at night provided in embodiment 1, further, in step S102, a similarity target enhancement module channel in the heterogeneous parallel backbone network is constructed by combining the similarity target enhancement module based on model driving provided by the present invention with operations such as convolution, pooling, and the like, so that the similarity target enhancement module can make the network tend to obtain more target information in the process of feature extraction, and provide a feature map containing more target information for subsequent target prediction.
Example 5
Based on the detection method of the infrared small target in the clean airspace of the airport at night provided by the embodiment 4, further, the similarity target enhancement module enhances the contrast between the target and the background in a similarity contrast mode, and inserts the similarity target enhancement module into the network as a plug-in module; according to the invention, an SOE module based on model driving is constructed by combining a Wasserstein distance principle according to the relation between the mean value and the variance of local feature blocks in a feature map. The mean value and the variance of each local feature block obtain a Wasserstein value, the whole feature map is divided into a plurality of feature blocks, a one-to-one Wasserstein distance matrix is obtained and converted into a Wasserstein similarity matrix positively correlated with an infrared target, the matrix is activated through a Sigmoid function, and is combined with an input feature map to obtain a new feature map, and target information in the feature map is highlighted through a similarity target enhancement module. The module can be added in the convolutional neural network at will, so that the function of enhancing the target characteristic information is realized.
Example 6
Based on the method for detecting the infrared small target in the clean airspace of the airport at night provided by the embodiment 1, further, in the step S103, in the feature fusion stage in the heterogeneous parallel network model, PAN structures are adopted to complete feature fusion from top to bottom and from bottom to top, multi-layer feature maps from different depths are fused, and multi-layer feature maps with different sizes are further obtained. The feature maps of different depths correspond to objects of different sizes, respectively, each feature map responding to an object of a specific size, and in addition, an object may be detected simultaneously by a plurality of feature maps.
Example 7
Based on the method for detecting the infrared small target in the clean airspace of the airport at night provided by the embodiment 1, further, in the step S104, in the prediction stage of the heterogeneous parallel network model, in the prediction structure part, the feature map is firstly predicted pixel by pixel to obtain a potential target, and the targets are classified to obtain the category information of the target. In addition, the corresponding target position information on the feature map is reversely propagated to the original image through a regression strategy, namely, the position of the target original image is obtained, then the category and the approximate position information of the target are obtained, finally, the non-optimal target frame is deleted through the NMS algorithm in the step S105, the final accurate target frame position is obtained, and the position information and the classification information are stored and displayed.
In the embodiment of the present invention, the NMS compares the scores of the plurality of frames to delete the frame with the low score, and the IOU formula calculates the score of the target frame, and the calculation method of the IOU is as shown in the following formula (2):
wherein A and B respectively represent two frames,a union region representing the two boxes is shown,representing the intersection area of the two boxes.
The infrared small target database is derived from flying unmanned aerial vehicle video sequences under a plurality of different scenes, and is made into a data set which can be used by an airport small target recognition model through manual marking.
Example 8
As shown in fig. 2, the method for detecting infrared small targets in a clear airspace of an airport at night according to the embodiment of the present invention includes an improved neural network model for detecting infrared small targets in a heterogeneous parallel network with enhanced similarity targets after training of an image input heterogeneous parallel network architecture with enhanced similarity targets of infrared small targets. The similarity target enhanced heterogeneous parallel network infrared small target detection neural network model is characterized in that a heterogeneous parallel network structure of the network directly performs multi-stage feature extraction on an input image and splices fusion features, and a PAN structure fuses information of a plurality of layers of feature maps to obtain a plurality of layers of fusion feature maps; and directly classifying the target frames based on the multilayer characteristic diagram through the prediction structure, simultaneously performing regression on the target frames to generate a plurality of predicted target frames which are possibly targets, then generating a final detection result by adopting an NMS (network management system) method, and displaying the target class, the corresponding target probability and the target frames. Specifically, the steps are as follows:
the method comprises the following steps: setting initial parameters of a heterogeneous parallel network model of the detection method of the night airport clear airspace infrared small target, inputting an infrared small target image in an infrared small target database into the set heterogeneous parallel network model of the detection method of the night airport clear airspace infrared small target for training to obtain a detection model of the detection method of the night airport clear airspace infrared small target;
the heterogeneous parallel network model of the detection method for the infrared small target in the clean airspace of the airport at night is used for constructing a heterogeneous parallel backbone network in a VGG (virtual ground gateway) form on the basis of basic structures such as Focus, C3 and SPP (spot-shaped Power map), generating a multi-layer fused feature map by combining a PAN (personal area network) structure, further generating a plurality of prediction target frames which are possibly targets according to the prediction network, and then generating a final detection result by adopting an NMS (network management system) method so as to obtain the final classification and accurate position information of the targets. Initial parameters of a heterogeneous parallel network model of the detection method of the infrared small target in the clean airspace of the airport at night comprise the number of network layers, the weight and the bias value of neurons in each layer; the initial learning rate of a heterogeneous parallel network model of the detection method of the infrared small target in the clean airspace of the airport at night is 0.001, and a warmup strategy is adopted in the initial iteration process, so that the model is prevented from oscillating, the convergence rate of the model is higher, and the model effect is better; decay at a rate of 0.1 times in 800 th to 1000 th Epoch, with a maximum training Epoch of 1200.
Step two: and (3) inputting the infrared small target image to be detected into the infrared small target detection model trained in the step one, wherein the whole network structure is internally composed of a heterogeneous parallel backbone network structure, a PAN characteristic fusion structure and a detection structure. The method comprises the steps of inputting an original image, extracting a feature map through a heterogeneous parallel network, obtaining a multi-layer feature map through PAN feature fusion structure fusion, and then realizing classification and regression of targets in the feature map through a detection structure to obtain a final result image.
The heterogeneous parallel backbone network of the heterogeneous parallel network model of the detection method of the night airport clear airspace infrared small target is composed of three different channels, and the feature graph obtained by the SOE channel provided by the invention, the general feature extraction structure and the feature graph obtained by the backbone network structure based on the SimAM channel are spliced, so that the proportion and the importance degree of feature graph target information in the convolution operation process are increased.
The SOE channel is a hybrid network structure which is constructed based on model driving and deep learning by combining SOE application of a similarity target enhancement module with a deep learning strategy, wherein the SOE is obtained by modeling according to the relation between the mean value and the variance of a local feature block in a feature map by combining the Wasserstein distance principle. The mean value and the variance of each local feature block obtain a Wasserstein value, the whole feature map is divided into a plurality of feature blocks, wasserstein distance matrixes corresponding to the input feature maps in a one-to-one mode are obtained and converted into similarity matrixes positively correlated with the infrared targets, then the matrixes are activated through a Sigmoid function and combined with the input feature maps to obtain new feature maps, and target information in the feature maps is highlighted through a similarity target enhancement module. The module can be added in the convolutional neural network at will, so that the function of enhancing the target characteristic information is realized.
The similarity target enhancement module SOE firstly divides the characteristic graph H multiplied by W intoA 2 × 2 small blockAnd extending to 4 × 4 large block with one small block as centerBig pieceAnd calculates the Wasserstein distance between each small block and the corresponding large block. The Wasserstein distance calculation mode is shown as formula (3):
the distance between two rectangular blocks is defined approximately as:
and is simplified as follows:
whereinAndare respectivelyAndthe distance between a central block and a surrounding domain, namely the similarity between the central block and the surrounding domain, is calculated, the larger the Wa value is, the higher the similarity between a small block and a large block is, the higher the probability that the small block is a background is, the smaller the Wa value is, the lower the similarity between the small block and the large block is, and the higher the probability that the small block is a target is. On the theoretical basis, the calculation is carried out block by block to finally obtain oneThe calculation process of the similarity matrix W _ soe is shown in fig. 4.
In fig. 4, the feature map input to the similarity target enhancement module is H × W, and is divided according to 2 × 2 small blocks, and a one-to-one correspondence 4 × 4 large block that radiates outward with each 2 × 2 small block as a center is obtained, and is obtained by calculating according to formula (5) by sliding with 2 × 2 small blocks as a unit and with a step size of 1W _ soe. From aboveThe similarity matrix values indicate the similarity between the current point and the surrounding points, and the smaller the value, the greater the probability that the range is the target.
In the embodiment of the present invention, the implementation flow of the similarity target enhancement module includes the following steps:
1) Let c =1,2,3,C;
2) Dividing the c-channel feature map into 2 × 2 small blocks to obtain I = (W × H)/(2 × 2) small blocks;
3) Order toi=1,2,3,I;
5) Obtaining a matrix of (W/2) × (H/2);
6) Obtaining a matrix W _ soe of (W/2) × (H/2) × C;
7) Obtaining W _ soe' =1/W _ soe according to the correlation of the target;
8) Obtaining a similarity matrix W' through an activation function;
9) Combining the input characteristic diagram with the similarity matrix W to obtain a new characteristic diagram F';
and (3) outputting: a new feature map F' with size H x W x C containing enhanced target information. Illustratively, the method for implementing the similarity target enhancement module specifically includes the following steps:
first of all, by taking a set small block from the input matrix FThen, the similarity matrix W _ soe is obtained, and then W _ soe '= 1/W _ soe is taken, which is called Wasserstein similarity, so that a similarity matrix W _ soe' and Wasserstein similarity matrix which is positively correlated with the target can be obtained. Then, on the basis of the similarity matrix W ', a Sigmoid function is used for normalization and activation to obtain a matrix W, and the original input characteristic diagram is combined, so that the characteristic value of the characteristic diagram target is basically unchanged, the characteristic value of the background is reduced, the difference between the background and the target is enlarged, a new characteristic diagram F' containing the target information weight is obtained, and the function of enhancing the target information in the characteristic diagram is realized. In the embodiment of the invention, the process is modularized into the similarity target enhancement module SOE, and the similarity target enhancement module SOE is added to any step of the neural network according to requirements, so that a W _ SOE channel is constructed by adding the similarity target enhancement module SOE to the convolution process.
In order to intuitively feel the change of the similarity target enhancement module before and after, the embodiment of the present invention provides a visualized heat map corresponding to the feature map, for example, fig. 5 (a) is an original image, fig. 5 (b) is an input feature map, and fig. 5 (c) is a feature map processed by the similarity enhancement module.
The heterogeneous parallel network comprises the following steps: the small target in the infrared image has a small target occupation ratio and an unclear target outline due to the fact that the shooting distance is long and the night image resolution is low. Therefore, the method constructs a heterogeneous parallel backbone network, adds a similarity target enhancement module SOE into the backbone network for feature extraction, and realizes feature extraction according to a set direction during auxiliary feature extraction, so that a feature map contains more potential target information, the weight of a target is increased, the importance degree of the potential target in the feature map is purposefully increased in the convolution process, and the target is highlighted. In addition, the method is spliced with a conventional convolution channel and a feature map obtained based on an SimAM channel, so that the feature map contains more target information, meanwhile, the correlation between the original background and the target is kept, and the model detection performance is further improved in an auxiliary mode. Finally, by fusing the characteristic diagrams of the three different channels, the proportion of potential targets is increased on the basis of keeping original basic information, and the effect of target detection is improved finally. The schematic structure is shown in fig. 6.
As shown in fig. 6, the feature map is extracted from the output feature map by three convolution methods, i.e. the general convolution channel at the upper layer, the SimAM channel at the middle layer, and the containing channel at the lower layerAnd performing convolution channels of the modules, and performing splicing operation on the outputs of the three channels to fuse the feature information from different operations and further transmitting the feature information to a feature fusion structure. Wherein, the general convolution channel of the upper layer carries out conventional convolution operation on the input characteristic diagram, and the characteristic diagram is extracted in a self-adaptive way; the attention of the SimAM channel in the middle layer to a potential target in the convolution process is increased through a non-parameter attention model; and the lower layer convolution channel sets a similarity target enhancement module through the method, so that the weight of the target is increased, the importance degree of the potential target in the characteristic diagram is purposefully increased in the convolution process, and the target is highlighted. Finally, by fusing the characteristic diagrams of the three different channels, the proportion of potential targets is increased on the basis of keeping original basic information, and the effect of improving target detection is finally realized.
The PAN structure in step three is shown in fig. 7, and the feature splicing is performed on part of the multilayer feature maps obtained in step two, so as to obtain multilayer feature maps containing targets with predictable depth information; as shown in FIG. 6, deep features contain more semantic information and shallow features contain more detailed information. In the invention, not only are the feature maps of different depths in the PAN structure fused, but also the shallow feature map from the backbone network part in the second step is fused and spliced. Therefore, by fusing feature maps of different depths, the feature map finally input to the detection structure includes more feature information.
In fig. 7, the feature map output from the backbone network is first convolved, and two upsampling operations are performed to enlarge the size of the feature map, followed by downsampling to further integrate the feature information. In addition, after all the up-sampling or down-sampling operations, the feature splicing operation is performed again, and the feature splicing operation is spliced with the shallow feature map with the same size, so that the defect that the target information is lost along with the increase of the convolution operation is overcome. Finally, the three feature layers P3, P4, and P5 (the feature layer of the 3,4,5 layer of PAN) on the right side in fig. 7 are input to the detection structure, and the prediction frame size setting information of each pixel point in each feature layer is shown in table 1.
The target area represented in the original image is reflected by each pixel point in the characteristic diagram, the size of the prediction frame refers to the size of a target which can be predicted at each pixel point, such as [10,13,16,30,33,23], the target with the size of 10 × 13,16 × 30 and 33 × 23 can be predicted, and the size of the target frame can be automatically corrected within a certain range according to the confidence coefficient. And predicting the target in the characteristic diagram according to the target weight obtained in the training process, and marking the position and the category of the target in the image by back propagation to the image to be detected.
Step four: and then, judging the type and the position of the target through the prediction structure so as to obtain the confidence coefficient of each infrared small target, and generating a target frame corresponding to each target so as to directly predict and obtain the type and the position information of the target. The detection structure inputs a plurality of feature maps with different depths obtained in the third step into the detection structure, processes each layer of feature map respectively, predicts targets in the feature maps in a feature point-by-feature point manner, orders the target probabilities corresponding to the predicted targets from large to small in the categories of the predicted targets, and selects the target with high target probability as a final target. And then, reversely transmitting the characteristic points of the target to the characteristic diagram corresponding to the original image through the feedback network, and acquiring the position of the target on the original image. The same target may be predicted on different feature maps, and then multiple predicted target boxes may be available.
Step five: and (4) deleting the non-optimal target frames in the plurality of target prediction frames obtained in the fourth step by using an NMS (network management system) method to obtain the optimal target frames, and storing and displaying the type and position information of the targets. In the fourth step, multiple predicted target frames may be obtained for the same target, so in order to obtain a target frame that best matches the target, the present invention adopts an NMS method to process the multiple predicted frames in the fourth step. When the coincidence rate of the two target frames is high, the two prediction frames can be regarded as the same target, and the prediction frame with high target probability is selected as the final target frame. And when the coincidence rate of the two prediction frames is smaller, the two prediction frames are regarded as two targets, the two target frames are reflected on the original image at the same time, and the target frames are marked as the final target type and position.
The database used in the present invention is from a public infrared dataset, the picture specifications are consistent, 256 pixels by 256 pixels, as shown in fig. 8. To verify the effectiveness of the present invention, fig. 8 is a sample of individual images of the data set used, and several representative infrared image effect graphs are shown to facilitate understanding of the objects of the present invention.
Fig. 9 is a comparison graph of Receiver Operating Characteristic (ROC) curves, in which a, B, C, D, and E represent different backbone network structures. A is a backbone network of a single-channel common convolution channel, B is a backbone network of a SimAM channel, C is a backbone network of a W _ soe channel, D is a backbone network of a common convolution combined with the SimAM channel, E is a backbone network structure of a common convolution channel and a SimAM channel combined with a W _ soe channel, and other parameters are the same. Fig. 10 (a) is 8 original images of small infrared targets to be detected in improved front and rear small infrared target detection results provided by the embodiment of the present invention; fig. 10 (b) is a detection result diagram of the detection method of the infrared small target in the airport headroom area at night in the detection results of the infrared small targets before and after improvement provided by the embodiment of the present invention.
Example 9
As shown in fig. 3, the detection system for infrared small targets in clean airspace of airport at night provided by the embodiment of the present invention includes:
the deep learning infrared small target detection model acquisition module 1 is used for setting initial parameters of a heterogeneous parallel network model of a detection method of a night airport clean airspace infrared small target, inputting a training set image in an infrared small target database into the heterogeneous parallel network model of the detection method of the night airport clean airspace infrared small target with the set parameters for training, and obtaining a deep learning infrared small target detection model;
the different-channel characteristic diagram splicing module 2 is used for inputting the infrared small target images to be detected into the trained infrared small target detection model, extracting the characteristics of the infrared small target images of the heterogeneous parallel backbone network, and splicing the characteristic diagrams from three different channels;
a predictable target feature map obtaining module 3, configured to perform feature splicing on the obtained partial multilayer feature maps through a Pixel Aggregation Network (PAN), so as to obtain multilayer predictable target feature maps containing target information of different sizes;
the target type and position information acquisition module 4 is used for judging the type and position of the target through the prediction structure so as to obtain the confidence coefficient of each infrared small target and generate a target frame corresponding to each target, so that the target type and position information is directly obtained through prediction;
and the target frame acquiring module 5 is configured to delete a Non-optimal target frame from the obtained multiple target prediction frames by using a Non-Maximum Suppression (NMS) method, obtain an optimal target frame, and store and display the category and location information of the target.
In the above embodiments, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described or recited in any embodiment.
For the information interaction, execution process and other contents between the above-mentioned devices/units, because the embodiments of the method of the present invention are based on the same concept, the specific functions and technical effects thereof can be referred to the method embodiments specifically, and are not described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
2. The application example is as follows:
application example
An embodiment of the present invention provides a computer device, including: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, the processor implementing the steps of any of the various method embodiments described above when executing the computer program.
Embodiments of the present invention further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps in the above method embodiments may be implemented.
The embodiment of the present invention further provides an information data processing terminal, where the information data processing terminal is configured to provide a user input interface to implement the steps in the above method embodiments when implemented on an electronic device, and the information data processing terminal is not limited to a mobile phone, a computer, or a switch.
The embodiment of the present invention further provides a server, where the server is configured to provide a user input interface to implement the steps in the above method embodiments when implemented on an electronic device.
Embodiments of the present invention provide a computer program product, which, when running on an electronic device, enables the electronic device to implement the steps in the above method embodiments when executed.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which may be stored in a computer-readable storage medium and used for instructing related hardware to implement the steps of the embodiments of the method according to the embodiments of the present invention. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer memory, read-only memory (ROM), random Access Memory (RAM), electrical carrier signal, telecommunications signal, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc.
3. Evidence of the relevant effects of the examples:
experiments show that
The method for detecting the infrared small target in the clean airspace of the airport at night comprises the steps of firstly setting initial parameters of a heterogeneous parallel network model of the method for detecting the infrared small target in the clean airspace of the airport at night, inputting training set images in an infrared small target database into the set network model for training, and obtaining an infrared small target detection model based on deep learning; inputting an infrared small target image to be detected into a trained infrared small target detection model, performing feature extraction on the infrared small target image passing through a heterogeneous parallel backbone network, and fusing the feature maps through a pixel aggregation network to further obtain a plurality of layers of feature maps containing target information; and then processing the feature map containing the target information through the prediction structure, respectively obtaining the type and the position information of the target, and generating a target frame corresponding to each target, thereby directly displaying the type and the position information of the predicted target in the image. The invention has good adaptivity and detection performance to infrared small targets; the detection speed is high, the model size is small, and the model is easy to deploy in hardware equipment; the model is simple to operate and easy to train. The results of comparison with other existing methods are shown in table 2, and the present invention achieves the best results in terms of model size, detection speed, missed detection rate, and precision rate.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, and any modification, equivalent replacement, and improvement made by those skilled in the art within the technical scope of the present invention disclosed herein, which is within the spirit and principle of the present invention, should be covered by the present invention.
Claims (10)
1. A method for detecting infrared small targets in a clean airspace of an airport at night is characterized by comprising the following steps:
s1, setting initial parameters of a heterogeneous parallel network model, inputting images of a training set in an infrared small target database into the heterogeneous parallel network model with the set parameters for training, and obtaining a deep-learning infrared small target detection model;
s2, inputting the infrared small target image to be detected into an infrared small target detection model, extracting the characteristics of the infrared small target image of the heterogeneous parallel backbone network, and splicing the characteristic diagrams of three different channels, wherein the three different channels comprise: a similarity target enhancement module channel, a general feature extraction structure channel and a backbone network structure channel based on a simple attention-free mechanism;
s3, performing feature splicing on the obtained partial multilayer feature maps through a pixel aggregation network, and then obtaining multilayer feature maps containing predictable targets with different sizes;
s4, judging the type and the position of the target through the prediction structure, obtaining the confidence coefficient of each infrared small target, generating a target prediction frame corresponding to each target, and predicting and obtaining the type and the position information of the target;
s5: and deleting the target frames with low scores in the plurality of obtained target prediction frames calculated by an IOU formula by using a non-maximum value inhibition method to obtain target frames, and storing and displaying the category and position information of the targets.
2. The method for detecting the infrared small target in the net airspace of the airport at night according to claim 1, wherein in step S1, the heterogeneous parallel network model builds a heterogeneous parallel backbone network in a VGG form based on a basic Focus module, a C3 module and an SPP module, generates a multi-layer fused feature map by combining a pixel aggregation network, and predicts the category and position of the target according to a prediction network; correcting the position information returned to the original image according to the L1 loss function to obtain the final classification and accurate position information of the target;
in the formulaIndicating the probability of regression being correct after a certain target position is mapped to the original image,is an influence range factor;
the initial parameters of the heterogeneous parallel network model comprise: the number of network layers, the weight and the bias value of each layer of neuron.
3. The method for detecting the night airport headroom infrared small target as claimed in claim 2, wherein the heterogeneous parallel backbone network of the heterogeneous parallel network model performs a splicing operation with a feature map obtained by a similarity target enhancement module channel and a feature map obtained by a general feature extraction structure based on a backbone network structure without a simple attention mechanism, and enhances the proportion and the importance degree of feature map target information in the convolution operation process.
4. The method for detecting infrared small targets in clear airspace at night airport according to claim 1, wherein in step S5, the target frame score calculated by the IOU formula is defined as the ratio between the intersection and union of two frames, and the IOU is calculated as shown in the following formula (2):
5. The method for detecting infrared small targets in airport clean airspace at night according to claim 3, wherein the similarity target enhancement module first divides the characteristic diagram H x W intoA 2 × 2 small blockAnd extending to 4 × 4 large block with one of the small blocks as centerBig pieceCalculating the Wasserstein distance between each small block and the corresponding large block; the Wasserstein distance calculation mode is shown as formula (3):
the distance between two rectangular blocks is defined approximately as:
and simplified as follows:
whereinAndare respectivelyAndthe mean and variance of the feature points in the region, wasserstein distance, are used to measure twoThe distance between the distributions is taken as the similarity of the two distributions, and the distance between the central block and the surrounding domain is calculated as the similarity of the central block and the surrounding domain;the larger the value, the higher the similarity between the small block and the large block, the greater the probability that the small block is the background,the smaller the value, the lower the similarity between the small block and the large block, the greater the probability that the small block is a target; calculating block by block to obtain a final productThe similarity matrix W _ soe;
the characteristic diagram input to the similarity target enhancement module is H multiplied by W, the characteristic diagram is divided according to 2 multiplied by 2 small blocks, corresponding 4 multiplied by 4 large blocks which are radiated outwards by taking each 2 multiplied by 2 small block as the center are obtained, sliding is carried out according to the step length of 1 by taking the 2 multiplied by 2 small blocks as the unit according to the formula (5), and calculation is carried out to obtain the characteristic diagramW _ soe.
6. The method for detecting night airport net airspace infrared small targets of claim 3, wherein the enhancing the proportion and importance degree of the feature map target information in the convolution operation process comprises:
inputting a feature map F of size H x W x C by taking a set small blockForming a similarity matrix W _ soe, then, taking W _ soe '= 1/W _ soe, called Wasserstein similarity, and obtaining a similarity matrix W _ soe' positively correlated with the target and a Wasserstein similarity matrix;
on the basis of the similarity matrix W ', a Sigmoid function is used for normalization and activation to obtain a matrix W, and the characteristic value of a characteristic diagram target is basically unchanged by combining an original input characteristic diagram, so that a new characteristic diagram F' containing the weight of target information is obtained, and the target information in the characteristic diagram is enhanced.
7. The method for detecting the night airport headroom infrared small target as claimed in claim 2, wherein in the feature fusion stage in the heterogeneous parallel network model, a pixel aggregation network structure is adopted to perform top-down and bottom-up feature fusion, and multiple layers of feature maps from different depths are fused, and multiple layers of feature maps with different sizes are further obtained; the feature maps of different depths correspond to different sized targets, respectively, each feature map being responsive to a target of a particular size.
8. The method for detecting the infrared small target in the airport headroom zone at night according to claim 7, wherein the heterogeneous parallel network model predicts the feature map pixel by pixel in a prediction structure to obtain a potential target, and classifies the target to obtain category information of the target; and reversely transmitting the corresponding target position information on the feature map to the original image through a regression strategy to obtain the position of the target original image, obtain the category and the approximate position information of the target, finally deleting the non-optimal target frame through a non-maximum value inhibition method in the step S5 to obtain the final accurate target frame position, and storing and displaying the position information and the classification information.
9. A system for implementing the method for detecting night airport net airspace infrared small targets according to any one of claims 1-8, wherein the system for detecting night airport net airspace infrared small targets comprises:
the deep learning infrared small target detection model acquisition module (1) is used for setting initial parameters of the heterogeneous parallel network model, inputting images of a training set in an infrared small target database into the heterogeneous parallel network model with the set parameters for training, and obtaining a deep learning infrared small target detection model;
the different-channel characteristic diagram splicing module (2) inputs the infrared small target image to be detected into the infrared small target detection model, performs characteristic extraction on the infrared small target image of the heterogeneous parallel backbone network, and splices characteristic diagrams from three different channels;
a characteristic diagram obtaining module (3) for predicting targets, which carries out characteristic splicing on the obtained partial multilayer characteristic diagrams through a pixel aggregation network so as to obtain multilayer characteristic diagrams containing target information with different sizes and capable of predicting targets;
the target type and position information acquisition module (4) is used for judging the type and position of the target through the prediction structure, acquiring the confidence coefficient of each infrared small target, generating a target frame corresponding to each target and predicting and acquiring the type and position information of the target;
and the target frame acquisition module (5) deletes the non-optimal target frame in the obtained target prediction frames by using a non-maximum value inhibition method to obtain a target frame, and stores and displays the type and position information of the target.
10. A computer arrangement, characterized in that the computer arrangement comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the method of detection of night airport net airspace infrared small targets according to any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211359429.0A CN115410012B (en) | 2022-11-02 | 2022-11-02 | Method and system for detecting infrared small target in night airport clear airspace and application |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211359429.0A CN115410012B (en) | 2022-11-02 | 2022-11-02 | Method and system for detecting infrared small target in night airport clear airspace and application |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115410012A true CN115410012A (en) | 2022-11-29 |
CN115410012B CN115410012B (en) | 2023-02-28 |
Family
ID=84169334
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211359429.0A Active CN115410012B (en) | 2022-11-02 | 2022-11-02 | Method and system for detecting infrared small target in night airport clear airspace and application |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115410012B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112818964A (en) * | 2021-03-31 | 2021-05-18 | 中国民航大学 | Unmanned aerial vehicle detection method based on FoveaBox anchor-free neural network |
CN114627052A (en) * | 2022-02-08 | 2022-06-14 | 南京邮电大学 | Infrared image air leakage and liquid leakage detection method and system based on deep learning |
CN114648714A (en) * | 2022-01-25 | 2022-06-21 | 湖南中南智能装备有限公司 | YOLO-based workshop normative behavior monitoring method |
US20220207728A1 (en) * | 2019-04-05 | 2022-06-30 | Oxford University Innovation Limited | Quality assessment in video endoscopy |
CN114758288A (en) * | 2022-03-15 | 2022-07-15 | 华北电力大学 | Power distribution network engineering safety control detection method and device |
-
2022
- 2022-11-02 CN CN202211359429.0A patent/CN115410012B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220207728A1 (en) * | 2019-04-05 | 2022-06-30 | Oxford University Innovation Limited | Quality assessment in video endoscopy |
CN112818964A (en) * | 2021-03-31 | 2021-05-18 | 中国民航大学 | Unmanned aerial vehicle detection method based on FoveaBox anchor-free neural network |
CN114648714A (en) * | 2022-01-25 | 2022-06-21 | 湖南中南智能装备有限公司 | YOLO-based workshop normative behavior monitoring method |
CN114627052A (en) * | 2022-02-08 | 2022-06-14 | 南京邮电大学 | Infrared image air leakage and liquid leakage detection method and system based on deep learning |
CN114758288A (en) * | 2022-03-15 | 2022-07-15 | 华北电力大学 | Power distribution network engineering safety control detection method and device |
Non-Patent Citations (3)
Title |
---|
JINGYI QU ET AL.: ""Research on recognition algorithm of LSS based on video in airport clearance area"", 《2021 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA》 * |
刘闪亮等: ""基于A-YOLOv5s 的机场小目标检测方法"", 《安全与环境学报》 * |
林野: ""基于生成对抗网络的跨域人脸合成研究和应用"", 《中国博士学位论文全文数据库信息科技辑》 * |
Also Published As
Publication number | Publication date |
---|---|
CN115410012B (en) | 2023-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109584248B (en) | Infrared target instance segmentation method based on feature fusion and dense connection network | |
US20200241545A1 (en) | Automatic braking of autonomous vehicles using machine learning based prediction of behavior of a traffic entity | |
US11373067B2 (en) | Parametric top-view representation of scenes | |
JP2021516806A (en) | Neural network for object detection and characterization | |
US20210326609A1 (en) | Object classification using extra-regional context | |
JP2021515939A (en) | Monocular depth estimation method and its devices, equipment and storage media | |
US20190301861A1 (en) | Method and apparatus for binocular ranging | |
Abdi et al. | Deep learning traffic sign detection, recognition and augmentation | |
CN112200129A (en) | Three-dimensional target detection method and device based on deep learning and terminal equipment | |
CN114359851A (en) | Unmanned target detection method, device, equipment and medium | |
Li et al. | Implementation of deep-learning algorithm for obstacle detection and collision avoidance for robotic harvester | |
CN115223117B (en) | Training and using method, device, medium and equipment of three-dimensional target detection model | |
CN117157678A (en) | Method and system for graph-based panorama segmentation | |
Khalifa et al. | A novel multi-view pedestrian detection database for collaborative intelligent transportation systems | |
CN114972758A (en) | Instance segmentation method based on point cloud weak supervision | |
WO2022217434A1 (en) | Cognitive network, method for training cognitive network, and object recognition method and apparatus | |
CN115410012B (en) | Method and system for detecting infrared small target in night airport clear airspace and application | |
CN116844129A (en) | Road side target detection method, system and device for multi-mode feature alignment fusion | |
Abu-Khadrah et al. | Pervasive computing of adaptable recommendation system for head-up display in smart transportation | |
US20230252658A1 (en) | Depth map completion in visual content using semantic and three-dimensional information | |
US20220237402A1 (en) | Static occupancy tracking | |
Schennings | Deep convolutional neural networks for real-time single frame monocular depth estimation | |
CN115273032A (en) | Traffic sign recognition method, apparatus, device and medium | |
CN114972182A (en) | Object detection method and device | |
Zeng | High efficiency pedestrian crossing prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20221129 Assignee: TIANDY TECHNOLOGIES Co.,Ltd. Assignor: CIVIL AVIATION University OF CHINA Contract record no.: X2024980002702 Denomination of invention: A Detection Method, System, and Application of Infrared Small Targets in Night Airport Clearance Area Granted publication date: 20230228 License type: Common License Record date: 20240312 |