CN117726807A

CN117726807A - Infrared small target detection method and system based on scale and position sensitivity

Info

Publication number: CN117726807A
Application number: CN202410176677.4A
Authority: CN
Inventors: 付莹; 刘睿; 刘乾坤
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2024-02-08
Filing date: 2024-02-08
Publication date: 2024-03-19
Anticipated expiration: 2044-02-08
Also published as: CN117726807B

Abstract

The application provides an infrared small target detection method and system based on scale and position sensitivity, wherein the method comprises the following steps: preprocessing a preset infrared small target data set, and performing data augmentation operation on the infrared small target data set through a plurality of data augmentation technologies; inputting the training data with different scales obtained by the augmentation into a deep convolutional neural network to predict with different scales, and obtaining a multi-scale prediction result; training the deep convolution neural network to obtain a trained deep convolution detection model, wherein a multi-scale prediction result is constrained by a scale and position sensitive loss function in the training process; and inputting the target infrared image to be detected into a trained deep convolution detection model to obtain a prediction result of the detection target. According to the method, the scale and the position of the detection target can be accurately distinguished through a lighter detection model, and the accuracy of infrared small target detection is improved.

Description

Infrared small target detection method and system based on scale and position sensitivity

Technical Field

The application relates to the technical field of computer vision, in particular to an infrared small target detection method and system based on scale and position sensitivity.

Background

At present, infrared small target detection is widely applied to a plurality of fields such as offshore monitoring and traffic management, is an important computer vision task, and aims to accurately identify and locate relatively small and dim detection targets from infrared images. In recent years, with the development of deep learning technology, infrared small target detection has been changed from a traditional detection mode of manually designed features to a detection mode based on a deep learning model.

In the related art, when the detection of the infrared small target is performed based on the deep learning model, a complex deep learning network model structure is generally required to be designed so as to meet the requirement of detection precision. Moreover, for the loss function adopted in the training process of the deep learning network model, the cross-ratio loss function is generally adopted.

However, in the detection scheme in the related art, constructing a complex model consumes more computing resources, increases training cost, and is easy to overfit training data. Moreover, the adopted cross ratio loss function lacks sensitivity to the scale and the position of the detection target, and targets with different scales and positions are difficult to distinguish.

Therefore, how to solve the problem that the loss function is insensitive to the scale and the position and reduce the complexity of the deep learning network model becomes a technical problem to be solved at present.

Disclosure of Invention

The present application aims to solve, at least to some extent, one of the technical problems in the related art.

Therefore, a first object of the present application is to provide a method for detecting an infrared small target based on scale and position sensitivity, which starts from the requirement of detecting the infrared small target, and can accurately distinguish the scale and position of the detected target through a lighter detection model, so as to solve the problems of high complexity of a detection network model, unreasonable design of a loss function and the like, and improve the accuracy of detecting the infrared small target.

A second object of the present application is to propose an infrared small target detection system based on scale and position sensitivity;

a third object of the present application is to propose an electronic device;

a fourth object of the present application is to propose a computer readable storage medium.

To achieve the above object, a first aspect of the present application proposes a method for detecting an infrared small target based on scale and position sensitivity, the method comprising the steps of:

preprocessing a preset infrared small target data set, and performing data augmentation operation on the infrared small target data set through a plurality of data augmentation technologies;

inputting the training data with different scales obtained by the augmentation into a deep convolutional neural network to predict with different scales, and obtaining a multi-scale prediction result;

training the deep convolution neural network to obtain a trained deep convolution detection model, wherein the multi-scale prediction result is constrained by a scale and position sensitive loss function in the training process;

and inputting the target infrared image to be detected into the trained deep convolution detection model to obtain a prediction result of the detection target.

Optionally, according to an embodiment of the present application, the performing the data augmentation operation on the infrared small target data set by using a plurality of data augmentation techniques includes: for each original image in the infrared small target data set, cutting different areas of the original image by a random cutting technology to generate training samples with different sizes and positions; and carrying out random Gaussian blur processing on each training sample so as to simulate the blur effect of the original infrared image.

Optionally, according to an embodiment of the present application, the inputting the training data with different scales obtained by augmentation into the deep convolutional neural network performs prediction with different scales to obtain a multi-scale prediction result, including: respectively inputting the training data with different scales to corresponding detection head modules in the deep convolutional neural network, and obtaining a prediction result output by each detection head module according to input characteristics; and splicing all the prediction results into the multi-scale prediction results through a characteristic splicing algorithm.

Optionally, according to one embodiment of the present application, the multi-scale prediction result is calculated by the following formula when there are four scales of training data:

wherein,representing multi-scale prediction results,/->Representing an activation function->The convolution function is represented as a function of the convolution,representing the up-sampling of the first parameter by a factor of the second parameter,/->The operation of the splice is indicated and,p _i representing the prediction results output by different detection heads,i=1，2，3，4。

optionally, according to an embodiment of the present application, the scale and position sensitive loss function is obtained by adding a scale sensitive loss function and a position sensitive loss function, the scale sensitive loss function calculates a loss weight according to a predicted scale and a true scale of the detection target, and the scale sensitive loss function is expressed by the following formula:

wherein,，

wherein,a set of pixels representing a detection target of the model prediction, < ->A set of pixels representing the actual detection target, < +.>Representing a minimum function, +.>The function of the maximum value is represented,representing a variance calculation function.

Optionally, according to an embodiment of the present application, the position sensitive loss function calculates a position penalty value according to a predicted center point and a true center point of the detection target, and the position sensitive loss function is expressed by the following formula:

wherein,，/>，/>，/>，

wherein,center point of detection target representing model prediction, < +.>Represents the center point of the actual detection target, +.>Representing the distance between the predicted center point and the origin of the preset coordinate system,/for>Representing the distance between the actual center and the origin of the preset coordinate system,/->An angle value representing an included angle between a connecting line from the predicted center point to the origin point and an x-axis of a preset coordinate system, ">And the angle value of an included angle formed by the connecting line from the actual center to the origin and the x axis of the preset coordinate system is shown.

To achieve the above object, a second aspect of the present application further proposes an infrared small target detection system based on scale and position sensitivity, comprising the following modules:

the data enhancement module is used for preprocessing a preset infrared small target data set and performing data enhancement operation on the infrared small target data set through various data enhancement technologies;

the computing module is used for inputting the training data with different scales obtained by the augmentation into the deep convolutional neural network to conduct prediction with different scales, and obtaining a multi-scale prediction result;

the training module is used for training the deep convolutional neural network to obtain a trained deep convolutional detection model, wherein the multi-scale prediction result is constrained by a scale and position sensitive loss function in the training process;

the detection module is used for inputting the target infrared image to be detected into the trained depth convolution detection model to obtain a prediction result of the detection target.

To achieve the above object, a third aspect of the present application further proposes an electronic device, including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to enable scale and position sensitivity based infrared small target detection as in any one of the first aspects above.

To achieve the above object, a fourth aspect of the present application further proposes a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the detection of infrared small targets based on scale and position sensitivity of any one of the above first aspects.

The technical scheme provided by the embodiment of the application at least brings the following beneficial effects: according to the method and the device, in the training process of the deep learning neural network, the loss function of fusion of scale sensitivity and position sensitivity is adopted, model parameters are adaptively learned according to diversified training data, and the infrared small target detection model after training is enabled to distinguish detection targets with different scales and different positions based on the loss function with sensitivity to the scale and the position of the detection target, so that infrared small target detection with higher precision is realized. In addition, the method and the device also realize multi-scale prediction, loss constraint is carried out on prediction results of different scales, and the accuracy of infrared small target detection is further improved. In addition, the method and the device can fully utilize the characteristic information of different scales, reduce the complexity of the model, realize lighter infrared small target detection, and reduce the model training cost and the resources consumed by detection. Therefore, the scale and the position of the detection target can be accurately distinguished through the lighter detection model, and the accuracy and the applicability of infrared small target detection are improved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart of an infrared small target detection method based on scale and position sensitivity according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a multi-scale feature prediction according to an embodiment of the present application;

fig. 3 is a schematic diagram of detection effect of an infrared small target according to an embodiment of the present application;

FIG. 4 is a schematic diagram of the working principle of an infrared small target detection system based on scale and position sensitivity according to the embodiment of the present application;

FIG. 5 is a schematic diagram of a small infrared target detection system based on scale and position sensitivity according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.

It should be noted that, as a possible implementation manner, the method for detecting an infrared small target based on scale and position sensitivity according to the present application may be performed by the system for detecting an infrared small target based on scale and position sensitivity according to the present application, where the system for detecting an infrared small target based on scale and position sensitivity may be applied to any electronic device, so that the electronic device may perform an infrared small target detection function.

The electronic device may be any device with computing capability, for example, a detection device with an infrared image acquisition function, or a personal computer PC and a mobile terminal that perform data processing after receiving detection data.

It should be further noted that, in the related embodiment, the first type of solution is to use a conventional manual design feature for detecting the small infrared target. Because the manual design features in this approach are typically based on experience and a priori knowledge, they may exhibit poor performance in the face of diverse, dynamic, practical application scenarios. Secondly, such schemes may not work well when dealing with complex situations such as noise, illumination changes, and target occlusion. This is because these methods often have difficulty capturing abstract and high-level features of objects in complex scenes, resulting in an affected detection accuracy.

And the second type of scheme adopts a detection method based on deep learning. This approach relies on designing a complex model structure, however, increasing model complexity may result in more data and computational resources being required during the training phase, increasing training costs and time. Secondly, complex models are prone to overfitting training data, degrading generalization performance in real scenes, especially in the face of noisy and varying infrared scenes. In addition, the steps of the complex model in the deployment process are more complex, larger computing resources are needed, and real-time application on the infrared equipment with limited resources is not facilitated.

In addition, in the deep learning process, the selection of the loss function directly affects the optimization process of the model. A suitable loss function may significantly improve model performance without increasing model complexity. However, the infrared small target detection method in the related embodiment is relatively less studied for the loss function. Widely used cross-ratio losses lack sensitivity to the scale and position of targets, and targets of different scales and different positions may have the same cross-ratio loss. This scale and position insensitive nature makes it difficult for the detection model to distinguish between targets of different scales and positions, ultimately limiting detection performance. While the other few types of loss functions are mostly designed for specific networks, limiting their applicability.

Therefore, the infrared small target detection method based on the scale and position sensitivity can accurately distinguish the scale and the position of the detection target through a lighter detection model, and improves the accuracy of infrared small target detection.

The following describes an infrared small target detection method, an infrared small target detection system and an electronic device based on scale and position sensitivity, which are provided by the embodiment of the invention, with reference to the accompanying drawings.

Fig. 1 is a flowchart of an infrared small target detection method based on scale and position sensitivity according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:

step S101: preprocessing a preset infrared small target data set, and performing data augmentation operation on the infrared small target data set through various data augmentation technologies.

Specifically, an existing infrared small target data set is firstly obtained as training data of a deep convolutional neural network, then the infrared small target data set is preprocessed, and the diversity of data in the data set can be increased by utilizing a data enhancement technology in the preprocessing process.

It should be noted that, because the disclosed infrared small target data set obtained by various modes is relatively limited, the quantity and types of image data in the data set are small, and the limited original data is directly used for training, which may cause the overfitting of the deep neural network, reduce the generalization capability of the model, and make it difficult to obtain the ideal effect in practical application. Therefore, the method and the device adopt various data enhancement technologies to carry out the enhancement operation on the original data, expand the scale of training data and promote the diversity of the data, thereby enhancing the generalization performance and performance expression of the model. The specific data enhancement technique employed can be determined according to actual needs.

In one embodiment of the present application, the data augmentation operation is performed on the infrared small target data set by a plurality of data augmentation techniques, including: for each original image in the infrared small target data set, cutting different areas of the original image by a random cutting technology to generate training samples with different sizes and positions; and carrying out random Gaussian blurring processing on each training sample so as to simulate the blurring effect of the original infrared image.

Specifically, in this embodiment, different areas of the original image are cut by using a random cutting technology, so that each original image can generate training samples with different sizes and positions, thereby simulating real infrared images with different dimensions and different positions of the small target to be tested. Then, the generated image is randomly added with Gaussian Blur, namely, noise points or textures and the like are randomly added in each generated training sample by using a Gaussian Blur Blur algorithm, so that the generated training sample can simulate the natural blurring effect in a real infrared image.

Therefore, through the processing, the image diversity of the training data can be increased, the robustness and generalization of the trained model can be enhanced, and the accuracy and performance of the model in detection can be improved.

Step S102: and inputting the training data with different scales obtained by the augmentation into a deep convolutional neural network to predict with different scales, and obtaining a multi-scale prediction result.

Specifically, the training data after the augmentation processing is input into a pre-established deep convolutional neural network, and the deep convolutional neural network outputs a multi-scale prediction result about a detection target in the image data after operation processing, so that the deep convolutional neural network is trained according to the prediction result output by the neural network.

In order to more clearly illustrate the implementation process of multi-scale feature prediction and model training according to the prediction result in the present application, the following is an exemplary description of the implementation principle of multi-scale feature prediction and training deep convolutional neural network shown in fig. 2.

In this embodiment, a U-Net network commonly used in practical applications may be selected as the backbone network. Since in the previous step, images of different scales are obtained as training data through the data augmentation operation, for example, as shown in fig. 2, 4 training sample images of different sizes can be generated for one original image. Thus in the decoder part of the deep convolutional neural network, feature maps with different scales (i.e., x in FIG. 2) ₁ To x ₄ Training sample images of (a) as input features to different test heads to obtain different scalesAnd (5) predicting the degree. That is, as shown in FIG. 2, for x ₁ To x ₄ Each feature map of the system is preset with a corresponding detection head module to conduct prediction operation, and each feature map corresponds to a prediction result. All predictions from the different feature maps are then finally stitched together to obtain the final prediction.

In specific implementation, training data with different scales obtained by augmentation is input into a deep convolutional neural network to conduct prediction with different scales, and a multi-scale prediction result is obtained, wherein the method comprises the following steps: respectively inputting training data with different scales to corresponding detection head modules in the deep convolutional neural network, and obtaining a prediction result output by each detection head module according to the input characteristics; and splicing all the prediction results into the multi-scale prediction result through a characteristic splicing algorithm.

For example, referring to the example shown in fig. 2, the process of obtaining a prediction result for each detection head may be expressed by the following formula:

wherein,and->Representing prediction results and feature inputs of different scales, respectively, in this exampleiIt may be in the range of from 1 to 4,and->The Sigmoid activation function and convolution operation are represented, respectively, corresponding to the Sigmoid module and conv3×3 convolution module in fig. 2.

Further, when there are four scales of training data (i.e., feature graphs) in this example, the final multi-scale prediction result is calculated by the following formula:

therefore, the multi-scale prediction result of the detection target in the image can be obtained for a group of input training data with different scales, and the result comprises the positioning result of the infrared small target to be detected in the feature images with different scales. In practical application, the scale number in a group of feature maps with different scales can be determined according to factors such as actual detection requirements and data enhancement processes, and the application is not limited to the scale number.

Step S103: training the deep convolution neural network to obtain a trained deep convolution detection model, wherein the multi-scale prediction result is constrained by the scale and position sensitive loss function in the training process.

Specifically, training is performed on a pre-constructed deep convolutional neural network by using training data after data enhancement until a prediction result of the deep convolutional neural network can meet the detection requirement of an actual infrared small target, so that a deep convolutional detection model for detecting the infrared small target is obtained.

In the model training process, for a group of training data, a multi-scale prediction result is obtained in the manner shown in the step S102, and then the scale and position sensitivity loss is calculated for each prediction result through the scale and position sensitivity loss function provided by the application, so that the prediction results of different scales are restrained, and therefore, different attentions to a detection target are obtained by a network, and better overall detection performance is obtained.

In one embodiment of the present application, the scale and position sensitive loss functions of the present application are derived by summing the scale sensitive loss function and the position sensitive loss function, i.e., the scale and position sensitive loss functions can be calculatedIs expressed by the following formula:

wherein,representing a scale-sensitive loss function,/->Representing a position sensitive loss function.

The scale sensitivity loss function calculates loss weight according to the predicted scale and the real scale of the detection target, and the scale sensitivity loss function is expressed by the following formula:

wherein,，

From this, it can be seen that the present applicationAnd calculating a lost weight by the function according to the predicted scale and the real scale of the detection target. The weight changes the scale insensitivity of the traditional cross-ratio, and the larger the difference between the predicted scale and the true scale, the more the detection network is focused on the target.

Further, in this embodiment, the position-sensitive loss function calculates the position penalty value according to the predicted center point and the true center point of the detection target, and the position-sensitive loss function may be expressed by the following formula:

wherein,，/>，/>，/>，

wherein,representation ofCenter point of detection target predicted by deep convolutional neural network model,/-for>Represents the center point of the actual detection target, +.>Representing the distance between the predicted center point and the origin of the preset coordinate system,/for>Representing the distance between the actual center and the origin of the preset coordinate system,/->An angle value representing an included angle between a connecting line from the predicted center point to the origin point and an x-axis of a preset coordinate system, ">And the angle value of an included angle formed by the connecting line from the actual center to the origin and the x axis of the preset coordinate system is shown.

In this embodiment, for each infrared image, the above-mentioned preset coordinate system is established by taking the upper left corner of the infrared image as the origin of the coordinate system, taking the upper boundary of the image as the x-axis and the left boundary as the y-axis, and then performing subsequent operations.

From this, it can be seen that the present applicationThe function designs position punishment according to the predicted center point and the real center point of the detected target, and punishment values change according to different types of positioning errors, so that the network can position the target more accurately.

Furthermore, the scale sensitivity loss and the position sensitivity loss are added to obtain a final scale and position sensitivity loss function, and the detection model is trained through the loss function, so that the method has a more accurate detection effect.

For example, the present application is experimentally directed to several infrared cells having different dimensions and different locationsThe targets are actually detected, and the loss values of the scale and position sensitive loss function of the method for detecting the targets are collected, and the detection results of the cross ratio loss function in the related embodiment are obtained. Wherein the scale and position sensitive loss function has the detection results of detection targets with different scales and different positions as shown in figure 3, and has accurate and different loss values for each detection targetL. The cross-over loss function has the same value under different conditions, for example, the cross-over loss is 0.4 for the detection target of the first row in fig. 3, and the cross-over loss is 0.3 for the detection target of the second row in fig. 3. Therefore, the scale and position sensitive loss function can reflect the difference of different positions and scales of the detection target from the loss value.

Based on the training mode, the deep convolutional neural network can be trained for multiple rounds through a large amount of training data, model parameters are adjusted according to the calculation result of each round, the network continuously adjusts to obtain optimal parameters of the detection model according to the principle of gradient descent in the training process, and the optimal parameters are loaded to obtain the optimal detection model.

Step S104: and inputting the target infrared image to be detected into a trained deep convolution detection model to obtain a prediction result of the detection target.

Specifically, in the process of actually detecting the infrared small target, a trained depth convolution detection model is called, the current target infrared image to be detected is input into the trained depth convolution detection model, and detection results, such as detection targets identified from the target infrared image, and positioning information of the detection targets, output by the depth convolution detection model are obtained.

In summary, in the method for detecting the infrared small target based on the scale and the position sensitivity, in the training process of the deep learning neural network, a loss function integrating the scale sensitivity and the position sensitivity is adopted, model parameters are adaptively learned according to diversified training data to adjust, and the trained infrared small target detection model can distinguish detection targets with different scales and different positions based on the loss function having sensitivity to the scale and the position of the detection target, so that the infrared small target detection with higher precision is realized. In addition, the method also realizes multi-scale prediction, carries out loss constraint on prediction results of different scales, and further improves the precision of infrared small target detection. In addition, the method can fully utilize the characteristic information of different scales, reduce the complexity of the model, realize lighter infrared small target detection, and reduce the model training cost and the resources consumed by detection. Therefore, the method can accurately distinguish the scale and the position of the detection target through a lighter detection model, and improves the accuracy and the applicability of infrared small target detection.

In order to implement the above embodiment, the present application further proposes a controllable noise removal system based on a diffusion model, which includes a data enhancement module 10 and a multi-scale prediction module 20. The working principle of the system is shown in fig. 4, and the system comprises a training stage (shown by a dotted line in the figure) and an actual detection stage (shown by a solid arrow in the figure).

The data enhancement module 10 acts on the training stage of the system, and introduces image diversity by using random clipping and Gaussian blur to enhance the robustness and generalization of the system.

The multi-scale prediction module 20 is used for obtaining the prediction results of different scales of the target in the training stage, and updating the module parameters according to the gradient descent rule by calculating the scale and position sensitivity loss of the results. In the actual detection phase, the target image is input to the multi-scale prediction module 20, and the multi-scale prediction module 20 may output a prediction result.

In particular, as one implementation, the multi-scale prediction module 20 includes a calculation module 21, a training module 22, and a detection module 23 to implement its functions. The following describes the model structure shown in fig. 5.

Fig. 5 is a schematic structural diagram of an infrared small target detection system based on scale and position sensitivity according to an embodiment of the present application, and as shown in fig. 5, the system includes a data enhancement module 10, a calculation module 21, a training module 22, and a detection module 23.

The data enhancing module 10 is configured to pre-process a preset small infrared target data set, and perform data enhancing operation on the small infrared target data set through multiple data enhancing technologies.

The computing module 21 is used for inputting the training data with different scales obtained by augmentation into the deep convolutional neural network to conduct prediction with different scales, and obtaining a multi-scale prediction result.

The training module 22 is configured to train the deep convolutional neural network to obtain a trained deep convolutional detection model, where the multi-scale prediction result is constrained by the scale and position sensitive loss function in the training process.

The detection module 23 is configured to input the target infrared image to be detected into a trained deep convolution detection model, and obtain a prediction result of the detection target.

In one embodiment of the present application, the data enhancement module 10 is specifically configured to: for each original image in the infrared small target data set, cutting different areas of the original image by a random cutting technology to generate training samples with different sizes and positions; and carrying out random Gaussian blurring processing on each training sample so as to simulate the blurring effect of the original infrared image.

In one embodiment of the present application, the calculating module 21 is specifically configured to: respectively inputting training data with different scales to corresponding detection head modules in the deep convolutional neural network, and obtaining a prediction result output by each detection head module according to the input characteristics; and splicing all the prediction results into a multi-scale prediction result through a characteristic splicing algorithm.

In one embodiment of the present application, when there are four scales of training data, the calculation module 21 may calculate the multi-scale prediction result by the following formula:

it should be noted that, the above description of the embodiment of the method for detecting the small infrared target based on the scale and the position sensitivity is also applicable to the system for detecting the small infrared target based on the scale and the position sensitivity of the present application, and the implementation principle is the same, which is not repeated in the present application.

In summary, in the infrared small target detection system based on scale and position sensitivity according to the embodiment of the application, in the training process of the deep learning neural network, a loss function integrating scale sensitivity and position sensitivity is adopted, model parameters are adaptively learned according to diversified training data to adjust, and based on the loss function having sensitivity to the scale and position of the detection target, the trained infrared small target detection model can distinguish detection targets with different scales and different positions, so that detection of the infrared small target with higher precision is realized. In addition, the system also realizes multi-scale prediction, carries out loss constraint on prediction results of different scales, and further improves the precision of infrared small target detection. In addition, the system can fully utilize the characteristic information of different scales, reduce the complexity of the model, realize lighter infrared small target detection, and reduce the model training cost and the resources consumed by detection. Therefore, the system can accurately distinguish the scale and the position of the detection target through a lighter detection model, and the accuracy and the applicability of infrared small target detection are improved.

In order to implement the above embodiment, the present application further proposes an electronic device, as shown in fig. 6, the electronic device 600 includes: a processor 610; a memory 620 for storing instructions executable by the processor 610; wherein the processor 960 is configured to execute instructions to implement a method of infrared small target detection based on scale and position sensitivity as described in any of the embodiments of the first aspect described above.

To achieve the above embodiments, the present application further proposes a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for detecting infrared small targets based on scale and position sensitivity as described in any of the embodiments of the first aspect above.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" is at least two, such as two, three, etc., unless explicitly defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. The method for detecting the infrared small target based on the scale and the position sensitivity is characterized by comprising the following steps of:

2. The method of claim 1, wherein said performing data augmentation operations on said infrared small target data set by a plurality of data augmentation techniques comprises:

for each original image in the infrared small target data set, cutting different areas of the original image by a random cutting technology to generate training samples with different sizes and positions;

and carrying out random Gaussian blur processing on each training sample so as to simulate the blur effect of the original infrared image.

3. The method according to claim 1, wherein the inputting the training data with different scales obtained by augmentation into the deep convolutional neural network to perform prediction with different scales, and obtaining a multi-scale prediction result includes:

respectively inputting the training data with different scales to corresponding detection head modules in the deep convolutional neural network, and obtaining a prediction result output by each detection head module according to input characteristics;

and splicing all the prediction results into the multi-scale prediction results through a characteristic splicing algorithm.

4. A method according to claim 3, wherein the multi-scale prediction result is calculated by the following formula when there are four scales of training data:

wherein,representing multi-scale prediction results,/->Representing an activation function->Representing a convolution function>Representing the up-sampling of the first parameter by a factor of the second parameter,/->The operation of the splice is indicated and,p _i representing the prediction results output by different detection heads,i=1，2，3，4。

5. the method according to claim 1, wherein the scale and position sensitive loss function is obtained by adding a scale sensitive loss function and a position sensitive loss function, the scale sensitive loss function calculating a loss weight according to a predicted scale and a true scale of the detection target, the scale sensitive loss function being expressed by the following formula:

wherein,，

wherein,a set of pixels representing a detection target of the model prediction, < ->A set of pixels representing the actual detection target, < +.>Representing a minimum function, +.>Represents a maximum function>Representing a variance calculation function.

6. The method of claim 5, wherein the position sensitive loss function calculates a position penalty value based on a predicted center point and a true center point of the detection target, the position sensitive loss function being expressed by the following formula:

wherein,，/>，/>，/>，

7. An infrared small target detection system based on scale and position sensitivity is characterized by comprising the following modules:

8. The system according to claim 7, wherein the data enhancement module is specifically configured to:

9. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the scale and position sensitivity based infrared small target detection method of any one of claims 1-6.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the scale and position sensitivity based infrared small object detection method according to any of claims 1-6.