CN116883640A

CN116883640A - Method and device for detecting infrared target in complex scene

Info

Publication number: CN116883640A
Application number: CN202310876776.9A
Authority: CN
Inventors: 王舒伟; 杨金宝; 仝晓杰; 杨晨; 侯凯文
Original assignee: Beijing Institute of Environmental Features
Current assignee: Beijing Institute of Environmental Features
Priority date: 2023-07-17
Filing date: 2023-07-17
Publication date: 2023-10-13

Abstract

The invention provides a method and a device for detecting an infrared target in a complex scene, wherein the method comprises the following steps: acquiring an infrared image detected in real time; the infrared image comprises a moving target; respectively carrying out target detection on the infrared images by using a classical infrared target detection algorithm and a deep learning algorithm, and obtaining a first target detection result and a second target detection result in one-to-one correspondence; and carrying out target fusion on the first target detection result and the second target detection result so as to output a positioning target aiming at the infrared image. According to the scheme, the advantages of the two algorithms can be combined, so that both the infrared small target and the target with obvious shape and texture can be accurately detected, and the false alarm rate of target detection can be reduced.

Description

Method and device for detecting infrared target in complex scene

Technical Field

The embodiment of the invention relates to the technical field of image processing, in particular to an infrared target detection method and device under a complex scene.

Background

The infrared imaging technology is widely applied to the military and civil fields, and how to accurately detect and track infrared small targets in complex scenes such as sky background, sea surface background, sea sky background and the like is a bottleneck problem to be solved urgently.

At present, the target detection, identification and tracking of the infrared image are generally realized by adopting a traditional filtering classical algorithm or a method of deep learning convolutional neural network. However, the generalization performance of the filtering classical algorithm is weaker, the accuracy of the deep learning method is poorer when facing targets with unobvious texture features, and both the deep learning method and the target are higher in false alarm rate.

Disclosure of Invention

The embodiment of the invention provides an infrared target detection method and device in a complex scene, which can reduce the false alarm rate of target detection.

In a first aspect, an embodiment of the present invention provides a method for detecting an infrared target in a complex scene, including:

acquiring an infrared image detected in real time; the infrared image comprises a moving target;

respectively carrying out target detection on the infrared images by using a classical infrared target detection algorithm and a deep learning algorithm, and obtaining a first target detection result and a second target detection result in one-to-one correspondence;

and carrying out target fusion on the first target detection result and the second target detection result so as to output a positioning target aiming at the infrared image.

Preferably, the classical infrared target detection algorithm comprises: at least one of a gray gradient algorithm, an optical flow field algorithm, and a gaussian filtering algorithm; and/or the number of the groups of groups,

the deep learning algorithm is a kernel-related filtering algorithm.

Preferably, performing object fusion on the first object detection result and the second object detection result to output a positioning object for the infrared image, including:

determining the number of detected targets according to the first target detection result, if the number is not smaller than the first set number, selecting the target with the highest confidence of the first set number from the detected targets as a first detection target, otherwise, taking the detected target as the first detection target;

determining whether a second detection target with a size exceeding a set value is detected according to the second target detection result, if so, determining a positioning target to be output according to the superposition condition of the second detection target and the first detection target so as to output the positioning target; and if not, outputting the first detection target as a positioning target.

Preferably, the determining the positioning target of the required output according to the coincidence condition of the second detection target and the first detection target includes:

for each first detection target, performing: determining whether the first detection target is not covered by the second detection target, wherein the first detection target is not overlapped with the second detection target, and if so, determining the first detection target as a positioning target; otherwise, determining the first detection target as a non-positioning target;

and determining the second detection target as a positioning target.

Preferably, if it is determined that the number of detected targets whose sizes exceed the set value is plural according to the second target detection result, the target of the maximum size is determined as the second detection target.

Preferably, the first set number is 2.

Preferably, the method further comprises:

and calculating the angular offset of each positioning target, and assigning the angular offset to the off-target quantity of the corresponding positioning target for output.

In a second aspect, an embodiment of the present invention further provides an infrared target detection apparatus in a complex scene, including:

the acquisition unit is used for acquiring the infrared image detected in real time; the infrared image comprises a moving target;

the detection unit is used for respectively carrying out target detection on the infrared images by utilizing a classical infrared target detection algorithm and a deep learning algorithm, and obtaining a first target detection result and a second target detection result in one-to-one correspondence;

and the fusion unit is used for carrying out target fusion on the first target detection result and the second target detection result so as to output a positioning target aiming at the infrared image.

In a third aspect, an embodiment of the present invention further provides an electronic device, including a memory and a processor, where the memory stores a computer program, and when the processor executes the computer program, the method described in any embodiment of the present specification is implemented.

In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform a method according to any of the embodiments of the present specification.

The embodiment of the invention provides an infrared target detection method and device under a complex scene, which are characterized in that a classical infrared target detection algorithm has the characteristics of accuracy and real-time performance for detecting infrared small targets, a deep learning algorithm has the characteristics of high accuracy for detecting targets with obvious shape and texture characteristics, and infrared images are respectively subjected to target detection by utilizing the two algorithms of the classical infrared target detection algorithm and the deep learning algorithm, and detection results obtained by the two algorithms are fused to detect and obtain positioning targets in the infrared images. Therefore, the method and the device can combine the respective advantages of the two algorithms, so that both the infrared small target and the target with obvious shape and texture can be accurately detected, and the false alarm rate of target detection can be reduced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of an infrared target detection method in a complex scene according to an embodiment of the present invention;

FIG. 2 is a hardware architecture diagram of an electronic device according to an embodiment of the present invention;

fig. 3 is a block diagram of an infrared target detection device in a complex scene according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by those skilled in the art without making any inventive effort based on the embodiments of the present invention are within the scope of protection of the present invention.

Referring to fig. 1, an embodiment of the present invention provides a method for detecting an infrared target in a complex scene, where the method includes:

step 100, acquiring an infrared image detected in real time; a target in the infrared image has at least one characteristic of high-speed motion, scale change and affine change;

102, respectively carrying out target detection on the infrared image by using a classical infrared target detection algorithm and a deep learning algorithm, and obtaining a first target detection result and a second target detection result in one-to-one correspondence;

and 104, performing target fusion on the first target detection result and the second target detection result to output a positioning target aiming at the infrared image.

In the embodiment of the invention, because the classical infrared target detection algorithm has the characteristics of accuracy and real-time performance for detecting infrared small targets, the deep learning algorithm has the characteristic of high accuracy for detecting targets with obvious shape and texture characteristics, and the infrared images are respectively detected by utilizing the classical infrared target detection algorithm and the deep learning algorithm, and the detection results obtained by the classical infrared target detection algorithm and the deep learning algorithm are fused to detect and obtain the positioning targets in the infrared images. Therefore, the method and the device can combine the respective advantages of the two algorithms, so that both the infrared small target and the target with obvious shape and texture can be accurately detected, and the false alarm rate of target detection can be reduced.

The manner in which the individual steps shown in fig. 1 are performed is described below.

Firstly, aiming at step 100, acquiring an infrared image detected in real time; the object in the infrared image has at least one of a high-speed motion, a scale change, and an affine change.

In the embodiment of the invention, the infrared imaging system can be utilized to acquire the video data of the moving target in real time so as to obtain the infrared image detected in real time. Wherein the moving object has characteristics of high-speed motion, dimensional change and affine change. For the moving targets with the characteristics, only one algorithm is adopted for detection, and the false alarm rate is high. The present embodiment considers using two different algorithms to perform target detection respectively, so that the advantages of the two algorithms are combined to ensure the rapidity and accuracy of the target detection process.

Then, for step 102, target detection is performed on the infrared image by using a classical infrared target detection algorithm and a deep learning algorithm, so as to obtain a first target detection result and a second target detection result in a one-to-one correspondence manner.

The classical infrared target detection algorithm has strong real-time performance and has certain advantages when processing infrared weak and small targets with unobvious texture characteristic information; the deep learning algorithm can be used for easily identifying the real infrared targets with relatively obvious target shapes and texture features under the condition that training samples are rich. Based on this, the two algorithms can be used to perform object detection on the infrared image, respectively.

In the embodiment of the invention, the requirement of real-time performance of target detection is considered, so that two physical plates can be used for respectively executing respective detection algorithms so as to ensure the processing speed of the algorithms. The two physical boards include a classical board for performing a classical infrared target detection algorithm and a smart board for performing a deep learning algorithm.

Further, after the video data is collected, in order to further improve the target detection speed of each infrared image in the video data, in the embodiment of the invention, the whole system can be operated on a TMS320C6678 platform, and the collected video data is transmitted to the classical board and the intelligent board in two paths through an SRIO hardware interface on the platform at the speed of 10 Hz/frame frequency by the infrared imaging system, so that a DSP chip on the classical board and a DSP chip on the intelligent board can quickly receive the infrared images to carry out target detection on the infrared images. The two DSPs perform calculation in parallel, so that the detection instantaneity can be further improved.

In order to further improve the accuracy of target detection, the infrared imaging system may acquire at least two resolution infrared images, so as to respectively perform target detection on the at least two resolution infrared images. For example, one resolution is 640×512 and the other resolution is 320×256.

In one embodiment of the present invention, the complex scene may include: and moving objects in a complex motion background and obvious objects in a complex scene.

The moving target in the complex moving scene is a scene with cloud layer interference, ground interference and the like, and when the target moves relative to the background, the condition which needs to be processed by the algorithm can be included in the classical infrared target detection algorithm used in the scene: at least one of a gray gradient algorithm, an optical flow field algorithm, and a gaussian filtering algorithm.

The obvious target scene in the complex scene is a simplified form of the last scene, and is mainly processed aiming at the infrared characteristics of the aircraft. The deep learning algorithm used in this scenario may be a kernel-dependent filtering algorithm (KCF algorithm).

After target detection is performed on the infrared image by using a classical infrared target detection algorithm, a first target detection result can be obtained. The first target detection result may include coordinate position and size information of the target detection frame. When the number of the target detection frames is a plurality, the detection of a plurality of targets is indicated.

After the target detection is performed on the infrared image by using a deep learning algorithm, a second target detection result can be obtained. The second target detection result may include coordinate position and size information of the target detection frames, and when the number of the target detection frames is plural, it indicates that plural targets are detected.

Finally, for step 104, performing target fusion on the first target detection result and the second target detection result to output a positioning target for the infrared image.

When two detection algorithms can detect a plurality of targets, in order to improve the detection speed and reduce the comparison process of the plurality of targets, when the targets are fused with the detection results of the two targets, the method specifically comprises the following steps:

s1: determining the number of detected targets according to the first target detection result, if the number is not smaller than the first set number, selecting the target with the highest confidence of the first set number from the detected targets as a first detection target, otherwise, taking the detected target as the first detection target;

preferably, the first set number is 2.

For example, if the number of detected targets is 5 according to the first target detection result, the 5 targets are ranked according to the order of the confidence from high to low, and the two targets with the highest confidence are selected as the first detection targets. If the number of the detected targets is 1 or 2 according to the first target detection result, the 1 or 2 targets are directly used as the first detection targets.

The confidence coefficient may be calculated by using the pixel number, the pixel gray value, the matching degree with the previous target, and the like of the target, where the higher the score is, the higher the confidence coefficient is. By selecting the target with the highest confidence as the first detection target, the detected target can be ensured to be higher in authenticity.

S2: determining whether a second detection target with a size exceeding a set value is detected according to the second target detection result, if so, determining a positioning target to be output according to the superposition condition of the second detection target and the first detection target so as to output the positioning target; and if not, outputting the first detection target as a positioning target.

Since the smart board has a better effect on detecting objects with large surface area and more obvious texture characteristics, a set value, for example, 40, can be set, and when the length or width of the detected object size in the first object detection result exceeds the set value, the accuracy of the object being a real object is higher. If the detection target exceeding the set value does not exist, determining that the intelligent board does not detect the target, and directly outputting the first detection target detected by the classical board as a positioning target.

In one embodiment of the present invention, considering that the target area detected by the deep learning algorithm is larger, when the coincidence condition of the second detection target and the first detection target is matched, the second detection target is more likely to cover the first detection target, so in order to reduce the matching times of the coincidence condition, if the number of detected targets with the size exceeding the set value is determined to be a plurality according to the second target detection result, the target with the largest size is determined to be the second detection target. That is, the number of second detection targets is one, and is a target whose size exceeds a set value.

After determining the first detection target and the second detection target, determining the positioning target to be output according to the coincidence condition of the second detection target and the first detection target may include:

and determining the second detection target as a positioning target.

Since the area of the second detection target is larger than the area of the first detection target, the first detection target may be covered by the second detection target, or the first detection target and the second detection target are coincident in outline, which, when present, indicates that both target detection algorithms detect the same target. For the first detection target covered by and/or overlapped with the second detection target, the target range detected by the intelligent board is larger and more real, so that the second detection target is output as a positioning target, and the first detection target is not used as the positioning target, thereby ensuring more accurate follow-up target tracking.

The second detection target is determined as a positioning target whether the second detection target covers or coincides with the first detection target.

In determining whether the first detection target is covered by the second detection target or coincides with the second detection target, it may be determined by calculating whether the center position of the first detection target is located within the second detection target.

In the embodiment of the invention, at most two first detection targets are output through the classical plate, at most one second detection target is output through the intelligent plate, and when the coincidence condition of the second detection target and the first detection target is matched, the positioning target can be output by carrying out matching twice at most, so that the target detection can be ensured to meet the requirement of real-time performance.

Further, in order to enable the upper computer to know which target detection algorithm is used for detecting each positioning target, the output positioning targets can be identified. For example, the two first detection targets output by the classical board use the identifier 1 and the identifier 2 respectively, the second detection target output by the intelligent board uses the identifier 3, and after determining the positioning target, the corresponding target detection algorithm can be determined by the identifier bit of the positioning target.

Further, after the positioning targets are detected, when the positioning targets are multiple, one target can be selected from the multiple positioning targets for target tracking, and the upper computer selects the positioning target to be tracked by issuing a target selection instruction. After the host computer issues the target selection instruction, in order to quickly achieve tracking of the positioning target, in one embodiment of the present invention, the method may further include: and calculating the angular offset of each positioning target, and assigning the angular offset to the off-target quantity of the corresponding positioning target for output.

The calculation mode of the angle deviation amount is as follows:

wherein Y is _h Is the horizontal angle deviation, Y _v Is a vertical angle deviation of fov _h For horizontal angle of view fov _v For the vertical angle of view, w is the width of the positioning target size, h is the height of the positioning target size, and f is the 44 th and 45 th bytes of the image superimposition information.

As shown in fig. 2 and 3, the embodiment of the invention provides an infrared target detection device under a complex scene. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. In terms of hardware, as shown in fig. 2, a hardware architecture diagram of an electronic device where an infrared target detection apparatus is located in a complex scenario provided in an embodiment of the present invention, in addition to a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 2, the electronic device where the apparatus is located in the embodiment may generally include other hardware, such as a forwarding chip responsible for processing a packet, and so on. Taking a software implementation as an example, as shown in fig. 3, the device in a logic sense is formed by reading a corresponding computer program in a nonvolatile memory into a memory by a CPU of an electronic device where the device is located and running the computer program. The embodiment provides an infrared target detection device under complex scene, including:

an acquisition unit 301, configured to acquire an infrared image detected in real time; the infrared image comprises a moving target;

the detection unit 302 is configured to perform target detection on the infrared image by using a classical infrared target detection algorithm and a deep learning algorithm, so as to obtain a first target detection result and a second target detection result in a one-to-one correspondence manner;

and a fusion unit 303, configured to perform target fusion on the first target detection result and the second target detection result, so as to output a positioning target for the infrared image.

In one embodiment of the present invention, the classical infrared target detection algorithm comprises: at least one of a gray gradient algorithm, an optical flow field algorithm, and a gaussian filtering algorithm; and/or the number of the groups of groups,

the deep learning algorithm is a kernel-related filtering algorithm.

In one embodiment of the present invention, the fusion unit is specifically configured to: determining the number of detected targets according to the first target detection result, if the number is not smaller than the first set number, selecting the target with the highest confidence of the first set number from the detected targets as a first detection target, otherwise, taking the detected target as the first detection target; determining whether a second detection target with a size exceeding a set value is detected according to the second target detection result, if so, determining a positioning target to be output according to the superposition condition of the second detection target and the first detection target so as to output the positioning target; and if not, outputting the first detection target as a positioning target.

In one embodiment of the present invention, when the fusion unit executes the positioning target that determines the required output according to the coincidence condition of the second detection target and the first detection target, the fusion unit specifically includes:

and determining the second detection target as a positioning target.

In one embodiment of the present invention, if it is determined that the number of detected targets whose size exceeds the set value is plural according to the second target detection result, the target of the maximum size is determined as the second detection target.

In one embodiment of the present invention, the first set number is 2.

In one embodiment of the present invention, the fusion unit is further configured to calculate an angular offset of each positioning target, and assign the angular offset to the off-target value of the corresponding positioning target for output.

It will be appreciated that the structure illustrated in the embodiments of the present invention is not intended to be limiting in any particular manner for an infrared target detection device in a complex scenario. In other embodiments of the invention, an infrared target detection device in a complex scenario may include more or fewer components than shown, or certain components may be combined, certain components may be split, or different component arrangements. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

The content of information interaction and execution process between the modules in the device is based on the same conception as the embodiment of the method of the present invention, and specific content can be referred to the description in the embodiment of the method of the present invention, which is not repeated here.

The embodiment of the invention also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the infrared target detection method under the complex scene in any embodiment of the invention when executing the computer program.

The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor is caused to execute the infrared target detection method under the complex scene in any embodiment of the invention.

Specifically, a system or apparatus provided with a storage medium on which a software program code realizing the functions of any of the above embodiments is stored, and a computer (or CPU or MPU) of the system or apparatus may be caused to read out and execute the program code stored in the storage medium.

In this case, the program code itself read from the storage medium may realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code form part of the present invention.

Examples of the storage medium for providing the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer by a communication network.

Further, it should be apparent that the functions of any of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform part or all of the actual operations based on the instructions of the program code.

Further, it is understood that the program code read out by the storage medium is written into a memory provided in an expansion board inserted into a computer or into a memory provided in an expansion module connected to the computer, and then a CPU or the like mounted on the expansion board or the expansion module is caused to perform part and all of actual operations based on instructions of the program code, thereby realizing the functions of any of the above embodiments.

It is noted that relational terms such as first and second, and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one …" does not exclude the presence of additional identical elements in a process, method, article or apparatus that comprises the element.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: various media in which program code may be stored, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The method for detecting the infrared target in the complex scene is characterized by comprising the following steps of:

2. The method of claim 1, wherein the classical infrared target detection algorithm comprises: at least one of a gray gradient algorithm, an optical flow field algorithm, and a gaussian filtering algorithm; and/or the number of the groups of groups,

the deep learning algorithm is a kernel-related filtering algorithm.

3. The method of claim 1, wherein object fusing the first object detection result and the second object detection result to output a positioning object for the infrared image comprises:

4. A method according to claim 3, wherein said determining a desired output positioning target based on the coincidence of said second detection target and said first detection target comprises:

and determining the second detection target as a positioning target.

5. The method according to claim 4, wherein if it is determined that the number of targets detected to be larger than a set value is plural based on the second target detection result, a target of a maximum size is determined as the second detection target.

6. A method according to claim 3, wherein the first set number is 2.

7. The method according to any one of claims 1-6, further comprising:

8. An infrared target detection device under a complex scene, comprising:

9. An electronic device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the method of any of claims 1-7 when the computer program is executed.

10. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-7.