CN117636019A - Camouflage target detection method and device and electronic equipment - Google Patents

Camouflage target detection method and device and electronic equipment Download PDF

Info

Publication number
CN117636019A
CN117636019A CN202311593929.5A CN202311593929A CN117636019A CN 117636019 A CN117636019 A CN 117636019A CN 202311593929 A CN202311593929 A CN 202311593929A CN 117636019 A CN117636019 A CN 117636019A
Authority
CN
China
Prior art keywords
visible light
fusion
image feature
light image
sar image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311593929.5A
Other languages
Chinese (zh)
Inventor
李雪
孙开敏
何涛
王子昂
李王斌
陈学煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 10 Research Institute
Original Assignee
CETC 10 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 10 Research Institute filed Critical CETC 10 Research Institute
Priority to CN202311593929.5A priority Critical patent/CN117636019A/en
Publication of CN117636019A publication Critical patent/CN117636019A/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the application provides a camouflage target detection method, a camouflage target detection device and electronic equipment, and relates to the technical field of image processing, wherein the method comprises the following steps: obtaining a visible light image and an SAR image of a target image; extracting visible light image feature maps with various scales based on the visible light image, and extracting SAR image feature maps with various scales based on the SAR image; performing feature fusion on the visible light image feature map and the SAR image feature map which belong to the same scale to obtain a plurality of fusion results; carrying out classification regression treatment on each fusion result to obtain a plurality of detection results corresponding to a plurality of scales one by one; and visualizing a camouflage target in the target image based on the plurality of detection results. According to the technical scheme, the detection result is obtained by fusing the visible light image feature images and the SAR image feature images with various scales, the camouflage target in the target image is identified by using the detection result, and the detection accuracy of the camouflage target is improved.

Description

Camouflage target detection method and device and electronic equipment
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a camouflage target detection method, a camouflage target detection device, and an electronic apparatus.
Background
In real situations, the camouflage target may adopt different camouflage means, such as appearance camouflage, radar camouflage and the like, which puts higher demands on the identification of the camouflage target, and a more accurate method is needed to identify the camouflage target.
Disclosure of Invention
The embodiment of the application provides a camouflage target detection method, a camouflage target detection device and electronic equipment.
Other features and advantages of the present application will be apparent from the following detailed description, or may be learned in part by the practice of the application.
According to a first aspect of an embodiment of the present application, there is provided a camouflage target detection method, including:
obtaining a visible light image and an SAR image of a target image;
extracting visible light image feature maps with various scales based on the visible light image, and extracting SAR image feature maps with various scales based on the SAR image;
performing feature fusion on the visible light image feature map and the SAR image feature map which belong to the same scale to obtain a plurality of fusion results;
carrying out classification regression treatment on each fusion result to obtain a plurality of detection results corresponding to a plurality of scales one by one;
and visualizing a camouflage target in the target image based on the plurality of detection results.
In some embodiments of the present application, based on the foregoing solution, the extracting a visible light image feature map of multiple scales based on the visible light image includes:
rolling and pooling the visible light image to obtain various scale characteristics of the visible light image;
performing fusion processing on the various scale features of the visible light image through an FPN structure to obtain an initial visible light image feature map with various scales;
and carrying out fusion processing on the initial visible light image feature images with various scales through the PAN structure to obtain the visible light image feature images with various scales.
In some embodiments of the present application, based on the foregoing solution, the extracting SAR image feature maps of multiple scales based on the SAR image includes:
extracting various scale features of the SAR image by utilizing an SAR image feature extraction network;
and processing the multiple scale features of the SAR image through a weighted bidirectional feature pyramid structure to obtain SAR image feature graphs of multiple scales.
In some embodiments of the present application, based on the foregoing solution, feature fusion is performed on the visible light image feature map and the SAR image feature map that belong to the same scale, to obtain a plurality of fusion results, including:
performing fusion pretreatment on the visible light image feature images and SAR image feature images belonging to the same scale to obtain a plurality of fusion pretreatment results;
and fusing each fusion pretreatment result by using a transducer model to obtain a plurality of fusion results.
In some embodiments of the present application, based on the foregoing solution, the performing fusion preprocessing on the visible light image feature map and the SAR image feature map that belong to the same scale includes:
obtaining a visible light image feature map and an SAR image feature map with the same scale, performing downsampling, and processing the two feature maps after downsampling into a data sequence;
a first position code is generated for each feature point in the visible light image feature map and a second position code is generated for each feature point in the SAR image feature map.
In some embodiments of the present application, based on the foregoing solution, the fusing each fusion pretreatment result using a transducer model includes:
and inputting the data sequence, the first position code and the second position code into a transducer model, and fusing different modal characteristics through a self-attention mechanism and a multi-head attention mechanism.
In some embodiments of the present application, based on the foregoing solution, the performing a classification regression process on each fusion result to obtain a plurality of detection results corresponding to a plurality of scales one to one, where the method includes:
classifying and regressing each fusion result one by one to obtain a visible light image fusion feature map with various scales and a SAR image fusion feature map with various scales;
adding the visible light image fusion feature images with the same scale and the visible light image feature images, and then convoluting to obtain a first detection result;
adding the SAR image fusion feature map and the SAR image feature map with the same scale, and then convolving to obtain a second detection result;
and combining the first detection results and the second detection results with the same multiple scales respectively to obtain multiple detection results corresponding to the multiple scales one by one.
In some embodiments of the present application, based on the foregoing solution, the visualizing the camouflage target in the target image based on the plurality of detection results includes:
decoding the detection results of the multiple scales;
and visually detecting a camouflage target in the target image according to the decoding processing result.
According to a second aspect of embodiments of the present application, there is provided a camouflage target detection device, including:
the first acquisition unit is used for acquiring a visible light image and an SAR image of the target image;
the first extraction unit is used for extracting visible light image feature graphs with various scales based on the visible light images;
the second extraction unit is used for extracting SAR image feature maps with various scales based on the SAR image;
the fusion unit is used for carrying out feature fusion on the visible light image feature images and the SAR image feature images which belong to the same scale to obtain a plurality of fusion results;
the classification regression unit is used for carrying out classification regression processing on each fusion result to obtain a plurality of detection results corresponding to a plurality of scales one by one;
and a visualization unit for visualizing a camouflage target in the target image based on the plurality of detection results.
According to a third aspect of embodiments of the present application, there is provided a computer readable storage medium having stored thereon computer instructions which, when run on a computer, cause the computer to perform the method according to the first aspect described above.
According to a fourth aspect of embodiments of the present application, there is provided an electronic device comprising a memory and a processor;
the memory is used for storing computer instructions;
the processor is configured to invoke the computer instructions in the memory, so that the electronic device performs the method according to the first aspect.
According to the technical scheme, the detection result is obtained by fusing the visible light image feature images and the SAR image feature images with various scales, the camouflage target in the target image is identified by using the detection result, and the detection accuracy of the camouflage target is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. It is apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art. In the drawings:
FIG. 1 illustrates a flow diagram of a camouflage target detection method according to one embodiment of the present application;
FIG. 2 shows a schematic diagram of a camouflage target detection device according to one embodiment of the present application;
FIG. 3 illustrates a schematic diagram of an electronic device structure according to one embodiment of the present application;
fig. 4 shows a schematic diagram of a computer system suitable for use in implementing the electronic device of the embodiments of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the present application. One skilled in the relevant art will recognize, however, that the aspects of the application can be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the application.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.
Referring to fig. 1, a flow diagram of a camouflage target detection method according to one embodiment of the present application is shown.
As shown in fig. 1, a camouflage target detection method is shown, which specifically includes steps S100 to S500.
Step S100, obtaining a visible light image and an SAR image of a target image.
It should be noted that, the visible light image mainly depends on reflected light to form an image, and camouflage targets usually take appearance camouflage measures, such as coating, film pasting, and the like, so that the appearance of the target is similar to the surrounding environment, and the characteristics of the reflected light become blurred, so that the visible light image is difficult to accurately identify. SAR images are formed by measuring the reflection and scattering of radar waves by a target, and have unique resolution and penetrating power. Because the camouflage target and the surrounding environment have different electromagnetic scattering characteristics, the camouflage target in the SAR image often shows a high brightness characteristic, namely, a bright pixel value is shown in the image, and the camouflage target is obviously compared with the background. The features of the visible light image and the SAR image provide the possibility to identify camouflage targets.
With continued reference to fig. 1, step S200 extracts a plurality of scale visible light image feature maps based on the visible light image, and extracts a plurality of scale SAR image feature maps based on the SAR image.
In some possible embodiments, based on the foregoing solution, the extracting a visible light image feature map of multiple scales based on the visible light image includes:
rolling and pooling the visible light image to obtain various scale characteristics of the visible light image;
performing fusion processing on the various scale features of the visible light image through an FPN structure to obtain an initial visible light image feature map with various scales;
and carrying out fusion processing on the initial visible light image feature images with various scales through the PAN structure to obtain the visible light image feature images with various scales.
Note that, the FPN structure in this embodiment refers to a feature pyramid structure, and the PAN structure refers to a pixel aggregation network structure.
Specifically, in the FPN structure, a series of upsampling and convolution processes are performed by using the feature with the smallest scale among the multiple scale features of the visible light image, so as to obtain a series of features with the same scale as the previous step, and the features with the corresponding scales obtained in the previous step are fused.
Specifically, in the PAN structure, a series of convolution processing is performed by using the feature with the largest scale among the multiple scale features of the visible light image, so as to obtain a series of features with the same scale as the previous step, and the features with the corresponding scales obtained in the previous step are fused. In the PAN structure, the processing sequence of the features is reverse to that of FPN, which is the feature scale from small to large, and PAN is from large to small.
For example, a visible light image (the picture size is 640×640, and the channel is 3) is input, and a series of convolution and pooling operations are performed to obtain three-scale features;
three feature graphs with different dimensions are obtained from bottom to top through an FPN structure, the sizes are respectively 1/8, 1/16 and 1/32 of the original graph, and the channel numbers are respectively 256, 256 and 512;
three visible light image feature images with different dimensions are obtained from top to bottom through a PAN structure, the sizes are respectively 1/8, 1/16 and 1/32 of the original image, and the channel numbers are respectively 256, 512 and 1024.
It can be appreciated that the present example utilizes the FPN structure and the PAN structure to extract the feature map of the visible light image by means of upsampling and downsampling, thereby ensuring the accuracy and effectiveness of the extracted feature map.
In some possible embodiments, based on the foregoing solution, the extracting the SAR image feature map of multiple scales based on the SAR image includes:
extracting various scale features of the SAR image by utilizing an SAR image feature extraction network;
and processing the multiple scale features of the SAR image through a weighted bidirectional feature pyramid structure to obtain SAR image feature graphs of multiple scales.
It can be understood that the SAR image feature extraction network can be obtained by adding the coordinate attention mechanism module on the basis of the feature extraction network.
The method comprises the steps of inputting SAR images (the image size is 640 x 640, the channel is 3), and processing the SAR images by utilizing an SAR image feature extraction network to obtain SAR image features of three scales;
three SAR image feature graphs with different scales are generated through a weighted bidirectional feature pyramid structure, the sizes are respectively 1/8, 1/16 and 1/32 of the original graph, and the channel numbers are respectively 256, 512 and 1024.
It can be understood that, in this embodiment, the SAR image feature is extracted by using the specially configured SAR image feature extraction network, so that the accuracy and effectiveness of the extracted SAR image feature are ensured.
With continued reference to fig. 1, in step S300, feature fusion is performed on the visible light image feature map and the SAR image feature map that belong to the same scale, so as to obtain a plurality of fusion results.
It can be understood that the scales of the multiple-scale visible light image feature map and the multiple-scale SAR image feature map obtained in step S200 are in one-to-one correspondence, for example, the scales of the obtained visible light image feature map are 1, 2, 3, and the scales of the obtained SAR image feature map are 1, 2, 3; and when the features are fused, fusing the light image feature map with the scale of 1 and the SAR image feature map, fusing the light image feature map with the scale of 2 and the SAR image feature map, and fusing the light image feature map with the scale of 3 and the SAR image feature map to finally obtain 3 fusion results.
In some possible embodiments, based on the foregoing schemes, fusion preprocessing is performed on the visible light image feature map and the SAR image feature map that belong to the same scale, so as to obtain a plurality of fusion preprocessing results;
and fusing each fusion pretreatment result by using a transducer model to obtain a plurality of fusion results.
It can be understood that the fusion preprocessing is performed on the two image feature maps to ensure that the two image feature maps conform to a fusion format.
In some possible embodiments, based on the foregoing solution, the performing fusion preprocessing on the visible light image feature map and the SAR image feature map that belong to the same scale includes:
obtaining a visible light image feature map and an SAR image feature map with the same scale, performing downsampling, and processing the two feature maps after downsampling into a data sequence;
a first position code is generated for each feature point in the visible light image feature map and a second position code is generated for each feature point in the SAR image feature map.
It can be understood that this step is to perform fusion pretreatment on two feature maps with the same scale, and since both feature maps in the application have multiple scales, it is necessary to perform fusion pretreatment on feature maps with different scales by adopting this step multiple times.
The method comprises the steps of obtaining a visible light image feature map and an SAR image feature map with a scale of 1, and respectively performing downsampling;
keeping the channel dimensions of the two feature maps unchanged, and leveling other dimensions of the two feature maps into a data sequence;
generating a first position code for each feature point by generating a position code matrix by nested loops of the width and height of the visible light image feature map;
generating a second position code for each feature point by generating a position code matrix by nested loops of the width and height of the SAR image feature map.
The fusion preprocessing step of the visible light image feature map and the SAR image feature map with the scales of 2 and 3 is shown in the example.
In some possible embodiments, based on the foregoing scheme, the fusing each fusion pretreatment result using a transducer model includes:
and inputting the data sequence, the first position code and the second position code into a transducer model, and fusing different modal characteristics through a self-attention mechanism and a multi-head attention mechanism.
The flattened sequence, the first position code and the second position code are input into a transducer model, and the characteristics of different modes are fused through a self-attention mechanism and a multi-head attention mechanism to obtain an output sequence, wherein the output sequence is a fusion result.
It can be understood that the number of the fusion pretreatment results corresponds to the number of the two feature maps, so in the application, the fusion needs to be performed for multiple times by using the step to obtain multiple fusion results.
For example, based on the foregoing example, the present application obtains three visible light image feature maps and three SAR image feature maps in total, so the number of fusion pretreatment is three, and the number of fusion is also three.
With continued reference to fig. 1, in step S400, classification regression processing is performed on each fusion result, so as to obtain multiple detection results corresponding to multiple scales one by one.
It will be appreciated that the nature of the fusion result is an output sequence, so that it needs to be reduced to a form that can be used.
In some possible embodiments, based on the foregoing solution, the performing a classification regression process on each fusion result to obtain a plurality of detection results corresponding to a plurality of scales one to one, where the method includes:
classifying and regressing each fusion result one by one to obtain a visible light image fusion feature map with various scales and a SAR image fusion feature map with various scales;
adding the visible light image fusion feature images with the same scale and the visible light image feature images, and then convoluting to obtain a first detection result;
adding the SAR image fusion feature map and the SAR image feature map with the same scale, and then convolving to obtain a second detection result;
and combining the first detection results and the second detection results with the same multiple scales respectively to obtain multiple detection results corresponding to the multiple scales one by one.
It can be understood that, since the fusion result has a plurality of fusion results, multiple classification regression needs to be performed correspondingly to obtain visible light image fusion feature images and SAR image fusion feature images corresponding to multiple scales after classification regression.
Illustratively, taking reduction after the fusion of the visible light image characteristic map and the SAR image characteristic map with the scale of 1 as an example.
Dividing the fused output sequence into two parts, and classifying and regressing the two parts into a visible light image fusion characteristic diagram and an SAR image fusion characteristic diagram with the scale of 1;
adding the visible light image fusion feature map with the scale of 1 to the visible light image feature map with the scale of 1, and then obtaining first detection data through convolution of 1 multiplied by 1;
adding the SAR image fusion feature map with the scale of 1 and the SAR image feature map with the scale of 1, and then obtaining second detection data through convolution of 1 multiplied by 1;
by adopting the same mode, the first detection data and the second detection data corresponding to the scale 2 and the first detection data and the second detection data corresponding to the scale 3 are respectively obtained;
and adding the first detection data with the scale of 1 and the second detection data with the scale of 1, adding the first detection data with the scale of 2 and the second detection data with the scale of 2, and adding the first detection data with the scale of 3 and the second detection data with the scale of 3 to obtain three detection data with different scales.
It should be noted that, the visible light image fusion feature map in the embodiment of the present application refers to an image obtained by fusing the visible light image feature map, and the SAR image fusion feature map refers to an image obtained by fusing the SAR image feature map.
With continued reference to fig. 1, step S500 visualizes a camouflage target in the target image based on the plurality of detection results.
In some possible embodiments, the step S500 specifically includes:
decoding the detection results of the multiple scales;
and visually detecting a camouflage target in the target image according to the decoding processing result.
The following describes embodiments of the apparatus of the present application, which may be used to perform a camouflage target detection method in the above embodiments of the present application. For details not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the method described in the present application.
Referring to fig. 2, a camouflage target detection device according to an embodiment of the present application includes: a first acquisition unit 201, a first extraction unit 202, a second extraction unit 203, a fusion unit 204, a classification regression unit 205, and a visualization unit 206.
The first acquiring unit 201 is configured to acquire a visible light image and a SAR image of a target image; a first extraction unit 202, configured to extract a visible light image feature map of multiple scales based on the visible light image; a second extraction unit 203, configured to extract a SAR image feature map of multiple scales based on the SAR image; the fusion unit 204 is configured to perform feature fusion on the visible light image feature map and the SAR image feature map that belong to the same scale, so as to obtain a plurality of fusion results; the classification regression unit 205 is configured to perform classification regression processing on each fusion result to obtain multiple detection results corresponding to multiple scales one by one; a visualizing unit 206 for visualizing the camouflaged object in the object image based on the plurality of detection results.
In some possible embodiments, based on the foregoing scheme, the first extraction unit 202 is configured to:
rolling and pooling the visible light image to obtain various scale characteristics of the visible light image;
performing fusion processing on the various scale features of the visible light image through an FPN structure to obtain an initial visible light image feature map with various scales;
and carrying out fusion processing on the initial visible light image feature images with various scales through the PAN structure to obtain the visible light image feature images with various scales.
In some possible embodiments, based on the foregoing scheme, the second extraction unit 203 is configured to:
extracting various scale features of the SAR image by utilizing an SAR image feature extraction network;
and processing the multiple scale features of the SAR image through a weighted bidirectional feature pyramid structure to obtain SAR image feature graphs of multiple scales.
In some possible embodiments, based on the foregoing scheme, the fusing unit 204 includes: the fusion preprocessing unit is used for carrying out fusion preprocessing on the visible light image feature images and SAR image feature images belonging to the same scale to obtain a plurality of fusion preprocessing results; and the fusion subunit is used for fusing each fusion pretreatment result by using a transducer model to obtain a plurality of fusion results.
In some possible embodiments, based on the foregoing scheme, the fusion preprocessing unit is configured to:
obtaining a visible light image feature map and an SAR image feature map with the same scale, performing downsampling, and processing the two feature maps after downsampling into a data sequence;
a first position code is generated for each feature point in the visible light image feature map and a second position code is generated for each feature point in the SAR image feature map.
In some possible embodiments, based on the foregoing scheme, the fusion subunit is configured to:
and inputting the data sequence, the first position code and the second position code into a transducer model, and fusing different modal characteristics through a self-attention mechanism and a multi-head attention mechanism.
In some possible embodiments, based on the foregoing scheme, the classification regression unit 205 is configured to:
classifying and regressing each fusion result one by one to obtain a visible light image fusion feature map with various scales and a SAR image fusion feature map with various scales;
adding the visible light image fusion feature images with the same scale and the visible light image feature images, and then convoluting to obtain a first detection result;
adding the SAR image fusion feature map and the SAR image feature map with the same scale, and then convolving to obtain a second detection result;
and combining the first detection results and the second detection results with the same multiple scales respectively to obtain multiple detection results corresponding to the multiple scales one by one.
In some possible embodiments, based on the foregoing scheme, the visualization unit 206 is configured to:
decoding the detection results of the multiple scales;
and visually detecting a camouflage target in the target image according to the decoding processing result.
As shown in fig. 3, the embodiment of the present application further provides an electronic device 300, including a memory 310, a processor 320, and a computer program 311 stored in the memory 310 and capable of running on the processor, where the processor 320 implements the above-mentioned method for detecting a camouflage target when executing the computer program 311.
Since the electronic device described in this embodiment is a device for implementing the a control device in this embodiment, based on the method described in this embodiment, those skilled in the art can understand the specific implementation manner of the electronic device and various modifications thereof, so how to implement the method in this embodiment for this electronic device will not be described in detail herein, and as long as those skilled in the art implement the device for implementing the method in this embodiment for this purpose, all fall within the scope of protection intended by this application.
Fig. 4 shows a schematic diagram of a computer system suitable for use in implementing the electronic device of the embodiments of the present application.
It should be noted that, the computer system 400 of the electronic device shown in fig. 4 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.
As shown in fig. 4, the computer system 400 includes a central processing unit (Central Processing Unit, CPU) 401 that can perform various appropriate actions and processes, such as performing the methods described in the above embodiments, according to a program stored in a Read-Only Memory (ROM) 402 or a program loaded from a storage section 408 into a random access Memory (Random Access Memory, RAM) 403. In the RAM 403, various programs and data required for the system operation are also stored. The CPU 401, ROM 402, and RAM 403 are connected to each other by a bus 404. An Input/Output (I/O) interface 405 is also connected to bus 404.
The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output portion 407 including a Cathode Ray Tube (CRT), a liquid crystal display (Liquid Crystal Display, LCD), and the like, a speaker, and the like; a storage section 408 including a hard disk or the like; and a communication section 409 including a network interface card such as a LAN (Local Area Network ) card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. The drive 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 410 as needed, so that a computer program read therefrom is installed into the storage section 408 as needed.
In particular, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 409 and/or installed from the removable medium 411. When executed by a Central Processing Unit (CPU) 401, performs the various functions defined in the system of the present application.
It should be noted that, the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Where each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented by means of software, or may be implemented by means of hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
As another aspect, the present application also provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs a camouflage target detection method as described in the above embodiments.
As another aspect, the present application also provides a computer-readable medium that may be contained in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device. The computer-readable medium carries one or more programs that, when executed by the electronic device, cause the electronic device to implement a camouflage target detection method as described in the above embodiments.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit, in accordance with embodiments of the present application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a usb disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, a touch terminal, or a network device, etc.) to perform the method according to the embodiments of the present application.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. A camouflage target detection method, comprising:
obtaining a visible light image and an SAR image of a target image;
extracting visible light image feature maps with various scales based on the visible light image, and extracting SAR image feature maps with various scales based on the SAR image;
performing feature fusion on the visible light image feature map and the SAR image feature map which belong to the same scale to obtain a plurality of fusion results;
carrying out classification regression treatment on each fusion result to obtain a plurality of detection results corresponding to a plurality of scales one by one;
and visualizing a camouflage target in the target image based on the plurality of detection results.
2. The method of claim 1, wherein the extracting the plurality of scale visible light image feature maps based on the visible light image comprises:
rolling and pooling the visible light image to obtain various scale characteristics of the visible light image;
performing fusion processing on the various scale features of the visible light image through an FPN structure to obtain an initial visible light image feature map with various scales;
and carrying out fusion processing on the initial visible light image feature images with various scales through the PAN structure to obtain the visible light image feature images with various scales.
3. The method of claim 1, wherein the extracting the SAR image feature map of multiple scales based on the SAR image comprises:
extracting various scale features of the SAR image by utilizing an SAR image feature extraction network;
and processing the multiple scale features of the SAR image through a weighted bidirectional feature pyramid structure to obtain SAR image feature graphs of multiple scales.
4. A method according to any one of claims 1 to 3, wherein the feature fusing the visible light image feature map and the SAR image feature map belonging to the same scale to obtain a plurality of fusion results includes:
performing fusion pretreatment on the visible light image feature images and SAR image feature images belonging to the same scale to obtain a plurality of fusion pretreatment results;
and fusing each fusion pretreatment result by using a transducer model to obtain a plurality of fusion results.
5. The method according to claim 4, wherein the fusion preprocessing of the visible light image feature map and the SAR image feature map belonging to the same scale comprises:
obtaining a visible light image feature map and an SAR image feature map with the same scale, performing downsampling, and processing the two feature maps after downsampling into a data sequence;
a first position code is generated for each feature point in the visible light image feature map and a second position code is generated for each feature point in the SAR image feature map.
6. The method of claim 5, wherein fusing each fusion pre-process result using a transducer model, comprising:
and inputting the data sequence, the first position code and the second position code into a transducer model, and fusing different modal characteristics through a self-attention mechanism and a multi-head attention mechanism.
7. The method of claim 6, wherein the performing a classification regression process on each of the fusion results to obtain a plurality of detection results corresponding to a plurality of scales one-to-one, comprises:
classifying and regressing each fusion result one by one to obtain a visible light image fusion feature map with various scales and a SAR image fusion feature map with various scales;
adding the visible light image fusion feature images with the same scale and the visible light image feature images, and then convoluting to obtain a first detection result;
adding the SAR image fusion feature map and the SAR image feature map with the same scale, and then convolving to obtain a second detection result;
and combining the first detection results and the second detection results with the same multiple scales respectively to obtain multiple detection results corresponding to the multiple scales one by one.
8. The method of claim 6, wherein the visualizing the camouflaged object in the object image based on the plurality of detection results comprises:
decoding the detection results of the multiple scales;
and visually detecting a camouflage target in the target image according to the decoding processing result.
9. A camouflage target detection device, comprising:
the first acquisition unit is used for acquiring a visible light image and an SAR image of the target image;
the first extraction unit is used for extracting visible light image feature graphs with various scales based on the visible light images;
the second extraction unit is used for extracting SAR image feature maps with various scales based on the SAR image;
the fusion unit is used for carrying out feature fusion on the visible light image feature images and the SAR image feature images which belong to the same scale to obtain a plurality of fusion results;
the classification regression unit is used for carrying out classification regression processing on each fusion result to obtain a plurality of detection results corresponding to a plurality of scales one by one;
and a visualization unit for visualizing a camouflage target in the target image based on the plurality of detection results.
10. An electronic device comprising a memory and a processor;
the memory is used for storing computer instructions;
the processor for invoking computer instructions in the memory to cause the electronic device to perform the method of any of claims 1-8.
CN202311593929.5A 2023-11-27 2023-11-27 Camouflage target detection method and device and electronic equipment Pending CN117636019A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311593929.5A CN117636019A (en) 2023-11-27 2023-11-27 Camouflage target detection method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311593929.5A CN117636019A (en) 2023-11-27 2023-11-27 Camouflage target detection method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN117636019A true CN117636019A (en) 2024-03-01

Family

ID=90021130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311593929.5A Pending CN117636019A (en) 2023-11-27 2023-11-27 Camouflage target detection method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN117636019A (en)

Similar Documents

Publication Publication Date Title
CN109508681B (en) Method and device for generating human body key point detection model
EP3637317B1 (en) Method and apparatus for generating vehicle damage information
CN109118542B (en) Calibration method, device, equipment and storage medium between laser radar and camera
CN108846440B (en) Image processing method and device, computer readable medium and electronic equipment
EP3637310A1 (en) Method and apparatus for generating vehicle damage information
CN114186632B (en) Method, device, equipment and storage medium for training key point detection model
CN109118456B (en) Image processing method and device
CN109711508B (en) Image processing method and device
CN109255767B (en) Image processing method and device
CN109345580B (en) Method and apparatus for processing image
CN113822428A (en) Neural network training method and device and image segmentation method
CN110222641B (en) Method and apparatus for recognizing image
CN111967467A (en) Image target detection method and device, electronic equipment and computer readable medium
CN111311480B (en) Image fusion method and device
CN110084873B (en) Method and apparatus for rendering three-dimensional model
CN116205978A (en) Method, device, equipment and storage medium for determining mapping image of three-dimensional target object
CN111209856B (en) Invoice information identification method and device, electronic equipment and storage medium
CN110110696B (en) Method and apparatus for processing information
CN110555861B (en) Optical flow calculation method and device and electronic equipment
CN116704324A (en) Target detection method, system, equipment and storage medium based on underwater image
CN115393423A (en) Target detection method and device
CN117636019A (en) Camouflage target detection method and device and electronic equipment
CN115100536A (en) Building identification method, building identification device, electronic equipment and computer readable medium
CN114323585A (en) Method for calculating modulation transfer function in batch, electronic device and storage medium
CN110119721B (en) Method and apparatus for processing information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination