CN112508787A - Target detection method based on image super-resolution - Google Patents

Target detection method based on image super-resolution Download PDF

Info

Publication number
CN112508787A
CN112508787A CN202011470434.XA CN202011470434A CN112508787A CN 112508787 A CN112508787 A CN 112508787A CN 202011470434 A CN202011470434 A CN 202011470434A CN 112508787 A CN112508787 A CN 112508787A
Authority
CN
China
Prior art keywords
resolution
target detection
super
network
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011470434.XA
Other languages
Chinese (zh)
Inventor
华尧
梁涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panji Technology Co ltd
Original Assignee
Panji Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Panji Technology Co ltd filed Critical Panji Technology Co ltd
Priority to CN202011470434.XA priority Critical patent/CN112508787A/en
Publication of CN112508787A publication Critical patent/CN112508787A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

A target detection method based on image super-resolution comprises the following steps: step 1, sending an original image into a target detection network to obtain an image needing super-resolution; step 2, performing super-resolution on the output characteristic diagram by using a super-resolution network to obtain a characteristic diagram with a larger size; step 3, cutting the characteristic diagram in the step 2 into a plurality of small areas again, and then searching a target frame on each small area; and 4, comparing the characteristic diagram with the characteristic diagram obtained in the step 3 to obtain a loss function of the whole network. According to the invention, the resolution ratio is improved by using the super-resolution network on the feature map or part of the original map, so that the target detection efficiency of the whole network is improved, the complexity of super-resolution and the calculation complexity for processing the super-resolution original map are reduced, and the real-time performance of the whole target detection task based on the super-resolution is improved.

Description

Target detection method based on image super-resolution
Technical Field
The invention belongs to the technical field of image target detection, and particularly relates to a target detection method based on image super-resolution.
Background
The Super Resolution (SR) is to increase an image with Low Resolution (LR) to High Resolution (HR) by a certain algorithm. The high-resolution image has higher pixel density, more detailed information and finer image quality. In order to obtain a high-resolution image, the most direct method is to use a high-resolution camera, however, in the practical application process, due to the consideration of the manufacturing process and the engineering cost, the high-resolution and super-resolution camera is not used in many occasions to acquire the image signal. Therefore, there is a certain application demand for obtaining HR by the super-resolution technique.
As can be seen from the definition of the technique, the super-resolution technique mainly has the function of recovering missing information in an image to form an image with higher resolution, and is generally used in the fields of photo restoration, film restoration, transmission image compression, and the like.
The super-resolution technology can be realized by traditional difference and other methods, and can also be realized by Deep Neural Network (DNN) calculation.
In the field of image processing, there is another important application field, which is target detection, and the target detection task is to identify the category of an object in a given picture and label the area where the object is located. The target detection technology which is widely applied at present is based on a Deep Neural Network (DNN) algorithm, a plurality of feature maps are generated by processing an input image through a DNN network, and a target to be detected is finally found in the feature maps by a picture frame searching method. Note that the size of the feature map is usually much smaller than that of the original image, and taking YOLOV3 as an example, the size of the input image is 416 × 416 pixels, and the size of the feature map is only 52 × 52,26 × 26, and 13 × 13 pixels.
In the task of target detection, one of the more difficult problems is: because the small target is difficult to extract by using a DNN network due to the characteristics of the small target, the target detection algorithm based on the DNN is generally low in detection success rate of the small target. A very intuitive idea is therefore: after the original picture is amplified by using an image super-resolution technology, the amplified picture is used for target detection, and the size of a small target is amplified, so that the recognition probability can be improved.
There is an optimal TDSR algorithm based on super-resolution target detection, i.e. a super-resolution algorithm driven by target detection. In the algorithm, a super-resolution network is used firstly, and the super-resolution processing is carried out on an original image needing to be detected to obtain an amplified image. And then, detecting by using a DNN network for object detection under the amplified image, and finally detecting a smaller object on the original image.
In a neural network system, an important concept is neural network training, which refers to using a loss function tool to calculate parameter values in a neural network. The loss function is a measure of the contribution of each parameter in the neural network to the output of the neural network. Firstly, calculating the loss of the output result using the current neural network parameters by using a loss function, then carrying out incremental adjustment on the neural network parameters once according to the loss value, and calculating the output result again. The above operations are repeated for a plurality of times, and finally the network parameters converge to a stable value. The above process is called training.
In the training process of the TDSR network, due to the introduction of the super-resolution network, a loss function based on a target detection result needs to be modified to revise parameters of the super-resolution network, that is, a task of super-resolution is designed for the accuracy of target detection.
The prior art has two disadvantages.
One is as follows: for the object detection network, the calculation speed of the network is related to the size of the input image, and if the side length of the input image is increased by 1 time, the calculation amount of the entire following object detection DNN network is increased by possibly more times. But the super-resolution itself is also very computationally intensive. Taking YOLOV3 as an example, when the input image is increased from 320 × 320 to 608 × 608, the image size is increased by less than 1 time, but the DNN network computation amount is increased by nearly 4 times (38.97vs 140.69TFLOPS), so that the resolution of the original image is improved, and the cost of the computation amount is very large, which has a great influence on the real-time performance of the system.
A second drawback of this technique is: because the DNN network uses the feature map after the operation process to perform the object recognition, the original image and the feature map have a certain difference, so that the resolution of the original image is improved and the feature of the system for recognizing the object cannot be obviously improved. Therefore, performing super-resolution operation on the original image has a limited capability of detecting the target for improving the DNN network to extract the target feature.
Disclosure of Invention
The invention aims to provide a target detection method based on image super-resolution to solve the problems.
In order to achieve the purpose, the invention adopts the following technical scheme:
a target detection method based on image super-resolution comprises the following steps:
step 1, sending a low-resolution original image into a target detection network to obtain a feature map needing super-resolution;
step 2, performing super-resolution on the output characteristic diagram by using a super-resolution network to obtain a characteristic diagram with a larger size;
step 3, cutting the characteristic diagram in the step 2 into a plurality of small areas again, ensuring that each small area is consistent with the size of the extracted network of the target frame of the original network, and then searching the target frame on each small area;
step 4, sending a high-resolution original image corresponding to the original low-resolution image into a current network, and obtaining a feature map with the same size after passing through the step 1-3 by using the original target detection network without super-resolution work; and comparing the characteristic diagram with the characteristic diagram obtained in the step 3 to obtain a loss function of the whole network.
Further, in step 1, case 1 is included: and (3) obtaining the size of an original image by down-sampling the high-definition original image, sending the size of the original image into a target detection network, and obtaining a characteristic image through the target detection network.
Furthermore, a super-resolution characteristic map is obtained by using a gradient map or a residual error network.
Further, in step 1, case 2 is included: sending the original image into a two-stage target detection network, and obtaining a candidate frame area of the image after the image passes through a first-stage network; and calculating the corresponding image area of the candidate frame area in the original image according to the obtained candidate frame to be used as the image needing super-resolution.
Further, if a plurality of candidate frames are found, extracting image areas corresponding to the candidate frames respectively, and taking a part of the union set; some threshold condition may be used to filter valid candidate frames, such as the size, position, overlap between candidate frames, etc.
Further, in step 4, the first order or second order loss of the feature map and the feature map obtained in step 3 is used as a super-resolution loss, and the loss after target detection and the super-resolution loss are combined to be used as a loss function of the whole network.
Further, in step 4, the resolution times of the low-resolution image and the high-resolution original image are consistent with the super-resolution network magnification of step 2.
Compared with the prior art, the invention has the following technical effects:
according to the invention, the resolution ratio is improved by using the super-resolution network on the feature map or part of the original map, so that the target detection efficiency of the whole network is improved, the complexity of super-resolution and the calculation complexity for processing the super-resolution original map are reduced, and the real-time performance of the whole target detection task based on the super-resolution is improved.
The method and the device perform super-resolution on the feature map, so that the feature of resolution improvement is closer to the feature of target detection, and the capability of target detection can be improved instead of simply improving the resolution of the image.
Drawings
FIG. 1 is a schematic diagram of a conventional target detection network;
fig. 2 is an original drawing of a super-resolution part in an improved target detection network.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
referring to fig. 1 and 2, the scheme uses a super-resolution network to improve the resolution of a feature map, rather than using the resolution of an original feature map, and sends the resolution-improved feature map to a subsequent target frame extraction algorithm for extraction. Because super-resolution processing on the feature map is used. The target detection network needs to be processed accordingly so that it can respond to the target frame extraction on the larger feature map after super-resolution.
Step 1: and (3) obtaining the original image size of the high-definition original image (HR) through downsampling, sending the original image size to a target detection network, and obtaining a characteristic image through the target detection network.
Taking YOLOV3 network as an example, an image with 1684 × 1684 resolution is down-sampled by 4 times to obtain an original 416 × 416 image, and the original 416 × 416 image is sent to a target detection network, and the size of a feature map output by the network is 13 × 13.
Step 2: and performing super-resolution on the output characteristic diagram by using a super-resolution network to obtain a characteristic diagram with a larger size. Super-resolution feature maps need to be obtained by means of gradient maps and the like.
Step 2 a: optionally, step 2 may obtain the super-resolution feature map by using a residual error network
Taking the YOLOV3 network as an example, the size of the final output feature map is 13 × 13, and if the feature map is subjected to 4 times of super-resolution, a feature map of 52 × 52 is obtained.
And step 3: and cutting the characteristic diagram into a plurality of small areas again to ensure that the size of each small area is consistent with that of the target frame extraction network of the original network, and then searching the target frame in each small area, thereby ensuring that the subsequent target detection network does not need to be modified.
Taking YOLOV3 network as an example, after obtaining a 52 × 52 feature map by super-resolution, the feature map is cut into 16 blocks of 13 × 13 regions, and then each region is sent to the original 13 × 13 target frame extraction network for identification.
If the analogy is to perform super-resolution directly on the original image, the whole target detection network needs to perform target detection on 1684 × 1684 picture, and the calculation amount is far greater than the current difficulty of detection on 52 × 52 feature map.
And 4, step 4: and (3) sending the high-definition original (HR) image into the current network, and obtaining a feature map with the same size after the step 1-3 by using the same network. Then, the difference between the feature map and the feature map obtained in step 3 is used as a super-resolution loss, and the loss after target detection (such as common classification loss and bbox loss) and the super-resolution loss are processed together (such as weighted average) as a loss function of the whole network.
In a two-stage target detection network (a typical fast RCNN network), because a two-stage feature extraction mode is used, candidate frames possibly having targets are extracted in the first stage, and only the candidate frame regions are subjected to target detection in the second stage, the candidate frame regions can be subjected to super-resolution by using a super-resolution network, so that a complete feature map does not need to be subjected to super-resolution, and the speed of the whole network is higher.
Step 1, sending an original image into a two-stage target detection network. And obtaining the candidate frame area of the graph after the first-stage network.
Taking the Fasterrcnn network as an example, after the original image passes through the backbone network and the RPN network, a plurality of candidate frame regions are output.
Step 2: and according to the obtained candidate frame, calculating the candidate frame, and cutting the image at the corresponding position in the original image to be used as the image needing super-resolution.
Step 2 a; if multiple candidate boxes are found, portions of the multiple candidate boxes are extracted.
And step 2 b: some threshold condition may be used to filter valid candidate frames, such as the size, position, overlap between candidate frames, etc.
Step 3-5: in the same manner as in step 2-4 of embodiment 1, the clipped original image obtained in step 2 is used as a feature map for subsequent input. The target detection DNN network used therein may be the two-stage target detection network used in step 1, or any other target detection network.
In most cases, step 2 does not find a candidate box, and the whole super-resolution network is not executed, so that the whole system is not overloaded.

Claims (8)

1. A target detection method based on image super-resolution is characterized by comprising the following steps:
step 1, sending a low-resolution image into a target detection network for processing to obtain a first characteristic diagram;
step 2, using a super-resolution network to perform super-resolution on the first characteristic diagram to obtain a second characteristic diagram with a larger size;
step 3, the second characteristic diagram in the step 2 is cut into more than two small areas again, the size of each small area is ensured to be consistent with that of an output characteristic diagram of a target frame extraction network of the target detection network, and then subsequent target processing is carried out on each small area;
step 4, sending the high-definition image into the target detection network in the step 1 to obtain a third feature map with the same size as the second feature map; the third feature map is then compared with the second feature map as a loss function of the target detection network and used for training of the target detection network.
2. The method for detecting the target based on the image super-resolution as claimed in claim 1, wherein the step 1 comprises the following steps: and (3) down-sampling and reducing the high-definition original image to the original image size, sending the original image size to a target detection network, and obtaining the first characteristic image through the target detection network.
3. The method for detecting the target based on the image super-resolution according to claim 1, wherein the super-resolution feature map is obtained by using a gradient map or a residual error network.
4. The method of claim 1, wherein in step 4, the difference between the feature map and the feature map obtained in step 3 is used as super-resolution loss, and the target detection loss after target detection and the super-resolution loss are combined as a loss function of the whole network.
5. The image super-resolution-based target detection method according to claim 1, wherein in step 1, the target detection network is a two-stage target detection network, and the target detection network obtains one or more candidate frame areas after passing through a first-stage network; and calculating a corresponding image area in the original image according to the candidate frame area, and taking the image area as a feature map needing super-resolution in the step 2.
6. The method according to claim 5, wherein if more than one candidate frame region is found, a union of image regions respectively calculated by a plurality of candidate frames is used as the feature map.
7. The method of claim 5, wherein a threshold condition is used to screen valid candidate frames, such as size, position, and overlapping portion between candidate frames.
8. The method according to claim 5, wherein after obtaining the image region corresponding to the clipped candidate frame, the original target detection network or any other target detection network is used for subsequent target detection.
CN202011470434.XA 2020-12-14 2020-12-14 Target detection method based on image super-resolution Pending CN112508787A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011470434.XA CN112508787A (en) 2020-12-14 2020-12-14 Target detection method based on image super-resolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011470434.XA CN112508787A (en) 2020-12-14 2020-12-14 Target detection method based on image super-resolution

Publications (1)

Publication Number Publication Date
CN112508787A true CN112508787A (en) 2021-03-16

Family

ID=74973172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011470434.XA Pending CN112508787A (en) 2020-12-14 2020-12-14 Target detection method based on image super-resolution

Country Status (1)

Country Link
CN (1) CN112508787A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115376022A (en) * 2022-06-30 2022-11-22 广东工业大学 Application of small target detection algorithm based on neural network in unmanned aerial vehicle aerial photography

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107481188A (en) * 2017-06-23 2017-12-15 珠海经济特区远宏科技有限公司 A kind of image super-resolution reconstructing method
CN110136063A (en) * 2019-05-13 2019-08-16 南京信息工程大学 A kind of single image super resolution ratio reconstruction method generating confrontation network based on condition
CN111062872A (en) * 2019-12-17 2020-04-24 暨南大学 Image super-resolution reconstruction method and system based on edge detection
CN111179177A (en) * 2019-12-31 2020-05-19 深圳市联合视觉创新科技有限公司 Image reconstruction model training method, image reconstruction method, device and medium
CN112016507A (en) * 2020-09-07 2020-12-01 平安科技(深圳)有限公司 Super-resolution-based vehicle detection method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107481188A (en) * 2017-06-23 2017-12-15 珠海经济特区远宏科技有限公司 A kind of image super-resolution reconstructing method
CN110136063A (en) * 2019-05-13 2019-08-16 南京信息工程大学 A kind of single image super resolution ratio reconstruction method generating confrontation network based on condition
CN111062872A (en) * 2019-12-17 2020-04-24 暨南大学 Image super-resolution reconstruction method and system based on edge detection
CN111179177A (en) * 2019-12-31 2020-05-19 深圳市联合视觉创新科技有限公司 Image reconstruction model training method, image reconstruction method, device and medium
CN112016507A (en) * 2020-09-07 2020-12-01 平安科技(深圳)有限公司 Super-resolution-based vehicle detection method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115376022A (en) * 2022-06-30 2022-11-22 广东工业大学 Application of small target detection algorithm based on neural network in unmanned aerial vehicle aerial photography
CN115376022B (en) * 2022-06-30 2024-04-05 广东工业大学 Application of small target detection algorithm in unmanned aerial vehicle aerial photography based on neural network

Similar Documents

Publication Publication Date Title
CN108304808B (en) Monitoring video object detection method based on temporal-spatial information and deep network
TWI223212B (en) Generalized text localization in images
US6473522B1 (en) Estimating text color and segmentation of images
CN111639692A (en) Shadow detection method based on attention mechanism
CN105590319A (en) Method for detecting image saliency region for deep learning
CN111860683B (en) Target detection method based on feature fusion
CN113052170B (en) Small target license plate recognition method under unconstrained scene
CN114332620A (en) Airborne image vehicle target identification method based on feature fusion and attention mechanism
CN112819837B (en) Semantic segmentation method based on multi-source heterogeneous remote sensing image
US20230127009A1 (en) Joint objects image signal processing in temporal domain
Babbar et al. A new approach for vehicle number plate detection
CN112233129A (en) Deep learning-based parallel multi-scale attention mechanism semantic segmentation method and device
CN113591831A (en) Font identification method and system based on deep learning and storage medium
WO2008066217A1 (en) Face recognition method by image enhancement
CN112508787A (en) Target detection method based on image super-resolution
CN111401368A (en) News video title extraction method based on deep learning
CN111079516B (en) Pedestrian gait segmentation method based on deep neural network
JP2010271792A (en) Image processing apparatus and method
CN106951831B (en) Pedestrian detection tracking method based on depth camera
US6983071B2 (en) Character segmentation device, character segmentation method used thereby, and program therefor
CN112418123B (en) Hough transformation-based engineering drawing line and line type identification method
CN113657225B (en) Target detection method
CN111931689B (en) Method for extracting video satellite data identification features on line
CN110705568B (en) Optimization method for image feature point extraction
CN107609595B (en) Line cutting image detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination