WO2021083151A1

WO2021083151A1 - Target detection method and apparatus, storage medium and unmanned aerial vehicle

Info

Publication number: WO2021083151A1
Application number: PCT/CN2020/124055
Authority: WO
Inventors: 李亚学
Original assignee: 深圳市道通智能航空技术股份有限公司
Priority date: 2019-11-01
Filing date: 2020-10-27
Publication date: 2021-05-06
Also published as: CN110796104A

Abstract

A target detection method and apparatus, a storage medium, and an unmanned aerial vehicle. The method comprises: acquiring a first image captured by a gimbal camera (101); performing segmentation processing on the first image to obtain at least two segmented images (102); performing a size change operation on the at least two segmented images to obtain at least two sub-images, the resolutions of the at least two sub-images matching the resolutions corresponding to at least two image analysis units (103); and inputting the at least two sub-images into the at least two image analysis units, and determining a target detection result according to the analysis results of the at least two image analysis units (104). The use of the technical solution in the present method can reduce the loss of image information and improve the accuracy and success rate of target detection.

Description

Target detection method, device, storage medium and drone

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on November 1, 2019, with application number 201911060737.1, application titled "Target Detection Method, Device, Storage Medium, and UAV", the entire content of which is incorporated by reference Incorporated in this application.

Technical field

The embodiments of the present invention relate to the technical field of drones, and in particular to target detection methods, devices, storage media, and drones.

Background technique

Unmanned Aerial Vehicles (UAVs) refer to unmanned aircraft operated by radio remote control equipment and independent program control equipment, or fully or intermittently autonomously operated by onboard computers. Compared with manned aircraft, unmanned aerial vehicles have the advantages of small size, low cost, low environmental requirements, and strong survivability, and are often more suitable for tasks in dangerous or harsh environments. With the rapid development of the UAV manufacturing industry, UAV systems are widely used in areas such as smart city management and intelligent traffic monitoring. Among them, target detection is a basic but challenging functional requirement in UAV systems, which is closely related to applications such as infrastructure inspection, city perception, map reconstruction, and traffic control. These applications have promoted the development of online monitoring systems based on drones. These online systems can perform various tasks, such as on-site facility inspections and detection of violations, identification of unhealthy crops, and acquisition of map data.

The UAV is generally integrated with an image processing chip, and the image analysis unit in the image processing chip analyzes and processes the images taken by the PTZ camera on the UAV to achieve target detection. At present, in order to ensure the real-time detection, the resolution of the input image of the image processing chip is generally low, and the resolution of the pan/tilt camera is very high. It is necessary to resize the image taken by the pan/tilt camera before performing the resize operation. Handed over to the image processing chip for processing, which makes part of the image information lost in the image processing process, making the detection of some small targets very difficult or even impossible. Therefore, the existing UAV target detection scheme needs to be improved.

Summary of the invention

The embodiments of the present invention provide a target detection method, device, storage medium, and equipment, which can optimize the existing target detection scheme.

In the first aspect, an embodiment of the present invention provides a target detection method, which is applied to an unmanned aerial vehicle, in which at least two image analysis units are integrated, and the method includes:

Obtain the first image taken by the pan/tilt camera;

Performing segmentation processing on the first image to obtain at least two segmented images;

Performing a size change operation on the at least two divided images to obtain at least two sub-images, the resolutions of the at least two sub-images matching the resolutions corresponding to the at least two image analysis units;

The at least two sub-images are input into the at least two image analysis units, and the target detection result is determined according to the analysis results of the at least two image analysis units.

In a second aspect, an embodiment of the present invention provides a target detection device applied to an unmanned aerial vehicle. At least two image analysis units are integrated in the unmanned aerial vehicle, and the device includes:

The image acquisition module is used to acquire the first image taken by the pan-tilt camera;

An image segmentation module, configured to perform segmentation processing on the first image to obtain at least two segmented images;

A size changing module, configured to perform a size changing operation on the at least two divided images to obtain at least two sub-images, and the resolutions of the at least two sub-images match the resolutions corresponding to the at least two image analysis units;

The target detection module is configured to input the at least two sub-images into the at least two image analysis units, and determine the target detection result according to the analysis results of the at least two image analysis units.

In a third aspect, an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the target detection method as provided in the embodiment of the present invention is implemented.

In a fourth aspect, an embodiment of the present invention provides an unmanned aerial vehicle, including a memory, at least two image analysis units, a processor, and a computer program stored in the memory and running on the processor, wherein the When the processor executes the computer program, the target detection method as provided in the embodiment of the present invention is implemented.

The target detection solution provided in the embodiment of the present invention is applied to an unmanned aerial vehicle. The unmanned aerial vehicle is integrated with at least two image analysis units to obtain the first image taken by the pan/tilt camera, and perform segmentation processing on the first image to determine at least Perform a size change operation on two divided images, and the resolutions of the obtained at least two sub-images match the resolutions corresponding to the at least two image analysis units. The at least two sub-images are input into the at least two image analysis units, and the The analysis results of the two image analysis units determine the target detection result. By adopting the above technical solution, after the original image taken by the pan/tilt camera is divided, the loss of image information can be reduced when the size is changed, and then it can be analyzed by at least two image analysis units, which can improve the target detection Accuracy and success rate.

Description of the drawings

FIG. 1 is a schematic flowchart of a target detection method according to Embodiment 1 of the present invention;

2 is a schematic diagram of an image taken by a pan-tilt camera according to Embodiment 1 of the present invention;

FIG. 3 is a schematic diagram of an image that has undergone size modification processing according to Embodiment 1 of the present invention; FIG.

4 is a schematic diagram of a left side image after segmentation according to Embodiment 1 of the present invention;

5 is a schematic diagram of a right side image after segmentation according to Embodiment 1 of the present invention;

FIG. 6 is a schematic diagram of a left side image after a size change according to Embodiment 1 of the present invention; FIG.

FIG. 7 is a schematic diagram of a right side image after a size change according to Embodiment 1 of the present invention; FIG.

8 is a schematic flowchart of a target detection method according to Embodiment 2 of the present invention;

FIG. 9 is a schematic flowchart of a target detection method according to Embodiment 3 of the present invention;

FIG. 10 is a schematic diagram of a left side image including position information according to Embodiment 3 of the present invention;

FIG. 11 is a schematic diagram of a right image including position information according to Embodiment 3 of the present invention;

FIG. 12 is a schematic diagram of fusion of analysis results according to Embodiment 3 of the present invention;

FIG. 13 is a structural block diagram of a target detection device provided by Embodiment 4 of the present invention;

FIG. 14 is a structural block diagram of an unmanned aerial vehicle according to Embodiment 6 of the present invention.

Detailed ways

The technical solutions of the present invention will be further described below in conjunction with the drawings and specific implementations. It can be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for ease of description, the drawings only show a part of the structure related to the present invention instead of all of the structure.

Before discussing the exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although the flowchart describes the steps as sequential processing, many of the steps can be implemented in parallel, concurrently, or simultaneously. In addition, the order of the steps can be rearranged. The processing may be terminated when its operation is completed, but may also have additional steps not included in the drawings. The processing may correspond to methods, functions, procedures, subroutines, subroutines, and so on.

Example one

Fig. 1 is a schematic flow chart of a target detection method provided by Embodiment 1 of the present invention. The method can be executed by a target detection device, where the device can be implemented by software and/or hardware, and generally can be integrated in a drone. As shown in Figure 1, the method includes:

Step 101: Obtain a first image taken by a pan-tilt camera.

In the embodiment of the present invention, the pan/tilt camera can be integrated inside the drone, or externally placed on the drone, to establish a connection with the drone through wired or wireless methods. The gimbal camera can collect images in real time during the flight of the drone, and can obtain the images taken by the gimbal camera in real time or at a preset frequency. The first image may be an image taken at any time, which is not limited in the embodiment of the present invention.

In the embodiment of the present invention, at least two image analysis units are integrated in the drone, and the specific type of the image analysis unit is not limited. For example, it may be a forward inference engine (Neural Network Inference Engine, NNIE) that is accelerated based on a neural network. ). The image analysis unit can be integrated in an image processing chip, such as the HI3559C chip, which carries two forward reasoners specifically aimed at the acceleration of neural networks. The two forward reasoners can handle tasks such as detection and classification independently.

Step 102: Perform segmentation processing on the first image to obtain at least two segmented images.

With the rapid development of camera technology, the resolution of the current pan/tilt camera can reach very high, such as 8K, and the ratio of the captured image is generally 4:3 or 16:9. In order to ensure the real-time detection, the resolution of the input image of the image analysis unit is generally low, such as 512*512. In this way, when the image taken by the pan/tilt camera is resized, it will lose more Image information reduces the accuracy of target detection, and it is especially difficult to detect small targets. When the drone is flying at a high altitude, targets on the ground are smaller in the image and more difficult to detect.

FIG. 2 is a schematic diagram of an image taken by a pan-tilt camera according to Embodiment 1 of the present invention. The image ratio is 16:9 and the resolution is 1920*1080. Fig. 3 is a schematic diagram of an image that has undergone a resizing process according to the first embodiment of the present invention. After resize, the image in Fig. 2 becomes the 1:1 image in Fig. 3, with a resolution of 512*512, so , Most of the information in Figure 2 is lost, resulting in unsatisfactory detection of small targets.

In this step, the first image may be segmented according to a preset rule to obtain at least two segmented images. Among them, the preset rules can include specific number of divisions and division methods, such as equal division or proportional division, etc., such as the division of left and right structures, the division of upper and lower structures, the division of Tian shape, and the division of Jiugong format. The specific number of segmentation and the segmentation method can be determined according to the number of image analysis units and other reference factors, which are not limited in the embodiment of the present invention.

In the embodiment of the present invention, in order to enhance the contrast effect with the prior art, the image in FIG. 2 is still taken as an example. FIG. 4 is a schematic diagram of a divided left image according to Embodiment 1 of the present invention, and FIG. 5 is a schematic diagram of a divided right image according to Embodiment 1 of the present invention. Referring to FIG. 4 and FIG. The image in 2 is divided equally between the left and right structure, and two divided images with an image ratio of 8:9 are obtained.

Step 103: Perform a size change operation on the at least two divided images to obtain at least two sub-images, and the resolutions of the at least two sub-images match the resolutions corresponding to the at least two image analysis units.

In the embodiment of the present invention, the specific manner of performing the resizing operation on the segmented image is not limited, and the purpose of the resizing operation is to match the resolutions of at least two sub-images with the resolutions corresponding to at least two image analysis units. Exemplarily, and generally, the resolutions corresponding to at least two image analysis units are equal, so that the resolutions of at least two sub-images are the same as the resolution.

Exemplarily, the size change operation may include a size change operation performed by an interpolation algorithm. The interpolation algorithm may be, for example, the nearest neighbor method, the bilinear method, the bicubic method, the algorithm based on the pixel region relationship, and the Lanzos interpolation method.

FIG. 6 is a schematic diagram of a left side image after a size change provided in the first embodiment of the present invention, and FIG. 7 is a schematic diagram of a right side image after a size change provided in the first embodiment of the present invention. Referring to FIGS. 6 and 7, the size change operation is performed on the images in FIGS. 5 and 6 respectively, and two images with an image ratio of 1:1 and a resolution of 512*512 are obtained.

Step 104: Input the at least two sub-images into the at least two image analysis units, and determine a target detection result according to the analysis results of the at least two image analysis units.

By adopting the solution of the embodiment of the present invention, the total area of the image input to the image analysis unit becomes larger. Referring to the above example, it is equivalent to double the target in the image. If the number of divided images is more, the magnification is also Larger, reducing the difficulty of target detection, can effectively improve the accuracy and success rate of target detection.

The working sequence of the at least two image analysis units is not specifically limited in the embodiment of the present invention. Optionally, after inputting the at least two sub-images into the at least two image analysis units, the method further includes: controlling the at least two image analysis units to analyze and process the received sub-images in parallel. The advantage of this setting is that it improves the accuracy and success rate of target detection while ensuring real-time detection.

Exemplarily, the content contained in the analysis result of the image analysis unit is related to the specific type, model, and function of the image analysis unit, which is not limited in the embodiment of the present invention. Generally, the analysis result may include the location information and type information of the analyzed target. The target can be a target object or a target person, etc. It can be a pre-designated target or an automatically recognized target, which can be set according to actual needs. The analysis results of at least two image analysis units can be integrated to determine the final target detection result.

The target detection method provided in the embodiment of the present invention is applied to an unmanned aerial vehicle. The unmanned aerial vehicle is integrated with at least two image analysis units to acquire the first image taken by the pan/tilt camera, and perform segmentation processing on the first image, and at least Perform a size change operation on two divided images, and the resolutions of the obtained at least two sub-images match the resolutions corresponding to the at least two image analysis units. The at least two sub-images are input into the at least two image analysis units, and the The analysis results of the two image analysis units determine the target detection result. By adopting the above technical solution, after the original image taken by the pan/tilt camera is divided, the loss of image information can be reduced when the size is changed, and then it can be analyzed by at least two image analysis units, which can improve the target detection Accuracy and success rate.

Example two

FIG. 8 is a schematic flowchart of a target detection method according to the second embodiment of the present invention. The method is optimized on the basis of the above embodiment, and the specific process of determining the target detection result according to the analysis results of the at least two image analysis units To refine.

Exemplarily, the determining the target detection result according to the analysis results of the at least two image analysis units includes: acquiring the analysis results of the at least two image analysis units; performing fusion processing on the at least two analysis results to obtain the target Test results. The advantage of this setting is that during the segmentation process, the target may be in the segmentation position, resulting in the target being detected in the adjacent images. In this case, the analysis results can be fused to determine the segmentation of the target. The analysis results are merged.

Further, the analysis result includes the type information and position information of the analyzed target, and the fusion processing of at least two analysis results includes: sequentially recording every two adjacent sub-images as a current sub-image pair , The current sub-image pair includes a first sub-image and a second sub-image, and the following operations are performed on the current sub-image pair: determining the first target in the first sub-image, and determining the first target in the second sub-image According to the first position information and first type information corresponding to the first target, and the second position information and second type information corresponding to the second target, the first target and the Whether the second target corresponds to the same target, and if so, the first target and the second target are merged into the same target. The advantage of this setting is that it can determine whether there is an image corresponding to the segmented target in every two adjacent sub-images one by one according to the position information and the type information, and can accurately identify the segmented target in the image.

Specifically, the method includes the following steps:

Step 201: Obtain a first image taken by a pan-tilt camera.

Step 202: Perform segmentation processing on the first image to obtain at least two segmented images.

Step 203: Perform a size change operation on the at least two divided images to obtain at least two sub-images, and the resolutions of the at least two sub-images match the resolutions corresponding to the at least two image analysis units.

Step 204: Input at least two sub-images into the at least two image analysis units, and control the at least two image analysis units to analyze and process the received sub-images in parallel.

Step 205: Obtain analysis results of at least two image analysis units.

Step 206: Record every two adjacent sub-images as a current sub-image pair in turn, and perform the following operations for the current sub-image pair: determine the first target in the first sub-image and the second target in the second sub-image; According to the first location information and the first type information corresponding to the first target, and the second location information and the second type information corresponding to the second target, it is determined whether the first target and the second target correspond to the same target. A target and the second target merge into the same target.

Wherein, the analysis result includes the type information and position information of the analyzed target, and the current sub-image pair includes the first sub-image and the second sub-image. The first target may be any target analyzed in the first sub-image, and the second target may be any target analyzed in the second sub-image.

Optionally, in order to reduce the amount of calculation, a preliminary screening may be performed on the first target and the second target, and the target closer to the segmentation boundary may be determined as the first target or the second target. Exemplarily, take the first target as an example, record the boundary overlapping with the second sub-image in the first sub-image as the segmentation boundary, obtain the candidate targets analyzed in the first sub-image, and determine for each candidate target The fifth distance between the current candidate target and the segmentation boundary, and if the fifth distance is less than a third preset threshold, the current candidate target is determined as the first target. In the same way, you can refer to the above content to determine the second goal. The third preset threshold may be set according to the size of the sub-image, for example, it may be a preset ratio of the side length of the side perpendicular to the segmentation boundary, and the preset ratio may be, for example, 10%.

Exemplarily, the position information in the analysis result may include a coordinate range, and the coordinate range may form a certain shape, such as a circle, an ellipse, or a rectangle, or a shape that matches the shape of the target.

Optionally, the position information includes the coordinates of a rectangular frame, and the rectangular frame contains an image corresponding to the analyzed target. The rectangular frame corresponding to the first target is recorded as a first rectangular frame, and the rectangular frame corresponding to the second target is recorded as a second rectangular frame. Type information can be determined by the specific capabilities of the image analysis unit. For example, it can analyze whether it is a moving object or a still life, an animal or a person, and it can also analyze specific categories, such as vehicles, houses, etc., and analyze more detailed categories. , Such as cars, buses, fire trucks, ambulances, etc.

Exemplarily, the same rule is adopted for numbering the four sides of the first rectangle and the second rectangle, and then it is determined whether the two rectangles correspond to the same target based on the distance between the sides and the type information corresponding to the two targets. In the case of adjacent left and right, it can be judged whether the distance between the two rectangular boxes in the horizontal direction is close enough, and whether the degree of deviation from the same horizontal line in the vertical direction is small enough; in the case of adjacent upper and lower sides, it can be judged that the two rectangular boxes are in the vertical direction. Whether the distance in the vertical direction is close enough, and whether the deviation from the same vertical line in the horizontal direction is small enough. If the above conditions are met and the first target and the second target have the same type information, the first target and the second target can be considered to be the same target.

Specifically, the first target and the first type information are determined according to the first position information and the first type information corresponding to the first target, and the second position information and the second type information corresponding to the second target. Whether the second target corresponds to the same target can include:

According to the coordinates of the first rectangular frame and the coordinates of the second rectangular frame, the first distance between the first boundary of the first rectangular frame and the third boundary of the second rectangular frame, and the first distance between the first boundary of the first rectangular frame and the third boundary of the second rectangular frame are calculated. The second distance between the third boundary of the rectangular frame and the first boundary of the second rectangular frame, the third distance between the second boundary of the first rectangular frame and the fourth boundary of the second rectangular frame, and the The fourth distance between the fourth boundary of the first rectangular frame and the second boundary of the second rectangular frame, wherein the first boundary and the third boundary in each rectangular frame are parallel, and the second boundary and the fourth boundary are parallel , When the first sub-image and the second sub-image are left and right adjacent, the first boundary in each rectangular frame is the left boundary, and when the first sub-image and the second sub-image are up and down When adjacent, the first boundary in each rectangle is the upper boundary;

Calculate the first ratio of the distance with the smaller value to the distance with the larger value in the first distance and the second distance; and calculate the third distance and the fourth distance, with the smaller value The second ratio of the distance to the greater distance;

When the first ratio is less than a first preset threshold, the second ratio is greater than a second preset threshold, and the first type information and the second type information are the same, the first target and the second type information are determined Whether the second target corresponds to the same target, wherein the first preset threshold is smaller than the second preset threshold.

Wherein, the first preset threshold and the second preset threshold can be set according to actual needs, for example, the first preset threshold is 0.1 and the second preset threshold is 0.6.

Optionally, the fusing the first target and the second target into the same target includes: determining a target rectangular frame according to the coordinates of the first rectangular frame and the coordinates of the second rectangle, the target The rectangular frame includes both the first rectangular frame and the second rectangular frame; the target rectangular frame and the first type information are determined as the analysis result corresponding to the fused target. The advantage of this setting is that when the first target and the second target are determined to be the same target, the target rectangular frame containing both the first rectangular frame and the second rectangular frame can be determined as the final position information of the target, avoiding the final target The number of targets in the test result is wrong.

Step 207: Determine the result after the fusion processing as the target detection result.

In the target detection method provided by the embodiment of the present invention, after the image segmentation and size change are performed, it is handed over to at least two image analysis units for parallel processing. After the preliminary analysis results of each image analysis unit are obtained, it is fully considered that the same target is segmented into The conditions in the two sub-images are merged with the analysis results of the determined segmented targets, and finally an accurate target detection result is obtained.

On the basis of the foregoing embodiment, after the target detection result is determined, it may further include performing a splicing operation on at least two sub-images containing the target detection result, and output to the user equipment. Among them, the user equipment may be a device such as a mobile terminal or a computer for the user to view the target detection result.

Example three

Fig. 9 is a schematic flow chart of a target detection method provided in the third embodiment of the present invention. The method is described by taking the example of integrating two image analysis units in a drone to divide the image into left and right equally. Specifically, the method Including the following steps:

Step 301: Obtain a first image taken by a pan-tilt camera.

For ease of description, the following description is made by taking the image in FIG. 2 as the first image as an example. The image ratio is 16:9 and the resolution is 1920*1080.

Step 302: Perform left and right average division processing on the first image to obtain two divided images.

Perform the average division of the left and right structure on the image in Fig. 2 to obtain two divided images with an image ratio of 8:9, as shown in Figs. 4 and 5.

Step 303: Perform a size change operation on the two divided images to obtain two sub-images, and the resolutions of the two sub-images match the resolutions corresponding to the two image analysis units.

Exemplarily, the two image analysis units are the two forward reasoners that are carried on the HI3559C chip specifically for accelerating neural networks, and the resolution supported by each reasoner is 512*512. Refer to Figure 6 and Figure 7, Resizing the images in Figure 5 and Figure 6 were performed to obtain two images with an image ratio of 1:1 and a resolution of 512*512.

Step 304: Input two sub-images into two image analysis units, and control the two image analysis units to analyze and process the received sub-images in parallel.

Step 305: Obtain the analysis results of the two image analysis units.

Exemplarily, the two reasoners will respectively provide location information (BBox) and category (Class) information of the analyzed target in the sub-image to which it belongs. Among them, the position information is represented by the coordinates of the rectangular frame. When a target coexists in the left and right sub-images, such as the car in the picture, which appears in both Fig. 6 and Fig. 7, then the two reasoners will also respectively give the position information of the car in the sub-image for which they are responsible. FIG. 10 is a schematic diagram of a left side image including location information provided by Embodiment 3 of the present invention, and FIG. 11 is a schematic diagram of a right side image including location information provided by Embodiment 3 of the present invention, as shown in FIGS. 10 and 11 , Respectively use rectangular boxes to circle the positions of the analyzed targets.

Step 306: Determine the first target in the first sub-image and the second target in the second sub-image, according to the first location information and the first type information corresponding to the first target, and the second location information corresponding to the second target With the second type of information, it is determined whether the first target and the second target correspond to the same target, and if so, the first target and the second target are merged into the same target.

Fig. 12 is a schematic diagram of a fusion of analysis results provided by Embodiment 3 of the present invention. As shown in the figure, the left rectangular box Left BBox represents the first rectangular box of the first target, and the right rectangular box Right BBox represents the second target’s first rectangular box. Two rectangular boxes.

Among them, the rectangular frame in the left picture and the rectangular frame in the right picture are in the same coordinate system, and the coordinate system can be determined according to the spliced image of the first sub-image and the second sub-image, for example, the lower left corner of the image is the coordinate The origin, the lower boundary is the horizontal axis, and the left boundary is the vertical axis. The rectangular box on the left can be represented by coordinates (Ltop, Lbottom, Lleft, Lrigth), and the rectangular box on the right can be represented by coordinates (Rtop, Rbottom, Rleft, Rrigth). W, w, H, and h in the figure are as follows Formula representation.

W=Rrigth–Lleft w=Rleft–Lrigth

H=Rbottom-Ltop h=Lbottom-Rtop

When w/W<0.1 and h/H>0.6, and the category information is the same, it can be considered that the left and right BBoxes are the same target. At this time, the merged BBox is (Ltop, Rbottom, Lleft, Rrigth). Among them, 0.1 is the first preset threshold, and 0.6 is the second preset threshold.

The above is just one case. Since the size and relative position of the rectangular frame may be different, it can be expressed as a general formula:

W=Rrigth–Lleft w=Rleft–Lrigth

H=max(Rbottom,Lbottom)–min(Ltop,Rtop)

h=min(Lbottom,Rbottom)–max(Rtop,Ltop)

When w/W<0.1 and h/H>0.6, and the category information is the same, it can be considered that the left and right BBoxes are the same target. At this time, the merged BBox is (min(Ltop, Rtop), max(Lbottom, Rbottom) ), Lleft, Rrigth).

Step 307: Determine the result after the fusion processing as the target detection result.

In the target detection method provided by the embodiment of the present invention, the original image collected by the pan/tilt camera is divided equally between the left and right structure, and the size is changed, and then the two reasoners are processed in parallel, and the preliminary analysis of the two reasoners is obtained. After the result, the situation that the same target is segmented into two sub-images is fully considered, and the analysis results of the determined segmented targets are merged, and finally an accurate target detection result is obtained. It can be seen from the above example that the detection resolution is expanded from 512*512 to 1024*512, which can improve the detection success rate of small targets. The parallel operation of dual reasoners is the same as that of 512*512 with single reasoners. The image resizing of :9 is 1:1 compared to the image resizing of 16:9 to 1:1. The resizing process retains more information of the image and further improves the accuracy and success rate of target detection.

Example four

Fig. 13 is a structural block diagram of a target detection device provided by the fourth embodiment of the present invention. The device can be implemented by software and/or hardware, and can generally be integrated in an unmanned aerial vehicle. Target detection can be performed by executing a target detection method. At least two image analysis units are integrated in the drone. As shown in Figure 13, the device includes:

The image acquisition module 401 is used to acquire the first image taken by the pan-tilt camera;

The image segmentation module 402 is configured to perform segmentation processing on the first image to obtain at least two segmented images;

The size changing module 403 is configured to perform a size changing operation on the at least two divided images to obtain at least two sub-images, and the resolutions of the at least two sub-images match the resolutions corresponding to the at least two image analysis units ；

The target detection module 404 is configured to input the at least two sub-images into the at least two image analysis units, and determine the target detection result according to the analysis results of the at least two image analysis units.

The target detection device provided in the embodiment of the present invention is applied to an unmanned aerial vehicle. The unmanned aerial vehicle is integrated with at least two image analysis units to acquire the first image taken by the pan/tilt camera, and perform segmentation processing on the first image to determine at least Perform a size change operation on two divided images, and the resolutions of the obtained at least two sub-images match the resolutions corresponding to the at least two image analysis units. The at least two sub-images are input into the at least two image analysis units, and the The analysis results of the two image analysis units determine the target detection result. By adopting the above technical solution, after the original image taken by the pan/tilt camera is divided, the loss of image information can be reduced when the size is changed, and then it can be analyzed by at least two image analysis units, which can improve the target detection Accuracy and success rate.

Optionally, the determining the target detection result according to the analysis results of the at least two image analysis units includes:

Acquiring the analysis results of the at least two image analysis units;

Perform fusion processing on at least two analysis results to obtain target detection results.

Optionally, the analysis result includes type information and location information of the analyzed target, and the fusion processing of at least two analysis results includes:

Each two adjacent sub-images are recorded as a current sub-image pair in turn, the current sub-image pair includes a first sub-image and a second sub-image, and the following operations are performed on the current sub-image pair:

Determining a first target in the first sub-image and a second target in the second sub-image;

Determine whether the first target and the second target correspond according to the first location information and the first type information corresponding to the first target, and the second location information and the second type information corresponding to the second target The same target, if it is, the first target and the second target are merged into the same target.

Optionally, the position information includes the coordinates of a rectangular frame, which contains an image corresponding to the analyzed target; the rectangular frame corresponding to the first target is marked as the first rectangular frame, and the second target corresponds to The rectangular frame of is marked as the second rectangular frame;

The first target and the second target are determined according to the first location information and the first type information corresponding to the first target, and the second location information and the second type information corresponding to the second target Whether it corresponds to the same target, including:

According to the coordinates of the first rectangular frame and the coordinates of the second rectangular frame, the first distance between the first boundary of the first rectangular frame and the third boundary of the second rectangular frame, and the first distance between the first boundary of the first rectangular frame and the third boundary of the second rectangular frame are calculated. The second distance between the third boundary of the rectangular frame and the first boundary of the second rectangular frame, the third distance between the second boundary of the first rectangular frame and the fourth boundary of the second rectangular frame, and the The fourth distance between the fourth boundary of the first rectangular frame and the second boundary of the second rectangular frame, wherein the first boundary and the third boundary in each rectangular frame are parallel, and the second boundary and the fourth boundary are parallel , When the first sub-image and the second sub-image are left and right adjacent, the first boundary in each rectangular frame is the left boundary, and when the first sub-image and the second sub-image are up and down When adjacent, the first boundary in each rectangular box is the upper boundary;

Optionally, the fusing the first target and the second target into the same target includes:

Determining a target rectangular frame according to the coordinates of the first rectangular frame and the coordinates of the second rectangle, the target rectangular frame including both the first rectangular frame and the second rectangular frame;

The target rectangular frame and the first type information are determined as the analysis result corresponding to the fused target.

Optionally, remember that a boundary in the first sub-image that coincides with the second sub-image is a segmentation boundary, and the determining the first target in the first sub-image includes:

Acquiring the candidate target analyzed in the first sub-image;

For each candidate target, determine the fifth distance between the current candidate target and the segmentation boundary, and if the fifth distance is less than the third preset threshold, determine the current candidate target as the first target.

Optionally, the at least two image analysis units include at least two forward reasoners NNIE that are accelerated based on a neural network.

Example five

An embodiment of the present invention also provides a storage medium containing computer-executable instructions, which are used to execute a target detection method when executed by a computer processor, and the method includes:

Obtain the first image taken by the pan/tilt camera;

Storage medium-any of various types of storage devices or storage devices. The term "storage medium" is intended to include: installation media, such as CD-ROM, floppy disk or tape device; computer system memory or random access memory, such as DRAM, DDRRAM, SRAM, EDORAM, Rambus RAM, etc.; Volatile memory, such as flash memory, magnetic media (such as hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may further include other types of memory or a combination thereof. In addition, the storage medium may be located in the first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network (such as the Internet). The second computer system can provide the program instructions to the first computer for execution. The term "storage media" may include two or more storage media that may reside in different locations (for example, in different computer systems connected through a network). The storage medium may store program instructions (for example, embodied as a computer program) executable by one or more processors.

Of course, a storage medium containing computer-executable instructions provided by an embodiment of the present invention is not limited to the above-mentioned target detection operation, and can also execute the target detection method provided in any embodiment of the present invention. Related operations.

Example Six

The embodiment of the present invention provides an unmanned aerial vehicle in which the target detection device provided in the embodiment of the present invention can be integrated. FIG. 14 is a structural block diagram of an unmanned aerial vehicle according to Embodiment 6 of the present invention. The drone 500 may include: a memory 501, a processor 502, and at least two image analysis units 503 (only one is shown in the figure). A computer program stored on the memory 501 and running on a processor, the processor 502 executing The computer program implements the target detection method described in the embodiment of the present invention. The method may include:

Obtain the first image taken by the pan/tilt camera;

Performing a size change operation on the at least two divided images to obtain at least two sub-images, the resolutions of the at least two sub-images match the resolutions corresponding to the at least two image analysis units;

The computer device provided by the embodiment of the present invention can reduce the loss of image information during the size change operation after the original image taken by the pan/tilt camera is divided, and then it can be analyzed by at least two image analysis units, which can improve The accuracy and success rate of target detection.

The target detection device, storage medium, and computer equipment provided in the foregoing embodiments can execute the target detection method provided in any embodiment of the present invention, and have the corresponding functional modules and beneficial effects for executing the method. For technical details that are not described in detail in the foregoing embodiments, reference may be made to the target detection method provided in any embodiment of the present invention.

Note that the above are only the preferred embodiments of the present invention and the applied technical principles. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and various obvious changes, readjustments and substitutions can be made to those skilled in the art without departing from the protection scope of the present invention. Therefore, although the present invention has been described in more detail through the above embodiments, the present invention is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present invention. The scope of is determined by the scope of the appended claims.

Claims

A target detection method, characterized in that it is applied to an unmanned aerial vehicle in which at least two image analysis units are integrated, and the method includes:

Obtain the first image taken by the pan/tilt camera;

Performing segmentation processing on the first image to obtain at least two segmented images;

Performing a size change operation on the at least two divided images to obtain at least two sub-images, the resolutions of the at least two sub-images matching the resolutions corresponding to the at least two image analysis units;

The at least two sub-images are input into the at least two image analysis units, and the target detection result is determined according to the analysis results of the at least two image analysis units.
The method according to claim 1, wherein the determining the target detection result according to the analysis results of the at least two image analysis units comprises:

Acquiring the analysis results of the at least two image analysis units;

Perform fusion processing on at least two analysis results to obtain target detection results.
The method according to claim 2, wherein the analysis result includes type information and location information of the analyzed target, and the fusion processing of at least two analysis results includes:

Each two adjacent sub-images are recorded as a current sub-image pair in turn, the current sub-image pair includes a first sub-image and a second sub-image, and the following operations are performed on the current sub-image pair:

Determining a first target in the first sub-image and a second target in the second sub-image;

Determine whether the first target and the second target correspond according to the first location information and the first type information corresponding to the first target, and the second location information and the second type information corresponding to the second target The same target, if it is, the first target and the second target are merged into the same target.
The method according to claim 3, wherein the position information includes the coordinates of a rectangular frame, and the rectangular frame contains the image corresponding to the analyzed target; the rectangular frame corresponding to the first target is marked as the first A rectangular frame, the rectangular frame corresponding to the second target is marked as the second rectangular frame;

The first target and the second target are determined according to the first location information and the first type information corresponding to the first target, and the second location information and the second type information corresponding to the second target Whether it corresponds to the same target, including:

According to the coordinates of the first rectangular frame and the coordinates of the second rectangular frame, the first distance between the first boundary of the first rectangular frame and the third boundary of the second rectangular frame, and the first distance between the first boundary of the first rectangular frame and the third boundary of the second rectangular frame are calculated. The second distance between the third boundary of the rectangular frame and the first boundary of the second rectangular frame, the third distance between the second boundary of the first rectangular frame and the fourth boundary of the second rectangular frame, and the The fourth distance between the fourth boundary of the first rectangular frame and the second boundary of the second rectangular frame, wherein the first boundary and the third boundary in each rectangular frame are parallel, and the second boundary and the fourth boundary are parallel , When the first sub-image and the second sub-image are left and right adjacent, the first boundary in each rectangular frame is the left boundary, and when the first sub-image and the second sub-image are up and down When adjacent, the first boundary in each rectangular box is the upper boundary;

Calculate the first ratio of the distance with the smaller value to the distance with the larger value in the first distance and the second distance; and calculate the third distance and the fourth distance, with the smaller value The second ratio of the distance to the greater distance;

When the first ratio is less than a first preset threshold, the second ratio is greater than a second preset threshold, and the first type information and the second type information are the same, the first target and the second type information are determined Whether the second target corresponds to the same target, wherein the first preset threshold is smaller than the second preset threshold.
The method according to claim 4, wherein the fusing the first target and the second target into the same target comprises:

Determining a target rectangular frame according to the coordinates of the first rectangular frame and the coordinates of the second rectangle, the target rectangular frame including both the first rectangular frame and the second rectangular frame;

The target rectangular frame and the first type information are determined as the analysis result corresponding to the fused target.
The method according to claim 3, wherein the boundary in the first sub-image that coincides with the second sub-image is a segmentation boundary, and the first target in the first sub-image is determined, include:

Acquiring the candidate target analyzed in the first sub-image;

For each candidate target, determine the fifth distance between the current candidate target and the segmentation boundary, and if the fifth distance is less than the third preset threshold, determine the current candidate target as the first target.
The method according to any one of claims 1 to 6, wherein the at least two image analysis units include at least two forward reasoners NNIE that are accelerated based on neural networks.
A target detection device, characterized in that it is applied to an unmanned aerial vehicle, at least two image analysis units are integrated in the unmanned aerial vehicle, and the device includes:

The image acquisition module is used to acquire the first image taken by the pan-tilt camera;

An image segmentation module, configured to perform segmentation processing on the first image to obtain at least two segmented images;

A size changing module, configured to perform a size changing operation on the at least two divided images to obtain at least two sub-images, and the resolutions of the at least two sub-images match the resolutions corresponding to the at least two image analysis units;

The target detection module is configured to input the at least two sub-images into the at least two image analysis units, and determine the target detection result according to the analysis results of the at least two image analysis units.
A computer-readable storage medium having a computer program stored thereon, wherein the program is executed by a processor to implement the method according to any one of claims 1-7.
An unmanned aerial vehicle, comprising a memory, at least two image analysis units, a processor, and a computer program stored on the memory and running on the processor, characterized in that the processor executes the computer program as follows: The method of any one of claims 1-7.