CN114170535A - Target detection positioning method, device, controller, storage medium and unmanned aerial vehicle - Google Patents

Target detection positioning method, device, controller, storage medium and unmanned aerial vehicle Download PDF

Info

Publication number
CN114170535A
CN114170535A CN202210128714.5A CN202210128714A CN114170535A CN 114170535 A CN114170535 A CN 114170535A CN 202210128714 A CN202210128714 A CN 202210128714A CN 114170535 A CN114170535 A CN 114170535A
Authority
CN
China
Prior art keywords
target
coordinate system
point
image
binocular
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210128714.5A
Other languages
Chinese (zh)
Inventor
罗巍
任雪峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhuoyi Intelligent Technology Co Ltd
Original Assignee
Beijing Zhuoyi Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhuoyi Intelligent Technology Co Ltd filed Critical Beijing Zhuoyi Intelligent Technology Co Ltd
Priority to CN202210128714.5A priority Critical patent/CN114170535A/en
Publication of CN114170535A publication Critical patent/CN114170535A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a target detection positioning method, a target detection positioning device, a controller, a storage medium and an unmanned aerial vehicle, wherein the method comprises the following steps: acquiring binocular images acquired by a binocular vision perception module, and constructing a depth map of the binocular images; identifying a target object in the binocular image based on the depth map; determining target world coordinates of a target object in a world coordinate system; and determining the space position parameters of the target object according to the flight data and the target world coordinates of the unmanned aerial vehicle. The invention detects and positions the target object by the unmanned aerial vehicle remote sensing and binocular stereo vision technology, and can accurately realize the detection and positioning of the target object with a complex background.

Description

Target detection positioning method, device, controller, storage medium and unmanned aerial vehicle
Technical Field
The invention relates to the technical field of unmanned aerial vehicles, in particular to an unmanned aerial vehicle-based target detection positioning method, an unmanned aerial vehicle-based target detection positioning device, a controller, a computer-readable storage medium and an unmanned aerial vehicle.
Background
The overhead transmission line connects a power plant, a transformer substation and a consumer to form a power transmission and distribution network. In transmission lines, insulators are widely used in basic equipment to perform the dual functions of electrical insulation and mechanical support. The failure of the insulator directly threatens the stability and the safety of the power transmission line. According to statistics, the accident caused by insulator fault accounts for the highest proportion of the power system fault. Therefore, the insulator state monitoring has important significance for the safety and stability of the power system. The traditional manual detection has high labor cost and low efficiency. In addition, adverse factors such as climate and geographical environment may limit manual inspection, resulting in many potential hazards not being discovered in a timely manner.
At present, the detection and the positioning of the insulator are mainly carried out manually, and obviously, the efficiency is low and the cost is high. To overcome the limitations of manual detection, it is necessary to develop automatic detection techniques to assist or replace manual decision-making. However, images taken by drones often contain a cluttered background including vegetation, rivers, roads, houses, and the like. Further, the insulators in the drawing are different due to the diversity of the insulators, the difference in the illumination conditions, and the photographing angle in the actual inspection scene. These disadvantages make it difficult to detect insulators in aerial images.
Existing methods are based on two-dimensional images of transmission line scenes and generally rely on some simplifying assumptions, especially the size, shape and background of the insulators. However, in practical applications, due to differences in shooting angles and lighting conditions, insulators vary widely in shape, appearance, size and the like, and backgrounds of insulators are complex and variable, which brings great challenges to image recognition research.
Disclosure of Invention
The invention aims to solve the technical problem of providing a target detection and positioning method based on an unmanned aerial vehicle, a target detection and positioning device based on the unmanned aerial vehicle, a controller, a computer readable storage medium and the unmanned aerial vehicle, and realizing automatic identification and positioning of a target object by fusing unmanned aerial vehicle remote sensing and binocular stereoscopic vision technologies.
In order to solve the above technical problem, according to an aspect of the present invention, there is provided a target detection and positioning method based on an unmanned aerial vehicle, including: acquiring binocular images acquired by a binocular vision perception module, and constructing a depth map of the binocular images;
identifying a target object in the binocular image based on the depth map;
determining target world coordinates of the target object in a world coordinate system;
and determining the space position parameters of the target object according to the flight data of the unmanned aerial vehicle and the target world coordinates.
In some embodiments, the step of acquiring binocular images acquired by the binocular vision perception module and constructing the depth map of the binocular images comprises:
performing region segmentation on both the left image and the right image of the binocular image to obtain a left region segmentation image and a right region segmentation image;
extracting a first characteristic point of the left region segmentation image and a second characteristic point of the right region segmentation image;
matching the first characteristic point and the second characteristic point according to the Euclidean distance between the first characteristic point and the second characteristic point to obtain a plurality of groups of original characteristic point pairs;
selecting a sparse disparity point pair in the original characteristic point pair, and calculating a depth value of the sparse disparity point pair;
calculating the projection distribution of the sparse disparity point pairs, and taking the average disparity of the sparse disparity point pairs of each region segmentation image as the disparity value of the corresponding region segmentation image;
and creating a sparse disparity map based on the disparity values to obtain a depth map of the binocular image.
In some embodiments, the selecting a sparse disparity point pair from the original feature point pair and calculating a depth value of the sparse disparity point pair includes:
connecting each group of the original characteristic point pairs, and calculating the slope of a connecting line of each group of the original characteristic point pairs;
taking the slope with the highest occurrence frequency in the plurality of slopes as a main slope;
reserving the original characteristic point pairs corresponding to the slopes which are the same as the main slopes in the plurality of slopes as the sparse disparity point pairs;
and calculating the depth value of the sparse disparity point pair.
In some embodiments, the step of identifying the target object in the binocular image based on the depth map comprises:
fusing according to the image characteristics of the binocular images to obtain a two-dimensional significant map;
using the depth map to improve two-dimensional saliency of the two-dimensional saliency map to obtain a depth saliency map;
and carrying out binarization and skeletonization on the depth significance map, and identifying the target object in the depth significance map according to preset features of the target object.
In some embodiments, the step of determining target world coordinates of the target object in a world coordinate system comprises:
setting the optical center of one camera in the binocular vision perception module as an origin, and setting the optical axis of the camera as a Z axis so as to establish a pixel coordinate system;
determining target pixel coordinates of a target point of the target object in the pixel coordinate system;
and converting the target pixel coordinate into the target world coordinate according to the conversion relation between the pixel coordinate system and the world coordinate system.
In some embodiments, the conversion relationship between the pixel coordinate system and the world coordinate system is:
Figure 369551DEST_PATH_IMAGE001
wherein the target pointTarget world coordinates of
Figure 891800DEST_PATH_IMAGE002
The coordinate of the target point in the pixel coordinate system is (uv) The coordinate of the origin in the pixel coordinate system is: (u 0 v 0 ) The distance between two cameras of the binocular vision perception module is defined as a baseline distancebThe parallax value of the target pointd=u i -u r The coordinates of the target point in the pixel coordinate systems of the two cameras are(u i , v i And(u r ,v r ,
Figure 466000DEST_PATH_IMAGE003
x, Y, Z, W is the four-dimensional coordinate value of the target point in the world coordinate system.
In some embodiments, the flight data of the drone includes: longitude and latitude, altitude, pitch angle, azimuth angle, roll angle of the unmanned aerial vehicle and pitch angle of the camera;
the spatial position parameters of the target object comprise: the latitude and longitude of the target object and the height from the sea level.
According to another aspect of the present invention, there is provided an unmanned aerial vehicle-based target detection positioning apparatus, including:
the construction module is configured to acquire binocular images acquired by the binocular vision perception module and construct a depth map of the binocular images;
the identification module is configured to identify a target object in the binocular image based on the depth map;
a determination module configured to determine target world coordinates of the target object in a world coordinate system;
and the positioning module is configured to determine the spatial position parameters of the target object according to the flight data of the unmanned aerial vehicle and the target world coordinates.
In some embodiments, the building module comprises:
the segmentation submodule is configured to perform region segmentation on both the left image and the right image of the binocular image to obtain a left region segmentation image and a right region segmentation image;
an extraction submodule configured to extract a first feature point of the left region segmentation image and a second feature point in the right region segmentation image;
the matching submodule is configured to match the first characteristic point and the second characteristic point according to the Euclidean distance between the first characteristic point and the second characteristic point to obtain a plurality of groups of original characteristic point pairs;
the calculation submodule is configured to select a sparse disparity point pair in the original characteristic point pair and calculate a depth value of the sparse disparity point pair;
the statistic submodule is configured to count the projection distribution of the sparse disparity point pairs, and the average disparity of the sparse disparity point pairs of each region segmentation image is used as the disparity value of the corresponding region segmentation image;
and the creating sub-module is configured to create a sparse disparity map based on the disparity values so as to obtain a depth map of the binocular image.
In some embodiments, the computation submodule comprises:
a slope calculation unit configured to connect each group of the original feature point pairs and calculate a slope of a connection line of each group of the original feature point pairs;
a main slope determination unit configured to take a slope having the highest frequency of occurrence among the plurality of slopes as a main slope;
a sparse disparity point pair determining unit configured to retain, as the sparse disparity point pair, the original feature point pair corresponding to a slope that is the same as the main slope among the plurality of slopes;
a depth value calculation unit configured to calculate depth values of the sparse disparity point pairs.
In some embodiments, the identification module comprises:
the fusion sub-module is configured to perform fusion according to the image characteristics of the binocular images to obtain a two-dimensional significant map;
a refinement submodule configured to refine the two-dimensional saliency of the two-dimensional saliency map using the depth map to obtain a depth saliency map;
and the recognition submodule is configured to binarize and skeletonize the depth significance map, and recognize the target object in the depth significance map according to preset features of the target object.
In some embodiments, the determining module comprises:
the establishing submodule is configured to set an optical center of one camera in the binocular vision perception module as an origin and set an optical axis of the camera as a Z axis so as to establish a pixel coordinate system;
a pixel coordinate determination submodule configured to determine target pixel coordinates of a target point of the target object within the pixel coordinate system;
and the conversion sub-module is configured to convert the target pixel coordinate into the target world coordinate according to the conversion relation between the pixel coordinate system and the world coordinate system.
In some embodiments, the conversion relationship between the pixel coordinate system and the world coordinate system is:
Figure 805715DEST_PATH_IMAGE001
wherein the target world coordinate of the target point is
Figure 832577DEST_PATH_IMAGE002
The coordinate of the target point in the pixel coordinate system is (uv) The coordinate of the origin in the pixel coordinate system is: (u 0 v 0 ) The distance between two cameras of the binocular vision perception module is defined as a baseline distancebThe parallax value of the target pointd=u i -u r The coordinates of the target point in the pixel coordinate systems of the two cameras are(u i , v i And(u r ,v r ,
Figure 424095DEST_PATH_IMAGE003
x, Y, Z, W is the four-dimensional coordinate value of the target point in the world coordinate system.
In some embodiments, the flight data of the drone includes: longitude and latitude, altitude, pitch angle, azimuth angle, roll angle of the unmanned aerial vehicle and pitch angle of the camera;
the spatial position parameters of the target object comprise: the latitude and longitude of the target object and the height from the sea level.
According to another aspect of the present invention, there is provided a controller comprising a memory and a processor, the memory storing a computer program, which when executed by the processor, is capable of implementing the steps of any of the above-mentioned drone-based target detection and positioning methods.
According to another aspect of the present invention, there is provided a computer-readable storage medium for storing a computer program, which when executed by a computer or a processor implements the steps of the drone-based target detection and positioning method according to any one of the above.
According to another aspect of the present invention, there is provided an unmanned aerial vehicle, comprising a binocular vision sensing module and the unmanned aerial vehicle-based target detection and positioning method according to any one of the above embodiments.
Compared with the prior art, the invention has obvious advantages and beneficial effects. By means of the technical scheme, the target detection positioning method based on the unmanned aerial vehicle, the target detection positioning device based on the unmanned aerial vehicle, the controller and the computer readable storage medium of the invention and the unmanned aerial vehicle can achieve considerable technical progress and practicability, have industrial wide utilization value and at least have the following advantages:
the invention detects and positions the target object by the unmanned aerial vehicle remote sensing and binocular stereo vision technology, and can accurately realize the detection and positioning of the target object with a complex background.
Secondly, the target object detection algorithm based on the significance of the depth map and the skeleton structure characteristics can accurately detect the target object in the aerial image with the complex background.
And thirdly, the invention can accurately acquire the longitude and latitude of the target object by real-time object space positioning based on binocular stereo vision technology and GPS positioning and coordinate conversion.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following preferred embodiments are described in detail with reference to the accompanying drawings.
Drawings
Fig. 1 is a schematic flow chart of a target detection and positioning method based on an unmanned aerial vehicle according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating an imaging geometry model of a camera according to an embodiment of the present invention;
FIG. 3 is a schematic view of the binocular stereo vision combined with unmanned aerial vehicle GPS for spatial positioning of a target object according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the transformation of the geographic coordinate system and the world coordinate system according to an embodiment of the present invention;
fig. 5 is a schematic block diagram of an unmanned aerial vehicle-based target detection positioning apparatus according to an embodiment of the present invention;
FIG. 6 is a block diagram of the construction of the building block shown in FIG. 5;
FIG. 7 is a block diagram of the compute submodule of FIG. 6;
FIG. 8 is a block diagram of the construction of the building block shown in FIG. 5;
fig. 9 is a block diagram showing the structure of the determination module shown in fig. 5.
Detailed Description
To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined objects, the following detailed description will be given, with reference to the accompanying drawings and preferred embodiments, to specific embodiments and effects of a method for detecting and positioning an object based on an unmanned aerial vehicle, an apparatus for detecting and positioning an object based on an unmanned aerial vehicle, a controller, a computer readable storage medium, and an unmanned aerial vehicle according to the present invention.
The invention provides a target detection and space positioning method aiming at an unmanned aerial vehicle aerial image under a complex background. Firstly, the left image in the binocular vision system is subjected to region segmentation, and sparse disparity points are calculated through stereo matching. And combining the region segmentation result with the sparse parallax point to generate a depth map, and reflecting the influence of the spatial position on the visual significance. Then, a two-stage strategy is proposed to achieve accurate detection of the target. Firstly, a candidate region of the target object is determined by an RGB-D significance detection method fusing a color contrast characteristic, a texture contrast characteristic and a depth characteristic. And then, defining a feature descriptor of a skeleton structure of the target object, performing structure search on the candidate target object region, filtering false targets, and realizing accurate detection of the target object. And finally, acquiring the spatial position of the target object by using binocular stereo vision and the GPS coordinates of the unmanned aerial vehicle.
Based on this, the invention provides a target detection and positioning method based on an unmanned aerial vehicle, as shown in fig. 1, the method comprises:
and step S10, acquiring binocular images acquired by the binocular vision perception module, and constructing a depth map of the binocular images.
Specifically, step S10 includes:
step S101, performing region segmentation on both the left image and the right image of the binocular image to obtain a left region segmentation image and a right region segmentation image.
The excellent region division result can improve the accuracy of the parallax boundary. The invention adopts image segmentation based on the Mask R-CNN, the Mask R-CNN is a new convolutional neural network provided based on the prior Faster R-CNN framework, and the network effectively detects the target and completes high-quality semantic segmentation. The main idea of the Mask R-CNN is to expand the original Faster R-CNN, add a branch and use the existing detection to perform parallel prediction on the target. Meanwhile, the network structure is easy to realize and train, and the speed is high.
According to the invention, based on the Mask R-CNN, the left image and the right image in the binocular image acquired by the binocular vision perception module are subjected to region segmentation, and then the left region segmentation image and the right region segmentation image are obtained.
Step S102, extracting a first characteristic point of the left region segmentation image and a second characteristic point of the right region segmentation image.
In particular, Speeded Up Robust Features (SURF) is a fast and stable algorithm for detecting feature points. In the present invention, SURF is used to extract feature points of the left and right region-segmented images and calculate their 64-dimensional descriptors.
The first and second feature points are first order Haar wavelet response distributions built on x and y directions of image coordinates, rather than gradients, speed is increased by using integral images, and only 64-dimensional descriptors of feature points are used. By adopting the 64-dimensional descriptor, the calculation matching time of the feature points can be effectively reduced, and the robustness of the extraction of the feature points can be improved.
And step S103, matching the first characteristic points and the second characteristic points according to the Euclidean distance between the first characteristic points and the second characteristic points to obtain a plurality of groups of original characteristic point pairs.
And step S104, selecting the sparse disparity point pairs in the original characteristic point pairs, and calculating the depth values of the sparse disparity point pairs.
In step S104, each group of original feature point pairs is connected, the slope of the connection line of each group of original feature point pairs is calculated, the slope with the highest occurrence frequency among the slopes is used as the main slope, the original feature point pair corresponding to the slope identical to the main slope among the slopes is reserved as the sparse disparity point pair, and the depth value of the sparse disparity point pair is calculated.
Some unmatched feature point pairs exist in the multiple sets of original feature point pairs obtained in step S103, and the present invention uses slope consistency to eliminate the unmatched original feature point pairs. First, the original matching point pairs are connected by lines and their slopes in the image coordinate system are calculated. Then, the frequency of the slope ratio is calculated, and the slope with the highest frequency is used as the main slope.Matching point pairs having the same slope value as the main slope are retained and defined as sparse disparity point pairs. Finally, the depth value of the sparse disparity point pair is calculatedZ spp
Step S105, counting the projection distribution of the sparse disparity point pairs, and taking the average disparity of the sparse disparity point pairs of each segmented image as the disparity value of the corresponding segmented image.
It is known that the presence of disparity information can be used to estimate the general depth information of an image. Based on the method, the depth map of the binocular image is constructed by the parallax values of the region segmentation image.
Step S106, creating a sparse disparity map based on the region segmentation image disparity value counted in the step S105 to obtain a depth map of the binocular imageD spp
In perspective projection imaging models, the mapping of 3D scenes to 2D images is a process where depth information is lost. Therefore, it is necessary to estimate the depth information of the scene from binocular disparity cues. However, the correspondence point matching problem is a difficulty. In consideration of the fact that the speed and the precision of the existing depth calculation method are possibly unstable in practical application, a new depth estimation method is provided. The method constructs a depth map by combining region segmentation results with sparse disparity points. And assuming that the disparity values on the same object are the same, in order to ensure the accuracy of the depth boundary, the obtained region segmentation structure is used for assisting the construction of the depth map.
In step S20, the target object in the binocular image is identified based on the depth map.
Specifically, fusion is carried out according to image features of binocular images to obtain a two-dimensional significant map, the two-dimensional significance of the two-dimensional significant map is improved by adopting a depth map to obtain a depth significant map, binarization and skeletonization are carried out on the depth significant map, and the target object in the depth significant map is identified according to preset features of the target object.
It is known that the importance of a region in an image depends on its difference from the surrounding regions, which is usually reflected in features such as color features, shape features, texture features, etc. The color versus bit and texture contrast features of the present invention are fused to obtain a two-dimensional saliency map. The depth information obtained in the previous process is then used to refine the two-dimensional significance detection results. While most of the salient regions are highlighted in the RGB saliency map computed by fusing color and texture features, some of the superpixels belonging to the background are also highlighted. Some background superpixels may also have high significance if they have high contrast with surrounding superpixels. In order to suppress the background in the significance map, the two-dimensional significance detection result was improved using the depth map obtained in the above process.
Although most of the salient regions are highlighted in the RGB saliency map computed fusing the color and texture features, some of the superpixels belonging to the background are also highlighted. Some background superpixels may also have high significance if they have high contrast with surrounding superpixels. In order to suppress the background in the significance map, the depth map obtained in the above process was used to improve the two-dimensional significance detection result.
The idea of this method is based on the fact that: if the regions have the same (or similar) depth values, their significance values should also be the same (or similar). In addition, objects closest to the viewer will attract more attention and should be assigned a higher significance value. Under these assumptions, the image is layered according to a set of thresholds defined by the depth values, with
Figure 852802DEST_PATH_IMAGE004
Wherein G represents the number of layers. The optimized significance map is calculated as follows:
Figure DEST_PATH_IMAGE005
wherein the content of the first and second substances,depth g is thatg-thThe depth value of (2) the image layer in the depth map,num g representing the number of superpixels in an image layerI g δThe sensitivity of the control weights to spatial distance is empirically set to 0.2.
The invention carries out binarization and skeletonization on the obtained depth significance map. These skeletons still retain important information on the shape and structure of the object.
In this embodiment, an insulator in a power transmission line is taken as an example for explanation, and although the types of insulators in the power transmission line are various, an insulator string has a unique skeleton structure characteristic compared with other pseudo target materials, and the following three points can be summarized:
in the framework structure diagram, a central shaft of an insulator string corresponds to a long straight line, and an insulator cap corresponds to a plurality of short straight lines; all the short straight lines are penetrated by the long straight lines; the short straight lines are substantially equal in length and arranged in parallel in the long straight lines at equal intervals.
The three characteristic descriptors are respectively established, and insulator structure searching is carried out in the significance result on the basis of the three characteristic descriptors, so that accurate detection of the insulator is realized. The structure search procedure is as follows:
Figure 97839DEST_PATH_IMAGE006
and searching a central axis. Use ofHoughAn algorithm detects straight lines in the skeleton image. According to the following formula, a straight line having a length greater than the length 1/3 of the long side of the connected domain circumscribed rectangle is regarded as the center axis of the pseudo-insulator.
L≥
Figure 877576DEST_PATH_IMAGE007
max(length cr ,weigth cr )
Figure 866261DEST_PATH_IMAGE008
And searching for the insulating cap. A line vertically bisected by the candidate center axis is searched for and its length and position are recorded. Then, the number of linesnum l Is counted in. A threshold value is setT N If, ifnum l T N The candidate target is retained for the third filtering step,T N was set to 6 in the experiment.
Figure 149475DEST_PATH_IMAGE009
And (5) uniformly arranging and judging. Length variance
Figure 971937DEST_PATH_IMAGE010
The length of the stub is calculated to represent the length consistency of the stub. Distance variance
Figure 973391DEST_PATH_IMAGE011
The distance of the short lines is calculated to represent the distance consistency of the short lines. These short straight lines are determined as the skeleton of the insulative cap and are retained if these two parameters satisfy the following formula. Otherwise, the target is judged as a false target and eliminated, so that the insulator can be accurately detected.
Figure 765767DEST_PATH_IMAGE012
Wherein the threshold valueT S Empirically set to 5.
In this embodiment, candidate target object regions are obtained based on depth map (RGB-D) saliency detection. Then, the skeleton structure of the candidate target object region is extracted. And defining a characteristic descriptor of the skeleton structure, and performing structure search according to the descriptor to realize the final accurate identification of the target object.
In step S30, the target world coordinates of the target object in the world coordinate system are determined.
Specifically, an optical center of one camera in the binocular vision perception module is set as an origin, an optical axis of the camera is set as a Z axis to establish a pixel coordinate system, target pixel coordinates of a target point of a target object in the pixel coordinate system are determined, and the target pixel coordinates are converted into target world coordinates according to a conversion relation between the pixel coordinate system and the world coordinate system.
The first step of the space positioning of the target object is to obtain the three-dimensional coordinates of the target point in a world coordinate system by using the geometric relationship and through binocular vision model analysis.
The internal and external parameters of the left camera and the right camera of the binocular vision perception module are obtained through a calibration algorithm. Setting the optical center of the left camera as the origin and the optical axis as the
Figure DEST_PATH_IMAGE013
Establishing a world coordinate system
Figure 903487DEST_PATH_IMAGE014
This means that the world coordinate system and the left camera coordinate system
Figure 896851DEST_PATH_IMAGE015
And (4) overlapping. Let the coordinates of the target point P in the world coordinate system be
Figure 244655DEST_PATH_IMAGE002
Then the coordinates in the image pixel coordinate system are: (u i , v i ) And (a)u r ,v r ). The distance between two cameras is defined as the baseline distancebd=u i -u r Expressed as disparity values.
Since the world coordinate system coincides with the left camera coordinate system, it is possible to use a camera with a view to the leftZ ω Equal to the perpendicular distance between the object and the two camera baselinesZ c . As shown in fig. 2, according to the binocular vision principle and the triangle similarity theorem, the transformation relationship between the pixel coordinate system and the world coordinate system can be obtained as follows:
Figure 981667DEST_PATH_IMAGE001
wherein: target world coordinates of the target point are
Figure 239473DEST_PATH_IMAGE002
The coordinate of the target point in the pixel coordinate system is (uv) The coordinate of the origin in the pixel coordinate system is (u 0 v 0 ) Of, twoThe distance between two cameras of the visual perception module is defined as the baseline distancebParallax value of target pointd=u i -u r The coordinates of the target point in the pixel coordinate systems of the two cameras are(u i , v i And(u r ,v r ,
Figure 997214DEST_PATH_IMAGE003
x, Y, Z, W is the four-dimensional coordinate value of the target point in the world coordinate system.
And step S40, determining the space position parameters of the target object according to the flight data of the unmanned aerial vehicle and the target world coordinates.
Wherein, unmanned aerial vehicle's flight data includes: longitude and latitude, altitude, pitch angle, azimuth angle, roll angle of the unmanned aerial vehicle and pitch angle of the camera; the spatial position parameters of the target include: latitude and longitude of the target object and height from sea level.
Specifically, the obtained space information of the target object is combined with flight data of the unmanned aerial vehicle, and the longitude, the latitude and the height of the target object are calculated. The industrial personal computer is used for receiving flight data of the unmanned aerial vehicle in real time and extracting information required by target positioning, including longitude and latitude (longitude and latitude) of the unmanned aerial vehiclelong 1 ,lat 1 ) Altitude of seah UAV Angle of pitchαAzimuth angleβAngle of rollγAnd the pitch angle of the cameraθ. As shown in figure 3 of the drawings,
Figure 973260DEST_PATH_IMAGE014
is a world coordinate system and is characterized by that,
Figure 248384DEST_PATH_IMAGE016
is the left camera coordinate system when the camera pitch angle is zero. The two coordinate systems are transformed as follows:
Figure 360696DEST_PATH_IMAGE017
unmanned plane in coordinate system
Figure 820496DEST_PATH_IMAGE016
Coordinates of (5)
Figure 18259DEST_PATH_IMAGE018
Can be obtained by manual measurement according to the installation position of the camera. Then we calculate the coordinates of the drone in the world coordinate system using the following expression
Figure DEST_PATH_IMAGE019
. As shown in fig. 3, a coordinate system
Figure 831495DEST_PATH_IMAGE020
Is/are as follows
Figure 188527DEST_PATH_IMAGE021
The axis is directed in the vertical direction,
Figure 429015DEST_PATH_IMAGE022
shaft and
Figure 379654DEST_PATH_IMAGE023
the axis lies in the horizontal plane. Coordinate system
Figure 996580DEST_PATH_IMAGE014
Wound around
Figure 208118DEST_PATH_IMAGE013
The shaft being rotated through an angle gamma and then wound
Figure 353929DEST_PATH_IMAGE024
The shaft rotates by an angle
Figure 526284DEST_PATH_IMAGE025
This can be related to a coordinate system
Figure 71535DEST_PATH_IMAGE020
And (4) overlapping. Coordinate system
Figure 12946DEST_PATH_IMAGE014
And a coordinate system
Figure 860816DEST_PATH_IMAGE020
The conversion relationship between is calculated as follows:
Figure 520468DEST_PATH_IMAGE026
let
Figure 603830DEST_PATH_IMAGE027
Representing the coordinates of the target object in a world coordinate system,
Figure 665327DEST_PATH_IMAGE028
and
Figure DEST_PATH_IMAGE029
respectively represent coordinate systems
Figure 277574DEST_PATH_IMAGE020
Coordinates of the target and drone. The height difference between the drone and the target may be calculated as:
Figure 158943DEST_PATH_IMAGE030
obviously, the height of the target
Figure 921362DEST_PATH_IMAGE031
As shown in fig. 4, the longitude and latitude of the target object are calculated from the longitude and latitude of the drone and the relative positional relationship between the target object and the drone(s) ((long 2 ,lat 2 )。
Figure 837366DEST_PATH_IMAGE032
Figure 151672DEST_PATH_IMAGE033
Wherein the content of the first and second substances,
Figure 785916DEST_PATH_IMAGE034
representing the horizontal distance between the drone and the target,
Figure 86447DEST_PATH_IMAGE035
representing the azimuth of the connecting line between the drone and the target, R is the radius of the earth.
The invention adopts a novel insulator space positioning method combining binocular stereo vision and GPS. The main goal of unmanned aerial vehicle inspection target positioning is to match pixel coordinates of a target in a two-dimensional image with coordinates in a real scene, such as GPS coordinates. And calculating a conversion matrix among an image coordinate system, a world coordinate system and a geographic coordinate system according to real-time flight data and equipment parameters of the unmanned aerial vehicle, and then obtaining the longitude and latitude and the height of the object through coordinate conversion.
In another embodiment of the present invention, as shown in fig. 5, an object detecting and positioning device based on an unmanned aerial vehicle includes: a building module 10, an identification module 20, a determination module 30 and a location module 40.
The construction module 10 is configured to acquire binocular images acquired by the binocular vision perception module and construct a depth map of the binocular images.
Specifically, as shown in fig. 6, the building block 10 includes: a segmentation sub-module 101, an extraction sub-module 102, a matching sub-module 103, a calculation sub-module 104, a statistics sub-module 105 and a creation sub-module 106.
The segmentation sub-module 101 is configured to perform region segmentation on both the left image and the right image of the binocular image to obtain a left region segmentation image and a right region segmentation image.
The excellent region division result can improve the accuracy of the parallax boundary. The segmentation submodule 101 of the invention adopts image segmentation based on Mask R-CNN, the Mask R-CNN is a new convolutional neural network proposed based on the prior Faster R-CNN architecture, and the network can effectively detect the target and complete high-quality semantic segmentation. The main idea of the Mask R-CNN is to expand the original Faster R-CNN, add a branch and use the existing detection to perform parallel prediction on the target. Meanwhile, the network structure is easy to realize and train, and the speed is high.
The segmentation submodule 101 performs region segmentation on both the left image and the right image in the binocular image acquired by the binocular vision perception module based on the Mask R-CNN, and then obtains a left region segmentation image and a right region segmentation image.
The extraction sub-module 102 is configured to extract a first feature point of the left region-segmented image and a second feature point of the right region-segmented image.
In particular, Speeded Up Robust Features (SURF) is a fast and stable algorithm for detecting feature points. In the present invention, the extraction sub-module 102 extracts feature points of the left and right region-segmented images using SURF and calculates 64-dimensional descriptors thereof.
The first and second feature points are first order Haar wavelet response distributions built on x and y directions of image coordinates, rather than gradients, speed is increased by using integral images, and only 64-dimensional descriptors of feature points are used. By adopting the 64-dimensional descriptor, the calculation matching time of the feature points can be effectively reduced, and the robustness of the extraction of the feature points can be improved.
The matching sub-module 103 is configured to match the first feature point and the second feature point according to an euclidean distance between the first feature point and the second feature point, so as to obtain a plurality of groups of original feature point pairs.
The computation submodule 104 is configured to select a sparse disparity point pair from the pair of original feature points, and compute a depth value of the sparse disparity point pair.
Specifically, as shown in fig. 7, the calculation submodule 104 includes: a slope calculation unit 1041, a main slope determination unit 1042, a sparse disparity point pair determination unit 1043, and a depth value calculation unit 1044,
the slope calculating unit 1041 is configured to connect each group of the original feature point pairs, and calculate a slope of a connection line of each group of the original feature point pairs. A main slope determining unit 1042 configured to take a slope having the highest frequency of occurrence among the plurality of slopes as a main slope. A sparse disparity point pair determining unit 1043 configured to retain, as the sparse disparity point pair, the original feature point pair corresponding to a slope that is the same as the main slope among the plurality of slopes. A depth value calculation unit 1044 configured to calculate depth values of the sparse disparity point pairs.
Some unmatched feature point pairs exist in the multiple sets of original feature point pairs obtained by the matching sub-module 103, and the unmatched original feature point pairs are eliminated by using slope consistency. First, the original matching point pairs are connected by lines and their slopes in the image coordinate system are calculated. Then, the frequency of the slope ratio is calculated, and the slope with the highest frequency is used as the main slope. Matching point pairs having the same slope value as the main slope are retained and defined as sparse disparity point pairs. Finally, the depth value of the sparse disparity point pair is calculatedZ spp
The statistics submodule 105 is configured to count the projection distribution of the sparse disparity point pairs, and take the average disparity of the sparse disparity point pairs of each region segmentation image as the disparity value of the corresponding region segmentation image.
It is known that the presence of disparity information can be used to estimate the general depth information of an image. Based on the method, the depth map of the binocular image is constructed by the parallax values of the region segmentation image.
The creating sub-module 106 is configured to create a sparse disparity map based on the disparity values to obtain a depth map of the binocular image.
In perspective projection imaging models, the mapping of 3D scenes to 2D images is a process where depth information is lost. Therefore, it is necessary to estimate the depth information of the scene from binocular disparity cues. However, the correspondence point matching problem is a difficulty. In consideration of the fact that the speed and the precision of the existing depth calculation method are possibly unstable in practical application, a new depth estimation method is provided. The method constructs a depth map by combining region segmentation results with sparse disparity points. And assuming that the disparity values on the same object are the same, in order to ensure the accuracy of the depth boundary, the obtained region segmentation structure is used for assisting the construction of the depth map.
The recognition module 20 is configured to recognize a target object in the binocular image based on the depth map.
Specifically, as shown in fig. 8, the identification module 20 includes: a fusion submodule 201, an improvement submodule 202 and an identification submodule 203.
Wherein, the fusion sub-module 201 is configured to perform fusion according to the image characteristics of the binocular images to obtain a two-dimensional saliency map. The refinement submodule 202 is configured to refine the two-dimensional saliency of the two-dimensional saliency map using the depth map to obtain a depth saliency map. The recognition submodule 203 is configured to binarize and skeletonize the depth significance map, and recognize the target object in the depth significance map according to preset features of the target object.
It is known that the importance of a region in an image depends on its difference from the surrounding regions, which is usually reflected in features such as color features, shape features, texture features, etc. The color versus bit and texture contrast features of the present invention are fused to obtain a two-dimensional saliency map. The depth information obtained in the previous process is then used to refine the two-dimensional significance detection results. While most of the salient regions are highlighted in the RGB saliency map computed by fusing color and texture features, some of the superpixels belonging to the background are also highlighted. Some background superpixels may also have high significance if they have high contrast with surrounding superpixels. In order to suppress the background in the saliency map, the depth map obtained in the previous process was used to improve the two-dimensional saliency detection result.
Although most of the salient regions are highlighted in the RGB saliency map computed fusing the color and texture features, some of the superpixels belonging to the background are also highlighted. Some background superpixels may also have high significance if they have high contrast with surrounding superpixels. In order to suppress the background in the significance map, the depth map obtained in the above process was used to improve the two-dimensional significance detection result. Such a methodThe idea of the law is based on the fact that: if the regions have the same (or similar) depth values, their significance values should also be the same (or similar). In addition, objects closest to the viewer will attract more attention and should be assigned a higher significance value. From these assumptions, we define a set of thresholds for layering images according to depth values, with
Figure 856957DEST_PATH_IMAGE004
Wherein G represents the number of layers. The optimized significance map is calculated as follows:
Figure 342165DEST_PATH_IMAGE036
=
Figure 198126DEST_PATH_IMAGE037
whereindepth g Is thatg-thThe depth value of (2) the image layer in the depth map,num g representing the number of superpixels in an image layerI g δThe sensitivity of the control weights to spatial distance is empirically set to 0.2.
The invention carries out binarization and skeletonization on the obtained depth significance map. These skeletons still retain important information on the shape and structure of the object.
In this embodiment, an insulator in a power transmission line is taken as an example for explanation, and although the types of insulators in the power transmission line are various, an insulator string has a unique skeleton structure characteristic compared with other pseudo target materials, and the following three points can be summarized:
in the framework structure diagram, a central shaft of an insulator string corresponds to a long straight line, and an insulator cap corresponds to a plurality of short straight lines; all the short straight lines are penetrated by the long straight lines; the short straight lines are substantially equal in length and arranged in parallel in the long straight lines at equal intervals.
The three characteristic descriptors are respectively established, and insulator structure searching is carried out in the significance result on the basis of the three characteristic descriptors, so that accurate detection of the insulator is realized. The structure search procedure is as follows:
Figure 302348DEST_PATH_IMAGE006
and searching a central axis. Use ofHoughAn algorithm detects straight lines in the skeleton image. According to the following formula, a straight line having a length greater than the length 1/3 of the long side of the connected domain circumscribed rectangle is regarded as the center axis of the pseudo-insulator.
L≥
Figure 51998DEST_PATH_IMAGE007
max(length cr ,weigth cr )
Figure 583474DEST_PATH_IMAGE008
And searching for the insulating cap. A line vertically bisected by the candidate center axis is searched for and its length and position are recorded. Then, the number of linesnum l Is counted in. A threshold value is setT N If, ifnum l T N The candidate target is retained for the third filtering step,T N was set to 6 in the experiment.
Figure 926730DEST_PATH_IMAGE009
And (5) uniformly arranging and judging. Length variance
Figure 569064DEST_PATH_IMAGE010
The length of the stub is calculated to represent the length consistency of the stub. Distance variance
Figure 444659DEST_PATH_IMAGE011
The distance of the short lines is calculated to represent the distance consistency of the short lines. These short straight lines are determined as the skeleton of the insulative cap and are retained if these two parameters satisfy the following formula. Otherwise, the target is judged as a false target and eliminated, so that the insulator can be accurately detected.
Figure 147036DEST_PATH_IMAGE038
Wherein the threshold valueT S Empirically set to 5.
In this embodiment, candidate target object regions are obtained based on depth map (RGB-D) saliency detection. Then, the skeleton structure of the candidate target object region is extracted. And defining a characteristic descriptor of the skeleton structure, and performing structure search according to the descriptor to realize the final accurate identification of the target object.
The determination module 30 is configured to determine target world coordinates of the target object in a world coordinate system.
Specifically, as shown in fig. 9, the determination module 30 includes: a build sub-module 301, a pixel coordinate determination sub-module 302 and a conversion sub-module 303.
Wherein the establishing sub-module 301 is configured to set an optical center of one camera in the binocular vision sensing module as an origin and an optical axis of the camera as a Z-axis to establish a pixel coordinate system. The pixel coordinate determination submodule 302 is configured to determine target pixel coordinates of a target point of the target object within the pixel coordinate system. The conversion sub-module 303 is configured to convert the target pixel coordinate into the target world coordinate according to a conversion relationship between a pixel coordinate system and a world coordinate system.
The first step of the space positioning of the target object is to obtain the three-dimensional coordinates of the target point in a world coordinate system by using the geometric relationship and through binocular vision model analysis.
The internal and external parameters of the left camera and the right camera of the binocular vision perception module are obtained through a calibration algorithm. Setting the optical center of the left camera as the origin and the optical axis as the
Figure 977589DEST_PATH_IMAGE013
Establishing a world coordinate system
Figure 423614DEST_PATH_IMAGE014
This means that the world coordinate system and the left camera coordinate system
Figure 147856DEST_PATH_IMAGE015
And (4) overlapping. Let the coordinates of the target point P in the world coordinate system be
Figure 21134DEST_PATH_IMAGE002
Then the coordinates in the image pixel coordinate system are: (u i , v i ) And (a)u r ,v r ). The distance between two cameras is defined as the baseline distancebd=u i -u r Expressed as disparity values.
Since the world coordinate system coincides with the left camera coordinate system, it is possible to use a camera with a view to the leftZ ω Equal to the perpendicular distance between the object and the two camera baselinesZ c . As shown in fig. 2, according to the binocular vision principle and the triangle similarity theorem, the transformation relationship between the pixel coordinate system and the world coordinate system can be obtained as follows:
Figure 73404DEST_PATH_IMAGE001
wherein: target world coordinates of the target point are
Figure 447753DEST_PATH_IMAGE002
The coordinate of the target point in the pixel coordinate system is (uv) The coordinate of the origin in the pixel coordinate system is (u 0 v 0 ) The distance between two cameras of the binocular vision perception module is defined as a baseline distancebParallax value of target pointd=u i -u r The coordinates of the target point in the pixel coordinate systems of the two cameras are(u i , v i And(u r ,v r ,
Figure 433027DEST_PATH_IMAGE039
x, Y, Z, W is meshAnd the four-dimensional coordinate value of the punctuation in the world coordinate system.
The positioning module 40 is configured to determine spatial location parameters of the target object based on the flight data of the drone and the target world coordinates.
Wherein, unmanned aerial vehicle's flight data includes: longitude and latitude, altitude, pitch angle, azimuth angle, roll angle of the unmanned aerial vehicle and pitch angle of the camera; the spatial position parameters of the target object comprise: the latitude and longitude of the target object and the height from the sea level.
Specifically, the obtained space information of the target object is combined with flight data of the unmanned aerial vehicle, and the longitude, the latitude and the height of the target object are calculated. The industrial personal computer is used for receiving flight data of the unmanned aerial vehicle in real time and extracting information required by target positioning, including longitude and latitude (longitude and latitude) of the unmanned aerial vehiclelong 1 ,lat 1 ) Altitude of seah UAV Angle of pitchαAzimuth angleβAngle of rollγAnd the pitch angle of the cameraθ. As shown in figure 3 of the drawings,
Figure 477206DEST_PATH_IMAGE014
is a world coordinate system and is characterized by that,
Figure 751193DEST_PATH_IMAGE016
is the left camera coordinate system when the camera pitch angle is zero. The two coordinate systems are transformed as follows:
Figure 663654DEST_PATH_IMAGE040
unmanned plane in coordinate system
Figure 237855DEST_PATH_IMAGE016
Coordinates of (5)
Figure 452936DEST_PATH_IMAGE018
Can be obtained by manual measurement according to the installation position of the camera. Then we calculate the coordinates of the drone in the world coordinate system using the expression as follows
Figure 745377DEST_PATH_IMAGE019
. As shown in fig. 3, a coordinate system
Figure 195950DEST_PATH_IMAGE020
Is/are as follows
Figure 890236DEST_PATH_IMAGE021
The axis is directed in the vertical direction,
Figure 276218DEST_PATH_IMAGE022
shaft and
Figure 790376DEST_PATH_IMAGE023
the axis lies in the horizontal plane. Coordinate system
Figure 44640DEST_PATH_IMAGE014
Wound around
Figure 593433DEST_PATH_IMAGE013
The shaft being rotated through an angle gamma and then wound
Figure 150316DEST_PATH_IMAGE024
The axis being rotated through an angle which may be related to the coordinate system
Figure 151770DEST_PATH_IMAGE020
And (4) overlapping. Coordinate system
Figure 209725DEST_PATH_IMAGE014
And a coordinate system
Figure 613024DEST_PATH_IMAGE020
The conversion relationship between is calculated as follows:
Figure 340809DEST_PATH_IMAGE041
let
Figure 829559DEST_PATH_IMAGE027
Representing the coordinates of the insulator in the world coordinate system,
Figure 425626DEST_PATH_IMAGE028
and
Figure 683432DEST_PATH_IMAGE029
respectively represent coordinate systems
Figure 582117DEST_PATH_IMAGE020
Coordinates of the target and drone. The height difference between the drone and the target may be calculated as:
Figure 682797DEST_PATH_IMAGE042
Figure 957921DEST_PATH_IMAGE030
obviously, the height of the insulator
Figure 70234DEST_PATH_IMAGE031
As shown in fig. 4, the longitude and latitude of the target object are calculated from the longitude and latitude of the drone and the relative positional relationship between the target object and the drone(s) ((long 2 ,lat 2 )。
Figure 405400DEST_PATH_IMAGE043
Figure 727797DEST_PATH_IMAGE033
Wherein the content of the first and second substances,
Figure 806611DEST_PATH_IMAGE034
representing the horizontal distance between the drone and the target,
Figure 773430DEST_PATH_IMAGE035
representing the azimuth of the connecting line between the drone and the target, R is the radius of the earth.
The invention adopts a novel insulator space positioning method combining binocular stereo vision and GPS. The main goal of unmanned aerial vehicle inspection target positioning is to match pixel coordinates of a target in a two-dimensional image with coordinates in a real scene, such as GPS coordinates. And calculating a conversion matrix among an image coordinate system, a world coordinate system and a geographic coordinate system according to real-time flight data and equipment parameters of the unmanned aerial vehicle, and then obtaining the longitude and latitude and the height of the object through coordinate conversion.
A controller according to another embodiment of the present invention includes a memory and a processor, the memory stores a computer program, and the program can realize the steps of the control method of tethered drone according to any embodiment when executed by the processor.
A computer-readable storage medium of another embodiment of the present invention stores a computer program which, when executed by a computer or a processor, implements the steps of the control method of tethered drone of any embodiment.
The unmanned aerial vehicle of another embodiment of the invention comprises a binocular vision perception module and the target detection positioning device based on the unmanned aerial vehicle of any one of the embodiments.
Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (17)

1. An unmanned aerial vehicle-based target detection and positioning method is characterized by comprising the following steps:
acquiring binocular images acquired by a binocular vision perception module, and constructing a depth map of the binocular images;
identifying a target object in the binocular image based on the depth map;
determining target world coordinates of the target object in a world coordinate system;
and determining the space position parameters of the target object according to the flight data of the unmanned aerial vehicle and the target world coordinates.
2. The unmanned aerial vehicle-based target detection and positioning method of claim 1, wherein the step of acquiring binocular images acquired by a binocular vision perception module and constructing a depth map of the binocular images comprises:
performing region segmentation on both the left image and the right image of the binocular image to obtain a left region segmentation image and a right region segmentation image;
extracting a first characteristic point of the left region segmentation image and a second characteristic point of the right region segmentation image;
matching the first characteristic point and the second characteristic point according to the Euclidean distance between the first characteristic point and the second characteristic point to obtain a plurality of groups of original characteristic point pairs;
selecting a sparse disparity point pair in the original characteristic point pair, and calculating a depth value of the sparse disparity point pair;
calculating the projection distribution of the sparse disparity point pairs, and taking the average disparity of the sparse disparity point pairs of each region segmentation image as the disparity value of the corresponding region segmentation image;
and creating a sparse disparity map based on the disparity values to obtain a depth map of the binocular image.
3. The unmanned aerial vehicle-based target detection and positioning method of claim 2, wherein the step of selecting sparse disparity point pairs from the original feature point pairs and calculating depth values of the sparse disparity point pairs comprises:
connecting each group of the original characteristic point pairs, and calculating the slope of a connecting line of each group of the original characteristic point pairs;
taking the slope with the highest occurrence frequency in the plurality of slopes as a main slope;
reserving the original characteristic point pairs corresponding to the slopes which are the same as the main slopes in the plurality of slopes as the sparse disparity point pairs;
and calculating the depth value of the sparse disparity point pair.
4. The unmanned aerial vehicle-based target detection and positioning method of claim 1, wherein the step of identifying the target object in the binocular image based on the depth map comprises:
fusing according to the image characteristics of the binocular images to obtain a two-dimensional significant map;
using the depth map to improve two-dimensional saliency of the two-dimensional saliency map to obtain a depth saliency map;
and carrying out binarization and skeletonization on the depth significance map, and identifying the target object in the depth significance map according to preset features of the target object.
5. The drone-based target detection and positioning method of claim 1, wherein the step of determining target world coordinates of the target object in a world coordinate system comprises:
setting the optical center of one camera in the binocular vision perception module as an origin, and setting the optical axis of the camera as a Z axis so as to establish a pixel coordinate system;
determining target pixel coordinates of a target point of the target object in the pixel coordinate system;
and converting the target pixel coordinate into the target world coordinate according to the conversion relation between the pixel coordinate system and the world coordinate system.
6. The unmanned aerial vehicle-based target detection and positioning method of claim 5, wherein the conversion relationship between the pixel coordinate system and the world coordinate system is as follows:
Figure 605998DEST_PATH_IMAGE001
wherein the target world coordinate of the target point is
Figure 336056DEST_PATH_IMAGE002
The coordinate of the target point in the pixel coordinate system is (uv) The coordinate of the origin in the pixel coordinate system is: (u 0 v 0 ) The distance between two cameras of the binocular vision perception module is defined as a baseline distancebThe parallax value of the target pointd=u i -u r The coordinates of the target point in the pixel coordinate systems of the two cameras are(u i , v i And(u r ,v r ,
Figure 739356DEST_PATH_IMAGE003
x, Y, Z, W is the four-dimensional coordinate value of the target point in the world coordinate system.
7. The drone-based target detection and positioning method of claim 1, wherein the flight data of the drone includes: longitude and latitude, altitude, pitch angle, azimuth angle, roll angle of the unmanned aerial vehicle and pitch angle of the camera;
the spatial position parameters of the target object comprise: the latitude and longitude of the target object and the height from the sea level.
8. The utility model provides a target detection positioner based on unmanned aerial vehicle which characterized in that includes:
the construction module is configured to acquire binocular images acquired by the binocular vision perception module and construct a depth map of the binocular images;
the identification module is configured to identify a target object in the binocular image based on the depth map;
a determination module configured to determine target world coordinates of the target object in a world coordinate system;
and the positioning module is configured to determine the spatial position parameters of the target object according to the flight data of the unmanned aerial vehicle and the target world coordinates.
9. The drone-based object detection and positioning device of claim 8, wherein the building module comprises:
the segmentation submodule is configured to perform region segmentation on both the left image and the right image of the binocular image to obtain a left region segmentation image and a right region segmentation image;
an extraction submodule configured to extract a first feature point of the left region segmentation image and a second feature point in the right region segmentation image;
the matching submodule is configured to match the first characteristic point and the second characteristic point according to the Euclidean distance between the first characteristic point and the second characteristic point to obtain a plurality of groups of original characteristic point pairs;
the calculation submodule is configured to select a sparse disparity point pair in the original characteristic point pair and calculate a depth value of the sparse disparity point pair;
the statistic submodule is configured to count the projection distribution of the sparse disparity point pairs, and the average disparity of the sparse disparity point pairs of each region segmentation image is used as the disparity value of the corresponding region segmentation image;
and the creating sub-module is configured to create a sparse disparity map based on the disparity values so as to obtain a depth map of the binocular image.
10. The drone-based object detection and positioning device of claim 9, wherein the computation submodule includes:
a slope calculation unit configured to connect each group of the original feature point pairs and calculate a slope of a connection line of each group of the original feature point pairs;
a main slope determination unit configured to take a slope having the highest frequency of occurrence among the plurality of slopes as a main slope;
a sparse disparity point pair determining unit configured to retain, as the sparse disparity point pair, the original feature point pair corresponding to a slope that is the same as the main slope among the plurality of slopes;
a depth value calculation unit configured to calculate depth values of the sparse disparity point pairs.
11. The drone-based object detection and positioning device of claim 8, wherein the identification module comprises:
the fusion sub-module is configured to perform fusion according to the image characteristics of the binocular images to obtain a two-dimensional significant map;
a refinement submodule configured to refine the two-dimensional saliency of the two-dimensional saliency map using the depth map to obtain a depth saliency map;
and the recognition submodule is configured to binarize and skeletonize the depth significance map, and recognize the target object in the depth significance map according to preset features of the target object.
12. The drone-based object detection and positioning device of claim 8, wherein the determination module comprises:
the establishing submodule is configured to set an optical center of one camera in the binocular vision perception module as an origin and set an optical axis of the camera as a Z axis so as to establish a pixel coordinate system;
a pixel coordinate determination submodule configured to determine target pixel coordinates of a target point of the target object within the pixel coordinate system;
and the conversion sub-module is configured to convert the target pixel coordinate into the target world coordinate according to the conversion relation between the pixel coordinate system and the world coordinate system.
13. The drone-based target detection and positioning device of claim 12, wherein the transformation relationship between the pixel coordinate system and the world coordinate system is:
Figure 654091DEST_PATH_IMAGE001
wherein the target world coordinate of the target point is
Figure 142841DEST_PATH_IMAGE002
The coordinate of the target point in the pixel coordinate system is (uv) The coordinate of the origin in the pixel coordinate system is: (u 0 v 0 ) The distance between two cameras of the binocular vision perception module is defined as a baseline distancebThe parallax value of the target pointd=u i -u r The coordinates of the target point in the pixel coordinate systems of the two cameras are(u i , v i And(u r ,v r ,
Figure 676591DEST_PATH_IMAGE003
x, Y, Z, W is the four-dimensional coordinate value of the target point in the world coordinate system.
14. The drone-based object detection and positioning device of claim 8, wherein the flight data of the drone includes: longitude and latitude, altitude, pitch angle, azimuth angle, roll angle of the unmanned aerial vehicle and pitch angle of the camera;
the spatial position parameters of the target object comprise: the latitude and longitude of the target object and the height from the sea level.
15. A controller comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, is capable of carrying out the steps of the method of any one of claims 1 to 7.
16. A computer-readable storage medium for storing a computer program which, when executed by a computer or processor, implements the steps of the method of any one of claims 1 to 7.
17. An unmanned aerial vehicle comprising a binocular vision perception module and the unmanned aerial vehicle-based target detection and positioning apparatus of any one of claims 8-14.
CN202210128714.5A 2022-02-11 2022-02-11 Target detection positioning method, device, controller, storage medium and unmanned aerial vehicle Withdrawn CN114170535A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210128714.5A CN114170535A (en) 2022-02-11 2022-02-11 Target detection positioning method, device, controller, storage medium and unmanned aerial vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210128714.5A CN114170535A (en) 2022-02-11 2022-02-11 Target detection positioning method, device, controller, storage medium and unmanned aerial vehicle

Publications (1)

Publication Number Publication Date
CN114170535A true CN114170535A (en) 2022-03-11

Family

ID=80489754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210128714.5A Withdrawn CN114170535A (en) 2022-02-11 2022-02-11 Target detection positioning method, device, controller, storage medium and unmanned aerial vehicle

Country Status (1)

Country Link
CN (1) CN114170535A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019216A (en) * 2022-08-09 2022-09-06 江西师范大学 Real-time ground object detection and positioning counting method, system and computer
CN115841487A (en) * 2023-02-20 2023-03-24 深圳金三立视频科技股份有限公司 Hidden danger positioning method and terminal along power transmission line
CN117437563A (en) * 2023-12-13 2024-01-23 黑龙江惠达科技股份有限公司 Plant protection unmanned aerial vehicle dotting method, device and equipment based on binocular vision
CN117523431A (en) * 2023-11-17 2024-02-06 中国科学技术大学 Firework detection method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108731587A (en) * 2017-04-14 2018-11-02 中交遥感载荷(北京)科技有限公司 A kind of the unmanned plane dynamic target tracking and localization method of view-based access control model
CN109472776A (en) * 2018-10-16 2019-03-15 河海大学常州校区 A kind of isolator detecting and self-destruction recognition methods based on depth conspicuousness
CN110548699A (en) * 2019-09-30 2019-12-10 华南农业大学 Automatic pineapple grading and sorting method and device based on binocular vision and multispectral detection technology
CN110599522A (en) * 2019-09-18 2019-12-20 成都信息工程大学 Method for detecting and removing dynamic target in video sequence
WO2021197341A1 (en) * 2020-04-03 2021-10-07 速度时空信息科技股份有限公司 Monocular image-based method for updating road signs and markings

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108731587A (en) * 2017-04-14 2018-11-02 中交遥感载荷(北京)科技有限公司 A kind of the unmanned plane dynamic target tracking and localization method of view-based access control model
CN109472776A (en) * 2018-10-16 2019-03-15 河海大学常州校区 A kind of isolator detecting and self-destruction recognition methods based on depth conspicuousness
CN110599522A (en) * 2019-09-18 2019-12-20 成都信息工程大学 Method for detecting and removing dynamic target in video sequence
CN110548699A (en) * 2019-09-30 2019-12-10 华南农业大学 Automatic pineapple grading and sorting method and device based on binocular vision and multispectral detection technology
WO2021197341A1 (en) * 2020-04-03 2021-10-07 速度时空信息科技股份有限公司 Monocular image-based method for updating road signs and markings

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUNPENG MA,ET AL.: "Real-Time Detection and Spatial Localization of Insulators for UAV Inspection Based on Binocular Stereo Vision", 《REMOTE SENSING》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019216A (en) * 2022-08-09 2022-09-06 江西师范大学 Real-time ground object detection and positioning counting method, system and computer
CN115841487A (en) * 2023-02-20 2023-03-24 深圳金三立视频科技股份有限公司 Hidden danger positioning method and terminal along power transmission line
CN117523431A (en) * 2023-11-17 2024-02-06 中国科学技术大学 Firework detection method and device, electronic equipment and storage medium
CN117437563A (en) * 2023-12-13 2024-01-23 黑龙江惠达科技股份有限公司 Plant protection unmanned aerial vehicle dotting method, device and equipment based on binocular vision
CN117437563B (en) * 2023-12-13 2024-03-15 黑龙江惠达科技股份有限公司 Plant protection unmanned aerial vehicle dotting method, device and equipment based on binocular vision

Similar Documents

Publication Publication Date Title
CN109034018B (en) Low-altitude small unmanned aerial vehicle obstacle sensing method based on binocular vision
CN114170535A (en) Target detection positioning method, device, controller, storage medium and unmanned aerial vehicle
US20210390329A1 (en) Image processing method, device, movable platform, unmanned aerial vehicle, and storage medium
CN116258817B (en) Automatic driving digital twin scene construction method and system based on multi-view three-dimensional reconstruction
Pang et al. SGM-based seamline determination for urban orthophoto mosaicking
Zhao et al. Reconstruction of textured urban 3D model by fusing ground-based laser range and CCD images
CN106446785A (en) Passable road detection method based on binocular vision
CN112734839A (en) Monocular vision SLAM initialization method for improving robustness
CN113643345A (en) Multi-view road intelligent identification method based on double-light fusion
CN115115859A (en) Long linear engineering construction progress intelligent identification and analysis method based on unmanned aerial vehicle aerial photography
CN116883610A (en) Digital twin intersection construction method and system based on vehicle identification and track mapping
CN113537047A (en) Obstacle detection method, obstacle detection device, vehicle and storage medium
Xiao et al. Geo-spatial aerial video processing for scene understanding and object tracking
CN115797408A (en) Target tracking method and device fusing multi-view image and three-dimensional point cloud
CN114923477A (en) Multi-dimensional space-ground collaborative map building system and method based on vision and laser SLAM technology
CN109961461B (en) Multi-moving-object tracking method based on three-dimensional layered graph model
David et al. Orientation descriptors for localization in urban environments
CN116229224A (en) Fusion perception method and device, electronic equipment and storage medium
CN106709432B (en) Human head detection counting method based on binocular stereo vision
CN114842340A (en) Robot binocular stereoscopic vision obstacle sensing method and system
CN113345084B (en) Three-dimensional modeling system and three-dimensional modeling method
CN113947724A (en) Automatic line icing thickness measuring method based on binocular vision
CN117315424A (en) Multisource fusion bird's eye view perception target detection method, device, equipment and medium
Heisterklaus et al. Image-based pose estimation using a compact 3d model
CN113221744B (en) Monocular image 3D object detection method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220311