Disclosure of Invention
The embodiment of the application provides a method and a device for determining depth information of an unmanned aerial vehicle, the unmanned aerial vehicle and a storage medium, and the accuracy of determining the depth information can be improved. The technical scheme is as follows:
in one aspect, a method for determining depth information of a drone is provided, where the method includes:
acquiring an image frame through a target camera, and determining coordinate information of a plurality of feature points from the image frame;
acquiring first depth information of a target point in the image frame acquired by a single-point ranging sensor, wherein the target camera and the single-point ranging sensor are both installed on an unmanned aerial vehicle;
acquiring second depth information of the plurality of feature points in a key image frame and first pose information of the unmanned aerial vehicle in the image frame;
determining third depth information of the plurality of feature points in the image frame based on the second depth information of the plurality of feature points in the key image frame, the first pose information, the coordinate information of the plurality of feature points, and the first depth information of the target point.
In one possible implementation, the determining third depth information of the plurality of feature points in the image frame based on the second depth information of the plurality of feature points in the key image frame, the first pose information, the coordinate information of the plurality of feature points, and the first depth information of the target point includes:
for any feature point, if the feature point is a feature point observed for the first time in the image frame, using the first depth information of the target point as the third depth information of the feature point;
if the feature point is not the feature point observed for the first time in the image frame, determining fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point;
and updating fourth depth information of the feature points in the image frame based on the coordinate information of the feature points and the first position and orientation information to obtain third depth information of the feature points.
In another possible implementation manner, the determining fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point includes:
if the position of the feature point in the image frame is in a preset area around the target point, taking the first depth information of the target point as the fourth depth information;
and if the position of the feature point in the image frame is not in a preset area around the target point, taking the second depth information as the fourth depth information.
In another possible implementation manner, the updating fourth depth information of the feature point in the image frame based on the coordinate information of the feature point and the first pose information to obtain third depth information of the feature point includes:
determining normalized coordinates of the feature points in the image frame based on the coordinate information of the feature points, determining partial derivatives of the normalized coordinates to the fourth depth information, and taking the partial derivatives as the observation matrix;
and correcting the first bit attitude information based on the observation matrix to obtain the third depth information.
In the embodiment of the application, the observation matrix of the system is determined by combining the depth of the feature points in a loose coupling mode, so that the first pose information corresponding to the key image frame is corrected, the updated depth information of the feature points is obtained, and the calculated amount is reduced.
In another possible implementation, after determining fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point, the method further includes:
if the position of the feature point in the image frame is not in a preset area around the target point, executing the step of updating fourth depth information of the feature point in the image frame based on the coordinate information and the first position and posture information of the feature point to obtain third depth information of the feature point;
if the position of the feature point in the image frame is in a preset area around the target point, acquiring a first depth variance of the target point in a plurality of image frames and a second depth variance of a plurality of feature points in the key image frame, wherein the plurality of image frames comprise the image frame which is collected currently;
and updating the fourth depth information based on the first depth variance, the second depth variance and the second depth information to obtain the third depth information.
In another possible implementation manner, the method further includes:
determining a third depth variance of a plurality of feature points in the image frame based on the third depth information for each feature point;
and if the third depth variance is smaller than a preset variance, determining target pose information of the unmanned aerial vehicle based on the third depth information of each feature point in the image frame.
In another possible implementation manner, the determining a third depth variance of a plurality of feature points in the image frame based on the third depth information of each feature point includes:
if the plurality of feature points are all feature points observed for the first time in the image frame, determining that the initial depth variance is the third depth variance;
if the plurality of feature points exist, the feature points are not feature points observed for the first time in the image frames, a first depth variance of the target point in the plurality of image frames and a second depth variance of a plurality of feature points in the key image frame are obtained, and the third depth variance is determined based on the first depth variance and the second depth variance, wherein the plurality of image frames comprise the image frame which is collected currently.
In another possible implementation manner, the determining of the initial depth variance includes:
determining an average reprojection error of the plurality of feature points in the image frames and a first depth variance of the target point in the plurality of image frames;
and taking the product of the average reprojection error, the first depth variance and a preset parameter as the initial depth variance.
In another possible implementation manner, the determining, by the drone, first position information in the image frame includes:
acquiring second pose information of the unmanned aerial vehicle, wherein the second pose information is pose information of the target image in a key image frame;
and predicting the pose information of the unmanned aerial vehicle in the image frame based on the second pose information to obtain the first pose information.
In another possible implementation manner, the determining coordinate information of a plurality of feature points from the image frame includes:
extracting feature points of the image frame to obtain a plurality of feature points to be processed;
removing the characteristic points which do not accord with the coplanarity hypothesis from the plurality of characteristic points to be processed to obtain a plurality of characteristic points;
coordinate information of the plurality of feature points in the image frame is acquired.
In another aspect, an apparatus for determining depth information of a drone is provided, the apparatus including:
the device comprises a first determination module, a second determination module and a third determination module, wherein the first determination module is used for acquiring an image frame through a target camera and determining coordinate information of a plurality of feature points from the image frame;
the first acquisition module is used for acquiring first depth information of a target point in the image frame acquired by the single-point ranging sensor, and the target camera and the single-point ranging sensor are both installed on the unmanned aerial vehicle;
the second acquisition module is used for acquiring second depth information of the plurality of feature points in a key image frame and first pose information of the unmanned aerial vehicle in the image frame;
a second determining module, configured to determine third depth information of the plurality of feature points in the image frame based on second depth information of the plurality of feature points in a key image frame, the first pose information, coordinate information of the plurality of feature points, and the first depth information of the target point.
In one possible implementation manner, the second determining module includes:
a first determining unit, configured to, for any feature point, if the feature point is a feature point observed for the first time in the image frame, use first depth information of the target point as third depth information of the feature point; if the feature point is not the feature point observed for the first time in the image frame, determining fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point;
and the updating unit is used for updating fourth depth information of the feature points in the image frame based on the coordinate information of the feature points and the first position information to obtain third depth information of the feature points.
In another possible implementation manner, the first determining unit is configured to use the first depth information of the target point as the fourth depth information if the position of the feature point in the image frame is within a preset area around the target point; and if the position of the feature point in the image frame is not in a preset area around the target point, taking the second depth information as the fourth depth information.
In another possible implementation manner, the updating unit is configured to determine a normalized coordinate of the feature point in the image frame based on the coordinate information of the feature point, determine a partial derivative of the normalized coordinate to the fourth depth information, and use the partial derivative as the observation matrix; and correcting the first bit attitude information based on the observation matrix to obtain the third depth information.
In another possible implementation manner, the apparatus further includes:
the updating unit is configured to update fourth depth information of the feature point in the image frame based on the coordinate information of the feature point and the first pose information to obtain third depth information of the feature point if the position of the feature point in the image frame is not located in a preset area around the target point;
the updating unit is further configured to obtain a first depth variance of the target point in a plurality of image frames and a second depth variance of a plurality of feature points in the key image frame if the position of the feature point in the image frame is within a preset area around the target point, where the plurality of image frames include the currently acquired image frame; and updating the fourth depth information based on the first depth variance, the second depth variance and the second depth information to obtain the third depth information.
In another possible implementation manner, the apparatus further includes:
a third determining module for determining a third depth variance of a plurality of feature points in the image frame based on the third depth information of each feature point;
a fourth determining module, configured to determine, if the third depth variance is smaller than a preset variance, target pose information of the unmanned aerial vehicle based on the third depth information of each feature point in the image frame.
In another possible implementation manner, the third determining module is configured to determine, if the plurality of feature points are all feature points observed for the first time in the image frame, that an initial depth variance is the third depth variance; if the plurality of feature points exist, the feature points are not feature points observed for the first time in the image frames, a first depth variance of the target point in the plurality of image frames and a second depth variance of a plurality of feature points in the key image frame are obtained, and the third depth variance is determined based on the first depth variance and the second depth variance, wherein the plurality of image frames comprise the image frame which is collected currently.
In another possible implementation manner, the apparatus further includes:
a fifth determining module for determining an average reprojection error of the plurality of feature points in the image frames and a first depth variance of the target point in the plurality of image frames; and taking the product of the average reprojection error, the first depth variance and a preset parameter as the initial depth variance.
In another possible implementation manner, the apparatus further includes: a sixth determining module, configured to obtain second pose information of the unmanned aerial vehicle, where the second pose information is pose information of the target image in a key image frame; and predicting the pose information of the unmanned aerial vehicle in the image frame based on the second pose information to obtain the first pose information.
In another possible implementation manner, the first determining module is configured to perform feature point extraction on the image frame to obtain a plurality of feature points to be processed; removing the characteristic points which do not accord with the coplanarity hypothesis from the plurality of characteristic points to be processed to obtain a plurality of characteristic points; coordinate information of the plurality of feature points in the image frame is acquired.
In another aspect, a drone is provided, which includes a processor and a memory, where the memory stores at least one program code, and the at least one program code is loaded and executed by the processor to implement the method for determining depth information of a drone.
In another aspect, a computer-readable storage medium is provided, in which at least one program code is stored, and the at least one program code is loaded and executed by a processor to implement the method for determining depth information of a drone described above.
In another aspect, a computer program product is provided, in which program code, when executed by a processor of a drone, enables the drone to perform the method of determining drone depth information described above.
In the embodiment of the application, in the process of performing visual positioning on the unmanned aerial vehicle, the accuracy of the depth information acquired by the single-point distance measuring sensor is high, so that the third depth information of a plurality of feature points in the image frame is determined by combining the first depth information acquired by the single-point distance measuring sensor and the second depth information of the feature points in the known key image frame, so that accurate depth information can be obtained even if the unmanned aerial vehicle is in the air, and the accuracy of the depth information determination in the visual positioning process is further improved.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Fig. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application. Referring to fig. 1, the implementation environment includes: the system comprises a target camera 101, a single-point ranging sensor 102 and a unmanned aerial vehicle 103, wherein the target camera 101 and the single-point ranging sensor 102 are respectively connected with the unmanned aerial vehicle 103 through a wireless or wired network.
The target camera 101 and the single point ranging sensor 102 are both mounted on the drone to be positioned. For example, the target camera 101 and the single point ranging sensor 102 are both mounted below the drone.
The target camera 101 may be a monocular camera. The single-point distance measuring sensor 102 may be any one of an ultrasonic distance measuring sensor, a laser distance measuring sensor, an infrared distance measuring sensor, and the like.
In the embodiment of the application, during the traveling process of the unmanned aerial vehicle, the target camera 101 is used for shooting an image of a reference surface and sending the image to the unmanned aerial vehicle 103; the single-point distance measuring sensor 102 is configured to measure a distance from the reference surface, and send the distance to the drone 103, where the distance is a depth of a center point of the reference surface; and the unmanned aerial vehicle 103 is used for determining the depth information of the plurality of feature points in the image based on the depth, and then carrying out visual positioning on the unmanned aerial vehicle according to the depth information.
The method for determining the depth information can be applied to the flight scene of the unmanned aerial vehicle. The drone in the following scenario is the drone 103 in the implementation environment, the target camera is the target camera 101, and the single-point ranging sensor is the single-point ranging sensor 102. For example, a target camera is installed below the unmanned aerial vehicle, and in the flying process of the unmanned aerial vehicle, the unmanned aerial vehicle determines the depth information of the feature points in the image according to the image shot by the target camera and the depth acquired by the single-point ranging sensor by the depth information determining method provided by the application, so that the unmanned aerial vehicle is visually positioned based on the depth information.
Fig. 2 is a flowchart of a method for determining depth information of an unmanned aerial vehicle according to an embodiment of the present application. Referring to fig. 2, the method includes:
step 201: acquiring an image frame through a target camera, and determining coordinate information of a plurality of feature points from the image frame;
step 202: acquiring first depth information of a target point in the image frame acquired by a single-point ranging sensor, wherein the target camera and the single-point ranging sensor are both installed on an unmanned aerial vehicle;
step 203: acquiring second depth information of the plurality of feature points in a key image frame and first position information of the unmanned aerial vehicle in the image frame;
step 204: and determining third depth information of the plurality of feature points in the image frame based on the second depth information of the plurality of feature points in the key image frame, the first pose information, the coordinate information of the plurality of feature points and the first depth information of the target point.
In one possible implementation, the determining third depth information of the plurality of feature points in the image frame based on the second depth information of the plurality of feature points in the key image frame, the first pose information, the coordinate information of the plurality of feature points, and the first depth information of the target point includes:
for any feature point, if the feature point is the feature point observed for the first time in the image frame, using the first depth information of the target point as the third depth information of the feature point;
if the feature point is not the feature point observed for the first time in the image frame, determining fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point;
and updating fourth depth information of the feature point in the image frame based on the coordinate information of the feature point and the first posture information to obtain third depth information of the feature point.
In another possible implementation manner, the determining fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point includes:
if the position of the feature point in the image frame is in a preset area around the target point, taking the first depth information of the target point as the fourth depth information;
and if the position of the feature point in the image frame is not in a preset area around the target point, taking the second depth information as the fourth depth information.
In another possible implementation manner, the updating fourth depth information of the feature point in the image frame based on the coordinate information of the feature point and the first pose information to obtain third depth information of the feature point includes:
determining a normalized coordinate of the feature point in the image frame based on the coordinate information of the feature point, determining a partial derivative of the normalized coordinate to the fourth depth information, and taking the partial derivative as the observation matrix;
and correcting the first bit attitude information based on the observation matrix to obtain the third depth information.
In another possible implementation, after determining fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point, the method further includes:
if the position of the feature point in the image frame is not in a preset area around the target point, executing the step of updating fourth depth information of the feature point in the image frame based on the coordinate information and the first position and posture information of the feature point to obtain third depth information of the feature point;
if the position of the feature point in the image frame is in a preset area around the target point, acquiring a first depth variance of the target point in a plurality of image frames and a second depth variance of a plurality of feature points in the key image frame, wherein the plurality of image frames comprise the image frame which is collected currently;
and updating the fourth depth information based on the first depth variance, the second depth variance and the second depth information to obtain the third depth information.
In another possible implementation manner, the method further includes:
determining a third depth variance of a plurality of feature points in the image frame based on the third depth information of each feature point;
and if the third depth variance is smaller than the preset variance, determining the target pose information of the unmanned aerial vehicle based on the third depth information of each feature point in the image frame.
In another possible implementation, the determining a third depth variance of a plurality of feature points in the image frame based on the third depth information of each feature point includes:
if the plurality of feature points are all feature points observed for the first time in the image frame, determining the initial depth variance as the third depth variance;
if the plurality of feature points exist, the feature points are not first observed in the image frame, a first depth variance of the target point in a plurality of image frames and a second depth variance of a plurality of feature points in the key image frame are obtained, and the third depth variance is determined based on the first depth variance and the second depth variance, wherein the plurality of image frames comprise the image frame which is currently acquired.
In another possible implementation manner, the determining process of the initial depth variance includes:
determining an average reprojection error of the plurality of feature points in the image frame and a first depth variance of the target point in the plurality of image frames;
and taking the product of the average reprojection error, the first depth variance and a preset parameter as the initial depth variance.
In another possible implementation manner, the determining process of the first position information of the drone in the image frame includes:
acquiring second pose information of the unmanned aerial vehicle, wherein the second pose information is pose information of the target image in a key image frame;
and predicting the pose information of the unmanned aerial vehicle in the image frame based on the second pose information to obtain the first pose information.
In another possible implementation manner, the determining coordinate information of a plurality of feature points from the image frame includes:
extracting the feature points of the image frame to obtain a plurality of feature points to be processed;
removing the characteristic points which do not accord with the coplanarity hypothesis from the plurality of characteristic points to be processed to obtain a plurality of characteristic points;
coordinate information of the plurality of feature points in the image frame is acquired.
In the embodiment of the application, in the process of performing visual positioning on the unmanned aerial vehicle, the accuracy of the depth information acquired by the single-point distance measuring sensor is high, so that the third depth information of a plurality of feature points in the image frame is determined by combining the first depth information acquired by the single-point distance measuring sensor and the second depth information of the feature points in the known key image frame, so that accurate depth information can be obtained even if the unmanned aerial vehicle is in the air, and the accuracy of the depth information determination in the visual positioning process is further improved.
Fig. 3 is a flowchart of a method for determining depth information of a drone, where the method is performed by the drone, and referring to fig. 3, the method includes:
step 301: the unmanned aerial vehicle acquires an image frame through the target camera, and coordinate information of a plurality of feature points is determined from the image frame.
Wherein, the target camera is installed on the unmanned aerial vehicle of pending location. In some embodiments, the target camera sends an image frame to the drone for each image frame acquired; the unmanned aerial vehicle receives image frames acquired by the target camera, and for each image frame, coordinate information of a plurality of feature points is determined from the image frame.
In some embodiments, an implementation of the drone to determine coordinate information for a plurality of feature points from the image frame includes the following steps (1) - (3):
(1) and the unmanned aerial vehicle extracts the feature points of the image frame to obtain a plurality of feature points to be processed.
In one possible implementation manner, the implementation manner of step (1) may be: the unmanned aerial vehicle tracks the characteristic point of the last image frame in the image frame; if the number of the feature points obtained by tracking in the image frame is smaller than the preset number, extracting new feature points from the image frame, and combining the feature points obtained by tracking and the new feature points into the plurality of feature points to be processed until the number of the feature points to be processed included in the image frame is not smaller than the preset number.
Wherein, the feature matching relationship between the image frames is established by tracking the feature points of the adjacent image frames. The feature point of the unmanned aerial vehicle tracking the last image frame in the image frame is not particularly limited. For example, the drone may employ optical flow methods for feature point tracking. In addition, the implementation mode of extracting the new feature points by the unmanned aerial vehicle is not particularly limited; for example, the unmanned aerial vehicle may extract Feature points of the image frame by using any one of FAST (Features from Accelerated Segment detection) algorithm, SIFT (Scale Invariant Feature Transform) algorithm, and the like, or the unmanned aerial vehicle may extract Feature points of the current image frame by using another algorithm other than the above-mentioned algorithm.
(2) And the unmanned aerial vehicle removes the characteristic points which do not accord with the coplanarity assumption from the plurality of characteristic points to be processed to obtain the plurality of characteristic points.
When the unmanned aerial vehicle shoots the image frame through the target camera, because the image frame is the image frame corresponding to the reference surface positioned below the unmanned aerial vehicle, objects at different altitudes may exist in the reference surface, and if feature points at different altitudes exist in the feature points to be processed, errors may exist in the depth information determined according to the feature points to be processed, so that the unmanned aerial vehicle can remove the feature points which do not accord with the coplanarity assumption from the feature points to be processed.
In one possible implementation, the drone may remove feature points that do not conform to the coplanar assumption from the plurality of feature points to be processed by using a RANSAC (Random Sample Consensus) process; for example, RANSAC processing is performed by a Homography (homographic) matrix transform, an essential matrix transform, or the like.
(3) And the unmanned aerial vehicle acquires coordinate information of the plurality of feature points in the image frame.
Wherein the coordinate information may include pixel coordinates of the feature point in the image frame; or the coordinate information may also include image coordinates in the feature point image frames. Accordingly, after determining the pixel coordinates of the feature points, the drone may convert the pixel coordinates to image coordinates.
In the embodiment of the application, the feature points which do not accord with the coplanarity assumption are removed from the extracted feature points to be processed, so that the remaining feature points are ensured to accord with the coplanarity assumption, and data support is provided for improving the accuracy of depth information determination of the feature points.
Step 302: the unmanned aerial vehicle obtains the first degree of depth information of the target point in this image frame that single-point range sensor gathered, and this target camera and this single-point range sensor all install on this unmanned aerial vehicle.
The first depth information is the distance between the single-point ranging sensor and the reference surface, namely the depth of the center point of the reference surface; correspondingly, the image frame is an image of the reference surface acquired by the target camera, and the target point is a central point of the image frame.
Step 303: the unmanned aerial vehicle acquires second depth information of the plurality of feature points in the key image frame and first position information of the unmanned aerial vehicle in the image frame.
The key image frame is a reference image frame determined by the unmanned aerial vehicle from a plurality of adjacent image frames collected by the target camera. In the embodiment of the application, because the sampling frequency of the target camera for acquiring the image frames is high, and the pose change of the unmanned aerial vehicle between two adjacent image frames is generally small, in a certain pose change range, a newly obtained image frame is only aligned with a certain specific frame to estimate the current pose, when the pose change range exceeds a certain range, the new specific frame is adopted to perform alignment in the next stage, and the specific frames for performing image alignment are key image frames.
Optionally, an implementation of determining the key image frame by the drone includes: if the number of the feature points newly extracted from the currently acquired image frame exceeds the preset number, determining the image frame as a key image frame; or if the average pixel moving distance of the old feature point exceeds a preset distance, determining that the currently acquired image frame is the key image frame, wherein the average pixel moving distance is the average pixel moving distance between the old feature point and the currently acquired image frame from the last key image frame.
It should be noted that the process of determining the key image frame by the drone may also include other implementations, and the application is not limited herein.
Wherein the first pose information is information representing a relative position and a relative pose of the drone at the image frame with respect to at the key image frame. In some embodiments, the process of the drone determining the first pose information of the drone in the image frame includes: the unmanned aerial vehicle acquires second pose information of the unmanned aerial vehicle, wherein the second pose information is pose information of the target image in the key image frame; and predicting the pose information of the unmanned aerial vehicle in the image frame based on the second pose information to obtain the first pose information.
In this embodiment, the drone predicts the first pose information from this second pose information, since the pose information of the keyframe frame is known. For example, the drone predicts the pose information of the drone in the image frame based on the second pose information through EKF (Extended Kalman Filter).
In the embodiment of the application, because the pose information of the unmanned aerial vehicle at the key image frame is known in the traveling process of the unmanned aerial vehicle, the relative pose information of the unmanned aerial vehicle at the image frame relative to the key image frame can be predicted by combining the pose information, so that data support can be provided for the determination of the subsequent depth information.
In this embodiment, the drone may determine third depth information of the plurality of feature points in the image frame based on the second depth information of the plurality of feature points in the key image frame, the first pose information, the coordinate information of the plurality of feature points, and the first depth information of the target point. It should be noted that, since there are a plurality of feature points in the image frame, and any feature point in the plurality of feature points may or may not be a feature point observed for the first time in the image frame, after step 303, the drone performs the operation of step 304 or performs the operations of steps 305 to 306 for any feature point.
Step 304: for any feature point, if the feature point is a feature point observed for the first time in the image frame, the unmanned aerial vehicle uses the first depth information of the target point as the third depth information of the feature point.
If the target camera in the embodiment of the application is a monocular camera, the number of image frames shot by the monocular camera at the same time is 1, however, according to one image frame, the unmanned aerial vehicle cannot determine the depth information of the feature points in the image frame; therefore, the unmanned aerial vehicle can take the first depth information of the target point collected by the single-point ranging sensor as the third depth information of the feature point, so that the initial depth value of the feature point is determined.
It should be noted that if the feature point is observed in other image frames subsequently, it indicates that the feature point is no longer the first observed feature point, and the drone may update the depth information of the feature point through the operations of steps 305 to 306.
In the embodiment of the application, the sampling frequency of the single-point ranging sensor and the sampling frequency of the target camera may be the same or different; accordingly, the first case: if the sampling frequency of the single-point ranging sensor is the same as that of the target camera, the first depth information acquired by the single-point ranging sensor is the depth information of the central point of the image frame.
In this case, the drone may use the depth information of the central point as third depth information of the feature point observed for the first time in the image frame.
In the second case: if the sampling frequency of the single-point ranging sensor is different from the sampling frequency of the target camera, the first depth information acquired by the single-point ranging sensor is not the depth information of the central point of the image frame.
In this case, for the image frame, the drone may determine, when the single-point ranging sensor collects the first depth information of the target point, the first depth information of the target point whose sampling time is closest to the sampling time at which the image frame is collected, and use the first depth information as the third depth information of the feature point observed for the first time in the image frame.
Step 305: if the feature point is not the feature point observed for the first time in the image frame, the unmanned aerial vehicle determines fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point.
In a possible implementation, the process of determining, by the drone, fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point, that is, step 305 may include at least the following two implementations:
the first implementation mode comprises the following steps: if the position of the feature point in the image frame is in a preset area around the target point, the unmanned aerial vehicle takes the first depth information of the target point as the fourth depth information.
The preset area can be set and changed according to needs, and the preset area is not particularly limited in the application; for example, the preset region may be a region with a preset number of pixels from the position where the target point is located, and the preset number of pixels may be 10 pixels, 15 pixels, 30 pixels, or the like.
In this implementation manner, since the precision of the first depth information of the target point acquired by the single-point ranging sensor is higher, the unmanned aerial vehicle may use the first depth information as fourth depth information of a feature point near the target point in the image frame, so that the precision of determining the fourth depth information is higher.
The second implementation mode comprises the following steps: if the position of the feature point in the image frame is not located in a preset area around the target point, the unmanned aerial vehicle takes the second depth information as the fourth depth information.
In this case, since the depth information of the feature point in the key image frame is known, the drone may use the second depth information that has been determined in the key image frame as the fourth depth information of the same feature point in the image frame.
In the embodiment of the application, the accuracy of the single-point ranging sensor is high, so that the depth determination accuracy can be improved by determining the initial depth value of the feature point near the target point by combining the depth information obtained by the single-point ranging sensor; on the other hand, for other feature points which are repeatedly observed and matched with the key image frame, the initial depth value can be determined by combining the depth information determined when the key image frame is processed, so that the efficiency of depth determination is improved on the basis of ensuring higher accuracy of depth determination.
In some embodiments, after step 305, the method for determining depth information of a drone provided by the present application further includes the following two implementation manners:
in one possible implementation, if the position of the feature point in the image frame is not within a preset area around the target point, the drone performs the operation of step 306.
For the feature points not located in the preset area around the target point, the fourth depth information of the feature points is determined based on the second depth information of the feature points in the key image frame, and then the unmanned aerial vehicle can update the fourth depth information through the first pose information corresponding to the key image frame.
In another possible implementation manner, if the position of the feature point in the image frame is within a preset area around the target point, the drone obtains a first depth variance of the target point in a plurality of image frames and a second depth variance of a plurality of feature points in the key image frame, where the plurality of image frames include the currently acquired image frame; and updating the fourth depth information based on the first depth variance, the second depth variance and the second depth information to obtain the third depth information.
The image frames are a plurality of continuous image frames acquired by a target camera, and the image frames comprise a key image frame and a currently acquired image frame. Since the depth information of the target point of each of the plurality of image frames may be determined by the first depth information acquired by the single-point ranging sensor, the first depth variance is a depth variance of a center point of the plurality of image frames. The second depth variance is a depth variance of a feature point in the keyframe frame.
In this implementation, the drone may update the fourth depth information based on the first depth variance, the second depth variance, and the second depth information through the EKF, that is, through the following formula one, to obtain the third depth information.
The formula I is as follows:
wherein, mufuseIs third depth information; μ is fourth depth information; mu.sobsIs the second depth information; sigmaobsIs a second depth variance; σ is a first depth variance.
In the embodiment of the application, because the depth information acquired by the single-point ranging sensor is accurate, when the depth information of the feature point is updated, the depth of the feature point of the image frame can be updated together with the first depth variance of the depth information acquired by the single-point ranging sensor and the known second depth variance in the key image frame, so that the calculation amount of the EKF update is saved.
Step 306: and the unmanned aerial vehicle updates fourth depth information of the feature point in the image frame based on the coordinate information of the feature point and the first position and posture information to obtain third depth information of the feature point.
In a possible implementation manner, the drone may update the fourth depth information through the EKF, and accordingly, an implementation manner in which the drone performs step 306 may include the following steps (1) - (2):
(1) the unmanned aerial vehicle determines the normalized coordinates of the feature point in the image frame based on the coordinate information of the feature point, determines the partial derivative of the normalized coordinates to the fourth depth information, and takes the partial derivative as the observation matrix.
Wherein, the unmanned aerial vehicle can realize the operation of step (1) through the following formula two:
the formula II is as follows:
wherein H is an observation matrix; m is the normalized coordinate of the feature point in the image frame,
is the coordinate information of the characteristic point in the image frame.
zaIs the second depth information of the feature point in the key image frame (edited _ frame), i.e. the fourth depth information in the image frame.
For any feature point which is not observed for the first time, the conversion relationship between the coordinate information in the image frame and the coordinate information in the key image frame is shown in the following formula three:
the formula III is as follows:
wherein, R _ c _ a and t _ c _ a represent the first pose information, i.e. pose and position, of the image frame;
representing the normalized coordinates of the feature points in the keyframe frame.
Accordingly, the observation matrix can also be expressed in the form of equation four:
the formula four is as follows:
(2) and the unmanned aerial vehicle corrects the first attitude information based on the observation matrix to obtain the third depth information.
In this step, the unmanned aerial vehicle updates the fourth depth information of the feature point through the EKF to obtain the third depth information.
In the embodiment of the application, the observation matrix of the system is determined by combining the depth of the feature points in a loose coupling mode, so that the first pose information corresponding to the key image frame is corrected, the updated depth information of the feature points is obtained, and the calculated amount of depth information determination is reduced.
In the embodiment of the application, on one hand, the first depth information acquired by the single-point distance measuring sensor is used as the third depth information of the feature point observed for the first time, so that the problem that the depth of the feature point cannot be determined by a single image frame is solved; on the other hand, for the feature point observed repeatedly, the third depth information of the feature point is determined by combining the first depth information, the first pose information of the key frame and the second depth information of the feature point, so that the accuracy of depth information determination is high.
In the embodiment of the application, after the depth information of a plurality of feature points of the image frame is determined, whether the pose of the unmanned aerial vehicle can be determined according to the depth variance of the plurality of feature points can be determined according to the depth information of the image frame; correspondingly, fig. 4 is a flowchart of a method for determining depth information of a drone, which is provided in an embodiment of the present application, and is executed by the drone, referring to fig. 4, where the method includes:
step 401: the drone determines a third depth variance for the plurality of feature points in the image frame based on the third depth information for each feature point.
In some embodiments, the feature points may all be feature points observed for the first time, or there may be feature points not observed for the first time, and accordingly, the implementation manner of step 401 includes the following two cases:
in the first case, if the plurality of feature points are all feature points observed for the first time in the image frame, the initial depth variance is determined to be the third depth variance.
The determination of the initial depth variance (meas _ var) may take two factors into consideration, namely, a first depth variance (tof _ variance) of a target point acquired by a single-point ranging sensor in a plurality of image frames, and a mean reprojection error (reprojection _ error) of a feature point, and accordingly, the determination of the initial depth variance may include the following steps:
determining, by the drone, an average reprojection error of the plurality of feature points in the image frame, and a first depth variance of the target point in a plurality of image frames; and taking the product of the average reprojection error, the first depth variance and a preset parameter as the initial depth variance.
In this case, the drone may implement the determination process of the initial depth variance by the following equation five:
the formula five is as follows: meas _ var ═ lambda × tof _ variance × projection _ error
Wherein lambda is a preset parameter. The preset parameter may be a fixed constant; accordingly, the preset parameter may be set and changed as needed, which is not specifically limited in the present application.
In the embodiment of the application, because the feature point may not be a feature point located around the target point, an error caused by using the first depth information acquired by the single-point ranging sensor as the initial depth value of the feature point is large, and an initial depth variance is set for the initial depth value, so that whether the depth is accurate or not can be determined by combining the depth variance in the depth updating process.
In a second case, if there is a feature point in the plurality of feature points that is not first observed in the image frame, a first depth variance of the target point in a plurality of image frames and a second depth variance of a plurality of feature points in the key image frame are obtained, and the third depth variance is determined based on the first depth variance and the second depth variance, wherein the plurality of image frames include the image frame currently acquired.
In this case, the drone may implement the process of determining the third depth variance based on the first depth variance and the second depth variance by the following equation six:
wherein σfuseIs a third depth variance; σ is a first depth variance; sigmaobsIs the second depth variance.
In the embodiment of the application, EKF (extended Kalman Filter) updating of the depth variance of the image frame is realized by combining the depth variance of the depth information acquired by the single-point ranging sensor and the depth variance of the feature point in the key image frame, so that the calculated amount in the updating process is saved, and the processing resource of the unmanned aerial vehicle is saved.
Step 402: and if the third depth variance is smaller than the preset variance, the unmanned aerial vehicle determines the target pose information of the unmanned aerial vehicle based on the third depth information of each feature point in the image frame.
If the third depth variance is smaller than the preset variance, the state convergence is indicated, and the process of determining the target pose information of the unmanned aerial vehicle based on the third depth information of each feature Point in the image frame can be realized by the unmanned aerial vehicle in a PnP (passive-n-Point, angle-n-Point) manner.
In this embodiment of the application, if the depths of the plurality of feature points in the image frame converge, it indicates that the depths of the plurality of feature points are relatively accurate, and at this time, the accuracy of the pose of the unmanned aerial vehicle determined by the third depth information is relatively high.
In the embodiment of the application, the EKF is used for depth estimation, and the depth information of the feature points is added into the state variable of the EKF, so that if the state is converged, that is, the currently determined third depth information is more accurate, the unmanned aerial vehicle can further determine the target pose information of the unmanned aerial vehicle based on the third depth information of the feature points, so that stable scale estimation can be obtained when the unmanned aerial vehicle is in a hundred meters high altitude, and a stable pose estimation result is obtained; in addition, the depth and the pose are fused based on a loose coupling mode, the calculation amount and the calculation complexity are reduced, and the small unmanned aerial vehicle can determine accurate depth information even under the condition of limited calculation resources.
All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.
Fig. 5 is a block diagram of a device 500 for determining depth information of an unmanned aerial vehicle according to an embodiment of the present application. Referring to fig. 500, the apparatus 500 includes:
a first determining module 501, configured to acquire an image frame by a target camera, and determine coordinate information of a plurality of feature points from the image frame;
a first obtaining module 502, configured to obtain first depth information of a target point in the image frame, where the first depth information is collected by a single-point ranging sensor, and the target camera and the single-point ranging sensor are both installed on the unmanned aerial vehicle;
a second obtaining module 503, configured to obtain second depth information of the feature points in the key image frame and first pose information of the drone in the key image frame;
a second determining module 504, configured to determine third depth information of the plurality of feature points in the image frame based on the second depth information of the plurality of feature points in the key image frame, the first pose information, the coordinate information of the plurality of feature points, and the first depth information of the target point.
In a possible implementation manner, the second determining module 504 includes:
a first determining unit, configured to, for any feature point, if the feature point is a feature point observed for the first time in the image frame, use first depth information of the target point as third depth information of the feature point; if the feature point is not the feature point observed for the first time in the image frame, determining fourth depth information of the feature point in the image frame based on the first depth information and the second depth information of the target point;
and the updating unit is used for updating the fourth depth information of the feature point in the image frame based on the coordinate information of the feature point and the first position information to obtain the third depth information of the feature point.
In another possible implementation manner, the first determining unit is configured to use the first depth information of the target point as the fourth depth information if the position of the feature point in the image frame is within a preset area around the target point; and if the position of the feature point in the image frame is not in a preset area around the target point, taking the second depth information as the fourth depth information.
In another possible implementation manner, the updating unit is configured to determine a normalized coordinate of the feature point in the image frame based on the coordinate information of the feature point, determine a partial derivative of the normalized coordinate to the fourth depth information, and use the partial derivative as the observation matrix; and correcting the first bit attitude information based on the observation matrix to obtain the third depth information.
In another possible implementation manner, the apparatus further includes:
the updating unit is used for updating fourth depth information of the feature point in the image frame based on the coordinate information of the feature point and the first position and posture information to obtain third depth information of the feature point if the position of the feature point in the image frame is not located in a preset area around the target point;
the updating unit is further configured to, if the position of the feature point in the image frame is within a preset area around the target point, obtain a first depth variance of the target point in a plurality of image frames and a second depth variance of a plurality of feature points in the key image frame, where the plurality of image frames include the currently acquired image frame; and updating the fourth depth information based on the first depth variance, the second depth variance and the second depth information to obtain the third depth information.
In another possible implementation manner, the apparatus further includes:
a third determining module for determining a third depth variance of the plurality of feature points in the image frame based on the third depth information of each feature point;
and the fourth determining module is used for determining the target pose information of the unmanned aerial vehicle based on the third depth information of each feature point in the image frame if the third depth variance is smaller than the preset variance.
In another possible implementation manner, the third determining module is configured to determine the initial depth variance as the third depth variance if the plurality of feature points are all feature points observed for the first time in the image frame; if the plurality of feature points exist, the feature points are not first observed in the image frame, a first depth variance of the target point in a plurality of image frames and a second depth variance of a plurality of feature points in the key image frame are obtained, and the third depth variance is determined based on the first depth variance and the second depth variance, wherein the plurality of image frames comprise the image frame which is currently acquired.
In another possible implementation manner, the apparatus further includes:
a fifth determining module for determining an average reprojection error of the plurality of feature points in the image frame and a first depth variance of the target point in the plurality of image frames; and taking the product of the average reprojection error, the first depth variance and a preset parameter as the initial depth variance.
In another possible implementation manner, the apparatus further includes: the sixth determining module is configured to obtain second pose information of the unmanned aerial vehicle, where the second pose information is pose information of the target image in the key image frame; and predicting the pose information of the unmanned aerial vehicle in the image frame based on the second pose information to obtain the first pose information.
In another possible implementation manner, the first determining module 501 is configured to perform feature point extraction on the image frame to obtain a plurality of feature points to be processed; removing the characteristic points which do not accord with the coplanarity hypothesis from the plurality of characteristic points to be processed to obtain a plurality of characteristic points; coordinate information of the plurality of feature points in the image frame is acquired.
In the embodiment of the application, in the process of performing visual positioning on the unmanned aerial vehicle, the accuracy of the depth information acquired by the single-point distance measuring sensor is high, so that the third depth information of a plurality of feature points in the image frame is determined by combining the first depth information acquired by the single-point distance measuring sensor and the second depth information of the feature points in the known key image frame, so that accurate depth information can be obtained even if the unmanned aerial vehicle is in the air, and the accuracy of the depth information determination in the visual positioning process is further improved.
It should be noted that: the determining apparatus for depth information of an unmanned aerial vehicle provided in the above-mentioned embodiment is only illustrated by dividing each of the above-mentioned functional modules when determining the depth information, and in practical application, the above-mentioned function distribution may be completed by different functional modules as required, that is, the internal structure of the terminal is divided into different functional modules, so as to complete all or part of the above-described functions. In addition, the determining apparatus for depth information of an unmanned aerial vehicle provided by the above embodiment and the determining method embodiment of depth information of an unmanned aerial vehicle belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment and are not described herein again.
The embodiment of the application also provides an unmanned aerial vehicle. Fig. 6 shows a block diagram of a structure of a drone 600 provided in an embodiment of the present application. Generally, the drone 600 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 601 and one or more memories 602, where the memories 602 are used for storing executable instructions, and the processors 601 are configured to execute the executable instructions to implement the method for determining the depth information of the drone provided by the method embodiments. Of course, this unmanned aerial vehicle 600 can also have parts such as wired or wireless network interface, keyboard and input/output interface to carry out input/output, this unmanned aerial vehicle 600 can also include other parts that are used for realizing the equipment function, do not describe herein any more.
The embodiment of the application also provides a computer-readable storage medium, where at least one program code is stored in the computer-readable storage medium, and the at least one program code is loaded and executed by a processor, so as to implement the method for determining depth information of an unmanned aerial vehicle described above.
The present application further provides a computer program product, and when a processor of the drone executes program codes in the computer program product, the drone is enabled to execute the method for determining depth information of the drone.
In some embodiments, the program code related to embodiments of the present application may be executed by a drone, or by a plurality of drones located at a site, or by a plurality of drones distributed at a plurality of sites and interconnected by a communication network, which may form a blockchain system.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.