CN112150529A

CN112150529A - Method and device for determining depth information of image feature points

Info

Publication number: CN112150529A
Application number: CN201910570786.3A
Authority: CN
Inventors: 杨帅
Original assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Current assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date: 2019-06-28
Filing date: 2019-06-28
Publication date: 2020-12-29
Anticipated expiration: 2039-06-28
Also published as: CN112150529B

Abstract

Disclosed are a method, an apparatus, a computer-readable storage medium, and an electronic device for determining depth information of image feature points, the method including: judging whether the current frame image meets a first preset condition, and if so, determining first depth information corresponding to a first feature point in the current frame image according to a depth prediction model; acquiring first gray information and a first camera pose corresponding to a current frame image; judging whether a subsequent frame image of the current frame image meets a first preset condition, and if the subsequent frame image of the current frame image does not meet the first preset condition, acquiring second gray information and a second camera pose corresponding to the subsequent frame image; and acquiring optimized first depth information according to the first gray information, the second gray information, the first camera pose, the second camera pose and the first depth information. According to the depth information obtaining method and device, the depth information of the image feature points is obtained through the depth prediction model, the optimized depth information is further obtained, and the accuracy of the depth information is high.

Description

Method and device for determining depth information of image feature points

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for determining depth information of image feature points.

Background

When the three-dimensional reconstruction is performed on the scene structure in the space, it is very important to acquire the relevant information of the scene structure in the space, and various sensors are widely applied to acquisition of the relevant information of the scene structure, wherein the camera is paid more and more attention due to its low price and the fact that the captured image carries rich relevant information of the scene structure.

The process of converting a space point with three-dimensional information in a space into a pixel point with two-dimensional information in an image is performed when the camera is used for shooting the image, so that one-dimensional information, namely depth information, of the space point is lost when the camera is used for collecting related information of a scene structure, and scale uncertainty is often caused when the depth information corresponding to the pixel point in the image collected by the camera is estimated at present, so that the accuracy of the depth information of the determined image feature point is not high.

Disclosure of Invention

The present application is proposed to solve the above-mentioned technical problems. Embodiments of the present application provide a method and an apparatus for determining depth information of image feature points, a computer-readable storage medium, and an electronic device, where depth information of image feature points is obtained through a depth prediction model, and the depth information is optimized based on gray scale information of an image to obtain optimized depth information, and the accuracy of the optimized depth information is high.

According to a first aspect of the present application, there is provided a method for determining depth information of an image feature point, including:

judging whether the current frame image meets a first preset condition, and if so, determining first depth information corresponding to a first feature point in the current frame image according to a pre-acquired depth prediction model;

acquiring first gray information and a first camera pose corresponding to the current frame image;

judging whether the subsequent frame image of the current frame image meets the first preset condition or not, and if the subsequent frame image of the current frame image does not meet the first preset condition, acquiring second gray information and a second camera pose corresponding to the subsequent frame image;

and acquiring optimized first depth information according to the first gray information, the second gray information, the first camera pose, the second camera pose and the first depth information.

According to a second aspect of the present application, there is provided an image feature point depth information determination apparatus including:

the depth information determining module is used for judging whether the current frame image meets a first preset condition, and if the current frame image meets the first preset condition, determining first depth information corresponding to a first feature point in the current frame image according to a depth prediction model acquired in advance;

the first acquisition model is used for acquiring first gray information and a first camera pose corresponding to the current frame image;

the second obtaining model is used for judging whether the subsequent frame image of the current frame image meets the first preset condition or not, and if the subsequent frame image of the current frame image does not meet the first preset condition, obtaining second gray information and a second camera pose corresponding to the subsequent frame image;

and the optimization module is used for acquiring optimized first depth information according to the first gray information, the second gray information, the first camera pose, the second camera pose and the first depth information.

According to a third aspect of the present application, there is provided a computer-readable storage medium storing a computer program for executing the above-described depth information determination method for image feature points.

According to a fourth aspect of the present application, there is provided an electronic apparatus comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is used for reading the executable instruction from the memory and executing the instruction to realize the depth information determination method of the image feature point.

Compared with the prior art, the method, the device, the computer-readable storage medium and the electronic device for determining the depth information of the image feature points, provided by the application, at least have the following beneficial effects:

on one hand, the depth information of the image feature points is obtained by using the depth prediction model, the depth information has an absolute scale, namely, the actual physical scale of a structural scene in a space can be reflected, and then the depth information of the image feature points is optimized based on the gray information corresponding to the image and the camera pose, so that the optimized depth information is obtained, and the accuracy of the optimized depth information is high.

On the other hand, the collected images are judged, the depth information of the image feature points is determined by using the depth prediction model only when the collected images meet the preset conditions, so that depth prediction of all the collected images according to the depth prediction model is avoided, the calculated amount can be effectively reduced, and the determination efficiency of the depth information of the image feature points is improved.

Drawings

The above and other objects, features and advantages of the present application will become more apparent by describing in more detail embodiments of the present application with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. In the drawings, like reference numbers generally represent like parts or steps.

Fig. 1 is a schematic flowchart of a depth information determining method for image feature points according to an exemplary embodiment of the present application;

fig. 2 is a flowchart further included before step 20 of a method for determining depth information of an image feature point according to an exemplary embodiment of the present application;

fig. 3 is a flowchart illustrating step 20 in a depth information determining method for image feature points according to an exemplary embodiment of the present application;

fig. 4 is a flowchart illustrating a step 80 in a depth information determining method for image feature points according to an exemplary embodiment of the present application;

fig. 5 is a flowchart illustrating a step 801 in a depth information determining method for image feature points according to an exemplary embodiment of the present application;

fig. 6 is a flowchart further included after step 802 in the method for determining depth information of image feature points according to an exemplary embodiment of the present application;

fig. 7 is a schematic structural diagram of an apparatus for determining depth information of image feature points according to a first exemplary embodiment of the present application;

fig. 8 is a schematic structural diagram of an apparatus for determining depth information of image feature points according to a second exemplary embodiment of the present application;

fig. 9 is a schematic structural diagram of an apparatus for determining depth information of image feature points according to a third exemplary embodiment of the present application;

fig. 10 is a schematic structural diagram of an apparatus for determining depth information of image feature points according to a fourth exemplary embodiment of the present application;

fig. 11 is a schematic structural diagram of an optimization unit 741 in a depth information determination apparatus for image feature points according to a fourth exemplary embodiment of the present application;

fig. 12 is a schematic structural diagram of an apparatus for determining depth information of image feature points according to a fifth exemplary embodiment of the present application;

fig. 13 is a schematic structural diagram of the map construction unit 743 in the depth information determining apparatus for image feature points according to the fifth exemplary embodiment of the present application;

fig. 14 is a block diagram of an electronic device provided in an exemplary embodiment of the present application.

Detailed Description

Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be understood that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and that the present application is not limited by the example embodiments described herein.

Summary of the application

The acquisition of the related information of the scene structure in the space is crucial to the realization of three-dimensional reconstruction, the camera draws more and more attention due to the low price and the fact that the shot image carries rich related information of the scene structure, and the image shot by the camera can only acquire image space information and cannot acquire corresponding depth information. At present, when depth information corresponding to image space information is estimated, the obtained depth information has scale uncertainty, so that the accuracy of the determined depth information of the image feature points is not high.

In the depth information determining method for the image feature points, the depth information of the image feature points is obtained by using the depth prediction model, the depth information has an absolute scale and can reflect a real physical scale of a structural scene in a space, the depth information of the image feature points obtained by the depth prediction model is used as an initial value, the initial value is optimized based on the gray scale information and the camera pose corresponding to the image, and the optimized depth information is obtained, so that the optimized depth information has the absolute scale, and the accuracy of the optimized depth information is high. Moreover, in the embodiment, the collected images are judged, so that depth prediction of all the collected images according to the depth prediction model is avoided, the calculated amount can be effectively reduced, and the depth information determination efficiency of the image feature points is improved.

Having described the basic concepts of the present application, various non-limiting embodiments of the present solution are described in detail below with reference to the accompanying drawings.

Exemplary method

Fig. 1 is a flowchart illustrating a method for determining depth information of an image feature point according to an exemplary embodiment of the present application.

The embodiment can be applied to electronic equipment, and particularly can be applied to a server or a general computer. As shown in fig. 1, a method for determining depth information of an image feature point according to an exemplary embodiment of the present application at least includes the following steps:

step 20: and judging whether the current frame image meets a first preset condition, and if so, determining first depth information corresponding to a first characteristic point in the current frame image according to a depth prediction model acquired in advance.

In this embodiment, the current frame image is judged according to the first preset condition, and only when the current frame image meets the first preset condition, the first depth information corresponding to the first feature point in the current frame image can be determined according to the depth prediction model, so that depth prediction of all images acquired by the camera according to the depth prediction model is avoided, the calculated amount can be effectively reduced, and the depth information determination efficiency of the image feature point is improved.

In a possible implementation manner, the depth prediction model utilizes a convolutional neural network technology, a convolutional neural network is utilized to train a training sample, and depth information corresponding to a pixel point in an image is obtained according to an input image, wherein the depth information has an absolute scale and can reflect a real physical scale of a structural scene in a space.

Specifically, the first preset condition corresponds to a generation condition of the key frame image, that is, if the current frame image satisfies the first preset condition, the current frame image is determined to be the key frame image. For example, the number of interval frames may be preset, and each time the number of interval frames between the current frame image and the previous key frame image reaches the preset number of interval frames, the current frame image satisfies the first preset condition; or the Euclidean distance can be calculated according to the camera pose corresponding to the current frame image and the camera pose corresponding to the previous key frame image, and compared with a first preset threshold, when the calculated Euclidean distance is greater than the first preset threshold, the current frame image meets a first preset condition; in this embodiment, semantic segmentation may be performed on the current frame image, so that whether the current frame image is determined to be the key frame image may be determined according to a luminosity difference between semantic information corresponding to the current frame image and semantic information corresponding to a previous key frame image, and when the luminosity difference is greater than a second preset threshold, the current frame image satisfies a first preset condition. In this embodiment, the content of the first preset condition is not limited, as long as whether the current frame image can be the key frame image can be determined.

Step 40: and acquiring first gray information and a first camera pose corresponding to the current frame image.

The image collected by the camera is usually a color image, which is not beneficial to the identification and subsequent calculation of the computer, so that the current frame image usually needs to be preprocessed to obtain the first gray scale information corresponding to the current frame image, and the related information of the scene structure in the space corresponding to the current frame image can be determined according to the first gray scale information of the current frame image.

Since the accuracy of the first depth information determined using the depth prediction model is low, the first depth information needs to be optimized. In this embodiment, the first depth information is optimized based on the camera pose corresponding to the multiple frames of images, so that the first camera pose corresponding to the current frame of image needs to be acquired.

Step 60: and judging whether the subsequent frame image of the current frame image meets a first preset condition, and if the subsequent frame image of the current frame image does not meet the first preset condition, acquiring second gray information and a second camera pose corresponding to the subsequent frame image.

The method comprises the steps that a current frame image meeting a first preset condition exists, and first depth information corresponding to a first feature point in the current frame image is determined, so that when a subsequent frame image of the current frame image does not meet the first preset condition, the subsequent frame image is used for optimizing the first depth information, and second gray information and a second camera pose corresponding to the subsequent frame image need to be obtained.

Step 80: and acquiring optimized first depth information according to the first gray information, the second gray information, the first camera pose, the second camera pose and the first depth information.

After first depth information corresponding to a first feature point in a current frame image is obtained, the first depth information is used as an initial value of the depth information of the first feature point, and the depth distribution of the first feature point is determined according to a first camera pose corresponding to the current frame image meeting a first preset condition and a second camera pose corresponding to a subsequent frame image not meeting the first preset condition, so that optimized first depth information is obtained. It should be noted that when the current frame image satisfies the first preset condition, if the subsequent frame image of at least one current frame image does not satisfy the first preset condition, the first depth information corresponding to the first feature point in the current frame image is optimized according to each subsequent frame image, and therefore the optimization of the first depth information is continuously performed.

It should be noted that any one of the images acquired by the camera may become the current frame image mentioned in this embodiment, for example, when the current frame image corresponds to the first frame image, and the current frame image satisfies a first preset condition, the first depth information corresponding to the first feature point in the first frame image is determined, and when the second frame image and the third frame image serve as subsequent frame images of the current frame image, and the second frame image and the third frame image do not satisfy the first preset condition, the second frame image and the third frame image are respectively used to optimize the first depth information.

In this embodiment, when a current frame image is judged, it is found that the current frame image does not satisfy a first preset condition, a first gray scale information and a first camera pose corresponding to the current frame image are obtained at this time, and if a previous frame key frame image exists before the current frame image and depth information corresponding to a feature point in the previous frame key frame image is determined according to a depth prediction model, the first gray scale information corresponding to the current frame image that does not satisfy the first preset condition and the depth information corresponding to the feature point in the previous frame key frame image are optimized by using the first camera pose and the first gray scale information corresponding to the current frame image that does not satisfy the first preset condition, that is, the current frame image at this time is used as a subsequent frame image of the previous frame key frame image. When the subsequent frame image of the current frame image is judged and the condition that the subsequent frame image meets the first preset condition is found, the depth information corresponding to the feature points in the subsequent frame image meeting the first preset condition is determined according to the depth prediction model which is obtained in advance, and the depth information corresponding to the feature points in the subsequent frame image is optimized by utilizing the image after the subsequent frame image.

In summary, when a frame of image is acquired, whether the image meets a first preset condition is judged, if the image meets the first preset condition, depth information corresponding to feature points in the image is determined according to a pre-acquired depth prediction model, and the obtained depth information is optimized by using a subsequent image which does not meet the first preset condition; if the first preset condition is not met, optimizing the depth information corresponding to the feature points in the previous frame of key frame image according to the gray information of the image and the camera pose, so that after the image is obtained, the image can be used for determining the depth information of the feature points or optimizing the depth information of the previous frame of key frame image, and continuously and circularly iterating.

The depth information determining method for the image feature points provided by the embodiment has at least the following beneficial effects:

Fig. 2 is a schematic flow chart of the embodiment shown in fig. 1, which is further included before determining whether the current frame image satisfies the first preset condition.

As shown in fig. 2, on the basis of the embodiment shown in fig. 1, in an exemplary embodiment of the present application, before the step of determining whether the current frame image satisfies the first preset condition shown in step 20, the method may further include the following steps:

step 101: and acquiring a third camera pose of a previous frame image, wherein the previous frame image meets a first preset condition.

When the camera pose corresponding to different frame images is used for optimizing the first depth information, the accurate camera pose can enable the accuracy of the acquired optimized first depth information to be higher, so that when the current frame image is acquired, the accurate first camera pose needs to be determined.

Step 102: and determining a second feature point of the previous frame image, wherein the second feature point is at least one feature point of the previous frame image, and the gray gradient of the feature point meets a second preset condition.

In this embodiment, the first camera pose corresponding to the current frame image is determined based on the assumption that the luminance is unchanged, so that a second feature point, that is, a key point, in the previous frame image needs to be determined, the feature point can be generally selected according to the gray gradient, a third preset threshold is set, and when the gray gradient is greater than the third preset threshold, a pixel point corresponding to the gray gradient is determined as the feature point. It should be noted that, because the previous frame image satisfies the first preset condition, the depth information of the feature point in the previous frame image is determined, and the second feature point here may be the feature point for which the depth information is determined, so that the feature point may be prevented from being selected multiple times in the same frame image.

Step 103: and acquiring third gray information of the second characteristic point, and acquiring first gray information corresponding to a first projection point of the second characteristic point on the current frame image.

According to the assumption that the luminosity is unchanged, namely, the luminosity values of pixel points of the same space point in the continuous frame images are the same, therefore, to determine the first camera pose corresponding to the current frame image, the third gray scale information of the second feature point needs to be determined, the projection of the second feature point from the previous frame image to the current frame image is completed, and the first gray scale information corresponding to the first projection point is determined.

Step 104: and determining the first camera pose corresponding to the current frame image according to the third camera pose corresponding to the previous frame image, the third gray scale information and a first gray scale error function between the first gray scale information corresponding to the first projection point.

The second feature point and the first projection point correspond to the same spatial point, so that according to the assumption that the luminosity is unchanged, theoretically, the third gray scale information corresponding to the second feature point and the first gray scale information corresponding to the first projection point should be the same, however, in practice, the third gray scale information corresponding to the second feature point and the first gray scale information corresponding to the first projection point often have a larger difference due to the problems of low accuracy of the camera pose utilized in the process of determining the first projection point and the like, thereby constructing a first gray-scale error function between the third gray-scale information corresponding to the second feature point and the first gray-scale information corresponding to the first projection point, the first camera pose corresponding to the current frame image is determined by minimizing the first gray scale error function, the obtained first camera pose is relatively accurate, therefore, the accuracy of the optimized first depth information acquired by the first camera pose is high.

In one possible implementation, the coarse camera pose corresponding to the current frame image is obtained, and the coarse camera pose may be obtained according to various positioning devices, such as an inertial measurement unit and a satellite positioning device, and the coarse camera pose is used as an initial value, and a third camera pose corresponding to the previous frame image and a minimum gray scale error method are used to optimize the coarse camera pose to determine a first camera pose corresponding to the current frame image.

It should be noted that, before acquiring the second camera pose corresponding to the subsequent frame image, the method further includes: acquiring second gray information corresponding to a third projection point projected by the first feature point in the current frame image on a subsequent frame image; and determining a second camera pose corresponding to a subsequent frame image according to the first camera pose corresponding to the current frame image, the first gray information corresponding to the first characteristic point and the second gray information corresponding to the third projection point. That is to say, after acquiring one frame of image, the camera pose corresponding to the image may be determined according to the previous frame of key frame image.

In this embodiment, before determining whether the current frame image satisfies the first preset condition, based on the assumption that the luminosity is not changed, the third camera pose of the previous frame image is utilized to minimize the first grayscale error function between the third grayscale information corresponding to the second feature point and the first grayscale information corresponding to the first projection point, so as to determine the first camera pose of the current frame image, where the accuracy of the first camera pose is relatively high, so that the accuracy of the optimized first depth information acquired by utilizing the first camera pose is relatively high.

Fig. 3 is a schematic flow chart illustrating a process of determining first depth information corresponding to a first feature point in a current frame image according to a depth prediction model obtained in advance in the embodiment shown in fig. 1.

As shown in fig. 3, on the basis of the embodiment shown in fig. 1, in an exemplary embodiment of the present application, the step of determining the first depth information corresponding to the first feature point in the current frame image shown in step 20 may specifically include the following steps:

step 201: and determining first depth information of pixel points in the current frame image according to a depth prediction model acquired in advance.

The depth prediction model is based on depth prediction realized by a convolutional neural network, training a training sample by the convolutional neural network, and outputting depth information corresponding to pixel points in an image according to an input image, so that first depth information of all pixel points in a current frame image can be determined by using the depth prediction model, namely a depth map corresponding to the current frame image can be obtained.

Step 202: and selecting a first characteristic point from the pixel points according to the gray gradient of the pixel points in the current frame image.

Although the depth prediction model is used for determining the first depth information of all the pixel points in the current frame image, the first depth information of all the pixel points is not required to be optimized, so that the pixel points need to be selected, the first feature points are selected, the first feature points can be selected according to the gray scale gradient of the pixel points in the current frame image, the pixel points with the gray scale gradient larger than the third preset threshold value are determined as the first feature points, and the selection of the first feature points can effectively improve the determination efficiency of the optimized first depth information.

Step 203: and determining first depth information corresponding to the first feature point.

And after the first characteristic point is selected, determining first depth information corresponding to the first characteristic point.

In this embodiment, after the depth prediction model is used to determine the first depth information of the pixel points in the current frame image, because not all the pixel points in the current frame image correspond to the effective information, the first feature point needs to be determined in the current frame image, thereby ensuring that the effective information in the current frame image enters into a subsequent process, avoiding optimizing the first depth information corresponding to all the pixel points, and being beneficial to improving the determination efficiency of the first depth information.

Fig. 4 is a schematic flow chart illustrating a process of acquiring optimized first depth information according to the first gray scale information, the second gray scale information, the first camera pose, the second camera pose, and the first depth information in the embodiment shown in fig. 1.

As shown in fig. 4, on the basis of the embodiment shown in fig. 1, in an exemplary embodiment of the present application, the step of obtaining optimized first depth information shown in step 80 may specifically include the following steps:

step 801: and acquiring an optimized first camera pose and an optimized second camera pose according to the first gray scale information, the second gray scale information, the first camera pose and the second camera pose.

When the optimized first depth information is determined, the accuracy of the utilized camera pose directly affects the accuracy of the determined optimized first depth information, so that before the optimized first depth information is determined, the first camera pose and the second camera pose are optimized to obtain the optimized first camera pose and the optimized second camera pose.

Step 802: and acquiring the optimized first depth information according to the optimized first camera pose, the optimized second camera pose and the first depth information.

After the optimized first camera pose and the optimized second camera pose are obtained, the first depth information is optimized through the geometric relation among different camera poses, and the optimized first depth information is obtained.

In this embodiment, the first camera pose and the second camera pose are optimized to obtain the optimized first camera pose and the optimized second camera pose, so that it is ensured that the accuracy of the optimized first depth information determined by using the optimized first camera pose and the optimized second camera pose is higher.

Fig. 5 shows a schematic flow chart of acquiring the optimized first camera pose and the optimized second camera pose according to the first gray scale information, the second gray scale information, the first camera pose and the second camera pose in the embodiment shown in fig. 4.

As shown in fig. 5, based on the embodiment shown in fig. 4, in an exemplary embodiment of the present application, the step of obtaining the optimized first camera pose and the optimized second camera pose shown in step 801 may specifically include the following steps:

step 8011: and acquiring second gray information corresponding to a second projection point projected by the first characteristic point on a subsequent frame image.

The embodiment optimizes the first camera pose and the second camera pose based on the assumption that the luminosity is unchanged, and completes the process of projecting the first feature point in the current frame image to the subsequent frame image according to the first camera pose of the current frame image and the second camera pose of the subsequent frame image. Specifically, the pixel coordinates of the first feature point in the current frame image are determined, the conversion from the pixel coordinates in the current frame image to the camera coordinate system corresponding to the first camera pose is completed according to the internal parameters of the camera, then the conversion from the camera coordinate system corresponding to the first camera pose to the camera coordinate system corresponding to the second camera pose is completed, the process of projecting from the camera coordinate system corresponding to the second camera pose to the pixel coordinates in the subsequent frame image is completed according to the internal parameters of the camera, and therefore the second gray scale information corresponding to the second projection point is obtained.

Step 8012: and establishing a second gray error function according to the first gray information of the first characteristic point and the second gray information of the second projection point.

The first characteristic point and the second projection point correspond to the same space point, so that a second gray error function between the first gray information of the first characteristic point and the second gray information of the second projection point is established.

Step 8013: and determining the optimized first camera pose and the optimized second camera pose of which the second gray scale error function meets a third preset condition according to the first camera pose and the second camera pose.

And adjusting the first camera pose and the second camera pose to enable the value of the second gray scale error function to be minimum, wherein the corresponding first camera pose and the corresponding second camera pose are the optimized first camera pose and the optimized second camera pose.

In this embodiment, optimization of the first camera pose and the second camera pose is completed based on a luminosity invariant hypothesis, and the optimized first camera pose and the optimized second camera pose are determined by minimizing a second gray scale error function of first gray scale information corresponding to the first feature point and second gray scale information corresponding to the second projection point, so that the accuracy of the determined optimized first camera pose and the determined optimized second camera pose is higher, and the accuracy of the determined optimized first depth information using the optimized first camera pose and the optimized second camera pose is improved.

Fig. 6 shows a flowchart further included after acquiring the optimized first depth information in the embodiment shown in fig. 4.

As shown in fig. 6, on the basis of the embodiment shown in fig. 4, in an exemplary embodiment of the present application, after the step of obtaining the optimized first depth information shown in step 802, the method may further include the following steps:

step 8031: and determining a fourth camera pose corresponding to each of at least one frame of previous frame image, wherein the previous frame image meets a first preset condition.

The embodiment is to implement the construction of the high-precision map by using the key frame image, and when the camera collects the relevant information of the scene structure in the space, a large number of images can be obtained, wherein repeated frame images exist, for example, red light or congestion occurs during the driving of a vehicle, so that the construction of the high-precision map by using all the frame images is not required, and therefore, the previous frame image, i.e., the key frame image, can be selected from all the frame images according to a first preset condition, and the scene structure in the space corresponding to each frame of the previous frame image has more differences by setting the first preset condition, thereby improving the construction efficiency of the high-precision map.

In a possible implementation manner, after the fourth camera poses are obtained, the fourth gray scale information corresponding to the third feature points corresponding to the previous frame images is used, and further optimization updating of the fourth camera poses corresponding to the previous frame images is completed based on the luminosity invariance assumption, so that when a high-precision map is constructed by using the optimized and updated fourth camera poses, the accuracy of the obtained high-precision map is higher.

Step 8032: and determining a third feature point corresponding to each of the at least one frame of previous frame image.

Even if each previous frame image is a key frame image, all pixel points in the previous frame image do not carry effective information, and therefore, the corresponding third feature points in each previous frame image need to be determined.

Step 8033: and acquiring fourth gray information and second depth information of the third feature point.

All the previous frame images meet a first preset condition, that is, all the previous frame images determine second depth information corresponding to a third feature point in the previous frame images according to the depth prediction model, so that the second depth information corresponding to the third feature point can be acquired after the third feature point is determined; after the third feature point is obtained, the gray value corresponding to the pixel coordinate of the third feature point is directly read, and then fourth gray information of the third feature point can be obtained. In a possible implementation manner, after the fourth camera pose corresponding to each frame of previous frame image is optimized and updated based on the luminosity invariant hypothesis, the second depth information corresponding to each third feature point is further optimized by using the optimized and updated fourth camera pose, so that the accuracy of the determined second depth information is ensured. After the third feature point is obtained, the gray value corresponding to the pixel coordinate of the third feature point is directly read, and then fourth gray information of the third feature point can be obtained.

Step 8034: and constructing a high-precision map according to the fourth gray information, the second depth information and the fourth camera pose of the third feature point, the first gray information, the optimized first depth information and the optimized first camera pose of the first feature point.

The third feature point and the first feature point are respectively key points of a previous frame image and a current frame image, and the high-precision map can be constructed after camera pose, gray information and depth information corresponding to each key point are acquired.

In the embodiment, the construction of the high-precision map by using the key points in each key frame can effectively improve the construction efficiency of the high-precision map; the second depth information and the first depth information used for constructing the high-precision map have absolute scales, so that the constructed high-precision map has the absolute scales and can reflect the real physical scales of the structural scene in the space; meanwhile, the fourth camera pose and the optimized first camera pose for constructing the high-precision map are obtained by optimizing the camera pose, so that the fourth camera pose and the optimized first camera pose have certain accuracy, and the constructed high-precision map has higher accuracy.

Exemplary devices

Based on the same concept as the method and the embodiment of the application, the embodiment of the application also provides a device for determining the depth information of the image feature points

Fig. 7 is a schematic structural diagram illustrating an apparatus for determining depth information of image feature points according to an exemplary embodiment of the present application.

As shown in fig. 7, an apparatus for determining depth information of an image feature point according to an exemplary embodiment of the present application includes:

the depth information determining module 71 is configured to determine whether the current frame image meets a first preset condition, and if the current frame image meets the first preset condition, determine, according to a depth prediction model acquired in advance, first depth information corresponding to a first feature point in the current frame image;

a first obtaining module 72, configured to obtain first gray scale information and a first camera pose corresponding to the current frame image;

a second obtaining module 73, configured to determine whether a subsequent frame image of the current frame image meets the first preset condition, and if the subsequent frame image of the current frame image does not meet the first preset condition, obtain second grayscale information and a second camera pose corresponding to the subsequent frame image;

and an optimizing module 74, configured to obtain optimized first depth information according to the first gray scale information, the second gray scale information, the first camera pose, the second camera pose, and the first depth information.

As shown in fig. 8, in an exemplary embodiment, the depth information determining apparatus of the image feature point further includes a camera pose determining module 70, and the camera pose determining module 70 includes:

a first obtaining unit 701, configured to obtain a third camera pose of a previous frame image, where the previous frame image meets a first preset condition;

a feature point determining unit 702, configured to determine a second feature point of the previous frame image, where the second feature point is at least one feature point in the previous frame image whose gray scale gradient meets a second preset condition;

a second obtaining unit 703, configured to obtain third grayscale information of the second feature point, and obtain first grayscale information corresponding to a first projection point of the second feature point on the current frame image;

a camera pose determining unit 704, configured to determine the first camera pose corresponding to the current frame image according to a first gray scale error function among a third camera pose corresponding to the previous frame image, the third gray scale information, and first gray scale information corresponding to the first projection point.

As shown in fig. 9, in an exemplary embodiment, the depth information determining module 71 includes:

a third obtaining unit 711, configured to determine, according to a depth prediction model obtained in advance, first depth information of a pixel point in a current frame image;

a feature point selecting unit 712, configured to select a first feature point from the pixel points according to a gray scale gradient of the pixel points in the current frame image;

a depth information determining unit 713, configured to determine first depth information corresponding to the first feature point.

As shown in FIG. 10, in an exemplary embodiment, the optimization module 74 includes:

an optimization unit 741, configured to obtain an optimized first camera pose and an optimized second camera pose according to the first gray scale information, the second gray scale information, the first camera pose, and the second camera pose;

a fourth obtaining unit 742 is configured to obtain the optimized first depth information according to the optimized first camera pose, the optimized second camera pose, and the first depth information.

As shown in fig. 11, in an exemplary embodiment, the optimization unit 741 includes:

a projection point obtaining subunit 7411, configured to obtain second gray scale information corresponding to a second projection point of the first feature point projected on the subsequent frame image;

a function establishing subunit 7412, configured to establish a second gray error function according to the first gray information of the first feature point and the second gray information of the second projection point;

an optimization subunit 7413, configured to determine, according to the first camera pose and the second camera pose, the optimized first camera pose and the optimized second camera pose of which the second gray-scale error function satisfies the third preset condition.

As shown in fig. 12 and 13, in an exemplary embodiment, the optimization module 74 further includes a map building unit 743, where the map building unit 743 includes:

a first determining subunit 7431, configured to determine a fourth camera pose corresponding to each of at least one previous frame image, where the previous frame image meets a first preset condition;

a second determining subunit 7432, configured to determine a third feature point corresponding to each of the at least one previous frame image;

an obtaining subunit 7433, configured to obtain fourth grayscale information and second depth information of the third feature point;

the map constructing subunit 7434 is configured to construct a high-precision map according to the fourth grayscale information, the second depth information, the fourth camera pose of the third feature point, the first grayscale information, the optimized first depth information, and the optimized first camera pose of the first feature point.

Exemplary electronic device

FIG. 14 illustrates a block diagram of an electronic device in accordance with an embodiment of the present application.

As shown in fig. 14, the electronic device 100 includes one or more processors 101 and memory 102.

The processor 101 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.

Memory 102 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 101 to implement the depth information determining method for image feature points of the various embodiments of the present application described above and/or other desired functions.

In one example, the electronic device 100 may further include: an input device 103 and an output device 104, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

Of course, for the sake of simplicity, only some of the components related to the present application in the electronic apparatus 100 are shown in fig. 14, and components such as a bus, an input/output interface, and the like are omitted. In addition, electronic device 100 may include any other suitable components depending on the particular application.

Exemplary computer program product and computer-readable storage Medium

In addition to the above-described methods and apparatuses, embodiments of the present application may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps in the method for depth information determination of image feature points according to various embodiments of the present application described in the above-mentioned "exemplary methods" section of this specification.

The computer program product may be written with program code for performing the operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps in the depth information determination method of image feature points according to various embodiments of the present application described in the "exemplary methods" section above in this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present application in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present application are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present application. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the foregoing disclosure is not intended to be exhaustive or to limit the disclosure to the precise details disclosed.

The block diagrams of devices, apparatuses, systems referred to in this application are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It should also be noted that in the devices, apparatuses, and methods of the present application, the components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. A method for determining depth information of image feature points comprises the following steps:

2. The method according to claim 1, wherein before said determining whether the current frame image satisfies a first preset condition, further comprising:

acquiring a third camera pose of a previous frame image, wherein the previous frame image meets a first preset condition;

determining a second feature point of the previous frame image, wherein the second feature point is at least one feature point of the previous frame image, and the gray gradient of the feature point meets a second preset condition;

acquiring third gray information of the second characteristic point, and acquiring first gray information corresponding to a first projection point of the second characteristic point on the current frame image;

and determining the first camera pose corresponding to the current frame image according to a first gray error function among a third camera pose corresponding to the previous frame image, the third gray information and first gray information corresponding to the first projection point.

3. The method of claim 1, wherein the obtaining optimized first depth information from the first gray scale information, second gray scale information, first camera pose, second camera pose, and the first depth information comprises:

acquiring an optimized first camera pose and an optimized second camera pose according to the first gray scale information, the second gray scale information, the first camera pose and the second camera pose;

and acquiring optimized first depth information according to the optimized first camera pose, the optimized second camera pose and the first depth information.

4. The method of claim 3, wherein the acquiring an optimized first camera pose and an optimized second camera pose from the first gray scale information, the second gray scale information, the first camera pose, and the second camera pose comprises:

acquiring second gray information corresponding to a second projection point of the first feature point projected on the subsequent frame image;

establishing a second gray error function according to the first gray information of the first characteristic point and the second gray information of the second projection point;

and determining the optimized first camera pose and the optimized second camera pose of which the second gray scale error function meets a third preset condition according to the first camera pose and the second camera pose.

5. The method of claim 3, after obtaining the optimized first depth information, further comprising:

determining a fourth camera pose corresponding to each of at least one frame of previous frame image, wherein the previous frame image meets the first preset condition;

determining a third feature point corresponding to each of the at least one frame of previous frame image;

acquiring fourth gray information and second depth information of the third feature point;

and constructing a high-precision map according to the fourth gray scale information, the second depth information, the fourth camera pose of the third feature point, the first gray scale information, the optimized first depth information and the optimized first camera pose of the first feature point.

6. The method according to any one of claims 1 to 5, wherein the determining, according to a pre-obtained depth prediction model, first depth information corresponding to a first feature point in the current frame image includes:

determining first depth information of pixel points in the current frame image according to a depth prediction model acquired in advance;

selecting a first characteristic point from pixel points according to the gray gradient of the pixel points in the current frame image;

and determining first depth information corresponding to the first feature point.

7. An apparatus for determining depth information of an image feature point, comprising:

8. The apparatus of claim 7, further comprising: a camera pose determination module;

the camera pose determination module includes:

the first acquisition unit is used for acquiring a third camera pose of a previous frame image, and the previous frame image meets a first preset condition;

the characteristic point determining unit is used for determining a second characteristic point of the previous frame image, wherein the second characteristic point is at least one characteristic point of the previous frame image, and the gray gradient of the previous frame image meets a second preset condition;

the second obtaining unit is used for obtaining third gray scale information of the second characteristic point and obtaining first gray scale information corresponding to a first projection point of the second characteristic point projected on the current frame image;

and the camera pose determining unit is used for determining the first camera pose corresponding to the current frame image according to a first gray error function among a third camera pose corresponding to the previous frame image, the third gray information and first gray information corresponding to the first projection point.

9. A computer-readable storage medium storing a computer program for executing the method for determining depth information of image feature points according to any one of claims 1 to 6.

10. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method for determining depth information of image feature points according to any one of claims 1 to 6.