CN109461208B

CN109461208B - Three-dimensional map processing method, device, medium and computing equipment

Info

Publication number: CN109461208B
Application number: CN201811363016.3A
Authority: CN
Inventors: 王成; 赵宇; 刘海伟; 丛林
Original assignee: Hangzhou Yixian Advanced Technology Co ltd
Current assignee: Hangzhou Yixian Advanced Technology Co ltd
Priority date: 2018-11-15
Filing date: 2018-11-15
Publication date: 2022-12-16
Anticipated expiration: 2038-11-15
Also published as: CN109461208A

Abstract

The embodiment of the invention provides a three-dimensional map processing method, which comprises the steps of obtaining initial map data of a target scene, wherein the initial map data at least comprises original image data and estimation state data during the collection of an original image; constructing a three-dimensional map with a real scale corresponding to a target scene according to original image data and estimated state data contained in the initial map data; and respectively optimizing the position data of the three-dimensional points in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional points to obtain the reconstructed three-dimensional map with the real scale. The position data of the three-dimensional points in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional points are respectively optimized, so that the precision of the three-dimensional map can be improved. In addition, the embodiment of the invention also provides a three-dimensional map processing device, a medium and a computing device.

Description

Three-dimensional map processing method, device, medium and computing equipment

Technical Field

Embodiments of the present invention relate to the field of data processing, and more particularly, to a three-dimensional map processing method, apparatus, medium, and computing device.

Background

This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

At present, a Simultaneous Localization And Mapping technology (SLAM) refers to a technology of constructing a map of an environment around a mobile terminal according to data acquired by sensors such as a camera And an Inertial Measurement Unit (IMU) on the mobile terminal, and performing spatial Localization on the mobile terminal, that is, by placing any mobile terminal with a sensor in an unknown environment, the mobile terminal tries to incrementally establish a continuous map And performs Localization by using the map, and the technology can be applied to scenes such as virtual reality equipment, augmented reality equipment, and terminal navigation.

In the related art, some processing algorithms for constructing maps have appeared, which take images as main data sources, for example, for unordered images (such as various network pictures), motion recovery Structure (SFM) is mainly used for "true scale free" three-dimensional reconstruction; for ordered images (such as a continuous piece of video), VSLAM (visual-SLAM) is mainly used for "true scale free" three-dimensional reconstruction or VISLAM (visual-inverse-SLAM) is used for "true scale" three-dimensional reconstruction.

However, under the condition that the hardware resource of the terminal device is limited, such as the condition that the computing resource is limited, the storage resource is limited, and the memory resource is limited, the map constructed by the terminal device by adopting the related technology has low precision, so that the user experience is poor.

Disclosure of Invention

Therefore, in the prior art, in the case that the hardware resource of the terminal device is limited, the map constructed by adopting the related art has low precision, which is a very annoying process.

For this reason, an improved three-dimensional map processing method is highly needed, so that a finer and more reasonable processing flow is provided to ensure the accuracy of the three-dimensional map.

In this context, embodiments of the present invention are intended to provide a three-dimensional map processing method, apparatus, medium, and computing device.

In a first aspect of embodiments of the present invention, there is provided a three-dimensional map processing method including: acquiring initial map data of a target scene, wherein the initial map data at least comprises original image data and estimation state data when an original image is acquired, and the estimation state data comprises: acquiring pose data of an original image and motion parameter data of a sensor during acquisition of the original image;

constructing a three-dimensional map having a real scale corresponding to the target scene from the original image data and the estimated state data included in the initial map data, including: determining at least two frames of similar original images according to the similarity of the feature points in the multiple frames of original images and performing tracking association; and performing linear triangulation processing on the at least two similar original images after tracking association by using the estimation state data as a constraint parameter, wherein the linear triangulation processing comprises the following steps: using the estimated state data as a constraint parameter, adding the constraint of the motion parameter data between every two frames of images by adopting a light beam inertial navigation adjustment method, and respectively optimizing the poses and M three-dimensional points of the N images to obtain a result with a real scale; and

respectively optimizing the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point, wherein the optimizing comprises the following steps: optimizing pose data of an image corresponding to the three-dimensional point to obtain a three-dimensional map with optimized pose data, and optimizing position data of the three-dimensional point in the three-dimensional map with optimized pose data to obtain a reconstructed three-dimensional map with a real scale;

wherein optimizing the pose data comprises: determining at least two frames of images which meet the requirements of different time domains and the same space domain from the original image data, wherein the at least two frames of images which are different in time domain and the same in space domain are determined through loop detection; calculating the transformation relation data of the corresponding poses of the at least two frames of images which meet the requirements of different time domains and the same space domain; and optimizing the pose data of the image corresponding to the three-dimensional point based on the transformation relation data and a preset pose graph model, wherein the optimization comprises the following steps: adding the transformation relation data as an edge into the pose graph model to serve as a constraint of the model, wherein the pose graph model optimizes the pose data by taking the constraint as prior knowledge;

optimizing the location data includes: merging the corresponding characteristic points of the images corresponding to the three-dimensional points, and calculating the position data of the three-dimensional points corresponding to the characteristic points according to the pixel coordinates of the feature points after merging;

optimizing the pose data of the images corresponding to the three-dimensional points to ensure that long-time error accumulation is corrected to a certain degree, combining the corresponding feature points of the images corresponding to the three-dimensional points when the pose of the images originally shot in the same place is closer to the spatial domain, and determining the positions of the three-dimensional points according to the combined feature points so as to optimize the position data of the three-dimensional points when optimizing the position data of the three-dimensional points in the three-dimensional map after optimizing the pose data;

and obtaining the reconstructed three-dimensional map with the real scale.

In an embodiment of the present invention, after optimizing the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point, respectively, the method further includes:

coupling the optimized three-dimensional point position data with pose data of an image corresponding to the three-dimensional point to obtain a coupled three-dimensional map;

and optimizing the coupled three-dimensional map again by adopting a light beam inertial navigation adjustment method to obtain the optimized three-dimensional map with the real scale.

In another embodiment of the present invention, wherein the estimated state data at the time of acquiring the raw image includes at least pose data at the time of acquiring the raw image and motion parameter data of the sensor at the time of acquiring the raw image.

In a second aspect of embodiments of the present invention, there is provided a three-dimensional map processing apparatus comprising: an obtaining module, configured to obtain initial map data of a target scene, where the initial map data at least includes original image data and estimated state data obtained when an original image is acquired, and the estimated state data includes: acquiring pose data of an original image and motion parameter data of a sensor during acquisition of the original image;

a construction module configured to construct a three-dimensional map having a real scale corresponding to the target scene according to the original image data and the estimated state data included in the initial map data, wherein the construction module includes:

the correlation unit is used for determining at least two frames of similar original images according to the similarity of the feature points in the original images of multiple frames and performing tracking correlation; and

the processing unit is used for performing linear triangulation processing on the at least two similar original images after tracking and association by taking the estimated state data as a constraint parameter, and comprises the following steps: using the estimated state data as a constraint parameter, adding the constraint of the motion parameter data between every two frames of images by adopting a light beam inertial navigation adjustment method, and respectively optimizing the poses and M three-dimensional points of the N images to obtain a result with a real scale;

and

the optimization module is used for respectively optimizing the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point, and comprises the following steps: optimizing the pose data of the image corresponding to the three-dimensional point to obtain a three-dimensional map with optimized pose data, optimizing the position data of the three-dimensional point in the three-dimensional map with optimized pose data to obtain a reconstructed three-dimensional map with real scale,

wherein the optimization module comprises: the determining unit is used for determining at least two frames of images which meet the requirements of different time domains and the same space domain from the original image data, wherein the at least two frames of images which are different in time domain and the same space domain are determined through loop detection;

the first calculation unit is used for calculating the transformation relation data of the poses corresponding to the at least two frames of images which are different in time domain and same in space domain; and

the optimization unit is used for optimizing the pose data of the image corresponding to the three-dimensional point based on the transformation relation data and a preset pose graph model, and comprises: adding the transformation relation data as an edge into the pose graph model to serve as a constraint of the model, wherein the pose graph model optimizes the pose data by taking the constraint as prior knowledge;

wherein optimizing the location data comprises: merging the corresponding characteristic points of the images corresponding to the three-dimensional points, and calculating the position data of the three-dimensional points corresponding to the characteristic points according to the pixel coordinates of the feature points after merging;

the method comprises the steps of optimizing pose data of an image corresponding to the three-dimensional point to enable long-time error accumulation to be corrected to a certain degree, enabling the poses of images originally shot in the same place to be relatively close to each other when viewed in a space domain, enabling corresponding feature points of the images corresponding to the three-dimensional points to be combined when optimizing position data of the three-dimensional points in a three-dimensional map after optimizing the pose data, and determining the positions of the three-dimensional points according to the combined feature points to achieve optimization of the position data of the three-dimensional points.

In one embodiment of the invention, the apparatus further comprises:

the coupling module is used for coupling the optimized three-dimensional point position data with the pose data of the image corresponding to the three-dimensional point to obtain a coupled three-dimensional map after respectively optimizing the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point;

and the optimization module is used for optimizing the coupled three-dimensional map again by adopting a light beam inertial navigation adjustment method to obtain the optimized three-dimensional map with the real scale.

In another embodiment of the invention, the estimated state data at the time of acquiring the raw image includes at least pose data at the time of acquiring the raw image and motion parameter data of the sensor at the time of acquiring the raw image.

In a third aspect of embodiments of the present invention, there is provided a medium storing computer-executable instructions that, when executed by a processing unit, are configured to implement a three-dimensional map processing method as described above.

In a fourth aspect of embodiments of the present invention there is provided a computing device comprising: a processing unit; and a storage unit storing computer-executable instructions for implementing the three-dimensional map processing method as described above when executed by the processing unit.

According to the three-dimensional map processing method, the three-dimensional map processing device, the three-dimensional map processing medium and the three-dimensional map processing computing equipment, after the three-dimensional map with the real scale is constructed, the position data of the three-dimensional points in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional points are optimized respectively, a more precise and reasonable processing flow is provided to guarantee the precision of the three-dimensional map, the three-dimensional map processing method and the three-dimensional map processing device are particularly suitable for constructing a map of a large-scale real scene, and compared with a scheme of guaranteeing the precision of the three-dimensional map by adding hardware resources in the related art, the three-dimensional map processing method, the three-dimensional map processing device and the three-dimensional map processing device provide a new processing flow, the precision of the three-dimensional map can be guaranteed more reasonably under the condition that the hardware resources of terminal equipment are limited, and better experience is brought to users.

Drawings

The above and other objects, features and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

fig. 1 schematically shows a pose diagram during movement of a mobile terminal according to an embodiment of the present invention;

FIG. 2 schematically illustrates a schematic diagram of feature points on multiple images having a common field of view corresponding to three-dimensional points on a space, according to an embodiment of the invention;

FIG. 3 schematically illustrates a schematic diagram of pose graph model optimization according to an embodiment of the invention;

FIG. 4 schematically illustrates an application scenario according to an embodiment of the present invention;

FIG. 5 schematically shows a flow chart of a three-dimensional map processing method according to an embodiment of the invention;

fig. 6 schematically shows a flowchart of constructing a three-dimensional map having real dimensions corresponding to a target scene from original image data and estimated state data contained in initial map data according to an embodiment of the present invention;

FIG. 7 schematically illustrates a flow chart for optimizing pose data of an image corresponding to a three-dimensional point, in accordance with an embodiment of the present invention;

FIG. 8 schematically illustrates a flow chart for optimizing position data of a three-dimensional point according to an embodiment of the invention;

fig. 9 schematically shows a flowchart of optimizing position data of a three-dimensional point in a three-dimensional map having a real scale and pose data of an image corresponding to the three-dimensional point, respectively, according to an embodiment of the present invention;

FIG. 10 schematically shows a flow chart of a three-dimensional map processing method according to another embodiment of the invention;

fig. 11 schematically shows a block diagram of a three-dimensional map processing apparatus according to an embodiment of the present invention;

FIG. 12 schematically shows a block diagram of a build module according to an embodiment of the invention;

FIG. 13 schematically shows a block diagram of an optimization module according to an embodiment of the invention;

FIG. 14 schematically shows a program product for implementing a three-dimensional map processing method according to an embodiment of the present invention; and

fig. 15 schematically shows a block diagram of a computing device for implementing a three-dimensional map processing method according to an embodiment of the present invention.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present invention will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the invention, and are not intended to limit the scope of the invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

According to an embodiment of the invention, a three-dimensional map processing method, a three-dimensional map processing device, a three-dimensional map processing medium and a computing device are provided.

In this context, it is to be understood that the terminology which has been referred to may be that which is used to implement a portion of the invention or that is otherwise conclusive. For example, the term may include:

landmark (landmark): objects with distinctive shapes and features in space, such as geometries that can be easily distinguished and detected by associated instruments on a vehicle, can be either natural or man-made. Some landmarks may also contain additional information (e.g., bar codes, two-dimensional codes, etc.).

Position (position): given a three-dimensional coordinate system (cartesian coordinate system), the position of an object in the coordinate system is generally denoted by (x, y, z).

Pose (position): position and attitude (orientation), for example: typically (x, y, yaw) in two dimensions and (x, y, z, aw, pitch, roll) in three dimensions, the last three elements describe the pose of the object. Fig. 1 schematically shows a position and posture diagram of a mobile terminal in a moving process according to an embodiment of the invention. As shown in FIG. 1, where yaw is the heading angle, rotating about the Z-axis, pitch is the pitch angle, rotating about the Y-axis, roll is the roll angle, rotating about the X-axis.

Augmented Reality (Augmented Reality or AR): the position and angle of the camera image are calculated, and an image analysis technology is added, so that the virtual world on the screen can be combined and interacted with the real world scene.

Inertial measurement unit (inertial measurement unit or IMU): the device for measuring the three-axis attitude angle (or angular rate) and acceleration of an object generally comprises an inertial sensing unit, wherein a three-axis gyroscope and three-direction accelerometers are arranged in the inertial sensing unit, so that the angular speed and the acceleration of the object in a three-dimensional space are measured, and the attitude of the object is solved.

Odometer (odometer): a motion sensor (e.g., an "inertial measurement unit") is used to evaluate pose changes in the near time.

Simultaneous Localization And Mapping (Simultaneous Localization And Mapping or SLAM): a process in which a robot (or any carrier) with sensors is placed in an unknown environment and the robot attempts to incrementally build a continuous map and use that map for localization can be referred to as "simultaneous localization and mapping".

Triangularization (triangle): and solving the positions of the three-dimensional points corresponding to the pair of pixel points and the pose of the image 2 relative to the image 1 for the pair of corresponding pixel points by using the geometric constraints of the two frames of images.

Bundle adjustment or BA): given a plurality of images with a common view field, taking the pose of a certain frame image as a reference coordinate system, extracting the landmark feature from all the images, and simultaneously optimizing the three-dimensional position of the feature in the reference coordinate system and the three-dimensional pose (position) of the image in the reference coordinate system. Fig. 2 schematically shows a schematic diagram of a feature point on a plurality of images having a common field of view corresponding to a three-dimensional point on a space according to an embodiment of the present invention. As shown in fig. 2, under the condition of light constraint, the process of establishing the optimization function through the binocular is expanded to N images acquired through N meshes by optimizing the average error, and the optimization function is established in the same way.

Structure from motion or SFM: a photogrammetric range imaging technique estimates three-dimensional structures in a sequence of two-dimensional images, possibly combined with local motion signals, SFM is typically solved using a "beam adjustment" algorithm.

Beam inertial navigation adjustment (BIA): the three-dimensional position of the landmark feature and the three-dimensional pose of the image calculated by BA are not in real scale, the constraint of Inertial Measurement Unit (IMU) data is added between every two frames of images, the poses of N images and M three-dimensional points are respectively optimized, and the result with real scale can be obtained (the result of BIA optimization is used and approximately considered as the real scale).

Pose graph model optimization (dose graph optimization or PGO): and establishing a graph model for a series of poses, and optimizing by using algorithms such as a gradient descent method, least squares, levenberg-Marquard (LM), gradient Device (GD) and the like. Fig. 3 schematically shows a schematic diagram of pose graph model optimization according to an embodiment of the present invention. As shown in fig. 3, the starting point and the end point in the original trajectory are not overlapped due to the existence of noise, an optimized trajectory is obtained by establishing a graph model for the poses of a series of images and optimizing the poses by using a correlation algorithm, and the starting point and the end point in the optimized trajectory are almost completely overlapped.

Closed loop (loop closure): judging whether the current data is the data of the place collected once or not through the related calculation of the sensor data (such as images); if yes, adding a constraint edge (edge) in the BA or BIA or PGO model for optimization.

Localization (localization): judging whether the current data is the data of the place collected once or not through the related calculation of the sensor data (such as images); and if so, providing the pose of the sensor in the existing three-dimensional map.

Node (node): the cells in the graphical model refer to poses in the PGO.

Edge (edge): the cells in the graphical model indicate the transformation relationship between the two poses in the PGO.

Image retrieval (image retrieval): given a target image and a database of images, the first k frames of images are retrieved from the database that are sufficiently similar (similarity greater than some threshold) to the target image, where k is a natural number.

Image database (image database): the storage space storing data in a certain data format may perform "image retrieval (image retrieval)", such as BOW (bag of words), IMI (inverted multi index), and provide services for image retrieval.

The principles and spirit of the present invention are explained in detail below with reference to several representative embodiments of the invention.

Summary of The Invention

The inventor finds that a map of the surrounding environment of the mobile terminal can be constructed according to data collected by sensors such as a camera and an inertia measurement unit on the mobile terminal, and the mobile terminal is spatially positioned, namely, the mobile terminal tries to incrementally establish a continuous map and performs positioning by using the map by placing any mobile terminal with the sensors in an unknown environment, and the technology can be applied to scenes such as virtual reality equipment, augmented reality equipment, terminal navigation and the like. In the related art, some processing algorithms for constructing maps by taking images as main data sources have appeared, for example, for unordered images (such as various network pictures), three-dimensional reconstruction without real scale is mainly carried out by using a motion recovery structure; for ordered images (e.g. a continuous piece of video), VSLAM (visual-SLAM) is mainly used for "true scale free" three-dimensional reconstruction or VISLAM (visual-initial-SLAM) for "true scale" three-dimensional reconstruction.

Based on the above analysis, the inventor thinks that after a three-dimensional map with a real scale is constructed, the position data of a three-dimensional point in the three-dimensional map with the real scale and the pose data of an image corresponding to the three-dimensional point are respectively optimized, so that a more precise and reasonable processing flow can be provided to ensure the precision of the three-dimensional map, and the method is particularly suitable for constructing a map of a large-scale real scene.

Having described the basic principles of the invention, various non-limiting embodiments of the invention are described in detail below.

Application scene overview

First, an application scenario of the three-dimensional map processing method and the device thereof according to the embodiment of the present invention is described in detail with reference to fig. 4.

Fig. 4 schematically shows an application scenario according to an embodiment of the present invention.

As shown in FIG. 4, user C on the edge of the selected building is the origin of coordinates (0, 0), user A is in front of a building, and the coordinates of user A with respect to the origin may be (13, 9, 0). Because a map of the environment around the mobile terminal can be constructed according to data collected by sensors such as a camera and an inertia measurement unit on the mobile terminal, and the mobile terminal is spatially positioned, a user A can construct a map around a building through the mobile terminal, so that the mobile terminal tries to incrementally establish a continuous map and performs positioning by using the map, for example, the user A walks a circle around the building, the building and the environment around the building can be shot through the mobile terminal used by the user A, after the mobile terminal shoots data used for constructing a three-dimensional map, a coordinate system is established, and then the three-dimensional map of the current environment is constructed. In the related art, the start point and the end point in the original trajectory of the user a, which makes one turn around the building, are not overlapped together due to the presence of noise, as shown in fig. 3. By adopting the three-dimensional map processing method, the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point are respectively optimized, for example, by establishing a graph model for the poses of a series of images and optimizing by adopting algorithms such as a gradient descent method, least squares, levenberg-Marquard (LM), gradient Descent (GD) and the like, an optimized track is obtained, a starting point and a terminal point in the optimized track almost completely coincide together, and particularly, the tracks before and after optimization shown in figure 3 can be referred for comparison.

According to the embodiment of the invention, after the three-dimensional map with the real scale is constructed, the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point are respectively optimized, so that a more precise and reasonable processing flow can be provided to ensure the precision of the three-dimensional map, and the method is particularly suitable for constructing the map of a large-scale real scene.

Exemplary method

A three-dimensional map processing method according to an exemplary embodiment of the present invention is described below with reference to fig. 5 in conjunction with the application scenario of fig. 4. It should be noted that the above application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present invention, and the embodiments of the present invention are not limited in this respect. Rather, embodiments of the present invention may be applied to any scenario where applicable.

Fig. 5 schematically shows a flowchart of a three-dimensional map processing method according to an embodiment of the present invention.

As shown in fig. 5, the three-dimensional map processing method according to the embodiment of the present invention includes operations S210 to S230.

In operation S210, initial map data of a target scene is acquired, wherein the initial map data includes at least raw image data and estimated state data at the time of acquiring the raw image.

According to an embodiment of the present invention, the target scene may be "museum", "art exhibition hall", "game exhibition hall", "multi-person AR of outdoor large scene", and the like. The raw image data may be image data acquired by a camera while the algorithm associated with the mapping technique is being run.

According to the embodiment of the invention, the estimation state data when the original image is acquired at least comprises the pose data when the original image is acquired and the motion parameter data of the sensor when the original image is acquired. For example, the position and the velocity of the camera when each frame of image is taken, the acceleration bias and the angular velocity bias of an inertial sensing unit (IMU) between two consecutive frames of images, and the parameter data of the IMU, etc. may be mentioned.

In operation S220, a three-dimensional map having a real scale corresponding to a target scene is constructed from original image data included in initial map data and estimated state data when the original image is acquired.

According to an embodiment of the present invention, when constructing a three-dimensional map, a suitable three-dimensional reconstruction algorithm may be selected, including but not limited to: ORB-SLAM, LSD-SLAM, LOAM, etc.; the three-dimensional reconstruction algorithm is associated with sensors including, but not limited to: monocular cameras, binocular cameras, depth cameras, radar, RGB-D cameras, and the like. Generally, for a monocular camera + IMU device configuration, visual-interactive geometry (calculating a series of poses) is performed first, while triangulating feature points. And then establishing a pose graph model for the calculated pose, and optimizing the graph model if a closed loop exists.

Fig. 6 schematically shows a flowchart of constructing a three-dimensional map having a real scale corresponding to a target scene from original image data and estimated state data contained in initial map data according to an embodiment of the present invention.

As shown in fig. 6, constructing a three-dimensional map having a real scale corresponding to a target scene from original image data included in initial map data and estimated state data at the time of capturing the original image includes operations S221 to S222.

In operation S221, at least two frames of similar original images are determined according to the feature point similarity in the multiple frames of original images, and tracking correlation is performed.

According to the embodiment of the invention, feature extraction can be carried out on multiple frames of original images, a plurality of feature points are determined in each frame of original image, then the similarity among the determined feature points is calculated, the feature points with the similarity larger than d (similarity threshold) are found out, and whether different original images are similar images can be determined based on the feature points with the similarity larger than d (similarity threshold), so that at least two frames of similar original images are determined, and tracking association is carried out on the corresponding similar original images. And then, the pose of the camera and the three-dimensional coordinates of the feature points are calculated in reverse according to the feature points with the similarity larger than d and the image difference caused by the motion of the camera, and three-dimensional reconstruction is carried out.

In operation S222, linear triangulation is performed on at least two similar original images after tracking and association, using the estimated state data when the original image is acquired as a constraint parameter.

According to the embodiment of the invention, in order to obtain a three-dimensional map with a real scale, the triangularization of two frames of images can be popularized to multi-frame continuous images, estimated state data is used as constraint parameters, the constraint of Inertial Measurement Unit (IMU) data is added between every two frames of images by adopting a beam inertial navigation adjustment method (BIA), the poses of N images and M three-dimensional points are optimized respectively, and a result with a real scale can be obtained. For example, the IMU parameter data is added into the result obtained by linearly triangulating at least two similar original images, so that the more accurate three-dimensional point and pose optimization with real dimension is obtained.

In operation S230, the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point are optimized, so as to obtain a reconstructed three-dimensional map with the real scale.

According to the embodiment of the invention, for example, when the position data of a three-dimensional point in a three-dimensional map is optimized, loop closure algorithm is used for loop detection, whether the current data is the data of a place collected once or not is judged, if yes, the data is used as a constrained edge (edge) and added into a pose graph model for optimization, and for example, algorithms such as least square, levenberg-Marquard (LM) and Gradient Descriptor (GD) are used for optimization. When the pose data of the image corresponding to the three-dimensional point after the position data is optimized, the light beam inertial navigation adjustment BIA can be used for carrying out error averaging, the odometry/SLAM algorithm is used for carrying out pose tracking, and the background simultaneously uses the positioning algorithm for carrying out error correction. Before the proofreading, the error of the pose information corresponding to the image can be compared with a threshold value, the proofreading is carried out when the error exceeds the threshold value, and the threshold value can be determined after a large amount of pose measurement data is compared with real pose data, calculated and analyzed.

According to the embodiment of the invention, various states output by the algorithm related to simultaneous positioning and mapping technology can be taken as input, and the camera and IMU data are combined to perform 'three-dimensional reconstruction of real scale', in particular to three-dimensional reconstruction of large-scale real scale, for example, three-dimensional map reconstruction is performed on the surrounding scene of a museum, so that a three-dimensional map with high-precision real scale is obtained.

According to the three-dimensional map processing method provided by the embodiment of the invention, after the three-dimensional map with the real scale is constructed, the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point are respectively optimized, so that a more precise and reasonable processing flow is provided to ensure the precision of the three-dimensional map, the method is particularly suitable for constructing the map of a large-scale real scene, the precision of the three-dimensional map can be more reasonably ensured under the condition that the hardware resource of the terminal equipment is limited, and better experience is brought to users.

Referring now to fig. 7-10, the method of fig. 5 will be further described in conjunction with specific embodiments.

Fig. 7 schematically shows a flowchart for optimizing pose data of an image corresponding to a three-dimensional point according to an embodiment of the present invention.

As shown in fig. 7, optimizing the pose data of the image corresponding to the three-dimensional point includes operations S231 to S233.

In operation S231, at least two frame images satisfying a preset condition are determined from the original image data.

According to the embodiment of the invention, the original image data acquired by the sensor can comprise a plurality of frames of original images, and the poses corresponding to the original images acquired at different times but at the same positions can be different, and the difference can influence the accuracy of the three-dimensional map, so that the pose data of the images corresponding to the same three-dimensional points can be optimized. The preset condition may be that the time domain is different but the spatial domain is the same, and at least two frames of images satisfying the different time domain but the same spatial domain can be determined from the original image data. Specifically, for example, three points in the afternoon take one photograph of the Tiananmen square, and two hours later, five points in the afternoon take one or more photographs of the Tiananmen square at the same position. The pictures taken at different times are at least two frames of images with different time domains and the same space domain. According to the embodiment of the invention, whether the current data is the data of the place collected once or not can be judged through the related calculation of the sensor data (such as the image) in a closed-loop detection mode, and at least two frames of images meeting the preset condition are determined from the original image data.

In operation S232, transformation relationship data of poses corresponding to at least two frames of images satisfying a preset condition is calculated.

According to the embodiment of the invention, the pose transformation relation data corresponding to at least two frames of images can be a transformation matrix between poses, or a functional transformation relation. For example, as shown in FIG. 4, user A and user B are in front of the same architectural landmark, and for user A, user A is represented by (0, 0) for the origin of coordinates, but the origin of the user C coordinate system from the local coordinates of user a is (-13, 9,0, 0), user a being (13, 9, 0) from the coordinate system of user C; for user B, the origin of coordinates for user B is represented by (0, 0), the origin of the C coordinate system viewed from the local coordinates of user B is (-4, 9, 0), user B is (4, 9, 0) from the coordinate system of user C.

In operation S233, pose data of an image corresponding to the three-dimensional point is optimized based on the transformation relation data and a preset pose graph model.

According to the embodiment of the invention, the transformation relation can be added into the pose graph model as the edge, and the whole three-dimensional map is optimized through the provided edge constraint. The pose graph model can be a pose graph model which is established for poses of a series of images in advance, and the pose data of the images corresponding to the three-dimensional points can be optimized by using algorithms such as a gradient descent method, least squares, levenberg-Marquard (LM), gradient Depth (GD) and the like.

According to the embodiment of the invention, the optimization error can be reduced as much as possible and the accuracy of the three-dimensional map can be improved based on the transformation relation data as a priori.

Fig. 8 schematically shows a flow chart for optimizing position data of a three-dimensional point according to an embodiment of the invention.

As shown in fig. 8, optimizing the position data of the three-dimensional point includes operations S234 to S235.

In operation S234, the corresponding feature points of the images corresponding to the three-dimensional points are merged.

In operation S235, position data of the feature point corresponding to the three-dimensional point is calculated according to the pixel coordinates of the feature point after the merging process.

According to the embodiment of the invention, long-time error accumulation is corrected to a certain degree, and the poses of images originally shot in the same place are relatively close to each other in an airspace, so that the corresponding feature points of the images corresponding to the three-dimensional points can be combined to optimize the position data of the three-dimensional points.

According to the embodiment of the invention, as the feature points corresponding to the same three-dimensional point in the image are combined, the three-dimensional point in the space can obtain the pixel coordinate of the unique corresponding feature point, and the pose of the camera and the three-dimensional coordinate of the three-dimensional point can be calculated in reverse through the pixel coordinate of the unique corresponding feature point. The characteristic points of the three-dimensional points on the image in the space are combined to realize the optimization of the position data of the three-dimensional points, thereby achieving the effect of optimizing the image.

Fig. 9 schematically shows a flowchart for optimizing position data of a three-dimensional point in a three-dimensional map having a real scale and pose data of an image corresponding to the three-dimensional point, respectively, according to an embodiment of the present invention.

As shown in fig. 9, optimizing the position data of the three-dimensional point in the three-dimensional map having the real scale and the pose data of the image corresponding to the three-dimensional point respectively includes operations S236 to S237.

In operation S236, the pose data of the image corresponding to the three-dimensional point is optimized to obtain a three-dimensional map with optimized pose data.

In operation S237, the position data of the three-dimensional point in the three-dimensional map after the pose data is optimized, so as to obtain a reconstructed three-dimensional map with a real scale.

According to the embodiment of the invention, the pose data of the image corresponding to the three-dimensional point is optimized firstly, so that long-time error accumulation is corrected to a certain degree, the poses of the images originally shot in the same place are relatively close in the view of an airspace, and then when the position data of the three-dimensional point in the three-dimensional map after the pose data is optimized, the corresponding feature points of the images corresponding to the three-dimensional points can be merged, and the position of the three-dimensional point is determined according to the merged feature points, so that the position data of the three-dimensional point can be optimized, the map can be further optimized, and the accuracy of the constructed three-dimensional map is further improved.

Fig. 10 schematically shows a flowchart of a three-dimensional map processing method according to another embodiment of the present invention.

As shown in fig. 10, after optimizing the position data of the three-dimensional point in the three-dimensional map having the real scale and the pose data of the image corresponding to the three-dimensional point, respectively, the three-dimensional map processing method according to the embodiment of the present invention further includes operations S240 to S250.

In operation S240, the optimized three-dimensional point position data is coupled with pose data of an image corresponding to the three-dimensional point, so as to obtain a coupled three-dimensional map.

In operation S250, the three-dimensional map after coupling is optimized again by using the light beam inertial navigation adjustment method, so as to obtain a three-dimensional map with a real scale after being optimized again.

According to the embodiment of the invention, for the coupled three-dimensional map, the constraint of Inertial Measurement Unit (IMU) data can be added between every two frames of images through a light beam inertial navigation adjustment method, the poses and M three-dimensional points of N images are respectively optimized, and the three-dimensional map with the real scale after re-optimization can be obtained. By the embodiment of the invention, a finer and more reasonable calculation is provided, the map can be further optimized, and the accuracy of the reconstructed three-dimensional map can be further improved.

Exemplary devices

Having introduced the method of an exemplary embodiment of the present invention, a three-dimensional map processing apparatus of an exemplary embodiment of the present invention is described next with reference to fig. 11.

Fig. 11 schematically shows a block diagram of a three-dimensional map processing apparatus according to an embodiment of the present invention.

As shown in fig. 11, the three-dimensional map processing apparatus 400 includes an acquisition module 410, a construction module 420, and an optimization module 430.

The obtaining module 410 is configured to obtain initial map data of a target scene, where the initial map data at least includes raw image data and estimated state data when the raw image is acquired.

The construction module 420 is configured to construct a three-dimensional map having a real scale corresponding to the target scene from the original image data and the estimated state data included in the initial map data.

The optimization module 430 is configured to optimize position data of a three-dimensional point in a three-dimensional map with a real scale and pose data of an image corresponding to the three-dimensional point, respectively, to obtain a reconstructed three-dimensional map with a real scale.

According to the three-dimensional map processing method, after the three-dimensional map with the real scale is constructed, the position data of the three-dimensional points in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional points are optimized respectively, a more precise and reasonable processing flow is provided to guarantee the precision of the three-dimensional map, and the method is particularly suitable for constructing the map of a large-scale real scene.

FIG. 12 schematically shows a block diagram of a build module according to an embodiment of the invention.

As shown in fig. 12, the building module 420 comprises an association unit 421 and a processing unit 422.

The associating unit 421 is configured to determine at least two frames of similar original images according to the feature point similarity in the multiple frames of original images and perform tracking association.

The processing unit 422 is configured to perform linear triangulation on at least two similar original images after tracking and associating, using the estimated state data as a constraint parameter.

FIG. 13 schematically shows a block diagram of an optimization module according to an embodiment of the invention.

As shown in fig. 13, the optimization module 430 includes a determination unit 431, a first calculation unit 432, and an optimization unit 433.

The determining unit 431 is configured to determine at least two frames of images satisfying a preset condition from the original image data.

The first calculating unit 432 is configured to calculate transformation relation data of poses corresponding to at least two frames of images meeting a preset condition.

The optimizing unit 433 is configured to optimize pose data of an image corresponding to the three-dimensional point based on the transformation relation data and a preset pose graph model.

According to the embodiment of the present invention, the determining unit 431 is configured to determine at least two frames of images satisfying different temporal domains but the same spatial domain from the original image data.

According to the embodiment of the invention, the data based on the transformation relation is used as a priori, so that the optimization error can be reduced as much as possible, and the accuracy of the three-dimensional map is improved.

According to an embodiment of the present invention, as shown in fig. 13, the optimization module 430 further includes a merging unit 434 and a second calculation unit 435.

The merging unit 434 is configured to merge corresponding feature points of the image corresponding to the three-dimensional points.

The second calculating unit 435 is configured to calculate, according to the pixel coordinates of the feature points after the merging processing, position data of the three-dimensional points corresponding to the feature points.

According to the embodiment of the present invention, the optimization module 430 is configured to optimize pose data of an image corresponding to the three-dimensional point to obtain a three-dimensional map after the pose data is optimized; and optimizing the position data of the three-dimensional points in the three-dimensional map after the pose data is optimized to obtain the reconstructed three-dimensional map with the real scale.

According to the embodiment of the invention, the pose data of the image corresponding to the three-dimensional point is optimized, long-time error accumulation is corrected to a certain degree, and the poses of the images shot in the same place are relatively close to each other in an airspace, and then when the position data of the three-dimensional point in the three-dimensional map after the pose data is optimized, the corresponding feature points of the images corresponding to the three-dimensional points can be merged, and the position of the three-dimensional point is determined according to the merged feature points, so that the position data of the three-dimensional point is optimized, the map can be further optimized, and the accuracy of the constructed three-dimensional map is further improved.

According to an embodiment of the present invention, as shown in fig. 11, the three-dimensional map processing apparatus 400 further includes a coupling module 440.

The coupling module 440 is configured to couple the optimized position data of the three-dimensional point and the pose data of the image corresponding to the three-dimensional point after optimizing the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point, respectively, to obtain a coupled three-dimensional map.

The optimization module 430 is configured to optimize the coupled three-dimensional map again by using a light beam inertial navigation adjustment method, so as to obtain a re-optimized three-dimensional map with a real scale.

By the embodiment of the invention, a finer and more reasonable calculation is provided, the map can be further optimized, and the accuracy of the reconstructed three-dimensional map can be further improved.

According to the embodiment of the invention, the estimation state data when the original image is acquired at least comprises the pose data when the original image is acquired and the motion parameter data of the sensor when the original image is acquired.

Exemplary Medium

Having described the apparatus of the exemplary embodiment of the present invention, next, a medium of the exemplary embodiment of the present invention for storing computer-executable instructions, which when executed by a processing unit, implement the three-dimensional map processing method of fig. 5 to 10 will be described with reference to fig. 14.

In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product including program code for causing a computing device to perform operations in the three-dimensional map processing method according to various exemplary embodiments of the present invention described in the "exemplary method" section above of this specification when the program product is run on the computing device, for example, the computing device may perform operation S210 as shown in fig. 5 to acquire initial map data of a target scene, where the initial map data includes at least original image data and estimated state data when the original image was acquired. In operation S220, a three-dimensional map having a real scale corresponding to the target scene is constructed according to the original image data and the estimated state data included in the initial map data. Operation S230 is performed to optimize the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point, so as to obtain a reconstructed three-dimensional map with the real scale.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Fig. 14 schematically shows a program product for implementing a three-dimensional map processing method according to an embodiment of the present invention.

As shown in fig. 14, a program product 50 of a three-dimensional map processing method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a computing device, such as a personal computer. However, the program product of the present invention is not limited in this respect, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device over any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., over the internet using an internet service provider).

Exemplary computing device

Having described the methods, media, and apparatus of exemplary embodiments of the present invention, a computing device of exemplary embodiments of the present invention is described next with reference to fig. 15, and includes a processing unit and a storage unit, the storage unit storing computer-executable instructions that, when executed by the processing unit, implement the three-dimensional map processing methods of fig. 5-10.

The embodiment of the invention also provides the computing equipment. As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.), or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

In some possible embodiments, a computing device according to the present invention may include at least one processing unit, and at least one memory unit. Wherein the storage unit stores program code that, when executed by the processing unit, causes the processing unit to perform the steps in the three-dimensional map processing method according to various exemplary embodiments of the present invention described in the above section "exemplary method" of the present specification. For example, the processing unit may perform operation S210 as shown in fig. 5, acquiring initial map data of a target scene, wherein the initial map data includes at least raw image data and estimated state data at the time of acquiring the raw image. In operation S220, a three-dimensional map having a real scale corresponding to a target scene is constructed from original image data and estimated state data included in initial map data. Operation S230 is performed to optimize the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point, so as to obtain a reconstructed three-dimensional map with the real scale.

A computing device 60 of the three-dimensional map processing method according to this embodiment of the present invention is described below with reference to fig. 15. Computing device 60 as shown in FIG. 15 is only one example and should not be taken to limit the scope of use and functionality of embodiments of the present invention.

As shown in fig. 15, computing device 60 is in the form of a general purpose computing device. Components of computing device 60 may include, but are not limited to: the at least one processing unit 601, the at least one memory unit 602, and a bus 603 that couples various system components (including the memory unit 602 and the processing unit 601).

Bus 603 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures.

The storage unit 602 may include readable media in the form of volatile memory, such as Random Access Memory (RAM) 6021 and/or cache memory 6022, and may further include read-only memory (ROM) 6023.

The memory unit 602 may also include a program/utility 6025 having a set (at least one) of program modules 6024, such program modules 6024 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which or some combination thereof may comprise an implementation of a network environment.

Computing device 60 may also communicate with one or more external devices 604 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with computing device 60, and/or with any devices (e.g., router, modem, etc.) that enable computing device 60 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/0) interface 605. Also, computing device 60 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through network adapter 606. As shown, the network adapter 606 communicates with the other modules of the computing device 60 over a bus 603. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computing device 60, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, to name a few.

It should be noted that although in the above detailed description several units/modules or sub-units/modules of the apparatus are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

Further, while operations of the methods of the invention are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

While the spirit and principles of the invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A three-dimensional map processing method, comprising:

acquiring initial map data of a target scene, wherein the initial map data at least comprises original image data and estimation state data when an original image is acquired, and the estimation state data comprises: acquiring pose data of an original image and motion parameter data of a sensor during acquisition of the original image;

constructing a three-dimensional map having a real scale corresponding to the target scene from the original image data and the estimated state data included in the initial map data, including: determining at least two frames of similar original images according to the similarity of the feature points in the original images of multiple frames, and performing tracking association; and performing linear triangularization processing on the at least two similar original images after tracking association by taking the estimation state data as a constraint parameter, wherein the linear triangularization processing comprises the following steps: using the estimated state data as a constraint parameter, adding the constraint of the motion parameter data between every two frames of images by adopting a light beam inertial navigation adjustment method, and respectively optimizing the poses and M three-dimensional points of the N images to obtain a result with a real scale; and

wherein optimizing the pose data comprises: determining at least two frames of images which meet the requirements of different time domains and the same space domain from the original image data, wherein the at least two frames of images which are different in time domain and the same in space domain are determined through loop detection; calculating the transformation relation data of the poses corresponding to the at least two frames of images which are different in time domain and identical in space domain; and optimizing the pose data of the image corresponding to the three-dimensional point based on the transformation relation data and a preset pose graph model, wherein the optimization comprises the following steps: adding the transformation relation data as an edge into the pose graph model to serve as a constraint of the model, wherein the pose graph model optimizes the pose data by taking the constraint as prior knowledge;

optimizing the location data includes: merging the corresponding characteristic points of the images corresponding to the three-dimensional points, and calculating the position data of the corresponding three-dimensional points of the characteristic points according to the pixel coordinates of the characteristic points after merging;

optimizing the pose data of the images corresponding to the three-dimensional points to enable long-time error accumulation to be corrected to a certain degree, enabling the poses of the images originally shot in the same place to be relatively close when viewed in a space domain, combining the corresponding feature points of the images corresponding to the three-dimensional points when optimizing the position data of the three-dimensional points in the three-dimensional map after optimizing the pose data, and determining the positions of the three-dimensional points according to the combined feature points to optimize the position data of the three-dimensional points;

and obtaining the reconstructed three-dimensional map with real scale.

2. The method according to claim 1, wherein after optimizing the position data of the three-dimensional point in the three-dimensional map having the real scale and the pose data of the image corresponding to the three-dimensional point, respectively, the method further comprises:

3. The method of any of claims 1 to 2, wherein the estimated state data at the time of acquiring a raw image comprises at least pose data at the time of acquiring the raw image and motion parameter data of a sensor at the time of acquiring the raw image.

4. A three-dimensional map processing apparatus comprising:

an obtaining module, configured to obtain initial map data of a target scene, where the initial map data at least includes original image data and estimated state data obtained when an original image is acquired, and the estimated state data includes: acquiring pose data when an original image is acquired and motion parameter data of a sensor when the original image is acquired;

the processing unit is used for performing linear triangulation processing on the at least two similar original images after tracking association by taking the estimation state data as a constraint parameter, and comprises the following steps: using the estimated state data as a constraint parameter, adding the constraint of the motion parameter data between every two frames of images by adopting a light beam inertial navigation adjustment method, and respectively optimizing the poses and M three-dimensional points of the N images to obtain a result with a real scale;

and

the method comprises the steps of optimizing pose data of an image corresponding to the three-dimensional point to enable long-time error accumulation to be corrected to a certain degree, enabling the poses of images shot in the same place to be close to each other in a spatial domain, combining corresponding feature points of the images corresponding to the three-dimensional point when optimizing position data of the three-dimensional point in a three-dimensional map after optimizing the pose data, and determining the position of the three-dimensional point according to the combined feature points to optimize the position data of the three-dimensional point.

5. The apparatus of claim 4, wherein the apparatus further comprises:

the coupling module is used for coupling the optimized three-dimensional position data with the pose data of the image corresponding to the three-dimensional point to obtain a coupled three-dimensional map after respectively optimizing the position data of the three-dimensional point in the three-dimensional map with the real scale and the pose data of the image corresponding to the three-dimensional point;

6. The apparatus of any of claims 4 to 5, wherein the estimated state data at the time of acquiring a raw image comprises at least pose data at the time of acquiring the raw image and motion parameter data of a sensor at the time of acquiring the raw image.

7. A computer-readable storage medium storing computer-executable instructions for implementing the three-dimensional map processing method of any one of claims 1 to 3 when executed by a processing unit.

8. A computing device, comprising:

a processing unit; and

a storage unit storing computer-executable instructions for implementing the three-dimensional map processing method of any one of claims 1 to 3 when executed by the processing unit.