CN113744308B - Pose optimization method, pose optimization device, electronic equipment, medium and program product - Google Patents

Pose optimization method, pose optimization device, electronic equipment, medium and program product Download PDF

Info

Publication number
CN113744308B
CN113744308B CN202110903076.5A CN202110903076A CN113744308B CN 113744308 B CN113744308 B CN 113744308B CN 202110903076 A CN202110903076 A CN 202110903076A CN 113744308 B CN113744308 B CN 113744308B
Authority
CN
China
Prior art keywords
data
pose
key frame
pass
acquired data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110903076.5A
Other languages
Chinese (zh)
Other versions
CN113744308A (en
Inventor
罗中飞
边威
宋海涛
孟祥冰
谢诗超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Autonavi Software Co Ltd
Original Assignee
Autonavi Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Autonavi Software Co Ltd filed Critical Autonavi Software Co Ltd
Priority to CN202110903076.5A priority Critical patent/CN113744308B/en
Publication of CN113744308A publication Critical patent/CN113744308A/en
Application granted granted Critical
Publication of CN113744308B publication Critical patent/CN113744308B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the disclosure discloses a pose optimization method, a pose optimization device, electronic equipment, a medium and a program product, wherein the pose optimization method comprises the following steps: acquiring acquisition data for more than two passes of the same road, the acquisition data for one pass comprising: feature points of the key frame image and corresponding initial pose data; firstly grouping the collected data of different passes, and aligning the collected data of different passes in the same group to obtain first optimized pose data; performing pose map optimization on the first optimized pose data to obtain second optimized pose data, wherein the pose map optimization at least comprises inter-frame relative pose constraint between key frame images with common view areas between two passes of acquired data; and finally, optimizing by a global beam adjustment method, and gradually increasing the precision of pose data. According to the technical scheme, the absolute precision and the relative precision of the whole pose can be improved, pose data with higher precision can be recovered, a good foundation is provided for reconstructing high-precision map elements, and therefore the high-precision map with higher precision can be reconstructed.

Description

Pose optimization method, pose optimization device, electronic equipment, medium and program product
Technical Field
The embodiment of the disclosure relates to the technical field of high-precision maps, in particular to a pose optimization method, a pose optimization device, electronic equipment, a medium and a program product.
Background
In recent years, research on automatic driving or intelligent driving technology has become a hotspot and trend, and one of implementation paths of automatic driving or intelligent driving depends on a high-precision map (simply referred to as a high-definition map or a high-standard map). The high-precision map can provide beyond-the-horizon perception for automatic driving and is also an important basis for automatic driving decision.
However, as the real world is dynamically changed, the high-precision map is used as a digital expression of the real world, and needs to be continuously updated to ensure that the geographic elements and the attributes thereof in the high-precision map are consistent with the real world. The inventor finds that the cost of the professional high-precision map acquisition vehicle is too high to be deployed on a large scale, so that the professional high-precision map acquisition vehicle is difficult to realize timely and efficient updating of the high-precision map.
In order to meet the updating requirement of the high-precision map, low-cost map construction becomes a hot spot of industrial research. And accurate pose data is a key for acquiring high-precision map elements by low-cost mapping. Therefore, how to improve the accuracy of pose data is a problem that those skilled in the art need to continuously solve and optimize.
Disclosure of Invention
The embodiment of the disclosure provides a pose optimization method, a pose optimization device, electronic equipment, a medium and a program product.
In a first aspect, a pose optimization method is provided in an embodiment of the present disclosure.
Specifically, the pose optimization method comprises the following steps:
acquiring acquisition data for more than two passes of the same road, the acquisition data for one pass comprising: feature points of the key frame image and initial pose data corresponding to the key frame image;
for single-pass acquired data, performing feature point matching on a key frame image in the single-pass acquired data and key frame images in other-pass acquired data to obtain matching results of the key frame images in different-pass acquired data, wherein the matching results comprise matching results of whether a common-view area exists between two key frame images;
grouping the different passes of acquisition data based on co-view regions of key frame images in the different passes of acquisition data;
aiming at different times of data acquisition in the same group, selecting one time of data acquisition from the data acquisition, and aligning other times of data acquisition in the same group based on initial pose data of the selected one time of data acquisition to obtain first optimized pose data of a key frame image in the same group;
Performing pose map pose optimization on the first optimized pose data of the key frame image to obtain second optimized pose data of the key frame image, wherein the pose map optimization at least comprises inter-frame relative pose constraint between key frame images with common view areas between two times of acquisition data;
and performing global beam adjustment method BA optimization based on the feature points of the key frame image and the second optimized pose data of the key frame image to obtain three-dimensional feature points of the key frame image and third optimized pose data, wherein the precision of the pose data is gradually increased.
With reference to the first aspect, in a first implementation manner of the first aspect, the grouping the different passes of the acquired data based on a co-view area of a keyframe image in the different passes of the acquired data includes:
dividing each pass of acquired data meeting a grouping condition in the different passes of acquired data into a group, wherein the grouping condition comprises a common view area of a key frame image of each pass of acquired data in the same group and a key frame image of other passes of acquired data.
With reference to the first aspect and the first implementation manner of the first aspect, in a second implementation manner of the first aspect, the disclosed embodiment selects one pass of collected data from among the collected data for different passes in the same group, aligns other passes of collected data in the same group based on initial pose data of the selected one pass of collected data to obtain first optimized pose data of a key frame image in the same group, including:
Selecting a first pass from among the data collected for different passes in the same packet;
based on initial pose data of each pair of common-view key frame images with common-view areas between second-pass acquired data and first-pass acquired data in the same group, first inter-frame relative pose information of each pair of common-view key frame images is obtained through an angle N point PnP;
counting the first inter-frame relative pose information of each pair of common-view key frame images between the second-pass acquired data and the first-pass acquired data to obtain first inter-frame weighted relative pose information between the second-pass acquired data and the first-pass acquired data;
aligning the second-pass acquired data to initial pose data of first-pass acquired data based on the first inter-frame weighted relative pose information to obtain first-pass optimized pose data of a key frame image in the second-pass acquired data;
obtaining second inter-frame relative pose information of each pair of common-view key frame images through PnP based on initial pose data of key frame images of third-pass acquired data with a common-view region and first-time optimized pose data of key frame images of aligned first-pass acquired data;
Counting the second inter-frame relative pose information of each pair of common-view key frame images between the third-pass acquired data and the aligned first-pass acquired data to obtain second inter-frame weighted relative pose information between the third-pass acquired data and the aligned first-pass acquired data;
and based on the second inter-frame weighted relative pose information, aligning the third-pass acquired data to the first optimized pose data of the aligned first-pass acquired data to obtain the first optimized pose data of the key frame image in the third-pass acquired data.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in a third implementation manner of the first aspect, the acquiring acquired data of more than two passes for a same road includes:
acquiring image data and motion data of more than two passes for the same road;
according to the image data and the motion data of each pass, acquiring initial pose data corresponding to the key frame image of each pass through a Visual Inertial Odometer (VIO) algorithm, wherein the characteristic points of the key frame image of each pass are acquired in the calculation process of the VIO algorithm.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in a fourth implementation manner of the first aspect, the disclosure performs pose map optimization on the first optimized pose data of the key frame image to obtain second optimized pose data of the key frame image, and includes:
And performing phase graph optimization on the first optimized pose data of the key frame image by constructing pose prior constraint, frame pose constraint among key image frames of single-pass acquired data and inter-frame relative pose constraint among key frame images with common view areas among two-pass acquired data, so as to obtain second optimized pose data of the key frame image.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in a fifth implementation manner of the first aspect, the disclosure performs global beam adjustment BA optimization based on the feature points of the key frame image and the second optimized pose data of the key frame image, to obtain three-dimensional feature points and third optimized pose data of the key frame image, including:
and performing global beam adjustment method BA optimization on the feature points of the key frame image and the second optimized pose data of the key frame image by establishing a factor graph of 3D feature point reprojection errors, frame pose constraint among key image frames of single-pass acquired data and pose prior constraint to obtain three-dimensional feature points of the key frame image and third optimized pose data.
In a second aspect, in an embodiment of the present disclosure, a pose optimization apparatus is provided.
Specifically, the pose optimization device includes:
an acquisition module configured to acquire acquisition data for more than two passes of the same road, the one pass acquisition data comprising: feature points of the key frame image and initial pose data corresponding to the key frame image;
the matching module is configured to perform characteristic point matching on key frame images in the single-pass acquired data and key frame images in other-pass acquired data for the single-pass acquired data to obtain matching results of the key frame images in different-pass acquired data, wherein the matching results comprise matching results of whether a common-view area exists between the two key frame images;
a grouping module configured to group the different passes of acquisition data based on a co-view region of a keyframe image in the different passes of acquisition data;
the alignment module is configured to acquire data for different passes in the same group, select one pass of acquired data from the acquired data, and align other passes of acquired data in the same group based on initial pose data of the selected one pass of acquired data to obtain first optimized pose data of a key frame image in the same group;
The pose map optimization module is configured to perform pose map optimization on the first optimized pose data of the key frame image to obtain second optimized pose data of the key frame image, and the pose map optimization at least comprises inter-frame relative pose constraint between key frame images with common view areas between two times of acquisition data;
and the global optimization module is configured to perform global beam adjustment method BA optimization based on the feature points of the key frame image and the second time of optimization pose data of the key frame image to obtain three-dimensional feature points of the key frame image and third time of optimization pose data, and the precision of the pose data is gradually increased.
With reference to the second aspect, in a first implementation manner of the second aspect, the grouping module is configured to:
dividing each pass of acquired data meeting a grouping condition in the different passes of acquired data into a group, wherein the grouping condition comprises a common view area of a key frame image of each pass of acquired data in the same group and a key frame image of other passes of acquired data.
With reference to the second aspect and the first implementation manner of the second aspect, in a second implementation manner of the second aspect, the embodiment of the disclosure is configured to
Selecting a first pass from among the data collected for different passes in the same packet;
based on initial pose data of each pair of common-view key frame images with common-view areas between second-pass acquired data and first-pass acquired data in the same group, first inter-frame relative pose information of each pair of common-view key frame images is obtained through an angle N point PnP;
counting the first inter-frame relative pose information of each pair of common-view key frame images between the second-pass acquired data and the first-pass acquired data to obtain first inter-frame weighted relative pose information between the second-pass acquired data and the first-pass acquired data;
aligning the second-pass acquired data to initial pose data of first-pass acquired data based on the first inter-frame weighted relative pose information to obtain first-pass optimized pose data of a key frame image in the second-pass acquired data;
obtaining second inter-frame relative pose information of each pair of common-view key frame images through PnP based on initial pose data of key frame images of third-pass acquired data with a common-view region and first-time optimized pose data of key frame images of aligned first-pass acquired data;
Counting the second inter-frame relative pose information of each pair of common-view key frame images between the third-pass acquired data and the aligned first-pass acquired data to obtain second inter-frame weighted relative pose information between the third-pass acquired data and the aligned first-pass acquired data;
and based on the second inter-frame weighted relative pose information, aligning the third-pass acquired data to the first optimized pose data of the aligned first-pass acquired data to obtain the first optimized pose data of the key frame image in the third-pass acquired data.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a third implementation manner of the second aspect, the present disclosure is configured to:
acquiring image data and motion data of more than two passes for the same road;
according to the image data and the motion data of each pass, acquiring initial pose data corresponding to the key frame image of each pass through a Visual Inertial Odometer (VIO) algorithm, wherein the characteristic points of the key frame image of each pass are acquired in the calculation process of the VIO algorithm.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a fourth implementation manner of the first aspect, the pose map optimization module is configured to:
And performing phase graph optimization on the first optimized pose data of the key frame image by constructing pose prior constraint, frame pose constraint among key image frames of single-pass acquired data and inter-frame relative pose constraint among key frame images with common view areas among two-pass acquired data, so as to obtain second optimized pose data of the key frame image.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a fifth implementation manner of the first aspect, the disclosure is configured to:
and performing global beam adjustment method BA optimization on the feature points of the key frame image and the second optimized pose data of the key frame image by establishing a factor graph of 3D feature point reprojection errors, frame pose constraint among key image frames of single-pass acquired data and pose prior constraint to obtain three-dimensional feature points of the key frame image and third optimized pose data.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including a memory for storing one or more computer instructions for supporting a pose optimization apparatus to perform the pose optimization method described above, and a processor configured to execute the computer instructions stored in the memory. The pose optimization device may further comprise a communication interface for the pose optimization device to communicate with other devices or a communication network.
In a fourth aspect, embodiments of the present disclosure provide a computer readable storage medium storing computer instructions for use by a pose optimization device, comprising computer instructions for performing the pose optimization method described above for a pose optimization device.
In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program/instruction, wherein the computer program/instruction, when executed by a processor, implements the method steps of the pose optimization method described above.
In a sixth aspect, an embodiment of the present disclosure provides a navigation method, in which a navigation route calculated based at least on a start point, an end point and a road condition is obtained based on a high-precision map, navigation guidance is performed based on the navigation route, and the high-precision map is implemented by reconstructing a map based on three-dimensional feature points and third-time optimized pose data obtained in any one of the above methods.
The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects:
according to the technical scheme, based on the inter-frame relative pose relation between key frame images among the acquired data of each pass and the inter-frame relative pose relation between key image frames of the acquired data of a single pass, the three times of pose optimization including pose alignment, pose map optimization and global BA optimization are carried out on the initial pose data of each pass, so that the precision of the pose data is gradually increased. According to the technical scheme, the absolute precision and the relative precision of the whole pose can be improved, pose data with higher precision can be recovered, a good foundation is provided for reconstructing high-precision map elements, and therefore a high-precision map with higher precision can be reconstructed.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of embodiments of the disclosure.
Drawings
Other features, objects and advantages of the embodiments of the present disclosure will become more apparent from the following detailed description of non-limiting embodiments, taken in conjunction with the accompanying drawings. The following is a description of the drawings.
Fig. 1 shows a flowchart of a pose optimization method according to an embodiment of the present disclosure.
FIG. 2 illustrates a schematic diagram of a process for rigid alignment of two-pass poses according to an embodiment of the present disclosure.
Fig. 3 shows a schematic diagram of 3D feature point re-projection according to an embodiment of the present disclosure.
Fig. 4 shows a block diagram of a pose optimization device according to an embodiment of the present disclosure.
Fig. 5 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
FIG. 6 is a schematic diagram of a computer system suitable for use in implementing a pose optimization method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, exemplary implementations of the embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily implement them. In addition, for the sake of clarity, portions irrelevant to description of the exemplary embodiments are omitted in the drawings.
In the presently disclosed embodiments, it is to be understood that the terms such as "comprises" or "comprising" and the like are intended to indicate the presence of features, numbers, steps, acts, components, portions, or combinations thereof disclosed in the present specification, and are not intended to exclude the possibility of one or more other features, numbers, steps, acts, components, portions, or combinations thereof being present or added.
In addition, it should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. Embodiments of the present disclosure will be described in detail below with reference to the attached drawings in conjunction with the embodiments.
According to the technical scheme provided by the embodiment of the disclosure, three pose optimizations, namely, pose alignment, pose graph optimization and global BA optimization, are performed on initial pose data of each pass based on inter-frame relative pose relationships between key frame images among data acquired by each pass and inter-frame relative pose relationships between key image frames of data acquired by a single pass, so that the precision of the pose data is gradually increased. According to the technical scheme, the absolute precision and the relative precision of the whole pose can be improved, pose data with higher precision can be recovered, a good foundation is provided for reconstructing high-precision map elements, and therefore a high-precision map with higher precision can be reconstructed.
Fig. 1 shows a flowchart of a pose optimization method according to an embodiment of the present disclosure, as shown in fig. 1, including the following steps S101 to S106:
in step S101, acquisition data of two or more passes for the same road is acquired, one pass of acquisition data including: feature points of the key frame image and initial pose data corresponding to the key frame image;
in step S102, for single-pass acquired data, performing feature point matching on a key frame image in the single-pass acquired data and key frame images in other-pass acquired data to obtain matching results of key frame images in different-pass acquired data, where the matching results include a matching result of whether there is a common view area between two key frame images;
in step S103, grouping the different passes of acquisition data based on the co-view regions of the keyframe images in the different passes of acquisition data;
in step S104, for different passes of acquired data in the same group, selecting one pass of acquired data therefrom, and aligning other passes of acquired data in the same group based on initial pose data of the selected one pass of acquired data to obtain first optimized pose data of a key frame image in the same group;
In step S105, performing pose map position optimization on the first optimized pose data of the key frame image to obtain second optimized pose data of the key frame image, where the pose map optimization at least includes inter-frame relative pose constraint between key frame images with common view regions between two passes of acquired data;
in step S106, global beam adjustment BA optimization is performed based on the feature points of the key frame image and the second optimized pose data of the key frame image, so as to obtain three-dimensional feature points of the key frame image and third optimized pose data, where the precision of the pose data is gradually increased.
As mentioned above, in recent years, research into automated driving or intelligent driving technology has become a hotspot and trend, and one of the implementation paths of automated driving or intelligent driving depends on a high-precision map (simply referred to as a high-definition map or a high-standard map). The high-precision map can provide beyond-the-horizon perception for automatic driving and is also an important basis for automatic driving decision. However, as the real world is dynamically changed, the high-precision map is used as a digital expression of the real world, and needs to be continuously updated to ensure that the geographic elements and the attributes thereof in the high-precision map are consistent with the real world. The inventor finds that the cost of the professional high-precision map acquisition vehicle is too high to be deployed on a large scale, so that the professional high-precision map acquisition vehicle is difficult to realize timely and efficient updating of the high-precision map. In order to meet the updating requirement of the high-precision map, low-cost mapping becomes a hot spot of industrial research. And accurate pose data is a key for acquiring high-precision map elements by low-cost mapping. Therefore, how to improve the accuracy of pose data is a problem that those skilled in the art need to continuously solve and optimize.
In view of the above, in this embodiment, a pose optimization method is proposed that performs three pose optimizations, namely, pose alignment, pose map optimization, and global BA optimization, on initial pose data of each pass based on an inter-frame relative pose relationship between key frame images between the acquired data of each pass and an inter-frame relative pose relationship between key image frames of the acquired data of a single pass, so that the precision of the pose data is gradually increased. According to the technical scheme, the absolute precision and the relative precision of the whole pose can be improved, pose data with higher precision can be recovered, a good foundation is provided for reconstructing high-precision map elements, and therefore a high-precision map with higher precision can be reconstructed.
In an embodiment of the present disclosure, the pose optimization method may be applied to a computer, a computing device, an electronic device, a server, a service cluster, etc. that may perform pose optimization.
In an embodiment of the disclosure, when a high-precision map of a certain road is constructed, acquiring data of more than two passes of the same road is required, and optimizing the acquired data of more than two passes to obtain more accurate characteristic points of key frame images of each pass and pose data corresponding to the key frame images, and then, the spatial three-dimensional information of the environment where the road is located can be recovered by adopting an SFM (Structure From Motion motion recovery structure), so that the high-precision map is constructed.
In one embodiment of the present disclosure, the key frame images are selected from a sequence of time-ordered image data acquired as key frame images, and the key frame images may be selected by a rule that the key frame images cannot be too close or too far apart, and the key frame images need to be selected within a certain interval range.
In one embodiment of the present disclosure, the feature points of the key frame image may be used to identify some target objects on the key frame image, typically points where the gray values on the image change drastically or points with a larger curvature on the edges of the image (such as the intersection of two edges) are considered as feature points of the image. The feature point comprises two parts, namely a Key point (Key-points) and a Descriptor (Descriptor), wherein the Key point expresses the three-dimensional position of the feature point, and the Descriptor is a description of the visual characteristics of the feature point, and most of the descriptors are in a vector form.
In an embodiment of the present disclosure, pose (phase) data is position and orientation (orientation) information of an acquisition device that acquires the keyframe image, for example: in three dimensions, typically (x, y, z, yaw, pitch, roll), the first three elements describe the three-dimensional position of the object and the last three elements describe the pose of the object. Where yaw is heading angle, rotate around Z axis, pitch is pitch angle, rotate around Y axis, roll is roll angle, rotate around X axis. According to the key frame image in each time of acquisition data, camera calibration is carried out (camera calibration refers to restoring an object in space by utilizing the key frame image shot by a camera), and according to the result of camera calibration and the motion data of the acquisition equipment, the pose data of the acquisition equipment when the key frame image is shot, namely the initial pose data corresponding to the key frame image, can be determined.
In an embodiment of the present disclosure, feature points of a key frame image of each pass of acquired data are used to identify a target object in the key frame image, and performing feature point matching refers to calculating a similarity of descriptors of two feature points (i.e., calculating a distance between descriptors of two feature points in vector space), and when the similarity exceeds a preset threshold, for example, 99%, the two feature points are matched and can be recorded as a common feature point. If the number of the common feature points between the two key frame images exceeds the preset number, the matching result between the two key frame images is that the two key frame images have the common view area, and if the number of the common feature points between the two key frame images does not exceed the preset number, the matching result between the two key frame images is that the two key frame images do not have the common view area. The common view area means that the key frame images of the two times of data acquisition display a common target object (possibly with different visual angles), and the common view area is arranged between the two key frame images, so that the two times of data acquisition equipment adopts different poses to shoot the same target object when acquiring the two key frame images.
In an embodiment of the present disclosure, feature point matching may be performed on a key frame image in a single-pass acquired data and a key frame image in other-pass acquired data for any single-pass acquired data, so as to obtain a matching result between any two key frame images between any two-pass acquired data. Preferably, if the distance between the acquisition positions of two key frame images is far, no common view area exists between the two key frame images, so in order to reduce the calculation amount, when the feature point matching is performed, for one key frame image in the single-pass acquisition data, feature point matching is performed on only key frame images, the acquisition positions of which are within a certain distance range, from other passes of acquisition data, and a matching result between the one key frame image and the corresponding screened key frame image is obtained.
In an embodiment of the present disclosure, based on the co-view areas of the key frame images in the different passes of the acquired data, the passes of the acquired data satisfying a grouping condition may be grouped, wherein the grouping condition includes that each pass of the acquired data in the same group has a key frame image with a co-view area with the key frame images in other passes of the acquired data.
In an embodiment of the present disclosure, the first pose optimization of the initial pose data corresponding to the key frame image is based on the initial pose data of the first acquisition data selected in the same group, and the other acquisition data in the same group are rigidly aligned, where the rigid alignment refers to that the initial pose data corresponding to each key frame image of each acquisition data is transformed by the same displacement and orientation. For example, as shown in fig. 2, the same group includes two times of acquired data S1 and S2, where a pose track formed by initial pose data of S1 and a pose track formed by initial pose data of S2 are offset, and the positions and orientations in the initial pose information of S2 are moved according to the same standard parameters, so that the positions and orientations are rigidly aligned with the pose track formed by the pose data of S1, so as to obtain first optimized pose data of S2, where the first optimized pose data of S1 is the initial pose data of S1, and thus aligned S1 and S2 are obtained.
In an embodiment of the present disclosure, if the two keyframe images have a common view area, it indicates that the same object is displayed in the two keyframe images only with different display angles, so that the relative pose relationship of the two keyframe images can be obtained accordingly, so that the relative pose relationship between the two acquired data can be obtained by statistics according to the keyframe images with the common view area between the two acquired data in the same group, and the initial pose data of the two acquired data can be rigidly aligned according to the relative pose relationship between the two acquired data, so as to obtain the first optimized pose data of the keyframe images of the two acquired data; then, because the key frame image between one or more times of acquired data and one time of acquired data in the two times of acquired data has a common view area, the relative pose relation between the other time of acquired data and one time of acquired data in the two times of acquired data can be obtained statistically, and the other time of acquired data is rigidly aligned to the initial pose data of the one time of acquired data according to the relative pose relation, so that the first optimized pose data of the key frame image in the same group is obtained after all the different times of acquired data in the same group are rigidly aligned according to the method.
The method is characterized in that the acquisition time is discontinuous, the acquisition equipment is different, the pose track in the acquired acquisition data of each pass has certain offset, the initial pose data in the acquisition data of each pass is rigidly aligned, a good initial value can be provided for the optimization of the subsequent pose map, and the situation of local optimization is avoided.
In one embodiment of the disclosure, a pose graph (pose map) optimization utilizes short-interval reliable relative pose measurement to construct a global optimization problem covering key frame images with long time span, and the global optimization problem spreads out accumulated errors, takes pose data corresponding to each key frame image as a node, and takes relative pose calculation results among the key frame images as measurement to carry out maximum likelihood estimation. The optimization variables of the pose graph optimization are pose, a graph optimization with only pose tracks needs to be constructed, edges between pose nodes are given by motion estimation obtained after feature matching between two key frame images, once the initial values are finished, the positions of feature points are not optimized, only the relation between the pose of the acquisition equipment is concerned, and the first optimization pose data of each pass key frame image is subjected to second-time pose optimization to obtain second optimization pose data of the key frame image, and the second optimization pose data restores the absolute precision of the pose data of each pass key frame image and simultaneously ensures the relative precision of the pose data of each pass key frame image based on the inter-frame relative pose constraint between key image frames of single pass acquisition data and the inter-frame relative pose constraint between key frame images with common view areas between any two passes acquisition data. The accuracy of the second optimized pose data is greater than the accuracy of the first optimized pose data.
In an embodiment of the disclosure, BA (Bundle Adjustment, beam adjustment) optimization not only optimizes pose but also feature points, BA optimization including optimization of re-projection errors (reprojection error); according to the three-dimensional position coordinates and pose data of the feature points, the three-dimensional feature points are projected into an image plane, and then the error between the three-dimensional feature points and the points extracted by the features in the image is calculated, so that the BA optimization is to minimize the error. The BA optimization can adjust the second optimization pose data and the feature points to minimize errors between the observed image positions and the predicted image positions, so that three-dimensional feature points of the key frame image and third optimization pose data are obtained, and the precision of the third optimization pose data is greater than that of the second optimization pose data.
In the above embodiment, first, acquisition data for two or more passes of the same road is acquired, and one pass of acquisition data includes: the method comprises the steps that feature points of key frame images and initial pose data corresponding to the key frame images are used for carrying out feature point matching on the key frame images in the single-pass acquired data and key frame images in other-pass acquired data aiming at the single-pass acquired data, so that matching results of the key frame images in different-pass acquired data are obtained, and the matching results comprise matching results of whether a common-view area exists between two key frame images; grouping the different passes of acquisition data based on co-view regions of key frame images in the different passes of acquisition data; then, aiming at different times of acquisition data in the same group, selecting one time of acquisition data from the different times of acquisition data, and aligning other times of acquisition data in the same group based on initial pose data of the selected one time of acquisition data to obtain first optimized pose data of a key frame image in the same group; the aligned first optimized pose data provides a good initial value for the pose graph, and performing the pose graph on the first optimized pose data of the key frame images of each pass restores the absolute precision of the pose data of the key frame images of each pass while guaranteeing the relative precision of the pose data of the key frame images of each pass, so that a better initial value can be provided for subsequent BA optimization; and finally, performing global BA optimization on the feature points of the key frame images and the second optimized pose data of the key frame images, so that three-dimensional feature points of the high-precision key frame images and third optimized pose data can be obtained, and a good basis is provided for subsequent construction of high-precision maps so as to reconstruct the high-precision maps with higher precision. In addition, the embodiment only utilizes the characteristic points of the image when the pose optimization is carried out, the use data is more simplified, and the method can recover the pose data with higher precision without acquiring data of too many passes (only two passes can be used for acquiring data in an example).
In an embodiment of the present disclosure, step S103, that is, the step of grouping the different passes of acquired data based on the co-view regions of the keyframe images in the different passes of acquired data, may include the steps of:
dividing each pass of the acquired data meeting a grouping condition in the different passes of the acquired data into a group, wherein the grouping condition comprises that each pass of the acquired data in the same group has a common view area with key frame images in other passes of the acquired data.
In this embodiment, each pass of acquired data in the same group necessarily has a region of co-vision with the keyframe images in one or more other passes of acquired data, and if it is a different group, none of the keyframe images in any one pass of acquired data in the same group has a region of co-vision with the keyframe images of the respective passes of acquired data in another group.
In this embodiment, it is assumed that 6 passes of acquired data S1, S2, S3, S4, S5, and S6 are acquired in total, wherein a key frame image between S1 and S2 has a common view region; a key frame image between S3 and S2 has a common view area, but no key frame image between S3 and S1 has a common view area; there is a common view region between S4 and S3, but no key frame image between S4 and S1, S2 has a common view region, and no key frame image between S5 and S6 has a common view region, but no key frame image between S5 and S1 to S4 has a common view region, and neither key frame image between S6 and S1 to S4 has a common view region, S1, S2, S3 and S4 can be classified into a first group, and S5 and S6 can be classified into a second group. Each pass of the acquired data in the first set necessarily has one or more keyframe images having a common view area with keyframe images in one or more other passes of the acquired data, e.g., a common view area between a keyframe image of S1 and a keyframe image of S2, a common view area between a keyframe image of S2 and a keyframe image of S1 or S3, a common view area between a keyframe image of S3 and a keyframe image of S2 or S4, and a common view area between a keyframe image of S4 and a keyframe image of S3.
In an embodiment of the present disclosure, step S104, i.e. for different passes in the same group, of collecting data from one pass, aligning other passes in the same group based on initial pose data of the selected one pass of collecting data, to obtain first optimized pose data of a key frame image in the same group, may include the following steps:
in step A1, data is acquired for different passes in the same packet, from which the first pass is selected to acquire data;
in step A2, based on initial pose data of each pair of common-view key frame images with a common-view area between second-pass acquired data and first-pass acquired data in the same group, first inter-frame relative pose information of each pair of common-view key frame images is obtained through an angle N point PnP;
in step A3, counting first inter-frame relative pose information of each pair of common-view keyframe images between the second-pass acquired data and the first-pass acquired data to obtain first inter-frame weighted relative pose information between the second-pass acquired data and the first-pass acquired data;
in step A4, aligning the second-pass acquired data to initial pose data of the first-pass acquired data based on the first inter-frame weighted relative pose information, so as to obtain first optimized pose data of a key frame image in the second-pass acquired data;
In step A5, based on the initial pose data of the key frame image of the third time of acquisition data with the common view area and the first time of optimized pose data of the key frame image of the aligned first time of acquisition data, obtaining second inter-frame relative pose information of each pair of common view key frame images through PnP;
in step A6, counting second inter-frame relative pose information of each pair of common-view key frame images between the third-pass acquired data and the aligned first-pass acquired data to obtain second inter-frame weighted relative pose information between the third-pass acquired data and the aligned first-pass acquired data;
in step A7, based on the second inter-frame weighted relative pose information, aligning the third-pass acquired data to the aligned first-pass optimized pose data of the first-pass acquired data, so as to obtain first-pass optimized pose data of the key frame image in the third-pass acquired data.
In this embodiment, one pass of collected data may be randomly selected from different passes of collected data in the same packet as the first pass of collected data, the second pass of collected data is any pass of collected data having a common view area with the first pass of collected data in the same packet, the first inter-frame relative pose information of each pair of common view key frame images may be obtained based on initial pose data of each pair of common view key frame images having a common view area between the second pass of collected data and the first pass of collected data through an angle N point PnP, and further first inter-frame weighted relative pose information between the second pass of collected data and the first pass of collected data is obtained through statistics, and the statistical method may be to weight average each first inter-frame relative pose information. And transforming the initial pose data of the second-pass acquired data according to the first inter-frame weighted relative pose information, and aligning the initial pose data of the first-pass acquired data to obtain first optimized pose data of the key frame images in the second-pass acquired data, wherein the first optimized pose data of the key frame images in the first-pass acquired data is the initial pose data of the first-pass acquired data. At this point, the aligned acquisition data is the first and second passes of acquisition data.
In this embodiment, the third pass of acquisition data is one pass of acquisition data in the same packet having a common view region with any of the aligned acquisition data. Obtaining second inter-frame relative pose information of each pair of common view key frame images between the third-pass acquired data and the aligned first-pass acquired data through PnP based on initial pose data of the key frame images of the third-pass acquired data with the common view region and first-pass optimized pose data of the key frame images of the aligned first-pass acquired data; carrying out weighted statistics on the second inter-frame relative pose information to obtain second inter-frame weighted relative pose information between the third time acquired data and the aligned first time acquired data; and transforming the initial pose data of the third-pass acquired data according to the second inter-frame weighted relative pose information, and aligning the first optimized pose data of the aligned first-pass acquired data to obtain the first optimized pose data of the key frame image in the third-pass acquired data. At this time, the aligned acquired data is the first, second and third acquired data, and the acquired data of the other remaining passes in the same group are repeatedly performed according to steps A5 to A7 until the acquired data of each pass in the same group are aligned.
Here, according to the common feature point and pose data of each pair of common view key frames, the inter-frame relative pose information of the pair of common view key frames can be estimated by using PnP algorithm.
By way of example, assume that 4 passes of acquisition data S1, S2, S3, S4 are acquired altogether, wherein there is a region of co-vision for the keyframe images between S1 and S2; a key frame image between S3 and S2 has a common view area, but no key frame image between S3 and S1 has a common view area; if there is a common view region between S4 and S3, but no key frame image between S4 and S1, S2 has a common view region, S1, S2, S3, and S4 may be divided into the same group. S2 can be randomly selected from the same group as first-pass acquired data, the second-pass acquired data can be one pass of S1 and S3 with a common view area with the S2 in the same group, and first inter-frame relative pose information of each pair of common view key frame images can be obtained based on initial pose data of each pair of common view key frame images with the common view area between the S1 and the S2 through an angle N point PnP algorithm, so that first inter-frame weighted relative pose information between the S1 and the S2 is obtained through statistics. And transforming the initial pose data of the S1 according to the first inter-frame weighted relative pose information, and aligning the initial pose data of the S2 to obtain first optimized pose data of the key frame image in the S1, wherein the first optimized pose data of the key frame image in the S2 is the initial pose data of the S2. At this time, the aligned acquired data are S1 and S2. The third pass collects data for one pass of data S3 in the same packet having a region of common view with S2 in S1 and S2. Based on the initial pose data of the key frame image of the S3 with the common view area and the first optimized pose data of the key frame image of the S2, obtaining second inter-frame relative pose information of each pair of common view key frame images between the S2 and the S3 through a PnP algorithm; carrying out weighted statistics on the second inter-frame relative pose information to obtain second inter-frame weighted relative pose information between S2 and S3; and (3) transforming the initial pose data of the S3 according to the second inter-frame weighted relative pose information, and aligning the first optimized pose data of the S2 to obtain the first optimized pose data of the key frame image in the S3. At this time, the aligned acquired data are S1, S2, and S3. The acquisition data S4 of the other remaining passes is repeated as follows according to steps A5 to A7: the S4 and the S3 have a common view area, and second inter-frame relative pose information of each pair of common view key frame images between the S4 and the S3 can be obtained through a PnP algorithm based on initial pose data of the key frame image of the S4 with the common view area and first time optimized pose data of the key frame image of the aligned one-time acquisition data S3; carrying out weighted statistics on the second inter-frame relative pose information to obtain second inter-frame weighted relative pose information between S4 and S3; and transforming the initial pose data of the S4 according to the second inter-frame weighted relative pose information, and aligning the first optimized pose data of the S3 to obtain the first optimized pose data of the key frame image in the S4, so that the acquired data of the S1 to the S4 in the same group are aligned.
Here, it should be noted that each pass of collected data has a common view area with N passes of collected data in the same group, N values corresponding to each pass of collected data are different, and the first pass of collected data is the one pass of collected data with the largest N value. If all other trip acquisition data in the same group have a common view area with a certain trip acquisition data, the certain trip acquisition data can be selected as first trip acquisition data, all pairs of common view key frame images with common view areas between the other trip acquisition data and the first trip acquisition data are respectively based on the above method, inter-frame relative pose information of all pairs of common view key frame images is obtained through an angle N point PnP, then inter-frame relative pose information of all pairs of common view key frame images between the other trip acquisition data and the first trip acquisition data is counted, inter-frame weighted relative pose information between the other trip acquisition data and the first trip acquisition data is obtained, finally, the other trip acquisition data are aligned to initial pose data of the first trip acquisition data based on the inter-frame weighted relative pose information, and first optimized pose data of key frame images in the other trip acquisition data is obtained.
In an embodiment of the disclosure, the step S101, that is, the step of acquiring the acquired data of more than two passes for the same road, may include the following steps:
acquiring image data and motion data of more than two passes for the same road;
according to the image data and the motion data of each pass, acquiring initial pose data corresponding to the key frame images of each pass through a Visual Inertial Odometer (VIO) algorithm, wherein the characteristic points of the key frame images of each pass are acquired in the calculation process of the VIO algorithm.
In this embodiment, the image data and motion data of the acquisition device during motion are acquired by installing sensors (e.g., a camera, an IMU (Inertial Measurement Unit, inertial measurement unit), an RTK (Real Time Kinematic), etc.) on the acquisition device, the image data being a sequence of images acquired by the camera during each acquired motion, the motion data including the RTK data of the camera during each acquired motion, or the RTK data and IMU data, etc., the IMU being sensors used primarily to detect and measure acceleration and rotational motion, the IMU generating acceleration and angular velocity measurements along multiple axes or degrees of freedom by integrating multiple inertial sensors, the IMU data being the acceleration and angular velocity measurements of the acquisition device measured by the IMU in Real Time. The RTK is a real-time differential GPS technology based on carrier phase observation, and can obtain three-dimensional pose data with centimeter-level positioning precision in real time, wherein the RTK data is real-time three-dimensional pose data of acquisition equipment.
In this embodiment, in order to realize low-cost mapping, image data and motion data for two or more passes of the same road can be acquired by crowd-sourced materials acquired by various acquisition devices such as an in-vehicle camera, a cell phone camera, a digital camera, or a video camera. It should be understood that crowd-sourced data refers to open data that is obtained by a public through a certain method (e.g. using an on-board camera to shoot) and then provided to the public or related institutions through the internet. The masses may voluntarily upload the crowd-sourced material or provide the crowd-sourced material by participating in the crowd-sourced tasks issued by the relevant institutions. The crowd-sourced data comprises image data and motion data which are acquired by the masses during each acquisition, wherein the acquisition data of one acquisition pass refers to characteristic points of key frame images and initial pose data corresponding to the key frame images, which are acquired based on the image data and the motion data acquired by one acquisition in continuous time, and each acquisition pass has time sequence. When a crowdsourcing map is needed for a certain road, more than two passes of acquired data for the road can be acquired from crowdsourcing data.
In this embodiment, the VIO (Visual Inertial Odometry, visual odometer) algorithm firstly extracts feature points in the key frame image during the calculation process, and then calculates initial pose data corresponding to the key frame image, so that feature points of the key frame image of each pass can be obtained during the VIO calculation process, without additional feature point extraction, and the calculation steps are reduced.
In an embodiment of the present disclosure, the step S105, that is, the step of performing pose map optimization on the first optimized pose data of the key frame image to obtain second optimized pose data of the key frame image, may include the following steps:
and performing phase graph optimization on the first optimized pose data of the key frame image by constructing pose prior constraint, frame pose constraint among key image frames of single-pass acquired data and inter-frame relative pose constraint among key frame images with common view areas among two-pass acquired data, so as to obtain second optimized pose data of the key frame image.
In this embodiment, when pose map optimization is applied to optimize the first optimized pose data, optimization adjustment is to be performed under pose prior constraint, frame pose constraint between key image frames of single-pass acquired data, and inter-frame relative pose constraint between key frame images having common view areas between two passes acquired data, the pose prior constraint is used to constrain adjustment of the first optimized pose data corresponding to the key frame images to a certain extent, and the frame pose constraint between the key image frames of single-pass acquired data means that when the first optimized pose data corresponding to a certain key frame image is adjusted, the inter-frame relative pose between the key frame image and the key frame image having common view areas in the pass where the key frame image is located should satisfy a certain constraint condition, and cannot exceed the inter-frame relative pose calculated according to the initial pose data of the key frame images too much; the inter-frame relative pose constraint between the key frame images with the common view area between the two passes of acquired data means that when the first optimized pose data corresponding to the key frame images is adjusted, the inter-frame relative pose between the key frame images and the key frame images with the common view area in other passes also meets a certain constraint condition and cannot exceed the inter-frame relative pose calculated according to the initial pose data of the key frame images too much. Thus, the absolute precision of the pose data of each time of key frame images can be recovered, and the relative precision of the pose data of each time of key frame images can be ensured.
In an embodiment of the disclosure, the step S106, that is, the step of performing global beam adjustment BA optimization based on the feature points of the key frame image and the second optimized pose data of the key frame image to obtain the three-dimensional feature points of the key frame image and the third optimized pose data, may include the following steps:
and performing global beam adjustment method BA optimization on the feature points of the key frame image and the second optimized pose data of the key frame image by establishing a factor graph of 3D feature point reprojection errors, frame pose constraint among key image frames of single-pass acquired data and pose prior constraint to obtain three-dimensional feature points of the key frame image and third optimized pose data.
In this embodiment, the 3D feature point re-projection error refers to projecting a feature point into a two-dimensional image plane according to three-dimensional position coordinates of the feature point and pose data of an acquisition device, and then calculating a distance between the feature point and a point extracted by features in the two-dimensional image, wherein the distance is minimized when BA optimization is performed. For example, fig. 3 shows a schematic diagram of 3D feature point re-projection according to an embodiment of the present disclosure, as shown in fig. 3, feature points P1, P2, and P3 are projected onto a keyframe image with pose being T1, T2, and T3, respectively, the projected position is compared with an actual position of extracting P1, P2, and P3 on the keyframe image, a distance between the two is calculated, and feature point and pose data are adjusted, so that the distance is minimized.
In the embodiment, when the re-projection error is minimized, frame pose constraint and pose prior constraint between key image frames of single-pass acquired data are considered, so that feature points and second-time optimized pose data are adjusted to perform global BA optimization, and three-dimensional feature points and third-time optimized pose data are obtained.
The following are device embodiments of the present disclosure that may be used to perform method embodiments of the present disclosure.
Fig. 4 shows a block diagram of a pose optimization device according to an embodiment of the present disclosure, which may be implemented as part or all of an electronic device by software, hardware, or a combination of both. As shown in fig. 4, the pose optimization device includes:
an acquisition module 401 configured to acquire acquisition data for more than two passes of the same road, one pass of acquisition data comprising: feature points of the key frame image and initial pose data corresponding to the key frame image;
the matching module 402 is configured to perform feature point matching on a key frame image in the single-pass acquired data and key frame images in other-pass acquired data for the single-pass acquired data, so as to obtain matching results of the key frame images in different-pass acquired data, wherein the matching results comprise matching results of whether a common-view area exists between the two key frame images;
A grouping module 403 configured to group the different passes of acquired data based on a co-view region of a keyframe image in the different passes of acquired data;
an alignment module 404 configured to collect data for different passes in the same group, select one pass from among the collected data, align other passes in the same group based on initial pose data of the selected one pass collected data, and obtain first optimized pose data of a keyframe image in the same group;
the pose map optimization module 405 is configured to perform pose map optimization on the first optimized pose data of the key frame image to obtain second optimized pose data of the key frame image, where the pose map optimization at least includes inter-frame relative pose constraints between key frame images with common view areas between two passes of acquired data;
the global optimization module 406 is configured to perform global beam adjustment BA optimization based on the feature points of the key frame image and the second-time optimized pose data of the key frame image, so as to obtain three-dimensional feature points of the key frame image and third-time optimized pose data, where the precision of the pose data is gradually increased.
As mentioned above, in recent years, research into automated driving or intelligent driving technology has become a hotspot and trend, and one of the implementation paths of automated driving or intelligent driving depends on a high-precision map (simply referred to as a high-definition map or a high-standard map). The high-precision map can provide beyond-the-horizon perception for automatic driving and is also an important basis for automatic driving decision. However, since the real world is dynamically changed, the high-precision map is used as a digital representation of the real world, and needs to be continuously updated to ensure that the geographic elements and the attributes thereof in the high-precision map are consistent with the real world. The inventor finds that the cost of the professional high-precision map acquisition vehicle is too high to be deployed on a large scale, so that the professional high-precision map acquisition vehicle is difficult to realize timely and efficient updating of the high-precision map. In order to meet the updating requirement of the high-precision map, low-cost mapping becomes a hot spot of industrial research. And accurate pose data is a key for acquiring high-precision map elements by low-cost mapping. Therefore, how to improve the accuracy of pose data is a problem that those skilled in the art need to continuously solve and optimize.
In view of the above, in this embodiment, a pose optimization apparatus is proposed that performs pose optimization three times of pose alignment, pose map optimization, and global BA optimization on initial pose data of each pass based on inter-frame relative pose relationships between key frame images between the acquired data of each pass and inter-frame relative pose relationships between key image frames of the acquired data of a single pass, so that the precision of the pose data is successively increased. According to the technical scheme, the absolute precision and the relative precision of the whole pose can be improved, pose data with higher precision can be recovered, a good foundation is provided for reconstructing high-precision map elements, and therefore a high-precision map with higher precision can be reconstructed.
In an embodiment of the present disclosure, the pose optimization apparatus may be applied to a computer, a computing device, an electronic device, a server, a service cluster, etc. that may perform pose optimization.
In an embodiment of the disclosure, when a high-precision map of a certain road is constructed, acquiring data of more than two passes of the same road is required, and optimizing the acquired data of more than two passes to obtain more accurate characteristic points of key frame images of each pass and pose data corresponding to the key frame images, and then, the spatial three-dimensional information of the environment where the road is located can be recovered by adopting an SFM (Structure From Motion motion recovery structure), so that the high-precision map is constructed.
In one embodiment of the present disclosure, the key frame images are selected from a sequence of time-ordered image data acquired as key frame images, and the key frame images may be selected by a rule that the key frame images cannot be too close or too far apart, and the key frame images need to be selected within a certain interval range.
In one embodiment of the present disclosure, the feature points of the key frame image may be used to identify some target objects on the key frame image, typically points where the gray values on the image change drastically or points with a larger curvature on the edges of the image (such as the intersection of two edges) are considered as feature points of the image. The feature point comprises two parts, namely a Key point (Key-points) and a Descriptor (Descriptor), wherein the Key point expresses the three-dimensional position of the feature point, and the Descriptor is a description of the visual characteristics of the feature point, and most of the descriptors are in a vector form.
In an embodiment of the present disclosure, pose (phase) data is position and orientation (orientation) information of an acquisition device that acquires the keyframe image, for example: in three dimensions, typically (x, y, z, yaw, pitch, roll), the first three elements describe the three-dimensional position of the object and the last three elements describe the pose of the object. Where yaw is heading angle, rotate around Z axis, pitch is pitch angle, rotate around Y axis, roll is roll angle, rotate around X axis. According to the key frame image in each time of acquisition data, camera calibration is carried out (camera calibration refers to restoring an object in space by utilizing the key frame image shot by a camera), and according to the result of camera calibration and the motion data of the acquisition equipment, the pose data of the acquisition equipment when the key frame image is shot, namely the initial pose data corresponding to the key frame image, can be determined.
In an embodiment of the present disclosure, feature points of a key frame image of each pass of acquired data are used to identify a target object in the key frame image, and performing feature point matching refers to calculating a similarity of descriptors of two feature points (i.e., calculating a distance between descriptors of two feature points in vector space), and when the similarity exceeds a preset threshold, for example, 99%, the two feature points are matched and can be recorded as a common feature point. If the number of the common feature points between the two key frame images exceeds the preset number, the matching result between the two key frame images is that the two key frame images have the common view area, and if the number of the common feature points between the two key frame images does not exceed the preset number, the matching result between the two key frame images is that the two key frame images do not have the common view area. The common view area means that the key frame images of the two times of data acquisition display a common target object (possibly with different visual angles), and the common view area is arranged between the two key frame images, so that the two times of data acquisition equipment adopts different poses to shoot the same target object when acquiring the two key frame images.
In an embodiment of the present disclosure, feature point matching may be performed on a key frame image in a single-pass acquired data and a key frame image in other-pass acquired data for any single-pass acquired data, so as to obtain a matching result between any two key frame images between any two-pass acquired data. Preferably, if the distance between the acquisition positions of two key frame images is far, no common view area exists between the two key frame images, so in order to reduce the calculation amount, when the feature point matching is performed, for one key frame image in the single-pass acquisition data, feature point matching is performed on only key frame images, the acquisition positions of which are within a certain distance range, from other passes of acquisition data, and a matching result between the one key frame image and the corresponding screened key frame image is obtained.
In an embodiment of the present disclosure, based on the co-view areas of the key frame images in the different passes of the acquired data, the passes of the acquired data satisfying a grouping condition may be grouped, wherein the grouping condition includes that each pass of the acquired data in the same group has a key frame image with a co-view area with the key frame images in other passes of the acquired data.
In an embodiment of the present disclosure, the first pose optimization of the initial pose data corresponding to the key frame image is based on the initial pose data of the first acquisition data selected in the same group, and the other acquisition data in the same group are rigidly aligned, where the rigid alignment refers to that the initial pose data corresponding to each key frame image of each acquisition data is transformed by the same displacement and orientation. For example, fig. 2 shows a schematic diagram of a process of rigid alignment of two-pass pose according to an embodiment of the present disclosure, as shown in fig. 2, where two-pass acquired data S1 and S2 are included in the same group, a pose track formed by initial pose data of S1 and a pose track formed by initial pose data of S2 are offset, and positions and orientations in initial pose information of S2 are moved according to the same standard parameters, so that the positions and orientations are rigidly aligned with the pose track formed by the pose data of S1, so as to obtain first optimized pose data of S2, where the first optimized pose data of S1 is the initial pose data of S1, and thus obtain aligned S1 and S2.
In an embodiment of the present disclosure, if the two keyframe images have a common view area, it indicates that the same object is displayed in the two keyframe images only with different display angles, so that the relative pose relationship of the two keyframe images can be obtained accordingly, so that the relative pose relationship between the two acquired data can be obtained by statistics according to the keyframe images with the common view area between the two acquired data in the same group, and the initial pose data of the two acquired data can be rigidly aligned according to the relative pose relationship between the two acquired data, so as to obtain the first optimized pose data of the keyframe images of the two acquired data; then, because the key frame image between one or more times of acquired data and one time of acquired data in the two times of acquired data has a common view area, the relative pose relation between the other time of acquired data and one time of acquired data in the two times of acquired data can be obtained statistically, and the other time of acquired data is rigidly aligned to the initial pose data of the one time of acquired data according to the relative pose relation, so that the first optimized pose data of the key frame image in the same group is obtained after all the different times of acquired data in the same group are rigidly aligned according to the scheme.
The method is characterized in that the acquisition time is discontinuous, the acquisition equipment is different, the pose track in the acquired acquisition data of each pass has certain offset, the initial pose data in the acquisition data of each pass is rigidly aligned, a good initial value can be provided for the optimization of the subsequent pose map, and the situation of local optimization is avoided.
In one embodiment of the disclosure, a pose graph (pose map) optimization utilizes short-interval reliable relative pose measurement to construct a global optimization problem covering key frame images with long time span, and the global optimization problem spreads out accumulated errors, takes pose data corresponding to each key frame image as a node, and takes relative pose calculation results among the key frame images as measurement to carry out maximum likelihood estimation. The optimization variables of the pose graph optimization are pose, a graph optimization with only pose tracks needs to be constructed, edges between pose nodes are given by motion estimation obtained after feature matching between two key frame images, once the initial values are finished, the positions of feature points are not optimized, only the relation between the pose of the acquisition equipment is concerned, and the first optimization pose data of each pass key frame image is subjected to second-time pose optimization to obtain second optimization pose data of the key frame image, and the second optimization pose data restores the absolute precision of the pose data of each pass key frame image and simultaneously ensures the relative precision of the pose data of each pass key frame image based on the inter-frame relative pose constraint between key image frames of single pass acquisition data and the inter-frame relative pose constraint between key frame images with common view areas between any two passes acquisition data. The accuracy of the second optimized pose data is greater than the accuracy of the first optimized pose data.
In an embodiment of the disclosure, BA (Bundle Adjustment, beam adjustment) optimization not only optimizes pose but also feature points, BA optimization including optimization of re-projection errors (reprojection error); according to the three-dimensional position coordinates and pose data of the feature points, the three-dimensional feature points are projected into an image plane, and then the error between the three-dimensional feature points and the points extracted by the features in the image is calculated, so that the BA optimization is to minimize the error. The BA optimization can adjust the second optimization pose data and the feature points to minimize errors between the observed image positions and the predicted image positions, so that three-dimensional feature points of the key frame image and third optimization pose data are obtained, and the precision of the third optimization pose data is greater than that of the second optimization pose data.
In the above embodiment, first, acquisition data for two or more passes of the same road is acquired, and one pass of acquisition data includes: the method comprises the steps that feature points of key frame images and initial pose data corresponding to the key frame images are used for carrying out feature point matching on the key frame images in the single-pass acquired data and key frame images in other-pass acquired data aiming at the single-pass acquired data, so that matching results of the key frame images in different-pass acquired data are obtained, and the matching results comprise matching results of whether a common-view area exists between two key frame images; grouping the different passes of acquisition data based on co-view regions of key frame images in the different passes of acquisition data; then, aiming at different times of acquisition data in the same group, selecting one time of acquisition data from the different times of acquisition data, and aligning other times of acquisition data in the same group based on initial pose data of the selected one time of acquisition data to obtain first optimized pose data of a key frame image in the same group; the aligned first optimized pose data provides a good initial value for the pose graph, and performing the pose graph on the first optimized pose data of the key frame images of each pass restores the absolute precision of the pose data of the key frame images of each pass while guaranteeing the relative precision of the pose data of the key frame images of each pass, so that a better initial value can be provided for subsequent BA optimization; and finally, performing global BA optimization on the feature points of the key frame images and the second optimized pose data of the key frame images, so that three-dimensional feature points of the high-precision key frame images and third optimized pose data can be obtained, and a good basis is provided for subsequent construction of high-precision maps so as to reconstruct the high-precision maps with higher precision. In addition, the embodiment only utilizes the characteristic points of the image when the pose optimization is carried out, the use data is more simplified, and the method can recover the pose data with higher precision without acquiring data of too many passes (only two passes can be used for acquiring data in an example).
In an embodiment of the present disclosure, the grouping module 403 is configured to:
dividing each pass of acquired data meeting a grouping condition in the different passes of acquired data into a group, wherein the grouping condition comprises a common view area of a key frame image of each pass of acquired data in the same group and a key frame image of other passes of acquired data.
In this embodiment, each pass of acquired data in the same group necessarily has a region of co-vision with the keyframe images in one or more other passes of acquired data, and if it is a different group, none of the keyframe images in any one pass of acquired data in the same group has a region of co-vision with the keyframe images of the respective passes of acquired data in another group.
In this embodiment, it is assumed that 6 passes of acquired data S1, S2, S3, S4, S5, and S6 are acquired in total, wherein a key frame image between S1 and S2 has a common view region; a key frame image between S3 and S2 has a common view area, but no key frame image between S3 and S1 has a common view area; there is a common view region between S4 and S3, but no key frame image between S4 and S1, S2 has a common view region, and no key frame image between S5 and S6 has a common view region, but no key frame image between S5 and S1 to S4 has a common view region, and neither key frame image between S6 and S1 to S4 has a common view region, S1, S2, S3 and S4 can be classified into a first group, and S5 and S6 can be classified into a second group. Each pass of the acquired data in the first set necessarily has one or more keyframe images having a common view area with keyframe images in one or more other passes of the acquired data, e.g., a common view area between a keyframe image of S1 and a keyframe image of S2, a common view area between a keyframe image of S2 and a keyframe image of S1 or S3, a common view area between a keyframe image of S3 and a keyframe image of S2 or S4, and a common view area between a keyframe image of S4 and a keyframe image of S3.
In an embodiment of the present disclosure, the alignment module 404 is configured to
Selecting a first pass from among the data collected for different passes in the same packet;
based on initial pose data of each pair of common-view key frame images with common-view areas between second-pass acquired data and first-pass acquired data in the same group, first inter-frame relative pose information of each pair of common-view key frame images is obtained through an angle N point PnP;
counting the first inter-frame relative pose information of each pair of common-view key frame images between the second-pass acquired data and the first-pass acquired data to obtain first inter-frame weighted relative pose information between the second-pass acquired data and the first-pass acquired data;
aligning the second-pass acquired data to initial pose data of first-pass acquired data based on the first inter-frame weighted relative pose information to obtain first-pass optimized pose data of a key frame image in the second-pass acquired data;
obtaining second inter-frame relative pose information of each pair of common-view key frame images through PnP based on initial pose data of key frame images of third-pass acquired data with a common-view region and first-time optimized pose data of key frame images of aligned first-pass acquired data;
Counting the second inter-frame relative pose information of each pair of common-view key frame images between the third-pass acquired data and the aligned first-pass acquired data to obtain second inter-frame weighted relative pose information between the third-pass acquired data and the aligned first-pass acquired data;
and based on the second inter-frame weighted relative pose information, aligning the third-pass acquired data to the first optimized pose data of the aligned first-pass acquired data to obtain the first optimized pose data of the key frame image in the third-pass acquired data.
In this embodiment, one pass of collected data may be randomly selected from different passes of collected data in the same packet as the first pass of collected data, the second pass of collected data is any pass of collected data having a common view area with the first pass of collected data in the same packet, the first inter-frame relative pose information of each pair of common view key frame images may be obtained based on initial pose data of each pair of common view key frame images having a common view area between the second pass of collected data and the first pass of collected data through an angle N point PnP, and further first inter-frame weighted relative pose information between the second pass of collected data and the first pass of collected data is obtained through statistics, and the statistical method may be to weight average each first inter-frame relative pose information. And transforming the initial pose data of the second-pass acquired data according to the first inter-frame weighted relative pose information, and aligning the initial pose data of the first-pass acquired data to obtain first optimized pose data of the key frame images in the second-pass acquired data, wherein the first optimized pose data of the key frame images in the first-pass acquired data is the initial pose data of the first-pass acquired data. At this point, the aligned acquisition data is the first and second passes of acquisition data.
In this embodiment, the third pass of acquisition data is one pass of acquisition data in the same packet having a common view region with any of the aligned acquisition data. Obtaining second inter-frame relative pose information of each pair of common view key frame images between the third-pass acquired data and the aligned first-pass acquired data through PnP based on initial pose data of the key frame images of the third-pass acquired data with the common view region and first-pass optimized pose data of the key frame images of the aligned first-pass acquired data; carrying out weighted statistics on the second inter-frame relative pose information to obtain second inter-frame weighted relative pose information between the third time acquired data and the aligned first time acquired data; and transforming the initial pose data of the third-pass acquired data according to the second inter-frame weighted relative pose information, and aligning the first optimized pose data of the aligned first-pass acquired data to obtain the first optimized pose data of the key frame image in the third-pass acquired data. At this time, the aligned acquired data is the first, second and third acquired data, and the acquired data of the other remaining passes in the same group are repeatedly performed according to steps A5 to A7 until the acquired data of each pass in the same group are aligned.
Here, according to the common feature point and pose data of each pair of common view key frames, the inter-frame relative pose information of the pair of common view key frames can be estimated by using PnP algorithm.
By way of example, assume that 4 passes of acquisition data S1, S2, S3, S4 are acquired altogether, wherein there is a region of co-vision for the keyframe images between S1 and S2; a key frame image between S3 and S2 has a common view area, but no key frame image between S3 and S1 has a common view area; if there is a common view region between S4 and S3, but no key frame image between S4 and S1, S2 has a common view region, S1, S2, S3, and S4 may be divided into the same group. S2 can be randomly selected from the same group as first-pass acquired data, the second-pass acquired data can be one pass of S1 and S3 with a common view area with the S2 in the same group, and first inter-frame relative pose information of each pair of common view key frame images can be obtained based on initial pose data of each pair of common view key frame images with the common view area between the S1 and the S2 through an angle N point PnP algorithm, so that first inter-frame weighted relative pose information between the S1 and the S2 is obtained through statistics. And transforming the initial pose data of the S1 according to the first inter-frame weighted relative pose information, and aligning the initial pose data of the S2 to obtain first optimized pose data of the key frame image in the S1, wherein the first optimized pose data of the key frame image in the S2 is the initial pose data of the S2. At this time, the aligned acquired data are S1 and S2. The third pass collects data for one pass of data S3 in the same packet having a region of common view with S2 in S1 and S2. Based on the initial pose data of the key frame image of the S3 with the common view area and the first optimized pose data of the key frame image of the S2, obtaining second inter-frame relative pose information of each pair of common view key frame images between the S2 and the S3 through a PnP algorithm; carrying out weighted statistics on the second inter-frame relative pose information to obtain second inter-frame weighted relative pose information between S2 and S3; and (3) transforming the initial pose data of the S3 according to the second inter-frame weighted relative pose information, and aligning the first optimized pose data of the S2 to obtain the first optimized pose data of the key frame image in the S3. At this time, the aligned acquired data are S1, S2, and S3. The acquisition data S4 of the other remaining passes is repeated as follows according to steps A5 to A7: the S4 and the S3 have a common view area, and second inter-frame relative pose information of each pair of common view key frame images between the S4 and the S3 can be obtained through a PnP algorithm based on initial pose data of the key frame image of the S4 with the common view area and first time optimized pose data of the key frame image of the aligned one-time acquisition data S3; carrying out weighted statistics on the second inter-frame relative pose information to obtain second inter-frame weighted relative pose information between S4 and S3; and transforming the initial pose data of the S4 according to the second inter-frame weighted relative pose information, and aligning the first optimized pose data of the S3 to obtain the first optimized pose data of the key frame image in the S4, so that the acquired data of the S1 to the S4 in the same group are aligned.
Here, it should be noted that each pass of collected data has a common view area with N passes of collected data in the same group, N values corresponding to each pass of collected data are different, and the first pass of collected data is the one pass of collected data with the largest N value. If all other trip acquisition data in the same group have a common view area with a certain trip acquisition data, the certain trip acquisition data can be selected as first trip acquisition data, all pairs of common view key frame images with common view areas between the other trip acquisition data and the first trip acquisition data are respectively based on the above method, inter-frame relative pose information of all pairs of common view key frame images is obtained through an angle N point PnP, then inter-frame relative pose information of all pairs of common view key frame images between the other trip acquisition data and the first trip acquisition data is counted, inter-frame weighted relative pose information between the other trip acquisition data and the first trip acquisition data is obtained, finally, the other trip acquisition data are aligned to initial pose data of the first trip acquisition data based on the inter-frame weighted relative pose information, and first optimized pose data of key frame images in the other trip acquisition data is obtained.
In an embodiment of the present disclosure, the obtaining module 401 is configured to:
acquiring image data and motion data of more than two passes for the same road;
according to the image data and the motion data of each pass, acquiring initial pose data corresponding to the key frame image of each pass through a Visual Inertial Odometer (VIO) algorithm, wherein the characteristic points of the key frame image of each pass are acquired in the calculation process of the VIO algorithm.
In this embodiment, the image data and motion data of the acquisition device during motion are acquired by installing sensors (e.g., a camera, an IMU (Inertial Measurement Unit, inertial measurement unit), an RTK (Real Time Kinematic), etc.) on the acquisition device, the image data being a sequence of images acquired by the camera during each acquired motion, the motion data including the RTK data of the camera during each acquired motion, or the RTK data and IMU data, etc., the IMU being sensors used primarily to detect and measure acceleration and rotational motion, the IMU generating acceleration and angular velocity measurements along multiple axes or degrees of freedom by integrating multiple inertial sensors, the IMU data being the acceleration and angular velocity measurements of the acquisition device measured by the IMU in Real Time. The RTK is a real-time differential GPS technology based on carrier phase observation, and can obtain three-dimensional pose data with centimeter-level positioning precision in real time, wherein the RTK data is real-time three-dimensional pose data of acquisition equipment.
In this embodiment, in order to realize low-cost mapping, image data and motion data for two or more passes of the same road can be acquired by crowd-sourced materials acquired by various acquisition devices such as an in-vehicle camera, a cell phone camera, a digital camera, or a video camera. It should be understood that crowd-sourced data refers to open data that is obtained by a public through a certain method (e.g. using an on-board camera to shoot) and then provided to the public or related institutions through the internet. The masses may voluntarily upload the crowd-sourced material or provide the crowd-sourced material by participating in the crowd-sourced tasks issued by the relevant institutions. The crowd-sourced data comprises image data and motion data which are acquired by the masses during each acquisition, wherein the acquisition data of one acquisition pass refers to characteristic points of key frame images and initial pose data corresponding to the key frame images, which are acquired based on the image data and the motion data acquired by one acquisition in continuous time, and each acquisition pass has time sequence. When a crowdsourcing map is needed for a certain road, more than two passes of acquired data for the road can be acquired from crowdsourcing data.
In this embodiment, the VIO (Visual Inertial Odometry, visual odometer) algorithm firstly extracts feature points in the key frame image during the calculation process, and then calculates initial pose data corresponding to the key frame image, so that feature points of the key frame image of each pass can be obtained during the VIO calculation process, without additional feature point extraction, and the calculation steps are reduced.
In an embodiment of the present disclosure, the pose map optimization module 405 is configured to:
and performing phase graph optimization on the first optimized pose data of the key frame image by constructing pose prior constraint, frame pose constraint among key image frames of single-pass acquired data and inter-frame relative pose constraint among key frame images with common view areas among two-pass acquired data, so as to obtain second optimized pose data of the key frame image.
In this embodiment, when pose map optimization is applied to optimize the first optimized pose data, optimization adjustment is to be performed under pose prior constraint, frame pose constraint between key image frames of single-pass acquired data, and inter-frame relative pose constraint between key frame images having common view areas between two passes acquired data, the pose prior constraint is used to constrain adjustment of the first optimized pose data corresponding to the key frame images to a certain extent, and the frame pose constraint between the key image frames of single-pass acquired data means that when the first optimized pose data corresponding to a certain key frame image is adjusted, the inter-frame relative pose between the key frame image and the key frame image having common view areas in the pass where the key frame image is located should satisfy a certain constraint condition, and cannot exceed the inter-frame relative pose calculated according to the initial pose data of the key frame images too much; the inter-frame relative pose constraint between the key frame images with the common view area between the two passes of acquired data means that when the first optimized pose data corresponding to the key frame images is adjusted, the inter-frame relative pose between the key frame images and the key frame images with the common view area in other passes also meets a certain constraint condition and cannot exceed the inter-frame relative pose calculated according to the initial pose data of the key frame images too much. Thus, the absolute precision of the pose data of each time of key frame images can be recovered, and the relative precision of the pose data of each time of key frame images can be ensured.
In an embodiment of the present disclosure, the global optimization module 406 is configured to:
and performing global beam adjustment method BA optimization on the feature points of the key frame image and the second optimized pose data of the key frame image by establishing a factor graph of 3D feature point reprojection errors, frame pose constraint among key image frames of single-pass acquired data and pose prior constraint to obtain three-dimensional feature points of the key frame image and third optimized pose data.
In this embodiment, the 3D feature point re-projection error refers to projecting a feature point into a two-dimensional image plane according to three-dimensional position coordinates of the feature point and pose data of an acquisition device, and then calculating a distance between the feature point and a point extracted by features in the two-dimensional image, wherein the distance is minimized when BA optimization is performed. For example, as shown in fig. 3, the feature points P1, P2, and P3 are projected onto the keyframe images whose poses are T1, T2, and T3, respectively, the projected positions are compared with the actual positions of the extracted P1, P2, and P3 on the keyframe images, the distances between the two are calculated, and the feature point and pose data are adjusted to minimize the distances.
In the embodiment, when the re-projection error is minimized, frame pose constraint and pose prior constraint between key image frames of single-pass acquired data are considered, so that feature points and second-time optimized pose data are adjusted to perform global BA optimization, and three-dimensional feature points and third-time optimized pose data are obtained.
The embodiment of the disclosure also discloses a navigation service, wherein a high-precision map of an area where the navigated object is located is obtained based on the pose optimization method, and a navigation guiding service of a corresponding scene is provided for the navigated object based on the high-precision map. The corresponding scene is one or a combination of a plurality of AR navigation, overhead navigation or main and auxiliary road navigation.
The embodiment of the disclosure also discloses a navigation method, wherein a navigation route calculated at least based on a starting point, an ending point and road conditions is obtained based on a high-precision map, navigation guidance is performed based on the navigation route, and the high-precision map is realized by reconstructing the map based on three-dimensional feature points and third-time optimized pose data obtained in any one of the above methods.
The present disclosure also discloses an electronic device, fig. 5 shows a block diagram of the electronic device according to an embodiment of the present disclosure, and as shown in fig. 5, the electronic device 500 includes a memory 501 and a processor 502; wherein,
the memory 501 is configured to store one or more computer instructions that are executed by the processor 502 to implement the method steps described above.
FIG. 6 is a schematic diagram of a computer system suitable for use in implementing a pose optimization method according to an embodiment of the present disclosure.
As shown in fig. 6, the computer system 600 includes a processing unit 601, which can execute various processes in the above-described embodiments according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the system 600 are also stored. The processing unit 601, the ROM602, and the RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608. The processing unit 601 may be implemented as a processing unit such as CPU, GPU, TPU, FPGA, NPU.
In particular, according to embodiments of the present disclosure, the methods described above may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a medium readable thereby, the computer program comprising program code for performing the pose optimization method. In such an embodiment, the computer program can be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules described in the embodiments of the present disclosure may be implemented by software, or may be implemented by hardware. The units or modules described may also be provided in a processor, the names of which in some cases do not constitute a limitation of the unit or module itself.
As another aspect, the embodiments of the present disclosure also provide a computer-readable storage medium, which may be a computer-readable storage medium included in the apparatus described in the above-described embodiment; or may be a computer-readable storage medium, alone, that is not assembled into a device. The computer-readable storage medium stores one or more programs for use by one or more processors in performing the methods described in the embodiments of the present disclosure.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the inventive concept. Such as the technical solution formed by mutually replacing the above-mentioned features and the technical features with similar functions (but not limited to) disclosed in the embodiments of the present disclosure.

Claims (11)

1. A pose optimization method, comprising:
acquiring acquisition data for more than two passes of the same road, the acquisition data for one pass comprising: feature points of the key frame image and initial pose data corresponding to the key frame image;
for single-pass acquired data, performing feature point matching on a key frame image in the single-pass acquired data and key frame images in other-pass acquired data to obtain matching results of the key frame images in different-pass acquired data, wherein the matching results comprise matching results of whether a common-view area exists between two key frame images;
grouping the different passes of acquisition data based on co-view regions of key frame images in the different passes of acquisition data;
aiming at different times of data acquisition in the same group, selecting one time of data acquisition from the data acquisition, and aligning other times of data acquisition in the same group based on initial pose data of the selected one time of data acquisition to obtain first optimized pose data of a key frame image in the same group;
performing pose map pose optimization on the first optimized pose data of the key frame image to obtain second optimized pose data of the key frame image, wherein the pose map optimization at least comprises inter-frame relative pose constraint between key frame images with common view areas between two times of acquisition data;
And performing global beam adjustment method BA optimization based on the feature points of the key frame image and the second optimized pose data of the key frame image to obtain three-dimensional feature points of the key frame image and third optimized pose data, wherein the precision of the pose data is gradually increased.
2. The method of claim 1, the grouping the different passes of acquisition data based on a co-view region of a keyframe image in the different passes of acquisition data comprising:
dividing each pass of acquired data meeting a grouping condition in the different passes of acquired data into a group, wherein the grouping condition comprises a common view area of a key frame image of each pass of acquired data in the same group and a key frame image of other passes of acquired data.
3. The method of claim 1 or 2, wherein the selecting one pass of the acquired data for different passes in the same group, aligning other passes of the acquired data in the same group based on the initial pose data of the selected one pass of the acquired data, and obtaining first optimized pose data of the keyframe image in the same group, comprises:
selecting a first pass from among the data collected for different passes in the same packet;
Based on initial pose data of each pair of common-view key frame images with common-view areas between second-pass acquired data and first-pass acquired data in the same group, first inter-frame relative pose information of each pair of common-view key frame images is obtained through an angle N point PnP;
counting the first inter-frame relative pose information of each pair of common-view key frame images between the second-pass acquired data and the first-pass acquired data to obtain first inter-frame weighted relative pose information between the second-pass acquired data and the first-pass acquired data;
aligning the second-pass acquired data to initial pose data of first-pass acquired data based on the first inter-frame weighted relative pose information to obtain first-pass optimized pose data of a key frame image in the second-pass acquired data;
obtaining second inter-frame relative pose information of each pair of common-view key frame images through PnP based on initial pose data of key frame images of third-pass acquired data with a common-view region and first-time optimized pose data of key frame images of aligned first-pass acquired data;
counting second inter-frame relative pose information of each pair of common-view key frame images between the third-pass acquired data and the aligned first-pass acquired data to obtain second inter-frame weighted relative pose information between the third-pass acquired data and the aligned first-pass acquired data;
And based on the second inter-frame weighted relative pose information, aligning the third-pass acquired data to the first optimized pose data of the aligned first-pass acquired data to obtain the first optimized pose data of the key frame image in the third-pass acquired data.
4. The method of claim 1, the acquiring acquisition data for more than two passes of the same road, comprising:
acquiring image data and motion data of more than two passes for the same road;
according to the image data and the motion data of each pass, acquiring initial pose data corresponding to the key frame images of each pass through a Visual Inertial Odometer (VIO) algorithm, wherein the characteristic points of the key frame images of each pass are acquired in the calculation process of the VIO algorithm.
5. The method of claim 1, wherein performing pose map pose optimization on the first optimized pose data of the key frame image to obtain second optimized pose data of the key frame image comprises:
and performing phase graph optimization on the first optimized pose data of the key frame image by constructing pose prior constraint, frame pose constraint among key image frames of single-pass acquired data and inter-frame relative pose constraint among key frame images with common view areas among two-pass acquired data, so as to obtain second optimized pose data of the key frame image.
6. The method according to claim 1, wherein the performing global beam adjustment BA optimization based on the feature points of the key frame image and the second optimized pose data of the key frame image to obtain three-dimensional feature points and third optimized pose data of the key frame image includes:
and performing global BA optimization on the feature points of the key frame image and the second optimized pose data of the key frame image by establishing a factor graph of 3D feature point reprojection errors, frame pose constraints among key image frames of single-pass acquired data and pose prior constraints to obtain three-dimensional feature points of the key frame image and third optimized pose data.
7. A pose optimization device comprising:
an acquisition module configured to acquire acquisition data for more than two passes of the same road, the one pass acquisition data comprising: feature points of the key frame image and initial pose data corresponding to the key frame image;
the matching module is configured to perform characteristic point matching on key frame images in the single-pass acquired data and key frame images in other-pass acquired data for the single-pass acquired data to obtain matching results of the key frame images in different-pass acquired data, wherein the matching results comprise matching results of whether a common-view area exists between the two key frame images;
A grouping module configured to group the different passes of acquisition data based on a co-view region of a keyframe image in the different passes of acquisition data;
the alignment module is configured to acquire data for different passes in the same group, select one pass of acquired data from the acquired data, and align other passes of acquired data in the same group based on initial pose data of the selected one pass of acquired data to obtain first optimized pose data of a key frame image in the same group;
the pose map optimization module is configured to perform pose map optimization on the first optimized pose data of the key frame image to obtain second optimized pose data of the key frame image, and the pose map optimization at least comprises inter-frame relative pose constraint between key frame images with common view areas between two times of acquisition data;
and the global optimization module is configured to perform global beam adjustment method BA optimization based on the feature points of the key frame image and the second time of optimization pose data of the key frame image to obtain three-dimensional feature points of the key frame image and third time of optimization pose data, and the precision of the pose data is gradually increased.
8. An electronic device comprising a memory and at least one processor; wherein the memory is for storing one or more computer instructions, wherein the one or more computer instructions are executed by the at least one processor to implement the method steps of any of claims 1-6.
9. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the method steps of any of claims 1-6.
10. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the method steps of any of claims 1-6.
11. A navigation method, wherein a navigation route calculated based on at least a start point, an end point and road conditions is obtained based on a high-precision map, navigation guidance is performed based on the navigation route, and the high-precision map is realized by performing map reconstruction based on three-dimensional feature points and third-time optimized pose data obtained by any one of the methods of claims 1 to 6.
CN202110903076.5A 2021-08-06 2021-08-06 Pose optimization method, pose optimization device, electronic equipment, medium and program product Active CN113744308B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110903076.5A CN113744308B (en) 2021-08-06 2021-08-06 Pose optimization method, pose optimization device, electronic equipment, medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110903076.5A CN113744308B (en) 2021-08-06 2021-08-06 Pose optimization method, pose optimization device, electronic equipment, medium and program product

Publications (2)

Publication Number Publication Date
CN113744308A CN113744308A (en) 2021-12-03
CN113744308B true CN113744308B (en) 2024-02-20

Family

ID=78730389

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110903076.5A Active CN113744308B (en) 2021-08-06 2021-08-06 Pose optimization method, pose optimization device, electronic equipment, medium and program product

Country Status (1)

Country Link
CN (1) CN113744308B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742884B (en) * 2022-06-09 2022-11-22 杭州迦智科技有限公司 Texture-based mapping, mileage calculation and positioning method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019157925A1 (en) * 2018-02-13 2019-08-22 视辰信息科技(上海)有限公司 Visual-inertial odometry implementation method and system
CN110866496A (en) * 2019-11-14 2020-03-06 合肥工业大学 Robot positioning and mapping method and device based on depth image
WO2020259248A1 (en) * 2019-06-28 2020-12-30 Oppo广东移动通信有限公司 Depth information-based pose determination method and device, medium, and electronic apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019157925A1 (en) * 2018-02-13 2019-08-22 视辰信息科技(上海)有限公司 Visual-inertial odometry implementation method and system
WO2020259248A1 (en) * 2019-06-28 2020-12-30 Oppo广东移动通信有限公司 Depth information-based pose determination method and device, medium, and electronic apparatus
CN110866496A (en) * 2019-11-14 2020-03-06 合肥工业大学 Robot positioning and mapping method and device based on depth image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CFD-SLAM:融合特征法与直接法的快速鲁棒SLAM系统;王化友;代波;何玉庆;;高技术通讯(12);全文 *

Also Published As

Publication number Publication date
CN113744308A (en) 2021-12-03

Similar Documents

Publication Publication Date Title
CN110243358B (en) Multi-source fusion unmanned vehicle indoor and outdoor positioning method and system
CN106679648B (en) Visual inertia combination SLAM method based on genetic algorithm
JP7326720B2 (en) Mobile position estimation system and mobile position estimation method
CN112197770B (en) Robot positioning method and positioning device thereof
CN108986037A (en) Monocular vision odometer localization method and positioning system based on semi-direct method
CN112907678B (en) Vehicle-mounted camera external parameter attitude dynamic estimation method and device and computer equipment
US20220292711A1 (en) Pose estimation method and device, related equipment and storage medium
CN108519102B (en) Binocular vision mileage calculation method based on secondary projection
CN111261016B (en) Road map construction method and device and electronic equipment
CN111830953A (en) Vehicle self-positioning method, device and system
CN110617821A (en) Positioning method, positioning device and storage medium
JP2022542289A (en) Mapping method, mapping device, electronic device, storage medium and computer program product
JP2012118666A (en) Three-dimensional map automatic generation device
Cai et al. Mobile robot localization using gps, imu and visual odometry
Zhang et al. Vision-aided localization for ground robots
CN114623817A (en) Self-calibration-containing visual inertial odometer method based on key frame sliding window filtering
CN115690338A (en) Map construction method, map construction device, map construction equipment and storage medium
CN116295412A (en) Depth camera-based indoor mobile robot dense map building and autonomous navigation integrated method
CN114693754A (en) Unmanned aerial vehicle autonomous positioning method and system based on monocular vision inertial navigation fusion
CN115326084A (en) Vehicle positioning method and device, computer equipment and storage medium
CN113744308B (en) Pose optimization method, pose optimization device, electronic equipment, medium and program product
Xian et al. Fusing stereo camera and low-cost inertial measurement unit for autonomous navigation in a tightly-coupled approach
CN113345032B (en) Initialization map building method and system based on wide-angle camera large distortion map
KR101006977B1 (en) Method for supplementing map data during a productive digital map
CN112651991A (en) Visual positioning method, device and computer system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant