CN114966789A - Mapping method and system fusing GNSS and multi-view vision - Google Patents

Mapping method and system fusing GNSS and multi-view vision Download PDF

Info

Publication number
CN114966789A
CN114966789A CN202210526760.0A CN202210526760A CN114966789A CN 114966789 A CN114966789 A CN 114966789A CN 202210526760 A CN202210526760 A CN 202210526760A CN 114966789 A CN114966789 A CN 114966789A
Authority
CN
China
Prior art keywords
view
key frame
pose
gnss
visual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210526760.0A
Other languages
Chinese (zh)
Inventor
彭刚
许镟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202210526760.0A priority Critical patent/CN114966789A/en
Publication of CN114966789A publication Critical patent/CN114966789A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/38Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system
    • G01S19/39Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system the satellite radio beacon positioning system transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/42Determining position
    • G01S19/45Determining position by combining measurements of signals from the satellite radio beacon positioning system with a supplementary measurement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/579Depth or shape recovery from multiple images from motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/77Determining position or orientation of objects or cameras using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Remote Sensing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Graphics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a mapping method and a system fusing GNSS and multi-view vision, wherein the method comprises the following steps: constraining the updating time of the multi-view visual image by the timestamp of the GNSS measured value to align the GNSS measured value with the multi-view visual image, and then carrying out pose estimation on the multi-view visual image to obtain a key frame pose; converting the current key frame pose and the previous key frame pose at each view angle into a station center coordinate system, calculating relative poses, taking the sum of the current key frame pose and the relative poses as a visual observation value of the current key frame, and solving a global optimization pose by taking the minimum difference between a GNSS measurement value and the visual observation value of the current key frame as a target; and updating the single-view local map by using the global optimization pose under the single view, and fusing the multi-view local map to obtain a globally consistent map when the time consistency constraint and the space consistency constraint are met. The invention can improve the real-time positioning precision and robustness and establish a globally consistent map.

Description

Mapping method and system fusing GNSS and multi-view vision
Technical Field
The invention belongs to the field of automatic driving, unmanned driving and mobile robot technology and multi-sensor fusion positioning and mapping, and particularly relates to a mapping method and system fusing GNSS and multi-view vision.
Background
The GNSS signal can provide global positioning information with higher accuracy, but in an indoor scene, the satellite signal may be blocked, and in an outdoor scene, the satellite signal is easily lost to influence positioning accuracy, even tracking is lost. Visual information is not influenced by satellite signals, pose estimation can be well finished only by extracting visual features in the environment, but the visual features are easily influenced by factors such as illumination and the like to be lost in tracking, accumulated errors are eliminated, loop detection is relied on, and if a loop cannot be formed by movement, the pose estimation precision is general.
In addition, the optimization process of loop detection of the visual SLAM system is time-consuming, the optimized pose is not real-time, and at the moment of loop detection, the pose optimization can generate large jump for eliminating accumulated errors, so that the motion control and decision of the mobile robot are not well estimated. Meanwhile, a single visual angle vision sensor is used for drawing construction, so that the drawing construction is easily blocked by the limitation of visual angles, and the drawing construction effect is poor.
Therefore, the technical problems of low real-time positioning precision, low robustness and poor global consistency exist in the prior art.
Disclosure of Invention
Aiming at the defects or improvement requirements of the prior art, the invention provides a mapping method and a mapping system fusing GNSS and multi-view vision, so that the technical problems of low real-time positioning precision, low robustness and poor global consistency in the prior art are solved.
To achieve the above object, according to an aspect of the present invention, there is provided a mapping method for merging GNSS and multi-view vision, including the following steps:
(1) constraining the update time of the multi-view visual image by the timestamp of the GNSS measured value to align the GNSS measured value with the multi-view visual image, and then carrying out pose estimation on the multi-view visual image to obtain the initial key frame pose of each view and the key frame pose of each subsequent time;
(2) converting the current key frame pose and the previous key frame pose at each visual angle into a station center coordinate system, calculating relative poses, taking the sum of the current key frame pose and the relative poses as a visual observation value of the current key frame, and solving the global optimization pose of the current key frame at each visual angle by taking the minimum difference between the aligned current key frame GNSS measurement value and the current key frame visual observation value as a target;
(3) and updating the single-view local map by using the global optimization pose of the current key frame under the single view angle, satisfying the time consistency constraint and the space consistency constraint when the multi-view visual images are aligned, the difference of the key frame depth values among the multi-view angles is in the visual error range, and the pose translation relation of the current key frame of the multi-view angles relative to the initial key frame is in the visual angle position error range, and fusing the multi-view local map to obtain the globally consistent map.
Further, the specific implementation manner of the spatial consistency constraint is as follows:
Figure BDA0003641291890000021
Figure BDA0003641291890000022
wherein the content of the first and second substances,
Figure BDA0003641291890000023
in order to be a visual error,
Figure BDA0003641291890000024
a key frame depth value for a view in the multi-view,
Figure BDA0003641291890000025
the key frame depth values of other views in the multi-view, b is the camera baseline corresponding to other views, f is the focal length of the camera corresponding to other views, σ is the parallax between other views and a certain view,
Figure BDA0003641291890000026
is a position error, b is more than 0 and less than 1,
Figure BDA0003641291890000027
is the pose translation relation of the current key frame relative to the initial key frame at a certain visual angle,
Figure BDA0003641291890000028
the pose translation relationship of the current key frame of other visual angles relative to the initial key frame of a certain visual angle,
Figure BDA0003641291890000029
and C is the relative pose of a certain visual angle.
Further, the alignment manner in the step (1) is as follows:
setting the time tolerance from the update frequency f of the GNSS measurements
Figure BDA0003641291890000031
Time of update t by GNSS measurements gnss Get timestamp [ t ] gnss -ts,t gnss +ts]And if the multi-view visual image is updated in the timestamp, aligning the GNSS measurement value with the multi-view visual image, and performing subsequent operation.
Further, the alignment manner in the step (3) is as follows:
setting the tolerance time according to the updating frequency f of the visual image of a certain visual angle in multiple visual angles
Figure BDA0003641291890000032
Figure BDA0003641291890000033
The updating time t of the visual image from a certain view angle in multiple view angles gnss Get the timestamp [ t ] gnss -ts,t gnss +ts]From multiple viewing anglesAnd updating the visual images of the visual angles in the time stamp, aligning other visual angles with the visual image of a certain visual angle, and performing subsequent operation.
Further, the relative pose is obtained by:
converting the current key frame pose and the previous key frame pose at each visual angle from a local coordinate system to a station center coordinate system, wherein the local coordinate system is a camera coordinate system corresponding to multiple visual angles, the z axis of the local coordinate system is aligned with the station center coordinate system, the x axis, the y axis and the z axis of the station center coordinate system respectively point to east, north and sky, the origin of the station center coordinate system is a point on the global coordinate system and is a semi-global coordinate system used for connecting the global coordinate system and the local coordinate system, the global coordinate system takes the earth mass center as the origin, the x-y plane is coincided with the equatorial plane, the x axis points to the initial meridian, the z axis points to the north pole, and then calculating the relative poses of the current key frame pose and the previous key frame pose.
The station center coordinate system is also called as a station coordinate system and an east-north-sky coordinate system ENU.
Further, before solving the global optimization pose, the method further comprises the following steps:
the GNSS measured value is converted from a global coordinate system to a station center coordinate system, and the coordinate value of the GNSS measured value in the global coordinate system is
Figure BDA0003641291890000034
The coordinate value corresponding to the standing center coordinate system is [ e n u ]] T The conversion method is as follows:
Figure BDA0003641291890000041
wherein the coordinate difference value
Figure BDA0003641291890000042
Obtained by the following formula:
Figure BDA0003641291890000043
wherein e is the ellipsoid eccentricity of the global coordinate system, N is the curvature radius of the reference ellipsoid, and the longitude, latitude and height of the origin of the standing-center coordinate system are lambda 0
Figure BDA0003641291890000044
And h 0 ,N 0 The curvature radius of the reference ellipsoid corresponding to the origin of the station center coordinate system.
Further, the global optimization pose solution further comprises:
taking the difference between the aligned current key frame GNSS measured value and the current key frame pose as a GNSS residual error item;
taking the difference between the global optimization pose of the current key frame and the global optimization pose of the previous key frame as the pose increment after the vision and GNSS optimization, taking the difference between the pose of the current key frame and the pose of the previous key frame as the pose increment under the vision, and taking the difference between the pose increment after the vision and GNSS optimization and the pose increment under the vision as the residual error item of the vision odometer;
firstly, pose estimation is carried out on a visual image with multiple visual angles, when a GNSS measured value is updated, a visual odometer residual error item and a GNSS residual error item are fused, and a least square problem is solved to obtain global optimization poses with different visual angles.
And fusing a visual odometer residual term and a GNSS residual term, and solving a least square problem, namely solving a global optimization pose by taking the minimum difference between the aligned current key frame GNSS measurement value and the current key frame visual observation value as a target.
When the method is applied to a mobile robot, during the moving process of the mobile robot, site mapping with dynamic change is carried out based on visual images of different visual angles, namely front visual angles, rear visual angles, left visual angles and right visual angles, so as to obtain a site map, if the site of the front visual angles is dynamically changed, the visual images of the front visual angles do not participate in mapping, and the site mapping is carried out based on the visual images of the rear visual angles, the left visual angles and the right visual angles; and in the process of moving the mobile robot backwards, fusing the front-view angle visual image into a site map.
According to another aspect of the present invention, there is provided a mapping system for integrating GNSS and multi-view vision, comprising: a camera, a GNSS and a controller;
the number of the cameras is multiple, the cameras are respectively positioned at different visual angles and used for collecting multi-visual-angle visual images;
the GNSS is used for acquiring GNSS measurement values;
the processor includes:
the position and orientation estimation module is used for constraining the updating time of the multi-view visual image through the timestamp of the GNSS measured value to enable the GNSS measured value to be aligned with the multi-view visual image, and then performing position and orientation estimation on the multi-view visual image to obtain the initial key frame position and the subsequent key frame position of each view;
the global optimization module is used for converting the current key frame pose and the previous key frame pose at each visual angle into a station center coordinate system, calculating relative poses, taking the sum of the current key frame pose and the relative poses as a visual observation value of the current key frame, and solving the global optimization pose of the current key frame at each visual angle by taking the minimum difference between the aligned current key frame GNSS measurement value and the current key frame visual observation value as a target;
and the mapping module is used for updating the single-view local map by using the global optimization pose of the current key frame under the single view, and meeting time consistency constraint and space consistency constraint when the multi-view visual image is aligned, the difference of the key frame depth values among the multiple views is in a visual error range and the pose translation relation of the current key frame of the multiple views relative to the initial key frame is in a view angle position error range, so as to fuse the multi-view local map and obtain the globally consistent map.
Furthermore, the plurality of cameras are installed at different positions of a single ground mobile robot, or the plurality of cameras are respectively installed on the plurality of ground mobile robots, or the plurality of cameras are partially installed on the ground mobile robot, and the plurality of cameras are partially installed on the aerial unmanned aerial vehicle.
Further, a plurality of cameras are installed at different positions of a single aerial unmanned aerial vehicle, or a plurality of cameras are dispersedly installed on a plurality of aerial unmanned aerial vehicles, or a plurality of cameras are partially installed on the aerial unmanned aerial vehicles, and partially installed on the ground mobile robot.
Further, when the system is applied to a mobile robot or an aerial unmanned aerial vehicle, during the moving process of the mobile robot or the aerial unmanned aerial vehicle, a site map is built in a dynamic change mode based on visual images at different visual angles in front, back, left and right directions to obtain a site map, if the site at a front visual angle is in a dynamic change mode, the visual images at the front visual angle do not participate in the building of the map, and the site map is built based on the visual images at the back visual angle, the left visual angle and the right visual angle; and in the process of retreating the mobile robot or the aerial unmanned aerial vehicle, the forward-looking angle visual image is fused into the site map.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
(1) firstly, estimating the pose of a multi-view visual image; and then, the GNSS and the multi-view vision are fused in a loose coupling mode to carry out global pose optimization, so that pose estimation does not depend on loop detection, and the robustness and the pose estimation precision of the system are improved. The global optimization is carried out by taking the minimum difference between the GNSS measured value and the visual observation value as a target, the calculated amount is reduced, the accuracy of attitude estimation in a short time is ensured, the purpose of estimating one frame by vision and solving one frame by global optimization can be realized, and the real-time performance is better. The pose estimated by the visual SLAM is inconsistent with the GNSS signal reference coordinate system, and therefore global optimization needs to be performed after coordinate transformation. The time stamps of the GNSS measured values are used for aligning the visual images with different visual angles, the consistency of the GNSS and the visual image frames is guaranteed, then the multi-visual angle local maps are fused on the premise that the time consistency constraint and the space consistency constraint are met, and the globally consistent map is established. The images with multiple visual angles can play a role of mutual compensation, and the condition of closed loop missing detection caused by visual angle limitation and single visual angle characteristic loss is avoided.
(2) According to the method, the abnormal depth values in the depth image measured by the camera can be effectively removed by restricting the depth values, so that map information is more credible, the position relation restriction is performed according to the key frame poses of different visual angles and the cameras of different visual angles are combined to perform restriction according to the relative position relation, so that maps of different visual angles are fused, and the map has global consistency.
(3) The method aligns different camera image frames and GNSS measurement values, and belongs to external alignment among different sensors. The first alignment is mainly used for estimating the pose of the single-view-angle camera and GNSS fusion to obtain the real-time key frame pose with higher precision, after the high-precision key frame pose is obtained, map alignment of different view angles is carried out, secondary alignment is carried out according to the updating time, the local maps of different view angles are updated in real time and fused with the global map, and the global consistency of the local maps of different view angles is ensured. Meanwhile, the first alignment is performed by using the updating time and frequency of the GNSS, the second alignment is performed by using the updating time and frequency of the camera, and the two alignments are both automatic, so that the robustness of the system is effectively improved.
(4) The coordinate transformation is mainly used for fusing GNSS optimization, the relative unification of different coordinate systems is beneficial to solving the least square problem, in the map fusion stage, the transformation and the alignment between different camera coordinate systems are ensured, the global consistency of maps with different visual angles is ensured, meanwhile, the spherical coordinate system is inconsistent with the planning and the control of the plane operation of the robot, and the transformation is carried out to an ENU (station center) plane coordinate system to be beneficial to the motion planning and the control of the robot.
(5) GNSS constraint and visual characteristic constraint are fused in a loose coupling mode, and when no GNSS signal is updated, the pose estimation of the visual odometer is taken as a main point; when GNSS signals are updated, the two are fused for global pose optimization, so that pose estimation does not depend on loop detection, robustness and pose estimation accuracy of the system are improved, when the GNSS signals are optimized, a visual odometer residual error item is constructed by a pose increment after vision and GNSS optimization and a pose increment under the vision, a GNSS residual error item is constructed by a current key frame pose and a GNSS measurement value, calculated amount is reduced, accuracy of the visual odometer pose estimation in a short time is guaranteed, a target of estimating one frame by the visual odometer and optimizing one frame by the global optimization solver can be realized, and good real-time performance is achieved.
(6) The invention utilizes the multi-view visual images to build the image, and can flexibly configure the images with different views for fusion so as to adapt to the dynamic change of the field. If the field is dynamically changed due to the operation of the robot in the advancing process of the robot, the visual images of the forward visual angles do not participate in the map building so as to prevent the interference on the map building and influence the precision and accuracy of the map building, and the field map building is carried out based on the visual images of other visual angles in the backward direction and the left-right direction; in the process of retreating the robot, the forward visual image is added to sense the changed forward visual scene, the dynamic field area map is updated in an incremental mode based on the fusion of the visual images at the front, the back, the left and the right of different visual angles, and the updated field map is used for the next robot path planning, so that the mapping adaptability of the situation of the dynamically changed field is improved.
Drawings
Fig. 1 is a flowchart of a mapping method for merging GNSS and multi-view vision according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of coordinate system transformation provided by an embodiment of the present invention;
fig. 3 (a) is a sequence 00 mapping result provided by the embodiment of the present invention;
fig. 3 (b) is a sequence 09 mapping result provided by the embodiment of the present invention;
FIG. 3 (c) is a sequence 08 mapping result provided by an embodiment of the present invention;
fig. 3 (d) is a sequence 10 mapping result provided by the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the respective embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention provides a mapping method fusing GNSS and multi-view vision, which can be applied to urban streets, indoor corridor corridors, outdoor construction scenes and the like, realizes high-precision pose estimation, and still has higher robustness in an environment with poor illumination conditions. The invention aims to improve the accuracy and robustness of the pose estimation of the robot and ensure the instantaneity of the pose under the condition that the motion does not generate loop. The invention uses the multi-view depth visual image to construct the image, optimizes the pose of the key frame, can play a role of mutual compensation by the multi-view image, improves the time consistency inspection and the space consistency inspection of the closed-loop frame, and avoids the condition of closed-loop missing inspection caused by visual limitation and single-view characteristic loss. As shown in fig. 1, a mapping method for fusing GNSS and multi-view vision includes the following steps:
(1) constraining the update time of the multi-view visual image by the timestamp of the GNSS measured value to align the GNSS measured value with the multi-view visual image, and then carrying out pose estimation on the multi-view visual image to obtain the initial key frame pose of each view and the key frame pose of each subsequent time;
(2) converting the current key frame pose and the previous key frame pose at each visual angle into a station center coordinate system, calculating relative poses, taking the sum of the current key frame pose and the relative poses as a visual observation value of the current key frame, and solving the global optimization pose of the current key frame at each visual angle by taking the minimum difference between the aligned current key frame GNSS measurement value and the current key frame visual observation value as a target;
(3) and updating the single-view local map by using the global optimization pose of the current key frame under the single view angle, satisfying time consistency constraint and space consistency constraint when the multi-view visual images are aligned, the difference of the key frame depth values among the multi-view angles is within the visual error range and the pose translation relation of the current key frame of the multi-view angles relative to the initial key frame is within the visual angle position error range, and fusing the multi-view local map to obtain the globally consistent map.
The alignment mode is as follows:
setting the time tolerance from the update frequency f of the GNSS measurements
Figure BDA0003641291890000091
Preferably, a is 0.5.
Time of update t by GNSS measurements gnss Get the timestamp [ t ] gnss -ts,t gnss +ts]And if the multi-view visual images are updated in the time stamp, the two images are considered to be measured in the same frame, and the GNSS measured value is aligned with the multi-view visual images for subsequent operation. Otherwise, the GNSS measurements are discarded. During subsequent fusion optimization, the GNSS frame is the latest updated value, and the consistency of the GNSS and the visual image frame is ensured.
In the invention, the GNSS data updating frequency is 20Hz, and the fusion optimization frequency is 10Hz, so that GNSS frames are updated for the last time, and the consistency of the GNSS and the visual image frames is ensured.
GNSS constraint and visual characteristic constraint are fused in a loose coupling mode, and when no GNSS position signal is updated, the pose estimation of the visual odometer is taken as a main point; when a GNSS position signal is updated, the GNSS position signal and the GNSS position signal are fused for global pose optimization, so that pose estimation does not depend on loop detection, and the robustness and pose estimation precision of the system are improved.
The pose estimation of the visual image with multiple visual angles comprises the following steps:
and estimating the initial pose by using a binocular vision SLAM system, estimating the relative motion of two frames according to the pose of a homography matrix model or a basic matrix model to obtain initial map points, defaulting the initial image frames to be key frames, and tracking by using a reference key frame model or a constant speed model to obtain the pose of the key frames.
The initial keyframe pose can be used to build an initial map, and the subsequent global optimization poses used to determine map points and then update the map.
The fusion optimization also needs to convert the pose estimated by the visual SLAM and the GNSS signal into the same reference coordinate system, and a coordinate system transformation schematic diagram is shown in FIG. 2. Wherein, ECEF (Earth-Centered, Earth-Fixed) global coordinate system takes the Earth mass center as the origin, and the x-y plane and the equatorial planeThe X axis points to the meridian of the prime, the Z axis points to the North, and the ECEF reference coordinate System adopts a WGS84(World Geodetic System-1984) coordinate System. The x, y and z axes of the ENU (East-North-Up) coordinate system point to East, North and day respectively, and the origin of the ENU coordinate system is a point on the ECEF coordinate system and is a semi-global coordinate system used for connecting the ECEF global coordinate system and the local map coordinate system. The local coordinate system is the relative coordinate system of the pose estimation and mapping of the visual SLAM system, with the z-axis aligned with the ENU coordinate system. The camera coordinate system is a relative coordinate system formed by the global coordinate system under the observation of the camera. The robot chassis control coordinate system is a coordinate system for the mobile robot to make relevant decisions, and is consistent with the global coordinate system in the invention. (.) WE Represents the ECEF Global coordinate System, (. cndot.) WN Representing the ENU semi-global coordinate system, (.) WL Representing a local coordinate system (.) WC Representing a camera coordinate system (·) WR Representing a robot chassis control coordinate system. The transformation relationships of the different coordinate systems are as follows:
Figure BDA0003641291890000101
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003641291890000111
representing a slave coordinate system
Figure BDA00036412918900001111
To a coordinate system
Figure BDA00036412918900001112
The rotation matrix of (a) is,
Figure BDA0003641291890000112
representative coordinate system
Figure BDA0003641291890000113
In a coordinate system
Figure BDA0003641291890000114
Of (2). Book (I)The invention has the following conversion relation:
camera coordinate system (·) WC And robot chassis coordinate system (·) WR : the transformation matrix between the two is mainly determined by the mounting position of the camera on the chassis.
Local coordinate system (·) WL And ENU coordinate system (.) WN : the transformation matrix between the two is mainly determined by the absolute coordinate value of the camera frame 0 in the ENU coordinate system.
ENU coordinate system (·) WN And ECEF Global coordinate System (.) WE : the transformation matrix between the two is mainly determined by the GPS positioning information (longitude, latitude, elevation) of the 0 th frame of the camera.
When the GNSS and the visual SLAM are fused, the coordinate system transformation of GNSS positioning information and visual SLAM pose estimation is needed to be carried out, so that the coordinate system is unified and calculation is convenient. And converting the pose result in the local coordinate system into an ENU coordinate system, and converting the GNSS positioning information into an ENU plane coordinate system from a WGS spherical coordinate system. The WGS84 coordinate system is an ellipsoid coordinate system with a major axis radius a of 6378137.0m, a minor axis radius b of 6356752.3142m, and an eccentricity e of 0.0818, assuming that the GNSS positioning information has longitude, latitude and altitude measurements in the WGS84 coordinate system of 0.0818
Figure BDA0003641291890000115
The coordinate value corresponding to ECEF coordinate system is [ x y z ]]T, the conversion relation of the two is as follows:
Figure BDA0003641291890000116
wherein, N is the curvature radius of the reference ellipsoid, and the calculation method of the value is as follows:
Figure BDA0003641291890000117
suppose that the coordinate value of the positioning information in the ENU coordinate system is [ e n u ]] T Longitude, latitude and altitude of origin of coordinates of
Figure BDA0003641291890000118
Corresponding to the radius of curvature of the reference ellipsoid
Figure BDA0003641291890000119
Figure BDA00036412918900001110
The WGS coordinate system and the ENU coordinate system are converted as follows:
Figure BDA0003641291890000121
Figure BDA0003641291890000122
Figure BDA0003641291890000123
Figure BDA0003641291890000124
under the condition of finishing the calibration of external parameters of the sensor, the relative pose obtained by the vision SLAM and the pose of the GNSS positioning sensor can be fused as long as the initial pose of the camera relative to the ENU coordinate system and the corresponding coordinate value of the GNSS positioning sensor on the ENU coordinate system are obtained, and then the absolute pose of the mobile robot under the world coordinate system during the actual motion can be obtained through conversion according to the initial coordinate value of the GNSS positioning sensor on the ECEF coordinate system.
Taking the difference between the aligned current key frame GNSS measured value and the current key frame pose as a GNSS residual error item;
taking the difference between the global optimization pose of the current key frame and the global optimization pose of the previous key frame as the pose increment after the vision and GNSS optimization, taking the difference between the pose of the current key frame and the pose of the previous key frame as the pose increment under the vision, and taking the difference between the pose increment after the vision and GNSS optimization and the pose increment under the vision as the residual error item of the vision odometer;
firstly, pose estimation is carried out on a visual image with multiple visual angles, when a GNSS measured value is updated, a visual odometer residual error item and a GNSS residual error item are fused, and a least square problem is solved to obtain global optimization poses with different visual angles.
And fusing a visual odometer residual term and a GNSS residual term, and solving a least square problem, namely solving a global optimization pose by taking the minimum difference between the aligned current key frame GNSS measurement value and the current key frame visual observation value as a target.
Calculating the global optimization pose to obtain the following formula:
Figure BDA0003641291890000131
wherein, χ * Represents the optimal solution of the global optimization pose solver,
Figure BDA0003641291890000132
and h (-) represents the measurement value of the GNSS at the k-th key frame, and h (-) is an observation equation obtained by a front-end visual odometer.
Taking forward vision as a reference value, obtaining a relative pose relation, and carrying out space consistency constraint, wherein the constraint mode is as follows:
Figure BDA0003641291890000133
Figure BDA0003641291890000134
wherein the content of the first and second substances,
Figure BDA0003641291890000135
the key-frame depth values obtained for forward vision,
Figure BDA0003641291890000136
key-frame depth values obtained for cameras from other perspectives, b being the camera baseline, f being the focal length of the camera, σ being the disparity,
Figure BDA0003641291890000137
the pose translation relationship of the current key frame relative to the initial key frame obtained for forward vision,
Figure BDA0003641291890000138
the pose translation relationship of the current keyframe obtained for other perspectives relative to the initial keyframe of the forward perspective,
Figure BDA0003641291890000139
and C, obtaining the relative pose by estimating the current visual angle. If the relation is met, the other visual angles and the forward visual angle are considered to meet the global consistency constraint, and the map is fused according to the relative relation of the positions and the postures of the key frames to obtain a global consistency map.
Example 1
The attitude estimation experiments were performed on sequence 00, sequence 08, sequence 09, and sequence 10 of the public data set KITTI. Wherein the real tracks of the sequence 00 and the sequence 09 form a loop, and the real tracks of the sequence 08 and the sequence 10 do not form a loop. The CPU model of the used computing platform is AMD R7-4800H, 16-core CPU, and the GPU model is NVIDIA GeForce RTX2060, 6G video memory. The mapping results are shown in fig. 3 (a), 3 (b), 3 (c), and 3 (d), and the experimental results of pose estimation are shown in table 1.
TABLE 1 Absolute track error results (unit: m) for different sequences of KITTI data sets
Figure BDA0003641291890000141
The absolute track errors RMSE of the different sequences and their corresponding statistics (Mean, Median, root Mean square error STD, minimum Min, maximum Max) are recorded. According to experimental results, the pose estimation method can effectively improve the pose estimation precision under urban road scenes with loops and urban road scenes without loops, so that the pose estimation does not depend on loop detection.
Meanwhile, the running time of each thread of different methods is recorded to obtain a table 2, the pure visual pose estimation method comprises three threads of front end tracking, local mapping and rear end optimization, a GNSS optimization thread is added to the method for fusing GNSS to perform global pose optimization, and the experimental result shows that the time consumption of the rear end optimization threads of the sequence 00 and the sequence 09 is long, so that the optimized pose is not real-time, and the sequence 08 and the sequence 10 do not form a track loop and do not perform rear end optimization. When the method is optimized, only the global pose of the current key frame and the relative pose of the current key frame generated by the front-end tracking thread relative to the previous key frame need to be added into a solver, the average time is about 20ms, the front-end tracking thread needs to extract feature points, initialize map points, select key frames and the like from image information, the time consumption is long, and the average time is more than 40ms, so that the method can realize the purposes of tracking one frame by the front end and optimizing one frame by GNSS, and has better real-time property while improving the pose estimation precision.
TABLE 2 dynamic performance results (unit: ms) for different sequences of KITTI data sets
Data set sequence number Front-end trace threads Local mapping process Back-end optimization threading GNSS optimization threads
00 42.368 185.573 6640.289 22.710
08 42.971 176.485 0 27.290
09 41.441 157.320 3673.453 24.127
10 40.134 151.876 0 15.679
Example 2
The invention is applied to the field of construction of unmanned engineering machinery, and the unmanned engineering machinery comprises an unmanned bulldozer, an unmanned grader, an unmanned road roller and the like. In the process of forward operation of the unmanned engineering machinery, due to the fact that the site of a forward visual angle is dynamically changed during construction, the forward visual image does not participate in map building so as to prevent interference on map building and influence on map building precision and accuracy, and at the moment, the site map building is carried out based on visual images of other visual angles in the backward direction and the left and right side directions; and in the process of backing the unmanned engineering machinery, adding the forward visual image into the unmanned engineering machinery, sensing the changed forward visual scene, incrementally updating a dynamic site area map based on the fusion of the visual images at different visual angles, and finishing construction operation to adapt to the dynamic change of the construction site.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A mapping method fusing GNSS and multi-view vision is characterized by comprising the following steps:
(1) constraining the update time of the multi-view visual image by the timestamp of the GNSS measured value to align the GNSS measured value with the multi-view visual image, and then carrying out pose estimation on the multi-view visual image to obtain the initial key frame pose of each view and the key frame pose of each subsequent time;
(2) converting the current key frame pose and the previous key frame pose at each visual angle into a station center coordinate system, calculating relative poses, taking the sum of the current key frame pose and the relative poses as a visual observation value of the current key frame, and solving the global optimization pose of the current key frame at each visual angle by taking the minimum difference between the aligned current key frame GNSS measurement value and the current key frame visual observation value as a target;
(3) and updating the single-view local map by using the global optimization pose of the current key frame under the single view angle, satisfying the time consistency constraint and the space consistency constraint when the multi-view visual images are aligned, the difference of the key frame depth values among the multi-view angles is in the visual error range, and the pose translation relation of the current key frame of the multi-view angles relative to the initial key frame is in the visual angle position error range, and fusing the multi-view local map to obtain the globally consistent map.
2. The mapping method for fusing GNSS and multi-view vision according to claim 1, wherein the spatial consistency constraint is implemented in a specific manner:
Figure FDA0003641291880000011
Figure FDA0003641291880000012
wherein the content of the first and second substances,
Figure FDA0003641291880000013
in order to be a visual error,
Figure FDA0003641291880000014
a key frame depth value for a view in the multi-view,
Figure FDA0003641291880000015
the key frame depth values of other views in the multi-view, b is the camera baseline corresponding to other views, f is the focal length of the camera corresponding to other views, σ is the parallax between other views and a certain view,
Figure FDA0003641291880000016
is a position error, b is more than 0 and less than 1,
Figure FDA0003641291880000017
is the pose translation relation of the current key frame relative to the initial key frame at a certain visual angle,
Figure FDA0003641291880000021
the pose translation relationship of the current key frame of other visual angles relative to the initial key frame of a certain visual angle,
Figure FDA0003641291880000022
and C is the relative pose of a certain visual angle.
3. The mapping method for merging GNSS and multi-view vision according to claim 1 or 2, wherein the alignment in step (1) is:
setting the time tolerance from the update frequency f of the GNSS measurements
Figure FDA0003641291880000023
Time of update t by GNSS measurements gnss Get the timestamp [ t ] gnss -ts,t gnss +ts]And if the multi-view visual image is updated in the time stamp, aligning the GNSS measured value with the multi-view visual image, and performing subsequent operation.
4. The mapping method for merging GNSS and multi-view vision according to claim 1 or 2, wherein the alignment in step (3) is:
setting the tolerance time according to the updating frequency f of the visual image of a certain visual angle in multiple visual angles
Figure FDA0003641291880000024
0<a<1;
The updating time t of the visual image from a certain view angle in multiple view angles gnss Get the timestamp [ t ] gnss -ts,t gnss +ts]And updating the visual images of other visual angles in the multi-visual angle in the time stamp, and aligning the visual images of other visual angles with the visual image of a certain visual angle to perform subsequent operation.
5. The mapping method for fusing GNSS and multi-view vision according to claim 1 or 2, wherein the relative pose is obtained by:
converting the current key frame pose and the previous key frame pose at each view angle from a local coordinate system to a center-of-station coordinate system, wherein the local coordinate system is a camera coordinate system corresponding to multiple view angles, the z axis of the local coordinate system is aligned with the center-of-station coordinate system, the x axis, the y axis and the z axis of the center-of-station coordinate system respectively point to the east, the north and the sky, the origin of the center-of-station coordinate system is a point on the global coordinate system and is a semi-global coordinate system used for connecting the global coordinate system and the local coordinate system, the global coordinate system takes the earth mass center as the origin, the x-y plane is coincided with the equatorial plane, the x axis points to the initial meridian, the z axis points to the north pole, and then calculating the relative poses of the current key frame pose and the previous key frame pose.
6. The mapping method for fusing GNSS and multi-view vision according to claim 5, wherein before solving the global optimization pose, the method further comprises:
the GNSS measured value is converted from a global coordinate system to a station center coordinate system, and the coordinate value of the GNSS measured value in the global coordinate system is
Figure FDA0003641291880000031
The coordinate value corresponding to the standing center coordinate system is [ e n u ]] T The conversion method is as follows:
Figure FDA0003641291880000032
wherein the coordinate difference value
Figure FDA0003641291880000033
Obtained by the following formula:
Figure FDA0003641291880000034
wherein e is the ellipsoid eccentricity of the global coordinate system, N is the curvature radius of the reference ellipsoid, and the longitude, latitude and height of the origin of the standing-center coordinate system are lambda 0
Figure FDA0003641291880000035
And h 0 ,N 0 The curvature radius of the reference ellipsoid corresponding to the origin of the station center coordinate system.
7. The mapping method for integrating GNSS and multi-view vision according to claim 1 or 2, wherein the multi-view includes a front view, a rear view, a left view and a right view, when the method is applied to a mobile robot, during the moving process of the mobile robot, a dynamically changing mapping of a site is performed based on the visual images of different views around the mobile robot, so as to obtain a site map, and if the site of the front view is dynamically changed, the visual image of the front view does not participate in mapping, and the mapping of the site is performed based on the visual images of the rear view, the left view and the right view; and in the process of moving the mobile robot backwards, fusing the front-view angle visual image into a site map.
8. The mapping method for fusing GNSS and multi-view vision according to claim 1 or 2, wherein the global optimization pose solving further comprises:
taking the difference between the aligned current key frame GNSS measurement value and the current key frame pose as a GNSS residual error item;
taking the difference between the global optimization pose of the current key frame and the global optimization pose of the previous key frame as the pose increment after the vision and GNSS optimization, taking the difference between the pose of the current key frame and the pose of the previous key frame as the pose increment under the vision, and taking the difference between the pose increment after the vision and GNSS optimization and the pose increment under the vision as the residual error item of the vision odometer;
firstly, pose estimation is carried out on a visual image with multiple visual angles, when a GNSS measured value is updated, a visual odometer residual error item and a GNSS residual error item are fused, and a least square problem is solved to obtain global optimization poses with different visual angles.
9. A mapping system fusing GNSS and multi-view vision is characterized by comprising: a camera, a GNSS, and a controller;
the number of the cameras is multiple, the cameras are respectively positioned at different visual angles and used for acquiring multi-visual-angle visual images;
the GNSS is used for acquiring GNSS measurement values;
the processor includes:
the position and orientation estimation module is used for constraining the updating time of the multi-view visual image through the timestamp of the GNSS measured value to enable the GNSS measured value to be aligned with the multi-view visual image, and then performing position and orientation estimation on the multi-view visual image to obtain the initial key frame position and the subsequent key frame position of each view;
the global optimization module is used for converting the current key frame pose and the previous key frame pose at each visual angle into a station center coordinate system, calculating relative poses, taking the sum of the current key frame pose and the relative poses as a visual observation value of the current key frame, and solving the global optimization pose of the current key frame at each visual angle by taking the minimum difference between the aligned current key frame GNSS measurement value and the current key frame visual observation value as a target;
and the mapping module is used for updating the single-view local map by using the global optimization pose of the current key frame under the single view, satisfying time consistency constraint and space consistency constraint when the multi-view visual image is aligned, the difference of the key frame depth values between the multiple views is within the visual error range and the pose translation relation of the current key frame of the multiple views relative to the initial key frame is within the view position error range, and fusing the multi-view local map to obtain the globally consistent map.
10. The mapping system combining GNSS and multi-view vision according to claim 9, wherein the plurality of cameras are installed at different locations of a single ground mobile robot, or the plurality of cameras are installed on a plurality of ground mobile robots, respectively, or a part of the plurality of cameras are installed on a ground mobile robot, a part of the plurality of cameras are installed on an aerial drone, or the plurality of cameras are installed at different locations of a single aerial drone, or the plurality of cameras are dispersedly installed on a plurality of aerial drones.
CN202210526760.0A 2022-05-12 2022-05-12 Mapping method and system fusing GNSS and multi-view vision Pending CN114966789A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210526760.0A CN114966789A (en) 2022-05-12 2022-05-12 Mapping method and system fusing GNSS and multi-view vision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210526760.0A CN114966789A (en) 2022-05-12 2022-05-12 Mapping method and system fusing GNSS and multi-view vision

Publications (1)

Publication Number Publication Date
CN114966789A true CN114966789A (en) 2022-08-30

Family

ID=82983368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210526760.0A Pending CN114966789A (en) 2022-05-12 2022-05-12 Mapping method and system fusing GNSS and multi-view vision

Country Status (1)

Country Link
CN (1) CN114966789A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115866229A (en) * 2023-02-14 2023-03-28 北京百度网讯科技有限公司 Method, apparatus, device and medium for converting view angle of multi-view image
CN116989772A (en) * 2023-09-26 2023-11-03 北京理工大学 Air-ground multi-mode multi-agent cooperative positioning and mapping method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115866229A (en) * 2023-02-14 2023-03-28 北京百度网讯科技有限公司 Method, apparatus, device and medium for converting view angle of multi-view image
CN115866229B (en) * 2023-02-14 2023-05-05 北京百度网讯科技有限公司 Viewing angle conversion method, device, equipment and medium for multi-viewing angle image
CN116989772A (en) * 2023-09-26 2023-11-03 北京理工大学 Air-ground multi-mode multi-agent cooperative positioning and mapping method
CN116989772B (en) * 2023-09-26 2024-01-02 北京理工大学 Air-ground multi-mode multi-agent cooperative positioning and mapping method

Similar Documents

Publication Publication Date Title
CN110033489B (en) Method, device and equipment for evaluating vehicle positioning accuracy
Alonso et al. Accurate global localization using visual odometry and digital maps on urban environments
CN110068335B (en) Unmanned aerial vehicle cluster real-time positioning method and system under GPS rejection environment
CN101241011B (en) High precision positioning and posture-fixing device on laser radar platform and method
CN114966789A (en) Mapping method and system fusing GNSS and multi-view vision
CN112461210B (en) Air-ground cooperative building surveying and mapping robot system and surveying and mapping method thereof
CN111880207B (en) Visual inertial satellite tight coupling positioning method based on wavelet neural network
CN105160702A (en) Stereoscopic image dense matching method and system based on LiDAR point cloud assistance
CN107014399A (en) A kind of spaceborne optical camera laser range finder combined system joint calibration method
CN112556719B (en) Visual inertial odometer implementation method based on CNN-EKF
CN111815765B (en) Heterogeneous data fusion-based image three-dimensional reconstruction method
CN112734765A (en) Mobile robot positioning method, system and medium based on example segmentation and multi-sensor fusion
CN114693754B (en) Unmanned aerial vehicle autonomous positioning method and system based on monocular vision inertial navigation fusion
CN115468567A (en) Cross-country environment-oriented laser vision fusion SLAM method
CN112179338A (en) Low-altitude unmanned aerial vehicle self-positioning method based on vision and inertial navigation fusion
CN110986888A (en) Aerial photography integrated method
Zhao et al. An ORB-SLAM3 Autonomous Positioning and Orientation Approach using 360-degree Panoramic Video
Ragab et al. Leveraging vision-based structure-from-motion for robust integrated land vehicle positioning systems in challenging GNSS environments
Mounier et al. High-Precision Positioning in GNSS-Challenged Environments: A LiDAR-Based Multi-Sensor Fusion Approach with 3D Digital Maps Registration
Roncella et al. Photogrammetric bridging of GPS outages in mobile mapping
CN113403942A (en) Label-assisted bridge detection unmanned aerial vehicle visual navigation method
CN112001970A (en) Monocular vision odometer method based on point-line characteristics
Fang et al. Marker-based mapping and localization for autonomous valet parking
Gu et al. Surveying and mapping of large-scale 3D digital topographic map based on oblique photography technology
Guo et al. Research on 3D geometric modeling of urban buildings based on airborne lidar point cloud and image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination