CN112833892B - Semantic mapping method based on track alignment - Google Patents

Semantic mapping method based on track alignment Download PDF

Info

Publication number
CN112833892B
CN112833892B CN202011644778.8A CN202011644778A CN112833892B CN 112833892 B CN112833892 B CN 112833892B CN 202011644778 A CN202011644778 A CN 202011644778A CN 112833892 B CN112833892 B CN 112833892B
Authority
CN
China
Prior art keywords
data
semantic
landmark
slam
orb
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011644778.8A
Other languages
Chinese (zh)
Other versions
CN112833892A (en
Inventor
张亮
朱光明
汪火良
蒋得志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Adaptive Technology Co ltd
Original Assignee
Hangzhou Adaptive Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Adaptive Technology Co ltd filed Critical Hangzhou Adaptive Technology Co ltd
Priority to CN202011644778.8A priority Critical patent/CN112833892B/en
Publication of CN112833892A publication Critical patent/CN112833892A/en
Application granted granted Critical
Publication of CN112833892B publication Critical patent/CN112833892B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • G01C21/30Map- or contour-matching
    • G01C21/32Structuring or formatting of map data

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a semantic mapping method based on track alignment, which comprises the following steps: s1, synchronously acquiring RGB data, laser data, IMU data and odometer data along the same running track; s2, through semantic ORB-SLAM, when ORB tracking is carried out by using RGB data, target detection is carried out on each frame of key frame image, a triangularization relation is constructed, and a 3D semantic landmark is extracted; s3, filtering the laser data by utilizing a down-sampling operation, and calculating IMU data and mileage count data; s4, processing the laser data, the IMU data and the odometer data to construct a 2D grid map; and S5, solving a similarity transformation relation between the ORB-SLAM and the track sequence, and fusing the 3D semantic landmark and the 2D grid map to generate a semantic map. The method and the device can construct an accurate grid map with semantic landmarks.

Description

Semantic mapping method based on track alignment
Technical Field
The invention belongs to the field of semantic map building, and relates to a semantic map building method based on track alignment, which can be used for building a semantic map of a surrounding environment.
Background
The SLAM (synchronous positioning and mapping) technology can help a robot to gradually construct a map of the surrounding environment by acquiring information through a sensor in a strange environment, and can also position the position of the robot in the map according to the change of environmental information, and the technology can help the mobile robot to realize an autonomous navigation task. The traditional SLAM technology is divided into two categories, but the traditional SLAM technology is only limited to analysis of environmental geometric characteristics, and the robot can only realize some very basic functions of obstacle avoidance, path planning and the like by means of a map generated by the traditional SLAM. In recent years, the research of semantic SLAM has been advanced, and by means of the more and more mature image processing technology based on deep learning, the semantic information of the environmental object is extracted from the image and is merged into the processing process and result of the semantic SLAM, so that the accuracy of the algorithm is further improved, and the map result is enriched. Although the current laser SLAM can generate an environment map which has higher precision and is suitable for robot navigation by virtue of a laser radar sensor, the laser radar sensor directly obtains point cloud information by virtue of a laser flight time method, so that semantic information in the environment is not easy to extract. The visual SLAM utilizes a camera sensor to acquire an image sequence of an environment to position and map, and the environment information acquired by the sensor data is richer and can be used for extracting semantic information, but is easily influenced by environmental conditions, so the mapping accuracy of the visual SLAM is often worse than that of the laser SLAM.
The traditional SLAM can only construct a geometric feature map of an environment, tasks which can be completed by the robot by using the map are very limited, the semantic SLAM can further improve the understanding degree of the robot to the environment, and the robot can complete more complex tasks by using the semantic map, so that the semantic mapping method for researching multi-sensor fusion has great significance in the semantic SLAM field.
Disclosure of Invention
In order to solve the above problems, the present invention provides a semantic mapping method based on track alignment, which includes adding a target detection process into ORB (organized Fast and tracked Brief) -SLAM, and extracting 3D semantic landmark information of an environmental object; meanwhile, a high-precision 2D grid map is constructed by utilizing laser radar data, IMU data and mileage count data; and finally, solving a similarity transformation relation between the track sequences of the two and aligning the 3D landmark and the 2D map to generate a semantic map, thereby solving the problems existing in the traditional SLAM.
The technical scheme adopted by the invention is as follows:
a semantic mapping method based on track alignment comprises the following steps:
s1, synchronously acquiring RGB data, laser data, IMU (inertial measurement unit) data and odometer data along the same running track;
s2, through semantic ORB-SLAM, when ORB tracking is carried out by using RGB data, target detection is carried out on each frame of key frame image, a triangularization relation is constructed according to target detection results between adjacent key frames, and a 3D semantic landmark is extracted;
s3, filtering the laser data by utilizing a down-sampling operation, reducing the calculated amount, and calculating and processing IMU data and mileage count data;
s4, processing the laser data, the IMU data and the odometer data through a Cartogrph algorithm to construct a 2D grid map;
and S5, solving a similarity transformation relation between the ORB-SLAM and the Cartogrer algorithm by utilizing a track sequence generated by the ORB-SLAM and the Cartogrer algorithm, and fusing the 3D semantic landmarks and the 2D grid map to generate a semantic map.
Preferably, in step S1, the RGB data, the laser data, the IMU data, and the odometer data start to be acquired synchronously, the final trajectories of the visual SLAM and the laser SLAM are each constituted by pose points, in order to solve the similarity transformation between the trajectories, the pose points that can be obtained by the timestamps in one-to-one correspondence are required to be obtained, and the pose points are all solved by the relative motion between the adjacent frames, so that the laser data and the RGB image sequence need to be acquired synchronously, and each sensor moves in an integral state during acquisition, so that the finally generated trajectories highly coincide.
Preferably, the step S2 is specifically performed according to the following steps:
s21, completing map initialization of the ORB-SLAM algorithm, enabling the ORB-SLAM algorithm to track the ORB of each frame of RGB image and generate a key frame;
s22, after the key frame is obtained, iteratively updating the 3D pose of the object landmark: firstly, projecting an object landmark (3D cube) serving as a boundary frame into each key frame image, matching the projection results and target detection results pairwise by using a KM algorithm, and matching the cost c (p, D) of a projection boundary frame p and a detected boundary frame D as shown in a formula (1):
Figure BDA0002874978040000021
wherein p represents a projection bounding box, d represents a detected bounding box, and l, t, r, b represent the left, top, right and bottom of the bounding box in the pixel coordinate system, respectively; the denominator aims to normalize the cost by using the width of the detected bounding box, and prevent a larger bounding box from leading the whole matching cost;
s23, triangulating the position of the 3D point of the landmark by a ray intersection extending from the center of the camera to the detected bounding box; the projection point selects the center point of the 2D detection frame, which means that the triangulated 3D point is also at the exact center of the object cubic landmark; after a group of object landmark hypotheses are sampled to be 3D points, a 3D boundary cube is instantiated for each point, and the center of the top surface of the boundary cube is anchored on the 3D point;
after generating a cubic landmark hypothesis for a new object, scoring the new hypothesis, wherein the score S (h) of the object hypothesis h is shown in equation (2):
Figure BDA0002874978040000031
where L is a set of landmarks H o Hypothesis that is o ∈ L, K o Keyframes, δ, containing all associated objects for which o is detected k Is the detection of a correlation in k, and f k Projecting h into keyframe k, α is the scale factor used to adjust the first sum, and Γ measures the relevance of object cubes h and h' according to their typical context (e.g., mouse and keyboard tend to be on the same surface); in this formula, the lower the computed score, the better the assumption of object landmarks; after scoring each hypothesis for all landmarks, if the best score of the new hypothesis is lower than the score of the existing estimated landmark, replacing the original landmark with the hypothesis with the lowest score;
s24, in each iteration, if the bounding box detected during the data association does not match any existing landmark, a new object landmark is instantiated for the detection. The accuracy of the cube at the very beginning is not important because the camera does not move much in subsequent key frames where the projection of the cube is sufficient to establish the data association. Once multiple views of landmarks are obtained through different keyframes, the pose and size scale of object landmarks can be refined gradually through multiple iterations.
Preferably, in step S3, the filtering the laser data specifically includes:
preprocessing the lidar data through a voxel filter in a PCL library: creating a 3D voxel grid on the input point cloud data, then in each voxel, all existing points are approximated by their centroids;
the specific calculation processing of the IMU data is as follows: performing a quaternion to euler angle conversion on the IMU data, as shown in equation (3):
Figure BDA0002874978040000041
wherein q = (q) 0 ,q 1 ,q 2 ,q 3 ) Is a quaternion, r = (α, β, γ) is euler angle;
the odometer data is obtained by equipping speedometers on left and right wheels of the mobile robot to obtain the advancing speed and the angular speed of the robot, and the odometer data is a directly acquired process.
Preferably, in step S4, the 2D occupancy grid map is generated by processing the lidar data, the IMU data, and the odometer data using the Cartographer algorithm partial SLAM and the global SLAM.
Preferably, the step S5 is specifically performed according to the following steps:
s51, obtaining a track sequence S generated by a semantic ORB-SLAM algorithm o And a track sequence S generated by the Cartogrier algorithm c Aligning the two sequences by means of time stamps, gradually aligning S o Filtering pose nodes without matching points, and finally generating two new sequences S which have the same number of pose points and are in one-to-one correspondence in sequence c′ And S o′
S52, obtaining the matched sequence S c′ And S o′ Then, the two sequences are subjected to similarity transformation solution, and the transformed error variance value epsilon is required to be obtained 2 ε 2 Minimum, error value ε 2 As shown in formula (4):
Figure BDA0002874978040000042
wherein s is a scaling coefficient, R is a rotation coefficient, and t is a translation coefficient; the final solution can be obtained here by solving this equation using the umeyama method:
R=USV T (5)
Figure BDA0002874978040000043
Figure BDA0002874978040000044
s53, after obtaining the similarity transformation parameters S, R and t, solving the corresponding position of any point in the environment point cloud generated by the ORB-SLAM in the 2D occupied grid coordinate system, wherein the specific conversion formula is as follows:
p′=s(Rp+t) (8)
wherein p' is a certain point in the grid map coordinate system, and p is a certain point in the semantic landmark coordinate system;
and S54, obtaining the corresponding relation between the 3D semantic earth surface and the 2D grid map through the conversion relation, and combining the 3D semantic earth surface and the 2D grid map into one semantic map.
The invention has the beneficial effects that:
(1) The method combines the advantages of the laser SLAM and the visual SLAM in the traditional SLAM, collects the laser radar data and the RGB data in the same running track, then processes the laser data to generate a 2D grid map by means of the Cartogrer algorithm, constructs the cubic semantic landmarks of an object by the ORB-SLAM algorithm, finally aligns two different pose point sequences of the same track generated by the two parts, and maps the semantic landmarks to the grid map by utilizing similarity transformation to form a final semantic map, thereby not only exerting the advantage that the laser SLAM can construct an accurate 2D map, but also exerting the advantage that the visual SLAM can extract semantic information, and solving the problem that the algorithm in the traditional SLAM cannot construct an accurate grid map with the semantic landmarks.
(2) The invention can use a hardware platform consisting of the sweeping robot and the camera sensor to collect a data set in an actual scene, has good flexibility and expandability, effectively solves the problem of inconsistent generated track sequences by fixing the sweeping robot and the camera sensor in the same device, and realizes the whole system by using the message publishing and subscribing model of the ROS, so that the final system can be conveniently applied to indoor task application of the mobile robot.
Drawings
FIG. 1 is a process diagram of a semantic mapping method based on track alignment according to the present invention;
fig. 2 is an overall architecture diagram of a semantic SLAM system based on a four-layer structure, which is implemented based on a sweeping robot and a camera sensor.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The embodiment of the invention provides a semantic map building method based on track alignment, aiming at the problem of content coupling in the traditional SLAM development, the invention provides an SLAM development framework based on four levels, and a semantic SLAM algorithm based on multi-sensor fusion is realized on the basis, as shown in figures 1 and 2, the semantic SLAM method comprises the following steps:
a1, synchronously scanning and acquiring RGB data, laser data, IMU data and odometer data by using a monocular camera and a sweeping robot, wherein the monocular camera is fixed on the sweeping robot and connected with a notebook computer, and continuously acquires RGB images as input of semantic ORB-SLAM in the moving process. And the laser data, IMU data and odometer data collected by the sweeping robot are used as the input of the Cartogrer.
Camera sensor calibration and sensor time alignment are required before the data set is acquired. The purpose of camera calibration is to determine the geometric position relationship between a certain point on the surface of an object in a three-dimensional space and the corresponding point in a two-dimensional image, and a geometric model of camera imaging must be established, namely, internal and external parameters and distortion parameters of a camera are solved. The purpose of sensor time alignment is to enable the timestamps recorded by the nodes of the two tracks to be matched on the same timeline when the track nodes are finally matched.
The camera calibration adopts a checkerboard camera calibration method of Zhang Zhengyou, a checkerboard calibration plate with known number and size of squares is used, an image coordinate system is established by taking the upper left corner as an original point, and a world coordinate system is established by taking a checkerboard plane as z = 0. Since the points where the black and white grids intersect in the checkerboard are all corner points, and the coordinates of each corner point are known, the transformation matrix between the two coordinate systems can be obtained. According to the orthogonality of the internal and external parameters of the camera, various parameters of the camera can be obtained by utilizing transformation matrix decomposition.
The sensor time comprises the time of a built-in system of the sweeping robot and the system time of a notebook computer, a wireless network card is embedded in the sweeping robot, and after the sweeping robot is started, the notebook computer is connected with the wireless network of the robot, so that the sweeping robot and the notebook computer are positioned in the same local area network. Before the robot is controlled to start running, the sweeping robot continuously sends 10 UDP packets with current time stamps to the notebook, the notebook forms a time stamp pair with the time stamp of the robot and the current time stamp of the notebook after receiving the packets each time and records the time stamp pair, and finally, the average time stamp difference between the time stamp pair and the time stamp pair is calculated according to the 10 groups of time stamp pairs.
A2, extracting the 3D semantic landmark from the RGB image sequence by utilizing the semantic ORB-SLAM needs to carry out several steps of extracting a key frame, obtaining target detection and optimizing landmark hypothesis.
And A21, completing map initialization of an ORB-SLAM algorithm, enabling the map initialization to perform ORB tracking on each frame of RGB image, and generating a key frame.
And A22, after the key frame is obtained, iteratively updating the 3D pose of the object landmark. Firstly, an object landmark (3D cube) is used as a boundary box to be projected into each key frame image, and then the projection and the target detection result are matched pairwise by utilizing a KM algorithm. The cost c (p, d) of matching a projected bounding box p with a detected bounding box d is given by equation (1):
Figure BDA0002874978040000061
where p denotes a projected bounding box, d denotes a detected bounding box, and l, t, r, b denote the left, top, right and bottom of the bounding box in the pixel coordinate system, respectively. The purpose of the denominator is to normalize the cost using the width of the detected bounding box, preventing a larger bounding box from dominating the overall matching cost.
A23, triangulating the 3D point locations of the landmarks by ray intersections extending from the camera center to the detected bounding box. The projection point is the center point of the 2D detection frame, which means that the triangulated 3D point is also at the exact center of the object cube landmark. After a group of object landmark hypotheses are sampled to be 3D points, a 3D boundary cube is instantiated for each point, and the center of the top surface of the boundary cube is anchored on the 3D point;
after generating a cubic landmark hypothesis for a new object, the new hypothesis is scored, and the score S (h) of the object hypothesis h is shown in equation (2):
Figure BDA0002874978040000071
where L is a set of landmarks H o Hypothesis that is o ∈ L, K o Keyframes, δ, containing all associated objects for which o is detected k Is the detection of a correlation in k, and f k H is projected into the keyframe k, α is the scale factor used to adjust the first sum, and Γ measures the relevance of the object cubes h and h' according to their typical context (e.g., the mouse and keyboard tend to be on the same surface). In this formula, the lower the computed score, the better the assumption of object landmarks. Finally, after scoring each hypothesis of all landmarks, if the best score of the new hypothesis is lower than the score of the existing estimated landmarkThe assumption with the lowest score is used to replace the original landmark.
At each iteration, if the bounding box detected during the data association does not match any existing landmarks, a new object landmark is instantiated for the detection. The accuracy of the cube at the very beginning is not important because the camera does not move much in subsequent key frames where the projection of the cube is sufficient to establish the data association. Once multiple views of landmarks are obtained through different keyframes, the pose and size scale of object landmarks can be refined gradually through multiple iterations.
And A3, preprocessing the laser data, the IMU data and the odometer data when the raster map is constructed by utilizing the laser data, the IMU data and the odometer data.
The lidar data is pre-processed by voxel filters in the PCL library, creating individual 3D voxel grids on the input point cloud data, and then in each voxel, all existing points are approximated by their centroids.
The IMU data needs to perform a conversion of quaternion to euler angle, as shown in equation (3):
Figure BDA0002874978040000081
wherein q = (q) 0 ,q 1 ,q 2 ,q 3 ) Is a quaternion, r = (α, β, γ) is euler angle;
and the odometer data is obtained by equipping the left and right wheels of the mobile robot with speedometers to obtain the advancing speed and the angular speed of the robot.
And A4, processing the laser radar data, the IMU data and the odometer data by using a local SLAM and a global SLAM of a Cartogrer algorithm library embedded in the sweeping robot to generate a 2D occupancy grid map.
And A5, after the semantic landmark generation and the grid map construction are completed, combining the semantic landmark generation and the grid map construction by using track alignment.
A51, obtaining a track sequence S generated by a semantic ORB-SLAM algorithm o And a track sequence S generated by a Cartogrph algorithm c Aligning the two sequences by means of time stamps, gradually aligning S o Filtering pose nodes without matching points, and finally generating two new sequences S which have the same number of pose points and are in one-to-one correspondence in sequence c′ And S o′
A52, obtaining the matched sequence S c′ And S o′ Then, the two sequences are subjected to similarity transformation solution, and the transformed error variance value epsilon is required to be obtained 2 ε 2 Minimum, error value ε 2 As shown in formula (4):
Figure BDA0002874978040000082
wherein s is a scaling coefficient, R is a rotation coefficient, and t is a translation coefficient; solving this equation here using the umeyama method, the final solution can be obtained:
R=USV T (5)
Figure BDA0002874978040000083
Figure BDA0002874978040000084
a53, after obtaining the similarity transformation parameters s, R and t, knowing the conversion relation between the two coordinate systems, the corresponding position of any point in the environment point cloud generated by the ORB-SLAM in the 2D occupied grid coordinate system can be obtained. The specific conversion formula is as follows:
p′=s(Rp+t) (8)
where p' is a certain point in the grid map coordinate system and p is a certain point in the semantic landmark coordinate system.
And finally, obtaining the corresponding relation between the 3D semantic ground surface and the 2D grid map through the conversion relation, and combining the 3D semantic ground surface and the 2D grid map into one semantic map.
The patent respectively uses the public data set and the generation module of the cube semantic landmark and the grid mapping module in the actual scene to carry out experiments and analysis. Experimental results show that the generation module of the semantic landmark can accurately generate a cubic frame representing the position and the size of an object in a three-dimensional space by means of an image sequence acquired under multiple visual angles, the grid map construction module can complete construction and optimization of a grid map in a small scene, and the map optimization effect is optimal when multiple sensor data are used simultaneously. And finally, acquiring a data set in an indoor scene containing a plurality of objects, completing the track alignment of the ORB-SLAM and the Cartogrer by utilizing the semantic SLAM system realized by the text, and finally generating a semantic map fused with a plurality of sensor data in the scene.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (4)

1. A semantic mapping method based on track alignment is characterized by comprising the following steps:
s1, synchronously acquiring RGB data, laser data, IMU data and odometer data along the same running track;
s2, through semantic ORB-SLAM, when ORB tracking is carried out by using RGB data, target detection is carried out on each frame of key frame image, a triangularization relation is constructed according to target detection results between adjacent key frames, and a 3D semantic landmark is extracted;
s3, filtering the laser data by utilizing a down-sampling operation, reducing the calculated amount, and calculating and processing IMU data and mileage count data;
s4, processing the laser data, the IMU data and the odometer data through a Cartogrer algorithm to construct a 2D grid map;
s5, solving a similarity transformation relation between the ORB-SLAM and a track sequence generated by a Cartogrer algorithm, and fusing a 3D semantic landmark and a 2D raster map to generate a semantic map;
the step S2 is specifically performed according to the following steps:
s21, completing map initialization of the ORB-SLAM algorithm, enabling the ORB-SLAM algorithm to track the ORB of each frame of RGB image and generate a key frame;
s22, after the key frame is obtained, iteratively updating the 3D pose of the object landmark: firstly, projecting an object landmark as a boundary frame into each key frame image, then matching the projection results and the target detection results pairwise by using a KM algorithm, wherein the cost c (p, d) of matching a projection boundary frame p and a detected boundary frame d is shown as a formula (1):
Figure FDA0003802089100000011
wherein p represents a projection bounding box, d represents a detected bounding box, and l, t, r, b represent the left, top, right and bottom of the bounding box in the pixel coordinate system, respectively;
s23, triangulating the position of the 3D point of the landmark by the intersection point of the light rays extending from the center of the camera to the detected bounding box; after a group of object landmark hypotheses are sampled to be 3D points, a 3D boundary cube is instantiated for each point, and the center of the top surface of the boundary cube is anchored on the 3D point;
after generating a cubic landmark hypothesis of a new object, scoring the new hypothesis, wherein the score S (h) of the object hypothesis h is shown in formula (2):
Figure FDA0003802089100000021
where L is a set of landmarks H o Hypothesis that is o ∈ L, K o Keyframes, δ, containing all associated objects for which o is detected k Is the detection of a correlation in k, and f k Projecting h into key frame k, α is the scale factor used to adjust the first sum, and Γ measures the relevance of object cubes h and h' according to their typical context; in thatAfter scoring each hypothesis of all landmarks, if the best score of the new hypothesis is lower than the score of the existing estimated landmark, replacing the original landmark by the hypothesis with the lowest score;
s24, in each iteration, if the bounding box detected during data association does not match any existing landmark, instantiating a new object landmark for the detection;
the step S5 is specifically performed according to the following steps:
s51, obtaining a track sequence S generated by a semantic ORB-SLAM algorithm o And a track sequence S generated by the Cartogrier algorithm c Aligning the two sequences by means of time stamps, gradually aligning S o Filtering pose nodes without matching points, and finally generating two new sequences S which have the same number of pose points and are in one-to-one correspondence in sequence c′ And S o′
S52, obtaining a matched sequence S c′ And S o′ Then, the two sequences are subjected to similarity transformation solution, and the transformed error variance value epsilon is required to be obtained 2 Minimum, error value ε 2 As shown in formula (4):
Figure FDA0003802089100000022
wherein s is a scaling coefficient, R is a rotation coefficient, and t is a translation coefficient; the final solution can be obtained here by solving this equation using the umeyama method:
R=USV T (5)
Figure FDA0003802089100000031
Figure FDA0003802089100000032
s53, after obtaining the similarity transformation parameters S, R and t, solving the corresponding position of any point in the environment point cloud generated by the ORB-SLAM in the 2D occupied grid coordinate system, wherein the specific conversion formula is as follows:
p′=s(Rp+t) (8)
wherein p' is a certain point in the grid map coordinate system, and p is a certain point in the semantic landmark coordinate system;
and S54, obtaining the corresponding relation between the 3D semantic ground surface and the 2D grid map through the relation transformation, and combining the 3D semantic ground surface and the 2D grid map into one semantic map.
2. The semantic mapping method based on trajectory alignment according to claim 1, wherein in step S1, RGB data, laser data, IMU data, and odometer data are synchronously acquired, and each sensor moves in an integral state during acquisition.
3. The semantic mapping method based on track alignment according to claim 1, wherein in the step S3, the filtering the laser data specifically comprises:
preprocessing the laser radar data through a voxel filter in a PCL library: creating a 3D voxel grid on the input point cloud data, then in each voxel, all existing points are approximated by their centroids;
the specific calculation processing of the IMU data is as follows: performing a quaternion to euler angle conversion on the IMU data, as shown in equation (3):
Figure FDA0003802089100000041
wherein q = (q) 0 ,q 1 ,q 2 ,q 3 ) Is a quaternion, r = (α, β, γ) is euler angle;
and the odometer data is obtained by equipping speedometers at the left and right wheels of the mobile robot to obtain the advancing speed and the angular speed of the robot.
4. The semantic mapping method based on trajectory alignment as claimed in claim 3, wherein in step S4, the laser radar data, IMU data and odometry data are processed by means of a Cartogrer algorithm, a partial SLAM and a global SLAM, and a 2D occupancy grid map is generated.
CN202011644778.8A 2020-12-31 2020-12-31 Semantic mapping method based on track alignment Active CN112833892B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011644778.8A CN112833892B (en) 2020-12-31 2020-12-31 Semantic mapping method based on track alignment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011644778.8A CN112833892B (en) 2020-12-31 2020-12-31 Semantic mapping method based on track alignment

Publications (2)

Publication Number Publication Date
CN112833892A CN112833892A (en) 2021-05-25
CN112833892B true CN112833892B (en) 2022-12-16

Family

ID=75927230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011644778.8A Active CN112833892B (en) 2020-12-31 2020-12-31 Semantic mapping method based on track alignment

Country Status (1)

Country Link
CN (1) CN112833892B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113240755B (en) * 2021-07-12 2021-09-21 中国海洋大学 City scene composition method and system based on street view image and vehicle-mounted laser fusion
CN114782626B (en) * 2022-04-14 2024-06-07 国网河南省电力公司电力科学研究院 Transformer substation scene map building and positioning optimization method based on laser and vision fusion
CN115496923B (en) * 2022-09-14 2023-10-20 北京化工大学 Multi-mode fusion target detection method and device based on uncertainty perception
CN115388880B (en) * 2022-10-27 2023-02-03 联友智连科技有限公司 Low-cost parking map construction and positioning method and device and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020156923A2 (en) * 2019-01-30 2020-08-06 Harman Becker Automotive Systems Gmbh Map and method for creating a map
CN111665842A (en) * 2020-06-09 2020-09-15 山东大学 Indoor SLAM mapping method and system based on semantic information fusion

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107036594A (en) * 2017-05-07 2017-08-11 郑州大学 The positioning of intelligent Power Station inspection intelligent body and many granularity environment perception technologies
CN109376785B (en) * 2018-10-31 2021-09-24 东南大学 Navigation method based on iterative extended Kalman filtering fusion inertia and monocular vision
CN111263308A (en) * 2020-01-15 2020-06-09 上海交通大学 Positioning data acquisition method and system
CN111539994B (en) * 2020-04-28 2023-04-18 武汉科技大学 Particle filter repositioning method based on semantic likelihood estimation
CN111798475B (en) * 2020-05-29 2024-03-22 浙江工业大学 Indoor environment 3D semantic map construction method based on point cloud deep learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020156923A2 (en) * 2019-01-30 2020-08-06 Harman Becker Automotive Systems Gmbh Map and method for creating a map
CN111665842A (en) * 2020-06-09 2020-09-15 山东大学 Indoor SLAM mapping method and system based on semantic information fusion

Also Published As

Publication number Publication date
CN112833892A (en) 2021-05-25

Similar Documents

Publication Publication Date Title
CN112833892B (en) Semantic mapping method based on track alignment
CN109166149B (en) Positioning and three-dimensional line frame structure reconstruction method and system integrating binocular camera and IMU
Zhao et al. A vehicle-borne urban 3-D acquisition system using single-row laser range scanners
CN110070615A (en) A kind of panoramic vision SLAM method based on polyphaser collaboration
CN112634451A (en) Outdoor large-scene three-dimensional mapping method integrating multiple sensors
Jiang et al. Unmanned Aerial Vehicle-Based Photogrammetric 3D Mapping: A survey of techniques, applications, and challenges
CN107741234A (en) The offline map structuring and localization method of a kind of view-based access control model
CN107450577A (en) UAV Intelligent sensory perceptual system and method based on multisensor
CN111060924B (en) SLAM and target tracking method
CN106017463A (en) Aircraft positioning method based on positioning and sensing device
CN106595659A (en) Map merging method of unmanned aerial vehicle visual SLAM under city complex environment
CN115388902B (en) Indoor positioning method and system, AR indoor positioning navigation method and system
Qian et al. Robust visual-lidar simultaneous localization and mapping system for UAV
Wen et al. TM 3 Loc: Tightly-coupled monocular map matching for high precision vehicle localization
Li et al. Robust localization for intelligent vehicles based on pole-like features using the point cloud
Zhu et al. Fusing GNSS/INS/vision with a priori feature map for high-precision and continuous navigation
Liu et al. Deep-learning and depth-map based approach for detection and 3-D localization of small traffic signs
Zhang LILO: A novel LiDAR–IMU SLAM system with loop optimization
CN116429116A (en) Robot positioning method and equipment
CN116380079A (en) Underwater SLAM method for fusing front-view sonar and ORB-SLAM3
CN115355904A (en) Slam method for Lidar-IMU fusion of ground mobile robot
WO2022246812A1 (en) Positioning method and apparatus, electronic device, and storage medium
Li et al. BA-LIOM: tightly coupled laser-inertial odometry and mapping with bundle adjustment
Jaenal et al. Improving visual SLAM in car-navigated urban environments with appearance maps
CN112945233A (en) Global drift-free autonomous robot simultaneous positioning and map building method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221116

Address after: No. 407-7, Shengxing Road, Ningwei Street, Xiaoshan District, Hangzhou, Zhejiang 310000

Applicant after: Hangzhou Adaptive Technology Co.,Ltd.

Address before: 311215 room b1-3-034, No. 198, Qidi Road, economic and Technological Development Zone, Xiaoshan District, Hangzhou City, Zhejiang Province

Applicant before: Hangzhou purevision Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant