Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide an automatic registration method of SLAM laser point cloud and panoramic image based on a large model, and solves the problem of low accuracy of the panoramic image pose obtained by registering the image point cloud and the laser point cloud in the prior art.
In order to achieve the above object, the present invention provides the following solutions:
a SLAM laser point cloud and panoramic image automatic registration method based on a large model comprises the following steps:
Calculating the pose of the panoramic camera to the SLAM system;
Determining an exposure time interval of the panoramic camera;
determining the pose corresponding to the panoramic image corresponding to the panoramic camera according to the exposure time interval;
obtaining a corresponding panoramic depth atlas according to the pose corresponding to the panoramic image;
Based on a panoramic image depth estimation algorithm, obtaining an optimal panoramic depth map according to a panoramic depth map set;
and obtaining an optimal matching point cloud of the panoramic camera and the SLAM system according to the optimal panoramic depth map.
Preferably, the calculating the pose of the panoramic camera to the SLAM system includes:
Acquiring a true value point cloud of a calibration room;
The SLAM system is static in a calibration room to scan and photograph at the middle position, and a laser point cloud of the SLAM system and a panoramic image of a panoramic camera are obtained;
Performing ICP registration on the laser point cloud and the truth point cloud to obtain the pose of the SLAM system under the truth point cloud coordinate system;
Obtaining the pose of the panoramic image under a true point cloud coordinate system according to the panoramic image and the true point cloud;
And obtaining the pose from the panoramic camera to the SLAM according to the laser point cloud of the SLAM, the panoramic image of the panoramic camera and the pose of the panoramic image under the true point cloud coordinate system.
Preferably, the exposure time interval of the panoramic camera is determined to be 100 milliseconds to 1000 milliseconds.
Preferably, the determining, according to the exposure time interval, the pose corresponding to the panoramic image corresponding to the panoramic camera includes:
Acquiring SLAM positions and postures corresponding to the first preset time and the second preset time according to the exposure time interval;
determining the pose of the SLAM system corresponding to a certain moment between the first preset time and the second preset time by utilizing interpolation calculation according to the SLAM position and the pose corresponding to the first preset time and the second preset time;
and obtaining the pose corresponding to the panoramic image according to the pose of the SLAM system and the pose from the panoramic camera to the SLAM system.
Preferably, the pose of the SLAM system includes:
Position amount and rotation amount.
Preferably, the position quantity is obtained by linear interpolation.
Preferably, the rotation amount is obtained by spherical linear interpolation.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
The invention provides a SLAM laser point cloud and panoramic image automatic registration method based on a large model, which comprises the following steps: calculating the pose of the panoramic camera to the SLAM system; determining an exposure time interval of the panoramic camera; determining the pose corresponding to the panoramic image corresponding to the panoramic camera according to the exposure time interval; obtaining a corresponding panoramic depth atlas according to the pose corresponding to the panoramic image; based on a panoramic image depth estimation algorithm, obtaining an optimal panoramic depth map according to a panoramic depth map set; and obtaining an optimal matching point cloud of the panoramic camera and the SLAM system according to the optimal panoramic depth map. According to the invention, the exposure time interval is determined, and then the optimal selection is realized by adopting a deep learning technology based on a large model in the interval, so that the minimum error is realized, and the optimal posture of the panorama is determined. And the accuracy of point cloud registration of the panoramic image and the SLAM system is improved.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention aims to provide an automatic registration method of SLAM laser point cloud and panoramic image based on a large model, which solves the problem of low accuracy of panoramic image pose obtained by registering the image point cloud and the laser point cloud in the prior art.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
As shown in fig. 2, the invention provides a large model-based automatic registration method of a SLAM laser point cloud and a panoramic image, which comprises the following steps:
Step 100: calculating the pose of the panoramic camera to the SLAM system;
step 200: determining an exposure time interval of the panoramic camera;
step 300: determining the pose corresponding to the panoramic image corresponding to the panoramic camera according to the exposure time interval;
Step 400: obtaining a corresponding panoramic depth atlas according to the pose corresponding to the panoramic image;
step 500: based on a panoramic image depth estimation algorithm, obtaining an optimal panoramic depth map according to a panoramic depth map set;
step 600: and obtaining an optimal matching point cloud of the panoramic camera and the SLAM system according to the optimal panoramic depth map.
Further, the calculating the pose of the panoramic camera to the SLAM system includes:
Acquiring a true value point cloud of a calibration room;
The SLAM system is static in a calibration room to scan and photograph at the middle position, and a laser point cloud of the SLAM system and a panoramic image of a panoramic camera are obtained;
Performing ICP registration on the laser point cloud and the truth point cloud to obtain the pose of the SLAM system under the truth point cloud coordinate system;
Obtaining the pose of the panoramic image under a true point cloud coordinate system according to the panoramic image and the true point cloud;
And obtaining the pose from the panoramic camera to the SLAM according to the laser point cloud of the SLAM, the panoramic image of the panoramic camera and the pose of the panoramic image under the true point cloud coordinate system.
Specifically, ① selects a hexahedral room, prints a plurality of two-dimensional codes with the length of not less than 20cm multiplied by 20cm, and is attached to different walls of the room, the height of the room is noted to fall, and an industrial-level fixed station laser scanner (such as Faro, Z+F and the like) is used for scanning to obtain a true value point cloud of a calibration field; ② And (3) the SLAM system is static in a middle position of the calibration room to scan and photograph, and laser point clouds and panoramic images are recorded. ③ ICP registration is carried out on the obtained SLAM point cloud and the truth point cloud, so that the pose of the laser under the truth point cloud coordinate system is obtained and recorded as; ④ Selecting more than 4 pairs of homonymy points on a gray level diagram (the gray level diagram can identify colors) of the panoramic image and the true point cloud, establishing an equation to solve the pose of the panoramic image under the true point cloud coordinate system, and recording as/>; ⑤ According to the two poses, calculating the poses from the panoramic camera to the multi-line laser as follows: /(I)=/>。
The method comprisesThe initial spatial relationship between the panoramic camera and the SLAM system is the same.
Further, the exposure time interval of the panoramic camera is determined to be 100 milliseconds to 1000 milliseconds.
Specifically, the exposure time of the panoramic image in the SLAM system cannot be accurately determined, but the exposure time can be determined to be within a certain time range, the time is the exposure delay time, a camera exposure instruction is sent from the SLAM system, the camera can complete exposure within a controllable time range according to the ambient light and the preset parameters of the camera, 90% of the conditions are found to be completed within 300 milliseconds to 600 milliseconds according to the test, and therefore the delay is set to be 100 milliseconds to 1000 milliseconds, and the delay interval can completely cover the real exposure time.
Further, determining, according to the exposure time interval, a pose corresponding to a panoramic image corresponding to the panoramic camera includes:
Acquiring SLAM positions and postures corresponding to the first preset time and the second preset time according to the exposure time interval;
determining the pose of the SLAM system corresponding to a certain moment between the first preset time and the second preset time by utilizing interpolation calculation according to the SLAM position and the pose corresponding to the first preset time and the second preset time;
and obtaining the pose corresponding to the panoramic image according to the pose of the SLAM system and the pose from the panoramic camera to the SLAM system.
Specifically, the SLAM trajectory (pose) at any exposure time is obtained:
the data format of the SLAM track is:
time, xyz position, 3×3 rotation matrix
Therefore, the pose of the panoramic camera can be correspondingly obtained as long as the exposure time is determined. If the SLAM system at the time t sends an exposure instruction to the panoramic camera, the true exposure is necessarily performed in a time unit of millisecond in a section (t+100, throught+1000) (time unit: millisecond).
Let the accurate exposure of the panoramic camera beAccording to time 1/</>, from SLAM track&& />Less than or equal to time2, finding out time1 and time2, acquiring SLAM positions and postures corresponding to time1 and time2, performing interpolation calculation, and calculating/>The pose of the SLAM system at the moment is set as/>. Initial spatial transformation relationship/>, through panoramic camera and SLAM coordinate system(See step 1), the pose corresponding to the panoramic image can be obtained: /(I)=/>
The pose of the SLAM system comprises:
Position amount and rotation amount.
Specifically, the position quantity can be obtained by direct linear Interpolation, and the rotation quantity is required to be obtained by spherical linear Interpolation (SPHERICAL LINEAR internation slerp).
The rotation amount is required to perform spherical interpolation after the matrix is transformed into quaternions (quaternion), and let the rotation matrices T1 and T2 be transformed into quaternions Q1 and Q2, as shown in fig. 3.
The rotation amount Qi corresponding to the current time is:
;
Wherein: 。
r represents the weight of the current frame in adjacent transformation, the value range is 0.0-1.0, and the calculation method is as follows:
Where i is the current frame number, Is the start frame number of the first section,/>Is the end frame number of the second segment.
And (5) solving the pose of the SLAM track at any moment, namely obtaining the pose of the panoramic image at any moment.
Furthermore, the invention obtains and calculates all the poses in the preset interval by using a large model + laser point cloud monitoring method, thereby obtaining the optimal moment and the optimal panoramic pose.
As shown in fig. 4-5, depth prediction first projects the panorama in equidistant cylinders, maps the panorama to multiple patches at multiple angles, gives each patch negligible distortion, feeds the patch to the network prediction depth, and merges all patch predictions into one equidistant cylinder projection form. In order to ensure the continuity of depth prediction of different patches, semantic segmentation branches are introduced, and the consistency of depth prediction is realized under the supervision of semantic information.
Depth estimation branch: the branch performs dense global depth estimation based on cylindrical projections of the panoramic image. With the encoding and self-encoding architecture, the encoder extracts strong features using vision transformer (ViT) based on dinov large model initialization as a backbone network (backbone).
Semantic segmentation branches: the branches are based on hexahedral projections of the panoramic image, and semantic segmentation of the panoramic image is performed. The same backup as the depth estimation branch is adopted, the weight sharing is carried out, and the weight is not shared with the last branch in the head part of the task.
A decoder: consists of a recombination and fusion module. The reorganization module maps, connects and projects the output of the transform block according toSpatial resolution resampling works. The fusion module utilizes a residual convolution unit to fuse the output characteristics of the recombination module and up-samples the characteristic map by 2 times.
Loss function:
(1) Depth estimation branching employs an Inverse Huber penalty
This loss is an L1 loss when d is smaller than c, and an L2 loss when d is larger than c. d represents the predicted depth of each pixel
And true value
Is the difference between each gradient drop
I represents the index of all pixels of each image of each batch,
=0.2, Maximum error rate per batch.
(2) The semantic segmentation branch adopts a cross entropy loss function to calculate the cross entropy Ls of the prediction segmentation result and the true value segmentation result.
The total loss function is:;
pre-training model and dataset
The pre-training model adopted by the method is dinov < 2 >, and the depth prediction training set is an NYU depth data set which comprises depth data and semantic tag data.
Further, the panoramic depth map is registered with the laser point cloud.
The invention also provides another embodiment:
As shown in fig. 6, image preprocessing: and (3) setting the exposure time interval of the panoramic image as [ t+100, t+1000]/ms, setting the panoramic pose corresponding to the moment t i (see step 3) as M, and generating a point cloud depth map from the result point cloud.
A) A virtual graph with the same size as the panoramic image (for quick calculation, the virtual graph can be reduced in a certain scale, etc.), and the virtual graph is used as a depth graph and is set as。
B) The coordinates of the laser point P (x, y, z) under the panoramic sphere are calculated:
c) Calculating the panoramic spherical coordinate system (O is the origin of coordinates, namely 0, 0), the length r i and the corresponding spherical longitude and latitude (w u,hv).
D) The depth map may be expressed as: depthMap (w u,hv)=ri), where (w u,hv) is required to scale to a preset depth map size.
E) When a plurality of laser points all correspond to the same one (w u,hv), the depth is takenAnd the small ones.
And calculating all laser point clouds in the scene according to the algorithm, so as to generate a point cloud depth map. For rapid calculation, only the laser point cloud which is positioned at a certain distance threshold from the position at the moment t i can be selected to participate in calculation, for example, within 50 meters.
Feature extraction: and extracting corner features by using SIFT, FAST and other technologies.
Feature matching: frequently, nearest neighbor matching is utilized, random sampling consistency is utilized, and the spatial position relation is utilized to remove mismatching points
After the matching point pairs are obtained, estimating a depth image transformation matrix by using a gradient descent method or a Gaussian-Newton method.
Registration accuracy is assessed. And taking the point cloud depth map with the highest registration precision at the time t i as the best matching image with the depth map, thereby obtaining the best matching time point cloud.
The beneficial effects of the invention are as follows:
The invention provides a SLAM laser point cloud and panoramic image automatic registration method based on a large model, which comprises the following steps: calculating the pose of the panoramic camera to the SLAM system; determining an exposure time interval of the panoramic camera; determining the pose corresponding to the panoramic image corresponding to the panoramic camera according to the exposure time interval; obtaining a corresponding panoramic depth atlas according to the pose corresponding to the panoramic image; based on a panoramic image depth estimation algorithm, obtaining an optimal panoramic depth map according to a panoramic depth map set; and obtaining an optimal matching point cloud of the panoramic camera and the SLAM system according to the optimal panoramic depth map. According to the invention, the exposure time interval is determined, and then the optimal selection is realized by adopting a deep learning technology based on a large model in the interval, so that the minimum error is realized, and the optimal posture of the panorama is determined. And the accuracy of point cloud registration of the panoramic image and the SLAM system is improved. Thereby obtaining the panoramic pose of the mapping level, and providing powerful technical support for the application of mapping, modeling, AR and other scenes.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.