CN117994309A

CN117994309A - SLAM laser point cloud and panoramic image automatic registration method based on large model

Info

Publication number: CN117994309A
Application number: CN202410404412.5A
Authority: CN
Inventors: 魏占营; 王庆东; 刘洋; 张成龙; 姚琰燚; 沈岳豪; 敦少杰
Original assignee: Beijing Vision Zhixing Technology Co ltd; Huajian Technology Shenzhen Co ltd
Current assignee: Beijing Vision Zhixing Technology Co ltd
Priority date: 2024-04-07
Filing date: 2024-04-07
Publication date: 2024-05-07
Anticipated expiration: 2044-04-07
Also published as: CN117994309B

Abstract

The invention provides a SLAM laser point cloud and panoramic image automatic registration method based on a large model, which comprises the following steps: calculating the pose of the panoramic camera to the SLAM system; determining an exposure time interval of the panoramic camera; determining the pose corresponding to the panoramic image corresponding to the panoramic camera according to the exposure time interval; obtaining a corresponding panoramic depth atlas according to the pose corresponding to the panoramic image; based on a panoramic image depth estimation algorithm, obtaining an optimal panoramic depth map according to a panoramic depth map set; and obtaining an optimal matching point cloud of the panoramic camera and the SLAM system according to the optimal panoramic depth map. The method solves the problem that in the prior art, the accuracy of the panoramic image pose obtained by registering the image point cloud and the laser point cloud is low.

Description

SLAM laser point cloud and panoramic image automatic registration method based on large model

Technical Field

The invention relates to the technical field of SLAM laser point clouds, in particular to an automatic registration method of an SLAM laser point cloud and a panoramic image based on a large model.

Background

The laser SLAM system for surveying and mapping can collect high-precision point cloud and panoramic images of indoor environments, and is characterized by miniaturization and light weight. The sensors thus configured are lightweight, including lasers, IMUs, cameras, etc. For example, the laser generally adopts 16-line or 32-line multi-line laser, the weight of the panoramic camera is 300 g-900 g, the panoramic camera generally has 100 g-300 g, and the small panoramic camera has excellent shooting capability and can generate 1000 ten thousand-1.0 hundred million-pixel high-quality panoramic images. However, this camera is not an industrial camera, and cannot provide a time service interface, and in a SLAM system, a photographing instruction is generally sent to a camera only by using an SDK (secondary development kit) of a panoramic camera, and the camera receives the instruction and then photographs. However, the time interval between the camera and the exposure is not fixed, that is, the panoramic image cannot provide accurate exposure time, and the inaccuracy of the time causes that the panoramic image and the SLAM laser point cloud cannot be accurately registered, so that the problems of the follow-up common functions such as point cloud coloring, panoramic measurement and the like occur.

As shown in fig. 1, a backpack SLAM system profile: simultaneous Localization AND MAPPING (synchronous positioning and map construction technology), namely, the environment is acquired through vision (called vision SLAM) or laser (called laser SLAM) in an unknown environment, so that the position of the environment is calculated reversely, then the environment is continuously perceived, and the three-dimensional space perception of the unknown environment is realized through continuous iteration. The general laser SLAM system comprises a laser, an IMU (Inertia Measurement Unit, an inertial measurement unit), a panoramic camera and an industrial personal computer, wherein the laser is responsible for sensing a three-dimensional space, the IMU assists in providing the gesture of equipment, the panoramic image is used for imparting color to point cloud and providing the real-scene texture of the environment, and the industrial personal computer is responsible for time synchronization and partial calculation functions. The backpack SLAM system is connected through a customized back frame, and laser point clouds of surrounding environments are obtained in the walking process of operators.

And acquiring an image by using a panoramic video mode, and registering the image point cloud generated by densely matching the sequence images with the laser point cloud to reversely calculate the pose of the panoramic image. The disadvantages are: the indoor environment illumination is severely transformed, the quality of the generated image point cloud is poor, and the panoramic image pose obtained by registering the image point cloud and the laser point cloud is inaccurate. In addition, the video panoramic image has huge data volume, and causes great pressure on data storage.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide an automatic registration method of SLAM laser point cloud and panoramic image based on a large model, and solves the problem of low accuracy of the panoramic image pose obtained by registering the image point cloud and the laser point cloud in the prior art.

In order to achieve the above object, the present invention provides the following solutions:

a SLAM laser point cloud and panoramic image automatic registration method based on a large model comprises the following steps:

Calculating the pose of the panoramic camera to the SLAM system;

Determining an exposure time interval of the panoramic camera;

determining the pose corresponding to the panoramic image corresponding to the panoramic camera according to the exposure time interval;

obtaining a corresponding panoramic depth atlas according to the pose corresponding to the panoramic image;

Based on a panoramic image depth estimation algorithm, obtaining an optimal panoramic depth map according to a panoramic depth map set;

and obtaining an optimal matching point cloud of the panoramic camera and the SLAM system according to the optimal panoramic depth map.

Preferably, the calculating the pose of the panoramic camera to the SLAM system includes:

Acquiring a true value point cloud of a calibration room;

The SLAM system is static in a calibration room to scan and photograph at the middle position, and a laser point cloud of the SLAM system and a panoramic image of a panoramic camera are obtained;

Performing ICP registration on the laser point cloud and the truth point cloud to obtain the pose of the SLAM system under the truth point cloud coordinate system;

Obtaining the pose of the panoramic image under a true point cloud coordinate system according to the panoramic image and the true point cloud;

And obtaining the pose from the panoramic camera to the SLAM according to the laser point cloud of the SLAM, the panoramic image of the panoramic camera and the pose of the panoramic image under the true point cloud coordinate system.

Preferably, the exposure time interval of the panoramic camera is determined to be 100 milliseconds to 1000 milliseconds.

Preferably, the determining, according to the exposure time interval, the pose corresponding to the panoramic image corresponding to the panoramic camera includes:

Acquiring SLAM positions and postures corresponding to the first preset time and the second preset time according to the exposure time interval;

determining the pose of the SLAM system corresponding to a certain moment between the first preset time and the second preset time by utilizing interpolation calculation according to the SLAM position and the pose corresponding to the first preset time and the second preset time;

and obtaining the pose corresponding to the panoramic image according to the pose of the SLAM system and the pose from the panoramic camera to the SLAM system.

Preferably, the pose of the SLAM system includes:

Position amount and rotation amount.

Preferably, the position quantity is obtained by linear interpolation.

Preferably, the rotation amount is obtained by spherical linear interpolation.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

The invention provides a SLAM laser point cloud and panoramic image automatic registration method based on a large model, which comprises the following steps: calculating the pose of the panoramic camera to the SLAM system; determining an exposure time interval of the panoramic camera; determining the pose corresponding to the panoramic image corresponding to the panoramic camera according to the exposure time interval; obtaining a corresponding panoramic depth atlas according to the pose corresponding to the panoramic image; based on a panoramic image depth estimation algorithm, obtaining an optimal panoramic depth map according to a panoramic depth map set; and obtaining an optimal matching point cloud of the panoramic camera and the SLAM system according to the optimal panoramic depth map. According to the invention, the exposure time interval is determined, and then the optimal selection is realized by adopting a deep learning technology based on a large model in the interval, so that the minimum error is realized, and the optimal posture of the panorama is determined. And the accuracy of point cloud registration of the panoramic image and the SLAM system is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a backpack SLAM system according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for automatically registering SLAM laser point cloud and panoramic image based on a large model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating linear interpolation of spherical surfaces according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a large model aided panoramic image depth estimation network architecture according to an embodiment of the present invention;

FIG. 5 is a diagram of a decoder architecture according to an embodiment of the present invention;

Fig. 6 is a schematic diagram of registration of a panoramic depth map and a point cloud according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention aims to provide an automatic registration method of SLAM laser point cloud and panoramic image based on a large model, which solves the problem of low accuracy of panoramic image pose obtained by registering the image point cloud and the laser point cloud in the prior art.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

As shown in fig. 2, the invention provides a large model-based automatic registration method of a SLAM laser point cloud and a panoramic image, which comprises the following steps:

Step 100: calculating the pose of the panoramic camera to the SLAM system;

step 200: determining an exposure time interval of the panoramic camera;

step 300: determining the pose corresponding to the panoramic image corresponding to the panoramic camera according to the exposure time interval;

Step 400: obtaining a corresponding panoramic depth atlas according to the pose corresponding to the panoramic image;

step 500: based on a panoramic image depth estimation algorithm, obtaining an optimal panoramic depth map according to a panoramic depth map set;

step 600: and obtaining an optimal matching point cloud of the panoramic camera and the SLAM system according to the optimal panoramic depth map.

Further, the calculating the pose of the panoramic camera to the SLAM system includes:

Acquiring a true value point cloud of a calibration room;

Specifically, ① selects a hexahedral room, prints a plurality of two-dimensional codes with the length of not less than 20cm multiplied by 20cm, and is attached to different walls of the room, the height of the room is noted to fall, and an industrial-level fixed station laser scanner (such as Faro, Z+F and the like) is used for scanning to obtain a true value point cloud of a calibration field; ② And (3) the SLAM system is static in a middle position of the calibration room to scan and photograph, and laser point clouds and panoramic images are recorded. ③ ICP registration is carried out on the obtained SLAM point cloud and the truth point cloud, so that the pose of the laser under the truth point cloud coordinate system is obtained and recorded as; ④ Selecting more than 4 pairs of homonymy points on a gray level diagram (the gray level diagram can identify colors) of the panoramic image and the true point cloud, establishing an equation to solve the pose of the panoramic image under the true point cloud coordinate system, and recording as/>; ⑤ According to the two poses, calculating the poses from the panoramic camera to the multi-line laser as follows: /(I)=/>。

The method comprisesThe initial spatial relationship between the panoramic camera and the SLAM system is the same.

Further, the exposure time interval of the panoramic camera is determined to be 100 milliseconds to 1000 milliseconds.

Specifically, the exposure time of the panoramic image in the SLAM system cannot be accurately determined, but the exposure time can be determined to be within a certain time range, the time is the exposure delay time, a camera exposure instruction is sent from the SLAM system, the camera can complete exposure within a controllable time range according to the ambient light and the preset parameters of the camera, 90% of the conditions are found to be completed within 300 milliseconds to 600 milliseconds according to the test, and therefore the delay is set to be 100 milliseconds to 1000 milliseconds, and the delay interval can completely cover the real exposure time.

Further, determining, according to the exposure time interval, a pose corresponding to a panoramic image corresponding to the panoramic camera includes:

Specifically, the SLAM trajectory (pose) at any exposure time is obtained:

the data format of the SLAM track is:

time, xyz position, 3×3 rotation matrix

Therefore, the pose of the panoramic camera can be correspondingly obtained as long as the exposure time is determined. If the SLAM system at the time t sends an exposure instruction to the panoramic camera, the true exposure is necessarily performed in a time unit of millisecond in a section (t+100, throught+1000) (time unit: millisecond).

Let the accurate exposure of the panoramic camera beAccording to time 1/</>, from SLAM track&& />Less than or equal to time2, finding out time1 and time2, acquiring SLAM positions and postures corresponding to time1 and time2, performing interpolation calculation, and calculating/>The pose of the SLAM system at the moment is set as/>. Initial spatial transformation relationship/>, through panoramic camera and SLAM coordinate system(See step 1), the pose corresponding to the panoramic image can be obtained: /(I)=/>

The pose of the SLAM system comprises:

Position amount and rotation amount.

Specifically, the position quantity can be obtained by direct linear Interpolation, and the rotation quantity is required to be obtained by spherical linear Interpolation (SPHERICAL LINEAR internation slerp).

The rotation amount is required to perform spherical interpolation after the matrix is transformed into quaternions (quaternion), and let the rotation matrices T1 and T2 be transformed into quaternions Q1 and Q2, as shown in fig. 3.

The rotation amount Qi corresponding to the current time is:

；

Wherein: 。

r represents the weight of the current frame in adjacent transformation, the value range is 0.0-1.0, and the calculation method is as follows:

Where i is the current frame number, Is the start frame number of the first section,/>Is the end frame number of the second segment.

And (5) solving the pose of the SLAM track at any moment, namely obtaining the pose of the panoramic image at any moment.

Furthermore, the invention obtains and calculates all the poses in the preset interval by using a large model + laser point cloud monitoring method, thereby obtaining the optimal moment and the optimal panoramic pose.

As shown in fig. 4-5, depth prediction first projects the panorama in equidistant cylinders, maps the panorama to multiple patches at multiple angles, gives each patch negligible distortion, feeds the patch to the network prediction depth, and merges all patch predictions into one equidistant cylinder projection form. In order to ensure the continuity of depth prediction of different patches, semantic segmentation branches are introduced, and the consistency of depth prediction is realized under the supervision of semantic information.

Depth estimation branch: the branch performs dense global depth estimation based on cylindrical projections of the panoramic image. With the encoding and self-encoding architecture, the encoder extracts strong features using vision transformer (ViT) based on dinov large model initialization as a backbone network (backbone).

Semantic segmentation branches: the branches are based on hexahedral projections of the panoramic image, and semantic segmentation of the panoramic image is performed. The same backup as the depth estimation branch is adopted, the weight sharing is carried out, and the weight is not shared with the last branch in the head part of the task.

A decoder: consists of a recombination and fusion module. The reorganization module maps, connects and projects the output of the transform block according toSpatial resolution resampling works. The fusion module utilizes a residual convolution unit to fuse the output characteristics of the recombination module and up-samples the characteristic map by 2 times.

Loss function:

(1) Depth estimation branching employs an Inverse Huber penalty

This loss is an L1 loss when d is smaller than c, and an L2 loss when d is larger than c. d represents the predicted depth of each pixel

And true value

Is the difference between each gradient drop

I represents the index of all pixels of each image of each batch,

=0.2, Maximum error rate per batch.

(2) The semantic segmentation branch adopts a cross entropy loss function to calculate the cross entropy Ls of the prediction segmentation result and the true value segmentation result.

The total loss function is:；

pre-training model and dataset

The pre-training model adopted by the method is dinov < 2 >, and the depth prediction training set is an NYU depth data set which comprises depth data and semantic tag data.

Further, the panoramic depth map is registered with the laser point cloud.

The invention also provides another embodiment:

As shown in fig. 6, image preprocessing: and (3) setting the exposure time interval of the panoramic image as [ t+100, t+1000]/ms, setting the panoramic pose corresponding to the moment t _i (see step 3) as M, and generating a point cloud depth map from the result point cloud.

A) A virtual graph with the same size as the panoramic image (for quick calculation, the virtual graph can be reduced in a certain scale, etc.), and the virtual graph is used as a depth graph and is set as。

B) The coordinates of the laser point P (x, y, z) under the panoramic sphere are calculated:

c) Calculating the panoramic spherical coordinate system (O is the origin of coordinates, namely 0, 0), the length r _i and the corresponding spherical longitude and latitude (w _u,h_v).

D) The depth map may be expressed as: depthMap (w _u,h_v)=r_i), where (w _u,h_v) is required to scale to a preset depth map size.

E) When a plurality of laser points all correspond to the same one (w _u,h_v), the depth is takenAnd the small ones.

And calculating all laser point clouds in the scene according to the algorithm, so as to generate a point cloud depth map. For rapid calculation, only the laser point cloud which is positioned at a certain distance threshold from the position at the moment t _i can be selected to participate in calculation, for example, within 50 meters.

Feature extraction: and extracting corner features by using SIFT, FAST and other technologies.

Feature matching: frequently, nearest neighbor matching is utilized, random sampling consistency is utilized, and the spatial position relation is utilized to remove mismatching points

After the matching point pairs are obtained, estimating a depth image transformation matrix by using a gradient descent method or a Gaussian-Newton method.

Registration accuracy is assessed. And taking the point cloud depth map with the highest registration precision at the time t _i as the best matching image with the depth map, thereby obtaining the best matching time point cloud.

The beneficial effects of the invention are as follows:

The invention provides a SLAM laser point cloud and panoramic image automatic registration method based on a large model, which comprises the following steps: calculating the pose of the panoramic camera to the SLAM system; determining an exposure time interval of the panoramic camera; determining the pose corresponding to the panoramic image corresponding to the panoramic camera according to the exposure time interval; obtaining a corresponding panoramic depth atlas according to the pose corresponding to the panoramic image; based on a panoramic image depth estimation algorithm, obtaining an optimal panoramic depth map according to a panoramic depth map set; and obtaining an optimal matching point cloud of the panoramic camera and the SLAM system according to the optimal panoramic depth map. According to the invention, the exposure time interval is determined, and then the optimal selection is realized by adopting a deep learning technology based on a large model in the interval, so that the minimum error is realized, and the optimal posture of the panorama is determined. And the accuracy of point cloud registration of the panoramic image and the SLAM system is improved. Thereby obtaining the panoramic pose of the mapping level, and providing powerful technical support for the application of mapping, modeling, AR and other scenes.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. An automatic registration method of SLAM laser point cloud and panoramic image based on a large model is characterized by comprising the following steps:

Calculating the pose of the panoramic camera to the SLAM system;

Determining an exposure time interval of the panoramic camera;

2. The automatic registration method of a large model-based SLAM laser point cloud and panoramic image of claim 1, wherein calculating the pose of the panoramic camera to the SLAM system comprises:

Acquiring a true value point cloud of a calibration room;

3. The method for automatically registering a large model-based SLAM laser point cloud with a panoramic image as recited in claim 1, wherein the exposure time interval of the determined panoramic camera is 100ms to 1000 ms.

4. The automatic registration method of a SLAM laser point cloud and a panoramic image based on a large model according to claim 1, wherein the determining, according to the exposure time interval, a pose corresponding to a panoramic image corresponding to a panoramic camera includes:

5. The automatic registration method of a large model-based SLAM laser point cloud and panoramic image of claim 4, wherein the pose of the SLAM system comprises:

Position amount and rotation amount.

6. The automatic registration method of a large model-based SLAM laser point cloud and panoramic image of claim 5, wherein said position quantity is obtained by linear interpolation.

7. The automatic registration method of SLAM laser point cloud and panoramic image based on large model as in claim 5, wherein the rotation amount is obtained by spherical linear interpolation.