CN117470230A

CN117470230A - Visual laser sensor fusion positioning algorithm based on deep learning

Info

Publication number: CN117470230A
Application number: CN202311372884.9A
Authority: CN
Inventors: 曹一波; 范敬文; 赵佳恒; 杨正东
Original assignee: Guangzhou Chuangyuan Robotics Co ltd
Current assignee: Guangzhou Chuangyuan Robotics Co ltd
Priority date: 2023-10-23
Filing date: 2023-10-23
Publication date: 2024-01-30

Abstract

The invention relates to the technical field of robot application, and discloses a vision laser sensor fusion positioning algorithm based on deep learning, which specifically comprises the following steps: acquiring real-time data through a laser radar, a camera and an IMU sensor on the robot; performing feature extraction and environment recognition on the preprocessed data by using a deep learning model; according to the feature extraction and the environment recognition result, weighting and fusing the different sensor data to obtain accurate pose estimation; and optimizing by using IMU data and GPS data through a back-end factor graph to obtain an accurate pose. The invention adopts a deep learning model to extract the dot line characteristics and identify the environment, thereby improving the characteristic quality and the positioning accuracy in a low-texture environment; through multi-sensor data fusion, the positioning accuracy and the robustness of the mobile robot SLAM system can be improved in a low-texture environment, and the real-time estimation of the position and the gesture of the robot can be realized.

Description

Visual laser sensor fusion positioning algorithm based on deep learning

Technical Field

The invention belongs to the technical field of robot application, and particularly relates to a vision laser sensor fusion positioning algorithm based on deep learning.

Background

Currently, more common mobile robots include home service robots, AGV logistics robots, and the like. The home service robot mainly comprises an education robot, a cleaning robot, an old and disabled assisting robot and the like. In special occasions, such as exhibitions, restaurants, welcome, shopping guides, security and surveillance, special robots are playing an increasingly important role. Although mobile robots with complex functions bring subversive changes to the production and life of people, the functions of the robots are still limited. In the field with high requirements on complex scenes and performances, the functions and performances of the robot still have a large improvement space. For an autonomous mobile robot, robot positioning and navigation are key problems, and involve the aspects of accurate positioning, scene recognition, semantic understanding, obstacle analysis and decision-making and the like, which require deep analysis and research.

A SLAM (Simultaneouss Localization and Mapping, time-series localization and mapping) system is a system in which a robot generates localization of its own position and posture and scene map information by collecting and calculating various sensor data. SLAM systems typically include a variety of sensors and functional modules. The current mature and common SLAM systems can be classified into laser radar based SLAM (laser SLAM) and vision based SLAM (VSLAM) by core functional modules.

The visual sensor can collect abundant environmental information, has lower cost, but the perception frame is easy to lose frames under high-speed movement, and the environmental perception based on the visual information is influenced by illumination conditions. The laser sensor has the characteristics of high precision, large range and the like in the aspect of environment perception, and has the advantages in the aspect of detecting edge information (such as road edges). The laser sensor works relatively stably, performs well in severe weather, but cannot sense environment detail information, and is difficult to realize functions of repositioning and loop detection of a classical visual SLAM frame.

A single lidar sensor and vision sensor have not been able to achieve accurate map construction and navigation of mobile robots in complex scenarios. In order to realize accurate positioning and navigation of the mobile robot in a complex environment, avoid the influence of factors such as light change, scene rapid movement, dynamic objects and the like on the SLAM system, the SLAM technology mainstream direction of the mobile robot in the future is designed and researched into a multi-sensor fused SLAM system, and the stability and the environment adaptability of the SLAM system are improved.

The multi-sensor fusion SLAM can combine the advantages of each sensor in different environments. Under the condition of less texture or regular texture distribution of the surrounding environment, the positioning effect of the pure visual odometer is poor, and track drift is easy to generate. Also, fewer environmental obstructions or regular distribution of the environment can lead to positioning drift of the pure laser odometer. To address the problem of lost positioning of a single sensor in certain environments, many studies have combined visual and laser sensors with other sensors (e.g., IMUs, code wheels, etc.). Compared with a pure vision or pure laser sensor odometer, the fusion sensor has higher positioning precision, is not easy to lose precision in a complex environment, and has better robustness.

While SLAM system robustness is improved by fusing multiple sensors, many studies do not fully utilize individual sensor data. Such as image processing, point features are typically extracted or pixel tracking is performed using optical flow methods. In a low texture environment, the pixel variation is not obvious, so that the characteristic point extraction and optical flow tracking effects are poor. For this purpose, some studies have performed key frame matching by extracting line features. In a low-texture environment, the number of line features is large, and vision sensor data is fully utilized.

Under the low texture environment, the traditional feature extraction method is low in robustness and poor in quality of extracted feature effect. Meanwhile, the conventional multi-sensor fusion method generally adopts a low-coupling method, and cannot well cope with sudden changes of complex environments, so we propose a visual laser sensor fusion positioning algorithm based on deep learning, which aims at researching that under the low-texture environment, the point line characteristics are extracted by utilizing deep learning, the environment is identified by utilizing the deep learning, and then the multi-sensor data are fused, so that the positioning accuracy and the system robustness are improved.

Disclosure of Invention

In order to make up for the defects of the prior art, the invention provides a visual laser sensor fusion positioning algorithm based on deep learning, which aims to solve the problem that the quality of characteristics and the positioning accuracy are to be improved in a low-texture environment.

In order to achieve the above purpose, the present invention provides the following technical solutions: a visual laser sensor fusion positioning algorithm based on deep learning specifically comprises the following steps:

s1, sensor data acquisition: acquiring real-time data through a laser radar, a camera and an IMU sensor on the robot;

s2, feature extraction and environment recognition: performing feature extraction and environment recognition on the preprocessed data by using a deep learning model;

s3, multi-sensor data fusion: according to the feature extraction and the environment recognition result, different sensor data are subjected to weighted fusion to obtain more accurate pose estimation;

s4, back-end factor graph optimization: and optimizing by using IMU data and GPS data through a back-end factor graph to obtain more accurate pose.

Preferably, in the step S3, the multi-sensor data includes laser radar data, visual image data, and IMU data.

Preferably, in the step S3, the real-time estimation of the position and the posture of the robot is performed by using a multi-sensor data fusion technology through a SLAM system, and the SLAM system is based on a laser SLAM or a VSLAM.

Preferably, the SLAM system is a vision-based SLAM system, and the system captures environmental information by using a camera, performs feature extraction and environmental mapping by combining a computer vision technology, and improves the performance of the vision SLAM system in a low-texture environment by performing feature extraction by using a deep learning model, such as a Convolutional Neural Network (CNN).

Preferably, the SLAM system is a SLAM system based on a laser radar, the system uses the laser radar to perform environment sensing, and uses point cloud data to perform feature extraction and map construction, so that the laser radar has higher precision and robustness, and can realize a better positioning effect in a complex environment.

Preferably, the SLAM system is a SLAM system based on vision and laser radar fusion, the system combines the advantages of vision and laser radar to realize multi-sensor data fusion, the performance of the system in a low-texture environment can be improved through feature extraction of a deep learning model, and meanwhile, the robustness and the accuracy of the system can be improved through the addition of the laser radar.

Preferably, the SLAM system is a SLAM system based on a filtering algorithm, and the system realizes the estimation of the position and the posture of the robot by fusing and optimizing the data of multiple sensors through designing the filtering algorithm such as a Kalman filter, a particle filter and the like. The filtering algorithm can improve the robustness and the positioning accuracy of the system to a certain extent.

The invention has the technical effects and advantages that:

1. the invention adopts the deep learning model to extract the dot line characteristics and identify the environment, and improves the characteristic quality and the positioning accuracy in a low-texture environment. Compared with the prior art, the invention can realize higher positioning precision in a complex environment.

2. The invention improves the robustness of the system in complex environments through multi-sensor data fusion. Under the complex scenes such as occlusion, texture deletion and the like, the SLAM system has better stability and reliability.

3. The SLAM system provided by the invention can realize real-time estimation of the position and the gesture of the robot, can quickly respond to environmental changes, and improves the navigation precision. Compared with the prior art, the method has obvious advantages in real-time performance.

4. The SLAM system has higher positioning precision and robustness, and can expand the application range of the robot in complex environments, such as unmanned, robot navigation and the like.

5. Compared with the traditional feature extraction method, such as SIFT, SURF and the like, the method can reduce the calculation complexity and improve the calculation efficiency.

In conclusion, the method has the advantages of improving positioning accuracy, enhancing system robustness and instantaneity, expanding application range, reducing calculation complexity and the like, and has higher technical value and practical value compared with the prior art.

Drawings

FIG. 1 is a flowchart of an algorithm of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Although the prior art has made some progress in the mobile robot SLAM system, there are some drawbacks as follows:

1. limitations of a single sensor: accurate map construction and navigation are difficult to achieve in complex scenarios, both visual and lidar sensors. The visual sensor has poor characteristic extraction effect in a low-texture environment, and the laser radar sensor is difficult to capture environment detail information. This results in limited positioning and navigation performance of a single sensor in complex environments.

2. The deficiency of the multi-sensor fusion technique: although multisensor fusion SLAM systems have achieved some success, existing fusion methods still have some problems. For example, in the multi-sensor fusion process, the utilization rate of each sensor data is not high, and the feature extraction effect still needs to be improved. In addition, the conventional multi-sensor fusion method is not excellent in coping with abrupt changes in complex environments.

3. Insufficient adaptive capacity to the environment: the existing SLAM system is susceptible to positioning and navigation performance in the face of factors such as light change, scene rapid movement, dynamic objects and the like. This limits the wide application of mobile robots in complex environments.

In view of the above-mentioned drawbacks of the prior art, an object of the present invention is to provide a multi-sensor fusion mobile robot SLAM system in a low-texture environment, which has higher positioning accuracy and system robustness. Specifically, the invention performs dot line feature extraction and environment recognition through deep learning, and then fuses the multi-sensor data to improve positioning accuracy and system robustness. Therefore, the limitation of a single sensor can be overcome, the effect of the multi-sensor fusion technology is improved, and the adaptability of the SLAM system in a complex environment is enhanced.

The invention is described in further detail below with reference to fig. 1;

the embodiment of the application discloses a visual laser sensor fusion positioning algorithm based on deep learning, which specifically comprises the following steps:

s2, feature extraction and environment recognition: the method comprises the steps of performing feature extraction and environment recognition on preprocessed data by using a deep learning model, performing dot line feature extraction and environment recognition by using the deep learning model, extracting features with higher robustness and accuracy from sensor data by using a training model, and under a low-texture environment, the features extracted by the model have higher quality, so that the performance of an SLAM system is improved;

s3, multi-sensor data fusion: according to the characteristic extraction and the environment recognition result, weighting and fusing different sensor data to obtain more accurate pose estimation, in the fusion process, the deep learning model can be weighted according to the characteristics of different sensor data, so that more accurate pose estimation is obtained, and meanwhile, the robustness of the system under a complex environment can be improved through the fusion of multiple sensor data;

Through the technical scheme, the positioning accuracy and the robustness of the SLAM system of the mobile robot can be improved in a low-texture environment, so that the application range of the robot in a complex environment is expanded; the deep learning model is adopted to extract the dot line characteristics and identify the environment, so that the characteristic quality and the positioning accuracy are improved in a low-texture environment; through multi-sensor data fusion, the robustness of the system in a complex environment is improved, and the real-time estimation of the position and the attitude of the robot is realized; the SLAM system has higher positioning precision and robustness, and can expand the application range of the robot in a complex environment.

Specifically, in the step S3, the multi-sensor data includes laser radar data, visual image data and IMU data, the real-time estimation of the position and the posture of the robot is performed by using a multi-sensor data fusion technology through a SLAM system, the SLAM system performs the posture estimation based on the laser SLAM or VSLAM and the characteristics extracted by a deep learning model, and the SLAM system has higher positioning precision and robustness in a complex environment.

Specifically, the SLAM system is a vision-based SLAM system, and the system captures environmental information by using a camera, performs feature extraction and environmental mapping by combining a computer vision technology, and can improve the performance of the vision SLAM system in a low-texture environment by performing feature extraction by using a deep learning model, such as a Convolutional Neural Network (CNN).

Specifically, the SLAM system is a laser radar-based SLAM system, the system uses the laser radar to perform environment sensing, and uses point cloud data to perform feature extraction and map construction, so that the laser radar has higher precision and robustness, and can achieve a better positioning effect in a complex environment.

Specifically, the SLAM system is an SLAM system based on vision and laser radar fusion, the system combines the advantages of vision and laser radar, multi-sensor data fusion is achieved, feature extraction is carried out through a deep learning model, the performance of the system in a low-texture environment can be improved, and meanwhile, the robustness and the accuracy of the system can be improved due to the addition of the laser radar.

Specifically, the SLAM system is a SLAM system based on a filtering algorithm, and the system realizes the estimation of the position and the posture of the robot by fusing and optimizing the data of multiple sensors through designing the filtering algorithm such as a Kalman filter, a particle filter and the like. The filtering algorithm can improve the robustness and the positioning accuracy of the system to a certain extent.

The visual laser sensor fusion positioning algorithm based on the deep learning is used for carrying out model training by utilizing the deep learning in part of research so as to improve the feature quality, thereby carrying out feature extraction; compared with the traditional method for extracting the dot line features, the dot line features extracted by the model are higher in quality and more in quantity, and the key frame matching effect is better than that of the features extracted by the traditional algorithm; and the environment around the robot is identified through deep learning, so that the current environment is more accurately obtained, which is favorable for a laser sensor or a visual sensor, thereby obtaining the weight of laser data and visual data, extracting the laser visual characteristics with higher quality and finally obtaining the positioning precision of the system.

Finally, it should be noted that: the foregoing description is only illustrative of the preferred embodiments of the present invention, and although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements or changes may be made without departing from the spirit and principles of the present invention.

Claims

1. The visual laser sensor fusion positioning algorithm based on deep learning is characterized by comprising the following steps of:

s3, multi-sensor data fusion: according to the feature extraction and the environment recognition result, weighting and fusing the different sensor data to obtain accurate pose estimation;

s4, back-end factor graph optimization: and optimizing by using IMU data and GPS data through a back-end factor graph to obtain an accurate pose.

2. The visual laser sensor fusion positioning algorithm based on deep learning as claimed in claim 1, wherein: in the step S3, the multi-sensor data includes laser radar data, visual image data, and IMU data.

3. The visual laser sensor fusion positioning algorithm based on deep learning as claimed in claim 1, wherein: in the step S3, the real-time estimation of the position and the posture of the robot is performed by using a multi-sensor data fusion technology through a SLAM system, and the SLAM system is based on laser SLAM or VSLAM.

4. A visual laser sensor fusion positioning algorithm based on deep learning as claimed in claim 3, wherein: the SLAM system is a vision-based SLAM system which captures environmental information by using a camera, performs feature extraction and environmental mapping by combining computer vision technology, and performs feature extraction by using a deep learning model such as a convolutional neural network.

5. A visual laser sensor fusion positioning algorithm based on deep learning as claimed in claim 3, wherein: the SLAM system is based on a laser radar, and the system uses the laser radar to perform environment sensing, and performs feature extraction and map construction by utilizing point cloud data.

6. A visual laser sensor fusion positioning algorithm based on deep learning as claimed in claim 3, wherein: the SLAM system is based on vision and laser radar fusion, the system combines the advantages of vision and laser radar, multi-sensor data fusion is achieved, and feature extraction is carried out through a deep learning model.

7. A visual laser sensor fusion positioning algorithm based on deep learning as claimed in claim 3, wherein: the SLAM system is based on a filtering algorithm, and the system fuses and optimizes the multi-sensor data by designing the filtering algorithm so as to realize the estimation of the position and the posture of the robot.