CN117906595A - Scene understanding navigation method and system based on feature point method vision SLAM - Google Patents

Scene understanding navigation method and system based on feature point method vision SLAM Download PDF

Info

Publication number
CN117906595A
CN117906595A CN202410317058.2A CN202410317058A CN117906595A CN 117906595 A CN117906595 A CN 117906595A CN 202410317058 A CN202410317058 A CN 202410317058A CN 117906595 A CN117906595 A CN 117906595A
Authority
CN
China
Prior art keywords
point cloud
map
ros
point
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410317058.2A
Other languages
Chinese (zh)
Inventor
鲁明丽
周理想
徐本连
从金亮
施健
周旭
顾苏杭
朱培逸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changshu Institute of Technology
Original Assignee
Changshu Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changshu Institute of Technology filed Critical Changshu Institute of Technology
Priority to CN202410317058.2A priority Critical patent/CN117906595A/en
Publication of CN117906595A publication Critical patent/CN117906595A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a scene understanding navigation method and a system based on feature point method vision SLAM, comprising the following steps: acquiring real-time pose information of a camera and issuing the real-time pose information through ROS; synthesizing the acquired RGB image and depth image into a 3D point cloud; acquiring the relative position relation between the robot and the object according to the semantic and depth information; splicing each frame of point cloud in the 3D point cloud into a global point cloud map according to pose information, and releasing the global point cloud map into the ROS; acquiring map point information, converting the map point information into point clouds to obtain a point cloud map, releasing the point cloud map data into ROS, and converting the point cloud map into a three-dimensional octree map and a two-dimensional grid map; and (3) the ORB-SLAM3 module and the result of the example segmentation network are distributed and subscribed in the form of topics through the ROS system for communication transmission, and the ROS navigation package is used for implementing navigation operation. The ORB-SLAM3 is combined with a real-time instance segmentation method to establish a dense semantic map, so that scene understanding can be performed.

Description

Scene understanding navigation method and system based on feature point method vision SLAM
Technical Field
The invention belongs to the technical field of machine vision and robots, and relates to a scene understanding navigation method and system based on a feature point method vision SLAM.
Background
Meanwhile, the positioning and map construction technology (Simultaneous Localization AND MAPPING, SLAM) is a popular problem in the field of robots, is also a key technology for realizing autonomous movement of robots in complex scenes, and has wide application in the fields of mobile robots, automatic driving, unmanned aerial vehicles, augmented reality (Augmented reality, AR), virtual Reality (VR) and the like.
However, the conventional visual SLAM (such as ORB-SLAM 3) based on the feature point method can only establish a sparse point cloud map for positioning, and cannot establish a two-dimensional occupied grid map and a three-dimensional octree map like a laser SLAM, so that navigation, obstacle avoidance and other operations cannot be performed by using the established map. On the other hand, in the conventional vision SLAM, the vision sensor required for the vision SLAM can acquire a large amount of environmental information such as color, texture, shape, etc., some special vision sensors (e.g., depth cameras) can also acquire the relative distance between the camera and objects in the environment, etc., and the sparse point cloud map wastes such abundant and useful information.
In recent years, the deep learning technology has made great progress, and a lot of breakthroughs are made in the field of computer vision, so that semantic information including identification and labeling of scene elements such as objects can be obtained by combining a deep learning method and fused into an SLAM system, and the robot can better understand environmental structures, semantic meanings and measurement information of the robot and surrounding objects from the perspective of human beings. In addition, the robotic operating system (Robot Operating System, ROS) provides rich functionality and interfaces. At present, although a deep learning method is fused into an SLAM system, communication between the deep learning method and the SLAM system is not efficient, and real-time performance is poor.
The invention discloses a visual SLAM method and a visual SLAM system for an indoor dynamic scene based on deep learning, wherein based on an ORB-SLAM2 framework, a GCNv feature extraction algorithm is utilized to replace a corresponding module of the ORB-SLAM2, a dynamic point removal module and a point cloud mapping thread are added, and interference of dynamic feature points on a visual odometer is effectively reduced. The GCNv feature extraction algorithm performs feature extraction on the image information, cannot establish a semantic map, and cannot help a robot to understand the environment from the human perspective.
Disclosure of Invention
The invention aims to provide a scene understanding navigation method and system based on feature point method vision SLAM, which expands sparse point cloud established by ORB-SLAM3 into a map capable of navigation and a semantic measurement map capable of scene understanding, and the results among the modules are distributed and subscribed in a topic form through an ROS system for communication transmission.
The technical solution for realizing the purpose of the invention is as follows:
A scene understanding navigation method based on feature point method vision SLAM comprises the following steps:
s01: acquiring real-time pose information of a camera from the ORB-SLAM3 and issuing the real-time pose information through the ROS;
s02: synthesizing the RGB image and the depth image obtained by the depth camera into a 3D point cloud;
s03: semantic and depth information acquired through an example segmentation network is used for acquiring the relative position relation between the robot and the object;
s04: splicing each frame of point cloud in the 3D point cloud into a global point cloud map according to the real-time pose information, and publishing the global point cloud map to the ROS;
S05: map point information of ORB-SLAM3 is obtained and converted into point cloud, a point cloud map is obtained, point cloud map data are issued to ROS, and the point cloud map is converted into a three-dimensional octree map and a two-dimensional grid map;
S06: and (3) the ORB-SLAM3 module and the result of the example segmentation network are distributed and subscribed in the form of topics through the ROS system for communication transmission, and the ROS navigation package is used for implementing navigation operation.
In a preferred embodiment, the method for acquiring real-time pose information of the camera from the ORB-SLAM3 and issuing the pose information through ROS in step S01 includes:
S11: acquiring Sophus of the camera pose in SE3f format from TrackRGBD interface of ORB-SLAM 3;
S12: the pose of Sophus:SE 3f format is converted into cv:Mat format Tcw, ,/>For rotation matrix,/>T is a translation vector,/>As a function of the rotation matrix of the camera pose,/>A translation matrix function for the camera pose;
S13: converting the reverse pose Tcw of the camera into a positive pose Twc, and extracting a rotation matrix RWC and a translation vector tWC from the positive pose Twc;
S14: constructing a rotation matrix RWC as a quaternion q;
S15: the translation vector tWC and the rotation quaternion q of the camera are respectively copied to a variable of the ROS message type and used for issuing camera pose information.
In a preferred technical solution, the method for synthesizing the 3D point cloud in step S02 includes:
S21: reading pixel points of the depth image to obtain a depth value d;
S22: calculating the three-dimensional coordinates of the current pixel point according to the camera internal parameters and the depth value d
The calculation method comprises the following steps:
Wherein: is the pixel point coordinate in the pixel coordinate system,/> Is a three-dimensional coordinate value of the pixel point p,,/>,/>,/>Is an internal reference of the camera,/>Is a scaling factor;
S23: acquiring RGB color values of pixel points corresponding to the color image, and storing the RGB color values in b, g and r member variables of a point p;
s24: performing the above processing on each frame of image to generate a 3D point cloud, performing rotation and stitching operation on the generated point cloud, and converting the generated point cloud into a specified coordinate system;
S25: voxel filtering is performed on the point cloud.
In a preferred embodiment, the method for acquiring the relative positional relationship between the robot and the object in step S03 includes:
s31: deploying YOLOv-seg instance partition network by using a C++ version OpenCV-DNN module;
S32: writing the deployed YOLOv-seg into ROS nodes, and issuing detection and segmentation results;
S33: sending the RGB image to YOLOv-seg example segmentation network to obtain target detection results of each category;
s34: reading the coordinates of the rectangular frames of each category in the pixel coordinate system from the result, and solving the coordinates of the central points of the rectangular frames
S35: indexing the point cloud coordinate system according to the center point coordinates of each category in the pixel coordinate system to obtain the center point coordinates in the image pixel coordinate systemCorresponding point cloud coordinates/>
S36: calculating the coordinate transformation relation between the pose of the camera and the position of the center point of each type of object in the point cloud through tf coordinate transformation under the ROS;
s37: the coordinate transformation relationships are published through ROS and then subscribed to in rviz.
In a preferred technical solution, the method for stitching the global point cloud map in step S04 includes:
S41: converting the acquired real-time pose of the camera into a transformation matrix from the camera to a world coordinate system;
S42: calculating the relative transformation between two adjacent frames of point clouds according to the transformation matrix;
S43: registering the point cloud in the current frame with the corresponding pose, and adding the registered point cloud into the global point cloud map.
In a preferred technical solution, the method for obtaining the point cloud map in the step S05 includes:
S51: reading map point information in ORB-SLAM 3;
s52: creating sensor_ msgs, namely PointCloud type point cloud data and initializing the point cloud data;
S53: traversing map points, converting the map points into Eigen:Vector 3d type, storing the converted coordinates in an array, copying data in the array to a specific position of point cloud data of sensor_ msgs: pointCloud2 type by using memcpy function, and publishing the point cloud data to ROS.
In a preferred embodiment, the method for converting the point cloud map into the three-dimensional octree map and the two-dimensional grid map in the step S05 includes:
s61: creating and initializing an empty OccupancyGrid data structure according to the range, resolution and grid map size of the point cloud;
S62: for each point cloud data, converting the coordinates of the point cloud data into grid units on OccupancyGrid, and marking the corresponding grid units as corresponding states, wherein the states comprise idle states, free states or unknown states;
s63: publishing the generated OccupancyGrid data to corresponding topics;
s64: creating and initializing a OccupancyMap data structure and setting basic parameters thereof;
s65: traversing each cell in OccupancyGrid, mapping cell information into OccupancyMap;
S66: and publishing the generated OccupancyMap data to corresponding topics.
In a preferred embodiment, the method for performing navigation operation using the ROS navigation pack in step S06 includes:
S71: a mobile robot or a camera incrementally builds a grid map of the desired environment;
S72: after the establishment is completed, storing the grid map;
S73: using the move_base node of ROS, a navigation operation is performed at rviz.
The invention also discloses a scene understanding navigation system based on the feature point method vision SLAM, which comprises the following steps:
The camera real-time pose acquisition and release module acquires real-time pose information of the camera from the ORB-SLAM3 and releases the real-time pose information through the ROS;
The 3D point cloud synthesis module synthesizes the RGB image and the depth image obtained by the depth camera into a 3D point cloud;
the scene understanding module is used for acquiring the relative position relation between the robot and the object through semantic and depth information acquired by the instance segmentation network;
the global point cloud map generation and release module is used for splicing each frame of point cloud in the 3D point cloud into a global point cloud map according to the real-time pose information and releasing the global point cloud map into the ROS;
the ORB-SLAM3 visual navigation map module is used for acquiring map point information of ORB-SLAM3, converting the map point information into point cloud to obtain a point cloud map, releasing the point cloud map data into ROS, and converting the point cloud map into a three-dimensional octree map and a two-dimensional grid map;
And the navigation module is used for publishing and subscribing the results of the ORB-SLAM3 module and the example division network in the form of topics through the ROS system for communication transmission, and performing navigation operation by using the ROS navigation package.
The invention also discloses a computer storage medium, on which a computer program is stored, which when executed realizes the scene understanding navigation method based on the feature point method vision SLAM.
Compared with the prior art, the invention has the remarkable advantages that:
1. the invention improves the defect that ORB-SLAM3 can only establish a sparse map, combines the YOLOv-seg real-time example segmentation method with the highest current speed and highest accuracy to establish a dense semantic map, and can help a robot understand the environment from the angle of human beings; in addition, the relative position and distance relation between the robot and the priori object in the environment is released in real time, so that the robot can conveniently realize actions such as grabbing.
2. The invention converts the sparse map into an occupied grid map similar to that established by a laser radar. By analyzing and processing the grid map, the functions of path planning, obstacle avoidance, navigation and the like can be realized, the working range of the visual SLAM is greatly expanded, and the visual SLAM system is more comprehensive and flexible.
3. According to the invention, each module operates under the ROS framework, the ORB-SLAM3 module and YOLOv-seg instance segmentation module are rewritten into ROS nodes, and the results between the modules are distributed and subscribed in the form of topics through the ROS system for communication transmission. This architecture makes the system run more efficient, real-time, and facilitates the extension of other functional modules. Meanwhile, as the ROS framework is widely applied to the field of robots, the invention also enables the whole system to be easier to integrate and cooperatively work with other ROS nodes, and further expands the application potential of the visual SLAM in the aspects of robot navigation, environment perception and the like.
Drawings
Fig. 1 is a flowchart of a scene understanding navigation method based on feature point method visual SLAM of the present embodiment;
FIG. 2 is a flowchart of the scene understanding navigation system based on the feature point method visual SLAM of the present embodiment;
FIG. 3 is a purely visually created three-dimensional octree map;
FIG. 4 is a purely visually created two-dimensional occupancy grid map;
fig. 5 is a schematic view of a scene understanding.
Detailed Description
The principle of the invention is as follows: the sparse point cloud established by the ORB-SLAM3 is expanded into a map capable of navigation and a semantic measurement map capable of scene understanding, and the results among the modules are distributed and subscribed in the form of topics through the ROS system for communication transmission.
Example 1:
as shown in fig. 1, a scene understanding navigation method based on feature point method vision SLAM includes the following steps:
s01: acquiring real-time pose information of a camera from the ORB-SLAM3 and issuing the real-time pose information through the ROS;
s02: synthesizing the RGB image and the depth image obtained by the depth camera into a 3D point cloud;
s03: semantic and depth information acquired through an example segmentation network is used for acquiring the relative position relation between the robot and the object;
s04: splicing each frame of point cloud in the 3D point cloud into a global point cloud map according to the real-time pose information, and publishing the global point cloud map to the ROS;
S05: map point information of ORB-SLAM3 is obtained and converted into point cloud, a point cloud map is obtained, point cloud map data are issued to ROS, and the point cloud map is converted into a three-dimensional octree map and a two-dimensional grid map;
S06: and (3) the ORB-SLAM3 module and the result of the example segmentation network are distributed and subscribed in the form of topics through the ROS system for communication transmission, and the ROS navigation package is used for implementing navigation operation.
The specific implementation of each step herein refers to the workflow of the navigation system based on the scene understanding of the feature point method vision SLAM, and will not be described herein.
In another embodiment, a computer storage medium has a computer program stored thereon, the computer program when executed implementing the scene understanding navigation method based on the feature point method visual SLAM described above.
The scene understanding navigation method based on the feature point method visual SLAM can adopt any one of the navigation methods of the visual SLAM based on the feature point method, and detailed implementation is not repeated here.
In yet another embodiment, a scene understanding navigation system based on feature point method visual SLAM includes:
The camera real-time pose acquisition and release module acquires real-time pose information of the camera from the ORB-SLAM3 and releases the real-time pose information through the ROS;
The 3D point cloud synthesis module synthesizes the RGB image and the depth image obtained by the depth camera into a 3D point cloud;
the scene understanding module is used for acquiring the relative position relation between the robot and the object through semantic and depth information acquired by the instance segmentation network;
the global point cloud map generation and release module is used for splicing each frame of point cloud in the 3D point cloud into a global point cloud map according to the real-time pose information and releasing the global point cloud map into the ROS;
the ORB-SLAM3 visual navigation map module is used for acquiring map point information of ORB-SLAM3, converting the map point information into point cloud to obtain a point cloud map, releasing the point cloud map data into ROS, and converting the point cloud map into a three-dimensional octree map and a two-dimensional grid map;
And the navigation module is used for publishing and subscribing the results of the ORB-SLAM3 module and the example division network in the form of topics through the ROS system for communication transmission, and performing navigation operation by using the ROS navigation package.
Specifically, the following description of the workflow of the scene understanding navigation system based on the feature point method visual SLAM is given by taking a preferred embodiment as an example as shown in fig. 2:
s1: real-time pose information of the camera is acquired from the ORB-SLAM3 and published by ROS.
The specific implementation method comprises the following steps:
S11: installing and running ORB-SLAM3;
S12: acquiring Sophus of the camera pose in SE3f format from TrackRGBD interface of ORB-SLAM 3;
s13: converting Sophus:: the pose in SE3f format into cv:: mat format Tcw;
Specifically, trackRGBD is the interface in ORB-SLAM3 for processing RGB-D camera data, and Sophus is returned to the camera pose in SE3f format; sophus SE3f is a data type in the Sophus C ++ library representing rigid body transformations in three dimensions, including translation and rotation, The function returns a rotating part,/>The function returns a translated portion.
Rotation matrix; Translation vectorThen/>
S14: converting the reverse pose Tcw of the camera into a positive pose Twc, and extracting a rotation matrix RWC and a translation vector tWC from the positive pose Twc;
S15: constructing a rotation matrix RWC as a quaternion q;
s16: the translation vector tWC and the rotation quaternion q of the camera are respectively copied to a variable of the ROS message type and used for issuing camera pose information.
S2: and synthesizing the 3D point cloud by the RGB image and the depth image acquired by the depth camera.
The specific implementation method comprises the following steps:
S21: reading pixel points of the depth image to obtain a depth value d;
S22: calculating the three-dimensional coordinates of the current pixel point according to the camera internal parameters and the depth value d
Specifically, the calculation method is as follows:
Wherein: is the pixel point coordinate in the pixel coordinate system,/> Is the value of the three-dimensional coordinate p-,/>,/>Is an internal reference of the camera,/>Is a scaling factor.
S23: acquiring RGB color values of pixel points corresponding to the color image, and storing the RGB color values in b, g and r member variables of a point p;
S24: repeating the steps, and generating a 3D point cloud for each frame of image;
S25: performing rotation and stitching operation on the generated point cloud, and converting the point cloud into a designated coordinate system;
s26: voxel filtering is performed on the point cloud to reduce the amount of point cloud data and noise.
S3: and combining the semantic and depth information to acquire the relative position relation between the robot and the object. As shown in fig. 5. The a priori objects in the environment have been marked in different colors while the relative position and distance relationship of the robot to the a priori objects is indicated in real time by the RGB-D camera.
The specific implementation method comprises the following steps:
s31: deploying YOLOv-seg instance partition network by using a C++ version OpenCV-DNN module;
S32: writing the deployed YOLOv-seg into ROS nodes, and issuing detection and segmentation results;
S33: sending the RGB image to YOLOv-seg example segmentation network to obtain target detection results of each category;
s34: reading the coordinates of the rectangular frames of each category in the pixel coordinate system from the result, and solving the coordinates of the central points of the rectangular frames
S35: indexing the point cloud coordinate system according to the center point coordinates of each category in the pixel coordinate system to obtain the center point coordinates in the pixel coordinate system of the pictureCorresponding point cloud coordinates/>;
Specific:
Wherein: For index value,/> Is the lateral resolution of the picture or camera. The index value is input into the generated single-frame point cloud to obtain the coordinate/>, of the point cloudWherein/>Is the distance between the robot and the object.
S36: calculating a coordinate transformation relation between the pose of the camera and the position of the center point of each type of object in the point cloud through tf coordinate transformation under the ROS, wherein the coordinate transformation relation is the relative position relation between the robot and the object; the calculated amount is small.
S37: the coordinate transformation relationships are published through ROS and then subscribed to in rviz.
S4: and (3) combining the pose information acquired in the step (S1), and splicing each frame of point cloud in the step (S2) into a 3D point cloud with the same total.
RViz (Robot Visualization) is an open source tool for visualizing the robotic system for displaying and debugging sensor data, status information, motion planning, etc. of the robot.
The specific implementation method comprises the following steps:
s41: converting the pose in the S1 into a transformation matrix from the camera to a world coordinate system;
S42: calculating the relative transformation between two adjacent frames of point clouds according to the transformation matrix;
S43: registering the point cloud in the current frame with the corresponding pose, and adding the registered point cloud into a global point cloud map;
S44: the global point cloud map is published into ROS.
S5: map point information of the ORB-SLAM3 is acquired and converted into a point cloud.
The specific implementation method comprises the following steps:
S51: reading map point information in ORB-SLAM 3;
s52: a sensor_ msgs: pointCloud2 type variable is created and initialized;
S53: traversing map points, converting the map points into Eigen, namely Vector3d type, storing the converted coordinates in an array, and copying data in the array to specific positions of point cloud data by using memcpy function;
S54: and releasing the point cloud data to the ROS to simulate the point cloud generated by the multi-line laser radar.
S6: the point cloud map is converted into a three-dimensional octree map and a two-dimensional grid map as shown in fig. 3 and 4. The square in fig. 3 represents an obstacle present in the environment. The other areas in fig. 4 represent the portions that the robot can pass through, and the black areas represent the presence of obstacles at that location, which the robot is required to avoid when planning and navigating the path.
The memcpy function is a standard library function in the C/C++ language for copying data between memories.
The specific implementation method comprises the following steps:
S61: creating and initializing an empty OccupancyGrid data structure according to the range, resolution and grid map size of the point cloud; occupancyGrid is a message type defined in ROS (robot operating system) for representing a three-dimensional probability map, i.e. octomap map.
S62: for each point cloud data, converting the coordinates of the point cloud data into grid units on OccupancyGrid, and marking the corresponding grid units as corresponding states, such as idle, free or unknown;
s63: publishing the generated OccupancyGrid data to corresponding topics;
S64: creating and initializing a OccupancyMap data structure and setting basic parameters thereof; occupancyMap are used primarily to convert a three-dimensional OccupancyGrid probability map to a two-dimensional occupancy grid map.
S65: traversing each cell in OccupancyGrid, mapping cell information into OccupancyMap;
S66: and publishing the generated OccupancyMap data to corresponding topics.
S7: and storing the established grid map, and using the ROS navigation package to implement navigation operation.
The specific implementation method comprises the following steps:
S71: slowly moving the robot or the camera, and incrementally establishing a grid map of the required environment;
S72: after the establishment is completed, storing the grid map;
S73: using the move_base node of ROS, a navigation operation is performed at rviz.
In ROS, move_base is an important software package for implementing the robot navigation function. The move_base comprises a series of nodes and algorithms for realizing the path planning, obstacle avoidance and navigation functions of the robot in an unknown environment.
The foregoing examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the foregoing examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principles of the present invention should be made therein and are intended to be equivalent substitutes within the scope of the present invention.

Claims (10)

1. A scene understanding navigation method based on feature point method vision SLAM is characterized by comprising the following steps:
s01: acquiring real-time pose information of a camera from the ORB-SLAM3 and issuing the real-time pose information through the ROS;
s02: synthesizing the RGB image and the depth image obtained by the depth camera into a 3D point cloud;
s03: semantic and depth information acquired through an example segmentation network is used for acquiring the relative position relation between the robot and the object;
s04: splicing each frame of point cloud in the 3D point cloud into a global point cloud map according to the real-time pose information, and publishing the global point cloud map to the ROS;
S05: map point information of ORB-SLAM3 is obtained and converted into point cloud, a point cloud map is obtained, point cloud map data are issued to ROS, and the point cloud map is converted into a three-dimensional octree map and a two-dimensional grid map;
S06: and (3) the ORB-SLAM3 module and the result of the example segmentation network are distributed and subscribed in the form of topics through the ROS system for communication transmission, and the ROS navigation package is used for implementing navigation operation.
2. The scene understanding navigation method based on feature point method vision SLAM of claim 1, wherein the method of acquiring real-time pose information of a camera from ORB-SLAM3 and issuing by ROS in step S01 comprises:
S11: acquiring Sophus of the camera pose in SE3f format from TrackRGBD interface of ORB-SLAM 3;
S12: the pose of Sophus:SE 3f format is converted into cv:Mat format Tcw, ,/>For rotation matrix,/>T is a translation vector,/>As a function of the rotation matrix of the camera pose,/>A translation matrix function for the camera pose;
S13: converting the reverse pose Tcw of the camera into a positive pose Twc, and extracting a rotation matrix RWC and a translation vector tWC from the positive pose Twc;
S14: constructing a rotation matrix RWC as a quaternion q;
S15: the translation vector tWC and the rotation quaternion q of the camera are respectively copied to a variable of the ROS message type and used for issuing camera pose information.
3. The scene understanding navigation method based on feature point method vision SLAM of claim 1, wherein the method for synthesizing 3D point cloud in step S02 comprises:
S21: reading pixel points of the depth image to obtain a depth value d;
S22: calculating the three-dimensional coordinates of the current pixel point according to the camera internal parameters and the depth value d
The calculation method comprises the following steps:
Wherein: is the pixel point coordinate in the pixel coordinate system,/> Is the three-dimensional coordinate value of the pixel point p,/>,/>,/>Is an internal reference of the camera,/>Is a scaling factor;
S23: acquiring RGB color values of pixel points corresponding to the color image, and storing the RGB color values in b, g and r member variables of a point p;
s24: performing the above processing on each frame of image to generate a 3D point cloud, performing rotation and stitching operation on the generated point cloud, and converting the generated point cloud into a specified coordinate system;
S25: voxel filtering is performed on the point cloud.
4. The scene understanding navigation method based on feature point method vision SLAM of claim 1, wherein the method for acquiring the relative positional relationship between the robot and the object in step S03 comprises:
s31: deploying YOLOv-seg instance partition network by using a C++ version OpenCV-DNN module;
S32: writing the deployed YOLOv-seg into ROS nodes, and issuing detection and segmentation results;
S33: sending the RGB image to YOLOv-seg example segmentation network to obtain target detection results of each category;
s34: reading the coordinates of the rectangular frames of each category in the pixel coordinate system from the result, and solving the coordinates of the central points of the rectangular frames
S35: indexing the point cloud coordinate system according to the center point coordinates of each category in the pixel coordinate system to obtain the center point coordinates in the image pixel coordinate systemCorresponding point cloud coordinates/>
S36: calculating the coordinate transformation relation between the pose of the camera and the position of the center point of each type of object in the point cloud through tf coordinate transformation under the ROS;
s37: the coordinate transformation relationships are published through ROS and then subscribed to in rviz.
5. The scene understanding navigation method based on the feature point method visual SLAM according to claim 1, wherein the method for stitching into the global point cloud map in step S04 comprises:
S41: converting the acquired real-time pose of the camera into a transformation matrix from the camera to a world coordinate system;
S42: calculating the relative transformation between two adjacent frames of point clouds according to the transformation matrix;
S43: registering the point cloud in the current frame with the corresponding pose, and adding the registered point cloud into the global point cloud map.
6. The scene understanding navigation method based on feature point method vision SLAM of claim 1, wherein the method for obtaining the point cloud map in step S05 comprises:
S51: reading map point information in ORB-SLAM 3;
s52: creating sensor_ msgs, namely PointCloud type point cloud data and initializing the point cloud data;
S53: traversing map points, converting the map points into Eigen:Vector 3d type, storing the converted coordinates in an array, copying data in the array to a specific position of point cloud data of sensor_ msgs: pointCloud2 type by using memcpy function, and publishing the point cloud data to ROS.
7. The scene understanding navigation method based on the feature point method vision SLAM according to claim 1, wherein the method of converting the point cloud map into a three-dimensional octree map and a two-dimensional grid map in step S05 comprises:
s61: creating and initializing an empty OccupancyGrid data structure according to the range, resolution and grid map size of the point cloud;
S62: for each point cloud data, converting the coordinates of the point cloud data into grid units on OccupancyGrid, and marking the corresponding grid units as corresponding states, wherein the states comprise idle states, free states or unknown states;
s63: publishing the generated OccupancyGrid data to corresponding topics;
s64: creating and initializing a OccupancyMap data structure and setting basic parameters thereof;
s65: traversing each cell in OccupancyGrid, mapping cell information into OccupancyMap;
S66: and publishing the generated OccupancyMap data to corresponding topics.
8. The scene understanding navigation method based on feature point method vision SLAM of claim 1, wherein the method of performing navigation operation using ROS navigation pack in step S06 comprises:
S71: a mobile robot or a camera incrementally builds a grid map of the desired environment;
S72: after the establishment is completed, storing the grid map;
S73: using the move_base node of ROS, a navigation operation is performed at rviz.
9. A scene understanding navigation system based on feature point method vision SLAM, comprising:
The camera real-time pose acquisition and release module acquires real-time pose information of the camera from the ORB-SLAM3 and releases the real-time pose information through the ROS;
The 3D point cloud synthesis module synthesizes the RGB image and the depth image obtained by the depth camera into a 3D point cloud;
the scene understanding module is used for acquiring the relative position relation between the robot and the object through semantic and depth information acquired by the instance segmentation network;
the global point cloud map generation and release module is used for splicing each frame of point cloud in the 3D point cloud into a global point cloud map according to the real-time pose information and releasing the global point cloud map into the ROS;
the ORB-SLAM3 visual navigation map module is used for acquiring map point information of ORB-SLAM3, converting the map point information into point cloud to obtain a point cloud map, releasing the point cloud map data into ROS, and converting the point cloud map into a three-dimensional octree map and a two-dimensional grid map;
And the navigation module is used for publishing and subscribing the results of the ORB-SLAM3 module and the example division network in the form of topics through the ROS system for communication transmission, and performing navigation operation by using the ROS navigation package.
10. A computer storage medium having stored thereon a computer program, wherein the computer program when executed implements the scene understanding navigation method based on feature point method vision SLAM of any of claims 1-8.
CN202410317058.2A 2024-03-20 2024-03-20 Scene understanding navigation method and system based on feature point method vision SLAM Pending CN117906595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410317058.2A CN117906595A (en) 2024-03-20 2024-03-20 Scene understanding navigation method and system based on feature point method vision SLAM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410317058.2A CN117906595A (en) 2024-03-20 2024-03-20 Scene understanding navigation method and system based on feature point method vision SLAM

Publications (1)

Publication Number Publication Date
CN117906595A true CN117906595A (en) 2024-04-19

Family

ID=90686301

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410317058.2A Pending CN117906595A (en) 2024-03-20 2024-03-20 Scene understanding navigation method and system based on feature point method vision SLAM

Country Status (1)

Country Link
CN (1) CN117906595A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110264563A (en) * 2019-05-23 2019-09-20 武汉科技大学 A kind of Octree based on ORBSLAM2 builds drawing method
CN111814683A (en) * 2020-07-09 2020-10-23 北京航空航天大学 Robust visual SLAM method based on semantic prior and deep learning features
CN114898062A (en) * 2022-05-28 2022-08-12 广东工业大学 Map construction method and device based on SLAM in dynamic scene
CN115035260A (en) * 2022-05-27 2022-09-09 哈尔滨工程大学 Indoor mobile robot three-dimensional semantic map construction method
CN115393538A (en) * 2022-08-22 2022-11-25 山东大学 Visual SLAM method and system for indoor dynamic scene based on deep learning
WO2022257801A1 (en) * 2021-06-09 2022-12-15 山东大学 Slam-based mobile robot mine scene reconstruction method and system
CN116030130A (en) * 2022-12-29 2023-04-28 西北工业大学 Hybrid semantic SLAM method in dynamic environment
WO2023104207A1 (en) * 2021-12-10 2023-06-15 深圳先进技术研究院 Collaborative three-dimensional mapping method and system
CN116295412A (en) * 2023-03-01 2023-06-23 南京航空航天大学 Depth camera-based indoor mobile robot dense map building and autonomous navigation integrated method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110264563A (en) * 2019-05-23 2019-09-20 武汉科技大学 A kind of Octree based on ORBSLAM2 builds drawing method
CN111814683A (en) * 2020-07-09 2020-10-23 北京航空航天大学 Robust visual SLAM method based on semantic prior and deep learning features
WO2022257801A1 (en) * 2021-06-09 2022-12-15 山东大学 Slam-based mobile robot mine scene reconstruction method and system
WO2023104207A1 (en) * 2021-12-10 2023-06-15 深圳先进技术研究院 Collaborative three-dimensional mapping method and system
CN115035260A (en) * 2022-05-27 2022-09-09 哈尔滨工程大学 Indoor mobile robot three-dimensional semantic map construction method
CN114898062A (en) * 2022-05-28 2022-08-12 广东工业大学 Map construction method and device based on SLAM in dynamic scene
CN115393538A (en) * 2022-08-22 2022-11-25 山东大学 Visual SLAM method and system for indoor dynamic scene based on deep learning
CN116030130A (en) * 2022-12-29 2023-04-28 西北工业大学 Hybrid semantic SLAM method in dynamic environment
CN116295412A (en) * 2023-03-01 2023-06-23 南京航空航天大学 Depth camera-based indoor mobile robot dense map building and autonomous navigation integrated method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈国军;陈巍;郁汉琪;王涵立;: "基于语义ORB-SLAM2算法的移动机器人自主导航方法研究", 机床与液压, no. 09, 15 May 2020 (2020-05-15) *

Similar Documents

Publication Publication Date Title
CN108401461A (en) Three-dimensional mapping method, device and system, cloud platform, electronic equipment and computer program product
CN111599001A (en) Unmanned aerial vehicle navigation map construction system and method based on image three-dimensional reconstruction technology
Schmid et al. View planning for multi-view stereo 3D reconstruction using an autonomous multicopter
CN108876814B (en) Method for generating attitude flow image
Chen et al. 3d point cloud processing and learning for autonomous driving
JP2022077976A (en) Image-based positioning method and system
CN112734765A (en) Mobile robot positioning method, system and medium based on example segmentation and multi-sensor fusion
CN111275015A (en) Unmanned aerial vehicle-based power line inspection electric tower detection and identification method and system
CN112991534B (en) Indoor semantic map construction method and system based on multi-granularity object model
CN113192200B (en) Method for constructing urban real scene three-dimensional model based on space-three parallel computing algorithm
CN111696199A (en) Ground-air fusion precise three-dimensional modeling method for synchronous positioning and mapping
WO2024087962A1 (en) Truck bed orientation recognition system and method, and electronic device and storage medium
CN117906595A (en) Scene understanding navigation method and system based on feature point method vision SLAM
Patel et al. Collaborative mapping of archaeological sites using multiple uavs
CN116310753A (en) Vectorized skeleton extraction method and system for outdoor scene point cloud data
CN111354044B (en) Panoramic vision compass estimation method based on sinusoidal curve fitting and application thereof
JP3512919B2 (en) Apparatus and method for restoring object shape / camera viewpoint movement
Leonardis et al. Confluence of computer vision and computer graphics
Kawasaki et al. Light field rendering for large-scale scenes
Fu et al. Costmap construction and pseudo-lidar conversion method of mobile robot based on monocular camera
Ruan The survey of vision-based 3D modeling techniques
Chen et al. 3DVPS: A 3D Point Cloud-Based Visual Positioning System
Kawasaki et al. Automatic 3D city construction system using omni camera
Qin et al. Grid map representation with height information based on multi-view 3D reconstruction technology
Joska WildPose: Long-Range 3D Motion Tracking of Cheetahs in the Wild Using Multi-Sensor Fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination