WO2022160430A1 - Method for obstacle avoidance of robot in the complex indoor scene based on monocular camera - Google Patents
Method for obstacle avoidance of robot in the complex indoor scene based on monocular camera Download PDFInfo
- Publication number
- WO2022160430A1 WO2022160430A1 PCT/CN2021/081649 CN2021081649W WO2022160430A1 WO 2022160430 A1 WO2022160430 A1 WO 2022160430A1 CN 2021081649 W CN2021081649 W CN 2021081649W WO 2022160430 A1 WO2022160430 A1 WO 2022160430A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- robot
- obstacle avoidance
- laser
- data
- reinforcement learning
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 230000002787 reinforcement Effects 0.000 claims abstract description 36
- 238000012549 training Methods 0.000 claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 13
- 230000009471 action Effects 0.000 claims abstract description 10
- 238000011176 pooling Methods 0.000 claims description 18
- 230000011218 segmentation Effects 0.000 claims description 12
- 238000012360 testing method Methods 0.000 claims description 10
- 230000000694 effects Effects 0.000 claims description 6
- 238000004088 simulation Methods 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 claims description 4
- 238000005259 measurement Methods 0.000 claims description 3
- 230000003287 optical effect Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000008447 perception Effects 0.000 abstract description 4
- 238000002474 experimental method Methods 0.000 description 4
- 230000001788 irregular Effects 0.000 description 4
- 239000002184 metal Substances 0.000 description 4
- 229910052751 metal Inorganic materials 0.000 description 4
- 230000004927 fusion Effects 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000009182 swimming Effects 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 241000283283 Orcinus orca Species 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- -1 ferrous metals Chemical class 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010408 sweeping Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0231—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
- G05D1/0246—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means
- G05D1/0253—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means extracting relative motion information from a plurality of images taken successively, e.g. visual odometry, optical flow
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/778—Active pattern-learning, e.g. online learning of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
Definitions
- the disclosure belongs to the field of navigation and obstacle avoidance in the field of robots, and the specific realization result is autonomous navigation and obstacle avoidance of robots, and particularly relates to a method for fully and effectively perceiving complex obstacles.
- the obstacle avoidance task of the robot is in a more complex scene, the robot can navigate to the target point autonomously without any collision with the obstacle, which has great practical application value.
- robot obstacle avoidance-related tasks such as sweeping robots, unmanned driving, smart warehouses, and smart logistics, have achieved significant performance improvements.
- Some new work abandons the Lidar sensor, use the depth map and color map captured by the RGB-D camera as input, directly map to the action, and carry out the end-to-end training of reinforcement learning.
- images have rich semantic information, but they also have a lot of redundant information that is not helpful for obstacle avoidance, which makes it difficult for reinforcement learning algorithms to converge, difficult to train, and cause a large gap between virtual and reality and difficult migration of strategies.
- the depth camera appears a lot of noise in an indoor environment with sunlight, and it almost fails.
- the traditional method of using depth map to point cloud mapping to remove ground interference information cannot perceive low obstacles on the ground such as clothing and swimming pools. Therefore, the method based on RGB-D end-to-end learning also has many problems, unable to fully perceive the complex indoor environment, and even unable to safely navigate and avoid obstacles.
- the present disclosure is based on the investigation and analysis of the existing obstacle avoidance navigation technology, by combining the advantages of Lidar and RGB camera, while abandoning the disadvantages of the two, constructs "pseudo laser" data, thereby realizing autonomous navigation and navigation in complex scenes.
- Obstacle avoidance task The input of the method is the image taken by the monocular RGB camera on the robot platform, and the output is the action to be taken by the robot, including linear velocity and angular velocity. This method can effectively perceive different types of complex obstacles in indoor scenes, thereby helping the reinforcement learning module to perform efficient learning and decision-making.
- the purpose of the present disclosure is to realize an efficient obstacle avoidance method for robots by mapping "pseudo laser" data for monocular RGB images in complex scenes.
- the method includes an environment perception stage and a control decision stage.
- the environment perception stage includes a depth prediction module, a semantic segmentation module, and a depth slicing module;
- the control decision stage includes a feature extraction guidance module and a reinforcement learning module.
- the method of the present disclosure is suitable for complex obstacles of various shapes and sizes.
- a method for obstacle avoidance of robot in the complex indoor scene based on monocular camera includes the following steps:
- Step 1 Loading robot simulation model and building a training and testing simulation environment
- Step 2 Getting semantic depth map
- RGB image from monocular camera carried by the TurtleBot-ROS robot, and inputting the RGB image into Fastdepth depth prediction network to obtain the depth map under current field of view; selecting lower half of the depth map as intermediate result; ground pixel information in the intermediate result will interfere obstacle avoidance, resulting in obstacle avoidance failure, so inputtingthe RGB image into CCNet semantic segmentation model to obtain a two-class semantic segmentation mask, where 0 represents the ground pixel and 1 represents the background.
- the semantic segmentation mask and the depth map are multiplypixel by pixel to obtain semantic depth map, where value of each pixel in the semantic depth map is depth distance of current viewing angle, and at the same time removing disturbing ground depth value;
- Step 3 Deep slice and data enhancement module
- Pooling window size is (240, 1) and step size is 1. Selecting the minimum value in the window for each pooling operation as output object, and performing all pooling operationson each column of the image, and result obtained is "pseudo laser" data;
- Step 4 Controlling decision stage
- the deep reinforcement learning module adopts PPO algorithm, and network structure is composed of 3 layers of convolutional layers and 3 layers of fully connected layers; in order to make the experimental robot reach target position steadily and safely, input of state includes three parts: observation data, target point distance and speed; the observation data is the "pseudo-laser" data obtained in step 3, the distance and speed of the target point are obtained by onboard odometer of the robot; proposing a feature extraction guidance layer, and extracting and fusing data features of three modes by three layers of convolution, and then obtaining feature mask through sigmoid activationand multiplying the "pseudo laser” observation data, sending result obtained to the deep reinforcement learning module; extracting information that is helpful to obstacle avoidance strategy from multi-modal data, and then combining the information with the "pseudo laser” observation data to make the subsequent feature extraction process more targeted and speed up convergence of the network;
- Step 5 Forming a monocular obstacle avoidance navigation network and output decision results
- Splicing steps 2, 3, and 4 to obtain the input image from the monocular RGB camera. After processing, obtaining the depth map and the semantic segmentation mask, and multiplying the dots and cropping. After the dynamic minimum pooling operation, obtaining the "pseudo laser" observation data. Inputting three consecutive frames of "pseudo-laser” observation data into the deep reinforcement learning module together with the distance and speed of the target point. After the feature extraction guidance layer, different attention is paid to each dimension of the "pseudo-laser” observation data. After multi-layer convolution, pooling, and full connection, the LSTM layer is used to increase the timing correlation for the entire path, and finally, the decision-making action of the robot at the current moment is output, so as to achieve the effect of autonomous obstacle avoidance and navigation.
- the disclosure solves the difficulty of fully perceiving complex obstacles (non-convex irregular obstacles, ferrous metals, complex ground obstacles) in the obstacle avoidance task of the robot in the indoor environment, which leads to the difficulty of obstacle avoidance failure, and helps the robot to use the semantic information of the environment to remove the interference of redundant pixels enables efficient reinforcement learning training and decision-making.
- the present disclosure proposes a reinforcement learning mapping method from a single RGB image directly to the robot's obstacle avoidance navigation action. The method relies on "pseudo laser" data and performs efficient decision-making by encoding semantic information into the laser data. And the accuracy of the method is proved through comparative experiments. In the comparative experiment, the method obtained the best performance in the average success rate and average time of all two commonly used indicators and has great advantages in complex scenarios.
- the disclosure is suitable for obstacle avoidance and navigation tasks in different complex indoor scenes (a) scenes containing non-convex irregular obstacles; (b) scenes containing black metal smooth material obstacles; (c) containing messy clothing on the ground, obstacle scenes such as glass and swimming pool. The effectiveness and applicability of this method in different scenarios are proved.
- Figure 1 is the network structure of the present disclosure.
- Figure 2 is the visualization result of the experiment of the embodiment of the present disclosure.
- the state includes "pseudo-laser" data, the distance from the target point, and the velocity at the previous moment; the action is composed of the linear velocity and angular velocity of the wheeled robot; the reward function includes the distance state from the target at each moment (the closer to the target, the positive return, and vice versa) , if there is a collision, it is -15, and if it reaches the target point, it is 15.
- the robot is encouraged to not take too much action at each step, that is, it cannot exceed the previous one. 1.7 times the angular velocity at the moment.
- the reinforcement learning algorithm is implemented in Pytorch. Stochastic gradient descent is used in the reinforcement learning network. Its momentum value is 0.9, weight attenuation is 1e-4, the learning rate is set to 5e-5, the attenuation factor is 0.99, the KL divergence parameter is 15e-4, and the maximum step size is 150.
- the learning process is terminated after 1.5 million training paths, and it takes about 40 hours to train the strategy on the computer equipped with an i7-7700 CPU and an NVIDIA GTX 1080Ti GPU.
- it is compared with the traditional method ORCA and the latest learning method multi-robot distributed obstacle avoidance strategy to verify the effectiveness of the disclosure. And perform ablation experiments on all the modules proposed in the network to prove the effectiveness of each part.
- Figure 1 is the network structure of the monocular obstacle avoidance navigation network.
- the network is composed of an environment perception stage and a control decision stage, which specifically includes a depth prediction module, a semantic mask module, a depth slicing module, a feature extraction guidance module, a reinforcement learning module, and data enhancement.
- the network takes monocular RGB images as input, and after obtaining the semantic depth map, it performs a dynamic minimization operation to obtain "pseudo-laser" data, which is used as the state input of reinforcement learning to generate the final robot decision-making action.
- Figure 2 is the process visualization result of the monocular visual obstacle avoidance navigation framework, in which (A) is listed as a chair obstacle scene; (B) is listed as a table obstacle scene; (C) is listed as a clothing obstacle scene; (D) is listed as a glass obstacle Scenes.
- the monocular camera on the robot platform captures the RGB image, predicts the semantic depth map, and then slices it to generate "pseudo laser” data.
- the comparison between the last two rows of "pseudo laser” data and Lidar data shows that the "pseudo laser” data "Can capture more complete environmental information, so as to carry out efficient reinforcement learning training and better environmental interaction.
- a method for obstacle avoidance of robot in the complex indoor scene based on monocular camera includes the following steps:
- Step 1 Load the robot simulation model and build a training and testing simulation environment
- the URDF model of the TurtleBot-ROS robot is used as the experimental robot; Block, Crossing, and Passing in ROS-Stage are used as the training environment, and 24 identical TurtleBot-ROS robots are deployed for distributed control decision module training; use the cafe environment in ROS-Gazebo as the background of the test scene, and manually add complex obstacles (tables, chairs, wardrobes, moving pedestrians, etc. ) in Gazebo to test the effectiveness of the entire vision system;
- Step 2 Get semantic depth map
- the RGB image is input into the CCNet semantic segmentation model to obtain a two-class semantic segmentation mask, where 0 represents the ground pixel and 1 represents the background.
- the semantic segmentation mask and the depth map are multiplypixel by pixel to obtain the semantic depth map, where the value of each pixel in the semantic depth map is the depth distance of the current viewing angle, and at the same time remove the disturbing ground depth value;
- Step 3 Deep slice and data enhancement module
- the pooling window size is (240, 1)
- the step size is 1, and each pooling operation selects the minimum value in the window as the output object, and pooling 640 times, data with a size of (1, 640) is obtained, which is "pseudo laser" data.
- "Pseudo-laser” not only retains the advantages of simple, easy-to-learn, and easy-to-transfer Lidar data, but also retains the semantic information in the visual image. Since the data is obtained from a two-dimensional image through a minimum pooling operation, it can fully perceive the complex obstacles in the environment, encode semantics into the laser of each dimension, and support the subsequent efficient reinforcement learning and the implementation of safe obstacle avoidance strategies.
- the sensor data obtained in a virtual environment is often perfect, but in a real environment, if some part of an object occludes another object, the observation value usually has an observation error near the boundary of the object. Larger noise will reduce the accuracy of the algorithm or even fail. Therefore, a data enhancement method is introduced, and noise interference is applied to the observation data of the virtual environment during training.
- noise interference is applied to the observation data of the virtual environment during training.
- replace the values around two adjacent endpoints by linear interpolation with a window size of (1, 8) .
- Gaussian white noise with a variance of 0.08 is adaptively added. This data enhancement method enables it to be directly transferred and adapted to real scenes full of noise even if it is trained in a virtual environment.
- Step 4 Control decision stage
- the deep reinforcement learning module adopts the PPO algorithm, and the network structure is composed of 3 layers of convolutional layers and 3 layers of fully connected layers.
- the state input includes three parts: observation data, target distance, and speed.
- the observation data is the "pseudo laser" data obtained in step 3, and the distance and speed of the target point are obtained by the onboard odometer of the robot.
- direct fusion and indirect fusion are two commonly used methods: direct fusion and indirect fusion.
- direct fusion in the channel is not conducive to learning obstacle avoidance strategies.
- blind indirect extraction leads to ignoring useful information in the observation data and capturing useless information.
- a feature extraction guidance layer is proposed.
- the data features of the three modalities are extracted and fused by three layers of convolution, and then the feature mask is obtained by sigmoid activation, and the "pseudo laser" observation data is multiplied, and the result is sent to the deep reinforcement learning module. It combines the advantages of the previous method.
- the information that is helpful for obstacle avoidance strategies is extracted from the multi-modal data, and then it is combined with the observation data so that the subsequent feature extraction process is more targeted and the convergence of the network is accelerated.
- the robot Because the monocular RGB camera is used as the sensor, the robot has a small forward perspective of 60°. Therefore, the second fully connected layer of the reinforcement learning network structure is modified to the LSTM layer to increase the timing correlation of the reinforcement learning module so that the robot can Make decisions based on all observations in the entire path.
- Step 5 Form a monocular obstacle avoidance navigation network and output decision results
- Steps 2, 3, and 4 are stitched together to obtain the input image from the monocular RGB camera, and the depth map and semantic mask are obtained after processing, and then cropped after dot multiplication.
- the "pseudo laser” data is obtained.
- the “pseudo laser” of the frame is input into the reinforcement learning network together with the distance and speed of the target point.
- a different degree of attention is applied to each dimension in the “pseudo laser” data.
- the LSTM is used to increase the timing correlation for the entire path, and finally, the decision-making action of the robot at the current moment is output, so as to achieve the effect of autonomous obstacle avoidance and navigation.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Aviation & Aerospace Engineering (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Automation & Control Theory (AREA)
- Electromagnetism (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention discloses a method for avoiding obstacles in a robot indoor complex scene based on the monocular camera, which belongs to the field of robot navigation and obstacle avoidance. The monocular obstacle avoidance navigation network of the present invention is composed of an environment perception stage and a control decision-making stage, and specifically includes a depth prediction module, a semantic mask module, a depth slicing module, a feature extraction guidance module, a reinforcement learning module, and data enhancement. The network uses monocular RGB images as input, and after obtaining the semantic depth map, it performs a dynamic minimization operation to obtain "pseudo-laser" data, which is used as the state input of reinforcement learning to generate the final robot decision-making action. The invention solves the difficulty of fully perceiving complex obstacles in the obstacle avoidance task of the robot in the indoor environment, which leads to the difficulty of obstacle avoidance failure, and helps the robot to use the semantic information of the environment to remove the interference of redundant pixels, thereby performing efficient reinforcement learning training and Decision-making has validity and applicability in different scenarios.
Description
The disclosure belongs to the field of navigation and obstacle avoidance in the field of robots, and the specific realization result is autonomous navigation and obstacle avoidance of robots, and particularly relates to a method for fully and effectively perceiving complex obstacles.
The obstacle avoidance task of the robot is in a more complex scene, the robot can navigate to the target point autonomously without any collision with the obstacle, which has great practical application value. With the rapid development of artificial intelligence technology, robot obstacle avoidance-related tasks, such as sweeping robots, unmanned driving, smart warehouses, and smart logistics, have achieved significant performance improvements.
However, indoor obstacle avoidance scenes often have some complex obstacles, such as non-convex irregular objects such as tables and chairs, black metal objects, clothing, and other obstacles lying flat on the ground. These objects will affect the traditional one-dimensional laser. The radar obstacle avoidance strategy has a serious impact, but there is no relevant research to deal with such objects. The existence of such complex obstacles will cause the Lidar system to be unable to fully perceive the environment, and thus make the navigation obstacle avoidance system ineffective. Specifically, for irregular objects, such as tables, one-dimensional Lidar can only perceive the legs of the table, which will make the robot mistakenly believe that the gap between the legs of the table can pass through, but when the robot is taller, it will interact with the table. For black metal objects, it will seriously interfere with the Lidar and absorb the emitted laser light, making it completely invalid; for complex obstacles on the ground, traditional methods cannot perceive lower ground obstacles, even embedded like swimming pools to the ground, but impassable obstacles. Therefore, fully and efficiently perceiving complex obstacles is an urgent task in the field of robot obstacle avoidance.
Most of the existing robot obstacle avoidance navigation methods use deep reinforcement learning as a learning method, which is popular because it can learn independently without manually collecting labeled data sets. Reinforcement learning is a "trial and error" process, which is often learned in a virtual environment and then transferred to a real scene. In order to narrow the gap between virtual and reality, Lidar data with the simple data format and easy to learn is usually used. However, the Lidar data is not fully aware of complex obstacles and cannot implement efficient obstacle avoidance strategies.
Some new work abandons the Lidar sensor, use the depth map and color map captured by the RGB-D camera as input, directly map to the action, and carry out the end-to-end training of reinforcement learning. For laser data, images have rich semantic information, but they also have a lot of redundant information that is not helpful for obstacle avoidance, which makes it difficult for reinforcement learning algorithms to converge, difficult to train, and cause a large gap between virtual and reality and difficult migration of strategies. In addition, the depth camera appears a lot of noise in an indoor environment with sunlight, and it almost fails. The traditional method of using depth map to point cloud mapping to remove ground interference information cannot perceive low obstacles on the ground such as clothing and swimming pools. Therefore, the method based on RGB-D end-to-end learning also has many problems, unable to fully perceive the complex indoor environment, and even unable to safely navigate and avoid obstacles.
Therefore, the present disclosure is based on the investigation and analysis of the existing obstacle avoidance navigation technology, by combining the advantages of Lidar and RGB camera, while abandoning the disadvantages of the two, constructs "pseudo laser" data, thereby realizing autonomous navigation and navigation in complex scenes. Obstacle avoidance task. The input of the method is the image taken by the monocular RGB camera on the robot platform, and the output is the action to be taken by the robot, including linear velocity and angular velocity. This method can effectively perceive different types of complex obstacles in indoor scenes, thereby helping the reinforcement learning module to perform efficient learning and decision-making.
Summary of the Disclosure
The purpose of the present disclosure is to realize an efficient obstacle avoidance method for robots by mapping "pseudo laser" data for monocular RGB images in complex scenes. The method includes an environment perception stage and a control decision stage. The environment perception stage includes a depth prediction module, a semantic segmentation module, and a depth slicing module; the control decision stage includes a feature extraction guidance module and a reinforcement learning module. The method of the present disclosure is suitable for complex obstacles of various shapes and sizes.
The technical solution of the present disclosure is:
A method for obstacle avoidance of robot in the complex indoor scene based on monocular camera, the method includes the following steps:
Step 1, Loading robot simulation model and building a training and testing simulation environment
In order to solve obstacle avoidance problem in complex scenes, using URDF model of TurtleBot-ROS robot as experimental robot; usingBlock, Crossing, and Passing in ROS-Stage as training environment, and deploying24 identical TurtleBot-ROS robots for distributed control decisionmodule training; using cafe environment in ROS-Gazebo as background of test scene, and manually adding complex obstacles in Gazebo to test effectiveness of entire visual system;
Obtaining RGB image from monocular camera carried by the TurtleBot-ROS robot, and inputting the RGB image into Fastdepth depth prediction network to obtain the depth map under current field of view; selecting lower half of the depth map as intermediate result; ground pixel information in the intermediate result will interfere obstacle avoidance, resulting in obstacle avoidance failure, so inputtingthe RGB image into CCNet semantic segmentation model to obtain a two-class semantic segmentation mask, where 0 represents the ground pixel and 1 represents the background. The semantic segmentation mask and the depth map are multiplypixel by pixel to obtain semantic depth map, where value of each pixel in the semantic depth map is depth distance of current viewing angle, and at the same time removing disturbing ground depth value;
Performing a dynamic minimum pooling operation on depth value pixels in the semantic depth map. Pooling window size is (240, 1) and step size is 1. Selecting the minimum value in the window for each pooling operation as output object, and performing all pooling operationson each column of the image, and result obtained is "pseudo laser" data;
By introducing a data enhancement method, applying noise interference to observation data of virtual environment during training; in order to identify noise boundary from training laser measurement, assuming that if the difference between two adjacent values in vector is greater than threshold 0.5, there will be a boundary; and replacing values around two adjacent endpoints by linear interpolation with a window size of (1, 8) ; at the same time, for all laser observation data, adaptively adding Gaussian white noise with a variance of 0.08;
Step 4, Controlling decision stage
After obtaining the "pseudo laser" data, placing the "pseudo laser" for three consecutive moments in three channels, and using formed tensor as input of deep reinforcement learning module, so that the experimental robot can effectively perceive optical flow effect of dynamic obstacles in a short period of time, so as to make correct decisions on dynamic obstacles;
The deep reinforcement learning module adopts PPO algorithm, and network structure is composed of 3 layers of convolutional layers and 3 layers of fully connected layers; in order to make the experimental robot reach target position steadily and safely, input of state includes three parts: observation data, target point distance and speed; the observation data is the "pseudo-laser" data obtained in step 3, the distance and speed of the target point are obtained by onboard odometer of the robot; proposing a feature extraction guidance layer, and extracting and fusing data features of three modes by three layers of convolution, and then obtaining feature mask through sigmoid activationand multiplying the "pseudo laser" observation data, sending result obtained to the deep reinforcement learning module; extracting information that is helpful to obstacle avoidance strategy from multi-modal data, and then combining the information with the "pseudo laser" observation data to make the subsequent feature extraction process more targeted and speed up convergence of the network;
Modifying second fully connected layer of the deep reinforcement learning module to an LSTM layer, increasing timing correlation of the deep reinforcement learning module, and enabling the experimental robot to make decisions based on all observations in entire path;
Step 5, Forming a monocular obstacle avoidance navigation network and output decision results
Splicing steps 2, 3, and 4 to obtain the input image from the monocular RGB camera. After processing, obtaining the depth map and the semantic segmentation mask, and multiplying the dots and cropping. After the dynamic minimum pooling operation, obtaining the "pseudo laser" observation data. Inputting three consecutive frames of "pseudo-laser" observation data into the deep reinforcement learning module together with the distance and speed of the target point. After the feature extraction guidance layer, different attention is paid to each dimension of the "pseudo-laser" observation data. After multi-layer convolution, pooling, and full connection, the LSTM layer is used to increase the timing correlation for the entire path, and finally, the decision-making action of the robot at the current moment is output, so as to achieve the effect of autonomous obstacle avoidance and navigation.
The beneficial effects of the present disclosure:
(1) Obstacle avoidance test results and efficiency
The disclosure solves the difficulty of fully perceiving complex obstacles (non-convex irregular obstacles, ferrous metals, complex ground obstacles) in the obstacle avoidance task of the robot in the indoor environment, which leads to the difficulty of obstacle avoidance failure, and helps the robot to use the semantic information of the environment to remove the interference of redundant pixels enables efficient reinforcement learning training and decision-making. The present disclosure proposes a reinforcement learning mapping method from a single RGB image directly to the robot's obstacle avoidance navigation action. The method relies on "pseudo laser" data and performs efficient decision-making by encoding semantic information into the laser data. And the accuracy of the method is proved through comparative experiments. In the comparative experiment, the method obtained the best performance in the average success rate and average time of all two commonly used indicators and has great advantages in complex scenarios.
(2) Broad applicability
The disclosure is suitable for obstacle avoidance and navigation tasks in different complex indoor scenes (a) scenes containing non-convex irregular obstacles; (b) scenes containing black metal smooth material obstacles; (c) containing messy clothing on the ground, obstacle scenes such as glass and swimming pool. The effectiveness and applicability of this method in different scenarios are proved.
Description of the Drawings
Figure 1 is the network structure of the present disclosure.
Figure 2 is the visualization result of the experiment of the embodiment of the present disclosure.
The specific embodiments of the present disclosure will be further described below in conjunction with the drawings and technical solutions.
This method uses PPO as the framework of deep reinforcement learning. The state includes "pseudo-laser" data, the distance from the target point, and the velocity at the previous moment; the action is composed of the linear velocity and angular velocity of the wheeled robot; the reward function includes the distance state from the target at each moment (the closer to the target, the positive return, and vice versa) , if there is a collision, it is -15, and if it reaches the target point, it is 15. The robot is encouraged to not take too much action at each step, that is, it cannot exceed the previous one. 1.7 times the angular velocity at the moment.
The reinforcement learning algorithm is implemented in Pytorch. Stochastic gradient descent is used in the reinforcement learning network. Its momentum value is 0.9, weight attenuation is 1e-4, the learning rate is set to 5e-5, the attenuation factor is 0.99, the KL divergence parameter is 15e-4, and the maximum step size is 150. In the embodiment of the present disclosure, the learning process is terminated after 1.5 million training paths, and it takes about 40 hours to train the strategy on the computer equipped with an i7-7700 CPU and an NVIDIA GTX 1080Ti GPU. In order to verify the effectiveness of the network, it is compared with the traditional method ORCA and the latest learning method multi-robot distributed obstacle avoidance strategy to verify the effectiveness of the disclosure. And perform ablation experiments on all the modules proposed in the network to prove the effectiveness of each part.
Figure 1 is the network structure of the monocular obstacle avoidance navigation network. The network is composed of an environment perception stage and a control decision stage, which specifically includes a depth prediction module, a semantic mask module, a depth slicing module, a feature extraction guidance module, a reinforcement learning module, and data enhancement. The network takes monocular RGB images as input, and after obtaining the semantic depth map, it performs a dynamic minimization operation to obtain "pseudo-laser" data, which is used as the state input of reinforcement learning to generate the final robot decision-making action.
Figure 2 is the process visualization result of the monocular visual obstacle avoidance navigation framework, in which (A) is listed as a chair obstacle scene; (B) is listed as a table obstacle scene; (C) is listed as a clothing obstacle scene; (D) is listed as a glass obstacle Scenes. The monocular camera on the robot platform captures the RGB image, predicts the semantic depth map, and then slices it to generate "pseudo laser" data. The comparison between the last two rows of "pseudo laser" data and Lidar data shows that the "pseudo laser" data "Can capture more complete environmental information, so as to carry out efficient reinforcement learning training and better environmental interaction.
A method for obstacle avoidance of robot in the complex indoor scene based on monocular camera, the method includes the following steps:
Step 1 Load the robot simulation model and build a training and testing simulation environment
In order to solve the obstacle avoidance problem in complex scenes, the URDF model of the TurtleBot-ROS robot is used as the experimental robot; Block, Crossing, and Passing in ROS-Stage are used as the training environment, and 24 identical TurtleBot-ROS robots are deployed for distributed control decision module training; use the cafe environment in ROS-Gazebo as the background of the test scene, and manually add complex obstacles (tables, chairs, wardrobes, moving pedestrians, etc. ) in Gazebo to test the effectiveness of the entire vision system;
Obtain the RGB image from the monocular camera carried by the TurtleBot-ROS robot, and input it into the Fastdepth depth prediction network to obtain the depth map under the current field of view; select the lower half of the depth map as the intermediate result; the ground pixel information in the result will interfere Obstacle avoidance, resulting in obstacle avoidance failure, so the RGB image is input into the CCNet semantic segmentation model to obtain a two-class semantic segmentation mask, where 0 represents the ground pixel and 1 represents the background. The semantic segmentation mask and the depth map are multiplypixel by pixel to obtain the semantic depth map, where the value of each pixel in the semantic depth map is the depth distance of the current viewing angle, and at the same time remove the disturbing ground depth value;
Perform a dynamic minimum pooling operation on the depth value pixels in the semantic depth map, the pooling window size is (240, 1) , the step size is 1, and each pooling operation selects the minimum value in the window as the output object, and pooling 640 times, data with a size of (1, 640) is obtained, which is "pseudo laser" data. "Pseudo-laser" not only retains the advantages of simple, easy-to-learn, and easy-to-transfer Lidar data, but also retains the semantic information in the visual image. Since the data is obtained from a two-dimensional image through a minimum pooling operation, it can fully perceive the complex obstacles in the environment, encode semantics into the laser of each dimension, and support the subsequent efficient reinforcement learning and the implementation of safe obstacle avoidance strategies.
The sensor data obtained in a virtual environment is often perfect, but in a real environment, if some part of an object occludes another object, the observation value usually has an observation error near the boundary of the object. Larger noise will reduce the accuracy of the algorithm or even fail. Therefore, a data enhancement method is introduced, and noise interference is applied to the observation data of the virtual environment during training. In order to identify the noise boundary from the training laser measurement, it is assumed that if the difference between two adjacent values in the vector is greater than the threshold of 0.5, there will be a boundary. And replace the values around two adjacent endpoints by linear interpolation with a window size of (1, 8) . At the same time, for all laser observation data, Gaussian white noise with a variance of 0.08 is adaptively added. This data enhancement method enables it to be directly transferred and adapted to real scenes full of noise even if it is trained in a virtual environment.
Step 4 Control decision stage
After obtaining the "pseudo laser" data, place the "pseudo laser" at three consecutive moments in the three channels to form a tensor with a size of (3, 640) as the input of the deep reinforcement learning module, which can make the robot effectivelyperceive the optical flow effect of dynamic obstacles in a short period of time, so as to make correct decisions on dynamic obstacles.
The deep reinforcement learning module adopts the PPO algorithm, and the network structure is composed of 3 layers of convolutional layers and 3 layers of fully connected layers. In order for the robot to reach the target position steadily and safely, the state input includes three parts: observation data, target distance, and speed. The observation data is the "pseudo laser" data obtained in step 3, and the distance and speed of the target point are obtained by the onboard odometer of the robot. Currently, there are two commonly used methods: direct fusion and indirect fusion. However, because the information comes from different modes, direct fusion in the channel is not conducive to learning obstacle avoidance strategies. On the other hand, blind indirect extraction leads to ignoring useful information in the observation data and capturing useless information. To this end, a feature extraction guidance layer is proposed. The data features of the three modalities are extracted and fused by three layers of convolution, and then the feature mask is obtained by sigmoid activation, and the "pseudo laser" observation data is multiplied, and the result is sent to the deep reinforcement learning module. It combines the advantages of the previous method. The information that is helpful for obstacle avoidance strategies is extracted from the multi-modal data, and then it is combined with the observation data so that the subsequent feature extraction process is more targeted and the convergence of the network is accelerated.
Because the monocular RGB camera is used as the sensor, the robot has a small forward perspective of 60°. Therefore, the second fully connected layer of the reinforcement learning network structure is modified to the LSTM layer to increase the timing correlation of the reinforcement learning module so that the robot can Make decisions based on all observations in the entire path.
Step 5 Form a monocular obstacle avoidance navigation network and output decision results
Claims (1)
- A method for obstacle avoidance of robot in the complex indoor scene based on monocular camera, wherein the method includes following steps of:step 1, loading robot simulation model and building a training and testing simulation environmentin order to solve obstacle avoidance problem in complex scenes, using URDF model of TurtleBot-ROS robot as experimental robot; using Block, Crossing, and Passing in ROS-Stage as training environment, and deploying 24 identical TurtleBot-ROS robots for distributed control decisionmodule training; using cafe environment in ROS-Gazebo as background of test scene, and manually adding complex obstacles in Gazebo to test effectiveness of entire visual system;step 2, getting semantic depth mapobtaining RGB image from monocular camera carried by the TurtleBot-ROS robot, and inputtingthe RGB image into the Fastdepth depth prediction network to obtain the depth map under current field of view; selecting lower half of the depth map as intermediate result; ground pixel information in the intermediate result will interfere obstacle avoidance, resulting in obstacle avoidance failure, so inputtingthe RGB image is into CCNet semantic segmentation model to obtain a two-class semantic segmentation mask, where 0 represents the ground pixel and 1 represents the background, the semantic segmentation mask and the depth map are multiply pixel by pixelto obtain semantic depth map, where value of each pixel in the semantic depth map is depth distance of current viewing angle, and at the same time removing disturbing ground depth value;step 3, deep slice and data enhancement moduleperforming a dynamic minimum pooling operation on depth value pixels in the semantic depth map, pooling window size is (240, 1) and step size is 1, selecting the minimum value in the window for each pooling operation as output object, and performing all pooling operations on each column of the image, and result obtained is "pseudo laser" data; by introducing a data enhancement method, applying noise interference to observation data ofvirtual environment during training; in order to identify noise boundary from training laser measurement, assuming that if the difference between two adjacent values in vector is greater than threshold 0.5, there will be a boundary; and replacing values around two adjacent endpoints by linear interpolation with a window size of (1, 8) ; at the same time, for all laser observation data, adaptively adding Gaussian white noise with a variance of 0.08;step 4, controlling decision stageafter obtaining the "pseudo laser" data, placing the "pseudo laser" for three consecutive moments in three channels, and using formed tensor as input of the deep reinforcement learning module, so that the experimental robot can effectively perceive optical flow effect of dynamic obstacles in a short period of time, so as to make correct decisions on dynamic obstacles;the deep reinforcement learning module adopts PPO algorithm, and network structure is composed of 3 layers of convolutional layers and 3 layers of fully connected layers; in order to make the experimental robot reach target position steadily and safely, input of state includes three parts: observation data, target point distance and speed; the observation data is the "pseudo-laser" data obtained in step 3, the distance and speed of the target point are obtained by onboard odometer of the robot; proposing a feature extraction guidance layer, and extracting and fusing data features of three modes by three layers of convolution, and then obtaining feature mask through sigmoid activationand multiplying "pseudo-laser" observation data, sending result obtained to the deep reinforcement learning module; extracting information that is helpful for obstacle avoidance strategy from multi-modal data, and then combiningthe information with the "pseudo laser" observation data to make the subsequent feature extraction process more targeted and speed up convergence of the network;modifying second fully connected layer of the deep reinforcement learning module to an LSTM layer, increasing the timing correlation of the deep reinforcement learning module, and enabling the experimental robot to make decisions based on all observations in entire path;step 5, forming a monocular obstacle avoidance navigation network and output decision resultssplicing steps 2, 3, and 4 to obtain the input image from the monocular RGB camera; After processing, obtaining the depth map and the semantic segmentation mask, and multiplying the dots and cropping; after the dynamic minimum pooling operation, obtaining the "pseudo laser" observation data; inputting three consecutive frames of "pseudo-laser" observation data into the deep reinforcement learning module together with the distance and speed of the target point; after the feature extraction guidance layer, different attention is paid to each dimension of the "pseudo-laser" observation data; after multi-layer convolution, pooling, and full connection, the LSTM layer is used to increase the timing correlation for the entire path, and finally, the decision-making action of the robot at the current moment is output, so as to achieve the effect of autonomous obstacle avoidance and navigation.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110106801.6 | 2021-01-27 | ||
CN202110106801.6A CN112767373B (en) | 2021-01-27 | 2021-01-27 | Robot indoor complex scene obstacle avoidance method based on monocular camera |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022160430A1 true WO2022160430A1 (en) | 2022-08-04 |
Family
ID=75705880
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/081649 WO2022160430A1 (en) | 2021-01-27 | 2021-03-19 | Method for obstacle avoidance of robot in the complex indoor scene based on monocular camera |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112767373B (en) |
WO (1) | WO2022160430A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115416047A (en) * | 2022-09-02 | 2022-12-02 | 北京化工大学 | Blind assisting system and method based on multi-sensor quadruped robot |
CN116089798A (en) * | 2023-02-07 | 2023-05-09 | 华东理工大学 | Decoding method and device for finger movement |
CN117593517A (en) * | 2024-01-19 | 2024-02-23 | 南京信息工程大学 | Camouflage target detection method based on complementary perception cross-view fusion network |
CN117670184A (en) * | 2024-01-31 | 2024-03-08 | 埃罗德智能科技(辽宁)有限公司 | Robot scene simulation method and system applied to digital robot industrial chain |
CN117707204A (en) * | 2024-01-30 | 2024-03-15 | 清华大学 | Unmanned aerial vehicle high-speed obstacle avoidance system and method based on photoelectric end-to-end network |
CN117697769A (en) * | 2024-02-06 | 2024-03-15 | 成都威世通智能科技有限公司 | Robot control system and method based on deep learning |
CN117830991A (en) * | 2024-03-04 | 2024-04-05 | 山东大学 | Multimode fusion-based four-foot robot complex scene sensing method and system |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113240723A (en) * | 2021-05-18 | 2021-08-10 | 中德(珠海)人工智能研究院有限公司 | Monocular depth estimation method and device and depth evaluation equipment |
CN113593035A (en) * | 2021-07-09 | 2021-11-02 | 清华大学 | Motion control decision generation method and device, electronic equipment and storage medium |
CN114037050B (en) * | 2021-10-21 | 2022-08-16 | 大连理工大学 | Robot degradation environment obstacle avoidance method based on internal plasticity of pulse neural network |
CN114581684B (en) * | 2022-01-14 | 2024-06-18 | 山东大学 | Active target tracking method, system and equipment based on semantic space-time representation learning |
CN114526738B (en) * | 2022-01-25 | 2023-06-16 | 中国科学院深圳先进技术研究院 | Mobile robot visual navigation method and device based on deep reinforcement learning |
CN115805595B (en) * | 2023-02-09 | 2023-12-26 | 白杨时代(北京)科技有限公司 | Robot navigation method and device and sundry cleaning robot |
CN116382267B (en) * | 2023-03-09 | 2023-09-05 | 大连理工大学 | Robot dynamic obstacle avoidance method based on multi-mode pulse neural network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107553490A (en) * | 2017-09-08 | 2018-01-09 | 深圳市唯特视科技有限公司 | A kind of monocular vision barrier-avoiding method based on deep learning |
CN109871011A (en) * | 2019-01-15 | 2019-06-11 | 哈尔滨工业大学(深圳) | A kind of robot navigation method based on pretreatment layer and deeply study |
WO2020056299A1 (en) * | 2018-09-14 | 2020-03-19 | Google Llc | Deep reinforcement learning-based techniques for end to end robot navigation |
US20200166896A1 (en) * | 2018-11-26 | 2020-05-28 | Uber Technologies, Inc. | Deep reinforcement learning based models for hard-exploration problems |
CN111367282A (en) * | 2020-03-09 | 2020-07-03 | 山东大学 | Robot navigation method and system based on multimode perception and reinforcement learning |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104299244B (en) * | 2014-09-26 | 2017-07-25 | 东软集团股份有限公司 | Obstacle detection method and device based on monocular camera |
CN107478220B (en) * | 2017-07-26 | 2021-01-15 | 中国科学院深圳先进技术研究院 | Unmanned aerial vehicle indoor navigation method and device, unmanned aerial vehicle and storage medium |
CN111611869B (en) * | 2020-04-25 | 2021-06-01 | 哈尔滨理工大学 | End-to-end monocular vision obstacle avoidance method based on serial deep neural network |
CN111311685B (en) * | 2020-05-12 | 2020-08-07 | 中国人民解放军国防科技大学 | Motion scene reconstruction unsupervised method based on IMU and monocular image |
CN111949032A (en) * | 2020-08-18 | 2020-11-17 | 中国科学技术大学 | 3D obstacle avoidance navigation system and method based on reinforcement learning |
-
2021
- 2021-01-27 CN CN202110106801.6A patent/CN112767373B/en active Active
- 2021-03-19 WO PCT/CN2021/081649 patent/WO2022160430A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107553490A (en) * | 2017-09-08 | 2018-01-09 | 深圳市唯特视科技有限公司 | A kind of monocular vision barrier-avoiding method based on deep learning |
WO2020056299A1 (en) * | 2018-09-14 | 2020-03-19 | Google Llc | Deep reinforcement learning-based techniques for end to end robot navigation |
US20200166896A1 (en) * | 2018-11-26 | 2020-05-28 | Uber Technologies, Inc. | Deep reinforcement learning based models for hard-exploration problems |
CN109871011A (en) * | 2019-01-15 | 2019-06-11 | 哈尔滨工业大学(深圳) | A kind of robot navigation method based on pretreatment layer and deeply study |
CN111367282A (en) * | 2020-03-09 | 2020-07-03 | 山东大学 | Robot navigation method and system based on multimode perception and reinforcement learning |
Non-Patent Citations (1)
Title |
---|
LINGPING GAO: "Agent Environment Perception and Control Decision Based on Deep Reinforcement Learning", MASTER THESIS, 13 May 2020 (2020-05-13), pages 1 - 60, XP055957664, DOI: 10.26991/d.cnki.gdllu.2020.002070 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115416047A (en) * | 2022-09-02 | 2022-12-02 | 北京化工大学 | Blind assisting system and method based on multi-sensor quadruped robot |
CN116089798A (en) * | 2023-02-07 | 2023-05-09 | 华东理工大学 | Decoding method and device for finger movement |
CN117593517A (en) * | 2024-01-19 | 2024-02-23 | 南京信息工程大学 | Camouflage target detection method based on complementary perception cross-view fusion network |
CN117593517B (en) * | 2024-01-19 | 2024-04-16 | 南京信息工程大学 | Camouflage target detection method based on complementary perception cross-view fusion network |
CN117707204A (en) * | 2024-01-30 | 2024-03-15 | 清华大学 | Unmanned aerial vehicle high-speed obstacle avoidance system and method based on photoelectric end-to-end network |
CN117670184A (en) * | 2024-01-31 | 2024-03-08 | 埃罗德智能科技(辽宁)有限公司 | Robot scene simulation method and system applied to digital robot industrial chain |
CN117670184B (en) * | 2024-01-31 | 2024-05-03 | 埃罗德智能科技(辽宁)有限公司 | Robot scene simulation method and system applied to digital robot industrial chain |
CN117697769A (en) * | 2024-02-06 | 2024-03-15 | 成都威世通智能科技有限公司 | Robot control system and method based on deep learning |
CN117697769B (en) * | 2024-02-06 | 2024-04-30 | 成都威世通智能科技有限公司 | Robot control system and method based on deep learning |
CN117830991A (en) * | 2024-03-04 | 2024-04-05 | 山东大学 | Multimode fusion-based four-foot robot complex scene sensing method and system |
CN117830991B (en) * | 2024-03-04 | 2024-05-24 | 山东大学 | Multimode fusion-based four-foot robot complex scene sensing method and system |
Also Published As
Publication number | Publication date |
---|---|
CN112767373B (en) | 2022-09-02 |
CN112767373A (en) | 2021-05-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022160430A1 (en) | Method for obstacle avoidance of robot in the complex indoor scene based on monocular camera | |
Ruan et al. | Mobile robot navigation based on deep reinforcement learning | |
US10824947B2 (en) | Learning method for supporting safer autonomous driving without danger of accident by estimating motions of surrounding objects through fusion of information from multiple sources, learning device, testing method and testing device using the same | |
EP3690727B1 (en) | Learning method and learning device for sensor fusion to integrate information acquired by radar capable of distance estimation and information acquired by camera to thereby improve neural network for supporting autonomous driving, and testing method and testing device using the same | |
Tai et al. | Towards cognitive exploration through deep reinforcement learning for mobile robots | |
JP7112752B2 (en) | Method for detecting pseudo 3D bounding box, method for testing the same, device using method for detecting pseudo 3D bounding box, and device for testing the same | |
Ou et al. | Autonomous quadrotor obstacle avoidance based on dueling double deep recurrent Q-learning with monocular vision | |
CN109964237A (en) | Picture depth prediction neural network | |
CN108230361A (en) | Enhance target tracking method and system with unmanned plane detector and tracker fusion | |
Sales et al. | Adaptive finite state machine based visual autonomous navigation system | |
Han et al. | Deep reinforcement learning for robot collision avoidance with self-state-attention and sensor fusion | |
Asadi et al. | Building an integrated mobile robotic system for real-time applications in construction | |
EP3686776B1 (en) | Method for detecting pseudo-3d bounding box to be used for military purpose, smart phone or virtual driving based on cnn capable of converting modes according to conditions of objects | |
CN105014675A (en) | Intelligent mobile robot visual navigation system and method in narrow space | |
Dionísio et al. | Nereon-an underwater dataset for monocular depth estimation | |
Zhu et al. | Autonomous reinforcement control of visual underwater vehicles: Real-time experiments using computer vision | |
Mishra et al. | A review on vision based control of autonomous vehicles using artificial intelligence techniques | |
Li et al. | Driver drowsiness behavior detection and analysis using vision-based multimodal features for driving safety | |
Khalil et al. | Integration of motion prediction with end-to-end latent RL for self-driving vehicles | |
Souza et al. | Template-based autonomous navigation in urban environments | |
CN114326821A (en) | Unmanned aerial vehicle autonomous obstacle avoidance system and method based on deep reinforcement learning | |
Wang et al. | Autonomous docking of the USV using deep reinforcement learning combine with observation enhanced | |
Adiuku et al. | Mobile robot obstacle detection and avoidance with NAV-YOLO | |
Xie et al. | Multiple autonomous robotic fish collaboration | |
Dong | Visual Guidance for Unmanned Aerial Vehicles with Deep Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21922029 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21922029 Country of ref document: EP Kind code of ref document: A1 |