CN116295401A - Indoor pure vision robot obstacle sensing method - Google Patents
Indoor pure vision robot obstacle sensing method Download PDFInfo
- Publication number
- CN116295401A CN116295401A CN202310090323.3A CN202310090323A CN116295401A CN 116295401 A CN116295401 A CN 116295401A CN 202310090323 A CN202310090323 A CN 202310090323A CN 116295401 A CN116295401 A CN 116295401A
- Authority
- CN
- China
- Prior art keywords
- camera
- obstacle
- robot
- laser
- laser radar
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 77
- 238000012549 training Methods 0.000 claims abstract description 10
- 230000000007 visual effect Effects 0.000 claims description 21
- 238000001514 detection method Methods 0.000 claims description 13
- 238000013480 data collection Methods 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 7
- 239000000284 extract Substances 0.000 claims description 6
- 230000003287 optical effect Effects 0.000 claims description 5
- 230000008859 change Effects 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000013178 mathematical model Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 235000004522 Pentaglottis sempervirens Nutrition 0.000 abstract description 3
- 238000013135 deep learning Methods 0.000 abstract description 3
- 230000002265 prevention Effects 0.000 abstract description 2
- 230000004888 barrier function Effects 0.000 abstract 3
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000009471 action Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 201000004569 Blindness Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000010897 surface acoustic wave method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/20—Instruments for performing navigational calculations
- G01C21/206—Instruments for performing navigational calculations specially adapted for indoor navigation
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/86—Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/88—Lidar systems specially adapted for specific applications
- G01S17/93—Lidar systems specially adapted for specific applications for anti-collision purposes
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/48—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
- G01S7/497—Means for monitoring or calibrating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
- G06T7/85—Stereo camera calibration
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Electromagnetism (AREA)
- Automation & Control Theory (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Theoretical Computer Science (AREA)
- Manipulator (AREA)
Abstract
The invention relates to the technical field of robots, and discloses an indoor pure vision robot obstacle sensing method, wherein images and point clouds are collected in a time synchronization mode by a mobile robot on the basis that vision and laser radar external parameters are calibrated; using SLAM method to reconstruct the scene densely based on the image and point cloud and converting into BEV view angle, providing obstacle true value for the image, then using the image as input, using obstacle value in BEV as output to construct data set; training a network model based on the data set; the trained network model is deployed on an autonomous robot using only cameras. According to the invention, the method of automatically generating the data set is used for reducing the participation of manpower, the surrounding barrier information is projected to the plane by a deep learning method in a bird's eye view mode, and then the projected barrier information is provided for the planning module to realize the collision prevention of the barrier, so that the cost is low, the robustness is high, and the application scene is wide.
Description
Technical Field
The invention relates to the technical field of robots, in particular to an indoor pure vision robot obstacle sensing method.
Background
In planning navigation tasks of a robot, it is necessary to perceive obstacles in real time, which is a necessary condition for realizing navigation of the robot. In an indoor environment, the scene is relatively simple. Many times we can simplify the two-dimensional way, directly use single line laser or multi-line laser to take scanned obstacle information as the boundary of the travelable area. Within this boundary, a safe movement of the robot can be ensured.
In the existing indoor autonomous robot navigation system, a combination scheme of a binocular camera and a laser radar is mostly used. The binocular can provide three-dimensional information of surrounding objects, but has the problems of close-distance blindness, serious reduction of depth information effect at a distance, and the like. Single line lasers are cheaper to price than multi-line lasers, but can only provide surrounding obstacle information in a two-dimensional plane. While multi-line lasers, while capable of providing three-dimensional information, are quite expensive and most of the time the point cloud is sparse. These sensors, while providing some degree of environmental awareness to autonomous mobile robots, are often limited. And often, a plurality of binocular cameras and laser radars are required to be carried on one robot, which brings challenges to hardware deployment of the autonomous robot.
Today, the development of the automatic driving technology is rapid, and the pure vision automatic driving technology represented by tesla is rapid to iterate, so that the automatic driving technology can be used for automatic driving tasks in many times. Tesla uses a neural network to perform various processes on an input image to achieve various detection tasks, including object detection, road line detection, road edge detection, decisions for various scenarios, and so on. The neural network technology mainly extracts features through a convolutional neural network, wherein the convolutional neural network consists of one or more convolutional layers and a top full-connected layer (corresponding to a classical neural network) and also comprises an associated weight and a pooling layer (pooling layer). This structure enables the convolutional neural network to take advantage of the two-dimensional structure of the input data. Convolutional neural networks can give better results in terms of image and speech recognition than other deep learning structures. This model may also be trained using a back propagation algorithm. Convolutional neural networks require fewer parameters to consider than other deep, feed forward neural networks.
The use of neural networks as described above can accomplish a number of tasks, and the mainstream solution today is to use Bird's Eye View (BEV) perspective to detect obstacles, since most of the target objects in the unmanned scene are on the same ground. Compared to the front view, the targets in the BEV are fixed in size at different positions, and we can optimize the detection effect with the known common object sizes.
At present, in an indoor scene, a single-line or multi-line laser radar is generally required to be used for accurately avoiding collision of obstacles, but the laser radar brings expensive hardware cost and is unfavorable for the deployment and popularization of autonomous navigation. Most obstacle sensing systems use binocular cameras to acquire the depth of the environment according to the parallax principle, but there are close-range blind areas and situations where the depth effect is severely reduced in a far distance.
Disclosure of Invention
The invention aims to provide an indoor pure vision robot obstacle sensing method, which integrates experimental equipment to form integrated equipment so as to solve the problems in the background technology.
In order to achieve the above purpose, the present invention provides the following technical solutions: an indoor purely visual robot obstacle sensing method, comprising: s1, collecting images and point clouds in a time synchronization mode by a mobile robot on the basis that vision and laser radar external parameters are calibrated; s2, dense reconstruction is carried out on a scene based on the image and the point cloud by using a SLAM method, the scene is converted into a BEV view angle, an obstacle truth value is provided for the image, the image is taken as input, and the obstacle value in the BEV is taken as output to construct a data set; s3, training a network model on the basis of the data set; and S4, deploying the trained network model on an autonomous robot only using a camera.
Further, the method for collecting the image and the point cloud by the mobile robot in a time synchronization mode on the basis that vision and laser radar external parameters are calibrated comprises the following steps: arranging a laser radar sensor and a plurality of monocular cameras on a mobile chassis of the robot; and calibrating the camera internal parameters by using a camera calibration method, and calibrating the camera and the external parameters of the laser radar by using a laser-camera calibration method.
Further, calibrating the camera internal parameters by using the camera calibration method includes: camera matrix calibration, including focal length (f x ,f y ) Optical center (C) x ,C y ) The matrix can be represented as follows:
distortion coefficient, 5 parameters of distortion mathematical model, d= (k) 1 ,k 2 ,p 1 ,p 2 ,k 3 ) Wherein k is 1 ,k 2 ,k 3 For radial distortion coefficient, p 1 ,p 2 Tangential distortion coefficient;
and calibrating the camera internal reference by using a checkerboard calibration method to obtain the camera internal reference K.
Further, the distortion coefficients include radial distortion and tangential distortion;
the distortion coefficient expression is:
x corrected =x(1+k 1 r 2 +k 2 r 4 +k 3 r 6 )
y corrected =y(1+k 1 r 2 +k 2 r 4 +k 3 r 6 );
the tangential distortion expression is:
x corrected =x+[2p 1 xy+p 2 (x 2 +2x 2 )]
y corrected =y+[p 1 (r 2 +2y 2 )+2p 2 xy+]。
further, calibrating the external parameters of the camera and the laser radar by using the laser-camera calibration method comprises the following steps: fixing a laser-camera, detecting the gesture of a checkerboard calibration plate by using a PnP algorithm by a camera part, performing RANSAC fitting on the collected laser point cloud after circling an area where the calibration plate is positioned by the laser radar part to obtain the spatial position of the calibration plate under the laser radar coordinate, moving the calibration plate, and calculating external parameters of the calibration plate and the laser radar part after sampling a plurality of groups of data;
wherein P is c For the collected visual recognition of spatial information, P L For the spatial information identified by the laser light,the obtained external reference relationship.
Further, collecting the image and the point cloud in the time synchronization manner comprises: a camera is used for realizing a visual odometer, and scene loop detection is realized through a bag-of-word method; the laser radar extracts characteristic points in the laser point cloud through an edge point-plane point extractor, and estimates the pose change of the robot between the front frame and the rear frame through characteristic point matching to form a laser odometer; and fusing the laser odometer, the visual inertial odometer and loop constraint in a nonlinear pose chart optimization mode to obtain the point cloud of the whole scene.
On the other hand, the invention also provides an indoor pure vision robot obstacle sensing device, which comprises: the data collection module is used for collecting images and point clouds in a time synchronization mode through the mobile robot on the basis that vision and laser radar external parameters are calibrated; the data preprocessing module is used for densely reconstructing a scene based on the image and the point cloud by using the SLAM method, converting the scene into a BEV view angle, providing an obstacle truth value for the image, and constructing a data set by taking the image as input and the obstacle value in the BEV as output; the network training module is used for training a network model on the basis of the data set; and the real machine deployment module is used for deploying the trained network model on the autonomous robot only using the camera.
Further, the data collection module comprises a mobile base, a laser radar sensor and a camera; the cameras are arranged in four groups and distributed in four directions of the movable base, and the laser radar sensors are connected to the top of the movable base.
In another aspect, the present invention also proposes a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor for an indoor purely visual robot obstacle awareness method as described above.
In another aspect, the invention also provides a terminal device, which comprises a processor and a memory, wherein the memory stores a plurality of instructions, and the processor loads the instructions to execute the indoor pure vision robot obstacle sensing method.
In summary, the beneficial effects of the invention are as follows due to the adoption of the technology:
the invention uses the method of automatically generating the data set to reduce the manual participation, and in the real machine deployment, only uses the monocular camera, reduces the hardware cost and the complexity of the sensor deployment, and can also make the system continuously generate new data sets by improving the method of scene reconstruction after the real machine deployment, thereby continuously and iteratively improving the effect; in addition, based on the characteristic that the indoor robot moves on the plane, surrounding obstacle information is projected on the plane by a deep learning method in a bird's eye view mode, and then the plane is provided for a planning module to realize obstacle collision prevention, so that the cost is low, the robustness is high, and the application scene is wide.
Drawings
FIG. 1 is a flow chart of a method for sensing an obstacle of an indoor purely visual robot according to the present invention;
FIG. 2 is a schematic diagram of the obstacle sensing device of the indoor pure vision robot;
FIG. 3 is a schematic diagram of a data collection module according to the present invention;
FIG. 4 is a flow chart of the data set creation of the present invention;
fig. 5 is a flow chart of the network learning of the present invention.
In the figure: 1. a movable base; 2. a lidar sensor; 3. and a camera.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention.
The embodiment of the invention provides an indoor pure vision robot obstacle sensing method as shown in fig. 1, which mainly aims at the problem that in the prior art, single-line or multi-line laser radars are required to be used for accurately preventing obstacles from collision, but the laser radars bring expensive hardware cost and are unfavorable for the deployment and popularization of autonomous navigation. Most obstacle sensing systems use binocular cameras to acquire the depth of the environment according to the parallax principle, but the problems of close-range blind areas and serious reduction of the depth effect at a distance exist.
Aiming at the problems, the basic hardware of the invention is provided with sensors such as laser radar, cameras and the like and a chassis, and an industrial control computer with a GPU is adopted as a processing unit, and the method comprises the following steps:
s1, collecting images and point clouds in a time synchronization mode by a mobile robot on the basis that vision and laser radar external parameters are calibrated;
s2, dense reconstruction is carried out on a scene based on the image and the point cloud by using a SLAM method, the scene is converted into a BEV view angle, an obstacle truth value is provided for the image, the image is taken as input, and the obstacle value in the BEV is taken as output to construct a data set;
s3, training a network model on the basis of the data set;
and S4, deploying the trained network model on an autonomous robot only using a camera.
Specifically, the method for collecting images and point clouds by using a time synchronization mode on the basis that vision and laser radar external parameters are calibrated by a mobile robot comprises the following steps: arranging a laser radar sensor and a plurality of monocular cameras on a mobile chassis of the robot; and calibrating the camera internal parameters by using a camera calibration method, and calibrating the camera and the external parameters of the laser radar by using a laser-camera calibration method.
As shown in fig. 2 and 3, an indoor purely visual robot obstacle sensing device includes:
the data collection module is used for collecting images and point clouds in a time synchronization mode through the mobile robot on the basis that vision and laser radar external parameters are calibrated;
the data preprocessing module is used for densely reconstructing a scene based on the image and the point cloud by using the SLAM method, converting the scene into a BEV view angle, providing an obstacle truth value for the image, and constructing a data set by taking the image as input and the obstacle value in the BEV as output;
the network training module is used for training a network model on the basis of the data set;
and the real machine deployment module is used for deploying the trained network model on the autonomous robot only using the camera. The data collection module comprises a mobile base 1, a laser radar sensor 2 and a camera 3; the cameras 3 are arranged in four groups and distributed in four directions of the mobile base 1, and the laser radar sensors 2 are connected to the top of the mobile base 1.
The computing processing unit runs the Ubuntu operating system using a common x86 architecture computer, on the basis of which the ROS robot operating system is installed. The sensors used are lidar sensors (e.g., velodyne16, robosense16, or Livox, etc., without limitation when using the method of the present invention) and a plurality of monocular cameras. And calibrating the camera internal parameters by using a camera calibration method, and calibrating the camera and the external parameters of the laser radar by using a laser-camera calibration method.
Calibrating the camera internal parameters by using a camera calibration method comprises the following steps: camera matrix calibration, including focal length (f x ,f y ) Optical center (C) x ,C y ) The matrix can be represented as follows:
distortion coefficient, 5 parameters of distortion mathematical model, d= (k) 1 ,k 2 ,p 1 ,p 2 ,k 3 ) Wherein k is 1 ,k 2 ,k 3 For radial distortion coefficient, p 1 ,p 2 Tangential distortion coefficient;
and calibrating the camera internal reference by using a checkerboard calibration method to obtain the camera internal reference K.
The distortion coefficients include radial distortion and tangential distortion;
the distortion coefficient expression is:
x corrected =x(1+k 1 r 2 +k 2 r 4 +k 3 r 6 )
y corrected =y(1+k 1 r 2 +k 2 r 4 +k 3 r 6 );
the tangential distortion expression is:
x corrected =x+[2p 1 xy+p 2 (x 2 +2x 2 )]
y corrected =y+[p 1 (r 2 +2y 2 )+2p 2 xy+]。
calibrating external parameters: before visual laser fusion, the sensor is calibrated, and a checkerboard calibration plate method is adopted. After the two are fixed, the camera part detects the gesture of the checkerboard calibration plate by using a PnP algorithm, and the laser radar part performs RANSAC fitting after circling the collected laser point cloud out of the area where the calibration plate is positioned to obtain the spatial position of the calibration plate under the laser radar coordinate. After sampling several groups of data, calculating external parameters of the two by moving the calibration plate;
wherein P is c For the collected visual recognition of spatial information, P L For the spatial information identified by the laser light,the obtained external reference relationship.
Collecting images and point clouds in the time-synchronized manner, comprising: a camera is used for realizing a visual odometer, and scene loop detection is realized through a bag-of-word method; the laser radar extracts characteristic points in the laser point cloud through an edge point-plane point extractor, and estimates the pose change of the robot between the front frame and the rear frame through characteristic point matching to form a laser odometer; and fusing the laser odometer, the visual inertial odometer and loop constraint in a nonlinear pose chart optimization mode to obtain the point cloud of the whole scene.
In addition, in addition to the above symbols, the following symbols are previously stated: w is the width of the image and h is the height of the image
x * ,y * ,z * Three-dimensional information of the laser spot (in the lidar coordinate system);
three-dimensional information of x, y, z laser points (in camera coordinate system);
the laser point projected on the image is found:
r 2 =x 2 +y 2 ;
d=1+k 1 *r 2 +k 2 *(r 2 ) 2 +k 3 *(r 2 ) 3 ;
P x =f x *(X*d+2*p 1 *X*Y+p 2 *(r 2 +2*X 2 ))+C x ;
P y =f y *(Y*d+2*p 2 *X*Y+p 1 *(r 2 +2*Y 2 ))+C x ;
for the obtained P x And P y Namely the pixel point position of the point cloud point after projection is the same time as the point cloud point is not [0,w ]]And [0, h]Points in between are truncated. According to the method, the front, back, left and right cameras and the laser radar on the robot are calibrated, and the relative pose is obtained.
In some embodiments, the scene reconstruction may use different kinds of methods such as SLAM and SFM, and the SLAM method is mainly described herein.
As shown in fig. 4, the main purpose of using the SLAM method is to obtain the pose of the robot, and in the present invention, a SLAM method of laser vision fusion is used. A camera is used to implement Visual Odometry (VO), and scene loop detection is implemented by Bag of Words (BW). And the laser radar extracts characteristic points in the laser point cloud through an edge point-plane point extractor, and estimates the pose change of the robot between the front frame and the rear frame through characteristic point matching to form a laser odometer (Localization Odometry, LO). And finally, fusing a laser odometer, a visual inertial odometer and loop constraint in a nonlinear pose graph optimization mode to realize maximum posterior estimation of a robot state space based on multi-sensor information and ensure the precision of the real-time pose of the robot. In addition, an IMU (inertial measurement unit) may be added to further improve accuracy, but will not be described here. After the robot traverses the entire scene, a point cloud of the entire scene may be obtained.
The above process is mainly used for obtaining the pose of each picture and the three-dimensional point cloud of the whole scene. In the preprocessing process, the position of the robot is restored to the scene by using the recorded pose information, and the size of the range to be projected of the scene point cloud is obtained according to the set region of interest (region of interest, ROI), wherein the region of interest is generally 10m by 10m. The z-axis in a three-dimensional point cloud (represented by unordered x, y, and z) then needs to be compressed and projected onto the x-y plane to form a two-dimensional space. In this case, the projection method may be set according to the height shape of the robot itself, and a circular robot is used as an example, assuming that the height is h (in this case, a size of one robot, for example, a radius r, is also required, and a circular mask is provided in the center area). The scene point cloud is first gridded on the x-y plane, and rays are emitted to the surroundings with a fixed angular resolution in a robot-like circle, as shown in the following figure (wherein red is a ray, green is an obstacle, and only schematic). The ray is stopped until the ray hits an obstacle or the boundary of the graph, whether a point exists within h height from the ground or not is judged, if the point does not exist, the ray is set as a passable area, and otherwise, the ray is set as an obstacle area.
The network architecture used in the present invention is shown in fig. 5. The images obtained by the four cameras are firstly input into a shared CNN (convolutional neural network) backup, and then 4 characteristic diagrams with the sizes of 1/2,1/4,1/8 and 1/16 of the original figures are obtained. The 4 graphs are stitched and fused into a tensor using a 1x1 convolution before proceeding to the next step. These image features are then subjected to 2D-3D projection, similar to the inverse transformation in 2). But the depth corresponding to the pixel point in the image is unknown, and the depth distribution in the image is obtained through network learning. The projection is followed by multiplication of the last two channels z and c to obtain a three-dimensional tensor. The tensor is fed into the convolution layer to re-extract features, and then a semantic segmentation module is used to segment out the exercisable area and the obstacle area.
The embodiment also provides a terminal device, which comprises a processor and a memory, wherein the memory stores a plurality of instructions, and the processor loads the instructions to execute the indoor pure vision robot obstacle sensing method. The terminal equipment can be smart phones, computers, tablet computers and other equipment.
Memory of the terminal equipment readable storage medium, input unit, display unit, sensor, audio circuit, transmission module, processor of the processing core, power supply, etc.
The memory may be used to store software programs and modules, such as program instructions/modules corresponding to the method in the above embodiments, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, so that the vibration reminding mode can be automatically selected according to the current scene where the terminal device is located to perform the processing, thereby not only ensuring that the scenes such as a conference are not disturbed, but also ensuring that the user can perceive an incoming call, and improving the intelligence of the terminal device. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory may further include memory remotely located with respect to the processor, the remote memory being connectable to the terminal device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input unit may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, the input unit may comprise a touch-sensitive surface as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations thereon or thereabout by a user (e.g., operations thereon or thereabout by a user using any suitable object or accessory such as a finger, stylus, etc.), and actuate the corresponding connection means according to a predetermined program. Alternatively, the touch-sensitive surface may comprise two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor, and can receive and execute commands sent by the processor. In addition, touch sensitive surfaces may be implemented in a variety of types, such as resistive, capacitive, infrared, and surface acoustic waves. The input unit may comprise other input devices in addition to the touch-sensitive surface. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.
The display unit may be used to display information input by a user or information provided to the user and various graphical user interfaces of the terminal device, which may be composed of graphics, text, icons, video and any combination thereof. The display unit may include a display panel, and alternatively, the display panel may be configured in the form of an LCD (Liquid Crystal Display ), an OLED (Organic Light-Emitting Diode), or the like. Further, the touch-sensitive surface may overlay the display panel, and upon detection of a touch operation thereon or thereabout, the touch-sensitive surface is communicated to the processor to determine the type of touch event, and the processor then provides a corresponding visual output on the display panel based on the type of touch event. Although in the figures the touch sensitive surface and the display panel are implemented as two separate components, in some embodiments the touch sensitive surface and the display panel may be integrated to implement the input and output functions.
The processor is a control center of the terminal equipment, and is connected with various parts of the whole mobile phone by various interfaces and lines, and executes various functions and processing data of the terminal equipment by running or executing software programs and/or modules stored in the memory and calling data stored in the memory, so that the whole mobile phone is monitored. Optionally, the processor may include one or more processing cores; in some embodiments, the processor may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor.
The terminal device also includes a power supply for powering the various components, which in some embodiments may be logically connected to the processor by a power management system, such that functions such as managing discharge, and managing power consumption are performed by the power management system. The power supply may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
There is also provided in this embodiment a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform a method of indoor purely visual robotic obstacle awareness. It should be noted that, for the indoor pure vision robot obstacle sensing method described in the present application, it will be understood by those skilled in the art that all or part of the flow of implementing the indoor pure vision robot obstacle sensing method described in the embodiments of the present application may be implemented by controlling related hardware through a computer program, where the computer program may be stored in a computer readable storage medium, such as a memory of a terminal device, and executed by at least one processor in the terminal device, and the execution process may include the flow of the embodiment of the indoor pure vision robot obstacle sensing method. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a random access Memory (RAM, random Access Memory), or the like.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Claims (8)
1. An indoor purely visual robot obstacle sensing method is characterized by comprising the following steps:
collecting images and point clouds in a time synchronization mode by a mobile robot on the basis that vision and laser radar external parameters are calibrated;
using SLAM method to reconstruct the scene densely based on the image and point cloud and converting into BEV view angle, providing obstacle true value for the image, then using the image as input, using obstacle value in BEV as output to construct data set;
training a network model based on the data set;
the trained network model is deployed on an autonomous robot using only cameras.
2. The method for sensing the obstacle of the indoor pure vision robot according to claim 1, wherein the step of collecting the image and the point cloud in a time-synchronous manner by the mobile robot on the basis that the vision and the laser radar external parameters are calibrated comprises the steps of:
arranging a laser radar sensor and a plurality of monocular cameras on a mobile chassis of the robot;
and calibrating the camera internal parameters by using a camera calibration method, and calibrating the camera and the external parameters of the laser radar by using a laser-camera calibration method.
3. The method for sensing the obstacle of the indoor pure vision robot according to claim 2, wherein the calibrating the camera internal parameters by using the camera calibration method comprises the following steps:
camera matrix calibration, including focal length (f x ,f y ) Optical center (C) x ,C y ) The matrix can be represented as follows:
distortion coefficient, 5 parameters of distortion mathematical model, d= (k) 1 ,k 2 ,p 1 ,p 2 ,k 3 ) Wherein k is 1 ,k 2 ,k 3 For radial distortion coefficient, p 1 ,p 2 Tangential distortion coefficient;
and calibrating the camera internal reference by using a checkerboard calibration method to obtain the camera internal reference K.
4. A method for sensing an obstacle in an indoor purely visual robot according to claim 3, wherein: the distortion coefficients include radial distortion and tangential distortion;
the distortion coefficient expression is:
x corrected =x(1+k 1 r 2 +k 2 r 4 +k 3 r 6 )
y corrected =y(1+k 1 r 2 +k 2 r 4 +k 3 r 6 );
the tangential distortion expression is:
x corrected =x+[2p 1 xy+p 2 (x 2 +2x 2 )]
y corrected =y+[p 1 (r 2 +2y 2 )+2p 2 xy+]。
5. the method for sensing the obstacle of the indoor pure vision robot according to claim 4, wherein the method comprises the following steps: calibrating the external parameters of the camera and the laser radar by using a laser-camera calibration method comprises the following steps:
fixing a laser-camera, detecting the gesture of a checkerboard calibration plate by using a PnP algorithm by a camera part, performing RANSAC fitting on the collected laser point cloud after circling an area where the calibration plate is positioned by the laser radar part to obtain the spatial position of the calibration plate under the laser radar coordinate, moving the calibration plate, and calculating external parameters of the calibration plate and the laser radar part after sampling a plurality of groups of data;
6. The indoor purely visual robotic obstacle awareness method of claim 1, wherein: collecting images and point clouds in the time-synchronized manner, comprising:
a camera is used for realizing a visual odometer, and scene loop detection is realized through a bag-of-word method;
the laser radar extracts characteristic points in the laser point cloud through an edge point-plane point extractor, and estimates the pose change of the robot between the front frame and the rear frame through characteristic point matching to form a laser odometer;
and fusing the laser odometer, the visual inertial odometer and loop constraint in a nonlinear pose chart optimization mode to obtain the point cloud of the whole scene.
7. The method for sensing the obstacle of the indoor pure vision robot according to claim 4, wherein the method comprises the following steps: also included is a sensing device comprising:
the data collection module is used for collecting images and point clouds in a time synchronization mode through the mobile robot on the basis that vision and laser radar external parameters are calibrated;
the data preprocessing module is used for densely reconstructing a scene based on the image and the point cloud by using the SLAM method, converting the scene into a BEV view angle, providing an obstacle truth value for the image, and constructing a data set by taking the image as input and the obstacle value in the BEV as output;
the network training module is used for training a network model on the basis of the data set;
and the real machine deployment module is used for deploying the trained network model on the autonomous robot only using the camera.
8. The method for sensing the obstacle of the indoor pure vision robot according to claim 7, wherein the method comprises the following steps: the data collection module comprises a mobile base, a laser radar sensor and a camera; the cameras are arranged in four groups and distributed in four directions of the mobile base, and the laser radar sensors are connected to the top of the mobile base 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310090323.3A CN116295401A (en) | 2023-02-09 | 2023-02-09 | Indoor pure vision robot obstacle sensing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310090323.3A CN116295401A (en) | 2023-02-09 | 2023-02-09 | Indoor pure vision robot obstacle sensing method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116295401A true CN116295401A (en) | 2023-06-23 |
Family
ID=86819498
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310090323.3A Pending CN116295401A (en) | 2023-02-09 | 2023-02-09 | Indoor pure vision robot obstacle sensing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116295401A (en) |
-
2023
- 2023-02-09 CN CN202310090323.3A patent/CN116295401A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11987250B2 (en) | Data fusion method and related device | |
US11586218B2 (en) | Method and apparatus for positioning vehicle, electronic device and storage medium | |
CN109682381B (en) | Omnidirectional vision based large-view-field scene perception method, system, medium and equipment | |
CN113284240B (en) | Map construction method and device, electronic equipment and storage medium | |
US11810376B2 (en) | Method, apparatus and storage medium for detecting small obstacles | |
EP3974778B1 (en) | Method and apparatus for updating working map of mobile robot, and storage medium | |
CN110794844B (en) | Automatic driving method, device, electronic equipment and readable storage medium | |
US11209277B2 (en) | Systems and methods for electronic mapping and localization within a facility | |
Chatterjee et al. | Vision based autonomous robot navigation: algorithms and implementations | |
Singh et al. | A two-layered subgoal based mobile robot navigation algorithm with vision system and IR sensors | |
CN113112491B (en) | Cliff detection method, cliff detection device, robot and storage medium | |
CN116645649B (en) | Vehicle pose and size estimation method, device and storage medium | |
CN110992424B (en) | Positioning method and system based on binocular vision | |
CN110260866A (en) | A kind of robot localization and barrier-avoiding method of view-based access control model sensor | |
CN113720324A (en) | Octree map construction method and system | |
Mojtahedzadeh | Robot obstacle avoidance using the Kinect | |
CN108367436A (en) | Determination is moved for the voluntary camera of object space and range in three dimensions | |
WO2024087962A1 (en) | Truck bed orientation recognition system and method, and electronic device and storage medium | |
EP4261789A1 (en) | Method for displaying posture of robot in three-dimensional map, apparatus, device, and storage medium | |
US20220012494A1 (en) | Intelligent multi-visual camera system and method | |
CN116642490A (en) | Visual positioning navigation method based on hybrid map, robot and storage medium | |
CN116295401A (en) | Indoor pure vision robot obstacle sensing method | |
CN111753768B (en) | Method, apparatus, electronic device, and storage medium for representing shape of obstacle | |
Kim | Ground Vehicle Navigation with Depth Camera and Tracking Camera | |
Yang | Cooperative mobile robot and manipulator system for autonomous manufacturing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |