CN116295401A

CN116295401A - Indoor pure vision robot obstacle sensing method

Info

Publication number: CN116295401A
Application number: CN202310090323.3A
Authority: CN
Inventors: 陈凯; 段江哗; 李玉麟; 马骏; 王煜
Original assignee: Daimeng Shenzhen Robot Technology Co ltd
Current assignee: Daimeng Shenzhen Robot Technology Co ltd
Priority date: 2023-02-09
Filing date: 2023-02-09
Publication date: 2023-06-23

Abstract

The invention relates to the technical field of robots, and discloses an indoor pure vision robot obstacle sensing method, wherein images and point clouds are collected in a time synchronization mode by a mobile robot on the basis that vision and laser radar external parameters are calibrated; using SLAM method to reconstruct the scene densely based on the image and point cloud and converting into BEV view angle, providing obstacle true value for the image, then using the image as input, using obstacle value in BEV as output to construct data set; training a network model based on the data set; the trained network model is deployed on an autonomous robot using only cameras. According to the invention, the method of automatically generating the data set is used for reducing the participation of manpower, the surrounding barrier information is projected to the plane by a deep learning method in a bird's eye view mode, and then the projected barrier information is provided for the planning module to realize the collision prevention of the barrier, so that the cost is low, the robustness is high, and the application scene is wide.

Description

Indoor pure vision robot obstacle sensing method

Technical Field

The invention relates to the technical field of robots, in particular to an indoor pure vision robot obstacle sensing method.

Background

In planning navigation tasks of a robot, it is necessary to perceive obstacles in real time, which is a necessary condition for realizing navigation of the robot. In an indoor environment, the scene is relatively simple. Many times we can simplify the two-dimensional way, directly use single line laser or multi-line laser to take scanned obstacle information as the boundary of the travelable area. Within this boundary, a safe movement of the robot can be ensured.

In the existing indoor autonomous robot navigation system, a combination scheme of a binocular camera and a laser radar is mostly used. The binocular can provide three-dimensional information of surrounding objects, but has the problems of close-distance blindness, serious reduction of depth information effect at a distance, and the like. Single line lasers are cheaper to price than multi-line lasers, but can only provide surrounding obstacle information in a two-dimensional plane. While multi-line lasers, while capable of providing three-dimensional information, are quite expensive and most of the time the point cloud is sparse. These sensors, while providing some degree of environmental awareness to autonomous mobile robots, are often limited. And often, a plurality of binocular cameras and laser radars are required to be carried on one robot, which brings challenges to hardware deployment of the autonomous robot.

Today, the development of the automatic driving technology is rapid, and the pure vision automatic driving technology represented by tesla is rapid to iterate, so that the automatic driving technology can be used for automatic driving tasks in many times. Tesla uses a neural network to perform various processes on an input image to achieve various detection tasks, including object detection, road line detection, road edge detection, decisions for various scenarios, and so on. The neural network technology mainly extracts features through a convolutional neural network, wherein the convolutional neural network consists of one or more convolutional layers and a top full-connected layer (corresponding to a classical neural network) and also comprises an associated weight and a pooling layer (pooling layer). This structure enables the convolutional neural network to take advantage of the two-dimensional structure of the input data. Convolutional neural networks can give better results in terms of image and speech recognition than other deep learning structures. This model may also be trained using a back propagation algorithm. Convolutional neural networks require fewer parameters to consider than other deep, feed forward neural networks.

The use of neural networks as described above can accomplish a number of tasks, and the mainstream solution today is to use Bird's Eye View (BEV) perspective to detect obstacles, since most of the target objects in the unmanned scene are on the same ground. Compared to the front view, the targets in the BEV are fixed in size at different positions, and we can optimize the detection effect with the known common object sizes.

At present, in an indoor scene, a single-line or multi-line laser radar is generally required to be used for accurately avoiding collision of obstacles, but the laser radar brings expensive hardware cost and is unfavorable for the deployment and popularization of autonomous navigation. Most obstacle sensing systems use binocular cameras to acquire the depth of the environment according to the parallax principle, but there are close-range blind areas and situations where the depth effect is severely reduced in a far distance.

Disclosure of Invention

The invention aims to provide an indoor pure vision robot obstacle sensing method, which integrates experimental equipment to form integrated equipment so as to solve the problems in the background technology.

In order to achieve the above purpose, the present invention provides the following technical solutions: an indoor purely visual robot obstacle sensing method, comprising: s1, collecting images and point clouds in a time synchronization mode by a mobile robot on the basis that vision and laser radar external parameters are calibrated; s2, dense reconstruction is carried out on a scene based on the image and the point cloud by using a SLAM method, the scene is converted into a BEV view angle, an obstacle truth value is provided for the image, the image is taken as input, and the obstacle value in the BEV is taken as output to construct a data set; s3, training a network model on the basis of the data set; and S4, deploying the trained network model on an autonomous robot only using a camera.

Further, the method for collecting the image and the point cloud by the mobile robot in a time synchronization mode on the basis that vision and laser radar external parameters are calibrated comprises the following steps: arranging a laser radar sensor and a plurality of monocular cameras on a mobile chassis of the robot; and calibrating the camera internal parameters by using a camera calibration method, and calibrating the camera and the external parameters of the laser radar by using a laser-camera calibration method.

Further, calibrating the camera internal parameters by using the camera calibration method includes: camera matrix calibration, including focal length (f _x ,f _y ) Optical center (C) _x ,C _y ) The matrix can be represented as follows:

distortion coefficient, 5 parameters of distortion mathematical model, d= (k) ₁ ,k ₂ ，p ₁ ,p ₂ ，k ₃ ) Wherein k is ₁ ,k ₂ ,k ₃ For radial distortion coefficient, p ₁ ,p ₂ Tangential distortion coefficient;

and calibrating the camera internal reference by using a checkerboard calibration method to obtain the camera internal reference K.

Further, the distortion coefficients include radial distortion and tangential distortion;

the distortion coefficient expression is:

x _corrected ＝x(1+k ₁ r ² +k ₂ r ⁴ +k ₃ r ⁶ )

y _corrected ＝y(1+k ₁ r ² +k ₂ r ⁴ +k ₃ r ⁶ )；

the tangential distortion expression is:

x _corrected ＝x+[2p ₁ xy+p ₂ (x ² +2x ² )]

y _corrected ＝y+[p ₁ (r ² +2y ² )+2p ₂ xy+]。

further, calibrating the external parameters of the camera and the laser radar by using the laser-camera calibration method comprises the following steps: fixing a laser-camera, detecting the gesture of a checkerboard calibration plate by using a PnP algorithm by a camera part, performing RANSAC fitting on the collected laser point cloud after circling an area where the calibration plate is positioned by the laser radar part to obtain the spatial position of the calibration plate under the laser radar coordinate, moving the calibration plate, and calculating external parameters of the calibration plate and the laser radar part after sampling a plurality of groups of data;

wherein P is _c For the collected visual recognition of spatial information, P _L For the spatial information identified by the laser light,

the obtained external reference relationship.

Further, collecting the image and the point cloud in the time synchronization manner comprises: a camera is used for realizing a visual odometer, and scene loop detection is realized through a bag-of-word method; the laser radar extracts characteristic points in the laser point cloud through an edge point-plane point extractor, and estimates the pose change of the robot between the front frame and the rear frame through characteristic point matching to form a laser odometer; and fusing the laser odometer, the visual inertial odometer and loop constraint in a nonlinear pose chart optimization mode to obtain the point cloud of the whole scene.

On the other hand, the invention also provides an indoor pure vision robot obstacle sensing device, which comprises: the data collection module is used for collecting images and point clouds in a time synchronization mode through the mobile robot on the basis that vision and laser radar external parameters are calibrated; the data preprocessing module is used for densely reconstructing a scene based on the image and the point cloud by using the SLAM method, converting the scene into a BEV view angle, providing an obstacle truth value for the image, and constructing a data set by taking the image as input and the obstacle value in the BEV as output; the network training module is used for training a network model on the basis of the data set; and the real machine deployment module is used for deploying the trained network model on the autonomous robot only using the camera.

Further, the data collection module comprises a mobile base, a laser radar sensor and a camera; the cameras are arranged in four groups and distributed in four directions of the movable base, and the laser radar sensors are connected to the top of the movable base.

In another aspect, the present invention also proposes a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor for an indoor purely visual robot obstacle awareness method as described above.

In another aspect, the invention also provides a terminal device, which comprises a processor and a memory, wherein the memory stores a plurality of instructions, and the processor loads the instructions to execute the indoor pure vision robot obstacle sensing method.

In summary, the beneficial effects of the invention are as follows due to the adoption of the technology:

the invention uses the method of automatically generating the data set to reduce the manual participation, and in the real machine deployment, only uses the monocular camera, reduces the hardware cost and the complexity of the sensor deployment, and can also make the system continuously generate new data sets by improving the method of scene reconstruction after the real machine deployment, thereby continuously and iteratively improving the effect; in addition, based on the characteristic that the indoor robot moves on the plane, surrounding obstacle information is projected on the plane by a deep learning method in a bird's eye view mode, and then the plane is provided for a planning module to realize obstacle collision prevention, so that the cost is low, the robustness is high, and the application scene is wide.

Drawings

FIG. 1 is a flow chart of a method for sensing an obstacle of an indoor purely visual robot according to the present invention;

FIG. 2 is a schematic diagram of the obstacle sensing device of the indoor pure vision robot;

FIG. 3 is a schematic diagram of a data collection module according to the present invention;

FIG. 4 is a flow chart of the data set creation of the present invention;

fig. 5 is a flow chart of the network learning of the present invention.

In the figure: 1. a movable base; 2. a lidar sensor; 3. and a camera.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention.

The embodiment of the invention provides an indoor pure vision robot obstacle sensing method as shown in fig. 1, which mainly aims at the problem that in the prior art, single-line or multi-line laser radars are required to be used for accurately preventing obstacles from collision, but the laser radars bring expensive hardware cost and are unfavorable for the deployment and popularization of autonomous navigation. Most obstacle sensing systems use binocular cameras to acquire the depth of the environment according to the parallax principle, but the problems of close-range blind areas and serious reduction of the depth effect at a distance exist.

Aiming at the problems, the basic hardware of the invention is provided with sensors such as laser radar, cameras and the like and a chassis, and an industrial control computer with a GPU is adopted as a processing unit, and the method comprises the following steps:

s1, collecting images and point clouds in a time synchronization mode by a mobile robot on the basis that vision and laser radar external parameters are calibrated;

s2, dense reconstruction is carried out on a scene based on the image and the point cloud by using a SLAM method, the scene is converted into a BEV view angle, an obstacle truth value is provided for the image, the image is taken as input, and the obstacle value in the BEV is taken as output to construct a data set;

s3, training a network model on the basis of the data set;

and S4, deploying the trained network model on an autonomous robot only using a camera.

Specifically, the method for collecting images and point clouds by using a time synchronization mode on the basis that vision and laser radar external parameters are calibrated by a mobile robot comprises the following steps: arranging a laser radar sensor and a plurality of monocular cameras on a mobile chassis of the robot; and calibrating the camera internal parameters by using a camera calibration method, and calibrating the camera and the external parameters of the laser radar by using a laser-camera calibration method.

As shown in fig. 2 and 3, an indoor purely visual robot obstacle sensing device includes:

the data collection module is used for collecting images and point clouds in a time synchronization mode through the mobile robot on the basis that vision and laser radar external parameters are calibrated;

the data preprocessing module is used for densely reconstructing a scene based on the image and the point cloud by using the SLAM method, converting the scene into a BEV view angle, providing an obstacle truth value for the image, and constructing a data set by taking the image as input and the obstacle value in the BEV as output;

the network training module is used for training a network model on the basis of the data set;

and the real machine deployment module is used for deploying the trained network model on the autonomous robot only using the camera. The data collection module comprises a mobile base 1, a laser radar sensor 2 and a camera 3; the cameras 3 are arranged in four groups and distributed in four directions of the mobile base 1, and the laser radar sensors 2 are connected to the top of the mobile base 1.

The computing processing unit runs the Ubuntu operating system using a common x86 architecture computer, on the basis of which the ROS robot operating system is installed. The sensors used are lidar sensors (e.g., velodyne16, robosense16, or Livox, etc., without limitation when using the method of the present invention) and a plurality of monocular cameras. And calibrating the camera internal parameters by using a camera calibration method, and calibrating the camera and the external parameters of the laser radar by using a laser-camera calibration method.

Calibrating the camera internal parameters by using a camera calibration method comprises the following steps: camera matrix calibration, including focal length (f _x ,f _y ) Optical center (C) _x ,C _y ) The matrix can be represented as follows:

The distortion coefficients include radial distortion and tangential distortion;

the distortion coefficient expression is:

x _corrected ＝x(1+k ₁ r ² +k ₂ r ⁴ +k ₃ r ⁶ )

y _corrected ＝y(1+k ₁ r ² +k ₂ r ⁴ +k ₃ r ⁶ )；

the tangential distortion expression is:

x _corrected ＝x+[2p ₁ xy+p ₂ (x ² +2x ² )]

y _corrected ＝y+[p ₁ (r ² +2y ² )+2p ₂ xy+]。

calibrating external parameters: before visual laser fusion, the sensor is calibrated, and a checkerboard calibration plate method is adopted. After the two are fixed, the camera part detects the gesture of the checkerboard calibration plate by using a PnP algorithm, and the laser radar part performs RANSAC fitting after circling the collected laser point cloud out of the area where the calibration plate is positioned to obtain the spatial position of the calibration plate under the laser radar coordinate. After sampling several groups of data, calculating external parameters of the two by moving the calibration plate;

the obtained external reference relationship.

Collecting images and point clouds in the time-synchronized manner, comprising: a camera is used for realizing a visual odometer, and scene loop detection is realized through a bag-of-word method; the laser radar extracts characteristic points in the laser point cloud through an edge point-plane point extractor, and estimates the pose change of the robot between the front frame and the rear frame through characteristic point matching to form a laser odometer; and fusing the laser odometer, the visual inertial odometer and loop constraint in a nonlinear pose chart optimization mode to obtain the point cloud of the whole scene.

In addition, in addition to the above symbols, the following symbols are previously stated: w is the width of the image and h is the height of the image

x ^* ,y ^* ,z ^* Three-dimensional information of the laser spot (in the lidar coordinate system);

three-dimensional information of x, y, z laser points (in camera coordinate system);

a transposed matrix of T;

the laser point projected on the image is found:

r ² ＝x ² +y ² ；

d＝1+k ₁ *r ² +k ₂ *(r ² ) ² +k ₃ *(r ² ) ³ ；

P _x ＝f _x *(X*d+2*p ₁ *X*Y+p ₂ *(r ² +2*X ² ))+C _x ；

P _y ＝f _y *(Y*d+2*p ₂ *X*Y+p ₁ *(r ² +2*Y ² ))+C _x ；

for the obtained P _x And P _y Namely the pixel point position of the point cloud point after projection is the same time as the point cloud point is not [0,w ]]And [0, h]Points in between are truncated. According to the method, the front, back, left and right cameras and the laser radar on the robot are calibrated, and the relative pose is obtained.

In some embodiments, the scene reconstruction may use different kinds of methods such as SLAM and SFM, and the SLAM method is mainly described herein.

As shown in fig. 4, the main purpose of using the SLAM method is to obtain the pose of the robot, and in the present invention, a SLAM method of laser vision fusion is used. A camera is used to implement Visual Odometry (VO), and scene loop detection is implemented by Bag of Words (BW). And the laser radar extracts characteristic points in the laser point cloud through an edge point-plane point extractor, and estimates the pose change of the robot between the front frame and the rear frame through characteristic point matching to form a laser odometer (Localization Odometry, LO). And finally, fusing a laser odometer, a visual inertial odometer and loop constraint in a nonlinear pose graph optimization mode to realize maximum posterior estimation of a robot state space based on multi-sensor information and ensure the precision of the real-time pose of the robot. In addition, an IMU (inertial measurement unit) may be added to further improve accuracy, but will not be described here. After the robot traverses the entire scene, a point cloud of the entire scene may be obtained.

The above process is mainly used for obtaining the pose of each picture and the three-dimensional point cloud of the whole scene. In the preprocessing process, the position of the robot is restored to the scene by using the recorded pose information, and the size of the range to be projected of the scene point cloud is obtained according to the set region of interest (region of interest, ROI), wherein the region of interest is generally 10m by 10m. The z-axis in a three-dimensional point cloud (represented by unordered x, y, and z) then needs to be compressed and projected onto the x-y plane to form a two-dimensional space. In this case, the projection method may be set according to the height shape of the robot itself, and a circular robot is used as an example, assuming that the height is h (in this case, a size of one robot, for example, a radius r, is also required, and a circular mask is provided in the center area). The scene point cloud is first gridded on the x-y plane, and rays are emitted to the surroundings with a fixed angular resolution in a robot-like circle, as shown in the following figure (wherein red is a ray, green is an obstacle, and only schematic). The ray is stopped until the ray hits an obstacle or the boundary of the graph, whether a point exists within h height from the ground or not is judged, if the point does not exist, the ray is set as a passable area, and otherwise, the ray is set as an obstacle area.

The network architecture used in the present invention is shown in fig. 5. The images obtained by the four cameras are firstly input into a shared CNN (convolutional neural network) backup, and then 4 characteristic diagrams with the sizes of 1/2,1/4,1/8 and 1/16 of the original figures are obtained. The 4 graphs are stitched and fused into a tensor using a 1x1 convolution before proceeding to the next step. These image features are then subjected to 2D-3D projection, similar to the inverse transformation in 2). But the depth corresponding to the pixel point in the image is unknown, and the depth distribution in the image is obtained through network learning. The projection is followed by multiplication of the last two channels z and c to obtain a three-dimensional tensor. The tensor is fed into the convolution layer to re-extract features, and then a semantic segmentation module is used to segment out the exercisable area and the obstacle area.

The embodiment also provides a terminal device, which comprises a processor and a memory, wherein the memory stores a plurality of instructions, and the processor loads the instructions to execute the indoor pure vision robot obstacle sensing method. The terminal equipment can be smart phones, computers, tablet computers and other equipment.

Memory of the terminal equipment readable storage medium, input unit, display unit, sensor, audio circuit, transmission module, processor of the processing core, power supply, etc.

The memory may be used to store software programs and modules, such as program instructions/modules corresponding to the method in the above embodiments, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, so that the vibration reminding mode can be automatically selected according to the current scene where the terminal device is located to perform the processing, thereby not only ensuring that the scenes such as a conference are not disturbed, but also ensuring that the user can perceive an incoming call, and improving the intelligence of the terminal device. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory may further include memory remotely located with respect to the processor, the remote memory being connectable to the terminal device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input unit may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, the input unit may comprise a touch-sensitive surface as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations thereon or thereabout by a user (e.g., operations thereon or thereabout by a user using any suitable object or accessory such as a finger, stylus, etc.), and actuate the corresponding connection means according to a predetermined program. Alternatively, the touch-sensitive surface may comprise two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor, and can receive and execute commands sent by the processor. In addition, touch sensitive surfaces may be implemented in a variety of types, such as resistive, capacitive, infrared, and surface acoustic waves. The input unit may comprise other input devices in addition to the touch-sensitive surface. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.

The display unit may be used to display information input by a user or information provided to the user and various graphical user interfaces of the terminal device, which may be composed of graphics, text, icons, video and any combination thereof. The display unit may include a display panel, and alternatively, the display panel may be configured in the form of an LCD (Liquid Crystal Display ), an OLED (Organic Light-Emitting Diode), or the like. Further, the touch-sensitive surface may overlay the display panel, and upon detection of a touch operation thereon or thereabout, the touch-sensitive surface is communicated to the processor to determine the type of touch event, and the processor then provides a corresponding visual output on the display panel based on the type of touch event. Although in the figures the touch sensitive surface and the display panel are implemented as two separate components, in some embodiments the touch sensitive surface and the display panel may be integrated to implement the input and output functions.

The processor is a control center of the terminal equipment, and is connected with various parts of the whole mobile phone by various interfaces and lines, and executes various functions and processing data of the terminal equipment by running or executing software programs and/or modules stored in the memory and calling data stored in the memory, so that the whole mobile phone is monitored. Optionally, the processor may include one or more processing cores; in some embodiments, the processor may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor.

The terminal device also includes a power supply for powering the various components, which in some embodiments may be logically connected to the processor by a power management system, such that functions such as managing discharge, and managing power consumption are performed by the power management system. The power supply may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.

There is also provided in this embodiment a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform a method of indoor purely visual robotic obstacle awareness. It should be noted that, for the indoor pure vision robot obstacle sensing method described in the present application, it will be understood by those skilled in the art that all or part of the flow of implementing the indoor pure vision robot obstacle sensing method described in the embodiments of the present application may be implemented by controlling related hardware through a computer program, where the computer program may be stored in a computer readable storage medium, such as a memory of a terminal device, and executed by at least one processor in the terminal device, and the execution process may include the flow of the embodiment of the indoor pure vision robot obstacle sensing method. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a random access Memory (RAM, random Access Memory), or the like.

The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Claims

1. An indoor purely visual robot obstacle sensing method is characterized by comprising the following steps:

collecting images and point clouds in a time synchronization mode by a mobile robot on the basis that vision and laser radar external parameters are calibrated;

using SLAM method to reconstruct the scene densely based on the image and point cloud and converting into BEV view angle, providing obstacle true value for the image, then using the image as input, using obstacle value in BEV as output to construct data set;

training a network model based on the data set;

the trained network model is deployed on an autonomous robot using only cameras.

2. The method for sensing the obstacle of the indoor pure vision robot according to claim 1, wherein the step of collecting the image and the point cloud in a time-synchronous manner by the mobile robot on the basis that the vision and the laser radar external parameters are calibrated comprises the steps of:

arranging a laser radar sensor and a plurality of monocular cameras on a mobile chassis of the robot;

and calibrating the camera internal parameters by using a camera calibration method, and calibrating the camera and the external parameters of the laser radar by using a laser-camera calibration method.

3. The method for sensing the obstacle of the indoor pure vision robot according to claim 2, wherein the calibrating the camera internal parameters by using the camera calibration method comprises the following steps:

camera matrix calibration, including focal length (f _x ,f _y ) Optical center (C) _x ,C _y ) The matrix can be represented as follows:

4. A method for sensing an obstacle in an indoor purely visual robot according to claim 3, wherein: the distortion coefficients include radial distortion and tangential distortion;

the distortion coefficient expression is:

x _corrected ＝x(1+k ₁ r ² +k ₂ r ⁴ +k ₃ r ⁶ )

y _corrected ＝y(1+k ₁ r ² +k ₂ r ⁴ +k ₃ r ⁶ )；

the tangential distortion expression is:

x _corrected ＝x+[2p ₁ xy+p ₂ (x ² +2x ² )]

y _corrected ＝y+[p ₁ (r ² +2y ² )+2p ₂ xy+]。

5. the method for sensing the obstacle of the indoor pure vision robot according to claim 4, wherein the method comprises the following steps: calibrating the external parameters of the camera and the laser radar by using a laser-camera calibration method comprises the following steps:

fixing a laser-camera, detecting the gesture of a checkerboard calibration plate by using a PnP algorithm by a camera part, performing RANSAC fitting on the collected laser point cloud after circling an area where the calibration plate is positioned by the laser radar part to obtain the spatial position of the calibration plate under the laser radar coordinate, moving the calibration plate, and calculating external parameters of the calibration plate and the laser radar part after sampling a plurality of groups of data;

the obtained external reference relationship.

6. The indoor purely visual robotic obstacle awareness method of claim 1, wherein: collecting images and point clouds in the time-synchronized manner, comprising:

a camera is used for realizing a visual odometer, and scene loop detection is realized through a bag-of-word method;

the laser radar extracts characteristic points in the laser point cloud through an edge point-plane point extractor, and estimates the pose change of the robot between the front frame and the rear frame through characteristic point matching to form a laser odometer;

and fusing the laser odometer, the visual inertial odometer and loop constraint in a nonlinear pose chart optimization mode to obtain the point cloud of the whole scene.

7. The method for sensing the obstacle of the indoor pure vision robot according to claim 4, wherein the method comprises the following steps: also included is a sensing device comprising:

and the real machine deployment module is used for deploying the trained network model on the autonomous robot only using the camera.

8. The method for sensing the obstacle of the indoor pure vision robot according to claim 7, wherein the method comprises the following steps: the data collection module comprises a mobile base, a laser radar sensor and a camera; the cameras are arranged in four groups and distributed in four directions of the mobile base, and the laser radar sensors are connected to the top of the mobile base 1.