CN110609560A

CN110609560A - Mobile robot obstacle avoidance planning method and computer storage medium

Info

Publication number: CN110609560A
Application number: CN201911037590.4A
Authority: CN
Inventors: 李振; 柏林; 刘彪; 舒海燕; 宿凯; 沈创芸; 祝涛剑; 雷宜辉; 张绍飞; 刘涛
Original assignee: Guangzhou High Rising Robot Co Ltd
Current assignee: Guangzhou High Rising Robot Co Ltd
Priority date: 2019-10-29
Filing date: 2019-10-29
Publication date: 2019-12-24

Abstract

The invention provides a mobile robot obstacle avoidance planning method and a computer storage medium, wherein the method comprises the following steps: s1, establishing a task model, and recording the environment state of the mobile robot in the task model; s2, according to the task model, automatically generating samples based on an A-star algorithm; s3, learning the sample by adopting a convolutional neural network to obtain a training result; and S4, using the training result for obstacle avoidance planning of the mobile robot. According to the obstacle avoidance planning method for the mobile robot, provided by the embodiment of the invention, the sample is automatically generated through the A-x algorithm, the obstacle avoidance paths in various environment states can be generated, then the learning is carried out through the convolutional neural network based on the sample and is used for obstacle avoidance planning, the real-time performance is ensured, and meanwhile, the characteristics of optimal path, suitability for dynamic environment and better generalization capability are also considered.

Description

Mobile robot obstacle avoidance planning method and computer storage medium

Technical Field

The present invention relates to the field of mobile robots, and more particularly, to a method for planning obstacle avoidance for a mobile robot and a computer storage medium.

Background

With the continuous development of artificial intelligence technology and robot application fields, the mobile robot has huge application requirements in many fields at home and abroad, such as mine environment detection, automatic workshop carrying, indoor and outdoor safety patrol and the like. And the automatic navigation capability is a basic premise for the floor use of the mobile robot. Because the actual environment is dynamically changed, how to design a better obstacle avoidance planning method so as to enable the robot to adapt to various environments is a very important problem.

The obstacle avoidance planning means that the mobile robot bypasses an obstacle and finally reaches a target point when sensing that a static or dynamic obstacle exists on a planned route through a sensor in the autonomous moving process. At present, the obstacle avoidance planning method of the mobile robot mainly includes the following two methods:

(1) a fuzzy control method: the prior knowledge forms a certain rule to generate a fuzzy controller to realize obstacle avoidance of the robot; (2) dynamic window method: the dynamic window method is used for controlling the robot to avoid obstacles by sampling a plurality of groups of speeds and angular speeds, simulating the movement track of the robot in the future based on a movement model, evaluating the track and finally selecting the speed and the angular speed corresponding to the optimal track.

The two obstacle avoidance planning methods have many defects:

(1) a fuzzy control method: the obstacle avoidance planning of the robot under a specific scene is realized by formulating a fuzzy rule, and a proper rule is difficult to formulate so that the robot is suitable for various environments and has poor generalization capability;

(2) dynamic window method: the method has poor effect on dynamic obstacles, and the real-time performance is difficult to guarantee when the speed sampling space is too large.

Disclosure of Invention

In view of this, the present invention provides an obstacle avoidance planning method for a mobile robot and a computer storage medium, which can improve the generalization ability of the mobile robot and ensure the real-time performance.

In order to solve the technical problem, in one aspect, the present invention provides an obstacle avoidance planning method for a mobile robot, where the method includes: s1, establishing a task model, and recording the environment state of the mobile robot in the task model; s2, according to the task model, automatically generating samples based on an A-star algorithm; s3, learning the sample by adopting a convolutional neural network to obtain a training result; and S4, using the training result for obstacle avoidance planning of the mobile robot.

According to the obstacle avoidance planning method for the mobile robot, provided by the embodiment of the invention, the sample is automatically generated through the A-x algorithm, the obstacle avoidance paths in various environment states can be generated, then the learning is carried out through the convolutional neural network based on the sample and is used for obstacle avoidance planning, the real-time performance is ensured, and meanwhile, the characteristics of optimal path, suitability for dynamic environment and better generalization capability are also considered.

According to some embodiments of the invention, in step S1, the task model is a grid map.

According to some embodiments of the present invention, in step S1, the walking task of the mobile robot is described in different grid values.

According to some embodiments of the invention, step S2 includes: s21, randomly generating an occupied grid, a free grid and a target grid according to the task model, and respectively assigning values to the occupied grid, the free grid and the target grid; s22, based on the walking task of the mobile robot, planning a path by adopting an A-x algorithm, if the path is planned successfully, recording the path, and if the path is planned unsuccessfully, marking the path as unreachable; s23, generating a label according to the planning result; and S24, repeating the steps S21 to S23, and generating a plurality of samples.

According to some embodiments of the invention, in step S21, a 30 x 30 grid map is created, where the value of the occupancy grid is 1, the value of the free grid is 0, and the value of the target grid is the original value of the occupancy grid or the free grid plus 99.

According to some embodiments of the present invention, in step S23, the tags include 9 tags from 0 to 8, where the tag 0 indicates that the mobile robot does not move, the tag 1 indicates that the mobile robot moves to the grid where 1 is located, the tag 2 indicates that the mobile robot moves to the grid where 2 is located, and so on, if the mobile robot plans the path unsuccessfully, the tag generated by the sample is 0, and if the mobile robot plans the path successfully, the tag value corresponding to the next grid where the path passes is selected as the tag of the sample.

According to some embodiments of the invention, in step S3, the sample is learned using a 6-layer neural network.

According to some embodiments of the invention, each layer of neural network contains training parameters including: the convolutional layer C1, the convolutional layer C1 includes 20 convolution kernels, the size of each convolution kernel is 3 x 3, 20 characteristic maps of 28x28 are output after convolution; s2 down-sampling layer, wherein the S2 down-sampling layer takes the maximum value of each 2x2 neighborhood in the feature map and outputs the maximum value to the next layer, and the output is 20 feature maps of 14x 14; a C3 convolutional layer, the C3 convolutional layer convolving the output of the S2 downsampled layer by a 5x5 convolutional kernel, the C3 convolutional layer comprises 50 convolutional kernels and outputs 50 10x10 feature maps; s4 down-sampling layer, wherein the S4 down-sampling layer takes the maximum value of each 2x2 neighborhood in the feature map and outputs the maximum value to the next layer, and the output is 50 feature maps of 5x 5; an F5 full link layer, the F5 full link layer comprising 100 cells, each cell fully linked with the S4 downsample layer; an L6Softmax regression classifier that accepts 100 inputs, producing 9 outputs.

According to some embodiments of the invention, in step S3, the training result is an obstacle avoidance decision of the mobile robot.

In a second aspect, embodiments of the present invention provide a computer storage medium comprising one or more computer instructions that, when executed, implement a method as in the above embodiments.

Drawings

Fig. 1 is a flowchart of an obstacle avoidance planning method for a mobile robot according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a task model in the obstacle avoidance planning method for a mobile robot according to the embodiment of the present invention;

fig. 3 is a schematic diagram of a sample label in the obstacle avoidance planning method for a mobile robot according to the embodiment of the present invention;

fig. 4 is a schematic diagram of a neural network in the obstacle avoidance planning method for a mobile robot according to the embodiment of the present invention;

fig. 5 is a schematic diagram of an electronic device according to an embodiment of the invention.

Reference numerals:

an electronic device 300;

a memory 310; an operating system 311; an application 312;

a processor 320; a network interface 330; an input device 340; a hard disk 350; a display device 360.

Detailed Description

The following detailed description of embodiments of the present invention will be made with reference to the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

First, an obstacle avoidance planning method for a mobile robot according to an embodiment of the present invention is described in detail with reference to the accompanying drawings.

As shown in fig. 1 to 4, the obstacle avoidance planning method for a mobile robot according to the embodiment of the present invention includes the following steps:

and S1, establishing a task model, and recording the environment state of the mobile robot in the task model.

And S2, according to the task model, automatically generating samples based on an A-star algorithm.

And S3, learning the sample by adopting a convolutional neural network to obtain a training result.

And S4, using the training result for obstacle avoidance planning of the mobile robot.

In other words, when the obstacle avoidance planning method for the mobile robot according to the embodiment of the invention is used for obstacle avoidance planning of the robot, firstly, a task model capable of describing the environment state of the mobile robot is established, the task model can be reasonably adjusted according to the environment state of the mobile robot, then, according to the task model, an obstacle avoidance planning sample is autonomously generated through an A-algorithm, then, a convolutional neural network is adopted to learn and train the sample, and the obtained training result can be used for obstacle avoidance planning of the mobile robot.

Therefore, according to the mobile robot obstacle avoidance planning method provided by the embodiment of the invention, the sample is automatically generated through the A-star algorithm, the obstacle avoidance paths in various environment states can be generated, then the convolutional neural network is used for learning based on the sample and planning the obstacle avoidance, the real-time performance is ensured, and meanwhile, the characteristics of optimal path, suitability for dynamic environment and better generalization capability are also considered.

According to an embodiment of the invention, in step S1, the task model is a grid map. Preferably, the walking task of the mobile robot is described in different grid values.

That is to say, in the obstacle avoidance planning method for the mobile robot according to the embodiment of the present invention, a grid map is used as a task model to describe an environment state of the mobile robot, different grids in the grid map may represent different grid values, and a walking task of the mobile robot may be described by the different grid values. Therefore, the environment state of the mobile robot is described through the grid map, the method has better generalization capability, can be applied to various scenes, and can generate obstacle avoidance paths in various environment states.

In some embodiments of the invention, step S2 includes:

and S21, randomly generating an occupied grid, a free grid and a target grid according to the task model, and respectively assigning values to the occupied grid, the free grid and the target grid.

And S22, based on the walking task of the mobile robot, planning a path by adopting an A-algorithm, recording the path if the path is planned successfully, and marking the path as unreachable if the path is planned unsuccessfully.

And S23, generating a label according to the planning result.

And S24, repeating the steps S21 to S23, and generating a plurality of samples.

In step S21, a 30 × 30 grid map is created, where the value of the occupancy grid is 1, the value of the free grid is 0, and the value of the target grid is the original value of the occupancy grid or the free grid plus 99. Further, in step S23, the tags include 9 tags from 0 to 8, where the tag 0 indicates that the mobile robot does not move, the tag 1 indicates that the mobile robot moves to the grid where 1 is located, the tag 2 indicates that the mobile robot moves to the grid where 2 is located, and so on, if the mobile robot plans the path, the tag generated by the sample is 0, and if the mobile robot plans the path successfully, the tag value corresponding to the next grid through which the path passes is selected as the tag of the sample.

It should be noted that, in step S21, the steps of randomly generating the occupied grid, the free grid, and the target grid according to the task model, and assigning values to the occupied grid, the free grid, and the target grid respectively may be completed when the task model is established in step S1, or only the task model to be established may be established in step S1, but the steps of randomly generating various grids in the task model and assigning values to the grids may be completed in step S21.

Specifically, as shown in fig. 2, fig. 2 shows a 30 × 30 grid map, with a grid value of 1 indicating an occupancy grid, i.e., onto which the mobile robot cannot travel, as shown by the 2 black grid in the figure. The remaining grids are free grids, and the value is 0, i.e. the mobile robot can travel on the free grids. The mobile robot is located in a fixed grid in the grid map, and the starting point is the grid where the mobile robot is located as shown in fig. 2. The grid value of the target point is the original value plus 99, i.e. the target grid is located on the free grid, the grid value is 99, and if the target grid is located on the occupied grid, the grid value is 100.

In step S22, a path is planned using the a-algorithm based on the travel task of the mobile robot generated in step S21, and if the planned path is successful, the path is recorded, and if the planned path is failed, the path is marked as unreachable. Then, labels can be generated according to the planning result, as shown in fig. 3, the labels include 9 labels from 0 to 8, where the label 0 indicates that the mobile robot does not move, the label 1 indicates that the mobile robot moves to the grid where 1 is located, the label 2 indicates that the mobile robot moves to the grid where 2 is located, and so on, if the mobile robot plans the path unsuccessfully, the label generated by the sample is 0, and if the mobile robot plans the path successfully, the label value corresponding to the next grid that the path passes through is selected as the label of the sample. The label corresponding to the sample shown in fig. 2 is 3.

After a label is generated, the steps S21-S23 are repeated until enough samples are generated, the number of samples being adjustable according to the actual use condition.

Optionally, in some embodiments of the present invention, in step S3, the sample is learned by using 6 layers of neural networks, preferably, each layer of neural networks includes training parameters, and the training parameters include: a C1 convolutional layer, an S2 downsampled layer, a C3 convolutional layer, an S4 downsampled layer, an F5 fully-connected layer, and an L6Softmax regression classifier.

Specifically, as shown in fig. 4, the C1 convolutional layer includes 20 convolutional kernels, each convolutional kernel has a size of 3 × 3, and 20 characteristic maps of 28 × 28 are output after convolution; the S2 downsampling layer takes the maximum value of each 2x2 neighborhood in the feature map and outputs the maximum value to the next layer, and the output is 20 feature maps of 14x 14; the C3 convolutional layer convolving the output of the S2 downsampled layer by a 5x5 convolutional kernel, the C3 convolutional layer comprising 50 convolutional kernels outputting 50 10x10 feature maps; the S4 downsampling layer takes the maximum value of each 2x2 neighborhood in the feature map and outputs the maximum value to the next layer, and the output is 50 feature maps of 5x 5; the F5 fully connected layer includes 100 cells, each cell fully connected with the S4 downsampled layer; the L6Softmax regression classifier accepts 100 inputs, producing 9 outputs.

Finally, in step S3, the training result is an obstacle avoidance decision of the mobile robot. Therefore, the neural network structure is adopted to train and learn the samples, and the training result can be used for obstacle avoidance planning of the mobile robot.

In a specific implementation process of the obstacle avoidance planning method for the mobile robot according to the embodiment of the invention, a 30 × 30 local grid map is established according to the current position of the mobile robot, and then the grid established in the previous step is assigned with values according to the walking task, map information, sensor information or other known information of the mobile robot. And then inputting the grid information in the previous step into the trained network, and outputting an obstacle avoidance decision. In the actual use process, one-step or multi-step prediction can be carried out according to specific application, and finally the robot can be controlled to avoid obstacles.

In summary, according to the obstacle avoidance planning method for the mobile robot in the embodiment of the invention, the sample training neural network is generated according to the result of the A-x algorithm planning, so that the obstacle avoidance decision has the characteristic of optimal path, the trained neural network has high decision efficiency, and the method is suitable for a dynamic environment to a certain extent, has good generalization capability and is suitable for various scenes.

In addition, the present invention also provides a computer storage medium, where the computer storage medium includes one or more computer instructions, and when executed, the one or more computer instructions implement any one of the above-mentioned obstacle avoidance planning methods for a mobile robot.

That is, the computer storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes any one of the above-mentioned obstacle avoidance planning methods for a mobile robot.

As shown in fig. 5, an embodiment of the present invention provides an electronic device 300, which includes a memory 310 and a processor 320, where the memory 310 is configured to store one or more computer instructions, and the processor 320 is configured to call and execute the one or more computer instructions, so as to implement any one of the methods described above.

That is, the electronic device 300 includes: a processor 320 and a memory 310, in which memory 310 computer program instructions are stored, wherein the computer program instructions, when executed by the processor, cause the processor 320 to perform any of the methods described above.

Further, as shown in fig. 5, the electronic device 300 further includes a network interface 330, an input device 340, a hard disk 350, and a display device 360.

The various interfaces and devices described above may be interconnected by a bus architecture. A bus architecture may be any architecture that may include any number of interconnected buses and bridges. Various circuits of one or more Central Processing Units (CPUs), represented in particular by processor 320, and one or more memories, represented by memory 310, are coupled together. The bus architecture may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like. It will be appreciated that a bus architecture is used to enable communications among the components. The bus architecture includes a power bus, a control bus, and a status signal bus, in addition to a data bus, all of which are well known in the art and therefore will not be described in detail herein.

The network interface 330 may be connected to a network (e.g., the internet, a local area network, etc.), and may obtain relevant data from the network and store the relevant data in the hard disk 350.

The input device 340 may receive various commands input by an operator and send the commands to the processor 320 for execution. The input device 340 may include a keyboard or a pointing device (e.g., a mouse, a trackball, a touch pad, a touch screen, or the like).

The display device 360 may display the result of the instructions executed by the processor 320.

The memory 310 is used for storing programs and data necessary for operating the operating system, and data such as intermediate results in the calculation process of the processor 320.

It will be appreciated that memory 310 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. The memory 310 of the apparatus and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

In some embodiments, memory 310 stores the following elements, executable modules or data structures, or a subset thereof, or an expanded set thereof: an operating system 311 and application programs 312.

The operating system 311 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application programs 312 include various application programs, such as a Browser (Browser), and are used for implementing various application services. A program implementing methods of embodiments of the present invention may be included in application 312.

The method disclosed by the above embodiment of the present invention can be applied to the processor 320, or implemented by the processor 320. Processor 320 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 320. The processor 320 may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, and may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present invention. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 310, and the processor 320 reads the information in the memory 310 and completes the steps of the method in combination with the hardware.

It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units designed to perform the functions described herein, or a combination thereof.

For a software implementation, the techniques described herein may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.

In particular, the processor 320 is also configured to read the computer program and execute any of the methods described above.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be physically included alone, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute some steps of the transceiving method according to various embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A mobile robot obstacle avoidance planning method is characterized by comprising the following steps:

s1, establishing a task model, and recording the environment state of the mobile robot in the task model;

s2, according to the task model, automatically generating samples based on an A-star algorithm;

s3, learning the sample by adopting a convolutional neural network to obtain a training result;

2. The method of claim 1, wherein in step S1, the task model is a grid map.

3. The method according to claim 2, characterized in that in step S1, the walking task of the mobile robot is described in different grid values.

4. The method according to claim 3, wherein step S2 includes:

s21, randomly generating an occupied grid, a free grid and a target grid according to the task model, and respectively assigning values to the occupied grid, the free grid and the target grid;

s22, based on the walking task of the mobile robot, planning a path by adopting an A-x algorithm, if the path is planned successfully, recording the path, and if the path is planned unsuccessfully, marking the path as unreachable;

s23, generating a label according to the planning result;

and S24, repeating the steps S21 to S23, and generating a plurality of samples.

5. The method according to claim 4, wherein in step S21, a 30 x 30 grid map is created, wherein the occupancy grid has a value of 1, the free grid has a value of 0, and the target grid has a value of 99 added to the original value of the occupancy grid or the free grid.

6. The method according to claim 5, wherein in step S23, the labels include 9 labels from 0 to 8, where the label 0 indicates that the mobile robot does not move, the label 1 indicates that the mobile robot moves to the grid where 1 is located, the label 2 indicates that the mobile robot moves to the grid where 2 is located, and so on, if the mobile robot plans the path unsuccessfully, the label generated by the sample is 0, and if the mobile robot plans the path successfully, the label value corresponding to the next grid that the path passes through is selected as the label of the sample.

7. The method of claim 3, wherein in step S3, the sample is learned using a 6-layer neural network.

8. The method of claim 7, wherein each layer of neural network contains training parameters comprising:

the convolutional layer C1, the convolutional layer C1 includes 20 convolution kernels, the size of each convolution kernel is 3 x 3, 20 characteristic maps of 28x28 are output after convolution;

s2 down-sampling layer, wherein the S2 down-sampling layer takes the maximum value of each 2x2 neighborhood in the feature map and outputs the maximum value to the next layer, and the output is 20 feature maps of 14x 14;

a C3 convolutional layer, the C3 convolutional layer convolving the output of the S2 downsampled layer by a 5x5 convolutional kernel, the C3 convolutional layer comprises 50 convolutional kernels and outputs 50 10x10 feature maps;

s4 down-sampling layer, wherein the S4 down-sampling layer takes the maximum value of each 2x2 neighborhood in the feature map and outputs the maximum value to the next layer, and the output is 50 feature maps of 5x 5;

an F5 full link layer, the F5 full link layer comprising 100 cells, each cell fully linked with the S4 downsample layer;

an L6Softmax regression classifier that accepts 100 inputs, producing 9 outputs.

9. The method of claim 1, wherein in step S3, the training result is an obstacle avoidance decision of the mobile robot.

10. A computer storage medium comprising one or more computer instructions which, when executed, implement the method of any one of claims 1-9.