CN117243539A

CN117243539A - Artificial intelligence obstacle surmounting and escaping method, device and control system

Info

Publication number: CN117243539A
Application number: CN202311457468.9A
Authority: CN
Inventors: 王忠林
Original assignee: Jihai Science And Technology Shenzhen Co ltd
Current assignee: Jihai Science And Technology Shenzhen Co ltd
Priority date: 2023-11-03
Filing date: 2023-11-03
Publication date: 2023-12-19

Abstract

The invention relates to the technical field of artificial intelligence, and discloses an artificial intelligence obstacle surmounting and escaping method, an artificial intelligence obstacle surmounting and escaping device and a control system, which are used for improving the accuracy of the artificial intelligence obstacle surmounting and escaping. Comprising the following steps: acquiring a plurality of infrared environment pictures; inputting a plurality of infrared environment pictures into a multi-scale ASFF network for infrared feature extraction to obtain a plurality of infrared feature pictures with different scales; performing cross-layer fusion processing on a plurality of infrared feature images with different scales to obtain a plurality of fusion feature images; inputting the multiple fusion feature images into an attention-enhancing coding network to perform target feature enhancement processing to obtain multiple enhancement feature images; inputting a plurality of enhancement feature maps into an obstacle recognition model to perform obstacle recognition so as to obtain target obstacle information; constructing a static grid map through the target obstacle information, and generating a cost potential field according to the target obstacle information; based on the static grid map, the cleaning robot is subjected to escape route planning through the cost potential field, and a target escape route is generated.

Description

Artificial intelligence obstacle surmounting and escaping method, device and control system

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an artificial intelligence obstacle surmounting and escaping method, an artificial intelligence obstacle surmounting and escaping device and a control system.

Background

The use of cleaning robots has become increasingly popular in modern society, especially in home and business environments. These robots aim to autonomously perform cleaning tasks, reducing the workload of people. However, in a practical environment, the cleaning robot often faces complicated obstacles, narrow passages, and insufficient light, which cause the performance of the robot to be degraded or stuck.

Many cleaning robots use sensors to sense the surrounding environment, but these sensors are limited, resulting in less accurate sensing of obstacles. This results in robot collisions or in an incomplete avoidance of obstacles. Most cleaning robots use basic path planning algorithms, but these algorithms are not flexible enough in complex environments. They cannot take into account the dynamics of obstacles or complex terrain. Some cleaning robot systems cause delay problems due to limited computational resources or inefficient algorithms. In case a fast response is required, this may lead to unavoidable obstacles or getting rid of the robot. Many cleaning robot systems are not accurate enough in modeling an environment. This means that they cannot understand the environment effectively and cannot predict the position or change of the obstacle.

Disclosure of Invention

The invention provides an artificial intelligence obstacle crossing and escaping method, an artificial intelligence obstacle crossing and escaping device and a control system, which are used for improving the accuracy of the artificial intelligence obstacle crossing and escaping.

The first aspect of the invention provides an artificial intelligence obstacle surmounting and escaping method, which comprises the following steps:

acquiring a plurality of infrared environment pictures acquired by a preset cleaning robot;

inputting a plurality of the infrared environment pictures into a preset multi-scale ASFF network to extract infrared characteristics, so as to obtain a plurality of infrared characteristic pictures with different scales;

performing cross-layer fusion processing on a plurality of infrared feature images with different scales to obtain a plurality of fusion feature images;

inputting a plurality of the fusion feature images into a preset attention-enhancing coding network to perform target feature enhancement processing to obtain a plurality of enhancement feature images;

inputting a plurality of enhancement feature maps into a preset obstacle recognition model to perform obstacle recognition to obtain target obstacle information, wherein the target obstacle information comprises: obstacle position information and obstacle size information;

constructing a static grid map through the target obstacle information, and generating a cost potential field according to the target obstacle information;

And planning a getting-off path of the cleaning robot through the cost potential field based on the static grid map, generating a target getting-off route and controlling the cleaning robot to run through the target getting-off route.

With reference to the first aspect, in a first implementation manner of the first aspect of the present invention, the performing cross-layer fusion processing on the multiple infrared feature maps with different scales to obtain multiple fusion feature maps includes:

respectively carrying out weight coefficient calculation on each infrared characteristic graph with different scales to obtain weight coefficient values corresponding to each infrared characteristic graph with different scales;

based on the weight coefficient value corresponding to each infrared characteristic image with different scales, carrying out weighted fusion on a plurality of infrared characteristic images with different scales to obtain a plurality of candidate fusion characteristic images;

performing feature level division on the candidate fusion feature graphs to obtain a plurality of shallow network high-resolution low-level features and a plurality of deep network low-resolution high-level semantic features;

and performing cross-layer fusion processing on the candidate fusion feature graphs based on the high-resolution low-level features of the shallow networks and the low-resolution high-level semantic features in the deep networks to obtain a plurality of fusion feature graphs.

With reference to the first aspect, in a second implementation manner of the first aspect of the present invention, inputting the plurality of fusion feature maps into a preset attention-enhancing encoding network to perform target feature enhancement processing, to obtain a plurality of enhancement feature maps, includes:

inputting a plurality of fusion feature images into the attention-enhancing coding network to perform hole convolution operation to obtain infrared features corresponding to each fusion feature image and receptive field size data corresponding to each fusion feature image;

based on the receptive field size data corresponding to each fusion feature map, inputting the infrared features corresponding to each fusion feature map into a first pooling layer for carrying out an average pooling operation to generate an average pooling feature map;

inputting the infrared features corresponding to each fusion feature map into a second pooling layer for maximum pooling operation based on the receptive field size data corresponding to each fusion feature map, and generating a maximum pooling feature map;

carrying out feature addition processing on the average pooling feature map and the maximum pooling feature map to obtain addition feature data, and carrying out data mapping on the addition feature data through a preset activation function to obtain a channel weight coefficient corresponding to the enhanced attention coding network;

And carrying out target feature enhancement processing on the fusion features based on the channel weight coefficient to obtain a plurality of enhancement feature graphs.

With reference to the first aspect, in a third implementation manner of the first aspect of the present invention, the inputting a plurality of enhancement feature maps into a preset obstacle recognition model to perform obstacle recognition, to obtain target obstacle information includes:

inputting a plurality of enhancement feature images into the obstacle recognition model for primary recognition to obtain initial obstacle information, and inputting a plurality of enhancement feature images into the obstacle recognition model for class loss calculation to obtain class loss data corresponding to each enhancement feature image;

inputting a plurality of enhancement feature images into the obstacle recognition model to calculate offset loss, so as to obtain offset loss data corresponding to each enhancement feature image;

inputting a plurality of enhancement feature images into the obstacle recognition model for regression loss calculation to obtain regression loss data corresponding to each enhancement feature image;

carrying out overall network loss calculation on the category loss data, the offset loss data and the regression loss data to obtain target overall network loss;

And carrying out information correction on the initial obstacle information based on the target overall network loss to obtain the target obstacle information.

With reference to the third implementation manner of the first aspect, in a fourth implementation manner of the first aspect of the present invention, the inputting the plurality of enhancement feature maps into a class loss function of a preset obstacle recognition model to perform class loss calculation, to obtain class loss data corresponding to each enhancement feature map includes:

inputting a plurality of enhancement feature images into a class loss function of a preset obstacle recognition model to analyze a target real region frame, so as to obtain a target real region frame corresponding to each enhancement feature image;

respectively carrying out target center point calculation on the target real region frames corresponding to each enhanced feature map to obtain a plurality of target center points;

performing self-adaptive feature fusion processing on a plurality of target center points to obtain a plurality of down-sampling center point coordinates;

performing Gaussian kernel mapping on the coordinates of the down-sampling center points to obtain a plurality of heat value maps;

and carrying out confidence prediction value fusion on each heat value graph by decibels to obtain category loss data corresponding to each enhanced feature graph.

With reference to the first aspect, in a fifth implementation manner of the first aspect of the present invention, the constructing a static grid map according to the target obstacle information, and generating a cost potential field according to the target obstacle information, includes:

constructing a two-dimensional grid for the target obstacle information to obtain two-dimensional grid data corresponding to the target obstacle information;

performing position calibration on the cleaning robot to obtain a spatial position coordinate corresponding to the cleaning robot;

based on the space position coordinates, carrying out three-dimensional space mapping on the two-dimensional grid data to obtain corresponding three-dimensional grid data;

performing static grid map construction based on the three-dimensional grid data to obtain a corresponding static grid map;

performing grid cell traversal on the static grid map to obtain a plurality of grid cells corresponding to the static grid map;

performing initial potential energy value matching on each grid cell to obtain an initial potential energy value corresponding to each grid cell;

based on the initial potential energy value corresponding to each grid cell, performing grid distance calculation on the target obstacle information to obtain a grid distance set, and generating a corresponding cost potential energy value set through the grid distance set;

And constructing a cost potential field on the static grid map based on the cost potential value set to obtain the cost potential field.

With reference to the first aspect, in a sixth implementation manner of the first aspect of the present invention, the planning, based on the static grid map, a getting-out path of the cleaning robot through the cost potential field, generating a target getting-out route, and controlling the cleaning robot to travel through the target getting-out route, includes:

calculating equipotential lines through the cost potential field and the static grid map to obtain corresponding equipotential lines;

based on the equipotential lines, carrying out passing point coordinate calculation on the cleaning robot to obtain a corresponding passing point coordinate set;

generating an initial path based on the passing point coordinate set to obtain an initial getting rid of poverty path corresponding to the cleaning robot;

calculating the path length of the initial escaping path to obtain the corresponding initial path length and path smoothness;

calculating path correction point positions for the initial path length and the path smoothness to obtain a plurality of path correction point coordinates;

and carrying out path correction on the initial escape path based on the path correction point coordinates to obtain the target escape path and controlling the cleaning robot to run through the target escape path.

The second aspect of the present invention provides an artificial intelligence obstacle surmounting and getting rid of poverty device, comprising:

the acquisition module is used for acquiring a plurality of infrared environment pictures acquired by a preset cleaning robot;

the input module is used for inputting a plurality of the infrared environment pictures into a preset multi-scale ASFF network to extract infrared characteristics so as to obtain a plurality of infrared characteristic diagrams with different scales;

the fusion module is used for carrying out cross-layer fusion processing on the infrared characteristic images with different scales to obtain a plurality of fusion characteristic images;

the enhancement module is used for inputting the multiple fusion feature images into a preset enhanced attention coding network to perform target feature enhancement processing to obtain multiple enhancement feature images;

the recognition module is used for inputting a plurality of the enhancement feature images into a preset obstacle recognition model to perform obstacle recognition to obtain target obstacle information, wherein the target obstacle information comprises: obstacle position information and obstacle size information;

the construction module is used for constructing a static grid map through the target obstacle information and generating a cost potential field according to the target obstacle information;

and the planning module is used for planning the getting-off path of the cleaning robot through the cost potential field based on the static grid map, generating a target getting-off route and controlling the cleaning robot to run through the target getting-off route.

The third aspect of the invention provides an artificial intelligence obstacle surmounting and escaping device, comprising: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the artificial intelligence obstacle crossing and getting rid of device to perform the artificial intelligence obstacle crossing and getting rid of method described above.

A fourth aspect of the present invention provides a computer readable storage medium having instructions stored therein that, when executed on a computer, cause the computer to perform the artificial intelligence obstacle-surmounting method described above.

In the technical scheme provided by the invention, a plurality of infrared environment pictures acquired by the cleaning robot are acquired; inputting a plurality of infrared environment pictures into a multi-scale ASFF network for infrared feature extraction to obtain a plurality of infrared feature pictures with different scales; performing cross-layer fusion processing on a plurality of infrared feature images with different scales to obtain a plurality of fusion feature images; inputting the multiple fusion feature images into an attention-enhancing coding network to perform target feature enhancement processing to obtain multiple enhancement feature images; inputting a plurality of enhancement feature maps into an obstacle recognition model to perform obstacle recognition to obtain target obstacle information, wherein the target obstacle information comprises: obstacle position information and obstacle size information; constructing a static grid map through the target obstacle information, and generating a cost potential field according to the target obstacle information; based on the static grid map, the cleaning robot is subjected to escape route planning through the cost potential field, a target escape route is generated, and the cleaning robot is controlled to travel through the target escape route. By using infrared ambient pictures and multi-scale ASFF networks, the robot can better perceive the environment, especially in situations of insufficient light or limited vision. This may improve the ability of the robot to identify obstacles, including its position and size information. The multi-scale ASFF network and the attention-enhancing encoding network facilitate extracting and enhancing features in infrared images. This helps the robot to better understand the environment, especially in the presence of complex textures or low contrast. By converting the target obstacle information into a static grid map and a cost potential field, the robot can plan the path better. The cost potential field reflects the risk degrees of different positions in the environment, and is helpful for the robot to select a safe path and avoid collision. The static grid map and the cost potential field enable the robot to plan a getting rid of the way in a complex environment. This means that the robot can avoid obstacles more effectively, and through the above processing steps, the cleaning robot can navigate and process complex environments more independently, thereby improving the accuracy of the artificial intelligence obstacle surmounting and getting rid of.

Drawings

FIG. 1 is a schematic diagram of an embodiment of a method for artificial intelligence obstacle crossing and escaping in accordance with an embodiment of the present invention;

FIG. 2 is a flow chart of a target feature enhancement process performed by inputting a plurality of fusion feature maps into a preset attention-enhancing encoding network in an embodiment of the present invention;

FIG. 3 is a flowchart of a method for inputting a plurality of enhanced feature maps into a preset obstacle recognition model to perform obstacle recognition to obtain target obstacle information in an embodiment of the present invention;

FIG. 4 is a flowchart of a method for obtaining category loss data corresponding to each enhancement feature map by performing category loss calculation by inputting a plurality of enhancement feature maps into an obstacle recognition model in an embodiment of the present invention;

FIG. 5 is a schematic view of an embodiment of an artificial intelligence obstacle detouring and escaping device according to the invention;

fig. 6 is a schematic diagram of an embodiment of an artificial intelligence obstacle detouring and getting rid of device according to the present invention.

Detailed Description

The embodiment of the invention provides an artificial intelligence obstacle crossing and escaping method, an artificial intelligence obstacle crossing and escaping device and a control system, which are used for improving the accuracy of the artificial intelligence obstacle crossing and escaping.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

For easy understanding, a specific flow of an embodiment of the present invention is described below, referring to fig. 1, and an embodiment of an artificial intelligence obstacle detouring and getting rid of method in an embodiment of the present invention includes:

s101, acquiring a plurality of infrared environment pictures acquired by a preset cleaning robot;

it can be understood that the execution body of the invention can be an artificial intelligent obstacle crossing and getting rid of device, and can also be a terminal or a server, and the execution body is not limited in the specific description. The embodiment of the invention is described by taking a server as an execution main body as an example.

Specifically, in order to be able to collect infrared environmental pictures, the cleaning robot needs to be equipped with an infrared sensor. These sensors are mounted in the robot in a suitable position, typically the front, to cover the front area. The infrared sensor is used for detecting infrared radiation emitted by objects in the environment. This radiation can tell the robot the distance and position information of the object, which is the key to obstacle surmounting and avoiding. When the infrared sensor is in place, the cleaning robot can begin to take an infrared environmental picture. Data acquisition is a periodic process in which infrared sensors periodically acquire infrared images of the environment and transmit these images in the form of digital data to the control system of the robot. The data acquisition process requires detailed configuration and setup. It is necessary to determine the frequency of data acquisition, i.e. how often an image is acquired. Depending on the task requirements of the robot, for example, if the robot needs to move at a higher speed, more frequent data acquisition and faster decision speeds are required. Another key setting is the resolution of the image. Resolution determines the sharpness of an image, typically expressed in units of pixels. Higher resolution can provide more detailed information, but can also increase the complexity and storage requirements of data processing. Therefore, a balance needs to be found between image quality and resource consumption. The acquired infrared image data needs to be processed and managed to ensure data quality and usability. This includes data calibration to correct errors present in the sensor, ensuring that the distance and temperature information in the image is accurate. Each infrared image should be time stamped and the acquisition time recorded for subsequent data analysis and synchronization. Furthermore, an appropriate data storage format, typically an image file format (such as JPEG or PNG), needs to be selected for subsequent processing. It is also necessary to build a data indexing system for ease of retrieval and management. This system records relevant information for each image, such as the acquisition site, robot status, etc. Such information will facilitate subsequent obstacle recognition and path planning.

S102, inputting a plurality of infrared environment pictures into a preset multi-scale ASFF network to extract infrared characteristics, and obtaining a plurality of infrared characteristic pictures with different scales;

it should be noted that, the multi-scale ASFF (attention space feature fusion) network is a deep learning model, and is specifically used for image feature fusion. Its main task is to combine feature maps of different scales to generate a richer, more informative feature representation. Multiscale ASFF networks typically include the following key components: multi-scale feature map input: the feature maps are derived from a plurality of infrared ambient pictures, each of which is processed by a Convolutional Neural Network (CNN) to extract feature information from the image. Feature maps of different scales capture different aspects of the environment, e.g., lower scales focus on details, while higher scales focus on the overall structure; attention mechanism: ASFF networks introduce a mechanism of attention that enables the network to automatically select and focus on important information in different scale feature maps. This process is similar to the human visual system selecting a region of interest when processing an image; feature fusion: and fusing the plurality of feature maps by using an attention mechanism through the multi-scale ASFF network to generate a plurality of fused feature maps with different scales. These fused feature maps contain information from different scales that more fully describe the environment. The cleaning robot needs to acquire a plurality of infrared environmental pictures. These pictures are taken by an infrared sensor and are typically stored in digital form. Each image contains infrared information about the environment. Some preprocessing steps are typically required to ensure that the image is of consistent format and quality prior to being input into the multiscale ASFF network. This includes adjusting the size, brightness, contrast, etc. of the image. The preprocessed image is input into a multi-scale ASFF network. The structure of the network will ensure that features are extracted on different scales. The feature extraction process is a core part of a deep learning model, and extracts features from images through convolution, pooling and other operations, wherein the features have different levels of abstraction and scales. The attention mechanism in a multi-scale ASFF network will decide which features are important and weight fusion of feature maps of different scales. This process is dynamic in that it adapts itself to the content of the input image. A multi-scale ASFF network will generate a plurality of different-scale infrared signatures that contain information from different scales and perspectives. These feature maps may be used for subsequent tasks such as obstacle recognition, path planning, etc. For example, suppose that the cleaning robot needs to traverse a steep step. It acquires a plurality of infrared environmental pictures using an infrared sensor, which capture the shape and temperature differences of the steps. These pictures are input into a multi-scale ASFF network. In a multiscale ASFF network, the low-scale feature map identifies edge details of the steps, and the high-scale feature map focuses on the structure of the entire step. Through the attention mechanism, the network automatically selects and fuses the features to generate a plurality of infrared feature maps with different scales. These feature maps may help the robot better understand the characteristics of the step and thus more effectively pass over it.

S103, performing cross-layer fusion processing on a plurality of infrared feature images with different scales to obtain a plurality of fusion feature images;

specifically, for each infrared feature map of different scales, a corresponding weight coefficient value needs to be calculated. These weighting coefficients determine the extent to which each feature map contributes to the final fusion result. The calculation of the weight coefficients may be based on various methods, for example, learning using Convolutional Neural Networks (CNNs) or may be determined using empirical rules or a priori knowledge. These coefficients are used to weight fuse feature maps of different scales. For each pixel location, the respective feature maps are multiplied by the corresponding weight coefficients and then added to generate a fused feature map. This process will result in the generation of multiple candidate fusion feature maps. And carrying out feature hierarchy division on the candidate fusion feature graphs. And extracting the features of different layers from the fusion feature map. Typically, servers are divided into shallow network high resolution low level features and deep network low resolution high level semantic features. Shallow features contain more detailed information, while deep features contain higher levels of semantic information. And performing cross-layer fusion processing on the candidate fusion feature map based on the high-resolution low-level features of the plurality of shallow networks and the low-resolution high-level semantic features in the plurality of deep networks. This process may include stitching, weighting, etc. the shallow features with the deep features to generate a final fused feature map. For example, suppose that the cleaning robot needs to traverse a complex environment including obstacles, steps, and slopes. The robot is equipped with an infrared sensor, and can acquire a plurality of infrared images of different scales. In obstacle surmounting tasks, it is necessary to make full use of this image information to ensure that the robot is able to safely pass the obstacle. For each of the different scale infrared feature maps, a weighting coefficient is calculated. For example, lower scale feature maps are very sensitive to details such as obstacle edges and should therefore be assigned higher weights, while higher scale feature maps are more sensitive to the overall environmental structure and therefore also require assignment of appropriate weights. And applying the weight coefficients to the feature map, and carrying out weighted fusion. This will generate a plurality of candidate fused feature maps, each incorporating information of different scales with different weights. And performing feature level division, and dividing the candidate fusion feature map into shallow features and deep features. For example, shallow features include details of edges and steps of obstacles, while deep features include overall structural and semantic information of the environment. And combining the shallow features and the deep features together through cross-layer fusion to generate a final fusion feature map. This feature map will contain information from different scales and different levels, which can be used for subsequent obstacle surmounting path planning and control decisions.

S104, inputting the multiple fusion feature images into a preset attention-enhancing coding network to perform target feature enhancement processing to obtain multiple enhancement feature images;

the multiple fusion feature maps are input into the attention-enhancing coding network to perform the hole convolution operation. Hole convolution is a convolution operation that can expand the receptive field of the convolution kernel to capture more extensive information. Each fusion profile is associated with corresponding receptive field size data to guide the convolution operation. And according to the receptive field size data corresponding to each fusion feature map, inputting the infrared features into the first pooling layer for average pooling operation, and simultaneously inputting the same infrared features into the second pooling layer for maximum pooling operation. This will generate an average pooled feature map and a maximum pooled feature map, each capturing different feature information. And carrying out feature addition processing on the average pooling feature map and the maximum pooling feature map. This may be a simple element-level addition operation, adding the two feature maps element-by-element, generating summed feature data. And carrying out data mapping on the added characteristic data through a preset activation function to obtain a channel weight coefficient. The channel weight coefficients are used to weight the information of the different channels (feature maps) to enhance a particular feature. The channel weight coefficients are calculated from the summed feature data. Channel weight coefficients are applied to the fused feature map to emphasize or suppress particular features. This will produce a plurality of enhanced feature maps, each highlighting a different environmental feature. For example, suppose that the cleaning robot encounters an obstacle having different temperatures and reflectivities in the environment. Robots capture a plurality of infrared signatures of different dimensions using infrared sensors, some of which emphasize temperature information and others of which emphasize reflectivity information. And inputting the fusion characteristic diagrams with different scales into an enhanced attention coding network, and performing hole convolution operation. This will increase the receptive field of each feature map to better capture the details and structure of the obstacle. And respectively inputting each fusion characteristic map into a first pooling layer for average pooling operation and a second pooling layer for maximum pooling operation according to the receptive field size data. This will generate an average pooling profile and a maximum pooling profile, which contain average temperature information and maximum reflectivity information, respectively. And carrying out feature addition on the two feature graphs to obtain added feature data. The summation feature data is data mapped by an activation function to generate channel weight coefficients. And applying the channel weight coefficients to the fused feature maps of different scales to generate a plurality of enhanced feature maps. These enhanced feature maps emphasize information from different channels, enhancing temperature and reflectivity characteristics, etc., to help the cleaning robot more accurately identify and understand obstacles.

S105, inputting a plurality of enhancement feature maps into a preset obstacle recognition model to perform obstacle recognition to obtain target obstacle information, wherein the target obstacle information comprises: obstacle position information and obstacle size information;

specifically, a plurality of enhancement feature maps are input into an obstacle recognition model, and primary recognition is performed. This process maps the feature map to the category and location information of obstacle recognition by neural networks or the like. Initial recognition will generate initial obstacle information, but these are not accurate enough. And simultaneously, inputting a plurality of enhancement feature maps into an obstacle recognition model to calculate category loss. The category loss calculation is used to evaluate the difference between the primary recognition result and the actual category. This loss value represents the accuracy of the class identification of the obstacle, which can be used for subsequent network loss calculations. In addition to the category loss, an offset loss needs to be calculated. Offset loss calculation is used to evaluate whether the first identified location is accurate. It measures the error between the first identified location and the actual location, which is very critical information, especially in obstacle surmounting tasks. In addition, regression loss calculation is also required. Regression loss calculation is used to evaluate the difference between the initially identified position coordinates and the actual position coordinates. The loss value reflects the accuracy of the position coordinates, and is very important for path planning and obstacle avoidance of obstacle surmounting tasks. Category loss, offset loss, and regression loss are combined into an overall network loss. The accuracy and the consistency of the identification result are comprehensively considered in the overall network loss, and feedback is provided for subsequent information correction. And carrying out information correction on the primarily identified obstacle information according to the overall network loss of the target. This correction process may be an optimization or back propagation process that adjusts the primary recognition result by minimizing overall network loss to more accurately reflect the actual situation. The corrected information is the target obstacle information. For example, assume that a cleaning robot acquires a plurality of enhanced feature maps using infrared sensors when traversing a set of obstacles. The primary identification process identifies the general location and class of such obstacles, such as a step or a bin. And performing category loss calculation on the primary identification result to evaluate accuracy of category identification. If the first recognition erroneously recognizes a step as a box, the category loss will be high. Meanwhile, offset loss calculation is performed to evaluate the accuracy of the position. If the first recognition recognizes the position of the step as some pixel shift of the actual position, the shift amount loss will be high. In addition, regression loss calculation is performed to evaluate accuracy of the position coordinates. If there is a difference between the initially identified position coordinates and the actual position coordinates, the regression loss will reflect this difference. These loss values are combined into an overall network loss, and the first identified obstacle information is corrected by an optimization or back propagation process. The corrected information can more accurately reflect the obstacles in the environment, including the category and the position of the obstacles, and the robot can be helped to perform obstacle surmounting tasks more safely.

The method comprises the steps of inputting a plurality of enhancement feature maps into a preset obstacle recognition model to perform category loss calculation. In this process, it is necessary to provide target real area frames containing accurate position and size information of each obstacle. These target real area boxes may be obtained by manual labeling or other methods. And calculating the center point coordinates of the target for the target real region frame corresponding to each enhanced feature map. The target center point is the geometric center of the target area that helps determine the location of the target in the image. And carrying out self-adaptive feature fusion processing on the plurality of target center points. This process may be a feature weighted operation that assigns different weights on the feature map depending on the location of the target center point. This helps to fuse the target information into the feature map, allowing the robot to better perceive the target location. And carrying out Gaussian kernel mapping on the coordinates of the target center point in the fused feature map. Gaussian nuclear mapping is a technique used to generate a map of heating values that radiates a gaussian distribution of heating values around a target center point. This heat value map is used to indicate the confidence of the target location, with higher heat values indicating that the robot is more confident of the target location. And carrying out confidence prediction value fusion on the heat value diagram of each enhancement feature diagram. This process may be a weighted average in which the confidence predictors for each feature map are weighted according to their importance. The resulting confidence prediction value may be used as part of a class loss calculation to evaluate the accuracy of recognition of the target class. For example, suppose a cleaning robot captures a plurality of enhanced feature maps using infrared sensors to identify different obstacles in the environment, such as boxes and steps. Each enhancement feature map contains information of different scales and layers. These feature maps are input into an obstacle recognition model to perform category loss calculation. At the same time, target real area frames of each obstacle are provided, and the frames contain accurate position and category information of the obstacle. And calculating a target center point of the target real region frame corresponding to each enhanced feature map so as to determine the center position of the target. And fusing the target center point information into the feature map through self-adaptive feature fusion so as to enhance the perception of the target position. And then, carrying out Gaussian kernel mapping on the coordinates of the target center points in the fused feature map to generate a heat value map, wherein the heat value map represents the confidence of the target position. And carrying out confidence prediction value fusion on the heat value map of each enhanced feature map to obtain category loss data corresponding to each feature map. These loss data can be used to evaluate the accuracy of the robot's recognition of different obstacle categories and to improve the recognition performance by an optimization process.

S106, constructing a static grid map through the target obstacle information, and generating a cost potential field according to the target obstacle information;

specifically, the target obstacle information is converted into two-dimensional mesh data. This may be achieved by dividing the environment into regular grid cells and indicating in each grid cell whether an obstacle is present. This allows the environmental information to be discretized for subsequent processing. And (3) calibrating the position of the cleaning robot, and determining the spatial position coordinates of the cleaning robot in the environment. This coordinate information will be used to map the two-dimensional grid data into three-dimensional space. And carrying out three-dimensional space mapping on the two-dimensional grid data based on the position coordinates of the cleaning robot. This process maps each two-dimensional grid cell to a three-dimensional coordinate location in the environment to construct grid data with spatial information. Based on the three-dimensional grid data, a static grid map is constructed. A static grid map is a discrete representation in space that describes the location of obstacles in the environment. Each grid cell may contain information about the presence of an obstacle, for example, whether there is an obstacle or the type of obstacle. An initial potential energy value match is made for each grid cell. These initial potential energy values represent the potential energy of the robot in the grid cell, typically with the potential energy value of the obstacle grid cell set to a higher value and the potential energy value of the obstacle-free grid cell set to a lower value. Based on the potential energy values, a mesh distance between each mesh unit and the target obstacle information is calculated. These distance information reflects the distance of the robot from the obstacle in the environment and is the basis for the construction of the cost potential field. And constructing a cost potential field based on the grid distance calculation result. The cost potential field is a scene reflecting the cost (typically distance) in the environment and is used to guide the path planning of the robot. The robot may tend to avoid costly areas, i.e. obstacles. For example, suppose that a cleaning robot is in an indoor environment, and needs to traverse a set of boxes. The target obstacle information (the location and type of the box) is converted into two-dimensional grid data, discretizing the environment into grid cells. The position of the robot is calibrated and its spatial position coordinates in the environment are determined. The two-dimensional mesh data is mapped to a three-dimensional space, taking into account the position of the robot. This will create a three-dimensional grid in which each grid cell is associated with a particular location in the environment. Then, a static grid map is constructed from the three-dimensional grid data, the map showing the locations of the boxes and the environmental information around the robot. And (3) performing initial potential energy value matching on each grid cell, wherein the potential energy value of the grid cell with the obstacle is higher, and the potential energy value of the grid cell without the obstacle is lower. And calculating the grid distance between each grid cell and the target obstacle information according to the potential energy value. These distance information will be used to generate a cost potential field, where the potential field reflects the distance between the robot and the obstacle in the environment. Eventually, the cost potential field may be used for path planning, and the robot may choose to avoid the more costly areas to safely cross the box. Thus, the cleaning robot can more intelligently plan the obstacle crossing path, and successful completion of the task is ensured.

S107, performing escape route planning on the cleaning robot through a cost potential field based on the static grid map, generating a target escape route and controlling the cleaning robot to travel through the target escape route.

The equipotential line calculation is performed based on the cost potential field and the static grid map. Equipotential lines represent equipotential potential values in the environment, typically representing the cost of the robot at different locations. These equipotential lines will help determine where the robot should move to avoid the obstacle. A series of waypoint coordinates is calculated based on the equipotential lines. These waypoint coordinates will be used to construct the initial escape path. The application generates an initial escape path through the point coordinates. This path is a series of line segments or curves connecting the passing points to connect the starting point and the target point. Calculating the length of the initial escape path will help assess the length of the path. The smoothness of the initial escape path is evaluated to determine the degree of curve of the path. The smooth path helps to ensure that the robot's movements are more stable. Based on the path length and smoothness, a plurality of path correction point coordinates are calculated. These points are used to adjust the initial path to ensure safe clearance over the obstacle. And correcting the initial escape path by using the coordinates of the path correction points. This process involves stretching, shortening or adjusting the curve shape of the path to ensure that the robot can safely clear the obstacle. And generating a final target escape path by applying path correction. The path is a corrected path, so that the safety obstacle surmounting of the robot is ensured, and the length and smoothness of the path are considered. The generated target escape path is used to control the travel of the cleaning robot. The robot will proceed sequentially along points on the path until it successfully passes over the obstacle and completes the escape task. For example, suppose a cleaning robot needs to cross a set of bins to complete a stranded-out task. Equipotential line calculations are performed from the static grid map and the cost potential field to determine which areas are safer for the robot and which should be avoided. A series of pass point coordinates are calculated that will help construct an initial escape path. The initial path is a path formed by connecting the passing points. The length and smoothness of the initial path are calculated to evaluate its quality. If the path length is too long or the degree of curvature is too high, a correction is required. Based on the path length and smoothness evaluation, path correction point coordinates are calculated. These points will be used to adjust the initial path to ensure that the robot can cross the box. And generating a final target escape path by applying path correction. This path will instruct the robot how to safely traverse the box and complete the task. The cleaning robot will advance to the point on the target escape path in turn, completing the escape task.

In the embodiment of the invention, a plurality of infrared environment pictures acquired by a cleaning robot are acquired; inputting a plurality of infrared environment pictures into a multi-scale ASFF network for infrared feature extraction to obtain a plurality of infrared feature pictures with different scales; performing cross-layer fusion processing on a plurality of infrared feature images with different scales to obtain a plurality of fusion feature images; inputting the multiple fusion feature images into an attention-enhancing coding network to perform target feature enhancement processing to obtain multiple enhancement feature images; inputting a plurality of enhancement feature maps into an obstacle recognition model to perform obstacle recognition to obtain target obstacle information, wherein the target obstacle information comprises: obstacle position information and obstacle size information; constructing a static grid map through the target obstacle information, and generating a cost potential field according to the target obstacle information; based on the static grid map, the cleaning robot is subjected to escape route planning through the cost potential field, a target escape route is generated, and the cleaning robot is controlled to travel through the target escape route. By using infrared ambient pictures and multi-scale ASFF networks, the robot can better perceive the environment, especially in situations of insufficient light or limited vision. This may improve the ability of the robot to identify obstacles, including its position and size information. The multi-scale ASFF network and the attention-enhancing encoding network facilitate extracting and enhancing features in infrared images. This helps the robot to better understand the environment, especially in the presence of complex textures or low contrast. By converting the target obstacle information into a static grid map and a cost potential field, the robot can plan the path better. The cost potential field reflects the risk degrees of different positions in the environment, and is helpful for the robot to select a safe path and avoid collision. The static grid map and the cost potential field enable the robot to plan a getting rid of the way in a complex environment. This means that the robot can avoid obstacles more effectively, and through the above processing steps, the cleaning robot can navigate and process complex environments more independently, thereby improving the accuracy of the artificial intelligence obstacle surmounting and getting rid of.

In a specific embodiment, the process of executing step S103 may specifically include the following steps:

(1) Respectively carrying out weight coefficient calculation on each infrared characteristic graph with different scales to obtain weight coefficient values corresponding to each infrared characteristic graph with different scales;

(2) Based on the weight coefficient value corresponding to each infrared characteristic image with different scales, carrying out weighted fusion on a plurality of infrared characteristic images with different scales to obtain a plurality of candidate fusion characteristic images;

(3) Performing feature level division on the candidate fusion feature graphs to obtain a plurality of shallow network high-resolution low-level features and a plurality of deep network low-resolution high-level semantic features;

(4) And performing cross-layer fusion processing on the candidate fusion feature graphs based on the high-resolution low-level features of the shallow networks and the low-resolution high-level semantic features in the deep networks to obtain a plurality of fusion feature graphs.

Specifically, for each infrared feature map of different scales, a corresponding weight coefficient needs to be calculated. These weighting coefficients reflect the extent to which each scale feature map contributes to the final fusion result. In general, the calculation of the weight coefficients may be based on the quality of the feature map, the amount of information, or other relevance indicators. For example, the weight coefficients may be calculated using a attention mechanism in a Convolutional Neural Network (CNN) such that more important feature maps have higher weights. And carrying out weighted fusion on the infrared characteristic graphs with different scales based on the calculated weight coefficients. This can be achieved by a simple weighted summation, wherein the value of each feature map is multiplied by its corresponding weight coefficient and then added to obtain a fused feature map. This process ensures that information of different scales is reasonably fused together. The fused feature map typically contains multiple levels of information. To better understand this information, feature hierarchies may be used to categorize the fused feature map. This may be achieved by convolving the multiple levels or pyramids of the neural network to separate the shallow network high resolution low level features from the deep network low resolution high level semantic features. And performing cross-layer fusion processing on the features based on the high-resolution low-level features of the plurality of shallow networks and the low-resolution high-level semantic features in the deep network. This process aims at integrating the different levels of information to obtain a more semantic and rich representation of the features. In general, cross-layer fusion may be implemented using different convolutional layers or pooling layers in a convolutional neural network. For example, suppose that the cleaning robot needs to traverse a complex environment, including short-range and long-range obstacles. Through infrared sensor, the robot has gathered the infrared characteristic map of different scales. These feature maps include high resolution short range features and low resolution long range features. And calculating a weight coefficient of the feature map of each scale. It is assumed that the close-range feature map is more important in this environment, so it is given a higher weighting coefficient. And the remote feature map obtains lower weight coefficients. And weighting and fusing the feature images by using the calculated weight coefficient. This will result in a fused profile in which the near information is enhanced while the far information is retained. And carrying out feature hierarchy division on the fused feature map, and dividing the feature map into high-resolution low-level features of a shallow network and low-resolution high-level semantic features in a deep network. The shallow and deep features are combined by a cross-layer fusion process to generate a final feature representation. This comprehensive characterization can be used for obstacle recognition, path planning, and escape decisions to ensure that the robot can safely clear various obstacles and complete tasks. The process fully utilizes information of different scales, and improves the perception and decision making capability of the system.

In a specific embodiment, as shown in fig. 2, the process of executing step S104 may specifically include the following steps:

s201, inputting a plurality of fusion feature images into an enhanced attention coding network to perform cavity convolution operation to obtain infrared features corresponding to each fusion feature image and receptive field size data corresponding to each fusion feature image;

s202, inputting infrared features corresponding to each fusion feature map into a first pooling layer for carrying out average pooling operation based on receptive field size data corresponding to each fusion feature map, and generating an average pooling feature map;

s203, inputting the infrared features corresponding to each fusion feature map into a second pooling layer for maximum pooling operation based on the receptive field size data corresponding to each fusion feature map, and generating a maximum pooling feature map;

s204, carrying out feature addition processing on the average pooling feature map and the maximum pooling feature map to obtain addition feature data, and carrying out data mapping on the addition feature data through a preset activation function to obtain a channel weight coefficient corresponding to the enhanced attention coding network;

s205, performing target feature enhancement processing on the multiple fusion features based on the channel weight coefficient to obtain multiple enhancement feature graphs.

It should be noted that, the multiple fusion feature maps are input into the attention-enhancing coding network to perform the hole convolution operation. Hole convolution is a form of convolutional neural network that can increase the receptive field to capture a wider range of characteristic information. And each fusion feature map is subjected to cavity convolution operation to obtain infrared features and receptive field size data. Based on the receptive field size data corresponding to each fusion feature map, the infrared features corresponding to each fusion feature map are input into a first pooling layer to carry out average pooling operation, and an average pooling feature map is generated. And simultaneously, inputting the infrared features corresponding to each fusion feature map into a second pooling layer to perform maximum pooling operation, and generating a maximum pooling feature map. These two pooling operations help extract different statistics of the feature map. And carrying out feature addition processing on the average pooling feature map and the maximum pooling feature map to obtain addition feature data. This step adds the pooled feature maps to obtain a more comprehensive feature representation. The summation feature data contains information under different pooling policies. And carrying out data mapping on the added feature data through a preset activation function to obtain a channel weight coefficient corresponding to the enhanced attention coding network. These weighting coefficients will reflect the importance of the different channels to the features, helping to focus better on the key features. And carrying out target feature enhancement processing on the multiple fusion features based on the channel weight coefficient. This process will weight the fused features such that important feature channels are enhanced, while unimportant channels are de-enhanced. In this way, a plurality of enhanced feature maps are generated that are more suitable for obstacle recognition, path planning and decision making. For example, suppose a cleaning robot is performing a stranded task in an environment that includes different types of obstacles, such as boxes, walls, and furniture. The robot collects a plurality of fusion feature maps through the infrared sensor, and each feature map captures information of different scales and semantics. These feature maps are subjected to a hole convolution operation, respectively, to increase the receptive field and extract higher-level feature information. For example, a particular convolution kernel helps detect edges and texture of an obstacle. And respectively carrying out average pooling and maximum pooling operation through different pooling layers so as to extract different statistical information. Averaging pooling helps to capture the average of the features, while maximizing pooling helps to capture the maximum of the features. And adding the average pooling feature map and the maximum pooling feature map to obtain added feature data. These data contain a comprehensive representation of the different statistics. And mapping the summation characteristic data through an activation function to generate a channel weight coefficient. These coefficients will reflect the importance of different channels, for example, some channels are more critical to identifying boxes, while other channels are more critical to identifying walls. And carrying out target feature enhancement processing on the fusion features based on the channel weight coefficients to generate a plurality of enhancement feature graphs. The feature diagrams have semantic information and discrimination, so that the robot can more accurately identify obstacles and plan paths so as to smoothly complete the escaping task. The process fully utilizes different scales and statistical information, and improves the perception and decision making capability of the system.

In a specific embodiment, as shown in fig. 3, the process of executing step S105 may specifically include the following steps:

s301, inputting a plurality of enhancement feature images into an obstacle recognition model for primary recognition to obtain initial obstacle information, and simultaneously inputting a plurality of enhancement feature images into the obstacle recognition model for class loss calculation to obtain class loss data corresponding to each enhancement feature image;

s302, inputting a plurality of enhancement feature images into an obstacle recognition model to perform offset loss calculation, so as to obtain offset loss data corresponding to each enhancement feature image;

s303, inputting a plurality of enhancement feature images into an obstacle recognition model to perform regression loss calculation, and obtaining regression loss data corresponding to each enhancement feature image;

s304, overall network loss calculation is carried out on the category loss data, the offset loss data and the regression loss data, and target overall network loss is obtained;

and S305, carrying out information correction on the initial obstacle information based on the target overall network loss to obtain target obstacle information.

The plurality of enhanced feature maps are input into an obstacle recognition model, and primary recognition is performed. The goal of this step is to detect and locate obstacles in the environment and to obtain initial obstacle information. The obstacle recognition model will output information such as the position of the obstacle and the bounding box. The plurality of enhanced feature maps are again entered into the obstacle recognition model, but this time with the goal of calculating the class loss. Category loss is used to determine which category an obstacle belongs to, such as a box, wall, or furniture. The model will calculate the loss from the provided feature map and the known class labels. In addition to the category loss, an offset loss needs to be calculated. The offset penalty is used to determine the exact location of the obstacle bounding box because there is a positional deviation from the initial identification. This step calculates the difference between the model predicted bounding box and the real bounding box to further optimize the position information. In addition, regression losses need to be calculated for further refining the position and shape of the obstacle. This step helps to ensure that the model is able to locate the boundaries of the obstacle more accurately, especially for complex shaped obstacles. And integrating the category loss, the offset loss and the regression loss, and calculating the overall network loss. The overall network loss is an indicator of the integrity of the model for evaluating its performance. By minimizing overall network loss, the model may continually improve the ability to identify and locate obstacles. And carrying out information correction on the initial obstacle information based on the target overall network loss. This step may correct errors present in the primary identification to obtain more accurate and reliable obstacle information. The corrected information will be used for path planning and escaping operations. For example, suppose that the cleaning robot encounters a box as an obstacle when performing a getting rid of poverty task. By means of the infrared sensor, the robot acquires a plurality of enhanced feature maps capturing different dimensions and visual features of the box. The feature maps are input into an obstacle recognition model for primary recognition. The model detects the location of the box and the bounding box. These feature maps are again entered into the model, but this time the class loss is calculated. The model determines that this obstacle belongs to the "box" class and calculates the corresponding loss to ensure accuracy of class identification. And performing offset loss calculation to fine tune the position of the boundary box, and ensuring that the boundary box accurately covers the box. Regression loss calculations are performed to further refine the location and shape of the box. The overall network loss will integrate category, offset and regression losses and help the model to continuously improve the recognition and localization of obstacles.

In a specific embodiment, as shown in fig. 4, the process of performing step S301 may specifically include the following steps:

s401, inputting a plurality of enhancement feature images into a class loss function of a preset obstacle recognition model to analyze a target real region frame, so as to obtain a target real region frame corresponding to each enhancement feature image;

s402, respectively carrying out target center point calculation on a target real region frame corresponding to each enhanced feature map to obtain a plurality of target center points;

s403, performing self-adaptive feature fusion processing on a plurality of target center points to obtain a plurality of down-sampling center point coordinates;

s404, gaussian kernel mapping is carried out on the coordinates of the plurality of downsampling center points, and a plurality of heat value diagrams are obtained;

s405, carrying out confidence prediction value fusion on each heat value graph in decibels to obtain category loss data corresponding to each enhancement feature graph.

And inputting the plurality of enhancement feature maps into a class loss function of a preset obstacle recognition model, and analyzing a target real region frame. The aim is to extract the target real area box in each enhanced feature map, namely the exact position and bounding box of the obstacle. And executing target center point calculation for the target real region frame corresponding to each enhancement feature map. The target center point is the center coordinates of the target area, and is typically calculated from the bounding box of the target area. These target center points represent the exact location of the obstacle. And carrying out self-adaptive feature fusion processing on the plurality of target center points. The purpose is to fuse together the target center points extracted from different feature maps to obtain more accurate target position information. The self-adaptive feature fusion can help the model to better understand the information of different feature graphs, so that the recognition accuracy is improved. The fused target center point is typically located on the high resolution feature map. These target center point coordinates need to be downsampled in order to align with the size of the original input image. This will result in the target center point coordinates being the same size as the input image. And carrying out Gaussian kernel mapping on the down-sampled coordinates of the target center point. The object is to convert the coordinate information of the target center point into a heat value map, wherein the position corresponding to the center point has a high heat value, and the position far from the center point has a low heat value. Gaussian kernel mapping helps the model locate the target region more accurately. And carrying out confidence prediction value fusion on each heat value graph. The goal is to determine whether each pixel contains a target and assign a confidence value to each pixel. By fusing confidence predictors of different feature maps, the position of the obstacle can be determined more reliably. For example, suppose a cleaning robot acquires a plurality of enhanced feature maps using an infrared camera to detect obstacles in the environment. In one of the feature maps, the model identifies an image of a table and extracts the target real area frame of the table. A target center point of the table is calculated, which is the center coordinates of the target area bounding box. This center point represents the approximate location of the table. The model performs adaptive feature fusion to fuse together the target center points extracted from the different feature maps to more accurately represent the position of the table. The next step is to downsample the fused target center point coordinates to align with the size of the original input image. This will generate a target center point heating value map of the same size as the input image. Confidence prediction value fusion is performed on the heat value map to determine whether each pixel belongs to a table, while assigning a confidence value to each pixel. The process fuses the information of different feature maps to improve the accuracy of obstacle recognition. Through these steps, the cleaning robot can better understand the position and shape of obstacles in the environment and plan paths more safely to accomplish tasks.

In a specific embodiment, the process of executing step S106 may specifically include the following steps:

(1) Constructing a two-dimensional grid for the target obstacle information to obtain two-dimensional grid data corresponding to the target obstacle information;

(2) Calibrating the position of the cleaning robot to obtain a spatial position coordinate corresponding to the cleaning robot;

(3) Based on the space position coordinates, carrying out three-dimensional space mapping on the two-dimensional grid data to obtain corresponding three-dimensional grid data;

(4) Carrying out static grid map construction based on the three-dimensional grid data to obtain a corresponding static grid map;

(5) Traversing the grid cells of the static grid map to obtain a plurality of grid cells corresponding to the static grid map;

(6) Performing initial potential energy value matching on each grid cell to obtain an initial potential energy value corresponding to each grid cell;

(7) Based on the initial potential energy value corresponding to each grid cell, grid distance calculation is carried out on the target obstacle information to obtain a grid distance set, and a corresponding cost potential energy value set is generated through the grid distance set;

(8) And constructing a cost potential field on the static grid map based on the cost potential value set to obtain the cost potential field.

The target obstacle information is converted into two-dimensional mesh data. This may be achieved by dividing the environment into grid cells, where each grid cell represents a small region of the environment. The location of the obstacle may be marked as an obstacle in the grid cell. And (3) calibrating the position of the cleaning robot to acquire the spatial position coordinates of the cleaning robot in the environment. This is typically accomplished by sensors or positioning systems to determine the precise position and orientation of the robot. The two-dimensional grid data is mapped to a three-dimensional space based on the spatial position coordinates of the cleaning robot. This is to map a planar representation of the environment to the three-dimensional space in which the robot is located for subsequent path planning and construction of the cost potential field. And constructing a static grid map based on the mapped grid data in the three-dimensional space. This map divides the environment into three-dimensional grid cells, which include information of obstacles. And performing grid cell traversal on the static grid map, and traversing all three-dimensional grid cells. This step helps to obtain information for each grid cell, including their location and obstruction information. An initial potential energy value is calculated for each grid cell. These potential energy values may be calculated based on the obstacle information, distance, etc. of the grid cells. The potential energy value reflects the difficulty of passing each grid cell, and is generally higher at obstacles and lower at no obstacles. And calculating the grid distance of the target obstacle information based on the initial potential energy value. This step helps to determine the distance between the robot and the target and provides important information for path planning. And generating a cost potential energy value set through grid distance calculation. These values will be used to construct a cost field representing the passing costs of the robot at different locations. Generally, the closer to the obstacle the higher the traffic cost is in the area. And constructing a cost potential field based on the cost potential value set. The cost potential field will help the robot plan the path to select the path with the lowest traffic cost, thus safely getting rid of the trouble. For example, assume that the cleaning robot is trapped in a room filled with obstacles, and the target obstacle information has been acquired. The room is divided into two-dimensional grids, each grid cell representing a small area of the room. The position of the obstacle is marked in the grid. The cleaning robot uses the laser sensor to determine its own position coordinates, for example (2, 3). The two-dimensional grid data is mapped to a three-dimensional space, taking into account the height and orientation of the robot. A static grid map is created in three-dimensional space, including obstacle information such as walls and furniture. Traversing the three-dimensional grid cells and acquiring information of each cell. An initial potential energy value is calculated from the obstacle information in the grid cell and the distance from the robot. The distance between the robot and the target area is calculated, which helps to determine the relative position of the robot and the target. A cost potential energy value is generated based on the distance and the obstacle information, e.g., the closer to the obstacle the higher the area potential energy value. Constructing a cost potential field, wherein the scene guides the cleaning robot to select a path with the lowest passing cost so as to get rid of the trouble safely.

In a specific embodiment, the process of executing step S107 may specifically include the following steps:

(1) Calculating equipotential lines through the cost potential field and the static grid map to obtain corresponding equipotential lines;

(2) Based on the equipotential lines, carrying out passing point coordinate calculation on the cleaning robot to obtain a corresponding passing point coordinate set;

(3) Generating an initial path based on the passing point coordinate set to obtain an initial escape path corresponding to the cleaning robot;

(4) Calculating the path length of the initial escape path to obtain the corresponding initial path length and path smoothness;

(5) Calculating path correction point positions for the initial path length and the path smoothness to obtain a plurality of path correction point coordinates;

(6) And carrying out path correction on the initial escaping path based on the coordinates of the plurality of path correction points to obtain a target escaping path, and controlling the cleaning robot to travel through the target escaping path.

In particular, the equipotential line calculation is based on the operation of the penalty field and the static grid map. The cost potential field assigns different cost values to different areas in the office, typically calculated by taking into account obstacle positions and other factors. These cost values constitute the cost potential field. By calculating the equipotential lines, the connection of points with the same cost value in the cost potential field can be found. These equipotential lines represent paths of equal price when moving in the cost potential field. Based on the equipotential lines, the coordinates of the passing points that the robot needs to pass through can be calculated. The passing points are typically located on equipotential lines, so the robot can travel along the equipotential lines without changing the cost. The calculation of these passing points may take into account the current position and the target position of the robot, ensuring that they move along the safe path. For example, in an office escape example, the waypoints may be located on doorways or open areas, which are typically areas of safe traffic. When the passing point coordinates are obtained, an initial escape path can be generated. This path is connected by a series of pass points describing how the robot should move to avoid obstacles and get rid of the trouble safely. And calculating the path length of the initial escape path to determine the total length of the path. The path length is an important indicator because it can be used to evaluate the length of the path. Shorter paths are generally more efficient. The path smoothness calculation helps to evaluate the curvature or degree of curvature of the path. Smooth paths are easier to navigate because they reduce sharp turns or pauses of the robot. Calculating smoothness may help select an optimal path. Based on the indexes such as the path length and the smoothness, the coordinates of a plurality of path correction points can be calculated. These correction points are used to further optimize the initial escape path. The correction points are typically located on the path for adjusting the movement trajectory of the robot. The coordinates of the correction points are applied to the initial escape path, thereby generating a target escape path. By adjusting the passing points on the path, the navigation performance of the path can be improved, and the robot can be ensured to get rid of the trouble smoothly. The travel of the cleaning robot is controlled by the target escape path. The robot will advance in turn according to the passing points on the path, ensuring to avoid the obstacle and get rid of the trouble safely. For example, consider a cleaning robot in an office environment that is trapped in a room with furniture, walls and doors. The initial position of the robot is on one side of the room, with the goal of reaching the doorway and leaving the room. In this example, the cost potential field will assign a cost value, furniture and walls have a higher cost value, and open space and doorways have a lower cost value. The initial position and the target position of the cleaning robot are known. Equipotential lines in the cost potential field represent paths of equal cost. The robot wants to follow the least costly path and so the equipotential lines will indicate which paths it should travel. Based on the equipotential lines, the robot's setpoint will be calculated to ensure that it follows the equipotential lines in the cost potential field. These waypoints are located at specific locations in the open space, such as doorways or central passageways. By connecting the pass points, an initial escape path is created describing how the robot should move from the initial position to the target position. The total length of the initial escape path is calculated to evaluate the length of the path. The smoothness of the initial escape path is calculated to determine the curvature of the path. Further optimization is required if the path needs to be sharply curved or frequently redirected. Based on the path length and smoothness, coordinates of the path correction points are calculated. These correction points may be located on the path for improving the path. Coordinates of the correction points are applied to the initial escape path to generate a target escape path. The target path is smoother and shorter, and is more suitable for robot navigation. Finally, the travel of the cleaning robot is controlled by the target escape path. The robots advance in sequence, ensuring to follow the path and get rid of the trouble safely.

The method for manually and intelligently surmounting and getting rid of the obstacle in the embodiment of the invention is described above, and the device for manually and intelligently surmounting and getting rid of the obstacle in the embodiment of the invention is described below, referring to fig. 5, one embodiment of the device for manually and intelligently surmounting and getting rid of the obstacle in the embodiment of the invention comprises:

an acquisition module 501, configured to acquire a plurality of infrared environmental pictures acquired by a preset cleaning robot;

the input module 502 is configured to input a plurality of the infrared environmental pictures into a preset multi-scale ASFF network to perform infrared feature extraction, so as to obtain a plurality of infrared feature graphs with different scales;

a fusion module 503, configured to perform cross-layer fusion processing on a plurality of the infrared feature maps with different scales, so as to obtain a plurality of fusion feature maps;

the enhancement module 504 is configured to input the multiple fusion feature maps into a preset attention enhancement coding network to perform target feature enhancement processing, so as to obtain multiple enhancement feature maps;

the identifying module 505 is configured to input a plurality of the enhanced feature maps into a preset obstacle identifying model to identify an obstacle, so as to obtain target obstacle information, where the target obstacle information includes: obstacle position information and obstacle size information;

a construction module 506, configured to construct a static grid map according to the target obstacle information, and generate a cost potential field according to the target obstacle information;

The planning module 507 is configured to plan a getting-off path of the cleaning robot through the cost potential field based on the static grid map, generate a target getting-off route, and control the cleaning robot to travel through the target getting-off route.

Acquiring a plurality of infrared environment pictures acquired by the cleaning robot through the cooperative cooperation of the components; inputting a plurality of infrared environment pictures into a multi-scale ASFF network for infrared feature extraction to obtain a plurality of infrared feature pictures with different scales; performing cross-layer fusion processing on a plurality of infrared feature images with different scales to obtain a plurality of fusion feature images; inputting the multiple fusion feature images into an attention-enhancing coding network to perform target feature enhancement processing to obtain multiple enhancement feature images; inputting a plurality of enhancement feature maps into an obstacle recognition model to perform obstacle recognition to obtain target obstacle information, wherein the target obstacle information comprises: obstacle position information and obstacle size information; constructing a static grid map through the target obstacle information, and generating a cost potential field according to the target obstacle information; based on the static grid map, the cleaning robot is subjected to escape route planning through the cost potential field, a target escape route is generated, and the cleaning robot is controlled to travel through the target escape route. By using infrared ambient pictures and multi-scale ASFF networks, the robot can better perceive the environment, especially in situations of insufficient light or limited vision. This may improve the ability of the robot to identify obstacles, including its position and size information. The multi-scale ASFF network and the attention-enhancing encoding network facilitate extracting and enhancing features in infrared images. This helps the robot to better understand the environment, especially in the presence of complex textures or low contrast. By converting the target obstacle information into a static grid map and a cost potential field, the robot can plan the path better. The cost potential field reflects the risk degrees of different positions in the environment, and is helpful for the robot to select a safe path and avoid collision. The static grid map and the cost potential field enable the robot to plan a getting rid of the way in a complex environment. This means that the robot can avoid obstacles more effectively, and through the above processing steps, the cleaning robot can navigate and process complex environments more independently, thereby improving the accuracy of the artificial intelligence obstacle surmounting and getting rid of.

The above figure 5 describes the artificial intelligence obstacle surmounting and getting rid of device in the embodiment of the invention in detail from the angle of modularized functional entity, and the artificial intelligence obstacle surmounting and getting rid of device in the embodiment of the invention is described in detail from the angle of hardware processing.

Fig. 6 is a schematic structural diagram of an artificial intelligence obstacle detouring device 600 according to an embodiment of the present invention, where the artificial intelligence obstacle detouring device 600 may have a relatively large difference according to configuration or performance, and may include one or more processors (CPU) 610 (e.g., one or more processors) and a memory 620, and one or more storage media 630 (e.g., one or more mass storage devices) storing application programs 633 or data 632. Wherein the memory 620 and the storage medium 630 may be transitory or persistent storage. The program stored on the storage medium 630 may include one or more modules (not shown), each of which may include a series of instruction operations on the artificial intelligence obstacle detouring device 600. Still further, the processor 610 may be configured to communicate with the storage medium 630 to execute a series of instruction operations in the storage medium 630 on the artificial intelligence obstacle detouring device 600.

The artificial intelligence obstacle-surmounting device 600 may also include one or more power supplies 640, one or more wired or wireless network interfaces 650, one or more input-output interfaces 660, and/or one or more operating systems 631, such as WindowsServe, macOSX, unix, linux, freeBSD, and the like. It will be appreciated by those skilled in the art that the artificial intelligence obstacle detouring and detouring device configuration shown in fig. 6 does not constitute a limitation of the artificial intelligence obstacle detouring and detouring device and may comprise more or less components than shown, or may be combined with certain components, or may be arranged with different components.

The invention also provides an artificial intelligence obstacle crossing and getting rid of equipment, which comprises a memory and a processor, wherein the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the processor executes the steps of the artificial intelligence obstacle crossing and getting rid of method in the above embodiments.

The invention also provides a computer readable storage medium, which can be a nonvolatile computer readable storage medium, and can also be a volatile computer readable storage medium, wherein the computer readable storage medium stores instructions which when run on a computer cause the computer to execute the steps of the artificial intelligence obstacle crossing and escaping method.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

The integrated units, if implemented in the form of software functional units and sold or passed as separate products, may be stored in a computer readable storage medium. Based on the understanding that the technical solution of the present invention may be embodied in essence or in a part contributing to the prior art or in whole or in part in the form of a software product stored in a storage medium, comprising instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or other various media capable of storing program codes.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The artificial intelligence obstacle crossing and escaping method is characterized by comprising the following steps of:

2. The method for obstacle crossing and escaping according to claim 1, wherein said cross-layer fusion processing is performed on a plurality of said infrared feature maps with different scales to obtain a plurality of fusion feature maps, comprising:

3. The method for obstacle crossing and escaping according to claim 1, wherein inputting a plurality of the fusion feature maps into a preset attention-enhancing coding network for target feature enhancement processing to obtain a plurality of enhancement feature maps comprises:

4. The method for obstacle crossing and escaping according to claim 1, wherein inputting a plurality of the enhanced feature maps into a preset obstacle recognition model for obstacle recognition to obtain target obstacle information comprises:

5. The method for obstacle crossing and escaping according to claim 4, wherein inputting a plurality of the enhancement feature maps into the obstacle recognition model for category loss calculation to obtain category loss data corresponding to each enhancement feature map comprises:

6. The method of claim 1, wherein the constructing a static grid map from the target obstacle information and generating a cost potential field from the target obstacle information comprises:

7. The method of claim 1, wherein the step of planning the escape path of the cleaning robot based on the static grid map by using the cost potential field, generating a target escape route, and controlling the cleaning robot to travel by using the target escape route comprises:

8. An artificial intelligence device of getting rid of poverty that hinders more, its characterized in that, artificial intelligence device of getting rid of poverty more includes:

9. An artificial intelligence device of getting rid of poverty that hinders more, its characterized in that, artificial intelligence device of getting rid of poverty more includes: a memory and at least one processor, the memory having instructions stored therein;

The at least one processor invoking the instructions in the memory to cause the artificial intelligence obstacle detouring device to perform the artificial intelligence obstacle detouring method of any one of claims 1-7.

10. A computer readable storage medium having instructions stored thereon, which when executed by a processor, implement the artificial intelligence obstacle-surmounting method of any one of claims 1-7.