CN116092042A

CN116092042A - Mesh obstacle recognition method, mesh obstacle recognition device, electronic equipment and computer storage medium

Info

Publication number: CN116092042A
Application number: CN202210907845.3A
Authority: CN
Inventors: 魏翼鹰; 姜一阳; 江澳; 张渝沄; 杨训鑑
Original assignee: Wuhan University of Technology WUT
Current assignee: Wuhan University of Technology WUT
Priority date: 2022-07-29
Filing date: 2022-07-29
Publication date: 2023-05-09

Abstract

The invention relates to a mesh obstacle identification method, a device, electronic equipment and a computer storage medium, wherein the method comprises the steps of acquiring an image to be identified, wherein the image to be identified comprises a mesh obstacle; inputting the image to be identified into a complete semantic segmentation prediction model, and outputting a semantic segmentation graph of the image to be identified; acquiring a depth map of the image to be identified; and carrying out image fusion on the semantic segmentation map and the depth map, and carrying out pixel analysis on the fusion image to determine the depth information of the mesh obstacle. According to the invention, the semantic segmentation map and the depth map are subjected to image fusion, and the accurate depth information of the mesh obstacle is obtained by utilizing the accurate classification characteristic of the semantic segmentation map, so that the recognition precision of unmanned equipment on the mesh obstacle is improved, and the driving safety is ensured.

Description

Mesh obstacle recognition method, mesh obstacle recognition device, electronic equipment and computer storage medium

Technical Field

The present invention relates to the field of computer vision, and in particular, to a method and apparatus for identifying a mesh obstacle, an electronic device, and a computer storage medium.

Background

The unmanned technology can be divided into three modules, namely perception, cognition and control, wherein the environment is accurately perceived at first, then information is processed, and finally an instruction is sent to a control system of an automobile to realize specific functions.

In the sensing part, there are a large number of sensors which work in coordination with each other, as many as possible acquire effective information to move the vehicle along the correct path, such as lidar, millimeter wave radar, ultrasonic radar, camera, inertial navigation (IMU), wheel odometer, etc., among which the most operable, expandable sensor is the camera, which is very widely used in unmanned driving because it is closest to the principle of human eye recognition environment, and attracts a large number of students, engineers to study.

In general, unmanned vehicles are required to run in daytime or under sufficient light conditions, and with the development of hardware technology, computers are getting more and more powerful, and the carried cameras can meet environmental perception tasks under most conditions. However, for mesh obstacle recognition, since mesh targets are too fine and meshes are often distributed horizontally, the meshes seen by the left and right cameras of the binocular camera have no obvious parallax, so that the mesh obstacle is difficult to recognize, and potential safety hazards exist.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a mesh obstacle recognition method, apparatus, electronic device, and computer storage medium for solving the problem of collision of unmanned equipment caused by low accuracy of mesh obstacle recognition in the prior art.

In order to solve the above-mentioned problems, the present invention provides, in a first aspect, a mesh obstacle recognition method including:

acquiring an image to be identified, wherein the image to be identified comprises a net-shaped obstacle;

inputting the image to be identified into a complete semantic segmentation prediction model, and outputting a semantic segmentation graph of the image to be identified;

acquiring a depth map of the image to be identified;

and carrying out image fusion on the semantic segmentation map and the depth map, and carrying out pixel analysis on the fusion image to determine the depth information of the mesh obstacle.

Further, the trained complete semantic segmentation prediction model is trained based on a PSPNet neural network;

the PSPNet neural network structure comprises a feature extraction sub-network, a pooling sub-network and a convolution sub-network.

Further, the training process of the semantic segmentation prediction model includes:

acquiring a picture set comprising a net-shaped obstacle, and labeling the picture set with a classification label to obtain a classification result set;

the picture sets and the classification results corresponding to each picture form a data set, wherein the data set comprises a training set, a testing set and a prediction set;

inputting the training set into a PSPNet neural network for training, acquiring trained model parameters after a preset loss condition is reached, loading the trained model parameters based on the PSPNet neural network, and finishing training of the semantic segmentation prediction model;

the training set is input into a PSPNet neural network for training, and specifically comprises the following steps:

extracting a first picture feature layer in the training set by utilizing the feature extraction sub-network;

carrying out pooling operation of different scales on the first picture feature layer by utilizing the pooling sub-network to obtain a second picture feature layer

And adjusting the characteristic layer number and the channel number of the second picture characteristic layer by utilizing the convolution sub-network so as to enable the output picture and the input picture to be the same in size.

Further, the obtaining the depth map of the image to be identified includes:

acquiring calibration parameters of a depth camera, and correcting the image to be identified according to the calibration parameters;

and matching the corrected images, and calculating the depth of each pixel point in the image to be identified according to the matching result so as to obtain a depth map of the image to be identified.

Further, the image fusion of the semantic segmentation map and the depth map includes:

constructing a camera coordinate system based on a depth camera;

and carrying out image fusion on the semantic segmentation map and the depth map in the camera coordinate system to obtain a fusion image.

Further, the pixel analysis of the fused image to determine depth information of the mesh obstacle includes:

and acquiring a histogram corresponding to the fusion image, carrying out pixel statistics on the histogram, and determining the depth information of the mesh obstacle based on a statistical result.

Further, the method further comprises:

and filling and repairing the depth map based on the depth information.

In a second aspect, the present invention also provides a mesh obstacle identifying apparatus, including:

the first acquisition module is used for acquiring an image to be identified, wherein the image to be identified comprises a net-shaped obstacle;

the output module is used for inputting the image to be identified into a complete semantic segmentation prediction model and outputting a semantic segmentation graph of the image to be identified;

the second acquisition module is used for acquiring the depth map of the image to be identified;

and the determining module is used for carrying out image fusion on the semantic segmentation map and the depth map and carrying out pixel analysis on the fusion image so as to determine the depth information of the mesh obstacle.

In a third aspect, the present invention also provides an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps in the mesh obstacle recognition method described above when executing the computer program.

In a fourth aspect, the present invention also provides a computer storage medium storing a computer program which, when executed by a processor, implements steps as in a mesh obstacle recognition method.

The beneficial effects of adopting the embodiment are as follows:

according to the method, the image to be identified is obtained in real time through the camera, then the semantic segmentation technology is used for obtaining the semantic segmentation image of the image to be identified, the net-shaped obstacle is accurately detected in real time, then the depth image generated by binocular stereoscopic imaging is corrected and supplemented according to the semantic segmentation image set, the identification precision of the unmanned equipment on the net-shaped obstacle is improved, so that a robot or an unmanned vehicle can comprehensively perceive the environment in a complex environment, accurate obstacle avoidance is realized, and driving safety is further improved.

Drawings

Fig. 1 is a flowchart illustrating an embodiment of a mesh obstacle recognition method according to the present invention;

FIG. 2 is a reference diagram of an image to be identified according to an embodiment of the present invention;

FIG. 3 is a reference diagram of a semantic segmentation map of an image to be identified according to an embodiment of the present invention;

FIG. 4 is an overall frame diagram of a PSPNet according to an embodiment of the present invention;

FIG. 5 is a label producing effect diagram according to an embodiment of the present invention;

FIG. 6 is a depth map of an image to be identified according to an embodiment of the present invention;

fig. 7 is a fusion diagram of a semantic segmentation diagram and a depth diagram of an image to be identified after image fusion according to an embodiment of the present invention;

FIG. 8 is a histogram corresponding to a partial region of a fused image according to an embodiment of the present invention;

FIG. 9 is a schematic diagram illustrating a mesh obstacle recognition device according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

Preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form a part hereof, and together with the description serve to explain the principles of the invention, and are not intended to limit the scope of the invention.

In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise. Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

Vehicles in unmanned technology are generally required to run in daytime or under sufficient light, and with the development of hardware technology, computers are increasingly powerful, and depth cameras in unmanned technology can meet environmental perception tasks under most conditions. However, for mesh obstructions, it is difficult to identify the mesh obstruction because the mesh target is too fine and the mesh is often distributed horizontally, resulting in no apparent parallax to the mesh seen by the depth camera. The semantic segmentation map can represent the area where each object is located in the image, so that the mesh obstacle can be well identified by combining the depth map and the semantic segmentation map.

The invention provides a mesh obstacle recognition method, a mesh obstacle recognition device, electronic equipment and a computer storage medium, which are respectively described below.

Referring to fig. 1, fig. 1 is a flow chart of an embodiment of a mesh obstacle recognition method according to the present invention, and a mesh obstacle recognition method according to an embodiment of the present invention is disclosed, including:

step S101: acquiring an image to be identified, wherein the image to be identified comprises a net-shaped obstacle;

step S102: inputting the image to be identified into a complete semantic segmentation prediction model, and outputting a semantic segmentation graph of the image to be identified;

step S103: acquiring a depth map of an image to be identified;

step S104: and carrying out image fusion on the semantic segmentation map and the depth map, and carrying out pixel analysis on the fused image to determine the depth information of the mesh obstacle.

The image to be identified comprises an image carrying a mesh obstacle, and it is understood that the unmanned device can meet the environmental perception task under most conditions in the automatic driving process, and the unmanned device comprises, but is not limited to, a robot, an unmanned vehicle and the like. However, for the mesh obstacle, since the mesh target is too fine and the mesh is often horizontally distributed and difficult to identify, the mesh obstacle in the field of view of the unmanned device needs to be processed, so that the unmanned device can automatically run.

Specifically, an image to be identified can be obtained through a depth camera of the unmanned equipment, then the image to be identified is input into a fully trained semantic segmentation prediction model, and a semantic segmentation graph aiming at the image to be identified is output. Referring to fig. 2 and 3, fig. 2 is a reference diagram of an image to be identified according to an embodiment of the present invention, and fig. 3 is a reference diagram of a semantic segmentation diagram of an image to be identified according to an embodiment of the present invention. It will be appreciated that semantic representation is from concrete to abstract, semantic segmentation, meaning that a computer is caused to segment according to the semantics of an image, and in the image domain, semantics refer to the content of an image. The semantic segmentation map of the image to be identified is thus a presentation map by classifying the content in the image.

It will be appreciated that the depth information of the mesh obstacle obtained by the depth camera is inaccurate, as the mesh seen by the depth camera has no apparent parallax, whereas the classified region contents are well presented in the semantic segmentation map. Therefore, the depth image of the image to be recognized and the semantic image captured by the depth camera can be subjected to image fusion, then the fused image is subjected to pixel analysis, and the depth information of the mesh obstacle is further determined, so that the unmanned equipment can comprehensively sense the surrounding environment according to the depth information of the mesh obstacle, and the precise obstacle avoidance is realized.

In one embodiment of the present application, training a complete semantic segmentation prediction model is based on a PSPNet neural network;

the structure of the PSPNet neural network comprises a feature extraction sub-network, a pooling sub-network and a convolution sub-network.

Firstly, it should be noted that the PSPNet (Pyramid Scene Parsing Network) pyramid scene analysis network is used as a neural network model for identifying mesh obstacles, and the core module of the model is a pyramid pooling module, which can divide a feature layer into grids with different sizes and aggregate context information of different areas, so as to improve the capability of acquiring global information. Referring to fig. 4, fig. 4 is an overall frame diagram of a PSPNet according to an embodiment of the present invention.

It should be noted that, because the technical solution in the present invention will be used in an embedded system or other mobile platforms, the requirement on computing performance cannot be too high, but a certain accuracy is also required, after the performance and the speed are balanced, the feature extraction subnetwork in the present invention selects the resnet50 network, and a backbone part can be used to obtain one feature layer after another as the input of the subsequent processing part.

Then, referring to the pooling sub-network, please refer to part c in fig. 4, four scale features are fused together, the top row is the coarsest global pooling, the bottom row is pooling of different scales, and after a series of processing, up-sampling is performed to restore the original size of the image, and then the images are stacked together to form the overall framework of the PSPNet.

It will be appreciated that by the above two steps we have obtained the characteristics of the input picture, and in order to obtain a picture of the same dimension as the input, a final channel adjustment is required, i.e. a convolutional sub-network is used, for example, a 3x3 convolutional is used to adjust the characteristic layer, a 1x1 convolutional is used to adjust the number of channels, and finally a size adjustment method is used to adjust the picture to be consistent with the input picture, so as to obtain a final semantic segmentation diagram.

In one embodiment of the present application, a training process of a semantic segmentation prediction model includes:

acquiring a picture set comprising the net-shaped barriers, and labeling the picture set with a classification label to obtain a classification result set;

the method comprises the steps that a picture set and a classification result corresponding to each picture form a data set, wherein the data set comprises a training set, a testing set and a prediction set;

and inputting the training set into a PSPNet neural network for training, acquiring trained model parameters after a preset loss condition is reached, loading the trained model parameters based on the PSPNet neural network, and finishing the training of the semantic segmentation prediction model.

It can be understood that unmanned equipment, such as an autopilot, is more used in factory transportation, and more barriers such as fence nets exist in the environment, so that a picture set including the net-shaped barriers mainly comes from the factory environment, after a certain number of photos are taken, classification tags can be marked on the picture set, and specifically, a tag tool software can be used for manually manufacturing tags, so that the effect is shown in fig. 5, and referring to fig. 5, fig. 5 is a tag manufacturing effect diagram provided in an embodiment of the invention. It can be understood that by manually labeling the labels, the method is equivalent to manually classifying the contents in the picture set, and further obtaining a data set composed of the pictures and the classification results corresponding to the pictures, so that the data set can be used for subsequent training.

In order to accelerate training and achieve better prediction effect, a method of transfer learning is adopted. In the pre-training model, 20 kinds of output are totally used, in order to simplify the model, the output classification is adjusted to 5 kinds in the training process, and the training process specifically comprises background, fence, ground, nylon net and person, so that the training speed is greatly increased, and the later prediction speed is also greatly improved.

The training set is then input into the built PSPNet neural network for training, and in one embodiment of the present invention,

the training set is input into the PSPNet neural network for training, and specifically comprises the following steps:

extracting a first picture feature layer in the training set by utilizing a feature extraction sub-network;

carrying out pooling operation of different scales on the first picture feature layer by using a pooling sub-network to obtain a second picture feature layer

And adjusting the feature layer number and the channel number of the second picture feature layer by utilizing the convolution sub-network so as to enable the sizes of the output picture and the input picture to be the same.

It can be understood that after the performance and the speed are balanced, the feature extraction sub-network selects a resnet50 network, and the first picture feature layer corresponding to the training set can be obtained by using the resnet50 network and used as the input of a subsequent processing part; the pooling sub-network carries out pooling operation of different scales on the first picture feature layer to obtain a second picture feature layer, and it can be understood that the pooling sub-network can divide the first feature layer into grids of different sizes, then aggregate context information of different areas to obtain global information, and finally generate the second picture feature layer, wherein the second picture feature layer is added with more detail information compared with the first picture feature layer, so that semantic content in a picture can be accurately identified; through the above two steps, we have acquired the characteristics of the input picture, and in order to obtain the picture with the same dimension as the input, the final channel adjustment is needed, that is, the characteristic layer is adjusted by using the convolution sub-network, for example, the 3x3 convolution is used, the channel number adjustment is performed by using the 1x1 convolution, and finally, the picture is adjusted to be consistent with the input picture by using the size adjustment method, so as to obtain the final semantic segmentation map.

After reaching the preset training condition, such as 100 times of iterative training, selecting the model file with the minimum comprehensive loss on the training set and the testing set from the model files generated by each generation to determine the parameters of the neural network model, and loading the trained model parameters to the constructed PSPNet model to complete the training of the semantic segmentation prediction model.

After training is completed, a prediction process is carried out, the process does not need back propagation, the neural network parameters do not need updating and learning, and the semantic segmentation pictures can be output after the original pictures are input into a semantic segmentation prediction model with complete training.

In one embodiment of the present invention, after training of the semantic segmentation prediction model is completed, the method further includes:

and evaluating the completely trained semantic segmentation prediction model by using a preset evaluation index to obtain an evaluation result.

The preset evaluation indexes comprise mIoU indexes, the mIoU indexes are used for evaluation, the mIoU is used for calculating the ratio of the intersection and the union of the two sets of the true value and the predicted value, and the evaluation result is 76.03% by performing the mIoU evaluation on the semantic segmentation prediction model trained above, so that the prediction effect of the model is better.

In one embodiment of the present application, obtaining a depth map of an image to be identified includes:

acquiring calibration parameters of the depth camera, and correcting an image to be identified according to the calibration parameters;

It should be noted that, currently, there are three main depth camera technologies: structured light, binocular vision and TOF time of flight methods. The invention adopts a binocular vision scheme, namely, the depth camera is a binocular camera, the binocular camera imitates the principle of human eye ranging, and the parallax of images acquired by the left camera and the right camera is utilized to restore the depth information of the picture. The binocular camera has strong light interference resistance, can work in an outdoor environment, has the lowest manufacturing cost, and can be combined with deep learning to further optimize imaging.

The calibration parameters of the depth camera respectively comprise the internal and external parameters of two cameras in the binocular camera and a homography matrix between the two cameras. The camera is calibrated to be understood as mapping from world coordinates to pixel coordinates, and the mapping relation between the world coordinates and the pixel coordinates is obtained by calibrating, so that the world coordinates can be reversely deduced by the pixel coordinates of the pixel points. The intrinsic parameters of the camera are parameters related to the characteristics of the camera, such as the focal length, pixel size, etc. of the camera; the external parameters of the camera include parameters of the camera in the world coordinate system, such as the position, rotation direction, etc. of the camera; the homography between two cameras describes the mapping between two planes, i.e. the transformation between two images at some points on a common plane.

It can be understood that when calculating the depth of the pixel, besides knowing the focal length and the base line of the camera in the parameters of the camera, the parallax between the two cameras needs to be known, that is, the corresponding relation between each pixel point of the left camera and the corresponding point of the right camera is known, that is, the corrected two images are subjected to pixel point matching, specifically, the homography matrix of the two cameras can be used for matching, or the polar constraint can be used for matching the pixel points, when the matching is completed, the parallax between the two cameras is obtained, the depth of each pixel can be calculated, and then the depth map about the image to be recognized is obtained. Referring to fig. 6, fig. 6 is a depth map of an image to be identified according to an embodiment of the present invention. The frame is a depth map of a net-shaped barrier, so that the effect is not ideal. This is because the mesh targets are too fine and the mesh is often distributed horizontally, resulting in no apparent parallax to the mesh seen by the left and right cameras of the binocular camera, and thus it is difficult to identify the mesh obstacle.

In one embodiment of the present application, image fusion of a semantic segmentation map with a depth map includes:

constructing a camera coordinate system based on a depth camera;

and carrying out image fusion on the semantic segmentation map and the depth map in a camera coordinate system to obtain a fusion image.

It can be understood that, since the depth map and the semantic segmentation map are both obtained by further processing the image to be identified captured by the depth camera, and the change of the coordinate system is not involved in the processing process, the depth map semantic segmentation map shares the same coordinate system, so that the depth map and the semantic segmentation map can be fused by using the camera coordinate system under the depth camera. Referring to fig. 7, fig. 7 is a fused image obtained by fusing a semantic segmentation image and a depth image of an image to be identified according to an embodiment of the present invention.

In one embodiment of the present application, performing pixel analysis on the fused image to determine depth information of the mesh obstacle includes:

In the process of performing pixel analysis on the fused image, in order to improve analysis efficiency, a histogram corresponding to an area where the mesh obstacle is located may be obtained, specifically, the area where the mesh obstacle is located may be cut out from the fused image as an area of interest, and histogram pixel statistics may be performed. Referring to fig. 8, fig. 8 is a histogram corresponding to a fused image portion area according to an embodiment of the present invention.

The histogram shows that the pixel values are mainly distributed around two peaks, wherein the pixel value of the area where the peak P1 is located is highest, and the pixel value is the distance closest to the obstacle, namely the distance of the net-shaped obstacle; the peak P2 and the areas of lower pixel values are other objects behind the mesh. The depth information of the mesh obstacle can be recalculated based on the pixel value of the area where the peak P1 is located.

In one embodiment of the present application, the method further includes:

and filling and repairing the depth map based on the depth information.

It will be appreciated that after the depth information of the mesh obstacle is obtained, the depth map may be padded and repaired according to the depth information of the mesh obstacle, specifically replaced by the recalculated depth information of the mesh obstacle.

According to the depth information of the mesh obstacle generated by binocular stereoscopic imaging, the depth map is corrected and supplemented, so that a robot or an unmanned vehicle can comprehensively sense the environment in a complex environment, and the accurate obstacle avoidance is realized.

In order to better implement the mesh obstacle recognition method according to the embodiment of the present invention, referring to fig. 9, fig. 9 is a schematic structural diagram of an embodiment of a mesh obstacle recognition device according to the present invention, where the mesh obstacle recognition device 900 further includes:

a first obtaining module 901, configured to obtain an image to be identified, where the image to be identified includes a mesh obstacle;

the output module 902 is configured to input the image to be identified into a complete semantic segmentation prediction model, and output a semantic segmentation map of the image to be identified;

a second obtaining module 903, configured to obtain a depth map of the image to be identified;

the determining module 904 is configured to perform image fusion on the semantic segmentation map and the depth map, and perform pixel analysis on the fused image to determine depth information of the mesh obstacle.

What needs to be explained here is: the apparatus 900 provided in the foregoing embodiments may implement the technical solutions described in the foregoing method embodiments, and the specific implementation principles of each module or unit may refer to the corresponding content in the foregoing method embodiments, which is not repeated herein.

Based on the above mesh obstacle recognition method, the embodiment of the invention further provides an electronic device, which includes: a processor and a memory, and a computer program stored in the memory and executable on the processor; the steps in the mesh obstacle recognition method of the embodiments described above are implemented when the processor executes a computer program.

A schematic structural diagram of an electronic device 1000 suitable for use in implementing embodiments of the present invention is shown in fig. 10. The electronic device in the embodiment of the present invention may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a car-mounted terminal (e.g., car navigation terminal), etc., and a stationary terminal such as a digital TV, a desktop computer, etc. The electronic device shown in fig. 10 is merely an example, and should not impose any limitation on the functionality and scope of use of embodiments of the present invention.

An electronic device includes: a memory and a processor, where the processor may be referred to as a processing device 1001 hereinafter, the memory may include at least one of a Read Only Memory (ROM) 1002, a Random Access Memory (RAM) 1003, and a storage device 1008 hereinafter, as specifically described below:

as shown in fig. 10, the electronic device 1000 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 1001 that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1002 or a program loaded from a storage means 1008 into a Random Access Memory (RAM) 1003. In the RAM1003, various programs and data necessary for the operation of the electronic apparatus 1000 are also stored. The processing device 1001, the ROM1002, and the RAM1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

In general, the following devices may be connected to the I/O interface 1005: input devices 1006 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 1007 including, for example, a Liquid Crystal Display (LCD), speaker, vibrator, etc.; storage 1008 including, for example, magnetic tape, hard disk, etc.; and communication means 1009. The communication means 1009 may allow the electronic device 1000 to communicate wirelessly or by wire with other devices to exchange data. While fig. 10 shows an electronic device 1000 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present invention, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present invention include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 1009, or installed from the storage device 1008, or installed from the ROM 1002. When being executed by the processing means 1001, performs the above-described functions defined in the method of the embodiment of the present invention.

Based on the mesh obstacle recognition method, the embodiment of the present invention further provides a computer readable storage medium storing one or more programs, where the one or more programs may be executed by one or more processors to implement the steps in the mesh obstacle recognition method according to the foregoing embodiments.

Those skilled in the art will appreciate that all or part of the flow of the methods of the embodiments described above may be accomplished by way of a computer program to instruct associated hardware, where the program may be stored on a computer readable storage medium. Wherein the computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory, etc.

The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims

1. A method of identifying a mesh obstacle, comprising:

acquiring a depth map of the image to be identified;

2. The mesh obstacle recognition method of claim 1, wherein the trained complete semantic segmentation prediction model is trained based on a PSPNet neural network;

3. The mesh obstacle recognition method of claim 2, wherein the training process of the semantic segmentation prediction model comprises:

4. The mesh obstacle recognition method according to claim 1, wherein the acquiring the depth map of the image to be recognized includes:

5. The mesh obstacle recognition method of claim 1, wherein the image fusing the semantic segmentation map with the depth map comprises:

constructing a camera coordinate system based on a depth camera;

6. The mesh obstacle recognition method according to any one of claims 1 or 5, wherein the performing pixel analysis on the fused image to determine depth information of the mesh obstacle includes:

7. The mesh obstacle identification method as set forth in claim 6, further comprising:

and filling and repairing the depth map based on the depth information.

8. A mesh obstacle recognition device, comprising:

9. An electronic device comprising a memory and a processor, wherein the memory is configured to store a program; the processor, coupled to the memory, for executing the program stored in the memory to implement the steps in the mesh obstacle recognition method of any one of the preceding claims 1 to 7.

10. A computer readable storage medium storing a computer readable program or instructions which, when executed by a processor, is capable of carrying out the steps of the mesh obstacle identification method as claimed in any one of the preceding claims 1 to 7.