CN114140765A

CN114140765A - Obstacle sensing method and device and storage medium

Info

Publication number: CN114140765A
Application number: CN202111338928.7A
Authority: CN
Inventors: 吴新开; 徐少清; 王鹏成
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2021-11-12
Filing date: 2021-11-12
Publication date: 2022-03-04
Anticipated expiration: 2041-11-12
Also published as: CN114140765B

Abstract

The application discloses a method and a device for sensing obstacles and a storage medium, which are used for reducing the false detection and the false detection rate of the obstacles and improving the detection precision. The obstacle sensing method disclosed by the application comprises the following steps: acquiring an original point cloud and a camera picture at the same time; acquiring a calibration internal parameter and a calibration external parameter of projection conversion; performing semantic segmentation on the original point cloud to obtain a second point cloud; performing semantic segmentation on the camera picture to obtain a second picture; according to the internal parameters and the external parameters, projecting the original point cloud to the second picture to obtain a third point cloud, wherein each point in the third point cloud comprises second semantic category information corresponding to the second picture; after the second semantic category information in the third point cloud, the first semantic category information in the second point cloud and the feature information of the original point cloud are subjected to voxelization, the second semantic category information and the feature information are input into a self-adaptive attention mechanism network for learning, and the weighted semantic information is obtained; and detecting the obstacle target according to the weighted semantic information. The application also provides an obstacle sensing device and a storage medium.

Description

Obstacle sensing method and device and storage medium

Technical Field

The present application relates to the field of automatic driving, and in particular, to a method and an apparatus for sensing an obstacle, and a storage medium.

Background

With the continuous development of automatic driving technology, various sensors are used as important components of an automatic driving system. The environment sensing portion of an autopilot system typically requires the acquisition of a large amount of ambient information to ensure proper understanding and corresponding decision-making of the automotive vehicle's body ambient environment. However, the sensing by using a single sensor has limitation, on one hand, the single sensing equipment may have a detection blind area due to the limitation of the installation position of the sensor; on the other hand each sensor has its own characteristic defect.

Therefore, the obstacle sensing is carried out by using a single sensor, and the problem of low identification precision exists.

Disclosure of Invention

In view of the foregoing technical problems, embodiments of the present application provide a method, an apparatus, and a storage medium for obstacle sensing, so as to improve accuracy of obstacle sensing.

In a first aspect, an obstacle sensing method provided in an embodiment of the present application includes:

acquiring an original point cloud and a camera picture at the same time;

acquiring a calibration internal parameter and a calibration external parameter of projection conversion;

performing semantic segmentation on the original point cloud to obtain a second point cloud;

performing semantic segmentation on the camera picture to obtain a second picture;

according to the internal parameters and the external parameters, projecting the original point cloud to the second picture to obtain a third point cloud, wherein each point in the third point cloud comprises second semantic category information corresponding to the second picture;

after the second semantic category information in the third point cloud, the first semantic category information in the second point cloud and the feature information of the original point cloud are subjected to voxelization, the second semantic category information and the feature information are input into a self-adaptive attention mechanism network for learning, and the weighted semantic information is obtained;

detecting an obstacle target according to the weighted semantic information;

wherein the second point cloud and the third point cloud include obstacle category information.

Preferably, the learning in the input adaptive attention mechanism network includes:

the local features are learned in the adaptive attention mechanism network to obtain learned local features V_i；

The global feature is learned in the adaptive attention mechanism network to obtain a learned global feature V_global；

The global feature V after learning is processed_globalSpliced to each local feature V_iTo obtain an enhanced feature V_gl。

Preferably, the detecting an obstacle target according to the first semantic information includes:

and inputting the first semantic information into a target detector to detect the obstacle target.

Preferably, the acquiring the original point cloud and the camera picture at the same time includes:

performing software synchronization or hardware synchronization on the point cloud and the camera;

and obtaining the original point cloud and the camera picture at the same moment.

Preferably, the semantic segmentation of the original point cloud to obtain the second point cloud comprises:

and inputting the original point cloud into a point cloud semantic segmentation network to obtain a second point cloud.

Preferably, the semantic segmentation of the camera picture to obtain the second picture includes:

and inputting the camera picture into a picture semantic segmentation network to obtain a second picture.

Before the performing voxelization on the second semantic category information in the third point cloud, the first semantic category information in the second point cloud and the feature information of the original point cloud, the method further comprises the following steps:

and converting the first semantic category information and the second semantic category information into a One-Hot coding format.

Preferably, the projecting the original point cloud to the second picture includes:

projection is performed according to the following formula:

P′＝Proj(K,M,P)，

wherein, Proj is a projection matrix processing process;

k is an internal reference matrix of the camera;

m is an external parameter matrix from the camera to the laser radar;

p is a laser radar point cloud set;

and P' is the laser radar point cloud projected to the camera coordinate system.

The local features are learned in a self-adaptive attention mechanism network to obtain learned local features V_iThe method comprises the following steps:

learning of local features is performed according to the following formula:

V_i＝max_{i＝1,2,…,N}{MLP_l(p_i)}，

wherein, V_iFeatures in the ith voxel grid obtained by learning;

p_iis the ith point in the space point cloud;

MLP_l(p_i) A multi-layer perceptron for local features;

max is the maximum pooling operation for all points in a voxel;

C₁the number of channels that are local feature maps;

n is the number of voxel bins.

Preferably, the global feature is learned in an adaptive attention mechanism network to obtain a learned global feature V_globalThe method comprises the following steps:

the global features are learned according to the following formula:

V_global＝max_{i＝1,2,…,N}{MLP_g(V_i)}

MLP_g(V_i) A global feature multi-layer perceptron;

max is the maximum pooling operation for all voxels;

C₂the number of channels for the entire profile;

n is the number of voxel grids;

V_ifeatures in the ith voxel grid obtained by learning are used.

The obtaining of the weighted semantic information includes:

obtaining weighted semantic information according to the following formula:

wherein, P_a,sAnd P_a,tThe weighted semantic information is obtained;

P_2Dthe second semantic information is obtained;

P_3Dthe first semantic information is obtained;

MLP_attis a multilayer perceptron;

σ is Sigmoid activation function.

By using the obstacle sensing method provided by the invention, the laser radar sensor and the camera sensor are fused, the advantages of different sensors are utilized, the defects of the respective sensors are also supplemented, and the sensing and identifying precision of point cloud target detection is improved. Meanwhile, in the scheme, the false detection and the false detection rate of the barrier are reduced by utilizing the deep learning network combining the three-dimensional point cloud semantic segmentation information and the two-dimensional picture semantic segmentation information and utilizing the semantic information of different sensors.

In a second aspect, an embodiment of the present application further provides an obstacle sensing device, including:

the image acquisition module is configured for acquiring a camera image and acquiring calibration internal parameters and external parameters of projection conversion;

a point cloud acquisition module configured to acquire an original point cloud;

the picture semantic segmentation module is configured for performing semantic segmentation on the camera picture to obtain a second picture;

the point cloud semantic segmentation module is configured for performing semantic segmentation on the original point cloud to obtain a second point cloud;

the image semantic projection module is configured to perform projection from the original point cloud to the second image according to the internal parameters and the external parameters to obtain a third point cloud, wherein each point in the third point cloud comprises second semantic category information corresponding to the second image;

the semantic fusion module is configured to perform voxelization on second semantic category information in the third point cloud, first semantic category information in the second point cloud and feature information of the original point cloud, and input the voxelization information into an adaptive attention mechanism network for learning to obtain weighted semantic information;

an obstacle sensing module configured to detect an obstacle target according to the weighted semantic information;

wherein the second point cloud and the second picture include obstacle category information.

In a third aspect, an embodiment of the present application further provides an obstacle sensing device, including: a memory, a processor, and a user interface;

the memory for storing a computer program;

the user interface is used for realizing interaction with a user;

the processor is used for reading the computer program in the memory, and when the processor executes the computer program, the obstacle sensing method provided by the invention is realized.

In a fourth aspect, an embodiment of the present invention further provides a processor-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the processor implements the obstacle sensing method provided by the present invention.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of obstacle sensing provided in an embodiment of the present application;

FIG. 2 is a schematic flow chart of an adaptive attention mechanism provided in an embodiment of the present application;

fig. 3 is a schematic structural diagram of an obstacle sensing device according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of another obstacle sensing device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Some of the words that appear in the text are explained below:

1. the term "and/or" in the embodiments of the present invention describes an association relationship of associated objects, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

2. In the embodiments of the present application, the term "plurality" means two or more, and other terms are similar thereto.

3. One-Hot coding format, also known as One-bit efficient coding, is often used in classified prediction, and is usually presented in the form of binary vectors. Firstly, the class to which the object belongs is mapped into an integer value, and then the integer value is converted into binary code, namely the class dimension value is 1, and the remaining dimension values are 0.

4. Voxelization means that a three-dimensional point cloud is divided into grids of the same resolution size (e.g., 0.75m × 0.75), and placed into different voxel grids according to the difference of the spatial position of each point in the point cloud.

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the display sequence of the embodiment of the present application only represents the sequence of the embodiment, and does not represent the merits of the technical solutions provided by the embodiments.

Example one

Referring to fig. 1, a schematic diagram of an obstacle sensing method provided in an embodiment of the present application is shown in fig. 1, where the method includes steps S101 to S107:

s101, acquiring an original point cloud and a camera picture at the same moment;

in the embodiment of the invention, the original point cloud is a three-dimensional point cloud and can be obtained through a laser radar. The camera pictures are acquired by the cameras, and if a plurality of cameras are installed, the camera pictures of the plurality of cameras are acquired simultaneously. The acquisition time of the original point cloud is the same as that of the camera picture, namely the original point cloud and the camera picture at the same moment are acquired.

As a preferred example, in this step, the original point cloud and the camera picture may not be at the same time, but the difference between the time of acquiring the original point cloud and the time of acquiring the camera picture is within a predetermined time difference range, and the predetermined time difference range is determined, for example, the difference between the time of acquiring the original point cloud and the time of acquiring the camera picture is 0.001 second.

As a preferred example, in order to acquire the original point cloud and the camera parameters at the same time, the point cloud and the camera may be soft-synchronized or hardware-synchronized. Soft synchronization refers to providing a same time source for different sensors, and the sensors respectively stamp recorded data; the hardware synchronization refers to that a hardware trigger is used for triggering different sensors to record time directly through physical signals such as PPS time service.

S102, obtaining calibrated internal parameters and external parameters of projection conversion;

in this step, calibration internal parameters and external parameters of projection conversion corresponding to the camera and the laser radar are obtained.

In the embodiment of the invention, the internal parameters are calibrated internal parameters of the camera; the external parameters are external parameters of the camera and the laser radar; the internal parameters and the external parameters are used for projection conversion of the point cloud.

Preferably, internal references include, but are not limited to: distortion coefficient, focal length, pixel size, etc.; external parameters include, but are not limited to: rotation, translation matrices, etc.

It should be noted that, in the embodiment of the present invention, the internal reference and the external reference are calibrated and stored in advance.

S103, performing semantic segmentation on the original point cloud to obtain a second point cloud;

as a preferred example, the original point cloud is input into a point cloud semantic segmentation network.

In the step, a frame of point cloud is transmitted to a point cloud semantic segmentation network, and a second point cloud containing fine-grained semantic information is obtained, namely the second point cloud comprises the first semantic information.

It should be noted that the fine-grained semantic information means that the category information of each point is clear and is not interfered by other external conditions, for example, is not influenced by internal parameters and external parameters.

S104, performing semantic segmentation on the camera picture to obtain a second picture;

in this step, the camera picture is input into a picture semantic segmentation network to obtain a second picture.

It should be noted that, as a preferable example, in the above steps S103 and S104, the semantic division network may be a Cylinder3D network or the like.

S105, projecting the original point cloud to the second picture according to the internal parameters and the external parameters to obtain a third point cloud, wherein each point in the third point cloud comprises second semantic category information corresponding to the second picture;

in this step, the projection method of the original point cloud to the second picture may be:

projection is performed according to the following formula:

P′＝Proj(K,M,P)，

wherein, Proj is a projection matrix processing process;

k is an internal reference matrix of the camera;

m is an external parameter matrix from the camera to the laser radar;

p is a laser radar point cloud set;

For example, the original point cloud is P, the second picture is K, and the projection is performed according to the above formula 1 to obtain a third point cloud P'.

After the above steps S101 to S105, each point in the point cloud has two semantic information, that is, the first semantic information from the original point cloud and the second semantic information from the point cloud projected picture. It should be noted that, as a preferred example, the obtained first semantic information and the second semantic information may also be in a One-Hot format.

S106, after the second semantic category information in the third point cloud, the first semantic category information in the second point cloud and the feature information of the original point cloud are subjected to voxelization, inputting the second semantic category information and the feature information into a self-adaptive attention mechanism network for learning, and obtaining weighted semantic information;

in this step, the learning in the input adaptive attention mechanism network includes:

adaptive attention machine for local featuresLearning in the system network to obtain the local feature V after learning_i；

As a preferred example, the local feature is learned in an attention mechanism network, and the learned local feature V is obtained_iThe method comprises the following steps:

learning of local features is performed according to the following formula:

V_i＝max_{i＝1,2,…,N}{MLP_l(p_i)}，

wherein, V_iFeatures in the ith voxel grid obtained by learning;

p_iis the ith point in the space point cloud;

MLP_l(p_i) A multi-layer perceptron for local features;

max is the maximum pooling operation for all points in a voxel;

C₁the number of channels that are local feature maps;

n is the number of voxel bins.

The global feature is learned in a self-adaptive attention mechanism network to obtain a learned global feature V_globalThe method comprises the following steps:

the global features are learned according to the following formula:

V_global＝max_{i＝1,2,…,N}{MLP_g(V_i)}

MLP_g(V_i) A global feature multi-layer perceptron;

max is the maximum pooling operation for all voxels;

C₂the number of channels for the entire profile;

n is the number of voxel grids;

V_ifeatures in the ith voxel grid obtained by learning are used.

As a preferred example, the weighted semantic information is obtained according to the following formula:

wherein, P_a,sAnd P_a,tThe weighted semantic information is obtained;

P_2Dthe second semantic information is obtained;

P_3Dthe first semantic information is obtained;

MLP_attis a multilayer perceptron;

σ is Sigmoid activation function.

As a preferred example, the process of this step is shown in FIG. 2. In fig. 2, the input point cloud is an original point cloud, the 2D semantic information is data obtained by converting the second semantic information into an One-Hot format, and the 3D semantic information is data obtained by converting the first semantic information into an One-Hot format. The treatment process is as follows:

after 2D and 3D semantic information converted into One-Hot are spliced to original point cloud feature information, if the number of types to be predicted is m, each point respectively comprises 2m of semantic type information segmented by 3D and 2D, and finally the semantic type information and the original data information of the point cloud are spliced into One block such as XYZ to obtain an N x (2m +3) -dimensional feature vector, the N x (2m +3) -dimensional feature vector is subjected to voxelization and then input into a self-adaptive attention mechanism network combining local features and global features, the type feature of each voxel grid after weighting is obtained, and finally the weighted type feature is input into a target detection network.

S107, detecting an obstacle target according to the weighted semantic information;

in this step, the weighted semantic information is input to a target detector to detect an obstacle target.

Example two

Based on the same inventive concept, an embodiment of the present invention further provides an obstacle sensing device, as shown in fig. 3, the device includes:

a picture obtaining module 303, configured to obtain a camera picture, and obtain calibration internal parameters and external parameters of projection conversion;

a point cloud obtaining module 301 configured to obtain an original point cloud;

a picture semantic segmentation module 304, configured to perform semantic segmentation on the camera picture to obtain a second picture;

a point cloud semantic segmentation module 302 configured to perform semantic segmentation on the original point cloud to obtain a second point cloud;

a picture semantic projection module 305, configured to perform projection from the original point cloud to the second picture according to the internal parameters and the external parameters to obtain a third point cloud, where each point in the third point cloud includes second semantic category information corresponding to the second picture;

a semantic fusion module 306, configured to perform voxelization on second semantic category information in the third point cloud, the first semantic category information in the second point cloud and the feature information of the original point cloud, and input the voxelization into an adaptive attention mechanism network to perform learning, so as to obtain weighted semantic information;

an obstacle sensing module 307 configured to detect an obstacle target according to the weighted semantic information;

As a preferred example, the picture acquiring module 303 is further configured to:

and acquiring a camera picture at the same time as the original point cloud.

Specifically, software synchronization or hardware synchronization may be performed on the point cloud and the camera, and then a camera picture at the same time as the original point cloud is obtained.

As a preferred example, the picture semantic segmentation module 304 is further configured to:

As a preferred example, the point cloud semantic segmentation module 302 is further configured to:

As a preferred example, the picture semantic projection module 305 is further configured to perform the projection of the original point cloud to the second picture according to the following manner:

projection is performed according to the following formula:

P′＝Proj(K,M,P)，

wherein, Proj is a projection matrix processing process;

k is an internal reference matrix of the camera;

m is an external parameter matrix from the camera to the laser radar;

p is a laser radar point cloud set;

As a preferred example, the learning in the input adaptive attention mechanism network includes:

The global feature V after learning is processed_globalIs spliced toEach local feature V_iTo obtain an enhanced feature V_gl。

The learning in the input adaptive attention mechanism network comprises:

the global features are learned according to the following formula:

V_global＝max_{i＝1,2,…,N}{MLP_g(V_i)}

MLP_g(V_i) A global feature multi-layer perceptron;

max is the maximum pooling operation for all voxels;

C₂the number of channels for the entire profile;

n is the number of voxel grids;

V_ifeatures in the ith voxel grid obtained by learning are used.

As a preferred example, the semantic fusion module 306 is further configured to obtain weighted semantic information according to the following formula:

wherein, P_a,sAnd P_a,tThe weighted semantic information is obtained;

P_2Dthe second semantic information is obtained;

P_3Dthe first semantic information is obtained;

MLP_attis a multilayer perceptron;

σ is Sigmoid activation function.

It should be noted that the apparatus provided in the second embodiment and the method provided in the first embodiment belong to the same inventive concept, solve the same technical problem, and achieve the same technical effect, and the apparatus provided in the second embodiment can implement all the methods of the first embodiment, and the same parts are not described again.

EXAMPLE III

Based on the same inventive concept, an embodiment of the present invention further provides an obstacle sensing device, as shown in fig. 4, the device includes:

including memory 402, processor 401, and user interface 403;

the memory 402 for storing a computer program;

the user interface 403 is used for realizing interaction with a user;

the processor 401 is configured to read the computer program in the memory 402, and when the processor 401 executes the computer program, the processor implements:

acquiring an original point cloud and a camera picture at the same time;

detecting an obstacle target according to the weighted semantic information;

Where in fig. 4 the bus architecture may include any number of interconnected buses and bridges, in particular one or more processors, represented by processor 401, and various circuits of memory, represented by memory 402, linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. The processor 401 is responsible for managing the bus architecture and general processing, and the memory 402 may store data used by the processor 501 in performing operations.

The processor 401 may be a CPU, ASIC, FPGA or CPLD, and the processor 401 may also employ a multi-core architecture.

The processor 401, when executing the computer program stored in the memory 402, implements any of the obstacle sensing methods of the first embodiment.

It should be noted that the apparatus provided in the third embodiment and the method provided in the first embodiment belong to the same inventive concept, solve the same technical problem, and achieve the same technical effect, and the apparatus provided in the third embodiment can implement all the methods of the first embodiment, and the same parts are not described again.

The present application also proposes a processor-readable storage medium. The processor-readable storage medium stores a computer program, and the processor implements any obstacle sensing method in the first embodiment when executing the computer program.

It should be noted that the division of the unit in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation. In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. An obstacle sensing method, comprising:

acquiring an original point cloud and a camera picture at the same time;

detecting an obstacle target according to the weighted semantic information;

2. The method of claim 1, wherein learning in the input adaptive attention mechanism network comprises:

3. The method of claim 2, wherein detecting an obstacle target based on the weighted semantic information comprises:

and inputting the weighted semantic information into a target detector to detect the obstacle target.

4. The method of claim 1, wherein the obtaining the original point cloud and the camera picture at the same time comprises:

5. The method of claim 1, wherein semantically segmenting the original point cloud to obtain a second point cloud comprises:

6. The method of claim 1, wherein the semantically segmenting the camera picture to obtain a second picture comprises:

7. The method of claim 1, wherein before the voxelizing the second semantic category information in the third point cloud, the first semantic category information in the second point cloud, and the feature information of the original point cloud, further comprising:

8. The method of claim 1, wherein said projecting the original point cloud to the second picture comprises:

projection is performed according to the following formula:

P′＝Proj(K,M,P)，

wherein, Proj is a projection matrix processing process;

k is an internal reference matrix of the camera;

m is an external parameter matrix from the camera to the laser radar;

p is a laser radar point cloud set;

9. The method of claim 2, wherein the local features are learned in an adaptive attention mechanism network to obtain learned local features V_iThe method comprises the following steps:

learning of local features is performed according to the following formula:

V_i＝max_{i＝1,2,…,N}{MLP_l(p_i)}，

wherein, V_iFeatures in the ith voxel grid obtained by learning;

p_iis the ith point in the space point cloud;

MLP_l(p_i) A multi-layer perceptron for local features;

max is the maximum pooling operation for all points in a voxel;

C₁the number of channels that are local feature maps;

n is the number of voxel bins.

10. The method of claim 2, wherein the global feature is learned in an adaptive attention mechanism network to obtain a learned global feature V_globalThe method comprises the following steps:

the global features are learned according to the following formula:

V_global＝max_{i＝1,2,…,N}{MLP_g(V_i)}

MLP_g(V_i) A global feature multi-layer perceptron;

max is the maximum pooling operation for all voxels;

C₂the number of channels for the entire profile;

n is the number of voxel grids;

V_ifeatures in the ith voxel grid obtained by learning are used.

11. The method of claim 2, wherein the obtaining the weighted semantic information comprises:

obtaining weighted semantic information according to the following formula:

wherein, P_a,sAnd P_a,tThe weighted semantic information is obtained;

P_2Dthe second semantic information is obtained;

P_3Dthe first semantic information is obtained;

MLP_attis a multilayer perceptron;

σ is Sigmoid activation function.

12. An obstacle sensing device, comprising:

a point cloud acquisition module configured to acquire an original point cloud;

13. An obstacle sensing apparatus comprising a memory, a processor and a user interface;

the memory for storing a computer program;

the user interface is used for realizing interaction with a user;

the processor for reading a computer program in the memory, the processor, when executing the computer program, implementing the obstacle sensing method according to one of claims 1 to 11.

14. A processor-readable storage medium, characterized in that the processor-readable storage medium stores a computer program which, when executed by a processor, implements an obstacle sensing method according to one of claims 1 to 11.