CN113361601A - Method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data - Google Patents

Method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data Download PDF

Info

Publication number
CN113361601A
CN113361601A CN202110627186.3A CN202110627186A CN113361601A CN 113361601 A CN113361601 A CN 113361601A CN 202110627186 A CN202110627186 A CN 202110627186A CN 113361601 A CN113361601 A CN 113361601A
Authority
CN
China
Prior art keywords
tensor
feature
perspective
sub
point cloud
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110627186.3A
Other languages
Chinese (zh)
Inventor
张雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qingzhou Zhihang Technology Co ltd
Original Assignee
Beijing Qingzhou Zhihang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qingzhou Zhihang Technology Co ltd filed Critical Beijing Qingzhou Zhihang Technology Co ltd
Priority to CN202110627186.3A priority Critical patent/CN113361601A/en
Publication of CN113361601A publication Critical patent/CN113361601A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Optical Radar Systems And Details Thereof (AREA)

Abstract

The embodiment of the invention relates to a method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data, which comprises the following steps: acquiring a first point cloud tensor generated by scanning a first target environment by the unmanned vehicle laser radar; performing two-dimensional voxel characteristic extraction processing on the first point cloud tensor to generate a first overlook characteristic tensor; according to a preset perspective mode, carrying out corresponding two-dimensional voxel characteristic extraction processing on the first point cloud tensor to generate a first perspective characteristic tensor; performing feature fusion processing on the first overhead view feature tensor by using the first perspective feature tensor to generate a first fusion feature tensor; the first fused feature tensor is dimensionality reduced using a 1 x 1 convolutional network. The method can reduce the calculation complexity of the voxelization algorithm, meet the requirement of low time delay in the field of automatic driving, and avoid the problem of loss of the height information of the surrounding environment of the self-vehicle due to the fact that only the overlook feature is used.

Description

Method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data
Technical Field
The invention relates to the technical field of data processing, in particular to a method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data.
Background
The point cloud data (point cloud data) is data for recording scanning information in a point form, and each point cloud data obtained by scanning with the laser radar includes a three-dimensional coordinate (X, Y, Z) and laser reflection Intensity information (Intensity). A voxel is an abbreviation of volume element (voxel), which is the smallest unit segmented in a digital three-dimensional space. In a digital three-dimensional space, a solid containing voxels can be represented in the form of polygons or isosurfaces through operations such as voxel feature extraction, solid rendering and the like. The method is applied to the field of automatic driving, and can input the voxel characteristics extracted from the point cloud data into an artificial intelligence model for target detection or target classification for calculation, so that the classification recognition result of objects around the driving route is obtained.
A common method for extracting point cloud voxel features is a three-dimensional voxelization method (e.g., VoxelNet method) based on a cartesian coordinate system. However, the three-dimensional voxelization method has high calculation complexity and long calculation time, and cannot adapt to the low delay characteristic required by the automatic driving field.
Disclosure of Invention
The invention aims to provide a method for fusing perspective and aerial view features based on unmanned vehicle laser radar data, an electronic device and a computer readable storage medium, wherein the aerial view two-dimensional feature extraction based on a Cartesian coordinate system and the perspective two-dimensional feature extraction based on a perspective three-dimensional coordinate system are respectively carried out on point cloud data, and the extracted two-dimensional aerial view features and the perspective features are subjected to feature fusion so as to obtain semantic information with three-dimensional features. By using the method, the calculation complexity of the voxel algorithm can be reduced, the requirement of low time delay in the field of automatic driving can be met, and the problem of height information loss of the surrounding environment of the self-vehicle caused by only using the overlook feature can be solved.
In order to achieve the above object, a first aspect of the embodiments of the present invention provides a method for fusion of perspective and overhead characteristics based on unmanned vehicle lidar data, where the method includes:
acquiring a first point cloud tensor generated by scanning a first target environment by the unmanned vehicle laser radar;
performing two-dimensional voxel characteristic extraction processing on the first point cloud tensor to generate a first overlooking characteristic tensor;
according to a preset perspective mode, carrying out corresponding two-dimensional voxel characteristic extraction processing on the first point cloud tensor to generate a first perspective characteristic tensor;
performing feature fusion processing on the first overhead feature tensor by using the first perspective feature tensor to generate a first fused feature tensor;
the first fused feature tensor is dimensionality reduced using a 1 x 1 convolutional network.
Preferably, the shape of the first point cloud tensor is X Y Z I, X is a cross-axis coordinate dimension parameter of the three-dimensional coordinate system, Y is a longitudinal-axis coordinate dimension parameter of the three-dimensional coordinate system, Z is a vertical-axis coordinate dimension parameter of the three-dimensional coordinate system, and I is a laser reflection intensity dimension parameter;
the first overlook feature tensor has a shape of H1*W1*C1Said H is1As a height dimension parameter, said W1As a width dimension parameter, C1Is a channel dimension parameter;
the first perspective feature tensor has a shape of H2*W2*C2Said H is2As a height dimension parameter, said W2As a width dimension parameter, C2As a parameter of channel dimension, C2=C1
The first fused feature tensor has a shape of H3*W3*C3Said H is3As a height dimension parameter, said W3As a width dimension parameter, C3As a parameter of the channel dimension, H3=H1,W3=W1,C3=H2*C1
Preferably, the performing two-dimensional voxel feature extraction processing on the first point cloud tensor to generate a first overhead view feature tensor specifically includes:
using PointPillars algorithm, for the first point cloud tensor [ X Y Z I]Performing two-dimensional voxel feature extraction of a Cartesian coordinate system to generate the first overlook feature tensor [ H [ ]1*W1*C1]。
Preferably, the performing, according to a preset perspective mode, corresponding two-dimensional voxel feature extraction processing on the first point cloud tensor to generate a first perspective feature tensor specifically includes:
when the perspective mode is a spherical mode, the cloud tensor [ X Y Z I ] is applied to the first point]Performing two-dimensional voxel feature extraction processing of a spherical coordinate system to generate the first perspective feature tensor [ H ]2*W2*C2];
When the perspective mode is a cylindrical mode, the cloud tensor [ X Y Z I ] is set for the first point]Performing two-dimensional voxel feature extraction processing of a cylindrical coordinate system to generate the first perspective feature tensor [ H ]2*W2*C2]。
Preferably, the generating a first fused feature tensor by performing feature fusion processing on the first overhead view feature tensor by using the first perspective feature tensor specifically includes:
extracting W from the first perspective feature tensor2Is H in shape2*1*C2First sub-tensor [ H ] of2*1*C2];
Extracting H from the first overlook feature tensor1*W1The shape is 1 x C1Second sub-tensor [1 x 1C ] of1];
For each of said second sub-tensors [1 x C1]Selecting a corresponding one of said first sub-tensors [ H ]2*1*C2]As a first corresponding sub-tensor [ H ]2*1*C2];
Reusing each of the second sub-tensors [1 x C ]1]A sub-tensor [ H ] corresponding to the first2*1*C2]Performing feature fusion to obtain corresponding shape 1 × C3Third sub-tensor [1 x 1C ] of3](ii) a Wherein, C3=H2*C1
Final from H obtained1*W1A third sub-tensor [1 x C ]3]Composing the first fused feature tensor [ H3*W3*C3](ii) a Wherein H3=H1,W3=W1
Further, the method comprisesSaid using each of said second sub-tensors [1 x C1]A sub-tensor [ H ] corresponding to the first2*1*C2]Performing feature fusion to obtain corresponding shape 1 × C3Third sub-tensor [1 x 1C ] of3]Specifically, include:
from the first corresponding sub-tensor [ H2*1*C2]Extracting to obtain H2The shape is 1 x C2Fourth partial sheet of (1 x 1C)2];
Using the second sub-tensor [1 x C1]With each of said fourth sub-tensors [1 x C, respectively2]Carrying out tensor cross multiplication to obtain H2The shape is 1 x C1The fifth tensor [1 x 1C ] of1];
To H2A fifth tensor [1 x C ]1]Channel merging was performed to obtain a shape of 1 x C3Of the third sub-tensor [1 x 1C3]Wherein, C3=H2*C1
A second aspect of an embodiment of the present invention provides an electronic device, including: a memory, a processor, and a transceiver;
the processor is configured to be coupled to the memory, read and execute instructions in the memory, so as to implement the method steps of the first aspect;
the transceiver is coupled to the processor, and the processor controls the transceiver to transmit and receive messages.
A third aspect of embodiments of the present invention provides a computer-readable storage medium storing computer instructions that, when executed by a computer, cause the computer to perform the method of the first aspect.
The embodiment of the invention provides a method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data, electronic equipment and a computer readable storage medium. By using the method, the calculation complexity of the voxelization algorithm is reduced, the requirement of low time delay in the field of automatic driving is met, and the problem of height information loss of the surrounding environment of the vehicle caused by only using the overlooking characteristic is solved.
Drawings
Fig. 1 is a schematic diagram of a method for fusion of perspective and aerial view characteristics based on unmanned vehicle lidar data according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an electronic device according to a second embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An embodiment of the present invention provides a method for fusing perspective and overhead view characteristics based on unmanned vehicle lidar data, as shown in fig. 1, which is a schematic diagram of the method for fusing perspective and overhead view characteristics based on unmanned vehicle lidar data according to the embodiment of the present invention, and the method mainly includes the following steps:
step 1, acquiring a first point cloud tensor generated by scanning a first target environment by an unmanned vehicle laser radar;
the shape of the first point cloud tensor is X, Y, Z and I, X is a cross-axis coordinate dimension parameter of the three-dimensional coordinate system, Y is a longitudinal-axis coordinate dimension parameter of the three-dimensional coordinate system, Z is a vertical-axis coordinate dimension parameter of the three-dimensional coordinate system, and I is a laser reflection intensity dimension parameter.
Here, as described above, the point cloud data includes a three-dimensional coordinate system and a laser reflection intensity; the unmanned vehicle radar performs target environment scanning on a specified scanning range according to a set scanning frequency in the process of performing the target environment scanning, obtains a plurality of scanning data, converts the obtained plurality of scanning data according to a point cloud three-dimensional coordinate to obtain a plurality of point cloud data, creates a calculation tensor according to the three-dimensional coordinate and the scanning intensity form for all the point cloud data, and obtains a first point cloud tensor, wherein if the laser reflection intensity is a scalar, the shape of the first point cloud tensor is X, Y, Z1, and if the laser reflection intensity is a vector, the shape of the first point cloud is X, Y, Z, I.
Step 2, performing two-dimensional voxel characteristic extraction processing on the first point cloud tensor to generate a first overlook characteristic tensor;
wherein the first overlook feature tensor has a shape of H1*W1*C1,H1As a height dimension parameter, W1As a parameter of the width dimension, C1Is a channel dimension parameter;
the method specifically comprises the following steps: using PointPillars algorithm, the first point cloud tensor [ X Y Z I]Two-dimensional voxel characteristic extraction of a Cartesian coordinate system is carried out, and a first overlook characteristic tensor [ H ] is generated1*W1*C1]。
Here, the principle of the PointPillars algorithm can be found in the technical paper "PointPillars: Fast Encoders for Object Detection from Point cloud". The embodiment of the invention completes the processing process of projecting the three-dimensional coordinates of the Cartesian coordinates system (Cartesian coordinates system) of the point cloud to the two-dimensional coordinates by training the PointPillars model, and the processing process of extracting the two-dimensional characteristics of the projected point cloud. Specifically, the method includes the steps that a well-trained PointPillars model is used for firstly obtaining two-dimensional maximum values (X-direction maximum value and Y-direction maximum value) of an input first point cloud tensor on an xy plane of an overlooking angle; then, a shape H is drawn according to the X-direction maximum value, the Y-direction maximum value and the set grid unit1*W1Each cell grid is set to correspond to a Pillar (pilar) tensor; then referring to the x/y two-dimensional coordinates of each point cloud data in the first point cloud tensor, distributing each point cloud to the corresponding pilar, and completing filling of each pilar tensor, wherein the model is also used for processing the three-dimensional coordinates of the point cloudA process of projection to a two-dimensional coordinate; then according to a set sampling strategy, data preprocessing such as noise filtering and sampling is carried out on the point cloud data in each Pillar tensor, the preprocessed Pillar tensors are input into a convolution network for feature extraction, and a final two-dimensional feature tensor, namely a first overlooking feature tensor [ H ] is obtained1*W1*C1]。
Step 3, according to a preset perspective mode, performing corresponding two-dimensional voxel characteristic extraction processing on the first point cloud tensor to generate a first perspective characteristic tensor;
wherein the first perspective feature tensor has a shape of H2*W2*C2,H2As a height dimension parameter, W2As a parameter of the width dimension, C2As a parameter of channel dimension, C2=C1
Here, the perspective mode according to the embodiment of the present invention supports two modes, a Spherical mode in which a Spherical coordinate system (Spherical coordinate system) is used as a perspective structure, and a Cylindrical mode in which a Cylindrical coordinate system (Cylindrical coordinate system) is used as a perspective structure;
the method specifically comprises the following steps: step 31, when the perspective mode is the spherical mode, the first point cloud tensor [ X Y Z I ] is processed]Two-dimensional voxel characteristic extraction processing of a spherical coordinate system is carried out to generate a first perspective characteristic tensor [ H2*W2*C2];
Here, the perspective mode is a spherical mode, which means that when two-dimensional feature extraction of spherical three-dimensional coordinates is performed on the input first point cloud tensor, spherical coordinate conversion is performed on the input first point cloud tensor; then the spherical surface is unfolded and marked into the shape of H2*W2Each grid cell is set to correspond to a cell tensor; then, referring to the spherical coordinates of each point cloud data in the first point cloud tensor, projecting each point cloud to a corresponding grid unit, and completing filling of each unit tensor; then according to a set sampling strategy, carrying out data preprocessing such as noise filtering and sampling on the point cloud data in each unit tensor, and inputting the preprocessed unit tensor into a convolution networkExtracting line features to obtain a final two-dimensional feature tensor, namely a first perspective feature tensor [ H ]2*W2*C2];
Step 32, when the perspective mode is the cylindrical mode, the first point cloud tensor [ X Y Z I ] is processed]Two-dimensional voxel characteristic extraction processing of a cylindrical coordinate system is carried out to generate a first perspective characteristic tensor [ H2*W2*C2]。
Here, the perspective mode is a cylindrical mode, which means that when two-dimensional feature extraction of cylindrical three-dimensional coordinates is to be performed on the input first point cloud tensor, cylindrical coordinate conversion is performed on the input first point cloud tensor; then, the column is unfolded to form a shape of H2*W2Each grid cell is set to correspond to a cell tensor; then, referring to the cylindrical coordinates of each point cloud data in the first point cloud tensor, projecting each point cloud to a corresponding grid unit, and completing filling of each unit tensor; then according to a set sampling strategy, data preprocessing such as noise filtering and sampling is carried out on the point cloud data in each unit tensor, the preprocessed unit tensor is input into a convolution network for feature extraction, and a final two-dimensional feature tensor, namely a first perspective feature tensor [ H ] is obtained2*W2*C2]。
Step 4, performing feature fusion processing on the first aerial view feature tensor by using the first perspective feature tensor to generate a first fusion feature tensor;
wherein the first fused feature tensor has a shape of H3*W3*C3,H3As a height dimension parameter, W3As a parameter of the width dimension, C3As a parameter of the channel dimension, H3=H1,W3=W1,C3=H2*C1
The two-dimensional perspective feature based on the perspective relation is used for carrying out feature fusion on the two-dimensional aerial view feature, so that the vertical dimension feature perpendicular to the horizontal/longitudinal dimensions can be obtained on the basis of the two-dimensional aerial view feature, and the purpose of adding three-dimensional semantic information to the two-dimensional aerial view feature is achieved;
the method specifically comprises the following steps: step 41, extracting W from the first perspective feature tensor2Is H in shape2*1*C2First sub-tensor [ H ] of2*1*C2];
Here, if H is used1Is the maximum of the vertical axis of the grid, in W1Drawing a grid graph for the maximum value of the horizontal axis of the grid to obtain a grid graph H1Line W1A first overhead grid diagram formed by orthogonal columns; then each first sub-tensor H2*1*C2]Actually corresponds to a column in the first perspective grid map;
step 42, extracting H from the first overlooking feature tensor1*W1The shape is 1 x C1Second partial sheet of (1 x 1C)1];
Here, if H is used2Is the maximum of the vertical axis of the grid, in W2Drawing a grid graph for the maximum value of the horizontal axis of the grid to obtain a grid graph H2Line W2A first perspective grid map formed by orthogonal columns; each second sub-tensor [1 x 1C ]1]Actually, one cell in the corresponding first overhead grid map;
step 43, for each second sub-tensor [1 x C1]Selecting a corresponding first sub-tensor [ H ]2*1*C2]As a first corresponding sub-tensor [ H ]2*1*C2];
Using each of the second sub-tensors [1 x C ], step 441]A sub-tensor [ H ] corresponding to the first2*1*C2]Performing feature fusion to obtain corresponding shape 1 × C3Third sub-tensor [1 x 1C ] of3](ii) a Wherein, C3=H2*C1
Here, the actual one-dimensional overhead feature vector and the one-dimensional list of perspective feature vectors are cross-multiplied, and then all the feature tensors with the third-dimensional information output by the cross-multiplication are combined to obtain one-dimensional vector with a plurality of third-dimensional features, that is, a third sub-tensor [1 x C3];
The method specifically comprises the following steps: in a step 441 of the method,from the first corresponding sub-tensor [ H ]2*1*C2]Extracting to obtain H2The shape is 1 x C2Of the fourth sub-tensor [1 x 1C2];
Step 442, using the second sub-tensor [1 x C1]With each fourth sub-tensor [1 x C, respectively2]Carrying out tensor cross multiplication to obtain H2The shape is 1 x C1The fifth tensor [1 x 1C ] of1];
The process of cross multiplication is respectively carried out on the one-dimensional overlooking characteristic vector and a row of one-dimensional perspective characteristic vectors;
step 443, for H2The fifth tensor [1 x C ]1]Channel merging was performed to obtain a shape of 1 x C3Third sub-tensor [1 x 1C ] of3]Wherein, C3=H2*C1
Here, the process is to cascade all cross multiplication results according to the channel dimension;
step 45, finally, from the H obtained1*W1A third sub-tensor [1 x C ]3]Constitute a first fused feature tensor [ H3*W3*C3](ii) a Wherein H3=H1,W3=W1
Here, the process is to integrate all the one-dimensional overhead feature vectors that have completed the third-dimensional feature fusion.
Step 5, using 1 × 1 convolution network to pair the first fused feature tensor [ H3*W3*C3]And (5) performing dimensionality reduction treatment.
Here, a 1 × 1 convolutional network is often used to perform dimensionality reduction on the high-dimensional tensor.
After that, the first fusion feature tensor after dimension reduction can be input into an artificial intelligence model for target detection or target classification as the three-dimensional voxel feature information of the point cloud for calculation, so as to obtain a classification recognition result of the objects around the driving route.
Fig. 2 is a schematic structural diagram of an electronic device according to a second embodiment of the present invention. The electronic device may be the terminal device or the server, or may be a terminal device or a server connected to the terminal device or the server and implementing the method according to the embodiment of the present invention. As shown in fig. 2, the electronic device may include: a processor 301 (e.g., a CPU), a memory 302, a transceiver 303; the transceiver 303 is coupled to the processor 301, and the processor 301 controls the transceiving operation of the transceiver 303. Various instructions may be stored in memory 302 for performing various processing functions and implementing the processing steps described in the foregoing method embodiments. Preferably, the electronic device according to an embodiment of the present invention further includes: a power supply 304, a system bus 305, and a communication port 306. The system bus 305 is used to implement communication connections between the elements. The communication port 306 is used for connection communication between the electronic device and other peripherals.
The system bus 305 mentioned in fig. 2 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The system bus may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The communication interface is used for realizing communication between the database access device and other devices (such as a client, a read-write library and a read-only library). The Memory may include Random Access Memory (RAM) and may also include Non-volatile Memory (Non-volatile e-Memory), such as at least one disk Memory.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), a Graphics Processing Unit (GPU), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It should be noted that the embodiment of the present invention also provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to execute the method and the processing procedure provided in the above-mentioned embodiment.
The embodiment of the present invention further provides a chip for executing the instructions, where the chip is configured to execute the processing steps described in the foregoing method embodiment.
The embodiment of the invention provides a method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data, electronic equipment and a computer readable storage medium. By using the method, the calculation complexity of the voxelization algorithm is reduced, the requirement of low time delay in the field of automatic driving is met, and the problem of height information loss of the surrounding environment of the vehicle caused by only using the overlooking characteristic is solved.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only examples of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. A method for fusion of perspective and aerial view features based on unmanned vehicle lidar data, the method comprising:
acquiring a first point cloud tensor generated by scanning a first target environment by the unmanned vehicle laser radar;
performing two-dimensional voxel characteristic extraction processing on the first point cloud tensor to generate a first overlooking characteristic tensor;
according to a preset perspective mode, carrying out corresponding two-dimensional voxel characteristic extraction processing on the first point cloud tensor to generate a first perspective characteristic tensor;
performing feature fusion processing on the first overhead feature tensor by using the first perspective feature tensor to generate a first fusion feature tensor;
the first fused feature tensor is dimensionality reduced using a 1 x 1 convolutional network.
2. The method for perspective and aerial view feature fusion based on unmanned vehicle lidar data of claim 1,
the shape of the first point cloud tensor is X, Y, Z and I, wherein X is a cross-axis coordinate dimension parameter of a three-dimensional coordinate system, Y is a longitudinal-axis coordinate dimension parameter of the three-dimensional coordinate system, Z is a vertical-axis coordinate dimension parameter of the three-dimensional coordinate system, and I is a laser reflection intensity dimension parameter;
the first overlook feature tensor has a shape of H1*W1*C1Said H is1As a height dimension parameter, said W1Is the width dimensionDegree parameter, said C1Is a channel dimension parameter;
the first perspective feature tensor has a shape of H2*W2*C2Said H is2As a height dimension parameter, said W2As a width dimension parameter, C2As a parameter of channel dimension, C2=C1
The first fused feature tensor has a shape of H3*W3*C3Said H is3As a height dimension parameter, said W3As a width dimension parameter, C3As a parameter of the channel dimension, H3=H1,W3=W1,C3=H2*C1
3. The method for fusion of perspective and overhead view features based on the unmanned vehicle lidar data according to claim 2, wherein the generating a first overhead view feature tensor by performing the two-dimensional voxel feature extraction processing on the first point cloud tensor specifically comprises:
using PointPillars algorithm, for the first point cloud tensor [ X Y Z I]Extracting two-dimensional voxel characteristics of a Cartesian coordinate system to generate the first overlook characteristic tensor [ H ]1*W1*C1]。
4. The method according to claim 2, wherein the generating a first perspective feature tensor by performing corresponding two-dimensional voxel feature extraction processing on the first point cloud tensor according to a preset perspective mode specifically includes:
when the perspective mode is a spherical mode, the cloud tensor [ X Y Z I ] is applied to the first point]Performing two-dimensional voxel feature extraction processing of a spherical coordinate system to generate the first perspective feature tensor [ H ]2*W2*C2];
When the perspective mode is a cylindrical mode, the cloud tensor [ X Y Z I ] is set for the first point]Performing two-dimensional voxel characteristic extraction processing of cylindrical coordinate system to generateThe first perspective feature tensor [ H [ ]2*W2*C2]。
5. The method according to claim 2, wherein the generating a first fusion feature tensor by performing feature fusion processing on the first overhead feature tensor using the first perspective feature tensor includes:
extracting W from the first perspective feature tensor2Is H in shape2*1*C2First sub-tensor [ H ] of2*1*C2];
Extracting H from the first overlook feature tensor1*W1The shape is 1 x C1Second sub-tensor [1 x 1C ] of1];
For each of said second sub-tensors [1 x C1]Selecting a corresponding one of said first sub-tensors [ H ]2*1*C2]As a first corresponding sub-tensor [ H ]2*1*C2];
Reusing each of the second sub-tensors [1 x C ]1]A sub-tensor [ H ] corresponding to the first2*1*C2]Performing feature fusion to obtain corresponding shape 1 × C3Third sub-tensor [1 x 1C ] of3](ii) a Wherein, C3=H2*C1
Final from H obtained1*W1A third sub-tensor [1 x C ]3]Composing the first fused feature tensor [ H3*W3*C3](ii) a Wherein H3=H1,W3=W1
6. The method for perspective and overhead feature fusion based on unmanned vehicle lidar data of claim 5, wherein the using each of the second sub-tensors [1 x 1C1]A sub-tensor [ H ] corresponding to the first2*1*C2]Performing feature fusion to obtain corresponding shape 1 × C3Third sub-tensor [1 x 1C ] of3]The method specifically comprises the following steps:
from the first corresponding sub-tensor [ H2*1*C2]Extracting to obtain H2The shape is 1 x C2Of the fourth sub-tensor [1 x 1C2];
Using the second sub-tensor [1 x C1]With each of said fourth sub-tensors [1 x C, respectively2]Carrying out tensor cross multiplication to obtain H2The shape is 1 x C1The fifth tensor [1 x 1C ] of1];
To H2A fifth tensor [1 x C ]1]Channel merging was performed to obtain a shape of 1 x C3Of the third sub-tensor [1 x 1C3]Wherein, C3=H2*C1
7. An electronic device, comprising: a memory, a processor, and a transceiver;
the processor is used for being coupled with the memory, reading and executing the instructions in the memory to realize the method steps of any one of claims 1-6;
the transceiver is coupled to the processor, and the processor controls the transceiver to transmit and receive messages.
8. A computer-readable storage medium having stored thereon computer instructions which, when executed by a computer, cause the computer to perform the method of any of claims 1-6.
CN202110627186.3A 2021-06-04 2021-06-04 Method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data Withdrawn CN113361601A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110627186.3A CN113361601A (en) 2021-06-04 2021-06-04 Method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110627186.3A CN113361601A (en) 2021-06-04 2021-06-04 Method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data

Publications (1)

Publication Number Publication Date
CN113361601A true CN113361601A (en) 2021-09-07

Family

ID=77532406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110627186.3A Withdrawn CN113361601A (en) 2021-06-04 2021-06-04 Method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data

Country Status (1)

Country Link
CN (1) CN113361601A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023165220A1 (en) * 2022-03-04 2023-09-07 京东鲲鹏(江苏)科技有限公司 Target object detection method and apparatus

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023165220A1 (en) * 2022-03-04 2023-09-07 京东鲲鹏(江苏)科技有限公司 Target object detection method and apparatus

Similar Documents

Publication Publication Date Title
CN109901567B (en) Method and apparatus for outputting obstacle information
US20210270609A1 (en) Method, apparatus, computing device and computer-readable storage medium for positioning
CN110264502B (en) Point cloud registration method and device
CN111709923B (en) Three-dimensional object detection method, three-dimensional object detection device, computer equipment and storage medium
US20230386076A1 (en) Target detection method, storage medium, electronic device, and vehicle
CN113420637A (en) Laser radar detection method under multi-scale aerial view angle in automatic driving
US10467359B2 (en) Special-purpose programmed computer for numerical simulation of a metal forming process having a predefined load path with corresponding mesh adjustment scheme
CN114091521B (en) Method, device and equipment for detecting vehicle course angle and storage medium
CN113361601A (en) Method for fusing perspective and aerial view characteristics based on unmanned vehicle laser radar data
CN115909269A (en) Three-dimensional target detection method and device and computer storage medium
CN111488783A (en) Method and device for detecting pseudo-3D bounding box based on CNN
EP4207072A1 (en) Three-dimensional data augmentation method, model training and detection method, device, and autonomous vehicle
CN117031491A (en) Map construction method and device, automatic navigation trolley and electronic equipment
CN116188931A (en) Processing method and device for detecting point cloud target based on fusion characteristics
CN115761425A (en) Target detection method, device, terminal equipment and computer readable storage medium
CN115856874A (en) Millimeter wave radar point cloud noise reduction method, device, equipment and storage medium
WO2022017129A1 (en) Target object detection method and apparatus, electronic device, and storage medium
CN114663478A (en) Method for estimating anchor point position according to multi-reference point prediction information
CN115527187A (en) Method and device for classifying obstacles
CN114966736A (en) Processing method for predicting target speed based on point cloud data
CN114549764A (en) Obstacle identification method, device, equipment and storage medium based on unmanned vehicle
CN114140660A (en) Vehicle detection method, device, equipment and medium
CN114820416A (en) Vehicle course angle calculation method, vehicle pose calculation method, device and equipment
CN113129437B (en) Method and device for determining space coordinates of markers
CN116740682B (en) Vehicle parking route information generation method, device, electronic equipment and readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210907