CN113240038A - Point cloud target detection method based on height-channel feature enhancement - Google Patents

Point cloud target detection method based on height-channel feature enhancement Download PDF

Info

Publication number
CN113240038A
CN113240038A CN202110605139.9A CN202110605139A CN113240038A CN 113240038 A CN113240038 A CN 113240038A CN 202110605139 A CN202110605139 A CN 202110605139A CN 113240038 A CN113240038 A CN 113240038A
Authority
CN
China
Prior art keywords
point cloud
feature vector
multiplied
backbone network
height
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110605139.9A
Other languages
Chinese (zh)
Other versions
CN113240038B (en
Inventor
张静
王佳军
许达
李云松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202110605139.9A priority Critical patent/CN113240038B/en
Publication of CN113240038A publication Critical patent/CN113240038A/en
Application granted granted Critical
Publication of CN113240038B publication Critical patent/CN113240038B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

The invention discloses a point cloud target detection method based on height-channel characteristic enhancement, which is used for solving the problems that the information loss and the detection effect of the existing detection method for compressing a point cloud space are limited by point cloud distribution characteristics. The method comprises the following steps: (1) converting point cloud data blocks received from a laser radar in real time into aggregated feature vectors; (2) extracting the attention weight value of a high degree dimension in the aggregation feature vector; (3) extracting an attention weight value of a channel dimension in the aggregation feature vector; (4) weighting the aggregated feature vectors; (5) constructing a backbone network; (6) training a backbone network; (7) and detecting the point cloud target. According to the invention, the point cloud data is equally divided into four parts, so that the information loss of the point cloud data is overcome, the key characteristics of the point cloud are enhanced by extracting and weighting the characteristic vectors, and the average precision of the point cloud target detection is improved.

Description

Point cloud target detection method based on height-channel feature enhancement
Technical Field
The invention belongs to the technical field of radars, and further relates to a point cloud target detection method based on height-channel characteristic enhancement in the technical field of radar target detection. The invention can detect the distant target in the point cloud scene by enhancing the point cloud characteristics.
Background
The point cloud is used as a basic data format output by the laser radar, original geometric information in a three-dimensional space is reserved, and rich shape and proportion information can be provided. It is therefore a preferred representation for unmanned and robotic scene-aware understanding. According to the characteristics of the laser radar, the point cloud data of every 360 degrees can form a three-dimensional point cloud scene. However, due to the fact that point clouds have near-dense far-sparse distribution unevenness, detection of far-sparse targets in a point cloud scene always has great challenges, and how to accurately detect far-sparse targets in the point cloud scene becomes a problem to be solved urgently in the technical field.
A point cloud target detection method based on a foresight grid is disclosed in patent document 'laser radar point cloud target detection method, system and device' applied by Shenlan Artificial Intelligence (Shenzhen) Limited company (patent application No. CN2020110603176, publication No. CN 112183393A). The method comprises the following specific steps: 1. an input feature construction step, namely constructing a foresight grid for the acquired point cloud, projecting each point in the field of view range in the point cloud into the foresight grid, recording the statistic of the point closest to the center of the laser radar in each grid as a feature value, and taking the statistic as an input feature of the foresight grid; extracting coordinates and reflection intensity of a point closest to the center of the laser radar in a point grid in each front view grid to obtain point cloud input characteristics; 2. a feature extraction step, wherein a front view grid input feature and a point cloud input feature are extracted in the step, and a convolution neural network is used for extracting features from a front view grid output feature; the point cloud output characteristics are characterized by using a point cloud characteristic extraction network, and each point characteristic is adjusted and output; constructing a point cloud output characteristic projection into a characteristic diagram with the same size as the grid output characteristic diagram of the front view; using the corresponding relation to construct a point cloud output characteristic projection into a characteristic diagram with the same size as the grid diagram output characteristic diagram, and keeping the original dimension to combine the grid output characteristic with the point cloud output characteristic of the front view; 3. and a detector detecting step of detecting an obstacle using the three-dimensional object based on the front view. The method solves the problem that information between original point cloud points is lost by using the grid features through the combination of the point cloud features and the grid features of the front view. However, the method still has the disadvantages that the method is limited by the near-dense and far-sparse distribution characteristics of the point cloud, the point cloud information of the far-distance target is less, and the missing detection of the far-distance target can be caused.
Lang et al, in its published paper "Point pilers: Fast Encoders for Object Detection from Point Clouds" (Computer Vision and Pattern Recognition, CVPR 2019 IEEE Conference on), disclose a Point cloud target Detection method based on Point cylinder partitioning. The method comprises the following specific steps: 1. carrying out cylinder division on each frame of point cloud data; 2. extracting the features in the point cylinder by using a neural network to generate a point cloud initial feature map; 3. taking the 2D convolutional neural network as a backbone network to perform feature extraction on the point cloud initial feature map; 4. and predicting the target in the point cloud by using the extracted point cloud characteristics. According to the method, the high-efficiency detection efficiency can be realized by carrying out cylinder division on the point cloud space. However, the method still has the disadvantages that the whole point cloud space is compressed along the vertical height direction by adopting a cylinder division mode, and the loss of point cloud information can be caused in the compression process, so that the false detection or the missing detection of a point cloud target can be caused.
Disclosure of Invention
The invention aims to provide a point cloud target detection method based on height-channel feature enhancement aiming at the defects of the prior art. The method is used for solving the problems that the information loss and the detection effect caused by compressing the point cloud space in the prior art are limited by the distribution characteristics of the point cloud close to dense and distant from sparse.
The specific idea for realizing the purpose of the invention is as follows: and carrying out four times of segmentation on the point cloud data to be detected along the vertical height direction to obtain four point cloud characteristic vectors, and enhancing the point cloud characteristics by attention weighting. The detection speed can be ensured by four times of segmentation, and point cloud information can be kept in a finer granularity manner so as to reduce the information loss of the point cloud; by performing attention weighting on the feature vectors in the height dimension and the channel dimension, the key features of the point cloud can be enhanced to overcome the close-dense-far-sparse distribution characteristics of the point cloud.
The specific steps for realizing the purpose of the invention are as follows:
(1) converting point cloud data blocks received from a laser radar in real time into aggregated feature vectors:
(2) extracting the attention weight value of the high dimensionality in the aggregation feature vector:
(2a) compressing the number of channels of the aggregated feature vector to 1 by using maximum pooling operation to obtain a height feature vector with the size of 496 multiplied by 432 multiplied by 1 multiplied by 4;
(2b) inputting the height feature vector into the convolutional layer, and outputting an attention weight value of a height dimension in the aggregation feature vector with the size of 496 multiplied by 432 multiplied by 4;
(3) extracting attention weight values of channel dimensions in the aggregated feature vector:
(3a) compressing the height number of the aggregation feature vector to be 1 by utilizing maximum pooling operation to obtain a channel feature vector with the size of 496 multiplied by 432 multiplied by 32 multiplied by 1;
(3b) inputting the channel feature vector into the convolutional layer, and outputting an attention weight value of a channel dimension in the aggregated feature vector with the size of 496 multiplied by 432 multiplied by 32;
(4) weighting the aggregated feature vectors:
(4a) performing cross multiplication on the attention weight value of the height dimension and the attention weight value of the channel dimension to obtain a polymerization attention weight value with the size of 496 multiplied by 432 multiplied by 32 multiplied by 4;
(4b) multiplying the aggregation attention weight value and the aggregation feature vector element by element to obtain a weighted feature vector with the size of 496 multiplied by 432 multiplied by 32 multiplied by 4;
(4c) compressing the height dimension of the weighted feature vector to be 1 by utilizing maximum pooling operation to obtain an enhanced feature vector of 496 multiplied by 432 multiplied by 32;
(5) constructing a backbone network:
building a backbone network of pointpilars and setting the number of input channels of the backbone network to be 32;
(6) training a backbone network:
inputting the enhanced feature vector into a backbone network, and iteratively updating network parameters by using an Adam optimization algorithm until a loss function of the backbone network is converged to obtain a trained backbone network;
(7) detecting a point cloud target:
(7a) converting point cloud data to be detected into a polymerization feature vector by adopting the same method as the step (1);
(7b) obtaining an enhanced feature vector by weighting the aggregated feature vector by the same method as the steps (2), (3) and (4);
(7c) and inputting the enhanced feature vector into the trained backbone network to complete point cloud target detection.
Compared with the prior art, the invention has the following advantages:
firstly, the point cloud data blocks are equally divided into four parts along the vertical direction to obtain four parts of point cloud characteristics, so that the defect that information loss is easily caused when point cloud data is compressed into one part of point cloud characteristics in the prior art is overcome, the false detection of a point cloud target can be reduced, and the average precision of point cloud target detection is improved.
Secondly, the method extracts the attention weight value of the high-degree dimension in the aggregated feature vector, extracts the attention weight value of the channel dimension in the aggregated feature vector, weights the aggregated feature vector, enhances the key features of the point cloud, overcomes the defect that the prior art is limited by the near-dense far-sparse distribution characteristics of the point cloud, and can reduce the missing detection of a remote target so as to improve the average precision of point cloud target detection.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The implementation steps of the present invention are further described with reference to fig. 1.
Step 1, point cloud data blocks received in real time from a laser radar are converted into aggregation characteristic vectors.
Through-filtering, point cloud data blocks with the length of 79.36 meters, the width of 69.12 meters and the height of 4 meters are obtained from the point cloud received by the laser radar in real time.
And equally dividing the point cloud data blocks into 4 parts along the vertical direction.
And uniformly dividing each point cloud data into cylinders with the same size, wherein the length and the width of each cylinder are 0.16 m, and the height of each cylinder is 1 m.
And inputting each cylinder into a trained PointNet network, and outputting 32-dimensional point cloud characteristics of the cylinder.
And arranging the 32-dimensional point cloud characteristics of each cylinder in each point cloud data into a characteristic vector with the size of 496 multiplied by 432 multiplied by 32 corresponding to the point cloud data according to the segmentation position.
The four feature vectors are spliced into an aggregated feature vector of size 496 × 432 × 32 × 4.
And 2, extracting the attention weight value of the high dimensionality in the aggregation feature vector.
With the maximum pooling operation, the number of channels of the aggregated feature vector is compressed to 1, resulting in a height feature vector of size 496 × 432 × 1 × 4.
The height feature vector is input to the first convolution layer, and the attention weight value of the height dimension in the aggregated feature vector with the size of 496 × 432 × 4 is output.
The first convolutional layer is a 2D convolutional layer with the convolutional kernel size of 1 multiplied by 1, the number of input channels of 4 and the number of output channels of 4.
And 3, extracting the attention weight value of the channel dimension in the aggregation characteristic vector.
With the maximum pooling operation, the height number of the aggregated feature vector is compressed to 1, resulting in a channel feature vector of size 496 × 432 × 32 × 1.
The channel feature vector is input into the second convolution layer, and the attention weight value of the channel dimension in the aggregated feature vector with the size of 496 × 432 × 32 is output.
The second convolutional layer is a 2D convolutional layer with the convolutional kernel size of 1 multiplied by 1, the number of input channels of 32 and the number of output channels of 32.
And 4, weighting the aggregation characteristic vectors.
And cross-multiplying the attention weight value of the height dimension and the attention weight value of the channel dimension to obtain an aggregate attention weight value with the size of 496 multiplied by 432 multiplied by 32 multiplied by 4.
The aggregate attention weight value is multiplied element by element with the aggregate feature vector to obtain a weighted feature vector with the size of 496 × 432 × 32 × 4.
The height dimension of the weighted feature vector is compressed to 1 using a max pooling operation, resulting in an enhanced feature vector of 496 × 432 × 32.
And 5, constructing a backbone network.
A backbone network of Pointpilars is built, and the number of input channels of the backbone network is set to be 32.
And 6, training a backbone network.
And inputting the enhanced feature vector into the backbone network, and iteratively updating the network parameters by using an Adam optimization algorithm until the loss function of the backbone network is converged to obtain the trained backbone network.
The parameter setting of the Adam optimization algorithm is that the exponential decay rate of the biased first moment estimation is set to be 0.95, the exponential decay rate of the biased second moment estimation is set to be 0.85, and the learning rate is set to be 0.003.
The loss function of the backbone network is as follows:
Figure BDA0003094023720000051
wherein, L represents the loss function of the backbone network, N represents the number of positive samples obtained by the enhanced feature vector through the prediction of the backbone network, and betalocThe weight value of the positioning loss function is (0, 10)],LlocRepresenting the localization loss function, betaclsWeight values representing classification loss functions, the values of whichThe range is (0, 5)],LclsRepresenting the classification loss function, betadirRepresents the weight value of the orientation loss function, and the value range is (0, 1)],LdirRepresenting an orientation loss function;
the localization loss function is as follows:
Figure BDA0003094023720000052
sigma (·) represents summation operation, b represents parameters of a point cloud target frame obtained by enhancing a feature vector through backbone network prediction, x represents an x-axis coordinate value corresponding to the center of the point cloud target frame, y represents a y-axis coordinate value corresponding to the center of the point cloud target frame, z represents a z-axis coordinate value corresponding to the center of the point cloud target frame, w represents the width of the point cloud target frame, l represents the length of the point cloud target frame, h represents the height of the point cloud target frame, theta represents an orientation angle of the point cloud target frame, Smoothl1(·) represents a Smoothl1 loss function, and delta b represents residual coding of the point cloud target frame and a real point cloud target frame obtained by enhancing the feature vector through backbone network prediction;
the classification loss function is as follows:
Lcls=-0.25(1-pa)2logpa
wherein p isaRepresenting the confidence degree of the target classification obtained by predicting the enhanced feature vector through a backbone network, wherein log (-) represents the logarithmic operation based on a natural constant e;
the orientation loss function is as follows:
Figure BDA0003094023720000061
wherein e is(·)Denotes an exponential operation with a natural constant e as base, fjAll possible target orientation classes, f, representing enhanced feature vectors predicted by the backbone networkiF representing enhanced feature vectors predicted by backbone networkjTowards the category.
And 7, detecting the point cloud target.
And (3) converting the point cloud data to be detected into the aggregation characteristic vector by adopting the same method as the step 1.
And (4) weighting the aggregation feature vector by adopting the same method as the steps 2, 3 and 4 to obtain an enhanced feature vector.
And inputting the enhanced feature vector into the trained backbone network to complete point cloud target detection.
The effect of the present invention will be further described with reference to simulation experiments.
1. Simulation experiment conditions are as follows:
the hardware platform of the simulation experiment of the invention is as follows: the processor is Inter (R) core (TM) i9-9900k CPU @3.6Ghz, and the memory is 16 GB. The GPU hardware is configured to be NVIDIA GeForce RTX 2080Ti, and the video memory is 11 GB.
The software platform of the simulation experiment of the invention is as follows: ubuntu 18.04 operating system and python 3.6.
The point cloud data set used in the simulation experiment is a KITTI data set, wherein the point cloud data is acquired by 64-line laser radar. A data set disclosed in The publication "Vision measures laboratories, The kit data set, The International Journal of nucleic Research 32.11(2013): 1231-1237" by Geiger, Andrea et al, which classifies data samples into three types, a simple sample, a medium sample, and a difficult sample.
2. Simulation content and result analysis thereof:
the simulation experiment of the invention adopts the invention and three detection methods (VoxelNet, Second, Pointpilers) in the prior art to respectively carry out target detection on vehicles in the input point cloud data set so as to obtain predicted point cloud targets.
The prior art method VoxelNet refers to a point cloud target detection method of a voxel-based 3D convolution network method, called VoxelNet for short, proposed in Zhou, Y et al, "VoxelNet: End-to-End learning for point closed based 3D object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018, pp.4490-4499".
The prior art method Second refers to a point cloud target detection method using 3D sparse convolution, called Second for short, proposed by Yan, Yan et al in "Second: sparse embedded volumetric detection, Sensors 18.10(2018): 3337".
The prior art method Pointpilers refers to a point cloud target detection method for point cylinder segmentation, which is proposed in "Pointpilers: Fast encoders for object detection from point clusters. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019, pp.12697-12705" by Lang, A.H et al.
In order to verify the effect of the invention, the following calculation of the detection average precision is carried out on the predicted point cloud target obtained by adopting four different detection methods.
All the calculation results are plotted in table 1, and Ours in table 1 represent the simulation experiment results of the present invention.
Figure BDA0003094023720000071
Figure BDA0003094023720000072
Figure BDA0003094023720000073
TABLE 1 table of average accuracy of four methods of detection
Average accuracy Simple sample Moderate sample Difficult sample
VoxelNet 81.97 65.46 62.85
Second 85.50 75.04 68.78
Pointpillars 87.50 77.01 74.77
Ours 88.45 78.01 76.72
As can be seen from Table 1, the invention exceeds the prior art in simple, medium and difficult samples on KITTI data sets, and proves that the invention can obtain higher average detection accuracy.

Claims (6)

1. A point cloud target detection method based on height-channel feature enhancement is characterized by extracting an attention weight value of a high-degree dimension in a polymerization feature vector, extracting an attention weight value of a channel dimension in the polymerization feature vector, and weighting the polymerization feature vector to obtain an enhanced feature vector; the method comprises the following steps:
(1) converting point cloud data blocks received from a laser radar in real time into aggregated feature vectors:
(2) extracting the attention weight value of the high dimensionality in the aggregation feature vector:
(2a) compressing the number of channels of the aggregated feature vector to 1 by using maximum pooling operation to obtain a height feature vector with the size of 496 multiplied by 432 multiplied by 1 multiplied by 4;
(2b) inputting the height feature vector into a first convolution layer, and outputting an attention weight value of a height dimension in the aggregation feature vector with the size of 496 multiplied by 432 multiplied by 4;
(3) extracting attention weight values of channel dimensions in the aggregated feature vector:
(3a) compressing the height number of the aggregation feature vector to be 1 by utilizing maximum pooling operation to obtain a channel feature vector with the size of 496 multiplied by 432 multiplied by 32 multiplied by 1;
(3b) inputting the channel feature vector into a second convolution layer, and outputting an attention weight value of a channel dimension in the aggregated feature vector with the size of 496 multiplied by 432 multiplied by 32;
(4) weighting the aggregated feature vectors:
(4a) performing cross multiplication on the attention weight value of the height dimension and the attention weight value of the channel dimension to obtain a polymerization attention weight value with the size of 496 multiplied by 432 multiplied by 32 multiplied by 4;
(4b) multiplying the aggregation attention weight value and the aggregation feature vector element by element to obtain a weighted feature vector with the size of 496 multiplied by 432 multiplied by 32 multiplied by 4;
(4c) compressing the height dimension of the weighted feature vector to be 1 by utilizing maximum pooling operation to obtain an enhanced feature vector of 496 multiplied by 432 multiplied by 32;
(5) constructing a backbone network:
building a backbone network of pointpilars and setting the number of input channels of the backbone network to be 32;
(6) training a backbone network:
inputting the enhanced feature vector into a backbone network, and iteratively updating network parameters by using an Adam optimization algorithm until a loss function of the backbone network is converged to obtain a trained backbone network;
(7) detecting a point cloud target:
(7a) converting point cloud data to be detected into a polymerization feature vector by adopting the same method as the step (1);
(7b) obtaining an enhanced feature vector by weighting the aggregated feature vector by the same method as the steps (2), (3) and (4);
(7c) and inputting the enhanced feature vector into the trained backbone network to complete point cloud target detection.
2. The method for detecting the point cloud target based on the height-channel feature enhancement as claimed in claim 1, wherein the step of converting the point cloud data block received from the laser radar in real time into the aggregated feature vector in step (1) is as follows:
firstly, through-filtering, obtaining a point cloud data block with the length of 79.36 meters, the width of 69.12 meters and the height of 4 meters from the point cloud received by a laser radar in real time;
uniformly dividing the point cloud data blocks into 4 parts in the vertical direction;
uniformly dividing each point cloud data into cylinders with the same size, wherein the length and the width of each cylinder are 0.16 m, and the height of each cylinder is 1 m;
fourthly, inputting each cylinder into a trained PointNet network, and outputting 32-dimensional point cloud characteristics of the cylinder;
fifthly, arranging the 32-dimensional point cloud characteristics of each cylinder in each point cloud data into a characteristic vector with the size of 496 multiplied by 432 multiplied by 32 corresponding to the point cloud data according to the segmentation position;
and sixthly, splicing the four feature vectors into an aggregation feature vector with the size of 496 multiplied by 432 multiplied by 32 multiplied by 4.
3. The method of claim 1, wherein the first convolution layer in step (2b) is a 2D convolution layer with a convolution kernel size of 1 × 1, a number of input channels of 4, and a number of output channels of 4.
4. The method of claim 1, wherein the second convolutional layer in step (3b) is a 2D convolutional layer with a convolutional kernel size of 1 × 1, the number of input channels of 32, and the number of output channels of 32.
5. The method for detecting point cloud target based on height-channel feature enhancement as claimed in claim 1, wherein the parameters of the Adam optimization algorithm in step (6) are set to set the exponential decay rate of the biased first moment estimation to 0.95, the exponential decay rate of the biased second moment estimation to 0.85, and the learning rate to 0.003.
6. The method for point cloud target detection based on elevation-channel feature enhancement as claimed in claim 1, wherein the loss function of the backbone network in step (6) is as follows:
Figure FDA0003094023710000031
wherein, L represents the loss function of the backbone network, N represents the number of positive samples obtained by the enhanced feature vector through the prediction of the backbone network, and betalocThe weight value of the positioning loss function is (0, 10)],LlocRepresenting the localization loss function, betaclsA weight value representing a classification loss function, which has a value in the range of (0, 5)],LclsRepresenting the classification loss function, betadirRepresents the weight value of the orientation loss function, and the value range is (0, 1)],LdirRepresenting an orientation loss function;
the localization loss function is as follows:
Figure FDA0003094023710000032
sigma (·) represents summation operation, b represents parameters of a point cloud target frame obtained by enhancing a feature vector through backbone network prediction, x represents an x-axis coordinate value corresponding to the center of the point cloud target frame, y represents a y-axis coordinate value corresponding to the center of the point cloud target frame, z represents a z-axis coordinate value corresponding to the center of the point cloud target frame, w represents the width of the point cloud target frame, l represents the length of the point cloud target frame, h represents the height of the point cloud target frame, theta represents an orientation angle of the point cloud target frame, Smoothl1(·) represents a Smoothl1 loss function, and delta b represents residual coding of the point cloud target frame and a real point cloud target frame obtained by enhancing the feature vector through backbone network prediction;
the classification loss function is as follows:
Lcls=-0.25(1-pa)2logpa
wherein p isaRepresenting the confidence degree of the target classification obtained by predicting the enhanced feature vector through a backbone network, wherein log (-) represents the logarithmic operation based on a natural constant e;
the orientation loss function is as follows:
Figure FDA0003094023710000041
wherein e is(·)Denotes an exponential operation with a natural constant e as base, fjAll possible target orientation classes, f, representing enhanced feature vectors predicted by the backbone networkiF representing enhanced feature vectors predicted by backbone networkjTowards the category.
CN202110605139.9A 2021-05-31 2021-05-31 Point cloud target detection method based on height-channel characteristic enhancement Active CN113240038B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110605139.9A CN113240038B (en) 2021-05-31 2021-05-31 Point cloud target detection method based on height-channel characteristic enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110605139.9A CN113240038B (en) 2021-05-31 2021-05-31 Point cloud target detection method based on height-channel characteristic enhancement

Publications (2)

Publication Number Publication Date
CN113240038A true CN113240038A (en) 2021-08-10
CN113240038B CN113240038B (en) 2024-02-09

Family

ID=77136065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110605139.9A Active CN113240038B (en) 2021-05-31 2021-05-31 Point cloud target detection method based on height-channel characteristic enhancement

Country Status (1)

Country Link
CN (1) CN113240038B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114862957A (en) * 2022-07-08 2022-08-05 西南交通大学 Subway car bottom positioning method based on 3D laser radar
CN115526936A (en) * 2022-11-29 2022-12-27 长沙智能驾驶研究院有限公司 Training method of positioning model and point cloud data positioning method and device
CN115965928A (en) * 2023-03-16 2023-04-14 安徽蔚来智驾科技有限公司 Point cloud feature enhancement method, target detection method, device, medium and vehicle

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020151109A1 (en) * 2019-01-22 2020-07-30 中国科学院自动化研究所 Three-dimensional target detection method and system based on point cloud weighted channel feature
CN112347987A (en) * 2020-11-30 2021-02-09 江南大学 Multimode data fusion three-dimensional target detection method
CN112668469A (en) * 2020-12-28 2021-04-16 西安电子科技大学 Multi-target detection and identification method based on deep learning
US20210142106A1 (en) * 2019-11-13 2021-05-13 Niamul QUADER Methods and systems for training convolutional neural network using built-in attention

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020151109A1 (en) * 2019-01-22 2020-07-30 中国科学院自动化研究所 Three-dimensional target detection method and system based on point cloud weighted channel feature
US20210142106A1 (en) * 2019-11-13 2021-05-13 Niamul QUADER Methods and systems for training convolutional neural network using built-in attention
CN112347987A (en) * 2020-11-30 2021-02-09 江南大学 Multimode data fusion three-dimensional target detection method
CN112668469A (en) * 2020-12-28 2021-04-16 西安电子科技大学 Multi-target detection and identification method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
严娟;方志军;高永彬;: "结合混合域注意力与空洞卷积的3维目标检测", 中国图象图形学报, no. 06 *
王康如;谭锦钢;杜量;陈利利;李嘉茂;张晓林;: "基于迭代式自主学习的三维目标检测", 光学学报, no. 09 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114862957A (en) * 2022-07-08 2022-08-05 西南交通大学 Subway car bottom positioning method based on 3D laser radar
CN114862957B (en) * 2022-07-08 2022-09-27 西南交通大学 Subway car bottom positioning method based on 3D laser radar
CN115526936A (en) * 2022-11-29 2022-12-27 长沙智能驾驶研究院有限公司 Training method of positioning model and point cloud data positioning method and device
CN115965928A (en) * 2023-03-16 2023-04-14 安徽蔚来智驾科技有限公司 Point cloud feature enhancement method, target detection method, device, medium and vehicle

Also Published As

Publication number Publication date
CN113240038B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
CN111832655B (en) Multi-scale three-dimensional target detection method based on characteristic pyramid network
CN113240038A (en) Point cloud target detection method based on height-channel feature enhancement
Chen et al. Distribution line pole detection and counting based on YOLO using UAV inspection line video
CN111626128B (en) Pedestrian detection method based on improved YOLOv3 in orchard environment
CN111145174B (en) 3D target detection method for point cloud screening based on image semantic features
CN113902897A (en) Training of target detection model, target detection method, device, equipment and medium
CN110349260B (en) Automatic pavement marking extraction method and device
WO2023193401A1 (en) Point cloud detection model training method and apparatus, electronic device, and storage medium
CN106952274A (en) Pedestrian detection and distance-finding method based on stereoscopic vision
CN110991444A (en) Complex scene-oriented license plate recognition method and device
CN114463736A (en) Multi-target detection method and device based on multi-mode information fusion
JP5870011B2 (en) Point cloud analysis device, point cloud analysis method, and point cloud analysis program
CN112668469A (en) Multi-target detection and identification method based on deep learning
CN113420819A (en) Lightweight underwater target detection method based on CenterNet
CN116189147A (en) YOLO-based three-dimensional point cloud low-power-consumption rapid target detection method
CN112613450A (en) 3D target detection method for enhancing performance on difficult sample
Hu et al. Traffic density recognition based on image global texture feature
CN116778341A (en) Multi-view feature extraction and identification method for radar image
CN116222577A (en) Closed loop detection method, training method, system, electronic equipment and storage medium
CN109934151B (en) Face detection method based on movidius computing chip and Yolo face
CN113269147B (en) Three-dimensional detection method and system based on space and shape, and storage and processing device
CN112597875A (en) Multi-branch network anti-missing detection aerial photography target detection method
CN116678418A (en) Improved laser SLAM quick loop-back detection method
Wang et al. Research on vehicle detection based on faster R-CNN for UAV images
CN114240940B (en) Cloud and cloud shadow detection method and device based on remote sensing image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant