WO2022141718A1 - Method and system for assisting point cloud-based object detection - Google Patents

Method and system for assisting point cloud-based object detection Download PDF

Info

Publication number
WO2022141718A1
WO2022141718A1 PCT/CN2021/074199 CN2021074199W WO2022141718A1 WO 2022141718 A1 WO2022141718 A1 WO 2022141718A1 CN 2021074199 W CN2021074199 W CN 2021074199W WO 2022141718 A1 WO2022141718 A1 WO 2022141718A1
Authority
WO
WIPO (PCT)
Prior art keywords
sampling
point cloud
cloud set
target point
point
Prior art date
Application number
PCT/CN2021/074199
Other languages
French (fr)
Chinese (zh)
Inventor
张翔
黄尚锋
杜静
夏启明
陈延行
江文涛
Original Assignee
罗普特科技集团股份有限公司
罗普特(厦门)系统集成有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 罗普特科技集团股份有限公司, 罗普特(厦门)系统集成有限公司 filed Critical 罗普特科技集团股份有限公司
Publication of WO2022141718A1 publication Critical patent/WO2022141718A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/04Indexing scheme for image data processing or generation, in general involving 3D image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the present disclosure belongs to the technical field of computers, and in particular relates to a method and system for assisting point cloud target detection.
  • 3D Laser Scanning Technology can continuously, quickly and massively acquire 3D point cloud data on the surface of objects, namely point cloud.
  • this technology is used in the field of autonomous driving, that is, the vehicle-mounted lidar is used to quickly scan the objects in front of the vehicle, to obtain a large number of point clouds rich in spatial structure information, and then according to the obtained point cloud data, the point cloud object-based target in front of the vehicle is carried out. detection.
  • This technology is widely used in the autonomous driving area.
  • 3D object detection In the field of artificial intelligence, 3D object detection has attracted the attention of more and more researchers. This technology plays an important role in autonomous driving, robot trajectory planning, virtual reality, etc.
  • 3D object detection methods are mainly divided into image-based 3D object detection, point cloud-based 3D object detection, and 3D object detection based on image and point cloud combination.
  • the point cloud-based 3D object detection directly uses the vehicle-mounted lidar data to estimate the 3D detection frame.
  • the first method is to convert the entire point cloud to voxels and then predict 3D detection boxes on the voxels.
  • these methods not only lose rich spatial structure information, but also inevitably utilize 3D CNN to learn features from voxels, which causes a large amount of computation.
  • Yin Zhou et al. proposed VoxelNet, which uses PointNet network to learn point cloud features from voxels after voxelization of point clouds, while SECOND proposed by Yan Y et al. uses sparse convolutional layers instead of PointNet.
  • PointNet and PointNet++ proposed by QiCR et al. have achieved great success in point clouds, more and more 3D object detection methods have realized the task of directly processing point clouds by using them.
  • Shi S et al. proposed the two-stage network PointRCNN, which first used PointNet++ as the semantic segmentation backbone network to distinguish foreground points and background points. Then, the 3D detection box is estimated from the foreground points.
  • the two-stage network STD proposed by Yang Z et al.
  • PointNet++ to learn point-by-point features of point clouds, and converted the internal point features of candidate target boxes from sparse representation to dense representation through the proposed PointsPool module.
  • the one-stage direct-utilization point cloud approach is known for its efficiency.
  • Shi W et al. proposed PointGNN and proposed a network that uses GNN to extract point-like features. This takes full advantage of GNNs to perceive spatial structure and achieves excellent performance on KITTI. GNN can perceive the spatial structure information of point cloud well, but it requires that the point cloud cannot be downsampled. So this means that the network has to GNN the whole point cloud. Therefore, the computational cost is high compared to other one-stage methods.
  • the purpose of the embodiments of the present application is to propose a method and system for assisting point cloud target detection, which has both the efficiency of a one-stage network and the accuracy of a two-stage network, and solves the problems existing in the above background technology.
  • an embodiment of the present application provides a method for assisting point cloud target detection, and the steps of the method include:
  • step of feature aggregation in step S200 includes:
  • step S300 Perform down-sampling-select-point-weighting-update processing on the initial target point cloud set, and repeat as many times as required until a satisfactory sampling target point cloud set is obtained, in which the interest sampling points are the sampling points left by the down-sampling operation,
  • the quality sampling points are the sampling points selected by the interest sampling points in the same cycle after denoising and feature learning to obtain good or bad scores.
  • Such an operation makes the sampling target point cloud set not only include as many foreground points as possible obtained by sampling the farthest point of the feature, but also a point cloud set that can preserve the overall shape of the point cloud as much as possible by sampling the farthest point.
  • Step S300 And lay the foundation for the smooth execution of step S400.
  • the specific method of downsampling in step 201 includes: inputting an initial target point cloud set, using distance-based farthest point sampling or independent sampling of eigenvalue-based farthest point sampling to collect the initial target point cloud set
  • the sampling points are down-sampled, and K interest sampling points are obtained after down-sampling.
  • Such an operation enables the obtained K interest sampling points to preserve as many foreground points as possible, and also preserve the shape of the point cloud as much as possible.
  • the specific method selected in step 202 includes: denoising the K interest sampling points, using the number of adjacent points around the interest sampling point and the distance information from the adjacent points to the interest sampling point to obtain through feature learning
  • the quality score of each interest sampling point according to the quality score, select the first K' quality sampling points with better quality from the K interest sampling points.
  • Such an operation can effectively remove sampling points that are noise points and sampling points with sparse spatial structure information, which facilitates subsequent weighting of adjacent points by their contribution degrees.
  • the specific method for weighting in step 203 includes: for the K' quality sampling points with itself as the center, randomly sampling m adjacent points in a spherical area with radius r, and using the characteristics of the quality sampling points , The characteristics of the adjacent points and the relative coordinates of the adjacent points and the quality sampling points are used as input to calculate the contribution of each adjacent point to the quality sampling points, and the product of the characteristics of each adjacent point and its contribution to the quality sampling points is weighted. features of each adjacent point. Such an operation can obtain the feature matrix of the adjacent points after weighting, which is convenient for the next update point operation.
  • the value range of the contribution degree in step S203 is 0-1. Such a setup can make the empowerment results more reasonable and scientific.
  • the specific method for updating the sampling points in the initial target point cloud set in step 204 includes: using the MaxPooling operation to obtain the most obvious feature on each channel corresponding to the feature of the adjacent point after weighting to generate a new feature , so as to form the process target point cloud set.
  • Such an operation enables the process target point cloud set to perceive the spatial structure well, and makes the prediction of the center point of the final 3D target detection frame more accurate.
  • the downsampling operation reduces the number of sampling points to be processed, improves the operation speed, and achieves both high efficiency. And precise.
  • the specific operation mode of step 205 is: input the process target point cloud set, use the joint sampling of distance-based farthest point sampling and eigenvalue-based farthest point sampling, to sample points in the process target point cloud set Perform downsampling to obtain new K interest sampling points, and then perform operations S202-S204 to obtain a second process target point cloud set or a sampling target point cloud set.
  • the second process target point cloud set is obtained, repeat step S205 until the sampling target point cloud set is finally obtained.
  • Repeated operations can further improve the center point prediction of the final 3D object detection frame as needed.
  • step S205 is performed for 1 to 4 times.
  • the number of point clouds after downsampling can be set for each downsampling, and the number of executions from 1 to 4 times can more reasonably allocate the number of downsampled point clouds, so that the calculation time and the prediction accuracy of the final 3D target detection frame are within a reasonable range. Too many execution times can easily lead to too long operation time, and too few execution times can easily lead to insufficient precision.
  • step S300 further includes: penalizing the wrong labels of predicted foreground points or background points by using the F local loss loss function.
  • the F local loss function can effectively supervise the accuracy of the final prediction results.
  • the step of generating the three-dimensional target detection frame in step S400 includes: first predefining the length, width and height of the three-dimensional target detection frame, and then using the feature of the predicted center point as an input, using a fully connected layer to output relative The difference and rotation angle of the pre-defined length, width and height are used to obtain the final three-dimensional target detection frame. Such operations make the final generated 3D object detection frame more realistic.
  • the present application provides a system for assisting point cloud target detection, the system comprising:
  • the extraction module is configured to segment the initial target point cloud set from the overall point cloud scene;
  • the feature aggregation module is configured to perform feature aggregation on the initial target point cloud set to obtain the spatial structure information of each sampling point in the initial target point cloud set, and assign the spatial structure information to the corresponding sampling points to obtain the sampling target point cloud set;
  • the feature propagation module is configured to expand the sample target point cloud set back to the initial target point cloud set through three-point linear interpolation, and update the feature of the midpoint of the initial target point cloud set by interpolation, according to the midpoint of the initial target point cloud set
  • the features of use a fully connected layer to output the probability that each sampling point in the initial target point cloud set is a foreground point or a background point;
  • the detection frame generation module is configured to input the sampling target point cloud set into the multilayer perceptron, output the offset of the sampling target point cloud set to the corresponding real object center point, and then compare the offset with the sampling The features of the target point cloud set are added to obtain the features of the predicted center point, and finally a three-dimensional object detection frame is generated according to the features of the predicted center point.
  • the feature aggregation module includes:
  • a downsampling module configured to downsample the initial target point cloud set to obtain K interest sampling points;
  • the point selection module is configured to select the interest sampling points to obtain K' quality sampling points
  • Weighting module the weighting module is configured to weight the adjacent points around the quality sampling point to obtain the feature matrix of each adjacent point after the weighting;
  • sampling point update module is configured to update the sampling point features in the initial target point cloud set by using the features of the weighted adjacent points to obtain the process target point cloud set;
  • the loop module is configured to downsample the process target point cloud set to obtain new K interest sampling points, and then repeat the operations S202-S204 to obtain the second process target point cloud set or the sampling target point cloud set.
  • the second process target point cloud set is obtained
  • the step S205 is repeated until the sampling target point cloud set is finally obtained.
  • the combination of multiple modules in the feature aggregation module can make the process target point cloud set to perceive the spatial structure well, make the prediction of the center point of the final 3D target detection frame more accurate, and reduce the number of sampling points to be processed through the downsampling operation. , improve the operation speed, and achieve both high efficiency and precision.
  • the present application provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method according to any one of the foregoing first aspects is implemented.
  • the method and system for assisting point cloud target detection have the following advantages: 1.
  • the spatial structure information of each sampling point in the initial target point cloud set is obtained, and The spatial structure information is assigned to the corresponding sampling points, and the sampling target point cloud set is obtained.
  • downsampling is used to reduce the number of processing sampling points, improve the operation speed, and make the method more efficient;
  • the point cloud set is expanded back to the initial target point cloud set through three-point linear interpolation, and the feature of the midpoint of the initial target point cloud set is updated by interpolation.
  • a fully connected layer is used to output the initial target point cloud. Concentrating the probability of whether each sampling point is a foreground point or a background point makes the final semantic segmentation effect more obvious, so that the display result of point cloud target detection is more obvious. 3. Input the sampling target point cloud set into the multi-layer perceptron, output the offset of the sampling target point cloud set to the corresponding real object center point, and then add the offset and the characteristics of the sampling target point cloud set to obtain the prediction center Finally, the three-dimensional target detection frame is generated according to the characteristics of the predicted center point, and the accurate position of the three-dimensional target detection frame can be obtained, which makes the method more accurate.
  • FIG. 1 is an exemplary base flow diagram according to an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of a feature aggregation step in a method for assisting point cloud target detection according to an embodiment of the present disclosure
  • FIG. 3 is a schematic structural diagram of a system for assisting point cloud target detection according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a feature aggregation module in a system for assisting point cloud target detection according to an embodiment of the present disclosure.
  • FIG. 1 shows an exemplary basic flow of a method for assisting point cloud target detection of the present disclosure.
  • the basic process includes:
  • Step S100 segmenting the initial target point cloud set from the overall point cloud scene
  • a full-pixel semantic segmentation method is used to segment the initial target point cloud set from the overall point cloud scene, assuming that the initial target point cloud set is point set A.
  • the semantic segmentation method can classify each pixel at the pixel level, so that the pixels of the same class are classified into the same class, which is more conducive to extracting the initial target point cloud set.
  • Step S200 performing feature aggregation on the initial target point cloud set to obtain the spatial structure information of each sampling point in the initial target point cloud set, and assigning the spatial structure information to the corresponding sampling point to obtain the sampling target point cloud set;
  • step S200 includes:
  • the specific operation method is: input point set A, use distance-based farthest point sampling or eigenvalue-based farthest point sampling independent sampling,
  • the sampling points in the target point cloud set are down-sampled, and K interest sampling points are obtained after down-sampling.
  • the sampling method based on the farthest point can better maintain the shape of the point cloud, and the farthest point sampling based on the eigenvalue can obtain as many foreground points as possible, and independently sample them so that the final K interest sampling points have both has the advantages of both.
  • the specific operation method is: denoising the K interest sampling points, using the number of adjacent points around the interest sampling point and the distance from the adjacent points to the interest sampling point information, obtain the quality score of each interest sampling point through feature learning, and select the first K' quality sampling points with better quality from the K interest sampling points according to the quality score.
  • the K interest sampling points obtained in S201 may include points with relatively sparse spatial structure information or sampling points themselves that are noise points, which can be denoised in step 202, and the feature learning method is used to obtain each sampling point. For the quality score, sort the sampling points according to the quality score, and select the top K' quality sampling points with good quality scores. Relative to the interest sampling point, the quality sampling point retains the sampling point set with richer spatial structure information and higher quality score.
  • the specific operation method is: for the K' quality sampling points with itself as the center of the circle, in a spherical area with a radius of r Randomly sample m nearby points, and use the characteristics of the quality sampling points, the characteristics of the adjacent points, and the relative coordinates of the adjacent points and the quality sampling points as input to calculate the contribution of each adjacent point to the quality sampling point, and find the contribution of each adjacent point to the quality sampling point.
  • the product of the feature of the point and its contribution to the quality of the sampling point gives the feature of each neighboring point after weighting.
  • the value range of the contribution degree is 0 to 1.
  • the specific operation mode is: using the MaxPooling operation to obtain the weighted adjacent point features on each channel corresponding to the feature The most obvious feature generates a new feature, thus forming the process target point cloud set, which is set as point set B.
  • the network uses a MaxPooling operation to obtain the most obvious feature on each channel of the feature matrix F' to generate a new feature. It can be expressed as New_F ⁇ R k'*c .
  • MaxPooling can also solve the problem of point cloud disorder.
  • New_F is the feature set of point set B.
  • Step S205 is executed for 1 to 4 times.
  • the number of point clouds after downsampling can be set for each downsampling, and the number of executions from 1 to 4 times can more reasonably allocate the number of downsampled point clouds, so that the calculation time and the prediction accuracy of the final 3D target detection frame are within a reasonable range. Too many execution times can easily lead to too long operation time, and too few execution times can easily lead to insufficient precision.
  • the downsampling process for point set B firstly performs distance-based farthest point sampling to obtain point set C, then performs the farthest point sampling based on eigenvalues to obtain point set D, and then performs step 202 on point set C and point set D.
  • -204 Operation obtains the feature sets of the point set C and the point set D.
  • the point set C and the point set D together constitute the second process target point cloud set.
  • the second process point cloud set is then downsampled.
  • the point set C obtained by sampling the farthest point based on the distance continues to obtain the point set E based on the sampling of the farthest point based on the distance, and the point obtained by sampling the farthest point based on the eigenvalue.
  • Set D continues to obtain the point set G according to the farthest point sampling based on the feature value, and then performs steps 202-204 on the point set E and the point set G to obtain the feature sets of the point set E and the point set G.
  • the point set E and the point set G are The set G together constitutes the sampling target point cloud set.
  • the basic process also includes step S300, expanding the sampling target point cloud set back to the initial target point cloud set through three-point linear interpolation, and updating the feature of the midpoint of the initial target point cloud set by interpolation, according to the initial target point cloud set.
  • the feature of the midpoint of the point cloud set using a fully connected layer to output the probability that each sampling point in the initial target point cloud set is a foreground point or a background point;
  • the sampling target point cloud set input in this step mainly uses the point set E obtained by sampling based on the farthest point, and then uses the three-point linear interpolation method (this method is the prior art, and will not be further explained) Expand the point set E back to the high-resolution point set C and update the features of the midpoint of the point set C, and then perform three-point linear interpolation on the point set C to expand it back to the higher-resolution point set B and update the point set B. feature, and then perform three-point linear interpolation on point set B to expand back to the original high-resolution point cloud scene, which is point set A, and update the features of points in point set A. Finally, a fully connected layer is used to output the probability that each point is a foreground point or a background point.
  • the F local loss function may also be used to penalize the wrong labels for predicting the foreground points or background points.
  • the F local loss function can effectively supervise the accuracy of the final prediction results.
  • Step S400 Input the sampling target point cloud set into the multi-layer perceptron, output the offset of the sampling target point cloud set to the corresponding center point of the real object, and then add the offset to the feature of the sampling target point cloud set to obtain a prediction
  • the features of the center point are finally generated according to the features of the predicted center point to generate a three-dimensional target detection frame.
  • the sampling target point cloud set mainly uses the point set G obtained by sampling the farthest point based on the feature value.
  • the offset of the center point of the real object, and the predicted center point position is obtained by adding the offset and the G coordinate of the point set.
  • the features of the predicted center point can be obtained by assigning weights to adjacent points in step 203 and updating the sampling points in step 204 .
  • the method of generating the 3D target detection frame may be by predefining the length, width and height of the 3D target detection frame, and then using the feature of the predicted center point as input, using a fully connected layer Output the difference and rotation angle relative to the pre-defined length, width and height, so as to obtain the final three-dimensional target detection frame.
  • the present application also provides a system for assisting point cloud target detection.
  • a system 500 for assisting point cloud target detection includes an extraction module 510 , a feature aggregation module 520 , a feature propagation module 530 and a detection frame generation module 540 .
  • the extraction module 510 is used to segment the initial target point cloud set from the overall point cloud scene.
  • the feature aggregation module 520 is configured to perform feature aggregation on the initial target point cloud set to obtain the spatial structure information of each sampling point in the initial target point cloud set, and assign the spatial structure information to the corresponding sampling points to obtain the sampling target point cloud set.
  • the feature aggregation module 520 includes:
  • the downsampling module 521 is used to downsample the initial target point cloud set to obtain K interest sampling points;
  • the point selection module 522 is used to select the interest sampling points to obtain K' quality sampling points
  • the weighting module 523 is used to weight the adjacent points around the quality sampling point to obtain the feature matrix of each adjacent point after the weighting;
  • the sampling point updating module 524 is used to update the sampling point feature in the initial target point cloud set with the feature of the adjacent point after the weighting to obtain the process target point cloud set;
  • the loop module 525 is used to downsample the process target point cloud set to obtain new K interest sampling points, and then repeat the operations S202-S204 to obtain the second process target point cloud set or the sampling target point cloud set, and repeat when the second process target point cloud set is obtained. Step S205 until the sampling target point cloud set is finally obtained.
  • a system 500 for assisting point cloud target detection further includes: a feature propagation module 530 for expanding the sample target point cloud set back to the initial target point cloud set through three-point linear interpolation, and updating the initial target point by interpolation
  • the feature of the midpoint of the cloud set uses a fully connected layer to output the probability that each sampling point in the initial target point cloud set is a foreground point or a background point.
  • the detection frame generation module 540 is used to input the sampling target point cloud set into the multi-layer perceptron, output the offset of the sampling target point cloud set to the corresponding real object center point, and then compare the offset with the characteristics of the sampling target point cloud set After the addition, the features of the predicted center point are obtained, and finally a three-dimensional target detection frame is generated according to the features of the predicted center point.
  • the processes described above with reference to the flowcharts may be implemented as computer software programs.
  • the computer-readable storage medium described in this application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above.
  • Computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein.
  • Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable storage medium other than a computer-readable storage medium that can be sent, propagated, or transmitted for use by or in connection with the instruction execution system, apparatus, or device program of.
  • Program code embodied on a computer-readable storage medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional procedural programming language - such as "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider e.g., using an Internet service provider through Internet connection.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the modules involved in the embodiments of the present application may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the described modules can also be provided in a processor, for example, it can be described as: a processor includes an acquisition module, an analysis module and an output module. Among them, the names of these modules do not constitute a limitation on the module itself under certain circumstances.

Abstract

The present disclosure relates to the technical field of computers, and specifically to a method and system for assisting point cloud-based object detection. The method and system for assisting point cloud-based object detection provided in embodiments of the present application have the following advantages: 1. the number of sampled points processed is reduced, and the calculation speed is increased; 2. according to the feature of an intermediate point of an initial target point cloud set, a fully connected layer is used to output the probability that each sampled point in the initial target point cloud set is a foreground point or a background point, so that the final semantic segmentation effect is more obvious, and then the display result of point cloud-based object detection is more obvious; 3. a precise three-dimensional object bounding box position can be obtained, and thus the method achieves high precision.

Description

一种辅助点云目标检测的方法及系统A method and system for assisting point cloud target detection
相关申请Related applications
本申请要求保护在2020年12月31日提交的申请号为202011633104.8的中国专利申请的优先权,该申请的全部内容以引用的方式结合到本文中。This application claims the priority of Chinese Patent Application No. 202011633104.8 filed on December 31, 2020, the entire contents of which are incorporated herein by reference.
技术领域technical field
本公开属于计算机技术领域,具体涉及一种辅助点云目标检测的方法及系统。The present disclosure belongs to the technical field of computers, and in particular relates to a method and system for assisting point cloud target detection.
背景技术Background technique
三维激光扫描技术(3D Laser Scanning Technology)可以连续地、快速地、大规模地获取物体表面的三维点云数据,即建点云(Point Cloud)。目前自动驾驶领域利用了这一项技术,即使用车载激光雷达快速扫描车辆前方的物体,获取大量地富有空间结构信息的点云,然后根据获得的点云数据进行基于点云物体的车前目标检测。这一项技术在自动驾驶区域被广泛的使用。3D Laser Scanning Technology (3D Laser Scanning Technology) can continuously, quickly and massively acquire 3D point cloud data on the surface of objects, namely point cloud. At present, this technology is used in the field of autonomous driving, that is, the vehicle-mounted lidar is used to quickly scan the objects in front of the vehicle, to obtain a large number of point clouds rich in spatial structure information, and then according to the obtained point cloud data, the point cloud object-based target in front of the vehicle is carried out. detection. This technology is widely used in the autonomous driving area.
在人工智能领域,三维物体检测吸引了越来越多学者的关注。该技术在自动驾驶、机器人轨迹规划、虚拟现实等方面发挥着重要作用。根据输入的格式,三维物体检测(3D object detection)方法主要分为基于图像的三维物体检测、基于点云的三维物体检测和基于图像和点云组合的三维物体检测。In the field of artificial intelligence, 3D object detection has attracted the attention of more and more scholars. This technology plays an important role in autonomous driving, robot trajectory planning, virtual reality, etc. According to the input format, 3D object detection methods are mainly divided into image-based 3D object detection, point cloud-based 3D object detection, and 3D object detection based on image and point cloud combination.
其中,基于点云的三维物体检测直接利用车载的激光雷达数据估计三维检测框。目前直接处理原始点云的方法主要有两种。第一种方法是将整个点云转换为体素,然后在体素上预测3D检测框。但是,这些方法不仅损失了丰富的空间结构信息,而且还不可避免地利用3D CNN从体素中学习特征,这造成了很大的计算量。2017年,Yin Zhou等提出VoxelNet,其将点云经过体素化后,利用PointNet网络从体素中学习点云特征,而Yan Y等人提出的SECOND则利用稀疏卷积层代替PointNet。这些方法不可避免地都产生了很大的计算量。为了减少体素化造成的计算量,Lang A H等提出的PointPillars 利用体素柱代替体素网格,但仍然造成了一些较大的计算量。Among them, the point cloud-based 3D object detection directly uses the vehicle-mounted lidar data to estimate the 3D detection frame. At present, there are two main methods for directly processing raw point clouds. The first method is to convert the entire point cloud to voxels and then predict 3D detection boxes on the voxels. However, these methods not only lose rich spatial structure information, but also inevitably utilize 3D CNN to learn features from voxels, which causes a large amount of computation. In 2017, Yin Zhou et al. proposed VoxelNet, which uses PointNet network to learn point cloud features from voxels after voxelization of point clouds, while SECOND proposed by Yan Y et al. uses sparse convolutional layers instead of PointNet. These methods inevitably generate a large amount of computation. In order to reduce the computational load caused by voxelization, the PointPillars proposed by Lang A H et al. use voxel columns to replace voxel grids, but still cause some large computational load.
另一种方式直接利用原始点云作为输入,不对点云做任何改变。因为Qi C R等提出的PointNet和PointNet++在点云中取得了巨大的成功,越来越多的三维目标检测方法通过借助它们实现了直接处理点云的任务。具体来说,2019年,Shi S等提出的二阶段网络PointRCNN,其首先利用PointNet++作为语义分割骨干网络来区分前景点和背景点。然后,根据前景点估计三维检测框。同年,Yang Z等提出的二阶段网络STD也利用PointNet++学习点云的逐点特征,并通过提出的PointsPool模块将候选目标框的内部点特征从稀疏表示转换为密集表示。通过对这两个二阶段方法的总结,发现其第一步语义分割网络起到了至关重要的作用,并且直接影响到最终的性能。但其难以忍受的推理时间使得其很难在实际的自动驾驶系统中应用,所以改进这一技术难点成为该工作中的一项重要的任务。Another way uses the original point cloud directly as input without making any changes to the point cloud. Because PointNet and PointNet++ proposed by QiCR et al. have achieved great success in point clouds, more and more 3D object detection methods have realized the task of directly processing point clouds by using them. Specifically, in 2019, Shi S et al. proposed the two-stage network PointRCNN, which first used PointNet++ as the semantic segmentation backbone network to distinguish foreground points and background points. Then, the 3D detection box is estimated from the foreground points. In the same year, the two-stage network STD proposed by Yang Z et al. also used PointNet++ to learn point-by-point features of point clouds, and converted the internal point features of candidate target boxes from sparse representation to dense representation through the proposed PointsPool module. By summarizing these two two-stage methods, it is found that the first-step semantic segmentation network plays a crucial role and directly affects the final performance. However, its unbearable reasoning time makes it difficult to apply it in practical autonomous driving systems, so improving this technical difficulty has become an important task in this work.
与二阶段方法不同,一阶段的直接利用点云方法以其效率著称。2020年,Shi W等提出PointGNN提出了一种利用GNN提取点状特征的网络。这充分利用GNN来感知空间结构,并在KITTI上取得了出色的性能。GNN可以很好地感知点云的空间结构信息,但它要求点云不能被下采样。因此,这意味着网络必须在整个点云中进行GNN。所以与其它一阶段的方法相比,计算成本很高。Unlike the two-stage approach, the one-stage direct-utilization point cloud approach is known for its efficiency. In 2020, Shi W et al. proposed PointGNN and proposed a network that uses GNN to extract point-like features. This takes full advantage of GNNs to perceive spatial structure and achieves excellent performance on KITTI. GNN can perceive the spatial structure information of point cloud well, but it requires that the point cloud cannot be downsampled. So this means that the network has to GNN the whole point cloud. Therefore, the computational cost is high compared to other one-stage methods.
公开内容public content
本申请实施例的目的在于提出了一种辅助点云目标检测的方法及系统,兼具一阶段网络的高效性和二阶段网络的精度,解决了上述背景技术存在的问题。The purpose of the embodiments of the present application is to propose a method and system for assisting point cloud target detection, which has both the efficiency of a one-stage network and the accuracy of a two-stage network, and solves the problems existing in the above background technology.
第一方面,本申请实施例提供了一种辅助点云目标检测的方法,该方法的步骤包括:In a first aspect, an embodiment of the present application provides a method for assisting point cloud target detection, and the steps of the method include:
S100、将初始目标点云集从整体点云场景中分割出;S100, segment the initial target point cloud set from the overall point cloud scene;
S200、将初始目标点云集进行特征聚合,以获得初始目标点云集中每个采样点的空间结构信息,并将空间结构信息赋值给对应的采样点,得到采样目标点云集;S200. Perform feature aggregation on the initial target point cloud set to obtain the spatial structure information of each sampling point in the initial target point cloud set, and assign the spatial structure information to the corresponding sampling points to obtain the sampling target point cloud set;
S300、将目标点云集通过三点线性插值扩张回初始目标点云集,并用插值法更新初始目标点云集的中点的特征,依据初始目标点云集的中点的特征,使用一层全连接层输出初始目标点云集中每个采样点是前景点还是背景点的概率;S300. Expand the target point cloud set back to the initial target point cloud set through three-point linear interpolation, and update the feature of the midpoint of the initial target point cloud set by interpolation, and use a fully connected layer to output the output according to the feature of the midpoint of the initial target point cloud set The probability that each sampling point in the initial target point cloud set is a foreground point or a background point;
S400、将采样目标点云集输入到多层感知机中,输出采样目标点云集到对应的真实 物体中心点的偏移量,再将偏移量与采样目标点云集的特征相加后得到预测中心点的特征,最后根据预测中心点的特征生成三维目标检测框。S400. Input the sampling target point cloud set into the multilayer perceptron, output the offset of the sampling target point cloud set to the corresponding center point of the real object, and then add the offset and the feature of the sampling target point cloud set to obtain the prediction center Point features, and finally generate a three-dimensional target detection frame according to the features of the predicted center point.
在一些实施例中,步骤S200中特征聚合的步骤包括:In some embodiments, the step of feature aggregation in step S200 includes:
S201、对初始目标点云集进行下采样得到K个兴趣采样点;S201, down-sampling the initial target point cloud set to obtain K interest sampling points;
S202、对兴趣采样点进行挑选得到K’个质量采样点;S202, selecting the interest sampling points to obtain K' quality sampling points;
S203、对质量采样点周围的临近点进行赋权得到赋权后每个临近点的特征矩阵;S203, weighting the adjacent points around the quality sampling point to obtain the feature matrix of each adjacent point after the weighting;
S204、利用赋权后的临近点的特征更新初始目标点云集内的采样点特征得到过程目标点云集;S204, using the features of the weighted adjacent points to update the sampling point features in the initial target point cloud set to obtain the process target point cloud set;
S205、对过程目标点云集进行下采样得到新的K个兴趣采样点再重复S202-S204操作得到第二过程目标点云集或采样目标点云集,当得到第二过程目标点云集时重复S205步骤直至最终得到采样目标点云集。S205. Perform down-sampling on the process target point cloud set to obtain new K interest sampling points, and repeat operations S202-S204 to obtain the second process target point cloud set or sampling target point cloud set. When the second process target point cloud set is obtained, repeat step S205 until the second process target point cloud set is obtained. Finally, the sampling target point cloud set is obtained.
对初始目标点云集进行下采样-挑选点-赋权-更新的处理,并根据需求设置重复多次直到得到满意的采样目标点云集,其中兴趣采样点为经下采样操作留下的采样点,质量采样点为同个循环里的兴趣采样点经过去噪和特征学习得到好坏分数并加以挑选的采样点。这样的操作使得采样目标点云集里既包含了通过特征最远点采样获取的尽可能多的前景点,也包含了通过距离最远点采样能够尽可能保存点云整体形状的点云集为步骤S300和步骤S400的顺利执行奠定基础。Perform down-sampling-select-point-weighting-update processing on the initial target point cloud set, and repeat as many times as required until a satisfactory sampling target point cloud set is obtained, in which the interest sampling points are the sampling points left by the down-sampling operation, The quality sampling points are the sampling points selected by the interest sampling points in the same cycle after denoising and feature learning to obtain good or bad scores. Such an operation makes the sampling target point cloud set not only include as many foreground points as possible obtained by sampling the farthest point of the feature, but also a point cloud set that can preserve the overall shape of the point cloud as much as possible by sampling the farthest point. Step S300 And lay the foundation for the smooth execution of step S400.
在一些实施例中,步骤201中的下采样的具体方法包括:输入初始目标点云集,利用基于距离的最远点采样或者基于特征值的最远点采样的独立采样,对初始目标点云集中的采样点进行下采样,经过下采样后得到K个兴趣采样点。这样的操作使得得到的K个兴趣采样点既能保留尽可能多的前景点,也能够尽可能保存点云形状。In some embodiments, the specific method of downsampling in step 201 includes: inputting an initial target point cloud set, using distance-based farthest point sampling or independent sampling of eigenvalue-based farthest point sampling to collect the initial target point cloud set The sampling points are down-sampled, and K interest sampling points are obtained after down-sampling. Such an operation enables the obtained K interest sampling points to preserve as many foreground points as possible, and also preserve the shape of the point cloud as much as possible.
在一些实施例中,步骤202中挑选的具体方法包括:对K个兴趣采样点进行去噪,利用兴趣采样点周围的临近点的数量以及临近点到兴趣采样点的距离信息,通过特征学习得到每个兴趣采样点的好坏分数,根据好坏分数,从K个兴趣采样点中选取前K’个质量较好的质量采样点。这样的操作可以有效去除本身为噪点的采样点及空间结构信息比较稀疏的采样点,便于后续通过各个临近点的贡献度对其进行赋权。In some embodiments, the specific method selected in step 202 includes: denoising the K interest sampling points, using the number of adjacent points around the interest sampling point and the distance information from the adjacent points to the interest sampling point to obtain through feature learning The quality score of each interest sampling point, according to the quality score, select the first K' quality sampling points with better quality from the K interest sampling points. Such an operation can effectively remove sampling points that are noise points and sampling points with sparse spatial structure information, which facilitates subsequent weighting of adjacent points by their contribution degrees.
在一些实施例中,步骤203中赋权的具体方法包括:针对K’个质量采样点以其自身为圆心,在半径为r的球形区域内随机采样m个临近点,将质量采样点的特征、临近 点的特征及临近点与质量采样点的相对坐标作为输入计算每个临近点对质量采样点的贡献程度,求每个临近点的特征与其对质量采样点的贡献程度的乘积得到赋权后每个临近点的特征。这样的操作可以得到赋权后临近点的特征矩阵,便于下一步的更新点操作。In some embodiments, the specific method for weighting in step 203 includes: for the K' quality sampling points with itself as the center, randomly sampling m adjacent points in a spherical area with radius r, and using the characteristics of the quality sampling points , The characteristics of the adjacent points and the relative coordinates of the adjacent points and the quality sampling points are used as input to calculate the contribution of each adjacent point to the quality sampling points, and the product of the characteristics of each adjacent point and its contribution to the quality sampling points is weighted. features of each adjacent point. Such an operation can obtain the feature matrix of the adjacent points after weighting, which is convenient for the next update point operation.
在一些实施例中,步骤S203中贡献程度的取值范围为0~1。这样的设置可以使赋权结果更合理,科学。In some embodiments, the value range of the contribution degree in step S203 is 0-1. Such a setup can make the empowerment results more reasonable and scientific.
在一些实施例中,步骤204中更新初始目标点云集内的采样点的具体方法包括:利用MaxPooling操作获取赋权后临近点的特征所对应的每个通道上最明显的特征生成一个新的特征,从而形成过程目标点云集。In some embodiments, the specific method for updating the sampling points in the initial target point cloud set in step 204 includes: using the MaxPooling operation to obtain the most obvious feature on each channel corresponding to the feature of the adjacent point after weighting to generate a new feature , so as to form the process target point cloud set.
这样的操作使过程目标点云集能够很好地感知空间结构,使得最终的三维目标检测框的中心点预测更精确,同时通过下采样操作减少需处理的采样点数量,提高运算速度,实现既高效又精确。Such an operation enables the process target point cloud set to perceive the spatial structure well, and makes the prediction of the center point of the final 3D target detection frame more accurate. At the same time, the downsampling operation reduces the number of sampling points to be processed, improves the operation speed, and achieves both high efficiency. And precise.
在一些实施例中,步骤205的具体操作方式为:输入过程目标点云集,利用基于距离的最远点采样和基于特征值的最远点采样的联合采样,对过程目标点云集中的采样点进行下采样得到新的K个兴趣采样点再进行S202-S204操作得到第二过程目标点云集或采样目标点云集,当得到第二过程目标点云集时重复S205步骤直至最终得到采样目标点云集。In some embodiments, the specific operation mode of step 205 is: input the process target point cloud set, use the joint sampling of distance-based farthest point sampling and eigenvalue-based farthest point sampling, to sample points in the process target point cloud set Perform downsampling to obtain new K interest sampling points, and then perform operations S202-S204 to obtain a second process target point cloud set or a sampling target point cloud set. When the second process target point cloud set is obtained, repeat step S205 until the sampling target point cloud set is finally obtained.
根据需要,重复操作可进一步提高最终的三维目标检测框的中心点预测。Repeated operations can further improve the center point prediction of the final 3D object detection frame as needed.
在一些实施例中,步骤S205执行次数为1~4次。每次下采样都可以设置下采样后的点云数,1~4次的执行次数能够更合理地分配下采样的点云数阶梯,使运算时间和最后的三维目标检测框的预测精度都在一个较佳的合理范围内。执行次数太多容易导致运算时间过长,执行次数太少又容易导致精确度不够。In some embodiments, step S205 is performed for 1 to 4 times. The number of point clouds after downsampling can be set for each downsampling, and the number of executions from 1 to 4 times can more reasonably allocate the number of downsampled point clouds, so that the calculation time and the prediction accuracy of the final 3D target detection frame are within a reasonable range. Too many execution times can easily lead to too long operation time, and too few execution times can easily lead to insufficient precision.
在一些实施例中,步骤S300还包括:通过F local loss损失函数惩罚预测前景点或背景点错误的标签。F local loss损失函数作为辅助监督手段可以有效地监督最终预测结果的准确性。 In some embodiments, step S300 further includes: penalizing the wrong labels of predicted foreground points or background points by using the F local loss loss function. As an auxiliary supervision method, the F local loss function can effectively supervise the accuracy of the final prediction results.
在一些实施例中,步骤S400中三维目标检测框的生成步骤包括:先预先定义三维目标检测框的长、宽和高,再将预测中心点的特征作为输入,使用一层全连接层输出相对预先定义的长、宽和高的差量和旋转角度,从而得到最终的三维目标检测框。这样的操作使最终生成的三维目标检测框更符合实际情况。In some embodiments, the step of generating the three-dimensional target detection frame in step S400 includes: first predefining the length, width and height of the three-dimensional target detection frame, and then using the feature of the predicted center point as an input, using a fully connected layer to output relative The difference and rotation angle of the pre-defined length, width and height are used to obtain the final three-dimensional target detection frame. Such operations make the final generated 3D object detection frame more realistic.
第二方面,本申请提供了一种辅助点云目标检测的系统,该系统包括:In a second aspect, the present application provides a system for assisting point cloud target detection, the system comprising:
提取模块,提取模块被配置用以将初始目标点云集从整体点云场景中分割出;an extraction module, the extraction module is configured to segment the initial target point cloud set from the overall point cloud scene;
特征聚合模块,特征聚合模块被配置用以将初始目标点云集进行特征聚合,以获得初始目标点云集中每个采样点的空间结构信息,并将空间结构信息赋值给对应的采样点,得到采样目标点云集;The feature aggregation module is configured to perform feature aggregation on the initial target point cloud set to obtain the spatial structure information of each sampling point in the initial target point cloud set, and assign the spatial structure information to the corresponding sampling points to obtain the sampling target point cloud set;
特征传播模块,特征传播模块被配置用以将采样目标点云集通过三点线性插值扩张回初始目标点云集,并用插值法更新初始目标点云集的中点的特征,依据初始目标点云集的中点的特征,使用一层全连接层输出初始目标点云集中每个采样点是前景点还是背景点的概率;feature propagation module, the feature propagation module is configured to expand the sample target point cloud set back to the initial target point cloud set through three-point linear interpolation, and update the feature of the midpoint of the initial target point cloud set by interpolation, according to the midpoint of the initial target point cloud set The features of , use a fully connected layer to output the probability that each sampling point in the initial target point cloud set is a foreground point or a background point;
检测框生成模块,检测框生成模块被配置用以将采样目标点云集输入到多层感知机中,输出采样目标点云集到对应的真实物体中心点的偏移量,再将偏移量与采样目标点云集的特征相加后得到预测中心点的特征,最后根据预测中心点的特征生成三维目标检测框。Detection frame generation module, the detection frame generation module is configured to input the sampling target point cloud set into the multilayer perceptron, output the offset of the sampling target point cloud set to the corresponding real object center point, and then compare the offset with the sampling The features of the target point cloud set are added to obtain the features of the predicted center point, and finally a three-dimensional object detection frame is generated according to the features of the predicted center point.
在一些实施例中,特征聚合模块包括:In some embodiments, the feature aggregation module includes:
下采样模块,下采样模块被配置用以对初始目标点云集进行下采样得到K个兴趣采样点;A downsampling module, the downsampling module is configured to downsample the initial target point cloud set to obtain K interest sampling points;
点挑选模块,点挑选模块被配置用以对兴趣采样点进行挑选得到K’个质量采样点;point selection module, the point selection module is configured to select the interest sampling points to obtain K' quality sampling points;
赋权模块,赋权模块被配置用以对质量采样点周围的临近点进行赋权得到赋权后每个临近点的特征矩阵;Weighting module, the weighting module is configured to weight the adjacent points around the quality sampling point to obtain the feature matrix of each adjacent point after the weighting;
采样点更新模块,采样点更新模块被配置用以利用赋权后的临近点的特征更新初始目标点云集内的采样点特征得到过程目标点云集;a sampling point update module, the sampling point update module is configured to update the sampling point features in the initial target point cloud set by using the features of the weighted adjacent points to obtain the process target point cloud set;
循环模块,循环模块被配置用以对过程目标点云集进行下采样得到新的K个兴趣采样点再重复S202-S204操作得到第二过程目标点云集或采样目标点云集,当得到第二过程目标点云集时重复S205步骤直至最终得到采样目标点云集。The loop module, the loop module is configured to downsample the process target point cloud set to obtain new K interest sampling points, and then repeat the operations S202-S204 to obtain the second process target point cloud set or the sampling target point cloud set. When the second process target point cloud set is obtained When the point cloud is collected, the step S205 is repeated until the sampling target point cloud set is finally obtained.
特征聚合模块中多个模块的组合作用可以使过程目标点云集能够很好地感知空间结构,使得最终的三维目标检测框的中心点预测更精确,同时通过下采样操作减少需处理的采样点数量,提高运算速度,实现既高效又精确。The combination of multiple modules in the feature aggregation module can make the process target point cloud set to perceive the spatial structure well, make the prediction of the center point of the final 3D target detection frame more accurate, and reduce the number of sampling points to be processed through the downsampling operation. , improve the operation speed, and achieve both high efficiency and precision.
第三方面,本申请提供了一种计算机可读存储介质,介质中存储有计算机程序,在 计算机程序被处理器执行时,实施如上述第一方面中任一项所述的方法。In a third aspect, the present application provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method according to any one of the foregoing first aspects is implemented.
本申请实施例提供的一种辅助点云目标检测的方法及系统具有如下优势,1、通过将初始目标点云集进行特征聚合,以获得初始目标点云集中每个采样点的空间结构信息,并将空间结构信息赋值给对应的所述采样点,得到采样目标点云集,聚合过程采用下采样减少了处理的采样点数量,提高运算速度,使本方法更具高效性;2、通过将采样目标点云集通过三点线性插值扩张回初始目标点云集,并用插值法更新初始目标点云集的中点的特征,依据初始目标点云集的中点的特征,使用一层全连接层输出初始目标点云集中每个采样点是前景点还是背景点的概率,使得最终语义分割的效果更明显,从而使点云目标检测的显示结果更明显。3、将采样目标点云集输入到多层感知机中,输出采样目标点云集到对应的真实物体中心点的偏移量,再将偏移量与采样目标点云集的特征相加后得到预测中心点的特征,最后根据预测中心点的特征生成三维目标检测框,能够得到精确的三维目标检测框位置,使本方法更具精确性。The method and system for assisting point cloud target detection provided by the embodiments of the present application have the following advantages: 1. By performing feature aggregation on the initial target point cloud set, the spatial structure information of each sampling point in the initial target point cloud set is obtained, and The spatial structure information is assigned to the corresponding sampling points, and the sampling target point cloud set is obtained. In the aggregation process, downsampling is used to reduce the number of processing sampling points, improve the operation speed, and make the method more efficient; 2. By dividing the sampling target The point cloud set is expanded back to the initial target point cloud set through three-point linear interpolation, and the feature of the midpoint of the initial target point cloud set is updated by interpolation. According to the feature of the midpoint of the initial target point cloud set, a fully connected layer is used to output the initial target point cloud. Concentrating the probability of whether each sampling point is a foreground point or a background point makes the final semantic segmentation effect more obvious, so that the display result of point cloud target detection is more obvious. 3. Input the sampling target point cloud set into the multi-layer perceptron, output the offset of the sampling target point cloud set to the corresponding real object center point, and then add the offset and the characteristics of the sampling target point cloud set to obtain the prediction center Finally, the three-dimensional target detection frame is generated according to the characteristics of the predicted center point, and the accurate position of the three-dimensional target detection frame can be obtained, which makes the method more accurate.
附图说明Description of drawings
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:Other features, objects and advantages of the present application will become more apparent by reading the detailed description of non-limiting embodiments made with reference to the following drawings:
图1是根据本公开的实施例示例性基础流程图;1 is an exemplary base flow diagram according to an embodiment of the present disclosure;
图2是根据本公开的实施例的一种辅助点云目标检测的方法中的特征聚合步骤的流程示意图;2 is a schematic flowchart of a feature aggregation step in a method for assisting point cloud target detection according to an embodiment of the present disclosure;
图3是根据本公开的实施例的一种辅助点云目标检测的系统的结构示意图。FIG. 3 is a schematic structural diagram of a system for assisting point cloud target detection according to an embodiment of the present disclosure.
图4是根据本公开的实施例的一种辅助点云目标检测的系统中的特征聚合模块的结构示意图。FIG. 4 is a schematic structural diagram of a feature aggregation module in a system for assisting point cloud target detection according to an embodiment of the present disclosure.
具体实施方式Detailed ways
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关公开,而非对该公开的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关公开相关的部分。The present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the related disclosure, but not to limit the disclosure. In addition, it should be noted that, for the convenience of description, only the parts related to the relevant disclosure are shown in the drawings.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互 组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in this application and the features of the embodiments may be combined with each other under the condition of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
图1示出了本公开一种辅助点云目标检测的方法的示例性基础流程。FIG. 1 shows an exemplary basic flow of a method for assisting point cloud target detection of the present disclosure.
如图1所示,该基础流程包括:As shown in Figure 1, the basic process includes:
步骤S100、将初始目标点云集从整体点云场景中分割出;Step S100, segmenting the initial target point cloud set from the overall point cloud scene;
在本公开的一些实施例中,采用全像素语义分割方法将初始目标点云集从整体点云场景中分割出,假设初始目标点云集为点集A。语义分割方法能够在像素级别上对每个像素进行分类,使得同一类的像素被归为同一类,更利于提取初始目标点云集。In some embodiments of the present disclosure, a full-pixel semantic segmentation method is used to segment the initial target point cloud set from the overall point cloud scene, assuming that the initial target point cloud set is point set A. The semantic segmentation method can classify each pixel at the pixel level, so that the pixels of the same class are classified into the same class, which is more conducive to extracting the initial target point cloud set.
步骤S200、将初始目标点云集进行特征聚合,以获得初始目标点云集中每个采样点的空间结构信息,并将所述空间结构信息赋值给对应的所述采样点,得到采样目标点云集;Step S200, performing feature aggregation on the initial target point cloud set to obtain the spatial structure information of each sampling point in the initial target point cloud set, and assigning the spatial structure information to the corresponding sampling point to obtain the sampling target point cloud set;
结合图2所示,在本公开的一些实施例中,步骤S200的具体实现方式包括:With reference to FIG. 2 , in some embodiments of the present disclosure, a specific implementation manner of step S200 includes:
S201、对初始目标点云集进行下采样得到K个兴趣采样点;具体操作方式为:输入点集A,利用基于距离的最远点采样或者基于特征值的最远点采样的独立采样,对初始目标点云集中的采样点进行下采样,经过下采样后得到K个兴趣采样点。这里基于距离最远点的采样方法能更好地保持点云形状,基于特征值的最远点采样能够获取尽可能多的前景点,对他们进行独立采样使得最终得到的K个兴趣采样点兼具两者的优势。S201, down-sampling the initial target point cloud set to obtain K interest sampling points; the specific operation method is: input point set A, use distance-based farthest point sampling or eigenvalue-based farthest point sampling independent sampling, The sampling points in the target point cloud set are down-sampled, and K interest sampling points are obtained after down-sampling. Here, the sampling method based on the farthest point can better maintain the shape of the point cloud, and the farthest point sampling based on the eigenvalue can obtain as many foreground points as possible, and independently sample them so that the final K interest sampling points have both has the advantages of both.
S202、对兴趣采样点进行挑选得到K’个质量采样点;具体操作方式为:对K个兴趣采样点进行去噪,利用兴趣采样点周围的临近点的数量以及临近点到兴趣采样点的距离信息,通过特征学习得到每个兴趣采样点的好坏分数,根据好坏分数,从K个兴趣采样点中选取前K’个质量较好的质量采样点。在S201得到的K个兴趣采样点中可能包含空间结构信息比较稀疏的点或者本身为噪点的采样点,在步骤202中可对其进行去噪,并通过特征学习的方式得到每个采样点的质量好坏分数,将采样点按质量好坏分数的高低进行排序,选取质量分数好的前K’个质量采样点。质量采样点相对兴趣采样点保留了其空间结构信息较为丰富、质量分数较为高的采样点集。S202, selecting the interest sampling points to obtain K' quality sampling points; the specific operation method is: denoising the K interest sampling points, using the number of adjacent points around the interest sampling point and the distance from the adjacent points to the interest sampling point information, obtain the quality score of each interest sampling point through feature learning, and select the first K' quality sampling points with better quality from the K interest sampling points according to the quality score. The K interest sampling points obtained in S201 may include points with relatively sparse spatial structure information or sampling points themselves that are noise points, which can be denoised in step 202, and the feature learning method is used to obtain each sampling point. For the quality score, sort the sampling points according to the quality score, and select the top K' quality sampling points with good quality scores. Relative to the interest sampling point, the quality sampling point retains the sampling point set with richer spatial structure information and higher quality score.
S203、对质量采样点周围的临近点进行赋权得到赋权后每个临近点的特征矩阵;具体操作方式为:针对K’个质量采样点以其自身为圆心,在半径为r的球形区域内随机采样m个临近点,将质量采样点的特征、临近点的特征及临近点与所述质量采样点的相对坐标作为输入计算每个临近点对质量采样点的贡献程度,求每个临近点的特征与其对质 量采样点的贡献程度的乘积得到赋权后每个临近点的特征。贡献程度的取值范围为0~1。这样的设置可以使赋权结果更合理,科学。S203, weighting the adjacent points around the quality sampling point to obtain the feature matrix of each adjacent point after the weighting; the specific operation method is: for the K' quality sampling points with itself as the center of the circle, in a spherical area with a radius of r Randomly sample m nearby points, and use the characteristics of the quality sampling points, the characteristics of the adjacent points, and the relative coordinates of the adjacent points and the quality sampling points as input to calculate the contribution of each adjacent point to the quality sampling point, and find the contribution of each adjacent point to the quality sampling point. The product of the feature of the point and its contribution to the quality of the sampling point gives the feature of each neighboring point after weighting. The value range of the contribution degree is 0 to 1. Such a setup can make the empowerment results more reasonable and scientific.
本实施例中,假设W∈R k'*m为所有临近点贡献程度矩阵,F∈R  k'*m*c为所有临近点自身特征向量的集合,c为临近点特征的深度,经过赋予临近点权重模块后得到的特征矩阵为F'=W*F,其中F'∈R k'*m*cIn this embodiment, it is assumed that W∈R k'*m is the contribution degree matrix of all adjacent points, F∈R k'*m*c is the set of feature vectors of all adjacent points, and c is the depth of adjacent point features. The feature matrix obtained after the adjacent point weight module is F'=W*F, where F'∈R k'*m*c .
S204、利用赋权后的临近点的特征更新初始目标点云集内的采样点特征得到过程目标点云集;具体操作方式为:利用MaxPooling操作获取赋权后临近点的特征所对应的每个通道上最明显的特征生成一个新的特征,从而形成过程目标点云集,设为点集B。S204, using the features of the weighted adjacent points to update the sampling point features in the initial target point cloud set to obtain the process target point cloud set; the specific operation mode is: using the MaxPooling operation to obtain the weighted adjacent point features on each channel corresponding to the feature The most obvious feature generates a new feature, thus forming the process target point cloud set, which is set as point set B.
本实施例中,对于上一步得到特征矩阵F'∈R k'*m*c,网络使用一个MaxPooling操作获取特征矩阵F'每个通道上最明显的特征来生成一个新的特征,新的特征可以表示为New_F∈R k'*c。同时,MaxPooling还可以解决点云无序性的问题。New_F即为点集B的特征集。 In this embodiment, for the feature matrix F'∈R k'*m*c obtained in the previous step, the network uses a MaxPooling operation to obtain the most obvious feature on each channel of the feature matrix F' to generate a new feature. It can be expressed as New_F∈R k'*c . At the same time, MaxPooling can also solve the problem of point cloud disorder. New_F is the feature set of point set B.
S205、对过程目标点云集进行下采样得到新的K个兴趣采样点再重复S202-S204操作得到第二过程目标点云集或采样目标点云集,当得到第二过程目标点云集时重复S205步骤直至最终得到采样目标点云集;具体操作方式为:输入过程目标点云集,利用基于距离的最远点采样和基于特征值的最远点采样的联合采样,对过程目标点云集中的采样点进行下采样,再进行S202-S204操作得到第二过程目标点云集或采样目标点云集,当得到第二过程目标点云集时重复S205步骤直至最终得到采样目标点云集。S205. Perform down-sampling on the process target point cloud set to obtain new K interest sampling points, and repeat operations S202-S204 to obtain the second process target point cloud set or sampling target point cloud set. When the second process target point cloud set is obtained, repeat step S205 until the second process target point cloud set is obtained. Finally, the sampling target point cloud set is obtained; the specific operation method is as follows: input the process target point cloud set, and use the joint sampling of the farthest point sampling based on distance and the farthest point sampling based on eigenvalues to down-sample the sampling points in the process target point cloud set. Sampling, and then perform operations S202-S204 to obtain a second process target point cloud set or a sampling target point cloud set. When the second process target point cloud set is obtained, repeat step S205 until the sampling target point cloud set is finally obtained.
步骤S205执行次数为1~4次。每次下采样都可以设置下采样后的点云数,1~4次的执行次数能够更合理地分配下采样的点云数阶梯,使运算时间和最后的三维目标检测框的预测精度都在一个较佳的合理范围内。执行次数太多容易导致运算时间过长,执行次数太少又容易导致精确度不够。Step S205 is executed for 1 to 4 times. The number of point clouds after downsampling can be set for each downsampling, and the number of executions from 1 to 4 times can more reasonably allocate the number of downsampled point clouds, so that the calculation time and the prediction accuracy of the final 3D target detection frame are within a reasonable range. Too many execution times can easily lead to too long operation time, and too few execution times can easily lead to insufficient precision.
本实施例中,假设重复次数为2次。则对点集B进行下采样过程先进行基于距离的最远点采样得到点集C,再进行基于特征值的最远点采样得到点集D,再对点集C和点集D进行步骤202-204操作得到点集C和点集D的特征集,此时点集C和点集D共同构成第二过程目标点云集。再将第二过程点云集进行下采样,此时由基于距离的最远点采样得到的点集C继续按照基于距离最远点采样得到点集E,由基于特征值最远点采样得到的点集D继续按照基于特征值最远点采样得到点集G,再对点集E和点集G进行 步骤202-204操作得到点集E和点集G的特征集,此时点集E和点集G共同构成采样目标点云集。In this embodiment, it is assumed that the number of repetitions is 2 times. Then, the downsampling process for point set B firstly performs distance-based farthest point sampling to obtain point set C, then performs the farthest point sampling based on eigenvalues to obtain point set D, and then performs step 202 on point set C and point set D. -204 Operation obtains the feature sets of the point set C and the point set D. At this time, the point set C and the point set D together constitute the second process target point cloud set. The second process point cloud set is then downsampled. At this time, the point set C obtained by sampling the farthest point based on the distance continues to obtain the point set E based on the sampling of the farthest point based on the distance, and the point obtained by sampling the farthest point based on the eigenvalue. Set D continues to obtain the point set G according to the farthest point sampling based on the feature value, and then performs steps 202-204 on the point set E and the point set G to obtain the feature sets of the point set E and the point set G. At this time, the point set E and the point set G are The set G together constitutes the sampling target point cloud set.
如图1所示,该基础流程还包括步骤S300、将采样目标点云集通过三点线性插值扩张回初始目标点云集,并用插值法更新所述初始目标点云集的中点的特征,依据初始目标点云集的中点的特征,使用一层全连接层输出初始目标点云集中每个采样点是前景点还是背景点的概率;As shown in Figure 1, the basic process also includes step S300, expanding the sampling target point cloud set back to the initial target point cloud set through three-point linear interpolation, and updating the feature of the midpoint of the initial target point cloud set by interpolation, according to the initial target point cloud set. The feature of the midpoint of the point cloud set, using a fully connected layer to output the probability that each sampling point in the initial target point cloud set is a foreground point or a background point;
本实施例中,本步骤输入的采样目标点云集主要用到的是基于距离最远点采样得到的点集E,然后通过三点线性插值法(该方法为现有技术,不再拓展讲述)将点集E扩张回高分辨率点集C并更新点集C中点的特征,再对点集C进行三点线性插值法扩张回更高分辨率点集B并更新点集B中点的特征,再对点集B进行三点线性插值法扩张回最原始的高分辨率点云集场景,即为点集A,并更新点集A中点的特征。最后使用一个全连接层输出每个点是前景点还是背景点的概率。In this embodiment, the sampling target point cloud set input in this step mainly uses the point set E obtained by sampling based on the farthest point, and then uses the three-point linear interpolation method (this method is the prior art, and will not be further explained) Expand the point set E back to the high-resolution point set C and update the features of the midpoint of the point set C, and then perform three-point linear interpolation on the point set C to expand it back to the higher-resolution point set B and update the point set B. feature, and then perform three-point linear interpolation on point set B to expand back to the original high-resolution point cloud scene, which is point set A, and update the features of points in point set A. Finally, a fully connected layer is used to output the probability that each point is a foreground point or a background point.
在本实施例的一些可实现方式中,还可以用F local loss损失函数惩罚预测前景点或背景点错误的标签。F local loss损失函数作为辅助监督手段可以有效地监督最终预测结果的准确性。 In some implementations of this embodiment, the F local loss function may also be used to penalize the wrong labels for predicting the foreground points or background points. As an auxiliary supervision method, the F local loss function can effectively supervise the accuracy of the final prediction results.
步骤S400、将采样目标点云集输入到多层感知机中,输出采样目标点云集到对应的真实物体中心点的偏移量,再将偏移量与采样目标点云集的特征相加后得到预测中心点的特征,最后根据预测中心点的特征生成三维目标检测框。Step S400: Input the sampling target point cloud set into the multi-layer perceptron, output the offset of the sampling target point cloud set to the corresponding center point of the real object, and then add the offset to the feature of the sampling target point cloud set to obtain a prediction The features of the center point are finally generated according to the features of the predicted center point to generate a three-dimensional target detection frame.
本实施例中,采样目标点云集主要用到的是基于特征值最远点采样得到的点集G,将点集G的特征输入到多层感知机(MLP)中,输出点集G到对应的真实物体中心点的偏移量,并将偏移量和点集G坐标相加后得到预测的中心点位置。对于预测中心点的特征使用步骤203赋予临近点权重和步骤204更新采样点即可得到。In this embodiment, the sampling target point cloud set mainly uses the point set G obtained by sampling the farthest point based on the feature value. The offset of the center point of the real object, and the predicted center point position is obtained by adding the offset and the G coordinate of the point set. The features of the predicted center point can be obtained by assigning weights to adjacent points in step 203 and updating the sampling points in step 204 .
在本实施例的一些可实现方式中,生成三维目标检测框的方式可以通过先预先定义三维目标检测框的长、宽和高,再将预测中心点的特征作为输入,使用一层全连接层输出相对预先定义的长、宽和高的差量和旋转角度,从而得到最终的三维目标检测框。In some implementations of this embodiment, the method of generating the 3D target detection frame may be by predefining the length, width and height of the 3D target detection frame, and then using the feature of the predicted center point as input, using a fully connected layer Output the difference and rotation angle relative to the pre-defined length, width and height, so as to obtain the final three-dimensional target detection frame.
为了能够实现上述方法,本申请还提供了一种辅助点云目标检测的系统。In order to implement the above method, the present application also provides a system for assisting point cloud target detection.
如图3所示,一种辅助点云目标检测的系统500,包括提取模块510,特征聚合模块520、特征传播模块530和检测框生成模块540。其中,As shown in FIG. 3 , a system 500 for assisting point cloud target detection includes an extraction module 510 , a feature aggregation module 520 , a feature propagation module 530 and a detection frame generation module 540 . in,
提取模块510,用以将初始目标点云集从整体点云场景中分割出。The extraction module 510 is used to segment the initial target point cloud set from the overall point cloud scene.
特征聚合模块520,用以将初始目标点云集进行特征聚合,以获得初始目标点云集中每个采样点的空间结构信息,并将空间结构信息赋值给对应的采样点,得到采样目标点云集。The feature aggregation module 520 is configured to perform feature aggregation on the initial target point cloud set to obtain the spatial structure information of each sampling point in the initial target point cloud set, and assign the spatial structure information to the corresponding sampling points to obtain the sampling target point cloud set.
结合图4所示,在本公开的一些实施例中,特征聚合模块520包括:4, in some embodiments of the present disclosure, the feature aggregation module 520 includes:
下采样模块521,用以对初始目标点云集进行下采样得到K个兴趣采样点;The downsampling module 521 is used to downsample the initial target point cloud set to obtain K interest sampling points;
点挑选模块522,用以对兴趣采样点进行挑选得到K’个质量采样点;The point selection module 522 is used to select the interest sampling points to obtain K' quality sampling points;
赋权模块523,用以对质量采样点周围的临近点进行赋权得到赋权后每个临近点的特征矩阵;The weighting module 523 is used to weight the adjacent points around the quality sampling point to obtain the feature matrix of each adjacent point after the weighting;
采样点更新模块524,用以利用所述赋权后的临近点的特征更新初始目标点云集内的采样点特征得到过程目标点云集;The sampling point updating module 524 is used to update the sampling point feature in the initial target point cloud set with the feature of the adjacent point after the weighting to obtain the process target point cloud set;
循环模块525用以对过程目标点云集进行下采样得到新的K个兴趣采样点再重复S202-S204操作得到第二过程目标点云集或采样目标点云集,当得到第二过程目标点云集时重复S205步骤直至最终得到采样目标点云集。The loop module 525 is used to downsample the process target point cloud set to obtain new K interest sampling points, and then repeat the operations S202-S204 to obtain the second process target point cloud set or the sampling target point cloud set, and repeat when the second process target point cloud set is obtained. Step S205 until the sampling target point cloud set is finally obtained.
如图3所示,一种辅助点云目标检测的系统500还包括:特征传播模块530,用以将采样目标点云集通过三点线性插值扩张回初始目标点云集,并用插值法更新初始目标点云集的中点的特征,依据初始目标点云集的中点的特征,使用一层全连接层输出初始目标点云集中每个采样点是前景点还是背景点的概率。As shown in FIG. 3 , a system 500 for assisting point cloud target detection further includes: a feature propagation module 530 for expanding the sample target point cloud set back to the initial target point cloud set through three-point linear interpolation, and updating the initial target point by interpolation The feature of the midpoint of the cloud set, according to the feature of the midpoint of the initial target point cloud set, uses a fully connected layer to output the probability that each sampling point in the initial target point cloud set is a foreground point or a background point.
检测框生成模块540,用以将采样目标点云集输入到多层感知机中,输出采样目标点云集到对应的真实物体中心点的偏移量,再将偏移量与采样目标点云集的特征相加后得到预测中心点的特征,最后根据所述预测中心点的特征生成三维目标检测框。The detection frame generation module 540 is used to input the sampling target point cloud set into the multi-layer perceptron, output the offset of the sampling target point cloud set to the corresponding real object center point, and then compare the offset with the characteristics of the sampling target point cloud set After the addition, the features of the predicted center point are obtained, and finally a three-dimensional target detection frame is generated according to the features of the predicted center point.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。本申请所述的计算机可读存储介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、设备或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、 光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、设备或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读存储介质,该计算机可读存储介质可以发送、传播或者传输用于由指令执行系统、设备或者器件使用或者与其结合使用的程序。计算机可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。In particular, according to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. The computer-readable storage medium described in this application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above. In this application, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In this application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable storage medium other than a computer-readable storage medium that can be sent, propagated, or transmitted for use by or in connection with the instruction execution system, apparatus, or device program of. Program code embodied on a computer-readable storage medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言或其组合来编写用于执行本申请的操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional procedural programming language - such as "C" language or similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
附图中的流程图和框图,图示了按照本申请各种实施例的方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
描述于本申请实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中,例如,可以描述为:一种处理器 包括获取模块、分析模块和输出模块。其中,这些模块的名称在某种情况下并不构成对该模块本身的限定。The modules involved in the embodiments of the present application may be implemented in a software manner, and may also be implemented in a hardware manner. The described modules can also be provided in a processor, for example, it can be described as: a processor includes an acquisition module, an analysis module and an output module. Among them, the names of these modules do not constitute a limitation on the module itself under certain circumstances.
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present application and an illustration of the applied technical principles. Those skilled in the art should understand that the scope of disclosure involved in this application is not limited to the technical solutions formed by the specific combination of the above-mentioned technical features, and should also cover, without departing from the above-mentioned disclosed concept, the technical solutions made of the above-mentioned technical features or Other technical solutions formed by any combination of its equivalent features. For example, a technical solution is formed by replacing the above-mentioned features with the technical features disclosed in this application (but not limited to) with similar functions.

Claims (14)

  1. 一种辅助点云目标检测的方法,其特征在于:所述方法的步骤包括:A method for assisting point cloud target detection, characterized in that: the steps of the method include:
    S100、将初始目标点云集从整体点云场景中分割出;S100, segment the initial target point cloud set from the overall point cloud scene;
    S200、将所述初始目标点云集进行特征聚合,以获得所述初始目标点云集中每个采样点的空间结构信息,并将所述空间结构信息赋值给对应的所述采样点,得到采样目标点云集;S200. Perform feature aggregation on the initial target point cloud set to obtain spatial structure information of each sampling point in the initial target point cloud set, and assign the spatial structure information to the corresponding sampling points to obtain a sampling target point cloud collection;
    S300、将所述采样目标点云集通过三点线性插值扩张回所述初始目标点云集,并用插值法更新所述初始目标点云集的中点的特征,依据所述初始目标点云集的中点的特征,使用一层全连接层输出所述初始目标点云集中每个采样点是前景点还是背景点的概率;S300. Expand the sampling target point cloud set back to the initial target point cloud set through three-point linear interpolation, and update the feature of the midpoint of the initial target point cloud set by interpolation, according to the characteristics of the midpoint of the initial target point cloud set. feature, using a fully connected layer to output the probability that each sampling point in the initial target point cloud set is a foreground point or a background point;
    S400、将所述采样目标点云集输入到多层感知机中,输出所述采样目标点云集到对应的真实物体中心点的偏移量,再将所述偏移量与所述采样目标点云集的特征相加后得到预测中心点的特征,最后根据所述预测中心点的特征生成三维目标检测框。S400. Input the sampling target point cloud set into a multilayer perceptron, output the offset of the sampling target point cloud set to the corresponding center point of the real object, and then combine the offset with the sampling target point cloud set After adding the features of the predicted center point, the feature of the predicted center point is obtained, and finally a three-dimensional target detection frame is generated according to the feature of the predicted center point.
  2. 根据权利要求1所述的方法,其特征在于:所述步骤S200中特征聚合的步骤包括:The method according to claim 1, wherein the step of feature aggregation in the step S200 comprises:
    S201、对所述初始目标点云集进行下采样得到K个兴趣采样点;S201, down-sampling the initial target point cloud set to obtain K interest sampling points;
    S202、对所述兴趣采样点进行挑选得到K’个质量采样点;S202, selecting the interest sampling points to obtain K' quality sampling points;
    S203、对所述质量采样点周围的临近点进行赋权得到赋权后每个所述临近点的特征矩阵;S203, weight the adjacent points around the quality sampling point to obtain the feature matrix of each of the adjacent points after the weighting;
    S204、利用所述赋权后的临近点的特征更新所述初始目标点云集内的采样点特征得到过程目标点云集;S204, utilize the feature of the adjacent point after the weighting to update the sampling point feature in the initial target point cloud set to obtain the process target point cloud set;
    S205、对所述过程目标点云集进行下采样得到新的K个兴趣采样点再重复S202-S204操作得到第二过程目标点云集或所述采样目标点云集,当得到所述第二过程目标点云集时重复S205步骤直至最终得到所述采样目标点云集。S205. Perform down-sampling on the process target point cloud set to obtain new K interest sampling points, and repeat operations S202-S204 to obtain the second process target point cloud set or the sampling target point cloud set. When the second process target point cloud is obtained Step S205 is repeated when the cloud is gathered until the cloud set of sampling target points is finally obtained.
  3. 根据权利要求2所述的方法,其特征在于:步骤201中所述下采样的具体方法包括:输入所述初始目标点云集,利用基于距离的最远点采样或者基于特征值的最远点采样的独立采样,对所述初始目标点云集中的采样点进行下采样,经过下采样后得到K个兴趣采样点。The method according to claim 2, wherein: the specific method of downsampling in step 201 comprises: inputting the initial target point cloud set, using distance-based farthest point sampling or eigenvalue-based farthest point sampling independent sampling, down-sampling the sampling points in the initial target point cloud set, and obtains K interest sampling points after down-sampling.
  4. 根据权利要求2所述的方法,其特征在于:步骤202中所述挑选的具体方法包括:对所述K个兴趣采样点进行去噪,利用所述兴趣采样点周围的临近点的数量以及临近点到所述兴趣采样点的距离信息,通过特征学习得到每个所述兴趣采样点的好坏分数,根据所述好坏分数,从所述K个兴趣采样点中选取前K’个质量较好的质量采样点。The method according to claim 2, wherein: the specific method for selecting in step 202 comprises: denoising the K interest sampling points, using the number of adjacent points around the interest sampling point and the adjacent The distance information from the point to the interest sampling point, the quality score of each interest sampling point is obtained through feature learning, and according to the quality score, the first K' quality comparison points are selected from the K interest sampling points. Good quality sampling points.
  5. 根据权利要求2所述的方法,其特征在于:步骤203中所述赋权的具体方法包括:针对所述K’个质量采样点以其自身为圆心,在半径为r的球形区域内随机采样m个临近点,将所述质量采样点的特征、所述临近点的特征及所述临近点与所述质量采样点的相对坐标作为输入计算每个所述临近点对所述质量采样点的贡献程度,求每个所述临近点的特征与其对所述质量采样点的贡献程度的乘积得到赋权后所述每个临近点的特征。The method according to claim 2, wherein: the specific method of weighting in step 203 comprises: for the K' quality sampling points, taking itself as the center of the circle, randomly sampling in a spherical area with a radius of r m adjacent points, using the characteristics of the quality sampling points, the characteristics of the adjacent points, and the relative coordinates of the adjacent points and the quality sampling points as input to calculate the effect of each adjacent point on the quality sampling points Contribution degree, calculate the product of the feature of each adjacent point and its contribution degree to the quality sampling point to obtain the feature of each adjacent point after weighting.
  6. 根据权利要求5所述的方法,其特征在于:步骤S203中所述贡献程度的取值范围为0~1。The method according to claim 5, wherein the value range of the contribution degree in step S203 is 0-1.
  7. 根据权利要求2所述的方法,其特征在于:步骤204中所述更新所 述初始目标点云集内的采样点的具体方法包括:利用MaxPooling操作获取所述赋权后临近点的特征所对应的每个通道上最明显的特征生成一个新的特征,从而形成过程目标点云集。The method according to claim 2, wherein the specific method for updating the sampling points in the initial target point cloud set in step 204 comprises: using a MaxPooling operation to obtain the corresponding features of the weighted adjacent points. The most obvious feature on each channel generates a new feature, which forms the process target point cloud set.
  8. 根据权利要求2所述的方法,其特征在于:步骤205的具体操作方式为:输入所述过程目标点云集,利用基于距离的最远点采样和基于特征值的最远点采样的联合采样,对所述过程目标点云集中的采样点进行下采样,再进行S202-S204操作得到第二过程目标点云集或所述采样目标点云集,当得到所述第二过程目标点云集时重复S205步骤直至最终得到所述采样目标点云集。The method according to claim 2, wherein: the specific operation mode of step 205 is: inputting the process target point cloud set, using the joint sampling of the farthest point sampling based on distance and the farthest point sampling based on eigenvalue, Perform downsampling on the sampling points in the process target point cloud set, and then perform operations S202-S204 to obtain the second process target point cloud set or the sampling target point cloud set, and repeat the step S205 when the second process target point cloud set is obtained. Until the sampling target point cloud set is finally obtained.
  9. 根据权利要求8所述的方法,其特征在于:步骤S205执行次数为1~4次。The method according to claim 8, wherein step S205 is performed for 1 to 4 times.
  10. 根据权利要求1所述的方法,其特征在于:所述步骤S300还包括:通过Flocal loss损失函数惩罚预测前景点或背景点错误的标签。The method according to claim 1, characterized in that: the step S300 further comprises: penalizing an incorrectly predicted foreground point or background point through the Flocal loss function.
  11. 根据权利要求1所述的方法,其特征在于:所述步骤S400中所述三维目标检测框的生成步骤包括:先预先定义所述三维目标检测框的长、宽和高,再将所述预测中心点的特征作为输入,使用一层全连接层输出相对预先定义的所述长、所述宽和所述高的差量和旋转角度,从而得到最终的三维目标检测框。The method according to claim 1, wherein the step of generating the three-dimensional target detection frame in the step S400 comprises: first predefining the length, width and height of the three-dimensional target detection frame, and then predicting the The feature of the center point is used as input, and a fully connected layer is used to output the difference and rotation angle relative to the pre-defined length, width, and height, so as to obtain the final three-dimensional target detection frame.
  12. 一种辅助点云目标检测的系统,其特征在于:所述系统包括:A system for assisting point cloud target detection, characterized in that: the system comprises:
    提取模块,所述提取模块被配置用以将初始目标点云集从整体点云场景中分割出;an extraction module configured to segment the initial target point cloud set from the overall point cloud scene;
    特征聚合模块,所述特征聚合模块被配置用以将所述初始目标点云集 进行特征聚合,以获得所述初始目标点云集中每个采样点的空间结构信息,并将所述空间结构信息赋值给对应的所述采样点,得到采样目标点云集;A feature aggregation module, which is configured to perform feature aggregation on the initial target point cloud set to obtain spatial structure information of each sampling point in the initial target point cloud set, and assign the spatial structure information to a value For the corresponding sampling points, obtain the sampling target point cloud set;
    特征传播模块,所述特征传播模块被配置用以将所述采样目标点云集通过三点线性插值扩张回所述初始目标点云集,并用插值法更新所述初始目标点云集的中点的特征,依据所述初始目标点云集的中点的特征,使用一层全连接层输出所述初始目标点云集中每个采样点是前景点还是背景点的概率;a feature propagation module, the feature propagation module is configured to expand the sample target point cloud set back to the initial target point cloud set through three-point linear interpolation, and update the feature of the midpoint of the initial target point cloud set by interpolation, According to the feature of the midpoint of the initial target point cloud set, a fully connected layer is used to output the probability that each sampling point in the initial target point cloud set is a foreground point or a background point;
    检测框生成模块,所述检测框生成模块被配置用以将所述采样目标点云集输入到多层感知机中,输出所述采样目标点云集到对应的真实物体中心点的偏移量,再将所述偏移量与所述采样目标点云集的特征相加后得到预测中心点的特征,最后根据所述预测中心点的特征生成三维目标检测框。A detection frame generation module, the detection frame generation module is configured to input the sampling target point cloud set into the multilayer perceptron, output the offset of the sampling target point cloud set to the corresponding real object center point, and then The feature of the predicted center point is obtained by adding the offset and the feature of the sampled target point cloud set, and finally a three-dimensional object detection frame is generated according to the feature of the predicted center point.
  13. 根据权利要求12所述的一种辅助点云目标检测的系统,其特征在于:所述特征聚合模块包括:The system for assisting point cloud target detection according to claim 12, wherein the feature aggregation module comprises:
    下采样模块,所述下采样模块被配置用以对所述初始目标点云集进行下采样得到K个兴趣采样点;A downsampling module, the downsampling module is configured to downsample the initial target point cloud set to obtain K interest sampling points;
    点挑选模块,所述点挑选模块被配置用以对所述兴趣采样点进行挑选得到K’个质量采样点;a point selection module, the point selection module is configured to select the interest sampling points to obtain K' quality sampling points;
    赋权模块,所述赋权模块被配置用以对所述质量采样点周围的临近点进行赋权得到赋权后每个所述临近点的特征矩阵;A weighting module, the weighting module is configured to weight the adjacent points around the quality sampling point to obtain the feature matrix of each of the adjacent points after the weighting;
    采样点更新模块,所述采样点更新模块被配置用以利用所述赋权后的临近点的特征更新所述初始目标点云集内的采样点特征得到过程目标点云集;a sampling point update module, the sampling point update module is configured to update the sampling point features in the initial target point cloud set by using the feature of the weighted adjacent point to obtain a process target point cloud set;
    循环模块,所述循环模块被配置用以对所述过程目标点云集进行下采样得到新的K个兴趣采样点再重复S202-S204操作得到第二过程目标点云集或所述采样目标点云集,当得到所述第二过程目标点云集时重复S205步骤直至最终得到所述采样目标点云集。A loop module, the loop module is configured to downsample the process target point cloud set to obtain new K interest sampling points, and then repeat operations S202-S204 to obtain a second process target point cloud set or the sampling target point cloud set, When the second process target point cloud set is obtained, step S205 is repeated until the sampling target point cloud set is finally obtained.
  14. 一种计算机可读存储介质,所述介质中存储有计算机程序,在所述计算机程序被处理器执行时,实施如权利要求1-11中任一项所述的方法。A computer-readable storage medium in which a computer program is stored, which, when executed by a processor, implements the method according to any one of claims 1-11.
PCT/CN2021/074199 2020-12-31 2021-01-28 Method and system for assisting point cloud-based object detection WO2022141718A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011633104.8 2020-12-31
CN202011633104.8A CN112734931B (en) 2020-12-31 2020-12-31 Method and system for assisting point cloud target detection

Publications (1)

Publication Number Publication Date
WO2022141718A1 true WO2022141718A1 (en) 2022-07-07

Family

ID=75608394

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/074199 WO2022141718A1 (en) 2020-12-31 2021-01-28 Method and system for assisting point cloud-based object detection

Country Status (2)

Country Link
CN (1) CN112734931B (en)
WO (1) WO2022141718A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116051633A (en) * 2022-12-15 2023-05-02 清华大学 3D point cloud target detection method and device based on weighted relation perception

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113284163B (en) * 2021-05-12 2023-04-07 西安交通大学 Three-dimensional target self-adaptive detection method and system based on vehicle-mounted laser radar point cloud

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989604A (en) * 2016-02-18 2016-10-05 合肥工业大学 Target object three-dimensional color point cloud generation method based on KINECT
US20180276885A1 (en) * 2017-03-27 2018-09-27 3Dflow Srl Method for 3D modelling based on structure from motion processing of sparse 2D images
CN110032962A (en) * 2019-04-03 2019-07-19 腾讯科技(深圳)有限公司 A kind of object detecting method, device, the network equipment and storage medium
CN110632608A (en) * 2018-06-21 2019-12-31 北京京东尚科信息技术有限公司 Target detection method and device based on laser point cloud
CN110991468A (en) * 2019-12-13 2020-04-10 深圳市商汤科技有限公司 Three-dimensional target detection and intelligent driving method, device and equipment
CN111753698A (en) * 2020-06-17 2020-10-09 东南大学 Multi-mode three-dimensional point cloud segmentation system and method
CN111915746A (en) * 2020-07-16 2020-11-10 北京理工大学 Weak-labeling-based three-dimensional point cloud target detection method and labeling tool

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9582939B2 (en) * 2015-06-11 2017-02-28 Nokia Technologies Oy Structure preserved point cloud simplification
CN110969210A (en) * 2019-12-02 2020-04-07 中电科特种飞机系统工程有限公司 Small and slow target identification and classification method, device, equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989604A (en) * 2016-02-18 2016-10-05 合肥工业大学 Target object three-dimensional color point cloud generation method based on KINECT
US20180276885A1 (en) * 2017-03-27 2018-09-27 3Dflow Srl Method for 3D modelling based on structure from motion processing of sparse 2D images
CN110632608A (en) * 2018-06-21 2019-12-31 北京京东尚科信息技术有限公司 Target detection method and device based on laser point cloud
CN110032962A (en) * 2019-04-03 2019-07-19 腾讯科技(深圳)有限公司 A kind of object detecting method, device, the network equipment and storage medium
CN110991468A (en) * 2019-12-13 2020-04-10 深圳市商汤科技有限公司 Three-dimensional target detection and intelligent driving method, device and equipment
CN111753698A (en) * 2020-06-17 2020-10-09 东南大学 Multi-mode three-dimensional point cloud segmentation system and method
CN111915746A (en) * 2020-07-16 2020-11-10 北京理工大学 Weak-labeling-based three-dimensional point cloud target detection method and labeling tool

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116051633A (en) * 2022-12-15 2023-05-02 清华大学 3D point cloud target detection method and device based on weighted relation perception
CN116051633B (en) * 2022-12-15 2024-02-13 清华大学 3D point cloud target detection method and device based on weighted relation perception

Also Published As

Publication number Publication date
CN112734931A (en) 2021-04-30
CN112734931B (en) 2021-12-07

Similar Documents

Publication Publication Date Title
Jiang et al. Self-supervised relative depth learning for urban scene understanding
Kim et al. Continuous occupancy maps using overlapping local Gaussian processes
CN111260661B (en) Visual semantic SLAM system and method based on neural network technology
JP7273129B2 (en) Lane detection method, device, electronic device, storage medium and vehicle
WO2022141718A1 (en) Method and system for assisting point cloud-based object detection
Marcu et al. A multi-stage multi-task neural network for aerial scene interpretation and geolocalization
CN112907602A (en) Three-dimensional scene point cloud segmentation method based on improved K-nearest neighbor algorithm
CN111723660A (en) Detection method for long ground target detection network
Bieder et al. Exploiting multi-layer grid maps for surround-view semantic segmentation of sparse lidar data
CN115147798A (en) Method, model and device for predicting travelable area and vehicle
CN113536920B (en) Semi-supervised three-dimensional point cloud target detection method
CN112200303B (en) Laser radar point cloud 3D target detection method based on context-dependent encoder
Rao et al. In-vehicle object-level 3D reconstruction of traffic scenes
Chaturvedi et al. Small object detection using retinanet with hybrid anchor box hyper tuning using interface of Bayesian mathematics
Li et al. Pillar-based 3D object detection from point cloud with multiattention mechanism
CN116503602A (en) Unstructured environment three-dimensional point cloud semantic segmentation method based on multi-level edge enhancement
CN115861944A (en) Traffic target detection system based on laser radar
CN115082778B (en) Multi-branch learning-based homestead identification method and system
CN115937520A (en) Point cloud moving target segmentation method based on semantic information guidance
Wilson et al. Image and Object Geo-Localization
Sun et al. Semantic-aware 3D-voxel CenterNet for point cloud object detection
Tan et al. 3D detection transformer: Set prediction of objects using point clouds
CN114820723A (en) Online multi-target tracking method based on joint detection and association
CN113096104A (en) Training method and device of target segmentation model and target segmentation method and device
Chen et al. 3D Object Detection of Cars and Pedestrians by Deep Neural Networks from Unit-Sharing One-Shot NAS

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21912498

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21912498

Country of ref document: EP

Kind code of ref document: A1