WO2023193400A1 - Procédé et appareil de détection et de segmentation de nuage de points et dispositif électronique - Google Patents

Procédé et appareil de détection et de segmentation de nuage de points et dispositif électronique Download PDF

Info

Publication number
WO2023193400A1
WO2023193400A1 PCT/CN2022/117322 CN2022117322W WO2023193400A1 WO 2023193400 A1 WO2023193400 A1 WO 2023193400A1 CN 2022117322 W CN2022117322 W CN 2022117322W WO 2023193400 A1 WO2023193400 A1 WO 2023193400A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
columnar
voxel
segmentation
features
Prior art date
Application number
PCT/CN2022/117322
Other languages
English (en)
Chinese (zh)
Inventor
赵天坤
唐佳
Original Assignee
合众新能源汽车股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 合众新能源汽车股份有限公司 filed Critical 合众新能源汽车股份有限公司
Publication of WO2023193400A1 publication Critical patent/WO2023193400A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details

Definitions

  • the present application relates to the field of computer technology, and in particular to point cloud detection and segmentation methods and devices, as well as electronic equipment and computer-readable storage media.
  • Point cloud data refers to a set of vectors in a three-dimensional coordinate system. Spatial information is recorded in the form of points, and each point contains three-dimensional coordinates. Depending on the data collection capabilities of point cloud collection equipment, some point cloud data may also contain color information (RGB) or reflection intensity information (Intensity). Taking point cloud data collected through lidar as an example, point cloud data includes the position coordinates and reflection intensity information of points in three-dimensional space. Point cloud data is widely used for target detection and recognition in the field of autonomous driving. For example, it is used for target detection and recognition in autonomous driving fields such as cars and drones. In the application process of point cloud data, point cloud detection and segmentation technology are usually used to perform target object detection and point cloud segmentation based on point cloud data.
  • point cloud detection technology refers to processing point cloud data to detect the position of the target object in the scene that the point cloud data matches
  • point cloud segmentation technology refers to identifying the target object that matches each point in the point cloud data. Category to facilitate subsequent automatic driving control.
  • the embodiment of the present application provides a point cloud detection and segmentation method, which helps to improve the efficiency of point cloud detection and point cloud segmentation.
  • embodiments of the present application provide a point cloud detection and segmentation method, including:
  • the target object is detected based on the point cloud feature vector, and the point cloud detection result is output; and, through the point cloud segmentation network branch of the multi-task neural network, based on the The point cloud feature vector is used for point cloud segmentation and the point cloud segmentation result is output.
  • embodiments of the present application provide a point cloud detection and segmentation device, including:
  • a columnar voxelization module used to perform columnar voxelization processing on the point cloud to be processed, and obtain a number of columnar voxels that constitute the point cloud to be processed;
  • a voxel feature acquisition module used to perform feature extraction and mapping on the plurality of columnar voxels, and obtain the voxel features of the point cloud to be processed;
  • a bird's-eye view feature mapping module used to map the voxel features to a bird's-eye view to obtain the bird's-eye view features corresponding to the point cloud to be processed;
  • a point cloud feature extraction module is used to extract features of the bird's-eye view features through the backbone network of a pre-trained multi-task neural network to obtain a point cloud feature vector;
  • a point cloud detection and segmentation module is configured to perform target object detection based on the point cloud feature vector through the point cloud detection network branch of the multi-task neural network, and output point cloud detection results; and, through the multi-task neural network
  • the point cloud segmentation network branch performs point cloud segmentation based on the point cloud feature vector and outputs the point cloud segmentation result.
  • embodiments of the present application also disclose an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • the processor executes the computer program, the The point cloud detection and segmentation method described in the embodiment of this application.
  • embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the steps of the point cloud detection and segmentation method disclosed in the embodiments of the present application are provided.
  • the point cloud detection and segmentation method disclosed in the embodiment of the present application obtains a number of columnar voxels that constitute the point cloud to be processed by performing columnar voxelization processing on the point cloud to be processed; and then characterizes the several columnar voxels.
  • Extract and map obtain the voxel features of the point cloud to be processed, and map the voxel features to a bird's-eye view to obtain the bird's-eye view features corresponding to the point cloud to be processed; finally, through the pre-trained multi-task neural
  • the backbone network of the network extracts features from the bird's-eye view features to obtain point cloud feature vectors; through the point cloud detection network branch of the multi-task neural network, target objects are detected based on the point cloud feature vectors and output point clouds Detection results; and, through the point cloud segmentation network branch of the multi-task neural network, point cloud segmentation is performed based on the point cloud feature vector, and the point cloud segmentation result is output, which helps to improve the efficiency of point cloud detection and point cloud segmentation. .
  • Figure 1 is a schematic flow chart of the point cloud detection and segmentation method in Embodiment 1 of the present application
  • Figure 2 is a schematic diagram of the effect of point cloud voxelization processing in Embodiment 1 of the present application.
  • Figure 3 is a schematic structural diagram of the multi-task neural network used in Embodiment 1 of the present application.
  • Figure 4 is a schematic diagram of point cloud segmentation result mapping in Embodiment 1 of the present application.
  • Figure 5 is one of the structural schematic diagrams of the point cloud detection and segmentation device in Embodiment 2 of the present application.
  • Figure 6 is the second structural schematic diagram of the point cloud detection and segmentation device in Embodiment 2 of the present application.
  • Figure 7 schematically shows a block diagram of an electronic device for performing a method according to the present application.
  • Figure 8 schematically shows a storage unit for holding or carrying program code for implementing the method according to the present application.
  • An embodiment of the present application discloses a point cloud detection and segmentation method, as shown in Figure 1.
  • the method includes: steps 110 to 150.
  • Step 110 Perform columnar voxelization processing on the point cloud to be processed, and obtain a number of columnar voxels that constitute the point cloud to be processed.
  • the point cloud to be processed described in the embodiment of this application is: the point cloud in the area of interest in the point cloud collected by a point cloud collection device (such as a lidar sensor).
  • a point cloud collection device such as a lidar sensor
  • the original point cloud collected by the lidar sensor installed on the vehicle is a data set of several disordered points, where each Point data can be represented by data with a dimension of 4, for example, expressed as: (x, y, z, i), where x, y, z are the spatial position coordinates of the point, and i represents the reflection intensity of the point.
  • point cloud preprocessing For the original point cloud collected by point cloud acquisition equipment, point cloud preprocessing first needs to be performed to obtain a point set that meets the requirements. For example, for the original point cloud, remove the nan values (null values), or remove the points with very large values to filter the point cloud noise.
  • nan values null values
  • point cloud preprocessing please refer to the prior art. In the embodiments of this application, the technical solution adopted for point cloud preprocessing is not limited and will not be described again here.
  • the point cloud collected by point cloud collection equipment is a point cloud in a three-dimensional irregular spatial area.
  • the data of the points in the area of interest in the large cube area determined previously is obtained to facilitate subsequent point cloud detection and point cloud segmentation of the point cloud in the area of interest.
  • the coordinates of points within the area of interest can be expressed by (x, y, z), where xmin ⁇ x ⁇ xmax, ymin ⁇ y ⁇ ymax, zmin ⁇ z ⁇ zmax, and the unit is meters .
  • points in the region of interest are determined based on point cloud quality. For example, if the point cloud far away from the vehicle is sparse and the number of points hitting the vehicle is small, you can set the minimum number of points to a smaller value (for example, the point value is equal to 5), and then find the corresponding number of points based on this number. , and determine a spatial area based on a maximum distance point. In some embodiments of the present application, for the same point cloud quality (such as point clouds collected by the same point cloud collection device), this distance can be predetermined by the quality of the collected point cloud data and will not change during the application process.
  • the method of determining the region of interest please refer to the method of determining the region of interest used in point cloud detection or point cloud segmentation solutions in the prior art.
  • the specific implementation method of determining the region of interest is not limited.
  • the point cloud to be processed is subjected to columnar voxelization processing to obtain a number of columnar voxels that constitute the point cloud to be processed, including: coordinate distribution according to the first coordinate axis and the second coordinate axis. , divide the points in the point cloud to be processed into several columnar voxels.
  • the first coordinate axis and the second coordinate axis are two different coordinate axes of a three-dimensional spatial coordinate system
  • the columnar voxels are prismatic voxels. For example, after the point cloud shown on the left side in Figure 2 is voxelized, a cuboid voxel (ie, a columnar voxel) 210 shown on the right side in Figure 2 can be obtained.
  • each voxel can be expressed as [x v , y v , zmaz-zmin], where x v represents the length of the voxel along the x-axis direction, y v represents the length of the voxel along the y-axis direction, zmax-zmin Represents the height of the voxel along the z-axis direction, in meters.
  • W ⁇ H columnar voxels can be divided, where,
  • the area of interest is divided into 512 ⁇ 250 columnar voxels. Subsequently, these columnar voxels are regarded as image pixels and used for feature extraction of the region of interest.
  • the point cloud of the area of interest can be represented as a voxel image of W ⁇ H ⁇ 1, and the dimension of the voxel image is W ⁇ H ⁇ 1.
  • the size of the columnar voxels is determined experimentally. For example, you can preset some voxel sizes, conduct point cloud detection and point cloud segmentation experiments respectively, analyze the impact of voxel size on detection and segmentation results and performance, and finally determine the most optimal voxel size.
  • the method further includes: obtaining the first point of the plurality of columnar voxels.
  • Cloud segmentation label wherein the first point cloud segmentation label includes: position information of each columnar voxel. For the W ⁇ H columnar voxels obtained by division, these columnar voxels form a voxel image with a voxel dimension of W ⁇ H ⁇ 1.
  • the first point cloud segmentation label of this voxel image is the above-mentioned W ⁇ H
  • the first point cloud segmentation label of the columnar voxel can be represented by a position information table of size W ⁇ H, for example, expressed as (W, H, 1).
  • the first point cloud segmentation label is used to subsequently determine the segmentation result of the point cloud based on the segmentation result of the columnar voxels.
  • obtaining the first point cloud segmentation labels of the several columnar voxels includes: for each columnar voxel, position information of the columnar voxel is used as the columnar voxel.
  • First point cloud segmentation label for voxel matching can be represented by a position information table of size W ⁇ H, for example, represented as (W, H, 1).
  • the position information table includes W ⁇ H sets of position information, and each set of position information corresponds to a columnar voxel.
  • each set of position information is used to represent the coordinate range of the corresponding columnar voxel on the x-axis and y-axis.
  • each set of position information can also be used to represent the coordinate range of points in the point cloud divided into columnar voxels corresponding to the set of position information.
  • the mapping relationship between the points in the point cloud and the columnar voxel can be established by recording the coordinate range of the corresponding columnar voxel in the position information table.
  • other methods may be used to establish the mapping relationship between points in the point cloud and columnar voxels.
  • the specific expression form of the mapping relationship is not limited.
  • Step 120 Perform feature extraction and mapping on the plurality of columnar voxels to obtain voxel features of the point cloud to be processed.
  • the columnar voxels After obtaining a number of columnar voxels that constitute the point cloud to be processed (such as the point cloud of the aforementioned area of interest), the columnar voxels can be regarded as pixels of the image, and the voxel image composed of the several columnar voxels can be processed. Feature extraction and mapping are used to obtain the features of the voxel image. Since the features of the voxel image are extracted based on the distribution data of points within columnar voxels, the features of the point cloud to be processed can be fully expressed.
  • performing feature extraction and mapping on the plurality of columnar voxels to obtain the voxel features of the point cloud to be processed includes: for each columnar voxel, obtaining the points divided into The center point of all points in the columnar voxel is calculated, and the coordinate distance between each point divided into the columnar voxel and the center point is calculated; for each columnar voxel, the coordinate distance divided into the columnar voxel is calculated.
  • the point features of all points in the columnar voxels are spliced into the voxel features of the columnar voxels, wherein the point features of each point include: the position coordinates and reflection intensity information of the point;
  • the voxel features of the columnar voxels are spliced to obtain the splicing features of the several columnar voxels; feature mapping is performed on the splicing features to obtain the voxel features of the point cloud to be processed.
  • each columnar voxel will contain a certain number of points. Taking a columnar voxel that contains K points as an example, first calculate the average coordinate of these K points based on the position coordinates in the original point cloud data of these K points.
  • the features of a columnar voxel containing K points can be expressed as features with a length of K ⁇ 7, that is, the features of the columnar voxel can be represented by the point features of all included points.
  • the voxel characteristics can be obtained by voxelizing the point cloud to be processed.
  • the features of the N columnar voxels obtained after voxelization processing (such as the aforementioned features with a length of K ⁇ 7) are spliced to obtain a length is the splicing feature of N ⁇ K ⁇ 7.
  • the columnar voxel can be discarded.
  • the spliced features can be feature mapped through a pre-trained feature extraction network to obtain features with a length of N ⁇ D, where D represents the feature dimension of each columnar voxel. number.
  • the feature extraction network can be constructed by serial connection of a fully connected layer, a normalization layer and a one-dimensional maximum pooling layer MaxPool1D.
  • N ⁇ D-dimensional features are output, where D is the full The dimension of the connection layer output.
  • Step 130 Map the voxel features to a bird's-eye view to obtain the bird's-eye view features corresponding to the point cloud to be processed.
  • mapping the voxel features to a bird's-eye view to obtain the bird's-eye view features corresponding to the point cloud to be processed includes: segmenting each of the first point cloud labels according to The position information of the columnar voxel is used to obtain the number of points included in each columnar voxel; for each columnar voxel, the volume is calculated according to the number of points included in the columnar voxel.
  • the feature corresponding to the columnar voxel in the voxel feature is mapped to the corresponding position of the bird's-eye view that matches the first point cloud segmentation label, and the bird's-eye view feature corresponding to the point cloud to be processed is obtained; wherein, According to the number of points included in the columnar voxel, the feature of the voxel feature corresponding to the columnar voxel is mapped to the corresponding position of the bird's-eye view that matches the first point cloud segmentation label, Including: when the number of points included in the columnar voxel is greater than 0, mapping the feature vector corresponding to the columnar voxel in the voxel feature to match the first point cloud segmentation label at the corresponding position of the bird's-eye view; when the number of points included in the columnar voxel is equal to 0, the feature vector at the corresponding position of the bird's-eye view that matches the first point cloud segmentation label is set to 0 .
  • each columnar voxel corresponds to a label data (i.e. a set of position information) in the first point cloud segmentation label.
  • the first The label data corresponds to columnar voxels with coordinates ranging from (0, 0) to (0.2, 0.2).
  • Each pixel The features are represented by D-dimensional feature vectors, and each pixel corresponds to a columnar voxel.
  • the respective feature vectors of the N columnar voxels included in the voxel feature can be mapped to the corresponding positions on the bird's-eye view to obtain the size is the bird's-eye view feature of W ⁇ H ⁇ D.
  • some columnar voxels may not have points.
  • the bird's-eye view corresponds to the columnar voxels that do not include points. position, its eigenvector can be set to a zero vector.
  • Step 140 Feature extraction is performed on the bird's-eye view features through the backbone network of the pre-trained multi-task neural network to obtain a point cloud feature vector.
  • the multi-task neural network includes: a backbone network 310, a point cloud detection network branch 320 and a cloud segmentation network branch 330.
  • the backbone network 310 may adopt a convolutional neural network commonly used in the prior art.
  • the backbone network 310 further includes: three cascaded feature extraction modules of different scales and a feature concatenation layer (ConCat), where each feature extraction
  • the modules include: different numbers of feature mapping modules (CBR), an upsampling layer, and, a feature mapping module (CBR).
  • the number of feature mapping modules (CBR) included in each feature extraction module can be 4, 6, and 6 respectively.
  • the feature mapping module (CBR) can be composed of a convolution layer, a batch normalization layer, and a cascade of Relu activation functions.
  • the sizes of the features output by these three feature extraction modules are respectively
  • the feature splicing layer is used to splice the features output by the above three feature extraction modules.
  • the above three feature extraction modules perform convolution operation, upsampling, normalization and activation processing on the input bird's-eye view features respectively.
  • the obtained feature vector dimension is C is the number of characteristic channels.
  • Step 150 through the point cloud detection network branch of the multi-task neural network, perform target object detection based on the point cloud feature vector, and output the point cloud detection results; and, through the point cloud segmentation network branch of the multi-task neural network , perform point cloud segmentation based on the point cloud feature vector, and output the point cloud segmentation result.
  • the point cloud feature vectors output by the backbone network 310 will be input to the point cloud detection network branch 320 and the cloud segmentation network branch 330 respectively, and these two network branches will perform the next step of processing respectively.
  • the following is an example of the execution scheme of the point cloud detection task and the point cloud segmentation task in conjunction with the network structures of the point cloud detection network branch 320 and the cloud segmentation network branch 330 respectively.
  • the point cloud detection network branch 320 includes four detection heads, which are respectively used to output the detection results of whether there is a heat map, the detected target position, the size of the target, and the rotation angle of the target.
  • each detection head included in the point cloud detection network branch 320 is composed of a feature extraction module and a convolutional layer.
  • the feature extraction module is further composed of a convolutional layer, a batch normalization layer and an activation layer. Function composition.
  • Each detection head performs feature encoding and transformation mapping on the input point cloud feature vector, and finally outputs the corresponding prediction result.
  • the detection head corresponding to the detection heat map has a size of Each position in the point cloud feature vector is predicted separately, and whether the corresponding position is a key point on the heat map is output; for another example, the detection head corresponding to the detection target object has a size of Predict from the point cloud feature vector and output the position of the detected target (x, y, z); for another example, the detection head corresponding to the output target size outputs the size of the target (dx, dy, dz); Corresponding to the detection head that outputs the rotation angle of the target object, it outputs the rotation angle ⁇ of the target object.
  • the point cloud segmentation network branch 330 is composed of an upsampling module, a feature extraction module and a convolutional layer.
  • the feature extraction module is further composed of a convolutional layer, a batch regression layer and a convolutional layer. It consists of a unified layer and an activation function.
  • the upsampling layer first upsamples the point cloud feature vectors output by the backbone network 310, and then uses the convolution layer, batch normalization layer and activation function to sequentially perform feature conversion and mapping on the vectors obtained by the upsampling process. , Finally, the segmentation results of the corresponding columnar voxels are output through the convolutional layer.
  • the size of the point cloud feature vector output by the backbone network is For example, the point cloud segmentation network branch 330 performs upsampling, convolution operations, batch normalization, activation mapping and other processes on the input point cloud feature vector, and finally outputs one-dimensional data of (W, H, n_class).
  • W and H refer to the dimensions of the output data corresponding to the width and height of the input feature map
  • n_classs represents the number of point cloud semantic categories.
  • the output data size of the point cloud segmentation network branch 330 is: 512 ⁇ 512 ⁇ 11, which means that in these 512 ⁇ 512 positions, each position has a set of segmentation result prediction values, the number is 11, the value of these 11 segmentation result prediction values is between 0-1, and the sum is 1, which means The probability value that each columnar voxel belongs to the corresponding point cloud semantic category. Furthermore, the point cloud semantic category corresponding to the maximum probability value can be taken as the point cloud semantic category for corresponding columnar voxel matching.
  • the point cloud semantic categories are determined according to specific application scenarios.
  • point cloud semantic categories can be defined to include but are not limited to any one or more of the following: buildings, green plants, ground, fences, curbs, lane lines, vehicles, etc.
  • point cloud segmentation processing is performed based on the point cloud feature vector, and the point cloud segmentation network branch will output the plurality of columns matched by the point cloud feature vector.
  • Point cloud segmentation results of voxels that is, all columnar voxels obtained after voxelizing the point cloud to be processed).
  • the segmentation result output by the point cloud segmentation network branch is the segmentation result obtained by semantic segmentation based on the features projected onto the bird's-eye view.
  • the point cloud segmentation result includes: the point cloud semantic category matched by each columnar voxel, the point cloud segmentation network branch through the multi-task neural network, based on the point cloud segmentation network branch.
  • the cloud feature vector After the cloud feature vector performs point cloud segmentation and outputs the point cloud segmentation result, it also includes: mapping the point cloud semantic category matched by the columnar voxel to the point cloud to be processed according to the position information of the columnar voxel. , the segmentation result of the point cloud to be processed is obtained.
  • the point cloud semantic category matched by the columnar voxel is mapped to the point cloud to be processed according to the position information of the columnar voxel, and the point to be processed is obtained.
  • the segmentation result of the point in the cloud includes: obtaining the points in the point cloud to be processed contained in each columnar voxel according to the position information of the columnar voxel; for each columnar voxel, The semantic category of the point cloud matched by the columnar voxel is used as the semantic category of the point cloud matched by the point contained in the columnar voxel.
  • each columnar voxel corresponds to a position in the bird's-eye view.
  • the segmentation result of the columnar voxel is obtained, which can be considered as the point cloud semantic segmentation result of the columnar area in the point cloud.
  • each box in the bird's-eye view corresponds to a columnar voxel.
  • the segmentation result corresponding to the image position matched by each box in the bird's-eye view can be regarded as the segmentation result of the columnar voxel corresponding to the box.
  • each columnar voxel corresponds to a spatial area in the point cloud to be processed.
  • This spatial area may contain 0 or more points.
  • the segmentation result of each columnar voxel ie, matching The point cloud semantic category
  • the point cloud semantic category is used as the point cloud semantic category of each point included in the columnar voxel.
  • the semantic segmentation of the points in the point cloud is completed. For example, for a columnar voxel with coordinates ranging from (0, 0) to (0.2, 0.2), if the segmentation result of the columnar voxel is "kerb", it can be determined that in the point cloud to be processed, the coordinate range is (0 , 0) to (0.2, 0.2), the semantic category of point cloud matching is "kerb".
  • the pre-trained multi-task neural network includes: a backbone network 310, a point cloud detection network branch 320, and a point cloud segmentation network branch 330.
  • the backbone network of the neural network before performing feature extraction on the bird's-eye view features and obtaining the point cloud feature vector, also includes: training a multi-task neural network based on several voxelized point cloud training samples; wherein, the voxelized points
  • the cloud training samples are constructed based on the columnar voxels obtained by performing columnar voxelization on several point clouds respectively; for each of the voxelized point cloud training samples, the sample data includes: several columnar voxels, and the sample labels include : The second point cloud segmentation label matching the corresponding sample data; the second point cloud segmentation label is used to identify the true value of the point cloud semantic category of each columnar voxel matching in the corresponding sample data; the columnar
  • the specific implementation method of generating sample data refers to the corresponding implementation method in the previous steps, such as obtaining the point cloud to be processed, and voxelizing the point cloud to be processed to obtain a number of columnar voxels.
  • the specific implementation will not be described again here.
  • each columnar voxel will contain a certain number of points, and these points are manually labeled with point cloud semantic categories.
  • point cloud semantic category matched by the largest number of points is annotated as the point cloud semantic category of the columnar voxel.
  • a certain columnar voxel includes 3 points, which are marked with point cloud semantic categories (such as small cars, large cars, bicycles, tricycles, pedestrians, cones, green plants, ground, fences, Curbs, lane lines, etc.), assuming they are (buildings, buildings, green plants), then take the largest number of buildings as the point cloud semantic category for this columnar voxel matching.
  • point cloud semantic categories matched by all columnar voxels obtained after voxelization of a certain point cloud are arranged according to voxel positions, that is, the point cloud semantic category labels matching the sample data generated by the point cloud are obtained (i.e., the second point cloud split tag).
  • the sample label of the sample data can be expressed as a W ⁇ H label matrix.
  • Each element in the label matrix is the point cloud semantic category matched by the corresponding columnar voxel. logo.
  • the sample label further includes: a point cloud detection label, which is used to identify the true value of the target detection result in the corresponding sample data. For example, for each point cloud used to generate training samples, manually mark the key points of the target object on the heat map, the spatial position coordinates, stereoscopic size, and rotation angle of the target object in the point cloud, and use the standard information as the Point cloud detection labels for training samples generated from point clouds.
  • training a multi-task neural network based on several voxelized point cloud training samples includes: performing the following point cloud detection and segmentation operations for each of the voxelized point cloud training samples. , obtain the point cloud detection result prediction value and point cloud segmentation result prediction value of the corresponding voxelized point cloud training sample: perform feature extraction and mapping on several columnar voxels included in the voxelized point cloud training sample, and obtain the Describe the voxel features of the voxelized point cloud training sample; map the voxel features to a bird's-eye view to obtain the bird's-eye view features corresponding to the voxelized point cloud training sample; through the backbone network, map the bird's-eye view Feature extraction is performed on the image features to obtain point cloud feature vectors; target object detection is performed based on the point cloud feature vectors through the point cloud detection network branch, and the point cloud detection result prediction value of the voxelized point
  • the target object is detected based on the point cloud feature vector, and the point cloud detection result prediction value of the voxelized point cloud training sample is output.
  • the point cloud detection result prediction value of the voxelized point cloud training sample is output. Please refer to the previous article to obtain the point to be processed. The relevant description of the cloud detection results will not be described again here.
  • point cloud segmentation network branch Through the point cloud segmentation network branch, perform point cloud segmentation based on the point cloud feature vector, and output the voxelized point cloud training sample point cloud segmentation result prediction value. Please refer to the previous article to obtain the point cloud to be processed. The relevant description of the segmentation results will not be repeated here.
  • the point cloud detection loss of the multi-task neural network is calculated based on the point cloud detection result prediction value and the corresponding point cloud detection label of each voxelized point cloud training sample.
  • the point cloud detection loss includes four parts, namely: heat map prediction loss, position prediction loss, size prediction loss and rotation angle prediction loss.
  • the position prediction loss, size prediction loss, and rotation angle prediction loss can be expressed by mean square error.
  • the position prediction loss of the multi-task neural network is represented by the mean square error of the predicted values of the target object position (such as spatial position coordinates) of all the voxelized point cloud training samples and the true value of the target object position in the sample label.
  • the size prediction loss of the multi-task neural network is represented by the mean square error of the predicted value of the target size (such as three-dimensional size) of all the voxelized point cloud training samples and the true value of the target size in the sample label; by The mean square error between the predicted value of the target rotation angle of all the voxelized point cloud training samples and the true value of the target rotation angle in the sample label represents the rotation angle prediction loss of the multi-task neural network.
  • the heat map prediction loss is calculated using a pixel-by-pixel focal loss function (ie, focal loss function).
  • the position of the target object is p.
  • the key points (p x , p y ) on the heat map are obtained, and the calculated data is distributed to the heat map through the Gaussian kernel. If the Gaussian kernels of multiple targets overlap, the maximum value will be taken.
  • the formula of the Gaussian kernel can be expressed as:
  • x and y are the enumerated step block positions in the image to be detected, is the target scale adaptive variance, and Y xyc is the Gaussian heat map data representation of each key point after Gaussian kernel mapping.
  • the point cloud segmentation loss of the multi-task neural network is calculated based on the point cloud segmentation result prediction value of each voxelized point cloud training sample and the corresponding second point cloud segmentation label.
  • the point cloud segmentation loss can be expressed by the cross entropy of the point cloud segmentation result prediction value and the corresponding second point cloud segmentation label.
  • point cloud detection loss and point cloud segmentation loss are integrated to calculate the loss of the multi-task neural network, and with the goal of minimizing the loss of the entire network, optimize the network of the backbone network, point cloud detection network branch and cloud segmentation network branch. parameters to complete the training of multi-task neural networks.
  • the point cloud detection and segmentation method disclosed in the embodiment of the present application obtains a number of columnar voxels that constitute the point cloud to be processed by performing columnar voxelization processing on the point cloud to be processed; and then characterizes the several columnar voxels.
  • Extract and map obtain the voxel features of the point cloud to be processed, and map the voxel features to a bird's-eye view to obtain the bird's-eye view features corresponding to the point cloud to be processed; finally, through the pre-trained multi-task neural
  • the backbone network of the network extracts features from the bird's-eye view features to obtain point cloud feature vectors; through the point cloud detection network branch of the multi-task neural network, target objects are detected based on the point cloud feature vectors and output point clouds Detection results; and, through the point cloud segmentation network branch of the multi-task neural network, point cloud segmentation is performed based on the point cloud feature vector, and the point cloud segmentation result is output, which helps to improve the efficiency of point cloud detection and point cloud segmentation. .
  • the point cloud feature vector is extracted and mapped through the backbone network of a multi-task neural network, and then input to the network branch corresponding to the point cloud detection task and the network branch corresponding to the point cloud segmentation task, respectively, for point cloud detection and point cloud detection.
  • Segmentation enables point cloud detection tasks and point cloud segmentation tasks to share the input of the point cloud feature extraction network. Compared with using two neural networks to independently perform point cloud detection and point cloud segmentation, it saves the amount of calculation consumed by point cloud feature extraction. , effectively improving the efficiency of point cloud detection and point cloud segmentation.
  • Point cloud detection tasks in the prior art usually include: point cloud preprocessing, feature extraction, and detection head prediction steps.
  • point cloud and its point cloud segmentation labels are converted to a bird's-eye view, and feature extraction, detection and segmentation are performed under the bird's-eye view, which is fast and effective.
  • point cloud semantic segmentation results output by the model is converted to each point in the point cloud, the task of semantic segmentation of point clouds based on points is completed, which effectively improves the speed of point cloud segmentation.
  • a point cloud detection and segmentation device disclosed in the embodiment of the present application, as shown in Figure 5, includes:
  • the columnar voxelization module 510 is used to perform columnar voxelization processing on the point cloud to be processed, and obtain a number of columnar voxels that constitute the point cloud to be processed;
  • the voxel feature acquisition module 520 is used to perform feature extraction and mapping on the plurality of columnar voxels, and obtain the voxel features of the point cloud to be processed;
  • a bird's-eye view feature mapping module 530 is used to map the voxel features to a bird's-eye view to obtain the bird's-eye view features corresponding to the point cloud to be processed;
  • the point cloud feature extraction module 540 is used to extract features of the bird's-eye view features through the backbone network of a pre-trained multi-task neural network to obtain a point cloud feature vector;
  • the point cloud detection and segmentation module 550 is used to perform target object detection based on the point cloud feature vector through the point cloud detection network branch of the multi-task neural network, and output the point cloud detection results; and, through the multi-task neural network
  • the point cloud segmentation network branch of the network performs point cloud segmentation based on the point cloud feature vector and outputs the point cloud segmentation result.
  • the point cloud segmentation result includes: the point cloud semantic category matched by each columnar voxel, and the device further includes:
  • the first point cloud segmentation label acquisition module 511 is used to obtain the first point cloud segmentation label of the plurality of columnar voxels, wherein the first point cloud segmentation label includes: position information of each columnar voxel. ;
  • the segmentation result conversion module 560 is used to map the point cloud semantic category matched by the columnar voxel to the point cloud to be processed according to the position information of the columnar voxel, and obtain the point cloud to be processed. Point segmentation results.
  • the point cloud semantic category matched by the columnar voxel is mapped to the point cloud to be processed according to the position information of the columnar voxel, and the point to be processed is obtained.
  • the segmentation results of cloud points include:
  • the point cloud semantic category matched by the columnar voxel is used as the point cloud semantic category matched by the points contained in the columnar voxel.
  • the voxel feature acquisition module 520 is further used to:
  • point features divided into all points in the columnar voxel are spliced into voxel features of the columnar voxel, where the point features of each point include : The position coordinates and reflection intensity information of the point;
  • the bird's-eye view feature mapping module 530 is further used to:
  • the feature corresponding to the columnar voxel in the voxel feature is mapped to the first point cloud segmentation label
  • the bird's-eye view features corresponding to the point cloud to be processed are obtained;
  • mapping according to the number of points included in the columnar voxel, the feature of the voxel feature corresponding to the columnar voxel to the corresponding position of the bird's-eye view that matches the first point cloud segmentation label.
  • the feature vector corresponding to the columnar voxel in the voxel feature is mapped to a bird's-eye view that matches the first point cloud segmentation label. at the corresponding position of the figure;
  • the feature vector at the corresponding position of the bird's-eye view matching the first point cloud segmentation label is set to 0.
  • the pre-trained multi-task neural network includes: a backbone network, a point cloud detection network branch, and a point cloud segmentation network branch.
  • the device further includes:
  • a multi-task neural network training module (not shown in the figure) is used to train a multi-task neural network based on several voxelized point cloud training samples;
  • the voxelized point cloud training sample is constructed based on the columnar voxels obtained after columnar voxelization processing of several point clouds respectively; for each of the voxelized point cloud training samples, the sample data includes: Several columnar voxels, the sample label includes: a second point cloud segmentation label matching the corresponding sample data; the second point cloud segmentation label is used to identify the true point cloud semantic category of each columnar voxel matching in the corresponding sample data. value; the true value of the point cloud semantic category matched by the columnar voxel is: among the point cloud semantic categories covered by the points in the point cloud that are divided into corresponding columnar voxels, the point cloud semantic category with the largest coverage rate.
  • the sample label also includes: a point cloud detection label, which is used to identify the true value of the target detection result in the corresponding sample data.
  • the sample label is based on several voxelized point clouds. Training samples to train multi-task neural networks, including:
  • the following point cloud detection and segmentation operations are performed respectively to obtain the predicted value of the point cloud detection result and the predicted value of the point cloud segmentation result of the corresponding voxelized point cloud training sample:
  • point cloud detection network branch target object detection is performed based on the point cloud feature vector, and the point cloud detection result prediction value of the voxelized point cloud training sample is output; and, through the point cloud segmentation network branch, Perform point cloud segmentation based on the point cloud feature vector, and output the predicted value of the voxelized point cloud training sample point cloud segmentation result;
  • the point cloud detection and segmentation device disclosed in the embodiment of this application is used to implement the point cloud detection and segmentation device method described in Embodiment 1 of this application.
  • the specific implementation of each module of the device will not be described in detail. Please refer to the method embodiment. Specific implementation of the corresponding steps.
  • the point cloud detection and segmentation device disclosed in the embodiment of the present application obtains a number of columnar voxels that constitute the point cloud to be processed by performing columnar voxelization processing on the point cloud to be processed; and then characterizes the several columnar voxels.
  • Extract and map obtain the voxel features of the point cloud to be processed, and map the voxel features to a bird's-eye view to obtain the bird's-eye view features corresponding to the point cloud to be processed; finally, through the pre-trained multi-task neural
  • the backbone network of the network extracts features from the bird's-eye view features to obtain point cloud feature vectors; through the point cloud detection network branch of the multi-task neural network, target objects are detected based on the point cloud feature vectors and output point clouds Detection results; and, through the point cloud segmentation network branch of the multi-task neural network, point cloud segmentation is performed based on the point cloud feature vector, and the point cloud segmentation result is output, which helps to improve the efficiency of point cloud detection and point cloud segmentation. .
  • the point cloud feature vector is extracted and mapped through the backbone network of a multi-task neural network, and then input to the network branch corresponding to the point cloud detection task and the network branch corresponding to the point cloud segmentation task, respectively, for point cloud detection and point cloud detection.
  • Segmentation enables point cloud detection tasks and point cloud segmentation tasks to share the input of the point cloud feature extraction network. Compared with using two neural networks to independently perform point cloud detection and point cloud segmentation, it saves the amount of calculation consumed by point cloud feature extraction. , effectively improving the efficiency of point cloud detection and point cloud segmentation.
  • point cloud and its point cloud segmentation labels are converted to a bird's-eye view, and feature extraction, detection and segmentation are performed under the bird's-eye view, which is fast and effective.
  • point cloud semantic segmentation results output by the model is converted to each point in the point cloud, the task of semantic segmentation of point clouds based on points is completed, which effectively improves the speed of point cloud segmentation.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated.
  • the components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.
  • Various component embodiments of the present application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all functions of some or all components in the electronic device according to embodiments of the present application.
  • the present application may also be implemented as an apparatus or device program (eg, computer program and computer program product) for performing part or all of the methods described herein.
  • Such a program implementing the present application may be stored on a computer-readable medium, or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, or provided on a carrier signal, or in any other form.
  • Figure 7 shows an electronic device that can implement the method according to the present application.
  • the electronic device may be a PC, a mobile terminal, a personal digital assistant, a tablet computer, etc.
  • the electronic device conventionally includes a processor 710 and a memory 720 and program code 730 stored on the memory 720 and executable on the processor 710.
  • the processor 710 executes the program code 730, the above embodiments are implemented.
  • the memory 720 may be a computer program product or a computer-readable medium.
  • Memory 720 may be electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • the memory 720 has a storage space 7201 for program code 730 of a computer program for executing any of the method steps described above.
  • the storage space 7201 for the program code 730 may include various computer programs respectively used to implement various steps in the above method.
  • the program code 730 is computer readable code. These computer programs can be read from or written into one or more computer program products. These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
  • the computer program includes computer readable code that, when run on an electronic device, causes the electronic device to perform the method according to the above embodiments.
  • An embodiment of the present application also discloses a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the steps of the point cloud detection and segmentation method described in Embodiment 1 of the present application are implemented.
  • Such a computer program product may be a computer-readable storage medium, which may have storage segments, storage spaces, etc. arranged similarly to the memory 720 in the electronic device shown in FIG. 7 .
  • the program code may, for example, be compressed and stored in the computer-readable storage medium in a suitable form.
  • the computer-readable storage medium is typically a portable or fixed storage unit as described with reference to FIG. 8 .
  • the storage unit includes computer readable code 730', which is code read by a processor. When these codes are executed by the processor, each step in the method described above is implemented.
  • any reference signs placed between parentheses shall not be construed as limiting the claim.
  • the word “comprising” does not exclude the presence of elements or steps not listed in a claim.
  • the word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements.
  • the application may be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the element claim enumerating several means, several of these means may be embodied by the same item of hardware.
  • the use of the words first, second, third, etc. does not indicate any order. These words can be interpreted as names.

Abstract

L'invention concerne un procédé de détection et de segmentation de nuage de points, se rapportant au domaine technique des ordinateurs. Le procédé comprend : la réalisation d'un traitement de voxélisation en colonne sur un nuage de points à traiter, de façon à obtenir une pluralité de voxels en colonne formant ledit nuage de points ; la réalisation d'une extraction de caractéristiques et d'un mappage à la pluralité de voxels en colonne pour obtenir les caractéristiques de voxel dudit nuage de points et le mappage des caractéristiques de voxel à une vue aérienne de façon à obtenir des caractéristiques de vue aérienne correspondant audit nuage de points ; la réalisation d'une extraction de caractéristiques sur les caractéristiques de vue aérienne au moyen d'un réseau fédérateur d'un réseau neuronal multitâche pré-entraîné pour obtenir un vecteur de caractéristique de nuage de points ; et au moyen d'une branche de réseau de détection de nuage de points et d'une branche de réseau de segmentation de nuage de points du réseau neuronal multitâche, la réalisation respective d'une détection de nuage de points et d'une segmentation de nuage de points sur la base du vecteur de caractéristique de nuage de points. La réduction de l'opération de réalisation répétée d'une extraction de caractéristiques de nuage de points permet d'améliorer l'efficacité de détection de nuage de points et de segmentation de nuage de points.
PCT/CN2022/117322 2022-04-06 2022-09-06 Procédé et appareil de détection et de segmentation de nuage de points et dispositif électronique WO2023193400A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210353486.1 2022-04-06
CN202210353486.1A CN114820463A (zh) 2022-04-06 2022-04-06 点云检测和分割方法、装置,以及,电子设备

Publications (1)

Publication Number Publication Date
WO2023193400A1 true WO2023193400A1 (fr) 2023-10-12

Family

ID=82533341

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/117322 WO2023193400A1 (fr) 2022-04-06 2022-09-06 Procédé et appareil de détection et de segmentation de nuage de points et dispositif électronique

Country Status (2)

Country Link
CN (1) CN114820463A (fr)
WO (1) WO2023193400A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114820463A (zh) * 2022-04-06 2022-07-29 合众新能源汽车有限公司 点云检测和分割方法、装置,以及,电子设备
CN115358413A (zh) * 2022-09-14 2022-11-18 清华大学 一种点云多任务模型的训练方法、装置及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476822A (zh) * 2020-04-08 2020-07-31 浙江大学 一种基于场景流的激光雷达目标检测与运动跟踪方法
CN111862101A (zh) * 2020-07-15 2020-10-30 西安交通大学 一种鸟瞰图编码视角下的3d点云语义分割方法
CN114140470A (zh) * 2021-12-07 2022-03-04 群周科技(上海)有限公司 一种基于直升机机载激光雷达的地物语义分割方法
CN114820463A (zh) * 2022-04-06 2022-07-29 合众新能源汽车有限公司 点云检测和分割方法、装置,以及,电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476822A (zh) * 2020-04-08 2020-07-31 浙江大学 一种基于场景流的激光雷达目标检测与运动跟踪方法
CN111862101A (zh) * 2020-07-15 2020-10-30 西安交通大学 一种鸟瞰图编码视角下的3d点云语义分割方法
CN114140470A (zh) * 2021-12-07 2022-03-04 群周科技(上海)有限公司 一种基于直升机机载激光雷达的地物语义分割方法
CN114820463A (zh) * 2022-04-06 2022-07-29 合众新能源汽车有限公司 点云检测和分割方法、装置,以及,电子设备

Also Published As

Publication number Publication date
CN114820463A (zh) 2022-07-29

Similar Documents

Publication Publication Date Title
WO2023193400A1 (fr) Procédé et appareil de détection et de segmentation de nuage de points et dispositif électronique
US11037305B2 (en) Method and apparatus for processing point cloud data
US9424493B2 (en) Generic object detection in images
US9147255B1 (en) Rapid object detection by combining structural information from image segmentation with bio-inspired attentional mechanisms
WO2023193401A1 (fr) Procédé et appareil de formation de modèle de détection de nuage de points, dispositif électronique et support de stockage
Liu et al. Fg-net: A fast and accurate framework for large-scale lidar point cloud understanding
CN111242122B (zh) 一种轻量级深度神经网络旋转目标检测方法和系统
WO2021217924A1 (fr) Procédé et appareil permettant d'identifier un type de véhicule au niveau d'un point de contrôle de trafic, et dispositif et support de stockage
Shen et al. Vehicle detection in aerial images based on lightweight deep convolutional network and generative adversarial network
CN112016638B (zh) 一种钢筋簇的识别方法、装置、设备及存储介质
Karim et al. A brief review and challenges of object detection in optical remote sensing imagery
CN113762003B (zh) 一种目标对象的检测方法、装置、设备和存储介质
Guo et al. DF-SSD: a deep convolutional neural network-based embedded lightweight object detection framework for remote sensing imagery
WO2019100348A1 (fr) Procédé et dispositif de récupération d'images, ainsi que procédé et dispositif de génération de bibliothèques d'images
Shao et al. Semantic segmentation for free space and lane based on grid-based interest point detection
Zhang et al. Recognition of bird nests on power transmission lines in aerial images based on improved YOLOv4
CN112200191B (zh) 图像处理方法、装置、计算设备及介质
CN114115993A (zh) 用在处理设备中的装置及用于人工神经网络的装置和方法
Geng et al. SANet: A novel segmented attention mechanism and multi-level information fusion network for 6D object pose estimation
CN115620081A (zh) 一种目标检测模型的训练方法及目标检测方法、装置
CN114155524A (zh) 单阶段3d点云目标检测方法及装置、计算机设备、介质
Marine et al. Pothole Detection on Urban Roads Using YOLOv8
Li et al. An FPGA-based tree crown detection approach for remote sensing images
Yang et al. Improved YOLOv4 based on dilated coordinate attention for object detection
CN116152345B (zh) 一种嵌入式系统实时物体6d位姿和距离估计方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22936328

Country of ref document: EP

Kind code of ref document: A1