CN111145174A - 3D target detection method for point cloud screening based on image semantic features - Google Patents
3D target detection method for point cloud screening based on image semantic features Download PDFInfo
- Publication number
- CN111145174A CN111145174A CN202010000186.6A CN202010000186A CN111145174A CN 111145174 A CN111145174 A CN 111145174A CN 202010000186 A CN202010000186 A CN 202010000186A CN 111145174 A CN111145174 A CN 111145174A
- Authority
- CN
- China
- Prior art keywords
- point cloud
- image
- semantic
- reg
- target detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a 3D target detection method for point cloud screening based on image semantic features. The method comprises the following steps: firstly, a 2D semantic segmentation method is used for segmenting image data to obtain semantic prediction. And projecting the generated semantic prediction into a LIDAR point cloud space through a known projection matrix, so that each point in the point cloud can obtain the semantic category attribute of the corresponding image position. We extract points related to vehicles, pedestrians, cyclists from the original point cloud and form the viewing cones. Secondly, the viewing cones are used as input of the depth 3D target detector, and a loss function which accords with the characteristics of the viewing cones is designed to conduct network training. The invention designs a 3D target detection algorithm for point cloud screening based on image semantic features, thereby greatly reducing the time and the calculation requirements of 3D detection. Finally, the performance of the method on a reference data set KITTI of 3D target detection shows that the method has good real-time target detection performance.
Description
Technical Field
The invention relates to 3D target detection, in particular to a 3D target detection algorithm for point cloud screening based on image semantic features, and belongs to the field of pattern recognition.
Background
Point cloud based 3D object detection serves as an important character in real life for many applications, such as autopilot, home robot, augmented reality, and virtual reality. Compared to traditional image-based target detection methods, LIDAR point clouds provide more accurate depth information that can be used to locate objects and delineate the shape of objects. However, LIDAR point clouds are more sparse and have large differences in density of parts, due to factors such as non-uniform 3D spatial sampling, effective range of the sensor, and object occlusion and relative position, unlike conventional images. To solve the above problem, many methods process a 3D point cloud into features that can be processed by a corresponding target detector using an artificially designed feature extraction method. However, these methods using all point clouds as input require a lot of computing resources and cannot achieve real-time detection.
Disclosure of Invention
The purpose of the invention is: aiming at the problems in the prior art, a 3D target detection algorithm for carrying out point cloud screening based on image semantic features is provided, the algorithm is an end-to-end depth 3D target detection method, a 2D image semantic segmentation method is simultaneously adopted to obtain the category attribute of each pixel in an image under the same scene, the prediction result is used as the prior category attribute, each point in the point cloud is marked through a known projection matrix, and the points of which the categories are automobiles, pedestrians and riders are all extracted from the point cloud to form a viewing cone which is used as the input of a 3D target detection network. At the same time we have designed a 3D object detector that handles cones. In addition to the basic components of the object detector, i.e., the point cloud feature extractor using the mesh, the convolutional intermediate extraction layer, and the regional pre-selection network (RPN), we also optimize the loss function to make the entire network more sensitive to objects in the view frustum that lack references. Our algorithm includes the following steps:
step (1): performing semantic segmentation on the two-dimensional image on the image data to obtain semantic prediction;
step (2): projecting the semantic prediction into a point cloud space, and screening points of a specific category to form a view cone;
and (3): building a 3D target detection network, and taking a viewing cone as the input of a 3D target detector;
and (4): enhancing the sensitivity of the loss function to the position of the 3D target frame;
and (5): and obtaining a total objective function and carrying out algorithm optimization.
Further, the specific method for performing semantic segmentation on the image data in the step (1) is as follows:
images were segmented using the deplab v3+ semantic segmentation method: firstly, manually labeling the image part of a training set in a data set; then, 200 epochs are pre-trained on a Cityscapes data set by DeepLabv3+, and then 50 epochs are finely adjusted on a manually marked semantic label data set; the resulting semantic segmentation network is trained to classify each pixel in the picture as one of 19 classes.
Further, in the step (2), based on the result predicted by the 2D semantic segmentation method, projecting the region of each category in each image into the LIDAR point cloud space by using a known projection matrix, wherein the region corresponding to the LIDAR point cloud space has a category attribute consistent with the image region; points about vehicles, pedestrians and cyclists are then screened from the original point cloud and extracted to form viewing cones.
Further, in step (3), a deep object detection network is constructed by using the pytorech, and the network comprises three parts: point cloud feature extractor using mesh, convolution intermediate extraction layer and regional pre-selection network RPN:
in a grid point cloud feature extractor, orderly cutting the whole view cone by using a 3D grid with a set size, and sending all points in each grid to the grid feature extractor, wherein the grid feature extractor consists of a linear layer, a batch normalization layer BatchNorm and a nonlinear activation layer ReLU;
in the convolution intermediate layer, 3 convolution intermediate modules are used, each convolution intermediate module is formed by sequentially connecting a 3D convolution layer, a batch normalization layer and a nonlinear activation layer, the output of the grid point cloud extractor is used as the input, and the feature with the 3D structure is converted into a 2D pseudo-graph feature which is used as the output;
the input of the regional preselection network RPN is provided by a convolution intermediate layer, the architecture of the PRN consists of three full convolution modules, each module containing a downsampled convolution layer followed by two convolution layers corresponding to the characteristic image size, after each convolution layer the BatchNorm and ReLU operations are applied; then, the output of each block is up-sampled to feature maps with the same size, and the feature maps are connected into a whole; finally, three 1 × 1 2D convolutional layers are applied to the desired learning objective to generate: (1) probability score plot, (2) regression bias, and (3) directional prediction.
Further, in step (4), an overall loss function L is added to the modeltotalAs follows:
Ltotal=β1Lcls+β2(Lreg_θ+Lreg_other)+β3Ldir+β4Lcorner
wherein L isclsTo predict the loss of classification, Lreg_θPredicted angle loss for 3D bounding box, Lreg_otherTo predict the loss of correction for the remaining parameters of the 3D bounding box, LdirTo predict loss of direction, LcornerTo predict the loss of vertex coordinates for the 3D bounding box β1,β2β3,β4Are hyper-parameters, set to 1.0, 2.0, 0.2 and 0.5, respectively;
for Lreg_θAnd Lreg_otherThe following variables were used:
Δθ=θg-θa
whereinwg,lg,hg,θgThe parameters for each bounding box provided for the tag,is a prediction parameter of an anchor point, where xc,yc,zcW, l, h and theta respectively refer to the central coordinate of the three-dimensional bounding box, the length, the width, the height and the overlooking course angle of the three-dimensional bounding box; wherein d isa=(la)2+(wa)2The length of the diagonal line of the anchor point floor; for the predicted angle thetapAngle loss Lreg_θThe concrete expression is as follows:
Lreg_θ=SnoothL1(sin(θp-Δθ))
correcting the loss L for a parameterreg_otherIn particular, it is the SmoothL1 function of the differences Δ x, Δ y, Δ z, Δ w, Δ L, Δ h, Δ θ, while the loss of coordinates of the vertices L of the 3D bounding boxcornerThe composition of (A) is as follows:
wherein NS, NH traverses all bounding boxes; p, P*,P**Denotes the predicted bounding box vertex, the vertex of the label bounding box, the vertex of the inverse bounding box, deltaijTo balance the coefficients, i, j are the indices of the targets generated by the final profile.
Further, in step (4), the positive and negative anchor point balance is adjusted using focal length:
FL(pt)=-αt(1-pt)γlog(pt)
wherein p istIs the estimated probability of the model, αtAnd γ is a super parameter adjustment coefficient, set to 0.5 and 2, respectively.
Further, in step (5), the whole model is trained according to steps (2), (3) and (4), that is, the 3D target detection network is trained on the KITTI data set, and the specific parameters and implementation method are as follows: training 20 ten thousand times, 160epochs, on a 1080Ti GPU using a random gradient descent SGD and Adam optimizer; the initial learning rate was set to 0.0002, the exponential decay factor was 0.8 and decayed every 15 epochs.
Has the advantages that: in the invention, the image is subjected to semantic segmentation, the obtained prediction result is used as a prior classification mark, and points in the corresponding point cloud are screened out to form a viewing cone. The operation greatly reduces the characteristic of high complexity of the previous input, so that the 3D object detector can obtain good effect while keeping real-time detection.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The invention provides a 3D target detection method for point cloud screening based on image semantic features. The method comprises the following steps: firstly, a 2D semantic segmentation method is used for segmenting image data to obtain semantic prediction. And projecting the generated semantic prediction into a LIDAR point cloud space through a known projection matrix, so that each point in the point cloud can obtain the semantic category attribute of the corresponding image position. We extract points related to vehicles, pedestrians, cyclists from the original point cloud and form the viewing cones. Secondly, the viewing cones are used as input of the depth 3D target detector, and a loss function which accords with the characteristics of the viewing cones is designed to conduct network training. Due to the existence of a large amount of background and noise in the point cloud, the original unstructured point cloud data is very difficult to process, and a large amount of special consideration needs to be carried out, so that a large amount of computing resources and training reasoning time need to be consumed. The invention designs a 3D target detection algorithm for point cloud screening based on image semantic features, thereby greatly reducing the time and the calculation requirements of 3D detection. Finally, the performance of the method on a reference data set KITTI of 3D target detection shows that the method has good real-time target detection performance.
The invention is further described with reference to the following figures and examples.
Examples
The invention provides a 3D target detection algorithm for point cloud screening based on image semantic features, and the specific flow is shown in figure 1.
Step (1): we segmented the image using the currently outstanding semantic segmentation method, deplab v3 +. The image data of the 3D object detection data set does not contain markers for segmentation. We first hand label the image portion of the training set in the dataset. We pre-train the deplab v3+ now cityscaps dataset for 200 epochs and then perform a 50epoch fine-tuning on the manually labeled semantic tags. The resulting semantic segmentation network is trained to classify each pixel in the picture as one of 19 classes.
Step (2): projecting the semantic prediction into a point cloud space, and screening points of a specific category to form a view cone, wherein the specific method comprises the following steps: based on the result predicted by the 2D semantic segmentation method, the region of each category in each image is projected into the point cloud space by using a known projection matrix, so that the region of the corresponding point cloud space has the category attribute consistent with the image region. Then we screen the points about the car, pedestrian, cyclist from the original point cloud, forming the viewing cone.
And (3): we build a deep target detection network using the pytorech depth framework. The network comprises three parts: point cloud feature extractor using mesh, convolution intermediate extraction layer and regional pre-selection network (RPN).
In the grid point cloud feature extractor, a viewing cone is firstly cut in order by using a 3D grid with a set size. And all of each mesh is taken as input to the mesh feature extractor. Our mesh feature extractor consists of a linear layer, a batch normalization layer (BatchNorm) and a nonlinear activation layer (ReLU).
In the convolution middle layer we use 3 convolution middle blocks in order to increase the receptive field to get more context. Each convolution intermediate module consists of a 3D convolution layer, a batch normalization layer (BatchNorm) and a nonlinear activation layer (ReLU) connected in sequence. It takes the output of the mesh point cloud extractor as input and converts this feature with 3D structure into a 2D pseudo-graph feature, which is taken as the final output.
The input to the Regional Preselection Network (RPN) is provided by the convolution interlayer. The architecture of the PRN consists of three full-volume modules. Each module contains a downsampled convolutional layer followed by two convolutional layers corresponding to the feature image size. After each convolutional layer, we apply the BatchNorm and ReLU operations. Then we upsample the output of each block to a feature map of the same size and concatenate these feature maps into one whole. Finally, three 2D convolutional layers are applied to the desired learning objective to generate: (1) probability score plot, (2) regression bias, and (3) directional prediction.
And (4): the screening process of the point cloud causes the view cones not to have original context information. The target point cloud data without reference makes the detection task more difficult, and a special loss function is added into the model to strengthen the sensitivity of the model to the target. Overall loss function LtotalAs follows:
Ltotal=β1Lcls+β2(Lreg_θ+Lreg_other)+β3Ldir+β4Lcorner
wherein L isclsTo classify the loss, Lreg_θIs the angle loss of the 3D bounding box, Lreg_otherCorrection of losses for the remaining parameters of the 3Dbounding box, LdirFor directional loss, LcornerLoss of vertex coordinates for 3D bounding box β1,β2β3,β4For the hyper-parameter, are set to 1.0, 2.0, 0.2 and 0.5, respectively.
For Lreg_θAnd Lreg_otherCan beTo be determined from the following variables:
Δθ=θg-θa
whereinwg,lg,hg,θgThe tag is provided with parameters describing the corresponding bounding box,prediction parameters being anchor points, where xc,yc,zcAnd w, l, h and theta respectively refer to the central coordinate of the three-dimensional bounding box, the length, the width, the height and the overlooking course angle. Wherein d isa=(la)2+(wa)2The length of the diagonal of the frame is detected for the anchor point look-down direction. For the predicted angle thetapAngle loss Lreg_θSpecifically, it can be expressed as:
Lreg_θ=SnoothL1(sin(θp-Δθ))
correcting the loss L for a parameterreg_otherIs a SmoothL1 function of the differences Δ x, Δ y, Δ z, Δ w, Δ l, Δ h, Δ θ. Loss of vertex coordinates L of 3D bounding boxcornerThe composition of (A) is as follows:
where NS, NH traverses all bounding boxes. P, P*,P**Indicating that the vertex of the bounding box is predicted, the vertex of the bounding box is labeled, and the label is inverted. In addition to the above loss function based on bounding box prediction, to solve the problem of imbalance of positive and negative anchor points existing in the RPNWe add focal length to address these drawbacks:
FL(pt)=-αt(1-pt)γlog(pt)
wherein p istIs the estimated probability of the model, αtAnd gamma is a super parameter adjustment coefficient, which is set to 0.5 and 2 respectively; log (p)t) The base numbers of the cross entropy are adopted, and the base numbers are both e and 10.
And (5): obtaining a total objective function and carrying out algorithm optimization:
the entire model is trained according to the previous steps 2, 3, 4. We train the 3D target detection network on the KITTI dataset. We trained on a 1080Ti GPU using random gradient descent (SGD) and Adam optimizers. Our model was trained 20 ten thousand times (160 epochs). The initial learning rate was set to 0.0002, the exponential decay factor was 0.8 and decayed every 15 epochs.
And (4) analyzing results:
to verify the superiority of the algorithm, we compared the proposed method with several of the most advanced target tests published recently, including MV3D, MV3D (LIDAR), F-PointNet, AVOD, AVOD-FCN and VoxelNet. As shown in tables 1 and 2, our method achieved the best performance in the most difficult target tests. Furthermore, table 3 provides a time-efficient comparison of the respective methods, and our method is also a real-time target detection method considering that it itself has used a 2D semantic segmentation method, which consumes too much time.
The experimental results are as follows:
table 1 counts the AP (%) values of 3D detection on the KITTI dataset.
Table 2 counts the AP (%) value of BEV detection on the KITTI dataset.
Table 3 counts the time(s) required for each method to process a scene over the KITTI dataset.
TABLE 1 AP-value comparison of 3D detection on KITTI data set
TABLE 2 AP-value comparison of BEV detection on KITTI data set
TABLE 3 comparison of time spent by each method on KITTI data sets
MV3D | MV3D(LIDAR) | F-PointNet | AVOD | AVOD-FCN | VoxelNet | ours |
0.36 | 0.24 | 0.17 | 0.08 | 0.10 | 0.23 | 0.18 |
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.
Claims (7)
1. A3D target detection algorithm for carrying out point cloud screening based on image semantic features is characterized by comprising the following steps:
step (1): performing semantic segmentation on the two-dimensional image on the image data to obtain semantic prediction;
step (2): projecting the semantic prediction into a point cloud space, and screening points of a specific category to form a view cone;
and (3): building a 3D target detection network, and taking a viewing cone as the input of a 3D target detector;
and (4): enhancing the sensitivity of the loss function to the position of the 3D target frame;
and (5): and obtaining a total objective function and carrying out algorithm optimization.
2. The 3D target detection algorithm for point cloud screening based on image semantic features as claimed in claim 1, wherein: the specific method for performing semantic segmentation on the image data in the step (1) is as follows:
images were segmented using the deplab v3+ semantic segmentation method: firstly, manually labeling the image part of a training set in a data set; then, 200 epochs are pre-trained on a Cityscapes data set by DeepLabv3+, and then 50 epochs are finely adjusted on a manually marked semantic label data set; the resulting semantic segmentation network is trained to classify each pixel in the picture as one of 19 classes.
3. The 3D object detection algorithm for point cloud screening based on image semantic features as claimed in claim 1, wherein in step (2), based on the result predicted by the 2D semantic segmentation method, the region of each category in each image is projected into the LIDAR point cloud space by using a known projection matrix, and the region corresponding to the LIDAR point cloud space has a category attribute consistent with the image region; points about vehicles, pedestrians and cyclists are then screened from the original point cloud and extracted to form viewing cones.
4. The 3D object detection algorithm for point cloud screening based on image semantic features as claimed in claim 1, wherein in step (3), a deep object detection network is constructed by using a pytorech, and the network comprises three parts: point cloud feature extractor using mesh, convolution intermediate extraction layer and regional pre-selection network RPN:
in a grid point cloud feature extractor, orderly cutting the whole view cone by using a 3D grid with a set size, and sending all points in each grid to the grid feature extractor, wherein the grid feature extractor consists of a linear layer, a batch normalization layer BatchNorm and a nonlinear activation layer ReLU;
in the convolution intermediate layer, 3 convolution intermediate modules are used, each convolution intermediate module is formed by sequentially connecting a 3D convolution layer, a batch normalization layer and a nonlinear activation layer, the output of the grid point cloud extractor is used as the input, and the feature with the 3D structure is converted into a 2D pseudo-graph feature which is used as the output;
the input of the regional preselection network RPN is provided by a convolution intermediate layer, the architecture of the PRN consists of three full convolution modules, each module containing a downsampled convolution layer followed by two convolution layers corresponding to the characteristic image size, after each convolution layer the BatchNorm and ReLU operations are applied; then, the output of each block is up-sampled to feature maps with the same size, and the feature maps are connected into a whole; finally, three 1 × 1 2D convolutional layers are applied to the desired learning objective to generate: (1) probability score plot, (2) regression bias, and (3) directional prediction.
5. The 3D target detection algorithm for point cloud screening based on image semantic features as claimed in claim 1, wherein in step (4), an overall loss function L is added to the modeltotalAs follows:
Ltotal=β1Lcls+β2(Lreg_θ+Lreg_other)+β3Ldir+β4Lcorner
wherein L isclsTo predict the loss of classification, Lreg_θPredicted angle loss for 3D bounding box, Lreg_otherTo predict the loss of correction for the remaining parameters of the 3Dbounding box, LdirTo predict loss of direction, LcornerTo predict the loss of coordinates of the vertices of the 3D bounding box β1,β2β3,β4Are hyper-parameters, set to 1.0, 2.0, 0.2 and 0.5, respectively;
for Lreg_θAnd Lreg_otherThe following variables were used:
Δθ=θg-θa
wherein the content of the first and second substances,wg,lg,hg,θgthe parameters for each bounding box provided for the tag,wa,la,ha,θais a prediction parameter of an anchor point, where xc,yc,zcW, l, h and theta respectively refer to the central coordinate of the three-dimensional bounding box, the length, the width, the height and the overlooking course angle of the three-dimensional bounding box; wherein d isa=(la)2+(wa)2The length of the diagonal line of the anchor point floor; for the predicted angle thetapAngle loss Lreg_θThe concrete expression is as follows:
Lreg_θ=SnoothL1(sin(θp-Δθ))
correcting the loss L for a parameterreg_otherIn particular, it is the SmoothL1 function of the differences Δ x, Δ y, Δ z, Δ w, Δ L, Δ h, Δ θ, while the loss of coordinates of the vertices L of the 3D bounding boxcornerThe composition of (A) is as follows:
wherein NS, NH traverses all bounding boxes; p, P*,P**Denotes the predicted bounding box vertex, the vertex of the label bounding box, the vertex of the inverse bounding box, deltaijTo balance the coefficients, i, j are the indices of the targets generated by the final profile.
6. The 3D target detection algorithm for point cloud screening based on image semantic features as claimed in claim 1, wherein in step (4), the positive and negative anchor point balance is adjusted using focal local:
FL(pt)=-αt(1-pt)γlog(pt)
wherein p istIs the estimated probability of the model, αtAnd γ is a super parameter adjustment coefficient, set to 0.5 and 2, respectively.
7. The 3D object detection algorithm for point cloud screening based on image semantic features as claimed in claim 1, wherein in step (5), the whole model is trained according to steps (2), (3), (4), that is, the 3D object detection network is trained on KITTI data set, and the specific parameters and implementation method are as follows: training 20 ten thousand times, 160epochs, on a 1080Ti GPU using a random gradient descent SGD and Adam optimizer; the initial learning rate was set to 0.0002, the exponential decay factor was 0.8 and decayed every 15 epochs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010000186.6A CN111145174B (en) | 2020-01-02 | 2020-01-02 | 3D target detection method for point cloud screening based on image semantic features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010000186.6A CN111145174B (en) | 2020-01-02 | 2020-01-02 | 3D target detection method for point cloud screening based on image semantic features |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111145174A true CN111145174A (en) | 2020-05-12 |
CN111145174B CN111145174B (en) | 2022-08-09 |
Family
ID=70523228
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010000186.6A Active CN111145174B (en) | 2020-01-02 | 2020-01-02 | 3D target detection method for point cloud screening based on image semantic features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111145174B (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112183358A (en) * | 2020-09-29 | 2021-01-05 | 新石器慧拓(北京)科技有限公司 | Training method and device for target detection model |
CN112184589A (en) * | 2020-09-30 | 2021-01-05 | 清华大学 | Point cloud intensity completion method and system based on semantic segmentation |
CN112200303A (en) * | 2020-09-28 | 2021-01-08 | 杭州飞步科技有限公司 | Laser radar point cloud 3D target detection method based on context-dependent encoder |
CN112464905A (en) * | 2020-12-17 | 2021-03-09 | 湖南大学 | 3D target detection method and device |
CN112541081A (en) * | 2020-12-21 | 2021-03-23 | 中国人民解放军国防科技大学 | Migratory rumor detection method based on field self-adaptation |
CN112562093A (en) * | 2021-03-01 | 2021-03-26 | 湖北亿咖通科技有限公司 | Object detection method, electronic medium, and computer storage medium |
CN112598635A (en) * | 2020-12-18 | 2021-04-02 | 武汉大学 | Point cloud 3D target detection method based on symmetric point generation |
GB2591171A (en) * | 2019-11-14 | 2021-07-21 | Motional Ad Llc | Sequential fusion for 3D object detection |
CN113343886A (en) * | 2021-06-23 | 2021-09-03 | 贵州大学 | Tea leaf identification grading method based on improved capsule network |
CN113378760A (en) * | 2021-06-25 | 2021-09-10 | 北京百度网讯科技有限公司 | Training target detection model and method and device for detecting target |
CN113984037A (en) * | 2021-09-30 | 2022-01-28 | 电子科技大学长三角研究院(湖州) | Semantic map construction method based on target candidate box in any direction |
CN114677568A (en) * | 2022-05-30 | 2022-06-28 | 山东极视角科技有限公司 | Linear target detection method, module and system based on neural network |
US11500063B2 (en) | 2018-11-08 | 2022-11-15 | Motional Ad Llc | Deep learning for object detection using pillars |
CN116385452A (en) * | 2023-03-20 | 2023-07-04 | 广东科学技术职业学院 | LiDAR point cloud panorama segmentation method based on polar coordinate BEV graph |
CN116912238A (en) * | 2023-09-11 | 2023-10-20 | 湖北工业大学 | Weld joint pipeline identification method and system based on multidimensional identification network cascade fusion |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109523552A (en) * | 2018-10-24 | 2019-03-26 | 青岛智能产业技术研究院 | Three-dimension object detection method based on cone point cloud |
US20190108639A1 (en) * | 2017-10-09 | 2019-04-11 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and Methods for Semantic Segmentation of 3D Point Clouds |
CN109784333A (en) * | 2019-01-22 | 2019-05-21 | 中国科学院自动化研究所 | Based on an objective detection method and system for cloud bar power channel characteristics |
CN110032962A (en) * | 2019-04-03 | 2019-07-19 | 腾讯科技(深圳)有限公司 | A kind of object detecting method, device, the network equipment and storage medium |
-
2020
- 2020-01-02 CN CN202010000186.6A patent/CN111145174B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190108639A1 (en) * | 2017-10-09 | 2019-04-11 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and Methods for Semantic Segmentation of 3D Point Clouds |
CN109523552A (en) * | 2018-10-24 | 2019-03-26 | 青岛智能产业技术研究院 | Three-dimension object detection method based on cone point cloud |
CN109784333A (en) * | 2019-01-22 | 2019-05-21 | 中国科学院自动化研究所 | Based on an objective detection method and system for cloud bar power channel characteristics |
CN110032962A (en) * | 2019-04-03 | 2019-07-19 | 腾讯科技(深圳)有限公司 | A kind of object detecting method, device, the network equipment and storage medium |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11500063B2 (en) | 2018-11-08 | 2022-11-15 | Motional Ad Llc | Deep learning for object detection using pillars |
US11214281B2 (en) | 2019-11-14 | 2022-01-04 | Motional Ad Llc | Sequential fusion for 3D object detection |
GB2591171B (en) * | 2019-11-14 | 2023-09-13 | Motional Ad Llc | Sequential fusion for 3D object detection |
US11634155B2 (en) | 2019-11-14 | 2023-04-25 | Motional Ad Llc | Sequential fusion for 3D object detection |
GB2591171A (en) * | 2019-11-14 | 2021-07-21 | Motional Ad Llc | Sequential fusion for 3D object detection |
CN112200303A (en) * | 2020-09-28 | 2021-01-08 | 杭州飞步科技有限公司 | Laser radar point cloud 3D target detection method based on context-dependent encoder |
CN112200303B (en) * | 2020-09-28 | 2022-10-21 | 杭州飞步科技有限公司 | Laser radar point cloud 3D target detection method based on context-dependent encoder |
CN112183358B (en) * | 2020-09-29 | 2024-04-23 | 新石器慧通(北京)科技有限公司 | Training method and device for target detection model |
CN112183358A (en) * | 2020-09-29 | 2021-01-05 | 新石器慧拓(北京)科技有限公司 | Training method and device for target detection model |
US11315271B2 (en) | 2020-09-30 | 2022-04-26 | Tsinghua University | Point cloud intensity completion method and system based on semantic segmentation |
CN112184589A (en) * | 2020-09-30 | 2021-01-05 | 清华大学 | Point cloud intensity completion method and system based on semantic segmentation |
CN112464905A (en) * | 2020-12-17 | 2021-03-09 | 湖南大学 | 3D target detection method and device |
CN112464905B (en) * | 2020-12-17 | 2022-07-26 | 湖南大学 | 3D target detection method and device |
CN112598635A (en) * | 2020-12-18 | 2021-04-02 | 武汉大学 | Point cloud 3D target detection method based on symmetric point generation |
CN112598635B (en) * | 2020-12-18 | 2024-03-12 | 武汉大学 | Point cloud 3D target detection method based on symmetric point generation |
CN112541081A (en) * | 2020-12-21 | 2021-03-23 | 中国人民解放军国防科技大学 | Migratory rumor detection method based on field self-adaptation |
CN112541081B (en) * | 2020-12-21 | 2022-09-16 | 中国人民解放军国防科技大学 | Migratory rumor detection method based on field self-adaptation |
CN112562093A (en) * | 2021-03-01 | 2021-03-26 | 湖北亿咖通科技有限公司 | Object detection method, electronic medium, and computer storage medium |
CN113343886A (en) * | 2021-06-23 | 2021-09-03 | 贵州大学 | Tea leaf identification grading method based on improved capsule network |
CN113378760A (en) * | 2021-06-25 | 2021-09-10 | 北京百度网讯科技有限公司 | Training target detection model and method and device for detecting target |
CN113984037B (en) * | 2021-09-30 | 2023-09-12 | 电子科技大学长三角研究院(湖州) | Semantic map construction method based on target candidate frame in any direction |
CN113984037A (en) * | 2021-09-30 | 2022-01-28 | 电子科技大学长三角研究院(湖州) | Semantic map construction method based on target candidate box in any direction |
CN114677568A (en) * | 2022-05-30 | 2022-06-28 | 山东极视角科技有限公司 | Linear target detection method, module and system based on neural network |
CN116385452A (en) * | 2023-03-20 | 2023-07-04 | 广东科学技术职业学院 | LiDAR point cloud panorama segmentation method based on polar coordinate BEV graph |
CN116912238A (en) * | 2023-09-11 | 2023-10-20 | 湖北工业大学 | Weld joint pipeline identification method and system based on multidimensional identification network cascade fusion |
CN116912238B (en) * | 2023-09-11 | 2023-11-28 | 湖北工业大学 | Weld joint pipeline identification method and system based on multidimensional identification network cascade fusion |
Also Published As
Publication number | Publication date |
---|---|
CN111145174B (en) | 2022-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111145174B (en) | 3D target detection method for point cloud screening based on image semantic features | |
CN111598030B (en) | Method and system for detecting and segmenting vehicle in aerial image | |
CN108647585B (en) | Traffic identifier detection method based on multi-scale circulation attention network | |
CN109886066B (en) | Rapid target detection method based on multi-scale and multi-layer feature fusion | |
CN111832655B (en) | Multi-scale three-dimensional target detection method based on characteristic pyramid network | |
CN111640125B (en) | Aerial photography graph building detection and segmentation method and device based on Mask R-CNN | |
CN111461212B (en) | Compression method for point cloud target detection model | |
CN111695514B (en) | Vehicle detection method in foggy days based on deep learning | |
CN110084817B (en) | Digital elevation model production method based on deep learning | |
CN113160062B (en) | Infrared image target detection method, device, equipment and storage medium | |
CN110909623B (en) | Three-dimensional target detection method and three-dimensional target detector | |
CN104134234A (en) | Full-automatic three-dimensional scene construction method based on single image | |
CN112560675B (en) | Bird visual target detection method combining YOLO and rotation-fusion strategy | |
CN109583483A (en) | A kind of object detection method and system based on convolutional neural networks | |
CN109801297B (en) | Image panorama segmentation prediction optimization method based on convolution | |
CN111640116B (en) | Aerial photography graph building segmentation method and device based on deep convolutional residual error network | |
CN113191204B (en) | Multi-scale blocking pedestrian detection method and system | |
CN111738206A (en) | Excavator detection method for unmanned aerial vehicle inspection based on CenterNet | |
CN114463736A (en) | Multi-target detection method and device based on multi-mode information fusion | |
CN114519819B (en) | Remote sensing image target detection method based on global context awareness | |
CN115424017B (en) | Building inner and outer contour segmentation method, device and storage medium | |
CN108074232A (en) | A kind of airborne LIDAR based on volume elements segmentation builds object detecting method | |
CN111738114A (en) | Vehicle target detection method based on anchor-free accurate sampling remote sensing image | |
EP4174792A1 (en) | Method for scene understanding and semantic analysis of objects | |
CN115115917A (en) | 3D point cloud target detection method based on attention mechanism and image feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 210000, 66 new model street, Gulou District, Jiangsu, Nanjing Applicant after: NANJING University OF POSTS AND TELECOMMUNICATIONS Address before: Yuen Road Qixia District of Nanjing City, Jiangsu Province, No. 9 210000 Applicant before: NANJING University OF POSTS AND TELECOMMUNICATIONS |
|
GR01 | Patent grant | ||
GR01 | Patent grant |