CN116206308A - Semantic segmentation method and device for point cloud data and excavator - Google Patents

Semantic segmentation method and device for point cloud data and excavator Download PDF

Info

Publication number
CN116206308A
CN116206308A CN202310165367.8A CN202310165367A CN116206308A CN 116206308 A CN116206308 A CN 116206308A CN 202310165367 A CN202310165367 A CN 202310165367A CN 116206308 A CN116206308 A CN 116206308A
Authority
CN
China
Prior art keywords
point cloud
cloud data
dimensional point
dimensional
semantic segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310165367.8A
Other languages
Chinese (zh)
Inventor
刘钢洋
樊登云
董洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sany Heavy Machinery Ltd
Original Assignee
Sany Heavy Machinery Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sany Heavy Machinery Ltd filed Critical Sany Heavy Machinery Ltd
Priority to CN202310165367.8A priority Critical patent/CN116206308A/en
Publication of CN116206308A publication Critical patent/CN116206308A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/36Applying a local operator, i.e. means to operate on image points situated in the vicinity of a given point; Non-linear local filtering operations, e.g. median filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Nonlinear Science (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a semantic segmentation method, a semantic segmentation device and an excavator for point cloud data, which are characterized in that three-dimensional point cloud data are firstly converted into two-dimensional point cloud data, so that the data are more compact and dense, voxelization is avoided, then the two-dimensional point cloud data are input into a neural network model for mature processing of two-dimensional images, semantic information of the three-dimensional point cloud data is directly generated, so that semantic segmentation of a three-dimensional environment scene is realized, training samples of the neural network model are generated in real time by a scene point cloud generation model, the scene point cloud generation model generates training samples under a corresponding scene according to scene requirements, namely, the scene point cloud generation model is utilized to generate training samples in real time, and when the semantic segmentation of an environment scene is carried out, the corresponding training samples are generated in real time according to the type of the environment scene so as to further train the neural network model, so that the accuracy of the semantic segmentation of the neural network model for the environment scene is improved, and the generalization capability of the neural network model is improved.

Description

Semantic segmentation method and device for point cloud data and excavator
Technical Field
The application relates to the technical field of acquisition of excavator operation scenes, in particular to a semantic segmentation method and device of point cloud data and an excavator.
Background
With the continuous development of automation technology, more and more engineering equipment and engineering vehicles also adopt automation operations, such as automation operations of an excavator, namely unmanned excavation operations. To realize unmanned excavation work, the excavator needs to understand the working scene, namely, the excavator predicts the actual class label through the sensor data, so that information such as a working area, an obstacle and the like in the working scene is clear.
The three-dimensional environment sensing technology is one of the core difficulties of automatic operation, and is responsible for identifying pedestrians, vehicles and other dynamic and static elements around the excavator so as to provide comprehensive environment information, and further plan operation lines, avoid static barriers, and dynamically pedestrians, vehicles and the like. The three-dimensional semantic segmentation method based on the laser radar aims to identify semantic categories of elements in the three-dimensional scene point cloud scanned by the laser radar, and is a basic task in the whole three-dimensional environment perception technology.
The existing three-dimensional environment point cloud semantic recognition is obtained based on a working mode or scene, for example, a trained model is adopted for semantic recognition, however, for engineering equipment such as an excavator, the operation scene of the engineering equipment is greatly changed, and if the same model is used for semantic recognition, the recognition precision is not high.
Disclosure of Invention
The present application has been made in order to solve the above technical problems. The embodiment of the application provides a semantic segmentation method and device for point cloud data and an excavator, and solves the technical problems.
According to one aspect of the present application, there is provided a semantic segmentation method of point cloud data, including: projecting the three-dimensional point cloud data to a two-dimensional image space to obtain two-dimensional point cloud data; inputting the two-dimensional point cloud data into a neural network model to generate semantic information of the three-dimensional point cloud data; the training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements.
In an embodiment, before the inputting the two-dimensional point cloud data into the neural network model and generating the semantic information of the three-dimensional point cloud data, the semantic segmentation method of the point cloud data further includes: and training the neural network model by adopting the training sample.
In an embodiment, the inputting the two-dimensional point cloud data into a neural network model, and generating the semantic information of the three-dimensional point cloud data includes: inputting the two-dimensional point cloud data into the neural network model to obtain semantic information of the three-dimensional point cloud data; the neural network model comprises a semantic segmentation model and a point cloud post-processing model, and the semantic segmentation model and the point cloud post-processing model are both converted into ONNX-format files and are integrated into one ONNX-format file.
In an embodiment, the inputting the two-dimensional point cloud data into a neural network model, and generating the semantic information of the three-dimensional point cloud data includes: carrying out semantic segmentation on the two-dimensional point cloud data to obtain semantic information of a two-dimensional image; and mapping the two-dimensional point cloud data back to a three-dimensional point cloud space based on the semantic information of the two-dimensional image to obtain the semantic information of the three-dimensional point cloud data.
In an embodiment, the mapping the two-dimensional point cloud data back to a three-dimensional point cloud space based on the semantic information of the two-dimensional image to obtain the semantic information of the three-dimensional point cloud data includes: mapping the two-dimensional point cloud data back to a three-dimensional point cloud space based on semantic information of the two-dimensional image to obtain initial information of the three-dimensional point cloud data; and post-processing the initial information to obtain semantic information of the three-dimensional point cloud data.
In an embodiment, the post-processing the initial information to obtain semantic information of the three-dimensional point cloud data includes: and eliminating edge errors in the initial information by adopting a proximity algorithm to obtain semantic information of the three-dimensional point cloud data.
In an embodiment, before the projecting the three-dimensional point cloud data into the two-dimensional image space to obtain the two-dimensional point cloud data, the semantic segmentation method of the point cloud data further includes: filtering the three-dimensional point cloud data to obtain filtered three-dimensional point cloud data;
the projecting the three-dimensional point cloud data into the two-dimensional image space to obtain the two-dimensional point cloud data comprises the following steps:
and projecting the filtered three-dimensional point cloud data to a two-dimensional image space to obtain the two-dimensional point cloud data.
In an embodiment, the filtering the three-dimensional point cloud data includes: and adopting any one or a combination of a plurality of filtering processing modes for the three-dimensional point cloud data: outlier filtering, null filtering, intensity filtering, box filtering, sphere filtering.
In an embodiment, the projecting the three-dimensional point cloud data into the two-dimensional image space to obtain the two-dimensional point cloud data includes: and projecting the three-dimensional point cloud data to a two-dimensional image space by adopting a spherical projection method to obtain the two-dimensional point cloud data.
According to another aspect of the present application, there is provided a semantic segmentation apparatus for point cloud data, including: the data projection module is used for projecting the three-dimensional point cloud data to a two-dimensional image space to obtain two-dimensional point cloud data; the semantic generation module is used for inputting the two-dimensional point cloud data into a neural network model and generating semantic information of the three-dimensional point cloud data; the training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements.
According to another aspect of the present application, there is provided an excavator, including: a body; the laser radar is arranged on the machine body and is used for collecting three-dimensional point cloud data; and the semantic segmentation device of the point cloud data is connected with the laser radar.
According to the semantic segmentation method and device for the point cloud data and the excavator, the three-dimensional point cloud data are projected to a two-dimensional image space to obtain the two-dimensional point cloud data; inputting the two-dimensional point cloud data into a neural network model to generate semantic information of the three-dimensional point cloud data; the training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements; firstly, converting three-dimensional point cloud data into two-dimensional point cloud data, enabling the data to be more compact and dense, avoiding voxelization, then inputting the two-dimensional point cloud data into a neural network model for mature processing of two-dimensional images, directly generating semantic information of the three-dimensional point cloud data, and therefore achieving semantic segmentation of three-dimensional environment scenes.
Drawings
The foregoing and other objects, features and advantages of the present application will become more apparent from the following more particular description of embodiments of the present application, as illustrated in the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.
Fig. 1 is a flowchart of a semantic segmentation method of point cloud data according to an exemplary embodiment of the present application.
Fig. 2 is a flowchart of a semantic segmentation method of point cloud data according to another exemplary embodiment of the present application.
Fig. 3 is a flowchart of a semantic segmentation method of point cloud data according to another exemplary embodiment of the present application.
Fig. 4 is a flowchart of a semantic segmentation method of point cloud data according to another exemplary embodiment of the present application.
Fig. 5 is a schematic diagram of a semantic segmentation method of point cloud data according to an exemplary embodiment of the present application.
Fig. 6 is a schematic structural diagram of a semantic segmentation device for point cloud data according to an exemplary embodiment of the present application.
Fig. 7 is a schematic structural diagram of a semantic segmentation device for point cloud data according to another exemplary embodiment of the present application.
Fig. 8 is a block diagram of an electronic device according to an exemplary embodiment of the present application.
Detailed Description
Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application and not all of the embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
Fig. 1 is a flowchart of a semantic segmentation method of point cloud data according to an exemplary embodiment of the present application. As shown in fig. 1, the semantic segmentation method of the point cloud data includes the following steps:
step 110: and projecting the three-dimensional point cloud data to a two-dimensional image space to obtain two-dimensional point cloud data.
Specifically, the point cloud data semantic segmentation method can be applied to environmental perception of a working scene of the unmanned excavator, namely, environmental data is perceived by acquiring the point cloud data in the working scene and identifying the type of the point cloud data. This application can set up one or more lidar and one or more camera on the organism of excavator, and lidar or camera preferred arrangement is at the excavator cab top to guarantee to obtain the biggest working field of view scope, and be difficult for receiving the influence of excavator working device vibrations, other lidar or camera can be arranged in the swing arm department, plays the effect of supplementing excavator cab top blind area. Compared with a camera sensor, the laser radar sensor has a wider visual field range, can return more accurate distance measurement, is not easily affected by environmental changes such as illumination and the like, and has great application prospect in the sensing of complex and changeable working scenes of the excavator.
In one embodiment, the specific implementation of step 110 may be: and projecting the three-dimensional point cloud data to a two-dimensional image space by adopting a spherical projection method to obtain two-dimensional point cloud data. In order to compactly and structurally characterize sparse and irregular three-dimensional point cloud data and perform standard convolution operation, the three-dimensional point cloud data are projected to a two-dimensional image space through the following formula:
Figure BDA0004096113060000051
where w and h represent the height and width, respectively, of the projected image, r represents the distance between each point and the origin of coordinates,
Figure BDA0004096113060000052
f represents the vertical field of view of the lidar (i.e. the visibility of the lidar in the vertical direction), f= |f down |+|f up |。
Step 120: inputting the two-dimensional point cloud data into a neural network model, and generating semantic information of the three-dimensional point cloud data.
The training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements. After the two-dimensional point cloud data are obtained, the two-dimensional point cloud data are input into the trained neural network model, and semantic information of the three-dimensional point cloud data is directly generated, so that perception of a working environment is realized. According to the method, a scene point cloud generation model is built according to an actual scene, corresponding scene data (three-dimensional point cloud data of corresponding scenes) can be generated for different working scenes by the scene point cloud generation model, so that sample data for different working scenes are obtained, and training samples corresponding to different working scenes are obtained by marking the sample data (for example, two-dimensional images obtained by converting the three-dimensional point cloud data). When the excavator performs actual operation, according to the working scene of the current operation, the scene point cloud generation model generates a corresponding training sample to further train the neural network model so as to improve the semantic segmentation capability and accuracy of the neural network model on the working scene of the current operation. Specifically, the scene point cloud generation model can generate training samples for the current operation by adopting methods of image word bag feature matching, point cloud ICP matching and the like based on data similarity, and iteratively train the neural network model to enable the training samples to meet the actual working scene requirement of the excavator.
Specifically, the neural network model comprises a semantic segmentation model and a point cloud post-processing model, and the semantic segmentation model and the point cloud post-processing model are both converted into ONNX format files and are integrated into one ONNX format file. The open neural network exchange (Open Neural Network Exchange, ONNX) format is a standard for representing deep learning models that enable the model to be transferred between different frameworks. ONNX is an open file format designed for machine learning to store trained models that allows different artificial intelligence frameworks to store model data and interact in the same format. Regardless of the training framework used to train the model (e.g., tensorFlow/Pytorch/OneFlow/Paddle), the model of these frameworks can be uniformly converted to a uniform format, ONNX, for storage after training is completed. According to the method, the semantic segmentation model and the point cloud post-processing model are both converted into ONNX-format files, then the ONNX-format files obtained by converting the semantic segmentation model and the point cloud post-processing model are fused into one ONNX-format file, so that one link is reduced in the model reasoning stage, the point cloud data are directly input into the fused ONNX-format files, segmented and post-processed results can be directly output, and the time consuming effect of an algorithm can be reduced.
According to the semantic segmentation method for the point cloud data, the three-dimensional point cloud data are projected to a two-dimensional image space, so that the two-dimensional point cloud data are obtained; inputting the two-dimensional point cloud data into a neural network model to generate semantic information of the three-dimensional point cloud data; the training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements; firstly, converting three-dimensional point cloud data into two-dimensional point cloud data, enabling the data to be more compact and dense, avoiding voxelization, then inputting the two-dimensional point cloud data into a neural network model for mature processing of two-dimensional images, directly generating semantic information of the three-dimensional point cloud data, and therefore achieving semantic segmentation of three-dimensional environment scenes.
Fig. 2 is a flowchart of a semantic segmentation method of point cloud data according to another exemplary embodiment of the present application. As shown in fig. 2, before step 120, the semantic segmentation method of the point cloud data may further include:
step 130: training the neural network model by using a training sample.
Before semantic segmentation is executed, a corresponding training sample is generated by the scene point cloud generation model according to the working scene of the current operation, and the neural network model is trained iteratively, so that the requirements of the actual working scene of the excavator are met, and the semantic segmentation capability and accuracy of the neural network model on the working scene of the current operation are improved.
Fig. 3 is a flowchart of a semantic segmentation method of point cloud data according to another exemplary embodiment of the present application. As shown in fig. 3, the step 120 may include:
step 121: and carrying out semantic segmentation on the two-dimensional point cloud data to obtain semantic information of the two-dimensional image.
By carrying out semantic segmentation on the two-dimensional point cloud data, the semantic segmentation of the two-dimensional point cloud data can be realized by adopting a neural network model (comprising a standard convolution module and the like) for processing the two-dimensional image, so that the difficulty and the calculated amount of the semantic segmentation can be reduced, and the sparse unordered laser radar point cloud data can be converted into more compact structured two-dimensional point cloud data, thereby improving the efficiency of the semantic segmentation. Specifically, the application can adopt a SalsaNext, rangeNet ++, squeezeSegV3 and other neural network models for semantic segmentation.
Step 122: and mapping the two-dimensional point cloud data back to the three-dimensional point cloud space based on the semantic information of the two-dimensional image so as to obtain the semantic information of the three-dimensional point cloud data.
In one embodiment, the implementation of step 122 may be: mapping the two-dimensional point cloud data back to a three-dimensional point cloud space based on semantic information of the two-dimensional image to obtain initial information of the three-dimensional point cloud data; and post-processing is carried out on the initial information to obtain semantic information of the three-dimensional point cloud data. After the semantic information of the two-dimensional image is obtained, the two-dimensional point cloud data are mapped back to the three-dimensional point cloud space, namely, the two-dimensional point cloud data are restored to the three-dimensional point cloud data, so that scene information (including information of a target object and the like) in the three-dimensional space is obtained, after the two-dimensional point cloud data are mapped back to the three-dimensional point cloud space, post-processing is carried out on the three-dimensional point cloud data (initial information) so as to solve discretization errors caused by spherical projection (namely, the problem of edge error classification of the target object caused when the two-dimensional image is mapped back to the three-dimensional point cloud space), and therefore the semantic information of the accurate three-dimensional point cloud data is obtained. Specifically, an edge error in the initial information is eliminated by adopting a proximity algorithm (such as a KNN post-processing algorithm), so as to obtain semantic information of the three-dimensional point cloud data. KNN refers to a K-nearest neighbor classification algorithm, which is a data mining classification method, and K nearest neighbors are the meaning of K nearest neighbors, i.e. each sample can be represented by K nearest neighbors. The core idea of the kNN algorithm is that if a sample belongs to a certain class for the most of the k nearest samples in the feature space, then that sample also belongs to that class and has the characteristics of the samples on that class. The method only determines the category to which the sample to be classified belongs according to the category of one or more samples which are nearest to each other in determining the classification decision. The kNN method is related to a very small number of neighboring samples when making a class decision. Since the kNN method mainly depends on the adjacent samples with limited surroundings, rather than the method of distinguishing the class domain, the kNN method is more suitable than other methods for the set of samples to be separated with more cross or overlap of the class domain.
Fig. 4 is a flowchart of a semantic segmentation method of point cloud data according to another exemplary embodiment of the present application. As shown in fig. 4, before step 110, the semantic segmentation method of the point cloud data may further include:
step 140: and filtering the three-dimensional point cloud data to obtain filtered three-dimensional point cloud data.
Because the laser radar mounted position is at the top of the cab of the excavator or the movable arm, the problems that laser radar emitting beams are blocked by the excavator body to cause failure or point cloud data are too concentrated at the position of the excavator body and the like can be avoided, and in addition, abnormal point cloud data can possibly occur due to complex working scenes of the excavator, so that the accuracy of subsequent semantic segmentation is ensured.
In one embodiment, the specific implementation of step 140 may be: any one or a combination of a plurality of the following filtering processing modes is adopted for the three-dimensional point cloud data: outlier filtering, null filtering, intensity filtering, box filtering, sphere filtering.
Correspondingly, the step 110 may include:
step 111: and projecting the filtered three-dimensional point cloud data to a two-dimensional image space to obtain two-dimensional point cloud data.
After the filtering processing, the three-dimensional point cloud data after the filtering processing is projected to a two-dimensional image space, so that not only can the accuracy of the basic point cloud data be improved, but also the calculated amount of the point cloud data can be reduced, and the segmentation efficiency is improved.
Fig. 5 is a schematic diagram of a semantic segmentation method of point cloud data according to an exemplary embodiment of the present application. As shown in fig. 5, the semantic segmentation method of the point cloud data includes the following steps:
step 510: and the laser radar acquires the point cloud data.
The method and the device acquire three-dimensional point cloud data in a working scene in real time by using the laser radar arranged on the excavator.
Step 520: training samples are generated.
According to the method and the device, training samples in corresponding scenes are generated by using the scene point cloud generation model according to scene requirements so as to update data.
Step 530: and (5) preprocessing point cloud data.
According to the method, the three-dimensional point cloud data are subjected to filtering processing by utilizing the pre-filtering processing, so that more accurate three-dimensional point cloud data are obtained, and an accurate basic data guarantee is provided for the accuracy of subsequent semantic segmentation.
Step 540: and (5) projecting point cloud data.
According to the method, the three-dimensional point cloud data are projected to the two-dimensional image space by using the spherical projection method, so that the two-dimensional point cloud data are obtained, the sparse and random three-dimensional point cloud data are compactly and structurally represented, and standard convolution operation can be performed, so that the difficulty of semantic segmentation is reduced, and the efficiency of semantic segmentation is improved.
Step 550: and (5) model training.
The neural network model comprises a semantic segmentation model and a point cloud data post-processing model, namely, semantic segmentation and point cloud data post-processing are integrated into one neural network model, so that the semantic segmentation and post-processing of the point cloud data are realized. Specifically, the semantic segmentation model of the method can adopt an ONNX open neural network, the point cloud data post-processing model can adopt an ONNX operator, and the ONNX operator is integrated into the ONNX open neural network.
Step 560: model deployment quantifies acceleration.
TensorRT is a high-performance deep learning reasoning (Infinite) optimizer developed by NVIDIA, which can provide low-latency, high-throughput deployment reasoning for deep learning applications. The TensorRT can be used for reasoning and accelerating a very large-scale data center, an embedded platform or an automatic driving platform. According to the method, the neural network model is deployed and optimized by using neural network reasoning optimizers such as a TensorRT optimizer, and particularly, the TensorRT optimizer can conduct FP16 quantization so as to reduce reasoning links, improve reasoning speed and facilitate the deployment of the neural network model to the actual working environment of the excavator.
Step 570: and (5) point cloud data segmentation.
After the neural network model is trained, semantic segmentation is carried out on point cloud data acquired by the laser radar, so that a semantic segmentation result of a working scene is obtained.
Fig. 6 is a schematic structural diagram of a semantic segmentation device for point cloud data according to an exemplary embodiment of the present application. As shown in fig. 6, the semantic segmentation apparatus 60 of the point cloud data includes: the data projection module 61 is configured to project the three-dimensional point cloud data into a two-dimensional image space to obtain two-dimensional point cloud data; the semantic generation module 62 is used for inputting the two-dimensional point cloud data into the neural network model to generate semantic information of the three-dimensional point cloud data; the training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements.
According to the semantic segmentation device for the point cloud data, the three-dimensional point cloud data are projected to a two-dimensional image space through the data projection module 61, and two-dimensional point cloud data are obtained; then, the semantic generation module 62 inputs the two-dimensional point cloud data into a neural network model to generate semantic information of the three-dimensional point cloud data; the training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements; firstly, converting three-dimensional point cloud data into two-dimensional point cloud data, enabling the data to be more compact and dense, avoiding voxelization, then inputting the two-dimensional point cloud data into a neural network model for mature processing of two-dimensional images, directly generating semantic information of the three-dimensional point cloud data, and therefore achieving semantic segmentation of three-dimensional environment scenes.
In an embodiment, the data projection module 61 may be further configured to: and projecting the three-dimensional point cloud data to a two-dimensional image space by adopting a spherical projection method to obtain two-dimensional point cloud data.
Fig. 7 is a schematic structural diagram of a semantic segmentation device for point cloud data according to another exemplary embodiment of the present application. As shown in fig. 7, the semantic segmentation apparatus 60 of the point cloud data may further include: the model training module 63 is configured to train the neural network model by using the training samples.
In one embodiment, as shown in FIG. 7, the semantic generation module 62 may include: a two-dimensional segmentation unit 621, configured to perform semantic segmentation on the two-dimensional point cloud data to obtain semantic information of a two-dimensional image; the point cloud mapping unit 622 is configured to map the two-dimensional point cloud data back to the three-dimensional point cloud space based on the semantic information of the two-dimensional image, so as to obtain the semantic information of the three-dimensional point cloud data.
In an embodiment, the point cloud mapping unit 622 may be further configured to: mapping the two-dimensional point cloud data back to a three-dimensional point cloud space based on semantic information of the two-dimensional image to obtain initial information of the three-dimensional point cloud data; and post-processing is carried out on the initial information to obtain semantic information of the three-dimensional point cloud data.
In an embodiment, as shown in fig. 7, the semantic segmentation apparatus 60 of the point cloud data may further include: the preprocessing module 64 is configured to perform filtering processing on the three-dimensional point cloud data, so as to obtain filtered three-dimensional point cloud data.
In an embodiment, the pre-processing module 64 may be further configured to: any one or a combination of a plurality of the following filtering processing modes is adopted for the three-dimensional point cloud data: outlier filtering, null filtering, intensity filtering, box filtering, sphere filtering.
Correspondingly, the data projection module 61 may be further configured to: and projecting the filtered three-dimensional point cloud data to a two-dimensional image space to obtain two-dimensional point cloud data.
The application also provides an excavator, comprising: a body; the laser radar is arranged on the machine body and is used for collecting three-dimensional point cloud data; and the semantic segmentation device of the point cloud data is connected with the laser radar.
According to the excavator, the three-dimensional point cloud data are projected to a two-dimensional image space to obtain two-dimensional point cloud data; inputting the two-dimensional point cloud data into a neural network model to generate semantic information of the three-dimensional point cloud data; the training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements; firstly, converting three-dimensional point cloud data into two-dimensional point cloud data, enabling the data to be more compact and dense, avoiding voxelization, then inputting the two-dimensional point cloud data into a neural network model for mature processing of two-dimensional images, directly generating semantic information of the three-dimensional point cloud data, and therefore achieving semantic segmentation of three-dimensional environment scenes.
Next, an electronic device according to an embodiment of the present application is described with reference to fig. 8. The electronic device may be either or both of the first device and the second device, or a stand-alone device independent thereof, which may communicate with the first device and the second device to receive the acquired input signals therefrom.
Fig. 8 illustrates a block diagram of an electronic device according to an embodiment of the present application.
As shown in fig. 8, the electronic device 10 includes one or more processors 11 and a memory 12.
The processor 11 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 10 to perform desired functions.
Memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 11 to implement the methods of the various embodiments of the present application described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, and the like may also be stored in the computer-readable storage medium.
In one example, the electronic device 10 may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other forms of connection mechanisms (not shown).
When the electronic device is a stand-alone device, the input means 13 may be a communication network connector for receiving the acquired input signals from the first device and the second device.
In addition, the input device 13 may also include, for example, a keyboard, a mouse, and the like.
The output device 14 may output various information to the outside, including the determined distance information, direction information, and the like. The output means 14 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.
Of course, only some of the components of the electronic device 10 that are relevant to the present application are shown in fig. 8 for simplicity, components such as buses, input/output interfaces, etc. are omitted. In addition, the electronic device 10 may include any other suitable components depending on the particular application.
The computer program product may write program code for performing the operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims (10)

1. The semantic segmentation method of the point cloud data is characterized by comprising the following steps of:
projecting the three-dimensional point cloud data to a two-dimensional image space to obtain two-dimensional point cloud data; and
inputting the two-dimensional point cloud data into a neural network model to generate semantic information of the three-dimensional point cloud data; the training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements.
2. The semantic segmentation method of point cloud data according to claim 1, wherein before the inputting of the two-dimensional point cloud data into a neural network model to generate semantic information of the three-dimensional point cloud data, the semantic segmentation method of point cloud data further comprises:
and training the neural network model by adopting the training sample.
3. The method for semantic segmentation of point cloud data according to claim 1, wherein the inputting the two-dimensional point cloud data into a neural network model, generating semantic information of the three-dimensional point cloud data comprises:
inputting the two-dimensional point cloud data into the neural network model to obtain semantic information of the three-dimensional point cloud data; the neural network model comprises a semantic segmentation model and a point cloud post-processing model, and the semantic segmentation model and the point cloud post-processing model are both converted into ONNX-format files and are integrated into one ONNX-format file.
4. The method for semantic segmentation of point cloud data according to claim 1, wherein the inputting the two-dimensional point cloud data into a neural network model, generating semantic information of the three-dimensional point cloud data comprises:
carrying out semantic segmentation on the two-dimensional point cloud data to obtain semantic information of a two-dimensional image; and
and mapping the two-dimensional point cloud data back to a three-dimensional point cloud space based on the semantic information of the two-dimensional image to obtain the semantic information of the three-dimensional point cloud data.
5. The method of claim 4, wherein mapping the two-dimensional point cloud data back to a three-dimensional point cloud space based on semantic information of the two-dimensional image to obtain semantic information of the three-dimensional point cloud data comprises:
mapping the two-dimensional point cloud data back to a three-dimensional point cloud space based on semantic information of the two-dimensional image to obtain initial information of the three-dimensional point cloud data; and
and carrying out post-processing on the initial information to obtain semantic information of the three-dimensional point cloud data.
6. The method for semantic segmentation of point cloud data according to claim 5, wherein the post-processing the initial information to obtain semantic information of the three-dimensional point cloud data comprises:
and eliminating edge errors in the initial information by adopting a proximity algorithm to obtain semantic information of the three-dimensional point cloud data.
7. The semantic segmentation method of point cloud data according to claim 1, wherein before the projecting three-dimensional point cloud data into a two-dimensional image space to obtain two-dimensional point cloud data, the semantic segmentation method of point cloud data further comprises:
filtering the three-dimensional point cloud data to obtain filtered three-dimensional point cloud data;
the projecting the three-dimensional point cloud data into the two-dimensional image space to obtain the two-dimensional point cloud data comprises the following steps:
and projecting the filtered three-dimensional point cloud data to a two-dimensional image space to obtain the two-dimensional point cloud data.
8. The semantic segmentation method of point cloud data according to claim 7, wherein the filtering the three-dimensional point cloud data comprises:
and adopting any one or a combination of a plurality of filtering processing modes for the three-dimensional point cloud data: outlier filtering, null filtering, intensity filtering, box filtering, sphere filtering.
9. A semantic segmentation apparatus for point cloud data, comprising:
the data projection module is used for projecting the three-dimensional point cloud data to a two-dimensional image space to obtain two-dimensional point cloud data; and
the semantic generation module is used for inputting the two-dimensional point cloud data into a neural network model and generating semantic information of the three-dimensional point cloud data; the training samples of the neural network model are generated in real time by a scene point cloud generating model, and the scene point cloud generating model generates the training samples under the corresponding scene according to scene requirements.
10. An excavator, comprising:
a body;
the laser radar is arranged on the machine body and is used for collecting three-dimensional point cloud data; and
the semantic segmentation device for point cloud data according to claim 9, wherein the semantic segmentation device for point cloud data is connected with the lidar.
CN202310165367.8A 2023-02-24 2023-02-24 Semantic segmentation method and device for point cloud data and excavator Pending CN116206308A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310165367.8A CN116206308A (en) 2023-02-24 2023-02-24 Semantic segmentation method and device for point cloud data and excavator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310165367.8A CN116206308A (en) 2023-02-24 2023-02-24 Semantic segmentation method and device for point cloud data and excavator

Publications (1)

Publication Number Publication Date
CN116206308A true CN116206308A (en) 2023-06-02

Family

ID=86509063

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310165367.8A Pending CN116206308A (en) 2023-02-24 2023-02-24 Semantic segmentation method and device for point cloud data and excavator

Country Status (1)

Country Link
CN (1) CN116206308A (en)

Similar Documents

Publication Publication Date Title
CN110363058B (en) Three-dimensional object localization for obstacle avoidance using one-shot convolutional neural networks
CN111615703B (en) Sensor Data Segmentation
US11017244B2 (en) Obstacle type recognizing method and apparatus, device and storage medium
JP2020042816A (en) Object detection method, device, apparatus, storage media, and vehicle
CN110176078B (en) Method and device for labeling training set data
CN111816020A (en) Migrating synthetic lidar data to a real domain for autonomous vehicle training
JP7224682B1 (en) 3D multiple object detection device and method for autonomous driving
CN113761999A (en) Target detection method and device, electronic equipment and storage medium
CN115273002A (en) Image processing method, device, storage medium and computer program product
CN115810133B (en) Welding control method based on image processing and point cloud processing and related equipment
Shepel et al. Occupancy grid generation with dynamic obstacle segmentation in stereo images
CN111695497B (en) Pedestrian recognition method, medium, terminal and device based on motion information
KR20210121628A (en) Method and system for automatically processing point cloud based on Reinforcement learning
Wen et al. CAE-RLSM: Consistent and efficient redundant line segment merging for online feature map building
Guo et al. Road environment perception for safe and comfortable driving
JP2022035033A (en) Information processing system, information processing method, program and vehicle control system
CN117032215A (en) Mobile robot object identification and positioning method based on binocular vision
CN116311114A (en) Method and device for generating drivable region, electronic equipment and storage medium
Liu et al. A lightweight lidar-camera sensing method of obstacles detection and classification for autonomous rail rapid transit
CN116206308A (en) Semantic segmentation method and device for point cloud data and excavator
KR20230120116A (en) Electronic device for detecting object and method for controlling the same
CN115937817A (en) Target detection method and system and excavator
CN115565072A (en) Road garbage recognition and positioning method and device, electronic equipment and medium
CN117152240A (en) Object detection method, device, equipment and storage medium based on monocular camera
Katare et al. Autonomous embedded system enabled 3-D object detector:(With point cloud and camera)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination