CN114359868A

CN114359868A - Method and device for detecting 3D point cloud target

Info

Publication number: CN114359868A
Application number: CN202111679534.8A
Authority: CN
Inventors: 郭昌野; 王宇; 耿真
Original assignee: FAW Group Corp
Current assignee: FAW Group Corp
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2022-04-15

Abstract

The invention discloses a method and a device for detecting a 3D point cloud target. Wherein, the method comprises the following steps: acquiring three-dimensional point cloud data of a driving road in the driving process of an automatic driving vehicle, wherein the automatic driving vehicle is provided with a low-power-consumption embedded platform, and the three-dimensional point cloud data of the driving road is acquired through an installed laser radar sensor; inputting the three-dimensional point cloud data of the driving road into an optimization model of a 3D point cloud target detection model, and identifying at least one target information in the area range of the driving road; the optimization model of the 3D point cloud target detection model is a model running on a low-power-consumption embedded platform. The invention solves the technical problem of small application range of the 3D target detection model.

Description

Method and device for detecting 3D point cloud target

Technical Field

The invention relates to the field of vehicles, in particular to a method and a device for detecting a 3D point cloud target.

Background

At present, in order to meet the requirement of environmental perception, an automatic driving automobile is often provided with a laser radar sensor to obtain real-time three-dimensional point cloud data of a road, the point cloud data can be input into a 3D point cloud target detection algorithm by utilizing the device, and the 3D point cloud target detection algorithm based on deep learning realizes an end-to-end perception function, so that various target information of the surrounding environment is obtained, and accurate information is provided for the environmental perception of the automatic driving automobile.

In the automatic driving scheme based on the laser radar sensor, a general chip scheme such as a graphic processor still occupies the mainstream position. However, although graphics processors have been highly versatile, fast, and efficient, they also have high power consumption and are therefore not suitable for use in autonomous vehicles, and as such, artificial intelligence processors have been created specifically for autonomous driving. Although an artificial intelligence processor created for automatic driving can achieve high computational power, high performance and low power consumption, the processor is complex in design structure and puts high requirements on model mobility, and even though a plurality of high-precision 3D target detection models appear in the current research field, the artificial intelligence processor is limited by factors such as computational power, transportability and precision and cannot be used on an automatic driving vehicle in a large scale, so that the technical problem that the 3D target detection model is small in application range exists.

Aiming at the problem that the 3D target detection model in the prior art is small in application range, an effective solution is not provided at present.

Disclosure of Invention

The embodiment of the invention provides a method and a device for detecting a 3D point cloud target, which at least solve the technical problem of small use range of a 3D target detection model.

According to an aspect of the embodiments of the present invention, there is provided a method for detecting a 3D point cloud target, including: acquiring an original model of a 3D point cloud target detection model; obtaining a backbone network model by cutting an original model; preprocessing a backbone network model by combining the light-weight point cloud characteristics to generate a target model; and generating the total error of the target model based on the detection error of the original model and the detection error of the target model.

Optionally, when controlling the braking of the target vehicle based on the target control data, the method further comprises: and cutting the heavy-weight trunk network part in the original model by adopting a pruning algorithm to obtain a trunk network model, wherein the trunk network model is a light-weight trunk network in the target model.

Optionally, multiple rounds of iterative training are performed on the original model based on a preset loss function, and when the model precision of the original model after the multiple rounds of iterative training reaches a target precision value, an optimized original model is generated, wherein the loss function is a function based on a random gradient descent algorithm.

Optionally, before inputting the three-dimensional point cloud data of the driving road into the optimized model of the 3D point cloud target detection model, the method further comprises: after carrying out multiple rounds of iterative training on the original model, if the total error of the target model is reduced to a target value, verifying the model precision of the original model; if the model precision reaches the target precision value, determining that the precision of the target model also reaches the target precision value; and performing fixed point compression on the target model of the target reaching precision value to generate an optimization model of the 3D point cloud target detection model.

Optionally, generating an overall error of the target model based on the detection error of the original model and the detection error of the target model, includes: obtaining the detection error L of the original model_teacherWherein L is_teacher＝L_reg+λL_cls，L_regFor errors in the detection of box regresses and label information, L_clsLambda represents the weight of the detection branch of the original model for the error of the classification value and the labeling information of the detection frame; obtaining a detection error L of the target model_studentWherein L is_student＝L_reg+λ₁L_cls，L_regFor errors in the detection of box regresses and label information, L_clsFor detecting errors in the classification values and the label information of the frames, lambda₁Weights characterizing the target model branches; respectively passing the classification characteristic graphs of the original model and the target model through an activation layer, and calculating to obtain a root mean square error L_hm(ii) a Combining the original model and the targetThe size of the model and the detection head of the central point are calculated to obtain the absolute value L of the error of the two models_wlhAnd L_xyz(ii) a Error absolute value L of original model and target model based on root mean square error_wlhAnd L_xyzGenerating a distillation error L between the two models_kd＝L_hm+λ₂L_xyz+λ₃L_wlh(ii) a And synchronously performing multiple rounds of iterative training on the target model based on the distillation error in the process of performing multiple rounds of iterative training on the original model to obtain the total error of the target model, wherein when the total error of the target model is reduced to a target value, and if the model precision reaches a target precision value, the precision of the target model is determined to also reach the target precision value.

According to another aspect of the embodiments of the present invention, there is also provided a device for detecting a 3D point cloud target, including: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring three-dimensional point cloud data of a driving road in the driving process of an automatic driving vehicle, the automatic driving vehicle is provided with a low-power-consumption embedded platform, and the three-dimensional point cloud data of the driving road is acquired through an installed laser radar sensor; the identification module is used for inputting the three-dimensional point cloud data of the driving road into an optimization model of the 3D point cloud target detection model and identifying at least one type of target information in the area range of the driving road; the optimization model of the 3D point cloud target detection model is a model running on a low-power-consumption embedded platform.

Optionally, the apparatus further comprises: the second acquisition module is used for acquiring an original model of the 3D point cloud target detection model; the cutting module is used for obtaining a backbone network model by cutting the original model; the preprocessing module is used for preprocessing the trunk network model by combining the lightweight point cloud characteristics to generate a target model; and the generating module is used for generating the total error of the target model based on the detection error of the original model and the detection error of the target model.

Optionally, the cropping module comprises: and the sub-cutting module is used for cutting the heavy-weight trunk network part in the original model by adopting a pruning algorithm to obtain a trunk network model, wherein the trunk network model is a light-weight trunk network in the target model.

Optionally, the apparatus further comprises: and the training module is used for carrying out multi-round iterative training on the original model based on a preset loss function, and generating the optimized original model when the model precision of the original model after the multi-round iterative training reaches a target precision value, wherein the loss function is a function based on a random gradient descent algorithm.

According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium. The computer readable storage medium includes a stored program, wherein when the program runs, the apparatus where the computer readable storage medium is located is controlled to execute the method for detecting the 3D point cloud target according to the embodiment of the present invention.

According to another aspect of the embodiments of the present invention, there is also provided a processor. The processor is used for running a program, wherein the program executes the detection method of the 3D point cloud target in the embodiment of the invention during running.

In the embodiment of the invention, in the driving process of the automatic driving vehicle, three-dimensional point cloud data of a driving road are obtained, wherein the automatic driving vehicle is provided with a low-power-consumption embedded platform, and the three-dimensional point cloud data of the driving road are acquired through an installed laser radar sensor; inputting the three-dimensional point cloud data of the driving road into an optimization model of a 3D point cloud target detection model, and identifying at least one target information in the area range of the driving road; the optimization model of the 3D point cloud target detection model is a model running on a low-power-consumption embedded platform. That is to say, the method is based on the low-power-consumption embedded platform, in the driving process of the automatic driving vehicle, the three-dimensional point cloud data of the driving road are acquired through the installed laser radar sensor, the three-dimensional point cloud data of the driving road are input into the optimization model of the 3D point cloud target detection model on the low-power-consumption embedded platform, so that at least one type of target information located in the area range of the driving road is identified, the technical effect of expanding the use range of the 3D target detection model is achieved, and the technical problem that the use range of the 3D target detection model is small is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a flow chart of a method of detecting a 3D point cloud target according to an embodiment of the invention;

fig. 2 is a schematic diagram of a detection apparatus for a 3D point cloud target according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

In accordance with an embodiment of the present invention, there is provided an embodiment of a method for detecting a 3D point cloud target, it is noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

Fig. 1 is a flowchart of a method for detecting a 3D point cloud target according to an embodiment of the present invention, and the method for detecting a 3D point cloud target shown in fig. 1 includes the following steps:

and S102, acquiring three-dimensional point cloud data of a driving road in the driving process of the automatic driving vehicle.

In the technical scheme provided by the step S102 of the present invention, a low power consumption embedded platform is installed in the autonomous vehicle, and three-dimensional point cloud data in the driving road is acquired by the installed laser radar sensor.

In this embodiment, the embedded platform is a special computer system that flexibly cuts software and hardware modules according to user requirements (such as function, reliability, cost, volume, power consumption, environment, etc.). The laser radar sensor analyzes the size of the reflection energy on the surface of an object, the amplitude, the frequency, the phase and other information of a reflection spectrum by measuring the propagation distance between the sensor transmitter and the target object, so that three-dimensional point cloud data of collected data in the driving process of the automatic driving vehicle is presented, wherein the laser radar sensor can be a binocular camera, a three-dimensional scanner and the like.

Optionally, the three-dimensional point cloud data in the driving road is a data set of points in a certain coordinate system, including three-dimensional coordinates X, Y, Z, color, classification value, intensity value, time, and the like, obtained by a laser radar sensor. For example, three-dimensional point cloud data is created by scanning a picture taken by a binocular camera and intrinsic parameters of the camera.

Optionally, the collecting of the three-dimensional point cloud data of the driving road may be collecting information such as a state of an autonomous vehicle, traffic flow information, road conditions, traffic signs, and the like.

And step S104, inputting the three-dimensional point cloud data of the driving road into an optimization model of the 3D point cloud target detection model, and identifying at least one target information in the area range of the driving road.

In the technical scheme provided in the above step S104 of the present invention, the three-dimensional point cloud data of the driving road is input to the optimization model of the 3D point cloud target detection model, and at least one target information located in the area range of the driving road is identified, wherein the optimization model of the 3D point cloud target detection model is a model operating on the low power consumption embedded platform.

Optionally, the optimization model of the 3D point cloud target detection model may be a model that is built according to a demand system and runs on a low-power-consumption embedded platform, for example, three-dimensional point cloud data of a driving road is input, and the optimized optimization model is obtained through data processing.

Alternatively, the target information may be information such as a state of the autonomous vehicle, traffic flow information, road conditions, a traffic sign, and the like, or information such as a pedestrian, a vehicle, a building, and the like encountered during the driving of the autonomous vehicle, which is not specifically limited herein.

In the steps from S102 to S104, three-dimensional point cloud data of a driving road is obtained during driving of an autonomous vehicle, wherein the autonomous vehicle is provided with a low-power-consumption embedded platform, and the three-dimensional point cloud data of the driving road is acquired through a laser radar sensor; inputting the three-dimensional point cloud data of the driving road into an optimization model of a 3D point cloud target detection model, and identifying at least one target information in the area range of the driving road; the optimization model of the 3D point cloud target detection model is a model running on a low-power-consumption embedded platform. That is to say, the method is based on the low-power-consumption embedded platform, in the driving process of the automatic driving vehicle, the three-dimensional point cloud data of the driving road are acquired through the installed laser radar sensor, the three-dimensional point cloud data of the driving road are input into the optimization model of the 3D point cloud target detection model on the low-power-consumption embedded platform, so that at least one type of target information located in the area range of the driving road is identified, the technical effect of expanding the use range of the 3D target detection model is achieved, and the technical problem that the use range of the 3D target detection model is small is solved.

The above-described method of this embodiment is further described below.

As an optional implementation manner, the method further includes: acquiring an original model of a 3D point cloud target detection model; obtaining a backbone network model by cutting an original model; preprocessing a backbone network model by combining the light-weight point cloud characteristics to generate a target model; and generating the total error of the target model based on the detection error of the original model and the detection error of the target model.

In this embodiment, the original model is used to guide the training of other models, also referred to as teacher model, which may include three-dimensional coordinates X, Y, Z, as well as intensity values, time, etc., constructed by combining 3D sparse convolutional coding, a heavyweight backbone network, and multiple sets of detector heads. The 3D sparse convolutional coding is used for reconstructing accurate information codes from sparse irregular data; the detection head is composed of a classification branch and a regression branch, wherein the classification branch is used for representing the types of detection information, such as people, vehicles, houses and the like; regression branches are used to represent features of the detected information, such as length, width, height, depth, orientation angle, and the like.

Optionally, the backbone network model is obtained by clipping the original model. The clipping is used for simplifying the model, so that the running speed of the model is accelerated, the small weight connection can be clipped off for a pruning algorithm, so that the model is simplified, and the trunk network model can be a model for reserving key points. For example, by reserving the originally input key points and cutting out unimportant connections, a network model only reserving the key points is obtained, and the model is reduced, so that the algorithm operation efficiency is improved, and the algorithm operation time is shortened.

Optionally, the embodiment combines the lightweight point cloud feature to preprocess the backbone network model and generate the target model. The preprocessing can be processing by combining point cloud grids of the aerial view, and a target model is formed by the trunk network model and the detection head obtained after cutting. The target model is also called a 3D point cloud target detection model.

As an alternative embodiment, when the braking of the target vehicle is controlled based on the target control data, the method further includes: and cutting the heavy-weight trunk network part in the original model by adopting a pruning algorithm to obtain a trunk network model, wherein the trunk network model is a light-weight trunk network in the target model.

Alternatively, the pruning algorithm may be to process the graph to display the appropriate region, e.g. a point of the test strip rendering, and if the point is outside the boundary, to cull the change point.

As an optional embodiment, the original model is subjected to multiple rounds of iterative training based on a preset loss function, and when the model precision of the original model after the multiple rounds of iterative training reaches a target precision value, an optimized original model is generated, wherein the loss function is a function based on a random gradient descent algorithm.

In this embodiment, a loss function may be set for the original model, and the loss function is a solution and evaluation model for minimizing the loss function. Training an original model by adopting a random gradient descent method, performing multiple rounds of iterative training on the original model based on a preset loss function, and generating an optimized original model when the model precision of the trained original model reaches a target precision value, wherein the optimized original model comprises a 3D point cloud target detection student model with higher precision and a 3D point cloud target detection teacher model with higher precision.

Alternatively, the target accuracy may be a specific input accuracy value, or may be set to indicate that the model accuracy of the original model reaches the target accuracy value when the model accuracy of the original model after multiple rounds of iterative training does not increase.

For example, in a machine learning algorithm, a loss function is constructed for an original model, then the loss function is optimized through an optimization algorithm based on gradient descent, so that the value of the loss function is minimum, and when the model precision of the original model does not rise in a training process, the model precision of the trained original model reaches a target precision value, so that the optimized original model is generated.

As an alternative embodiment, before inputting the three-dimensional point cloud data of the driving road into the optimized model of the 3D point cloud target detection model, the method further comprises: after carrying out multiple rounds of iterative training on the original model, if the total error of the target model is reduced to a target value, verifying the model precision of the original model; if the model precision reaches the target precision value, determining that the precision of the target model also reaches the target precision value; and performing fixed point compression on the target model of the target reaching precision value to generate an optimization model of the 3D point cloud target detection model.

In this embodiment, after performing multiple rounds of iterative training on the original model, if the total error of the target model is reduced to a target value, the model accuracy of the original model is verified, and if the model accuracy reaches the target accuracy value, the target model reaching the target accuracy value is compressed at a fixed point to generate an optimized model of the 3D point cloud target detection model.

In this embodiment, the total error of the target model may also be referred to as a cross entropy loss function, wherein the target value of the total error decrease may be a specific value input in advance, or may be set to a value when the total error of the original model after multiple rounds of iterative training does not decrease any more.

Optionally, the optimized model of the 3D point cloud target detection model may be a 3D point cloud target detection model that is compressed at a fixed point by using a quantitative perception training method and is finally output and can be run at a low-power-consumption embedded end.

As an alternative embodiment, generating an overall error of the target model based on the detection error of the original model and the detection error of the target model includes: obtaining the detection error L of the original model_teacherWherein L is_reacher＝L_reg+λL_cls，L_regFor errors in the detection of box regresses and label information, L_clsLambda represents the weight of the detection branch of the original model for the error of the classification value and the labeling information of the detection frame; obtaining a detection error L of the target model_studentWherein L is_student＝L_reg+λ₁L_cls，L_regFor errors in the detection of box regresses and label information, L_clsFor detecting errors in the classification values and the label information of the frames, lambda₁Weights characterizing the target model branches; respectively passing the classification characteristic graphs of the original model and the target model through an activation layer, and calculating to obtain a root mean square error L_hm(ii) a Calculating the sizes of the original model and the target model and the detection head of the central point to obtain the absolute value L of the error of the two models_wlhAnd L_xyz(ii) a Error absolute value L of original model and target model based on root mean square error_wlhAnd L_xyzGenerating a distillation error L between the two models_kd＝L_hm+λ₂L_xyz+λ₃L_wlh(ii) a And synchronously performing multiple rounds of iterative training on the target model based on the distillation error in the process of performing multiple rounds of iterative training on the original model to obtain the total error of the target model, wherein when the total error of the target model is reduced to a target value and the model precision reaches a target precision value, the precision of the target model is determined to also reach the target precision value.

In this embodiment, the original model detection error L is obtained_teacherWherein L is_teacher＝L_reg+λL_clsLreg is the error of the regression value of the detection frame and the labeled information, Llcs is the error of the classification value of the detection frame and the labeled information, and lambda represents the weight of the detection branch of the original model.

In this embodiment, the detection error L of the target model is obtained_studentWherein L is_student＝L_reg+λ₁L_clsLreg is the error between the regression value of the detection frame and the labeled information, Llcs is the error between the classification value of the detection frame and the labeled information, and lambda₁Weights characterizing the target model branches.

In the embodiment, the classification feature maps of the original model and the target model are respectively passed through an activation layer, and a root mean square error is calculated and recorded as Lhm; calculating the sizes of the original model and the target model and the detection head of the central point to obtain the absolute values of errors Lwlh and Lxyz of the two models; errors based on original and target modelsThe difference between the absolute values of Lwlh and Lxyz yields the distillation error L between the two models_kd＝L_hm+λ₂L_xyz+λ₃L_wlh。

In this embodiment, in the process of performing multiple rounds of iterative training on the original model, multiple rounds of iterative training are synchronously performed on the target model based on the distillation error to obtain the total error L of the target model_{Total error}＝L_student+λ₄L_kd，λ₄The weights characterizing the branches between the two models.

Optionally, in this embodiment, when the total error of the target model falls to the target value, the model accuracy of the target model is verified, and if the model accuracy reaches the target accuracy value, a higher accuracy 3D point cloud target detection student model is obtained.

In the embodiment, three-dimensional point cloud data of a driving road are acquired in the driving process of an automatic driving vehicle, wherein the automatic driving vehicle is provided with a low-power-consumption embedded platform, and the three-dimensional point cloud data of the driving road are acquired through an installed laser radar sensor; inputting the three-dimensional point cloud data of the driving road into an optimization model of a 3D point cloud target detection model, and identifying at least one target information in the area range of the driving road; the optimization model of the 3D point cloud target detection model is a model running on a low-power-consumption embedded platform. That is to say, the method is based on the low-power-consumption embedded platform, in the driving process of the automatic driving vehicle, the three-dimensional point cloud data of the driving road are acquired through the installed laser radar sensor, the three-dimensional point cloud data of the driving road are input into the optimization model of the 3D point cloud target detection model on the low-power-consumption embedded platform, so that at least one type of target information located in the area range of the driving road is identified, the technical effect of expanding the use range of the 3D target detection model is achieved, and the technical problem that the use range of the 3D target detection model is small is solved.

Example 2

The technical solutions of the embodiments of the present invention will be illustrated below with reference to preferred embodiments.

The artificial intelligence chip is one of the technical cores of the artificial intelligence era, determines the software and hardware infrastructure and development ecology of the automatic driving computing platform, and needs to realize real-time dynamic acquisition and identification of surrounding environment information including but not limited to the state of a vehicle, traffic flow information, road conditions, traffic signs and the like in order to ensure that the automatic driving vehicle can make correct judgment in different scenes, so as to meet the requirements of a vehicle decision system. In order to meet the requirement of environment perception, the automatic driving automobile is often provided with a laser radar sensor to obtain real-time road three-dimensional point cloud data, the point cloud data can be input into a 3D point cloud target detection algorithm by using the device, and the 3D point cloud target detection algorithm based on deep learning realizes an end-to-end perception function, so that various target information of the surrounding environment is obtained, and accurate information is provided for the environment perception of the automatic driving automobile.

In the laser radar sensor, a general chip scheme such as a graphic processor still occupies a mainstream position. However, although graphics processors have been highly versatile, fast, and efficient, they also have high power consumption and are therefore not suitable for use in autonomous vehicles, and as such, artificial intelligence processors have been created specifically for autonomous driving. Although an artificial intelligence processor created for automatic driving can achieve high computational power, high performance and low power consumption, the processor is complex in design structure and puts high requirements on model mobility, and even though a plurality of high-precision 3D target detection models appear in the current research field, the artificial intelligence processor is limited by factors such as computational power, transportability and precision and cannot be used on an automatic driving vehicle in a large scale, so that the technical problem that the 3D target detection model is small in application range exists.

Meanwhile, the 3D point cloud target detection algorithm running in an autonomous vehicle has the following disadvantages: (1) the structure is complicated, the computing force is low, and the time delay is high. Finally, only a high computational force platform can be adopted to meet the detection frame rate, so that the cost and the power consumption are increased; (2) the portability is poor, and some operators depend on specific hardware and can only be deployed on a certain or a plurality of special platforms; (3) in order to meet the precision requirement, floating point precision calculation is adopted, so that some embedded platforms specially designed for low-precision calculation cannot give full play to the calculation capacity, and the detection delay is further increased.

In a related technology, a voxelized 3D network and arithmetic mean (VFE) realization algorithm based on an embedded FPGA platform can be realized, and a preprocessing algorithm for 3D point cloud target detection is realized on the embedded platform, but the algorithm is only limited to the embedded FPGA platform, and the problem that a 3D target detection model is small in use range still exists.

In another related technology, a point cloud feature map is input into a 3D target detection model, the 3D target detection model is trained, the trained 3D target detection model is obtained, and target detection is realized based on the trained 3D target detection model.

In order to overcome the problems, the embodiment designs a 3D point cloud target detection method which can operate on a low-power-consumption embedded platform and can achieve high precision and low time delay, a teacher model with high computational power and high precision is cut to obtain a light-weight student model backbone network, and then the light-weight point cloud feature preprocessing is combined to realize a student model with high portability and low computational power; the precision of the student model reaches the same level as that of the teacher model by distilling the teacher model; the fixed point model can be obtained by adopting a quantitative perception training method, and finally, the 3D point cloud target detection model with low time delay and high precision can be operated on a low-power-consumption embedded terminal, so that the operation speed is increased, the higher accuracy is kept, the high portability is realized on various neural network processing chips, and the problem of small application range of the 3D target detection model is solved.

In this embodiment, the detection method of the 3D point cloud target is composed of four parts: (1) and (5) training a teacher model. Constructing and training a high-precision 3D target detection teacher model, wherein the teacher model has the function of realizing high-performance radar perception, and can achieve a sample model with higher precision through training although the teacher model cannot be directly transferred to an embedded platform due to calculation force and operator limitation; (2) and cutting the teacher model into student models. Cutting the teacher model obtained by the first part to enable the cut lightweight model to meet the computing power requirement of the embedded platform; (3) and (5) training a student model. The feature extraction capability of the teacher model is migrated into the student models through a knowledge distillation migration learning means, so that the student models absorb the detection capability of the teacher model, the teacher model guides learning, the generalization of a student network is improved, the convergence is accelerated, and finally the student models which achieve higher precision are obtained; (4) and carrying out fixed-point quantification on the student model. In order to further compress the student model, the student model is quantized into a fixed-point model, and precision loss is hardly caused, specifically, floating point type parameters (weights or tensors) of continuous values or discrete values in a network are linearly mapped into discrete values of fixed-point approximation, original floating point format data is replaced, and input and output are kept to be floating point types, so that the aims of reducing the size of the model, reducing the memory consumption of the model, accelerating the reasoning speed of the model and the like are fulfilled. The method is further described below.

Step 1, constructing a 3D point cloud target detection teacher model by combining 3D sparse convolution coding, a heavyweight trunk network and a plurality of groups of detection heads;

and 2, setting a loss function for the teacher model. Wherein L is_regFor errors in the detection of box regresses and label information, L_clsFor detecting errors of the classification values and the labeling information of the frames, the total error is L_teacher＝L_reg+λL_cls. And (3) training the teacher model by adopting a random gradient descent method, and stopping training when the total error is reduced to a certain degree after multiple rounds of iterative training. By verifying the precision of the teacher model, when the precision meets the requirement, a high-precision 3D point cloud target detection teacher model can be obtained;

step 3, cutting a heavyweight trunk network part in the teacher model by using a pruning algorithm to obtain a lightweight trunk network in the student model;

step 4, constructing a 3D point cloud target detection student model by combining point cloud grid preprocessing of the aerial view and the cut lightweight trunk network and the detection head;

step 5, and the teacher in step 2The same model is used, and the detection error of the student model is L_student＝L_reg+λ₁L_cls. And inputting the data of the same batch to respectively obtain the output of the teacher model and the output of the student model. Respectively passing the classification characteristic graphs output by the two models through an activation layer, calculating the root mean square error, and recording as L_hm. Calculating the error of the absolute value of the error of the size and the detection head of the central point output by the two models to respectively obtain L_wlh，L_xyz. So that the distillation error is L_kd＝L_hm+λ₂L_xyz+λ₃L_wlh. Thereby obtaining the total error L of the student model_{Total error}＝L_student+λ₄L_kd；

Step 6, training the student model by adopting a random gradient descent method, and after multiple rounds of iterative training, when the overall error of the student model is reduced to a certain degree, and the precision of the student model is verified, when the precision meets the requirement, obtaining a 3D point cloud target detection student model which also achieves higher precision;

and 7, performing fixed-point compression on the student model by using a quantitative perception training method, and finally outputting a 3D point cloud target detection model which can run at a low-power-consumption embedded terminal.

Example 3

According to the embodiment of the invention, the invention further provides a control device of the detection method of the 3D point cloud target. It should be noted that the detection apparatus for a 3D point cloud target can be used to execute the detection method for a 3D point cloud target in embodiment 1.

Fig. 2 is a schematic diagram of a detection apparatus for a 3D point cloud target according to an embodiment of the present invention. As shown in fig. 2, the apparatus 200 for detecting a 3D point cloud target may include: a first acquisition module 201 and a recognition module 202.

The system comprises a first acquisition module 201, which is used for acquiring three-dimensional point cloud data of a driving road in the driving process of an automatic driving vehicle, wherein the automatic driving vehicle is provided with a low-power-consumption embedded platform, and the three-dimensional point cloud data of the driving road is acquired through an installed laser radar sensor.

The identification module 202 is used for inputting the three-dimensional point cloud data of the driving road into an optimization model of the 3D point cloud target detection model and identifying at least one type of target information in the area range of the driving road; the optimization model of the 3D point cloud target detection model is a model running on a low-power-consumption embedded platform.

Optionally, the second obtaining module is configured to obtain an original model of the 3D point cloud target detection model.

Optionally, the cutting module is configured to obtain the backbone network model by cutting the original model.

Optionally, the preprocessing module is configured to combine the lightweight point cloud feature to preprocess the backbone network model, so as to generate the target model.

Optionally, the generating module is configured to generate an overall error of the target model based on the detection error of the original model and the detection error of the target model.

Optionally, the training module is configured to perform multiple rounds of iterative training on the original model based on a preset loss function, and generate the optimized original model when the model precision of the original model after the multiple rounds of iterative training reaches a target precision value, where the loss function is a function based on a random gradient descent algorithm.

Optionally, the generating module includes a first generating unit, configured to verify the model accuracy of the original model after performing multiple rounds of iterative training on the original model, if the total error of the target model decreases to a target value; if the model precision reaches the target precision value, determining that the precision of the target model also reaches the target precision value; and performing fixed point compression on the target model of the target reaching precision value to generate an optimization model of the 3D point cloud target detection model.

Optionally, the second obtaining module includes a first obtaining unit, configured to obtain the original model detection error L_teacherWherein L is_teacher＝L_reg+λL_cls，L_regFor errors in the detection of box regresses and label information, L_clsAnd for detecting errors of the classification values and the labeling information of the frames, the lambda represents the weight of the detection branch of the original model.

Optionally, the second obtaining module includes a second obtaining unit, configured to obtain a detection error L of the target model_studentWherein L is_student＝L_reg+λ₁L_cls，L_regFor errors in the detection of box regresses and label information, L_clsFor detecting errors in the classification values and the label information of the frames, lambda₁Weights characterizing the target model branches.

Optionally, the generating module includes a second generating unit, configured to pass the classification feature maps of the original model and the target model through the active layer, respectively, and calculate a root mean square error L_hm(ii) a Calculating the sizes of the original model and the target model and the detection head of the central point to obtain the absolute value L of the error of the two models_wlhAnd L_xyz(ii) a Error absolute value L of original model and target model based on root mean square error_wlhAnd L_xyzGenerating a distillation error L between the two models_kd＝L_hm+λ₂L_xyz+λ₃L_wlh(ii) a And synchronously performing multiple rounds of iterative training on the target model based on the distillation error in the process of performing multiple rounds of iterative training on the original model to obtain the total error of the target model, wherein when the total error of the target model is reduced to a target value and the model precision reaches a target precision value, the precision of the target model is determined to reach the target precision value.

Example 4

According to an embodiment of the present invention, there is also provided a computer-readable storage medium including a stored program, wherein the program performs the method for detecting a 3D point cloud target described in embodiment 1.

Example 5

According to an embodiment of the present invention, there is also provided a processor, configured to execute a program, where the program executes the method for detecting a 3D point cloud target described in embodiment 1 during execution.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A detection method of a 3D point cloud target is applied to an automatic driving vehicle and comprises the following steps:

acquiring three-dimensional point cloud data of a driving road in the driving process of an automatic driving vehicle, wherein the automatic driving vehicle is provided with a low-power-consumption embedded platform, and the three-dimensional point cloud data of the driving road is acquired through a laser radar sensor;

inputting the three-dimensional point cloud data of the driving road into an optimization model of a 3D point cloud target detection model, and identifying at least one target information located in the area range of the driving road;

and the optimization model of the 3D point cloud target detection model is a model running on the low-power-consumption embedded platform.

2. The method of claim 1, further comprising:

acquiring an original model of a 3D point cloud target detection model;

obtaining a backbone network model by cutting the original model;

preprocessing the backbone network model by combining the light-weighted point cloud characteristics to generate a target model;

and generating an overall error of the target model based on the detection error of the original model and the detection error of the target model.

3. The method of claim 2, wherein a pruning algorithm is used to prune the heavyweight trunk network portion in the original model to obtain the trunk network model, wherein the trunk network model is a lightweight trunk network in the target model.

4. The method according to claim 2, wherein the original model is subjected to multiple rounds of iterative training based on a preset loss function, and when the model precision of the original model after the multiple rounds of iterative training reaches a target precision value, the optimized original model is generated, wherein the loss function is a function based on a random gradient descent algorithm.

5. The method according to any one of claims 2 to 4, wherein before inputting the three-dimensional point cloud data of the travel road into the optimized model of the 3D point cloud target detection model, the method further comprises:

after multiple rounds of iterative training are carried out on the original model, if the total error of the target model is reduced to a target value, the model precision of the original model is verified;

if the model precision reaches a target precision value, determining that the precision of the target model also reaches the target precision value;

and performing fixed point compression on the target model which reaches the target precision value to generate an optimization model of the 3D point cloud target detection model.

6. The method of claim 5, wherein generating an overall error of the target model based on the detection error of the original model and the detection error of the target model comprises:

obtaining the detection error L of the original model_teacherWherein, said L_teacher＝L_reg+λL_cls，L_regFor errors in the detection of box regresses and label information, L_clsLambda represents the weight of the detection branch of the original model for the error of the classification value and the labeling information of the detection frame;

obtaining a detection error L of the target model_studentWherein, said L_student＝L_reg+λ₁L_cls，L_regFor errors in the detection of box regresses and label information, L_clsFor detecting errors in the classification values and the label information of the frames, lambda₁Weights characterizing the target model branches;

respectively passing the classification characteristic graphs of the original model and the target model through an activation layer, and calculating to obtain a root mean square error L_hm；

Calculating the sizes of the original model and the target model and the detection head of the central point to obtain the absolute value L of the error of the two models_wlhAnd L_xyz；

Calculating an absolute value of error L of the original model and the target model based on the root mean square error_wlhAnd L_xyzGenerating a distillation error L between the two models_kd＝L_hm+λ₂L_xyz+λ_LL_wlh；

And in the process of carrying out multiple rounds of iterative training on the original model, synchronously carrying out multiple rounds of iterative training on the target model based on the distillation error to obtain the total error of the target model, wherein when the total error of the target model is reduced to a target value, and if the model precision reaches a target precision value, the precision of the target model is determined to also reach the target precision value.

7. A detection device of a 3D point cloud target is applied to an automatic driving vehicle and comprises:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring three-dimensional point cloud data of a driving road in the driving process of an automatic driving vehicle, the automatic driving vehicle is provided with a low-power-consumption embedded platform, and the three-dimensional point cloud data of the driving road are acquired through an installed laser radar sensor;

the identification module is used for inputting the three-dimensional point cloud data of the driving road into an optimization model of the 3D point cloud target detection model and identifying at least one target information in the area range where the driving road is located;

8. The apparatus of claim 7, further comprising:

the second acquisition module is used for acquiring an original model of the 3D point cloud target detection model;

the cutting module is used for obtaining a backbone network model by cutting the original model;

the preprocessing module is used for preprocessing the backbone network model by combining light-weighted point cloud characteristics to generate a target model;

and the generating module is used for generating the total error of the target model based on the detection error of the original model and the detection error of the target model.

9. The apparatus of claim 8, wherein the cropping module comprises: and the sub-cutting module is used for cutting the heavy-weight trunk network part in the original model by adopting a pruning algorithm to obtain the trunk network model, wherein the trunk network model is a light-weight trunk network in the target model.

10. The apparatus of claim 8, further comprising: and the training module is used for carrying out multiple rounds of iterative training on the original model based on a preset loss function, and generating the optimized original model when the model precision of the original model after the multiple rounds of iterative training reaches a target precision value, wherein the loss function is a function based on a random gradient descent algorithm.