CN113642620A

CN113642620A - Model training and obstacle detection method and device

Info

Publication number: CN113642620A
Application number: CN202110868394.2A
Authority: CN
Inventors: 朱滨
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2021-07-30
Filing date: 2021-07-30
Publication date: 2021-11-12
Anticipated expiration: 2041-07-30
Also published as: CN113642620B

Abstract

The specification discloses a model training and obstacle detection method and device, which can firstly obtain each frame of historically collected laser point cloud as a training sample, label each training sample according to obstacle information in each frame of laser point cloud, and then respectively determine a plurality of candidate detection frames of each obstacle in each frame of laser point cloud and the quality score of each candidate detection frame through each network layer of an obstacle detection model to be trained. And then, clustering the candidate detection frames according to the quality scores of the candidate detection frames to determine a plurality of high-quality detection frames. And finally, adjusting model parameters in the obstacle detection model by taking the difference between the high-quality detection frame of each obstacle in each training sample and the position marking frame of each obstacle in each training sample as a target. Based on the quality scores of the candidate detection frames, clustering is carried out on the candidate detection frames, the criteria of quality judgment of the detection frames are dynamically adjusted, and the generalization of the model to different scenes is improved.

Description

Model training and obstacle detection method and device

Technical Field

The application relates to the technical field of machine learning, in particular to a model training and obstacle detection method and device.

Background

In the driving process of the unmanned equipment, the obstacle information of surrounding obstacles needs to be detected in real time so as to carry out obstacle avoidance driving.

When the obstacle detection is carried out, laser point clouds in the surrounding environment can be collected through laser radar equipment, the collected laser point clouds are input into a pre-trained obstacle detection model, and the position detection frame of each obstacle contained in the frame of laser point clouds is determined.

At present, when training an obstacle detection model, a plurality of frames of historically collected laser point clouds can be obtained, and a position marking frame of an obstacle in each frame of laser point cloud is determined. And then, aiming at each frame of laser point cloud, inputting the frame of laser point cloud into a first network layer of an obstacle detection model to be trained, and determining the type of each obstacle and a plurality of candidate detection frames in the frame of laser point cloud.

Then, aiming at each obstacle in the frame of laser point cloud, determining the intersection ratio between the position marking frame of the obstacle and a plurality of candidate detection frames of the obstacle, and determining the candidate detection frame with the intersection ratio larger than a preset threshold value as a high-quality detection frame. And finally, adjusting model parameters of each network layer in the obstacle detection model by taking the position deviation between each high-quality detection frame and the position marking frame of the obstacle as a target.

However, since the laser point clouds serving as training samples come from various different scenes, and when quality screening is performed on the basis of intersection and comparison of detection frames, the same threshold standard is set for the laser point clouds collected in different scenes, so that the generalization of the model to different scenes is poor, and the output result of the model is inaccurate. For example, for two different scenes, namely dense obstacle distribution and sparse obstacle distribution, if the same screening criteria for the quality of the detection frames are set, the distribution difference of the quality detection frames in the two scenes is large.

Disclosure of Invention

The embodiment of the specification provides a model training and obstacle detection method and device, which are used for partially solving the problems in the prior art.

The embodiment of the specification adopts the following technical scheme:

the model training method provided by the specification comprises the following steps:

acquiring each frame of historically acquired laser point cloud as a training sample, and marking a position marking frame of each obstacle in each training sample according to obstacle information of each obstacle contained in each frame of laser point cloud;

aiming at each training sample, inputting the laser point cloud in the training sample into a first network layer of an obstacle detection model to be trained, extracting features, and determining the point cloud features of the laser point cloud;

inputting the determined point cloud characteristics of the laser point cloud into a second network layer of the obstacle detection model to be trained, and determining each candidate detection frame of each obstacle in the laser point cloud;

aiming at each candidate detection frame of each obstacle in the laser point cloud, inputting the candidate detection frame and the point cloud characteristics of the laser point cloud into a third network layer of an obstacle detection model to be trained, and determining the quality score of the candidate detection frame;

clustering the candidate detection frames of each obstacle according to the quality score of each candidate detection frame of each obstacle in the laser point cloud, and determining a plurality of high-quality detection frames of each obstacle in the laser point cloud;

determining a loss function according to the difference between each high-quality detection frame of each obstacle in each training sample and the position marking frame of each obstacle in each training sample, and adjusting the model parameters of each network layer in the obstacle detection model by taking the minimum loss function as a target.

Optionally, the obstacle detection model to be trained further includes a fourth network layer;

the method further comprises the following steps:

and inputting the determined point cloud characteristics of the laser point cloud into a fourth network layer of the obstacle detection model to be trained, and determining the type of each obstacle in the laser point cloud.

Optionally, determining the quality score of the candidate detection box specifically includes:

determining a quality score of the position of the candidate detection box output by the third network layer;

and determining the quality score of the candidate detection frame according to the quality score of the type of the obstacle and the quality score of the position of the candidate detection frame.

Optionally, the method further comprises:

determining the intersection ratio between each candidate detection frame of each obstacle in the laser point cloud and the position marking frame of the obstacle as the marking score of the candidate detection frame;

and adjusting the model parameters in the third network layer with the aim of minimizing the difference between the quality score output by the third network layer and the labeling score of the candidate detection frame.

Optionally, before determining the quality score of the candidate detection box, the method further includes:

inputting the candidate detection frame and the point cloud characteristics of the laser point cloud into a fifth network layer of the obstacle detection model to be trained, and determining the corresponding prediction deviation distribution of the candidate detection frame;

and determining the prediction deviation corresponding to the candidate detection frame according to the prediction deviation distribution corresponding to the candidate detection frame, and adjusting the position of the candidate detection frame according to the determined prediction deviation.

The present specification provides a method for detecting an obstacle, including:

acquiring laser point cloud of the surrounding environment;

inputting the acquired laser point cloud into a first network layer of a pre-trained obstacle detection model, performing feature extraction, and determining the point cloud feature of the laser point cloud;

inputting the determined point cloud characteristics of the laser point cloud into a second network layer of a pre-trained obstacle detection model, and determining each candidate detection frame of each obstacle in the laser point cloud;

aiming at each candidate detection frame of each obstacle in the laser point cloud, inputting the candidate detection frame and the point cloud characteristics of the laser point cloud into a third network layer of a pre-trained obstacle detection model, and determining the quality score of the candidate detection frame;

and determining a prediction detection frame of each obstacle according to the determined high-quality detection frames of each obstacle and the quality scores of the high-quality detection frames.

aiming at each high-quality detection frame of each determined obstacle, inputting point cloud features in the high-quality detection frame into a sixth network layer of the obstacle detection model, and determining the type of the obstacle corresponding to the high-quality detection frame;

and updating the quality score of the high-quality detection frame according to the quality score of the high-quality detection frame corresponding to the type of the obstacle and the quality score of the position of the high-quality detection frame output by the third network layer.

Optionally, determining a predicted detection frame of each obstacle according to each determined high-quality detection frame of each obstacle and the quality score thereof, specifically including:

aiming at each high-quality detection frame of each obstacle in the laser point cloud, determining the intersection ratio of the high-quality detection frame and other high-quality detection frames of the obstacle according to the position relationship of the high-quality detection frame of the obstacle and the other high-quality detection frames of the obstacle;

and determining the predicted detection frame of the obstacle according to the intersection ratio and the quality score among the high-quality detection frames of the obstacle.

This specification provides a model training device, comprising:

the acquisition module acquires each frame of historically acquired laser point cloud as a training sample, and marks a position marking frame of each obstacle in each training sample according to the obstacle information of each obstacle contained in each frame of laser point cloud;

the system comprises a characteristic extraction module, a first network layer and a second network layer, wherein the characteristic extraction module is used for inputting laser point clouds in training samples into a first network layer of an obstacle detection model to be trained aiming at each training sample, extracting characteristics and determining point cloud characteristics of the laser point clouds;

the detection module is used for inputting the determined point cloud characteristics of the laser point cloud into a second network layer of the obstacle detection model to be trained and determining each candidate detection frame of each obstacle in the laser point cloud;

the scoring module is used for inputting the candidate detection frame and the point cloud characteristics of the laser point cloud into a third network layer of the obstacle detection model to be trained aiming at each candidate detection frame of each obstacle in the laser point cloud, and determining the quality score of the candidate detection frame;

the classification module is used for clustering the candidate detection frames of each obstacle according to the quality score of the candidate detection frames of each obstacle in the laser point cloud, and determining a plurality of high-quality detection frames of each obstacle in the laser point cloud;

and the parameter adjusting module is used for determining a loss function according to the difference between each high-quality detection frame of each obstacle in each training sample and the position marking frame of each obstacle in each training sample, and adjusting the model parameters of each network layer in the obstacle detection model by taking the minimum loss function as a target.

This specification provides an obstacle detection device, including:

the acquisition module acquires laser point cloud of the surrounding environment;

the system comprises a characteristic extraction module, a data acquisition module and a data processing module, wherein the characteristic extraction module is used for inputting the acquired laser point cloud into a first network layer of a pre-trained obstacle detection model, performing characteristic extraction and determining the point cloud characteristics of the laser point cloud;

the detection module is used for inputting the determined point cloud characteristics of the laser point cloud into a second network layer of a pre-trained obstacle detection model and determining each candidate detection frame of each obstacle in the laser point cloud;

the scoring module is used for inputting the candidate detection frame and the point cloud characteristics of the laser point cloud into a third network layer of a pre-trained obstacle detection model aiming at each candidate detection frame of each obstacle in the laser point cloud, and determining the quality score of the candidate detection frame;

the classification module is used for clustering the candidate detection frames of each obstacle according to the quality scores of the candidate detection frames of each obstacle in the laser point cloud, and determining a plurality of high-quality detection frames of each obstacle in the laser point cloud;

and the output module is used for determining the prediction detection frame of each obstacle according to the determined high-quality detection frames and the quality scores of the high-quality detection frames of each obstacle.

The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-described model training or obstacle detection method.

The present specification provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the above model training or obstacle detection method.

The embodiment of the specification adopts at least one technical scheme which can achieve the following beneficial effects:

in this specification, each frame of laser point cloud collected in history may be obtained as a training sample, each training sample is labeled according to obstacle information in each frame of laser point cloud, and then, a plurality of candidate detection frames of each obstacle in each frame of laser point cloud and a quality score of each candidate detection frame are determined respectively through each network layer of an obstacle detection model to be trained. And then, clustering the candidate detection frames according to the quality scores of the candidate detection frames to determine a plurality of high-quality detection frames. And finally, adjusting model parameters in the obstacle detection model by taking the difference between the high-quality detection frame of each obstacle in each training sample and the position marking frame of each obstacle in each training sample as a target. Based on the quality scores of the candidate detection frames, clustering is carried out on the candidate detection frames, the criteria of quality judgment of the detection frames are dynamically adjusted, and the generalization of the model to different scenes is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 is a flow chart of a model training method provided by an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of an obstacle location marker provided in an embodiment of the present disclosure;

fig. 3 is a schematic diagram of a position labeling frame and a candidate detecting frame of an obstacle according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram illustrating an intersection ratio between a position labeling box and a candidate detection box according to an embodiment of the present disclosure;

fig. 5 is a flowchart of an obstacle detection method provided in an embodiment of the present disclosure;

FIG. 6 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of an obstacle detection device provided in an embodiment of the present disclosure;

fig. 8 is a schematic view of an electronic device implementing a model training or obstacle detection method according to an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the present disclosure more apparent, the technical solutions of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person skilled in the art without making any inventive step based on the embodiments in the description belong to the protection scope of the present application.

The present specification provides a model training method for training an obstacle detection model to detect obstacles in the surrounding environment in real time during the driving of an unmanned device. The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 1 is a schematic flow chart of a model training method provided in an embodiment of the present specification, which may specifically include the following steps:

s100: and acquiring each frame of historically acquired laser point cloud as a training sample, and marking a position marking frame of each obstacle in each training sample according to the obstacle information of each obstacle contained in each frame of laser point cloud.

According to the model training method provided by the specification, when the obstacle detection model is trained, the laser point cloud collected in the history driving process of the unmanned equipment can be obtained firstly as a training sample, so that subsequent model training can be carried out.

Specifically, each frame of laser point cloud including the surrounding environment collected by the unmanned equipment in the historical driving process can be obtained first, and each obtained frame of laser point cloud is used as a training sample. And then, for each frame of laser point cloud, marking the frame of laser point cloud according to the obstacle information of each obstacle in the frame of laser point cloud.

The obstacle information at least comprises information such as the position of an obstacle and the size of the obstacle, and when the frame of laser point cloud is marked, all the obstacles contained in the frame of laser point cloud can be marked in a position marking frame mode.

As shown in fig. 2, fig. 2 exemplarily shows a frame of laser point cloud of historical collection, the left side in the figure is point cloud data corresponding to a vehicle, and the right side in the figure is point cloud data corresponding to a tree, and the positions of obstacle vehicles and trees in the frame of laser point cloud can be labeled through a rectangular frame as shown in the figure. It should be noted that fig. 2 is only exemplarily illustrated by a two-dimensional plane space, the actually acquired laser point cloud is in a three-dimensional space, and the position marking frame of each obstacle is also a three-dimensional structure.

In addition, the obstacle information may further include types of obstacles, and type labeling may be performed on each obstacle in the frame of laser point cloud according to the type of each obstacle in the frame of laser point cloud, so that when obstacle detection is performed, the type of the obstacle may be detected based on the obstacle detection model.

S102: and aiming at each training sample, inputting the laser point cloud in the training sample into a first network layer of the obstacle detection model to be trained, extracting the characteristics, and determining the point cloud characteristics of the laser point cloud.

S104: and inputting the determined point cloud characteristics of the laser point cloud into a second network layer of the obstacle detection model to be trained, and determining each candidate detection frame of each obstacle in the laser point cloud.

After the training samples and their labels are obtained in step S100, model training can be performed based on the training samples and their labels.

Specifically, for each training sample, the laser point clouds in the training sample are input into a first network layer of the obstacle detection model to be trained, feature extraction is performed, and the point cloud features of the frame of laser point clouds are determined. The point cloud feature includes feature information of multiple dimensions such as the type, material, and color of each laser point. And then, inputting the point cloud characteristics of the frame of laser point cloud into a second network layer of the obstacle detection model to be trained, and determining a plurality of candidate detection frames of each obstacle contained in the frame of laser point cloud.

And each obstacle corresponds to a plurality of candidate detection frames for roughly determining the initial position of the obstacle. As shown in fig. 3, the vehicle graphs connected by the dotted lines in fig. 3 represent point cloud data corresponding to the vehicle in the training sample, the solid line boxes represent position labeling boxes of the vehicle, and the dotted line boxes represent a plurality of candidate detection boxes of the vehicle.

Further, the obstacle detection model further comprises a fourth network layer for predicting the obstacle type. Therefore, after the point cloud features of the frame of laser point cloud are extracted, the point cloud features of the frame of laser point cloud can be input into the third network layer of the obstacle detection model to be trained, and the type of each obstacle contained in the frame of laser point cloud is determined.

S106: and aiming at each candidate detection frame of each obstacle in the laser point cloud, inputting the candidate detection frame and the point cloud characteristics of the laser point cloud into a third network layer of an obstacle detection model to be trained, and determining the quality score of the candidate detection frame.

After a plurality of candidate detection frames of each obstacle are determined, quality evaluation can be carried out on each candidate detection frame, so that the detection frame with the highest quality, namely the most accurate position, is screened out and used as the detection result of the obstacle.

Specifically, for each candidate detection frame of each obstacle in the frame of laser point cloud, the candidate detection frame and the point cloud characteristics of the frame of laser point cloud are input into a third network layer of the obstacle detection model to be trained, and the quality score of the position of the candidate detection frame is determined. The candidate detection frame may be represented by coordinates of a center position of the candidate detection frame and parameters of a length, a width, and a height of the candidate detection frame.

Further, the quality score of the candidate detection frame of the obstacle represents an Intersection Over Unity (IOU) between the candidate detection frame of the obstacle and the position mark frame of the obstacle, the closer the IOU between the candidate detection frame of the obstacle and the position mark frame of the obstacle is to 1, the closer the candidate detection frame of the obstacle is to the position mark frame of the obstacle is, the more accurate the position of the candidate detection frame is, the higher the quality score of the position of the candidate detection frame is.

Therefore, in the model training stage, the IOU between the candidate detection frame of the obstacle and the position marking frame of the obstacle can be determined as the marking score of the candidate detection frame for each candidate detection frame of each obstacle in the frame of laser point cloud, and the model parameters in the third network layer are adjusted with the goal of minimizing the difference between the quality score output by the third network layer and the marking score of the candidate detection frame.

Fig. 4 exemplarily shows an IOU between a candidate detection frame and a position mark frame of an obstacle, where a rectangular frame a shows the candidate detection frame of the obstacle, a rectangular frame B shows the position mark frame of the obstacle, a shaded area in the left diagram shows an intersection a ∞ B between the candidate detection frame and the position mark frame, a shaded area in the right diagram shows a union a ≡ B between the candidate detection frame and the position mark frame, and the IOU between the candidate detection frame and the position mark frame of the obstacle is shown as (a ≡ B)/(a ≡ B).

Furthermore, since the laser point cloud includes different types of obstacles, in order to comprehensively judge the quality score of the candidate detection frame of each obstacle, the quality score of the candidate detection frame of each obstacle can be comprehensively determined according to the quality score of the type of each obstacle and the quality score of the position of the candidate detection frame of each obstacle. For example, assuming that the probability that the type of the obstacle belongs to the vehicle is 0.8, and the mass score of the position of the candidate detection frame of the obstacle is 0.8, the mass score of the candidate detection frame is 0.8 × 0.8 — 0.64.

In order to make the position quality evaluation of each candidate detection frame more accurate, preliminary position correction may be performed on each candidate detection frame before determining the quality score of the position of each candidate detection frame, and the quality evaluation may be performed on the preliminarily corrected candidate detection frame. And determining the corresponding prediction deviation distribution of the candidate detection frame by inputting the candidate detection frame and the point cloud characteristics of the frame of laser point cloud into a fifth network layer of the obstacle detection model to be trained. And determining the prediction deviation corresponding to the candidate detection frame according to the position deviation with the maximum probability in the prediction deviation distribution corresponding to the candidate detection frame, and correcting the position of the candidate detection frame according to the determined prediction deviation.

When the fifth network layer is trained, the actual deviation distribution between the candidate detection frame and the position marking frame of the obstacle can be determined, and the model parameters in the fifth network layer are adjusted by taking the minimum difference between the actual deviation distribution and the predicted deviation distribution as a learning target.

For example, assuming that the actual deviation corresponding to the candidate detection frame is 1m, the actual deviation distribution is determined to be [0,0,0,0,1] according to the unit length of 0.2m, which indicates that the probability that the localization deviation of the candidate detection frame is distributed in the fifth section is 1, and the probability that the localization deviation is distributed in the other sections is 0. The prediction deviation distribution corresponding to the candidate detection frame is [0.1,0.2,0.1,0.5,0.1], which indicates that the probabilities of the prediction deviation distribution in the first interval, the third interval and the fifth interval are all 0.1, the probability of the prediction deviation distribution in the second interval is 0.2, and the probability of the prediction deviation distribution in the fourth interval is 0.5. The predicted deviation of the candidate detection frame may then be determined according to the most probable positioning deviation, i.e. the positioning deviation corresponding to the fourth interval.

S108: and clustering the candidate detection frames of each obstacle according to the quality score of each candidate detection frame of each obstacle in the laser point cloud, and determining a plurality of high-quality detection frames of each obstacle in the laser point cloud.

After the quality scores of the candidate detection frames of the obstacle are determined in the specification, the candidate detection frames can be preliminarily screened, and part of high-quality detection frames are selected.

Due to the fact that the quality evaluation standards of the detection frames are different in different scenes, for example, the evaluation standard in a dense obstacle scene is looser than that in a sparse obstacle scene, and the evaluation standard in rainy weather on the detection frames is looser than that in sunny weather. Therefore, in order to adapt to various different scenes, the judgment threshold value of the quality of the detection box can be dynamically adjusted in a clustering manner in the specification.

Specifically, the candidate detection frames can be clustered according to the quality scores of the candidate detection frames of each obstacle in the frame of laser point cloud, a plurality of candidate detection frames with similar quality scores are determined, and a candidate detection frame with a higher quality score is determined from each cluster and used as a high-quality detection frame of each obstacle.

When clustering is carried out on each candidate detection frame, clustering can be carried out by adopting a clustering algorithm such as k-means and the like. Of course, a Gaussian Mixture Model (GMM) can also be used for clustering. The present specification does not limit which clustering method is used, as long as candidate detection frames with high quality and low quality can be distinguished. Since classification based on a clustering algorithm is a mature prior art, the description is omitted here.

S110: determining a loss function according to the difference between each high-quality detection frame of each obstacle in each training sample and the position marking frame of each obstacle in each training sample, and adjusting the model parameters of each network layer in the obstacle detection model by taking the minimum loss function as a target.

After a plurality of high-quality detection frames are preliminarily screened out in the specification, model parameters in the obstacle detection model can be adjusted based on position deviation between each high-quality detection frame of the obstacle and a position marking frame of the obstacle.

Specifically, for each high-quality detection frame of each obstacle in the frame of laser point cloud, a loss function is determined according to a position deviation between the high-quality detection frame and a position marking frame of the obstacle. And adjusting model parameters of each network layer in the obstacle detection model with the aim of minimizing the loss function, namely minimizing the deviation between the high-quality detection frame and the position marking frame.

Further, since the position of each candidate detection frame has been preliminarily corrected in step S106, after the high-quality detection frames are screened out, the model parameters of each network layer can be adjusted with the objective of minimizing the actual deviation between the high-quality detection frame of each obstacle and the position labeling frame of each obstacle.

Based on the model training method shown in fig. 1, according to each frame of historically collected laser point cloud and obstacle information of each obstacle in the laser point cloud, a training sample and a mark thereof are determined, then point cloud characteristics of each frame of laser point cloud are extracted through a first network layer of an obstacle detection model to be trained, and a candidate detection frame of each obstacle in each frame of laser point cloud is determined through a second network layer. And then, aiming at each candidate detection frame of each obstacle in the frame of laser point cloud, inputting the candidate detection frame and the point cloud characteristics of the frame of laser point cloud into a third network layer, determining the quality score of each candidate detection frame, clustering each candidate detection frame according to the quality score of each candidate detection frame, and determining a plurality of high-quality detection frames. And finally, determining a loss function according to the difference between the high-quality detection frame of each obstacle in each training sample and the position marking frame of each obstacle in each training sample, and adjusting the model parameters of each network layer in the obstacle detection model by taking minimization of the loss function as a target. Based on the quality scores of the candidate detection frames, clustering is carried out on the candidate detection frames, the criteria of quality judgment of the detection frames are dynamically adjusted, and the generalization of the model to different scenes is improved.

In addition, in this specification, since the type of each obstacle in the frame of laser point cloud is identified based on the point cloud data of the whole frame of laser point cloud in step S102, in order to further improve the accuracy of obstacle type identification, accurate type identification may be performed based on the point cloud data in the high-quality detection frame of the obstacle.

Specifically, aiming at each high-quality detection frame of each obstacle in the frame of laser point cloud, the point cloud characteristics in the high-quality detection frame are input into a sixth network layer of the obstacle detection model to be trained, and the type of the obstacle corresponding to the high-quality detection frame is determined. And after the subsequent model deployment is on line, updating the quality score of each high-quality detection frame based on the quality score of the type of the obstacle after re-identification, and determining the final detection result of the obstacle according to the updated quality score of each high-quality detection frame.

In this specification, the fourth network layer and the sixth network layer of the obstacle detection model are used for identifying the type of an obstacle. And the fourth network layer determines the barrier type of each barrier contained in the whole frame of laser point cloud based on the point cloud characteristics of the whole frame of laser point cloud. And the sixth network layer only carries out type identification on the obstacles in the high-quality detection frame based on the point cloud characteristics in the high-quality detection frame. Therefore, the obstacle type identified by the sixth network layer is more accurate than the identification result identified by the fourth network layer.

When each training sample is labeled in the present specification, the type of each obstacle in each training sample may also be labeled. In the model training stage, the model parameters in the fourth network layer can be adjusted according to the type of each barrier mark in each frame of laser point cloud and the output result of the fourth network layer. And adjusting the model parameters in the sixth network layer according to the type of each barrier mark in each frame of laser point cloud and the output result of the sixth network layer.

Based on the model training method shown in fig. 1, the present specification correspondingly provides an obstacle detection method, which can obtain an obstacle detection model based on the model training method, and detect an obstacle in the surrounding environment during the driving process of the unmanned device.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 5 is a schematic flow chart of an obstacle detection method provided in an embodiment of the present disclosure, which may specifically include the following steps:

s200: and acquiring laser point cloud of the surrounding environment.

S202: inputting the obtained laser point cloud into a first network layer of a pre-trained obstacle detection model, performing feature extraction, and determining the point cloud feature of the laser point cloud.

S204: and inputting the determined point cloud characteristics of the laser point cloud into a second network layer of a pre-trained obstacle detection model, and determining each candidate detection frame of each obstacle in the laser point cloud.

When detecting an obstacle in the surrounding environment of the unmanned aerial vehicle, the present laser point cloud of the surrounding environment may be collected, and the frame of laser point cloud may be input to a first network layer in a pre-trained obstacle detection model to extract the point cloud characteristics of the frame of laser point cloud. And then, inputting the point cloud characteristics of the frame of laser point cloud into a second network layer of a pre-trained obstacle detection model, and determining a plurality of candidate detection frames of each obstacle in the frame of laser point cloud.

S206: and aiming at each candidate detection frame of each obstacle in the laser point cloud, inputting the candidate detection frame and the point cloud characteristics of the laser point cloud into a third network layer of an obstacle detection model to be trained, and determining the quality score of the candidate detection frame.

S208: and clustering the candidate detection frames of each obstacle according to the quality score of each candidate detection frame of each obstacle in the laser point cloud, and determining a plurality of high-quality detection frames of each obstacle in the laser point cloud.

S210: and determining a prediction detection frame of each obstacle according to the determined high-quality detection frames of each obstacle and the quality scores of the high-quality detection frames.

After detecting and obtaining a plurality of detection frames of each obstacle in the frame of laser point cloud, determining a plurality of candidate detection frames of the obstacle according to each obstacle detected in the frame of laser point cloud, inputting the candidate detection frames and the point cloud characteristics of the frame of laser point cloud into a third network layer of a pre-trained obstacle detection model according to each candidate detection frame of the obstacle, and determining the quality score of the candidate detection frames.

And then, clustering the candidate detection frames of the obstacles according to the quality scores of the candidate detection frames of the obstacles in the frame of laser point cloud, determining a plurality of candidate detection frames with similar quality scores, and determining a candidate detection frame with higher quality score from the clusters to serve as a high-quality detection frame.

Finally, for each obstacle, a final detection result of the obstacle, that is, a predicted detection frame of the obstacle is determined from the respective quality detection frames of the obstacle by a Non-Maximum Suppression (NMS) algorithm based on the quality scores of the respective quality detection frames of the obstacle. The obstacle detection model used in the present specification is obtained by training based on the model training method shown in fig. 1.

Further, when the screening of the prediction detection frames is performed based on the NMS algorithm, the high-quality detection frames may be sorted from high to low according to the quality scores of the high-quality detection frames of the obstacle, and the IOU between the high-quality detection frame and the other high-quality detection frames of the obstacle may be determined for each high-quality detection frame of the obstacle in turn according to the sorting result of the quality scores according to the positional relationship between the high-quality detection frame and the other high-quality detection frames of the obstacle. And then, when other high-quality detection frames and the IOU of the high-quality detection frame exceed a preset threshold value, the other high-quality detection frames can be eliminated. And finally, determining a final prediction detection frame from the high-quality detection frames after the duplication removal according to the quality scores of the high-quality detection frames. Wherein, the preset threshold value can be reasonably set as required.

Furthermore, since the type of each obstacle in the frame of laser point cloud is identified based on the point cloud data of the whole frame of laser point cloud in step S202, in order to further improve the accuracy of obstacle type identification, accurate type identification can be performed based on the point cloud data in the high-quality detection frame of the obstacle.

Specifically, for each high-quality detection frame of each obstacle in the frame of laser point cloud, the point cloud features in the high-quality detection frame are input into a sixth network layer of a pre-trained obstacle detection model, and the type of the obstacle corresponding to the high-quality detection frame is determined. And then, updating the quality score of the high-quality detection frame according to the quality score of the high-quality detection frame corresponding to the type of the obstacle and the quality score of the position of the high-quality detection frame output by the third network layer.

Based on the model training method shown in fig. 1, an embodiment of the present specification further provides a schematic structural diagram of a model training apparatus, as shown in fig. 6.

Fig. 6 is a schematic structural diagram of a model training apparatus provided in an embodiment of the present disclosure, including:

the acquisition module 300 acquires each frame of historically acquired laser point cloud as a training sample, and marks a position marking frame of each obstacle in each training sample according to the obstacle information of each obstacle contained in each frame of laser point cloud;

a feature extraction module 302, configured to, for each training sample, input a laser point cloud in the training sample into a first network layer of an obstacle detection model to be trained, perform feature extraction, and determine a point cloud feature of the laser point cloud;

the detection module 304 is used for inputting the determined point cloud characteristics of the laser point cloud into a second network layer of the obstacle detection model to be trained and determining each candidate detection frame of each obstacle in the laser point cloud;

a scoring module 306, configured to, for each candidate detection frame of each obstacle in the laser point cloud, input the candidate detection frame and the point cloud characteristics of the laser point cloud into a third network layer of the obstacle detection model to be trained, and determine a quality score of the candidate detection frame;

the classification module 308 is used for clustering the candidate detection frames of each obstacle according to the quality scores of the candidate detection frames of each obstacle in the laser point cloud, and determining a plurality of high-quality detection frames of each obstacle in the laser point cloud;

and the parameter adjusting module 310 determines a loss function according to the difference between each high-quality detection frame of each obstacle in each training sample and the position marking frame of each obstacle in each training sample, and adjusts the model parameters of each network layer in the obstacle detection model by taking the minimum loss function as a target.

Optionally, the obstacle detection model to be trained further includes a fourth network layer, and the detection module 304 is further configured to input the determined point cloud features of the laser point cloud into the fourth network layer of the obstacle detection model to be trained, and determine the type of each obstacle in the laser point cloud.

Optionally, the scoring module 306 is specifically configured to determine a quality score of the position of the candidate detection box output by the third network layer, and determine the quality score of the candidate detection box according to the quality score of the type of the obstacle and the quality score of the position of the candidate detection box.

Optionally, the scoring module 306 is further configured to, for each candidate detection frame of each obstacle in the laser point cloud, determine an intersection ratio between the candidate detection frame of the obstacle and a position labeling frame of the obstacle as a labeling score of the candidate detection frame, and adjust the model parameter in the third network layer with a goal of minimizing a difference between the quality score output by the third network layer and the labeling score of the candidate detection frame.

Optionally, the scoring module 306 is further configured to input the candidate detection frame and the point cloud feature of the laser point cloud into a fifth network layer of the obstacle detection model to be trained, determine a prediction deviation distribution corresponding to the candidate detection frame, determine a prediction deviation corresponding to the candidate detection frame according to the prediction deviation distribution corresponding to the candidate detection frame, and adjust the position of the candidate detection frame according to the determined prediction deviation.

Based on the obstacle detection method shown in fig. 5, an embodiment of the present specification further provides a schematic structural diagram of an obstacle detection apparatus, as shown in fig. 7.

Fig. 7 is a schematic structural diagram of an obstacle detection device provided in an embodiment of the present specification, including:

an acquisition module 400 for acquiring a laser point cloud of a surrounding environment;

the feature extraction module 402 is used for inputting the acquired laser point cloud into a first network layer of a pre-trained obstacle detection model, performing feature extraction and determining point cloud features of the laser point cloud;

the detection module 404 is used for inputting the determined point cloud characteristics of the laser point cloud into a second network layer of a pre-trained obstacle detection model, and determining each candidate detection frame of each obstacle in the laser point cloud;

a scoring module 406, configured to, for each candidate detection frame of each obstacle in the laser point cloud, input the candidate detection frame and the point cloud characteristics of the laser point cloud into a third network layer of a pre-trained obstacle detection model, and determine a quality score of the candidate detection frame;

the classification module 408 is used for clustering the candidate detection frames of each obstacle according to the quality scores of the candidate detection frames of each obstacle in the laser point cloud, and determining a plurality of high-quality detection frames of each obstacle in the laser point cloud;

and the output module 410 is used for determining the prediction detection frame of each obstacle according to the determined high-quality detection frames of each obstacle and the quality scores of the high-quality detection frames.

Optionally, the scoring module 406 is specifically configured to, for each high-quality detection frame of each determined obstacle, input the point cloud feature in the high-quality detection frame into a sixth network layer of the obstacle detection model, determine the type of the obstacle corresponding to the high-quality detection frame, and update the quality score of the high-quality detection frame according to the quality score of the high-quality detection frame corresponding to the type of the obstacle and the quality score of the position of the high-quality detection frame output by the third network layer.

Optionally, the output module 410 is specifically configured to, for each good-quality detection frame of each obstacle in the laser point cloud, determine, according to a positional relationship between the good-quality detection frame of the obstacle and other good-quality detection frames of the obstacle, an intersection ratio between the good-quality detection frame and the other good-quality detection frames, and determine, according to the intersection ratio between the good-quality detection frames of the obstacle and a quality score, a predicted detection frame of the obstacle.

Embodiments of the present description also provide a computer-readable storage medium, where the storage medium stores a computer program, and the computer program can be used to execute the model training method provided in fig. 1 or the obstacle detection method shown in fig. 5.

According to a traffic signal light detection method shown in fig. 1 or an obstacle detection method shown in fig. 5, an embodiment of the present specification further provides a schematic structural diagram of an electronic device shown in fig. 8. As shown in fig. 8, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, but may also include hardware required for other services. The processor reads a corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the model training method shown in fig. 1 or the obstacle detection method shown in fig. 5.

Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and create a dedicated integrated circuit chip. Furthermore, nowadays, instead of manually generating an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardbyscript Description Language (vhigh Description Language), and so on, which are currently used in the most popular languages. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the various elements may be implemented in the same one or more software and/or hardware implementations of the present description.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims

1. A method of model training, comprising:

2. The method of claim 1, wherein the obstacle detection model to be trained further comprises a fourth network layer;

the method further comprises the following steps:

3. The method of claim 2, wherein determining the quality score of the candidate detection box comprises:

4. The method of claim 1, wherein the method further comprises:

5. The method of claim 1, wherein prior to determining the quality score for the candidate detection box, the method further comprises:

6. An obstacle detection method, comprising:

acquiring laser point cloud of the surrounding environment;

7. The method of claim 6, wherein determining the quality score of the candidate detection box comprises:

8. The method of claim 6, wherein determining the predicted detection frame for each obstacle based on the determined quality detection frames and their quality scores for each obstacle comprises:

9. A model training apparatus, comprising:

10. An obstacle detection device, comprising:

11. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of claims 1 to 5 or 6 to 8.

12. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the method of any of claims 1 to 5 or 6 to 8.