CN113688760A

CN113688760A - Automatic driving data identification method and device, computer equipment and storage medium

Info

Publication number: CN113688760A
Application number: CN202111009568.6A
Authority: CN
Inventors: 夏润; 刘传秀; 韩旭
Original assignee: Guangzhou Weride Technology Co Ltd
Current assignee: Guangzhou Weride Technology Co Ltd
Priority date: 2021-08-31
Filing date: 2021-08-31
Publication date: 2021-11-23

Abstract

The embodiment of the invention provides a data identification method, a data identification device, computer equipment and a storage medium for automatic driving, wherein the method comprises the following steps: the method comprises the steps of acquiring original data acquired during automatic driving, respectively determining a student network and a teacher network which are suitable for automatic driving, calculating difference of the student network and the teacher network for predicting the original data, taking the difference as distribution difference, identifying original data belonging to long-tail data for automatic driving according to the distribution difference, taking the original data as target data, and automatically excavating the long-tail data according to the difference of generalization capability of the student network and the teacher network for the long-tail data, so that the accuracy of excavating the long-tail data can be ensured, manual excavation of the long-tail data is avoided, the efficiency of excavating the long-tail data is greatly improved, and the cost is greatly reduced.

Description

Automatic driving data identification method and device, computer equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of automatic driving, in particular to a data identification method and device for automatic driving, computer equipment and a storage medium.

Background

To ensure an accurate decision of the vehicle in automatic driving, the vehicle may sense its own state and surrounding environment, and acquire a large amount of data, such as speed of the vehicle, attitude of the vehicle, traffic flow information, road conditions, traffic signs, and the like.

For decision making, the data generally has a basic assumption that the data approximately obeys uniform distribution, however, in a real drive test environment, the environmental data often shows an unbalanced phenomenon, and long-tail data exists, that is, environmental data with low probability and difficult collection occurs.

For example, for a traffic light, the traffic light may be damaged, and the traffic light may be shielded by a floating obstacle, or weather such as cloudy days, rainy days, and foggy days may occur, so that the acquisition of image data and point cloud data is affected, and the acquired data is long-tail data.

If the long-tail data are missing, the decision of automatic driving under partial scenes is problematic, and accidents can be caused.

However, the number of long-tail data is small, and the long-tail data may be acquired in less than 1 hour every 1000 hours in automatic driving, so that the efficiency of manually mining the long-tail data is low, and the cost is high.

Disclosure of Invention

The embodiment of the invention provides a data identification method and device for automatic driving, computer equipment and a storage medium, and aims to solve the problems of low efficiency and high cost of manual mining of long-tail data of automatic driving.

In a first aspect, an embodiment of the present invention provides an automatic driving data identification method, including:

acquiring original data acquired during automatic driving;

respectively determining a student network and a teacher network suitable for the automatic driving;

calculating a difference between the student network and the teacher network for predicting the original data as a distribution difference;

and identifying the original data belonging to the long tail data for the automatic driving according to the distribution difference as target data.

In a second aspect, an embodiment of the present invention further provides an automatic driving data recognition apparatus, including:

the system comprises an original data acquisition module, a data acquisition module and a data acquisition module, wherein the original data acquisition module is used for acquiring original data acquired during automatic driving;

the network determining module is used for respectively determining a student network and a teacher network which are suitable for automatic driving;

a distribution difference calculation module for calculating a difference between the student network and the teacher network, which predicts the original data, as a distribution difference;

and the original data identification module is used for identifying the original data which belong to the long tail data for the automatic driving according to the distribution difference and taking the original data as target data.

In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:

one or more processors;

a memory for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the data recognition method for automatic driving according to the first aspect.

In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data recognition method for automatic driving according to the first aspect.

In the embodiment, the raw data acquired during automatic driving is acquired, the student network and the teacher network suitable for automatic driving are respectively determined, the difference of the raw data predicted between the student network and the teacher network is calculated to serve as the distribution difference, the raw data belonging to the long-tail data for automatic driving is identified according to the distribution difference to serve as the target data, the long-tail data is automatically mined according to the difference of the generalization capability of the student network and the teacher network to the long-tail data, the accuracy of mining the long-tail data can be guaranteed, manual mining of the long-tail data is avoided, the efficiency of mining the long-tail data is greatly improved, and the cost is greatly reduced.

Drawings

Fig. 1 is a flowchart of an automatic driving data recognition method according to an embodiment of the present invention;

fig. 2A is a schematic structural diagram of a vehicle according to an embodiment of the present invention;

fig. 2B is an exemplary diagram of a detection box according to an embodiment of the present invention;

FIG. 3 is a flowchart of a data recognition method for automatic driving according to a second embodiment of the present invention;

fig. 4 is a schematic structural diagram of an automatic driving data identification device according to a third embodiment of the present invention;

fig. 5 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1 is a flowchart of an automatic driving data recognition method according to an embodiment of the present invention, where the embodiment is applicable to a case where a teacher network and a student network are used to mine long-tailed data of automatic driving, and the method may be executed by an automatic driving data recognition device, where the automatic driving data recognition device may be implemented by software and/or hardware, and may be configured in a computer device, such as a server, a personal computer, or the like, and specifically includes the following steps:

step 101, acquiring raw data collected during automatic driving.

The vehicle in this embodiment may support automatic driving, so-called automatic driving, which may refer to the ability of the vehicle itself to sense the environment, plan a path, and autonomously implement vehicle control, that is, human-simulated driving by electronically controlling the vehicle.

Depending on the degree of grasp of the vehicle handling task, the automated driving vehicle can be classified into L0 non-Automation (No Automation), L1 driver assistance (driverasistence), L2 Partial Automation (Partial Automation), L3 conditional Automation (conditional Automation), L4 High Automation (High Automation), and L5 full Automation (full Automation).

The automatically driven vehicle in this embodiment may refer to a vehicle that satisfies any one of requirements L1-L5, in which the system functions as an assist function in L1-L3, and when reaching L4, the vehicle drive will be handed over to the system, and therefore, the automatically driven vehicle may be selected as a vehicle that satisfies any one of requirements L4 and L5.

As shown in fig. 2A, the vehicle 200 may include a driving control apparatus 201, a vehicle body bus 202, an ECU (electronic control unit) 203, an ECU204, an ECU205, a sensor 206, a sensor 207, a sensor 208, an actuator 209, an actuator 210, and an actuator 211.

A driving control device (also referred to as an in-vehicle brain) 201 is responsible for overall intelligent control of the entire vehicle 200. The driving control device 201 may be a separately provided controller, for example, a CPU, a heterogeneous processor (e.g., GPU, TPU, NPU, etc.), a Programmable Logic Controller (PLC), a single chip microcomputer, an industrial controller, or the like; or the equipment consists of other electronic devices which have input/output ports and have the operation control function; but also a computer device installed with a vehicle driving control type application. The driving control device can analyze and process the data sent by each ECU and/or the data sent by each sensor received from the vehicle body bus 202, make a corresponding decision, and send an instruction corresponding to the decision to the vehicle body bus.

The body bus 202 may be a bus for connecting the driving control apparatus 201, the ECU203, the ECU204, the ECU205, the sensor 206, the sensor 207, the sensor 208, and other not-shown apparatuses of the vehicle 200. Since the high performance and reliability of a CAN (controller area network) bus are widely recognized, a vehicle body bus commonly used in a motor vehicle is a CAN bus. Of course, it is understood that the body bus may be other types of buses.

The vehicle body bus 202 may send the instruction sent by the driving control device 201 to the ECU203, the ECU204, and the ECU205, and the ECU203, the ECU204, and the ECU205 may further analyze and process the instruction and send the instruction to the corresponding execution device for execution.

Sensors

206, 207, 208 include, but are not limited to, lidar, cameras, and the like.

It should be understood that the numbers of the vehicle, the driving control apparatus, the body bus, the ECU, the actuators, and the sensors in fig. 2A are merely illustrative. There may be any number of vehicles, driving control devices, body buses, ECUs, and sensors, as desired for implementation.

When the vehicle is automatically driven, the sensors can be used for collecting data, and the data is recorded as raw data, and the raw data is generally data which is not labeled.

Further, these raw data may include the state of the vehicle itself, such as speed, attitude, and the like, and may also include information of the external environment, such as image data, point cloud data, audio data, and the like.

Step 102, respectively determining a student network and a teacher network suitable for automatic driving.

Teacher network and student network belong to the model used in transfer learning (tranfersLearning), wherein, the transfer learning is to transfer the performance of one model to another model, the teacher network is often a more complex model with very good performance and generalization capability, and the network can be used as a softtarget to guide another simpler student network to learn, so that the simpler student network with less parameter computation amount can also have the performance similar to the teacher network, which is a model compression mode.

In this embodiment, on one hand, a model suitable for automatic driving is set as a student network, and on the other hand, a model suitable for automatic driving is set as a teacher network, where structures of the student network and the teacher network are independent and different from each other, the structures of the student network and the teacher network are not limited to a neural network designed manually, but may also be a neural network optimized by a model quantization method, a neural network searched for a target hardware delay characteristic by an NAS (neural network structure search) method, and the like, which is not limited in this embodiment.

In a specific implementation, one or more evaluation indexes used for evaluating the model in a specified dimension may be determined, and may be attributes of the structure of the model itself, or may be performances of the model, such as delay, accuracy, recall, volume, and the like.

The student network and the teacher network suitable for automatic driving are trained respectively for the same learning objective so as to distinguish the student network from the teacher network under evaluation indexes.

Among them, the learning target is one of functions of automatic driving, for example, detecting traffic lights, detecting lane lines, detecting vehicles, detecting pedestrians, and the like.

That is, the learning target of the student network is the same as the learning target of the teacher network, and the evaluation index of the student network is significantly different from the evaluation index of the teacher network.

It should be noted that the models with different structures have different performances in different scenes, in order to comprehensively improve the performance of the student network, a plurality of (two or more) teacher networks can be set for the same student network, the structures of the teacher networks are independent and different from each other, the teacher networks with different structures have different preferences for the long-tail data, and richer long-tail data can be mined by using the preference differences.

Illustratively, the evaluation index includes delay and accuracy, in this example, a data set is determined, and the data set includes sample data labeled with Tag, so that a supervised mode is applied to train a student network and a teacher network.

Models suitable for autonomous driving are trained using the data set as a network of students for a specified learning objective.

Aiming at the same learning target, the same data set is used for training a model suitable for automatic driving and used as a teacher network, and the same data set is used for training the student network and the teacher network, so that a uniform evaluation standard can be provided, and the evaluation indexes can be distinguished more accurately.

The student network is generally a lightweight model, the teacher network is generally a large-size model, namely, the structure of the student network is simpler than that of the teacher network, so that the delay of the student network is lower than that of the teacher network, the student network can be deployed in a vehicle and assists in decision making in real time when automatic driving is started, meanwhile, the learning capacity and the generalization capacity of the student network on long-tail data are weaker, the teacher network has stronger learning capacity and generalization capacity on the long-tail data, in addition, the accuracy of the student network is lower than that of the teacher network, and the learning capacity and the generalization capacity on the long-tail data are improved depending on the teacher network.

Of course, the above-mentioned manner of training the student network and the teacher network is only an example, and when the embodiment of the present invention is implemented, other manners of training the student network and the teacher network may be set according to actual situations, for example, for a model for prescreening, a model with a more complex structure and a higher recall rate may be set as the teacher network, a model with a simpler structure and a lower recall rate may be set as the student network, and the like, which is not limited in this embodiment of the present invention. In addition, besides the above manner of training the student network and the teacher network, those skilled in the art may also adopt other manners of training the student network and the teacher network according to actual needs, and the embodiment of the present invention is not limited thereto.

And 103, calculating the difference of the original data between the student network and the teacher network as a distribution difference.

In this embodiment, the student network and the teacher network may be applied to an automatic driving scenario, the student network and the teacher network may be used to predict the original data according to the learning target, and calculate the difference between the two in prediction as a distribution difference, where the predicted distribution difference may reflect the difference between the student network and the teacher network in performance to a certain extent, especially the difference between the learning ability and the generalization ability of the long-tail data.

It should be noted that, in order to improve the mining accuracy, a whole original data may be divided into multiple original data with smaller volumes according to the number, the average, the random and other manners, and the distribution difference between the student network and the teacher network is predicted by using the original data with smaller volumes.

In a specific implementation, in one aspect, raw data is input into a student network for processing to output a first detection result. On the other hand, the original data is input into the teacher network for processing so as to output a second detection result.

Considering that the performance (e.g., accuracy) of the teacher network is generally better than that of the student network, the first detection result may be found to be different from the second detection result by taking the second detection result as a reference, that is, by default, the second detection result is accurate, taking the first detection result as a comparison with the second detection result, and calculating a difference between the first detection result and the second detection result as a distribution difference.

Since the model can be classified into classification (classification) and regression (regression), and the data structures of the first detection result and the second detection result are different, the way of calculating the distribution difference of the first detection result relative to the second detection result is also different, for example, KL divergence (relative entropy), JS divergence (Jensen Shannon), cross entropy, Wasserstein distance (bulldozer distance), and the like.

For the problem of classification in automatic driving, the first detection result includes a first probability that a plurality of classes suitable for automatic driving are associated with each class, and the second detection result includes a second probability that a plurality of classes suitable for automatic driving are associated with each class, and for the same learning objective, the first detection result is symmetrical to the second detection result, that is, the class in the first detection result is the same as the class in the second detection result, and is suitable for calculating the distribution difference using KL divergence and the like.

For example, for the problem of traffic light classification, the data structures of the first detection result and the second detection result (the left side is the traffic light, and the right side is the probability (or confidence)) are as follows:

and (3) the red light is on: xx%

And (3) turning on a green light: xx%

Turning on a yellow lamp: xx%

Black light: xx%

Taking KL divergence as an example, for each class, a logarithm is calculated as a ratio between the second probability and the first probability, a product between the second probability and the logarithm is calculated as a difference of the class, and a sum value between differences corresponding to all classes is set as a difference of the first detection result with respect to the second detection result as a distribution difference, which is expressed as follows:

where p is used to represent the true distribution (i.e., the second probability), q is used to represent the predicted distribution (i.e., the first probability), D_KL(p‖q)Is a dispersion value, i.e. the difference in distribution, x_iRepresents the ith class, N (N is a positive integer) classes, q (x)_i) Is a first probability, p (x)_i) Is the second probability.

At KL divergence, the closer/closer the distribution of q is to p, the smaller the divergence value (distribution difference) and, conversely, the farther the distribution of q is from p, the smaller the divergence value (distribution difference) and, furthermore, because the logarithmic function is a convex function, the non-negative divergence value of KL divergence.

For the problem of detection (including regression and classification) in automatic driving, the first detection result comprises a plurality of first detection frames suitable for automatic driving, a first probability that a plurality of classes associated with each first detection frame are associated with each class, the second detection result comprises a plurality of second detection frames suitable for automatic driving, a second probability that a plurality of classes associated with each second detection frame are associated with each class, the first detection result and the second detection result are symmetrical for the same learning target, namely the number of classes in the first detection result is equal to the number of classes in the second detection result, and the distribution difference is calculated by using KL divergence and the like.

The first detection frame and the second detection frame are detection frames, and can be used for detecting a target object in automatic driving, such as a traffic light, a traffic sign, a vehicle, a pedestrian, an obstacle, and the like.

For example, for the problem of traffic light detection, the data structures of the first detection result and the second detection result (the left side is the traffic light, and the right side is the probability (or confidence)) are as follows:

detection frame 1:

position: x, y, w, h

Category information:

and (3) the red light is on: xx%

And (3) turning on a green light: xx%

Turning on a yellow lamp: xx%

Black light: xx%

……

And (3) detecting a frame n:

position: x, y, w, h

Category information:

and (3) the red light is on: xx%

And (3) turning on a green light: xx%

Turning on a yellow lamp: xx%

Black light: xx%

And matching the first detection frame and the second detection frame by using a vertex-based mean square error, a detection frame-based IOU (Intersection over Unit), and the like to form a pairing frame.

The KL divergence may be used to calculate the variance of the pair boxes for each pair box, and specifically, for each class in each pair box, a logarithm is calculated for the ratio between the second probability and the first probability, a product between the second probability and the logarithm is calculated as the variance of the class, and a sum of the corresponding variances of all classes is calculated as the variance of the pair box.

And setting the sum value of the differences corresponding to all the pairing frames as the difference of the first detection result relative to the second detection result as the distribution difference.

For the problem detected in the automatic driving, the original data at least includes point cloud data, and the point cloud data can detect a target object, such as a traffic light, a pedestrian, a vehicle, and the like, independently or in cooperation with other data (such as image data), then, as shown in fig. 2B, the first detection result includes a plurality of first detection frames, the first detection frame is used for representing the target object, the second detection result includes a plurality of second detection frames, and the second detection frame is used for representing the target object.

In a specific implementation, two-dimensional spatial indexes of an R tree (RTree) structure may be respectively established for each frame of the first detection result and each frame of the second detection result, and after the spatial indexes are established, all detection results (the first detection result and the second detection result) around an arbitrary location may be quickly retrieved by querying coordinates (x, y) of the location. Under the conditions of more teacher networks and larger detection area, the retrieval speed of the detection result can be greatly accelerated by RTree, and the calculation cost of analysis and matching is reduced.

As shown in fig. 2B, in the spatial index, a first detection frame and a second detection frame representing the same target object are searched for as paired frames, and in general, the student network also has a certain detection accuracy, the first detection frame does not deviate from the second detection frame seriously, if the first detection frame and the second detection frame overlap, the first detection frame and the second detection frame can be determined as paired frames, of course, if the first detection frame is independent and does not overlap with any second detection frame, the first detection frame and the second detection frame representing an empty frame can be determined as paired frames, and if the second detection frame is independent and does not overlap with any first detection frame, the first detection frame representing an empty frame and the second detection frame can be determined as paired frames.

As shown in fig. 2B, in the spatial index, an overlap ratio is calculated for the pair of frames, where the overlap ratio is a ratio between a first number of point cloud data in the overlap region and a total number, the total number is a difference value obtained by subtracting the first number from a sum of a second number of point cloud data in a first non-overlap region and a third number in a second non-overlap region, the overlap region is a region overlapped between the first detection frame and the second detection frame, the first non-overlap region is a region not overlapped with the second detection frame in the first detection frame, and the second non-overlap region is a region not overlapped with the first detection frame in the second detection frame, and then the overlap ratio can be expressed as:

IOU＝intersection_points(A,B)/(points(A)+points(B)-intersection_points(A,B))

wherein IOU is the overlap ratio, interaction _ points (A, B) is the first number, points (A) is the second number, points (B) is the third number.

If the student network fails to detect the target object, the overlap ratio IOU is defined to be 0.

For the raw data collected by the autonomous vehicle, the raw data may be divided into fixed time periods according to a preset length, for example, the length of the set time period is 5 seconds, each time period includes 50 frames of raw data, and once the raw data of one time period is received, the distribution difference MeanPointIOU may be calculated, and the long-tail data may be filtered.

Specifically, for a preset time period, a ratio between the sum of all the overlapping rates of all the target objects in all the raw data in the time period and the total number of all the target objects is calculated, and the ratio represents the distribution difference of a single teacher network.

The square mean value rootmeans of all ratios can be calculated for all teacher networks as the distribution variance over the time period.

Of course, the manner of calculating the distribution difference is only an example, and when the embodiment of the present invention is implemented, other manners of calculating the distribution difference may be set according to actual situations, which is not limited in the embodiment of the present invention. In addition, besides the above manner of calculating the distribution difference, a person skilled in the art may also adopt other manners of calculating the distribution difference according to actual needs, and the embodiment of the present invention is not limited thereto.

And 104, identifying original data belonging to long tail data for automatic driving according to the distribution difference, and using the original data as target data.

For the same original data, the performance difference between the student network and the teacher network can be identified through the distribution difference, so that the original data belonging to the long tail data for the automatic driving is mined and recorded as target data.

Furthermore, the performance of the student network is identified to be closer to that of the teacher network through the distribution difference, and it can be determined that the original data is more likely to belong to common data and is not long-tail data.

The performance of the student network is identified to be obviously worse than that of the teacher network through the distribution difference, and the original data can be determined to belong to the long tail data more probably.

In one example, if the number of teacher networks is one, the distribution difference is compared with a preset threshold, and if the distribution difference is greater than the threshold, the distribution difference is large, and it may be determined that the original data belongs to long-tail data for automatic driving as target data.

In another example, if the number of the teacher networks is multiple, the multiple distribution differences are respectively compared with a preset threshold, and if any distribution difference is greater than the threshold, it is determined that the original data belongs to the long-tail data for the automatic driving as the target data.

In yet another example, if the number of teacher networks is plural, an average value between the plural distribution differences is calculated as an average difference, the average difference is compared with a preset threshold, and if the average difference is greater than the threshold, it is determined that the raw data belongs to the long-tail data for the automatic driving as the target data.

Further, the threshold in the above example may be a fixed threshold, such as an empirical value, or may be a dynamic threshold, such as a percentile P1 (e.g., 1%) of the most recent time segments (e.g., 1000 time segments) may be set as the threshold, assuming that about 1% of the original data is planned to be retained, and so on.

Of course, the above manner of mining the long-tail data is only an example, and when the embodiment of the present invention is implemented, other manners of mining the long-tail data may be set according to actual situations, which is not limited in this embodiment of the present invention. In addition, besides the above-mentioned manner of mining long-tail data, a person skilled in the art may also adopt other manners of mining long-tail data according to actual needs, and the embodiment of the present invention is also not limited thereto.

Example two

Fig. 3 is a flowchart of an automatic driving data recognition method according to a second embodiment of the present invention, where the second embodiment is based on the foregoing embodiment, and further includes an operation of training a student network using long-tailed data, where the method specifically includes the following steps:

step 301, raw data collected during automatic driving is acquired.

Step 302, respectively determining a student network and a teacher network suitable for automatic driving.

Step 303, calculating the difference of the raw data between the student network and the teacher network as the distribution difference.

And step 304, identifying original data belonging to long tail data for automatic driving according to the distribution difference as target data.

And step 305, labeling the target data with a label.

In this embodiment, for the target data belonging to the long tail data, a label may be labeled according to the learning target of the student network, for example, for the traffic light detection, the label is a category of the detection frame and the signal light.

Generally, to ensure the accuracy of the sample, the label may be manually labeled, that is, manually inputting the label for the target data through the client.

Of course, in consideration of the higher accuracy of the output of the teacher network (i.e., the second detection result), in order to increase the labeling speed, the output of the teacher network (i.e., the second detection result) may be used as a label, and so on, which is not limited in this embodiment.

And step 306, training the student network according to the target data and the labels.

Considering that the structure of the student network is simpler, the time delay is low, the student network is suitable for being deployed on a vehicle and applied to automatic driving in real time, therefore, the student network can be trained at least by taking target data belonging to the long-tail data as sample data under the supervision of a label, the learning capacity of the student network on the long-tail data is improved, the generalization capacity of the student network is improved, the accuracy of an automatic driving decision is improved, and the safety of automatic driving is ensured.

In one training mode, the student network may be retrained.

In the training mode, the target data can be used as sample data and updated with the labels to a preset training set, wherein the training set comprises the sample data with the labels marked and is used for training the student network in advance.

Training a new student network adapted for automatic driving using the updated data set for a specified learning objective, wherein the new student network has the same structure as the original student network.

In another training approach, fine-tuning (fine-tuning) of the student network may be performed.

In the training mode, the student network can be used as a source network, and the weight of the source network is initialized to a new student network suitable for automatic driving, wherein the structure of the new student network is the same as that of the original student network.

And setting the target data as sample data and the label as a new training set, and training a new student network by using the new data set aiming at the specified learning target.

Of course, the above manner of training the student network is only an example, and when the embodiment of the present invention is implemented, other manners of training the student network may be set according to actual situations, for example, the student network is trained by using a continuous learning framework, which is not limited in this embodiment of the present invention. In addition, besides the above manner of training the student network, a person skilled in the art may also adopt other manners of training the student network according to actual needs, and the embodiment of the present invention is not limited thereto.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.

EXAMPLE III

Fig. 4 is a block diagram of a structure of an automatic driving data identification apparatus according to a third embodiment of the present invention, which may specifically include the following modules:

a raw data acquisition module 401, configured to acquire raw data acquired during automatic driving;

a network determining module 402, configured to determine a student network and a teacher network suitable for the automatic driving respectively;

a distribution difference calculation module 403, configured to calculate a difference between the student network and the teacher network, which predicts the raw data, as a distribution difference;

a raw data identification module 404, configured to identify, as target data, the raw data that belongs to long tail data for the automatic driving according to the distribution difference.

In one embodiment of the present invention, the network determining module 402 comprises:

the evaluation index determining module is used for determining one or more evaluation indexes;

and the network training module is used for respectively training the student network and the teacher network which are suitable for automatic driving aiming at the same learning target so as to distinguish the student network from the teacher network under the evaluation index.

In one embodiment of the present invention, the evaluation index includes delay, accuracy;

the network training module comprises:

the data set determining module is used for determining a data set, wherein the data set comprises sample data marked with labels;

a student network training module for training a student network suitable for said autonomous driving using said data set for a specified learning objective;

a teacher network training module for training a teacher network suitable for said autonomous driving using said data set for said learning objective;

wherein the delay of the student network is lower than the delay of the teacher network, and the accuracy of the student network is lower than the accuracy of the teacher network.

In one embodiment of the present invention, the distribution difference calculation module 403 includes:

the first detection result output module is used for inputting the original data into the student network for processing so as to output a first detection result;

the second detection result output module is used for inputting the original data into the teacher network for processing so as to output a second detection result;

and the relative difference calculating module is used for calculating the difference of the first detection result relative to the second detection result by taking the second detection result as a reference, and taking the difference as a distribution difference.

In one embodiment of the invention, the first detection result comprises a first probability that a plurality of classes are associated with each of the classes, and the second detection result comprises a second probability that a plurality of the classes are associated with each of the classes;

the relative difference calculation module includes:

a logarithm calculation module for calculating, for each of the classes, a logarithm of a ratio between the second probability and the first probability;

a category difference calculation module for calculating a product between the second probability and the logarithm as a difference of the category;

a category difference sum calculating module, configured to set a sum value of the differences corresponding to all the categories as a difference of the first detection result with respect to the second detection result, as a distribution difference.

In another embodiment of the present invention, the first detection result comprises a plurality of first detection boxes, a first probability that a plurality of categories associated with each of the first detection boxes are associated with each of the categories, and the second detection result comprises a plurality of second detection boxes, a second probability that a plurality of categories associated with each of the second detection boxes are associated with each of the categories;

the relative difference calculation module includes:

the matching frame matching module is used for matching the first detection frame with the second detection frame to serve as a matching frame;

a logarithm calculation module to calculate a logarithm of a ratio between the second probability and the first probability for each of the categories in each of the pair boxes;

a frame difference calculation module, configured to calculate a sum of the differences corresponding to all the categories as a difference of the paired frames;

a frame difference sum calculating module configured to set a sum value of the differences corresponding to all the paired frames as a difference of the first detection result with respect to the second detection result as a distribution difference.

In yet another embodiment of the present invention, the relative difference calculation module includes:

the spatial index establishing module is used for establishing a spatial index of an R tree structure for the first detection result and the second detection result;

the matching frame searching module is used for searching a first detection frame and a second detection frame which represent the same target object in the spatial index to be used as matching frames;

an overlap ratio calculation module, configured to calculate an overlap ratio for the pair of frames in the spatial index, where the overlap ratio is a ratio between a first number of point cloud data in an overlap region and a total number, the total number is a difference value obtained by subtracting the first number from a sum of a second number of point cloud data in a first non-overlap region and a third number in a second non-overlap region, the overlap region is a region overlapped between the first detection frame and the second detection frame, the first non-overlap region is a region not overlapped with the second detection frame in the first detection frame, and the second non-overlap region is a region not overlapped with the first detection frame in the second detection frame;

the ratio calculation module is used for calculating the ratio between the sum of all the overlapping rates and the total number of all the target objects in a preset time period;

and the square average calculation module is used for calculating the square average of the ratio as the distribution difference in the time period.

In one embodiment of the present invention, the raw data identification module 404 comprises:

the first difference comparison module is used for comparing the distribution difference with a preset threshold value if the number of the teacher networks is one;

the first long-tail data determining module is used for determining that the original data belongs to the long-tail data for the automatic driving as target data if the distribution difference is larger than the threshold;

alternatively, the first and second electrodes may be,

the second difference comparison module is used for comparing the distribution differences with a preset threshold value if the number of the teacher networks is multiple;

a second long-tail data determination module, configured to determine that the original data belongs to long-tail data for the automatic driving as target data if any of the distribution differences is greater than the threshold;

alternatively, the first and second electrodes may be,

an average difference calculation module, configured to calculate an average value between the plurality of distribution differences as an average difference if the number of the teacher networks is multiple;

the third difference comparison module is used for comparing the average difference with a preset threshold value;

and a third long-tail data determining module, configured to determine that the original data belongs to long-tail data for the automatic driving as target data if the average difference is greater than the threshold.

In one embodiment of the present invention, further comprising:

the label marking module is used for marking a label on the target data;

and the student network updating module is used for training the student network according to the target data and the label.

In one embodiment of the invention, the student network update module comprises:

a training set updating module, configured to update the target data and the label to a preset training set as sample data, where the training set is used for training the student network in advance;

a student network retraining module for training a new student network adapted for said autonomous driving using said updated data set for a specified learning objective;

alternatively, the first and second electrodes may be,

a weight initialization module for initializing the weight of the source network to a new student network suitable for the automatic driving, with the student network as the source network;

a training set setting module, configured to set the target data as sample data and the label as a new training set;

and the student network fine-tuning module is used for training a new student network by using the new data set aiming at a specified learning target.

The automatic driving data identification device provided by the embodiment of the invention can execute the automatic driving data identification method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

Example four

Fig. 5 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention. FIG. 5 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in FIG. 5 is only an example and should not bring any limitations to the functionality or scope of use of embodiments of the present invention.

As shown in FIG. 5, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, and commonly referred to as a "hard drive"). Although not shown in FIG. 5, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, computer device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing, such as implementing the data recognition method for automatic driving provided by the embodiment of the present invention, by running a program stored in the system memory 28.

EXAMPLE five

An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the above-mentioned data identification method for automatic driving, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

A computer readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for identifying data for autonomous driving, comprising:

acquiring original data acquired during automatic driving;

2. The method of claim 1, wherein the separately determining a student network and a teacher network applicable to the autonomous driving comprises:

determining one or more evaluation indexes;

and respectively training a student network and a teacher network which are suitable for the automatic driving aiming at the same learning target so as to distinguish the student network and the teacher network under the evaluation index.

3. The method of claim 2, wherein the evaluation metrics include latency, accuracy;

the training of the student network and the teacher network adapted to the automatic driving, respectively, for the same learning objective to distinguish the student network and the teacher network under the evaluation index includes:

determining a data set, wherein the data set comprises sample data with labels;

training a student network adapted for the autonomous driving using the data set for a specified learning objective;

training a teacher network adapted for the autonomous driving using the data set for the learning objective;

4. The method of claim 1, wherein calculating the difference between the student network and the teacher network in which the raw data is predicted as a distributed difference comprises:

inputting the original data into the student network for processing so as to output a first detection result;

inputting the original data into the teacher network for processing so as to output a second detection result;

calculating a difference of the first detection result with respect to the second detection result as a distribution difference with the second detection result as a reference.

5. The method of claim 4, wherein the first detection result comprises a first probability that a plurality of classes are associated with each of the classes, and wherein the second detection result comprises a second probability that a plurality of the classes are associated with each of the classes;

the calculating, with the second detection result as a reference, a difference of the first detection result with respect to the second detection result as a distribution difference includes:

for each of the categories, computing a logarithm of a ratio between the second probability and the first probability;

calculating a product between the second probability and the logarithm as a difference of the class;

setting a sum value between the differences corresponding to all the categories as a difference of the first detection result with respect to the second detection result as a distribution difference.

6. The method of claim 4, wherein the first detection result comprises a plurality of first detection boxes, a first probability that a plurality of categories associated with each of the first detection boxes are associated with each of the categories, and wherein the second detection result comprises a plurality of second detection boxes, a second probability that a plurality of categories associated with each of the second detection boxes are associated with each of the categories;

matching the first detection frame with the second detection frame to serve as a pairing frame;

calculating, for each of the categories in each of the pair boxes, a logarithm of a ratio between the second probability and the first probability;

calculating the sum value of the differences corresponding to all the categories as the difference of the pairing box;

setting a sum value of the differences corresponding to all the pair boxes as a difference of the first detection result relative to the second detection result as a distribution difference.

7. The method according to claim 4, wherein the calculating, with the second detection result as a reference, a difference of the first detection result with respect to the second detection result as a distribution difference comprises:

establishing a spatial index of an R tree structure for the first detection result and the second detection result;

searching a first detection frame and a second detection frame which represent the same target object in the spatial index to be used as pairing frames;

calculating an overlap ratio for the paired boxes in the spatial index, where the overlap ratio is a ratio between a first number of point cloud data in an overlap region and a total number, the total number is a difference value obtained by subtracting the first number from a sum of a second number of point cloud data in a first non-overlap region and a third number in a second non-overlap region, the overlap region is a region overlapped between the first detection box and the second detection box, the first non-overlap region is a region not overlapped with the second detection box in the first detection box, and the second non-overlap region is a region not overlapped with the first detection box in the second detection box;

calculating the ratio of the sum of all the overlapping rates to the total number of all the target objects in a preset time period;

calculating a squared average of the ratios as the distribution variance over the time period.

8. The method according to claim 1, wherein the identifying the raw data belonging to long tail data for the automatic driving as target data according to the distribution difference comprises:

if the number of the teacher networks is one, comparing the distribution difference with a preset threshold value;

if the distribution difference is larger than the threshold value, determining that the original data belongs to long-tail data for the automatic driving and using the long-tail data as target data;

alternatively, the first and second electrodes may be,

if the number of the teacher networks is multiple, comparing the distribution differences with a preset threshold value respectively;

if any distribution difference is larger than the threshold value, determining that the original data belongs to long-tail data for the automatic driving and using the long-tail data as target data;

alternatively, the first and second electrodes may be,

if the number of the teacher networks is multiple, calculating an average value among the distribution differences as an average difference;

comparing the average difference with a preset threshold value;

and if the average difference is larger than the threshold value, determining that the original data belongs to long-tail data for the automatic driving and using the long-tail data as target data.

9. The method according to any one of claims 1-8, further comprising:

labeling the target data with a label;

and training the student network according to the target data and the labels.

10. The method of claim 9, wherein training the student network based on the goal data and the labels comprises:

updating the target data serving as sample data and the label to a preset training set, wherein the training set is used for training the student network in advance;

training a new student network adapted for said autonomous driving using said updated data set for a specified learning objective;

alternatively, the first and second electrodes may be,

taking the student network as a source network, and initializing the weight of the source network to a new student network suitable for the automatic driving;

setting the target data as sample data and the label as a new training set;

training a new student network using the new data set for a specified learning objective.

11. An autonomous driving data recognition apparatus, comprising:

12. A computer device, characterized in that the computer device comprises:

one or more processors;

a memory for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the autopilot data recognition method of any of claims 1-10.

13. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the data recognition method for autonomous driving according to any one of claims 1 to 10.