WO2023045935A1 - 一种目标检测模型自动化迭代方法、设备及存储介质 - Google Patents

一种目标检测模型自动化迭代方法、设备及存储介质 Download PDF

Info

Publication number
WO2023045935A1
WO2023045935A1 PCT/CN2022/120032 CN2022120032W WO2023045935A1 WO 2023045935 A1 WO2023045935 A1 WO 2023045935A1 CN 2022120032 W CN2022120032 W CN 2022120032W WO 2023045935 A1 WO2023045935 A1 WO 2023045935A1
Authority
WO
WIPO (PCT)
Prior art keywords
target detection
vehicle
target
model
data
Prior art date
Application number
PCT/CN2022/120032
Other languages
English (en)
French (fr)
Inventor
张放
徐成
赵勍
刘涛
夏洋
李晓飞
王肖
张德兆
霍舒豪
Original Assignee
北京智行者科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京智行者科技股份有限公司 filed Critical 北京智行者科技股份有限公司
Publication of WO2023045935A1 publication Critical patent/WO2023045935A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/15Vehicle, aircraft or watercraft design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/02CAD in a network environment, e.g. collaborative CAD or distributed simulation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present application relates to the technical field of automatic driving, and in particular to an automatic iteration method, device and storage medium of a target detection model.
  • Autonomous driving long-tail scenarios refer to sudden, low-probability, and unpredictable scenarios, such as intersections with traffic signal failures, drunk-driving vehicles, and balloons in the middle of the road. How to deal with long-tail scenarios has always been a difficult problem in the industry, and has gradually become the key to restricting the development of autonomous driving. To solve these problems, the autonomous driving system needs to accumulate a large amount of data and continuously optimize the model.
  • the traditional model iterative verification method adopts the method of model iteration driven by functional testing.
  • data collection is driven by requirements and problems, and then the data is manually analyzed and marked to design an optimization plan.
  • Testing finally forms a serial iterative process of labeling, development, and testing.
  • This method is effective for software function development. Limited manpower can solve limited problems and realize functions within a specific range.
  • the traditional iterative verification method of the model is difficult to make autonomous driving truly land, so that the entire industry can achieve safe operation at all times and under all working conditions.
  • the traditional problem-driven approach relies on a serial development model to optimize the model. The development and testing cycle is long and cannot be carried out in parallel.
  • the method of manually labeling data takes a long time and the labeling efficiency is low; thirdly, most of the tests verify the model through manual construction of typical scenarios or random tests, and the coverage of actual running scenarios is low.
  • the above aspects show that the problem-driven approach can no longer meet the needs of solving a large number of problems in real scenarios, and cannot automatically solve most of the problems, and cannot efficiently achieve the goal of autonomous driving.
  • the embodiments of the present application aim to solve at least one of the above technical problems.
  • the embodiment of the present application provides an automatic iterative method for a target detection model, including:
  • cloud computing resources to mark the valuable data for improving the performance of the vehicle-side target detection model through a data-driven model, and use the labeling results to train the vehicle-side target detection model;
  • the embodiment of the present application provides an automatic iterative method for a vehicle-end target detection model, including:
  • the target detection result is obtained through inference of the vehicle-end target detection model
  • the vehicle-side target detection model In cooperation with cloud computing resources, iterate the vehicle-side target detection model being used into a trained vehicle-side target detection model; wherein, the trained vehicle-side target detection model is generated by cloud computing resources through a data-driven model Describes the model obtained by annotating valuable data for performance improvement of the vehicle-side target detection model, and using the labeling results to train the vehicle-side target detection model.
  • the embodiment of the present application provides an automatic iteration method for a cloud object detection model, including:
  • the embodiment of the present application provides a vehicle-end execution device, including:
  • the vehicle-side calculation module is equipped with a vehicle-side target detection model, and the target detection result is obtained through inference of the vehicle-side target detection model;
  • the vehicle-end acquisition module is used to cooperate with the cloud execution device to collect valuable data for improving the performance of the vehicle-end target detection model according to the target detection results;
  • the performance of the target detection model is improved.
  • Valuable data is labeled, and the vehicle-side target detection model is trained using the labeling results;
  • the vehicle-end computing module is also used to cooperate with the cloud execution device to iterate the configured vehicle-end object detection model into a trained vehicle-end object detection model.
  • the embodiment of the present application provides a cloud execution device, including:
  • the cloud acquisition module is used to cooperate with the vehicle-side execution equipment, and collect valuable data for improving the performance of the vehicle-side target detection model according to the target detection results obtained by the vehicle-side execution equipment through the vehicle-side target detection model reasoning;
  • An automatic labeling module used for labeling the valuable data for improving the performance of the vehicle-end target detection model through a data-driven model
  • the training module is used to train the vehicle-end target detection model by using the labeling results
  • the iteration module is used to cooperate with the vehicle-side execution equipment to iterate the vehicle-side target detection model being used by the vehicle-side computing resources into a trained vehicle-side target detection model.
  • the embodiment of the present application provides an electronic device, including: at least one processor, and a memory connected to the at least one processor in communication, wherein the memory stores information that can be executed by the at least one processor. Instructions, the instructions are executed by the at least one processor, so that the at least one processor can execute the steps of any one of the aforementioned automatic iterative methods for vehicle-end object detection models.
  • the embodiment of the present application provides an automatic driving vehicle, including the aforementioned electronic device.
  • the embodiment of the present application provides a storage medium, on which a computer program is stored, and when the program is executed by a processor, the steps of the aforementioned automatic iterative method for the vehicle-end object detection model are implemented.
  • the embodiment of the present application provides an electronic device, including: at least one processor, and a memory communicatively connected to the at least one processor, wherein the memory stores instructions executable by the at least one processor , the instructions are executed by the at least one processor, so that the at least one processor can execute the steps of the aforementioned cloud-based object detection model automation iterative method.
  • an embodiment of the present invention provides a storage medium, on which a computer program is stored, and when the program is executed by a processor, the steps of the aforementioned cloud-based object detection model automation iterative method are implemented.
  • the automatic iterative method of target detection model adopts the vehicle-side reasoning-cloud training mode, that is, deploying a multi-task, lightweight vehicle-side target detection model on the vehicle side, and automatically and targetedly based on the target detection results Collect valuable data to improve the performance of the vehicle-end target detection model, and then use the powerful computing power and data storage capacity of the cloud to automatically complete a series of operations such as labeling data set generation, model training, and model iteration in real time; Car-side inference--the cloud training model fully utilizes the resource advantages of the cloud, and improves the efficiency of the iteration of the target detection model on the autonomous driving car.
  • the method provided by this application automatically collects valuable data for improving the performance of the vehicle-side target detection model in an environment with limited vehicle-to-cloud communication resources.
  • the automatic collection process is not only efficient, but also covers rare, abnormal, and sudden long-tail scenarios, shields duplicate data and junk data, ensures the validity, diversity and integrity of the collected data, and provides sufficient, high-quality, diverse, effective and reliable data for the automatic completion of model training and model iteration on the cloud. data base.
  • the method provided by this application uses a single-task, deep-level data-driven model to automatically complete data labeling to obtain a labeling dataset.
  • This method of automatically generating a labeling dataset greatly reduces the manual labeling work, which is obviously helpful to solve problems caused by low labeling efficiency.
  • FIG. 1 is a system architecture diagram of an automatic iterative system for a target detection model provided by an embodiment of the present application
  • FIG. 2 is a schematic flow chart of an automatic iterative method for a target detection model provided in an embodiment of the present application
  • Fig. 3 is a schematic diagram of a target detection result output by a vehicle-side target detection model based on an image detection target;
  • Fig. 4 is an example of data valuable to the performance improvement of the vehicle-end target detection model collected by the vehicle-end acquisition module
  • Figure 5 is an example of the type of elements contained in a scene
  • Fig. 6 is an example of collecting data valuable to the performance improvement of the vehicle-end target detection model by the vehicle-end acquisition module
  • Fig. 7 is an example of the target type library used by the data-driven model
  • FIG. 8 is a schematic structural diagram of an autonomous vehicle
  • FIG. 9 is a schematic structural diagram of a vehicle computing system
  • Fig. 10 is a possible example of the self-driving vehicle and the vehicle-mounted execution device
  • FIG. 11 is a schematic structural diagram of a cloud execution device
  • Fig. 12 is another schematic flow chart of the automatic iterative method for target detection model provided by the embodiment of the present application.
  • Fig. 13 is a schematic flow chart of the labeling process of the data-driven model
  • Fig. 14 is a schematic flow diagram of performing consistency check on the target detection results, local detection results, and global detection results, and determining the target label according to the test results;
  • Fig. 15 is a schematic flow chart of training a vehicle-end object detection model using the labeling results.
  • FIG. 1 is a system architecture diagram of the target detection model automation iteration system provided by the embodiment of the present application.
  • a data storage unit 20 a cloud execution device 30 and a cloud database 40 .
  • the vehicle-side execution device 10 includes: a vehicle-side collection module 10A, and a vehicle-side calculation module 10B.
  • the vehicle-side computing module 10B is configured with a vehicle-side model.
  • the vehicle-end execution equipment can be applied to the self-driving vehicle, wherein the self-driving vehicle is provided with at least one sensor, such as vehicle-mounted radar (such as millimeter-wave radar, infrared radar, laser radar, Doppler radar, etc.), light sensor, rainfall sensor, etc. Sensors, visual sensors (such as cameras, driving recorders), vehicle attitude sensors (such as gyroscopes), speed sensors (such as Doppler radar), inertial measurement units (IMUs), etc.
  • vehicle-mounted radar such as millimeter-wave radar, infrared radar, laser radar, Doppler radar, etc.
  • light sensor such as rain sensor, etc.
  • Sensors such as cameras, driving recorders
  • vehicle attitude sensors such as gyroscopes
  • speed sensors such as Doppler radar
  • IMUs inertial measurement units
  • the car-end acquisition module has the function of data acquisition, and sends the collected data to the host computer for analysis and processing. It can be used to collect analog or digital signals collected by various sensors installed on the self-driving vehicle, and can also be used to collect the car-end computing module. The result of inference through the vehicle-side model can also be used to collect vehicle status data, map data, driver operation data, etc.
  • the vehicle end acquisition module has a built-in data acquisition card (that is, a computer expansion card that realizes data acquisition function), which can be connected through USB, PXI, PCI, PCI Express, FireWire (1394), PCMCIA, ISA, Compact Flash, 485, 232, Ethernet, Various wireless networks and other buses collect and send data.
  • the car-end acquisition module also has data processing functions, specifically to cooperate with the cloud-end acquisition module to extract valuable data for improving the performance of the car-end model from the collected data.
  • the vehicle data storage unit has a data storage function, which can be used to store the signals collected by the aforementioned various sensors, the results of vehicle model reasoning, vehicle status data, map data, driver operation data, and can also be used to store operating systems, applications, etc. .
  • the data storage unit at the vehicle end can be realized by using embedded multimedia card (eMMC), single-level cell flash memory (SLC NAND), universal flash memory (UFS) solid-state drive (SSD), etc.
  • the vehicle-end data storage unit may be set in the vehicle-end execution device, or may be an external device other than the vehicle-end execution device.
  • the car-side model has reasoning functions and can be used to implement functions such as target detection, behavior prediction, and decision-making planning for autonomous vehicles.
  • the vehicle end model may be a model of a neural network type or a non-neural network type model. In the embodiment of the present application, only the vehicle end model is a model of a neural network type as an example.
  • the "vehicle-side object detection model” 10C referred to in the embodiment of the present application is the vehicle-end model that realizes the object detection function.
  • the "target detection result” referred to in the embodiment of the present application is the result of inference of the vehicle-end target detection model.
  • the vehicle-side computing module obtains sensor data, vehicle status data, map data, driver operation data, etc., and then uses these data as input data of the vehicle-side model, and uses the vehicle-side model to perform inference to realize target detection and behavior prediction of autonomous vehicles , decision planning and other functions.
  • the cloud execution device 30 includes: a cloud collection module 30A, an automatic labeling module 30B, a training module 30C, and an iteration module 30D.
  • the cloud execution device can be implemented by a cloud server.
  • the data transmission between the vehicle-side execution device and the cloud execution device is realized through a communication interface, which can use vehicle wireless communication technology V2X, vehicle Ethernet, 3G/4G/5G mobile communication technology, etc. for communication.
  • the cloud acquisition module has the function of data acquisition, and sends the collected data to the host computer for analysis and processing. There is a data transmission relationship between the cloud acquisition module and the vehicle-end acquisition module, and the cloud-end acquisition module obtains data from the vehicle-end acquisition module according to requirements.
  • the cloud acquisition module has a built-in data acquisition card, which can collect and send data through buses such as USB, PXI, PCI, PCI Express, FireWire (1394), PCMCIA, ISA, Compact Flash, 485, 232, Ethernet, and various wireless networks.
  • the cloud acquisition module also has a data processing function, specifically to cooperate with the vehicle-end acquisition module to collect valuable data for improving the performance of the vehicle-end model.
  • the cloud database has a data storage function, which can be realized by using cloud storage technology, cloud database technology, etc.
  • the automatic labeling module has the function of data processing, which can realize the function of data labeling.
  • the training module uses the labeling results obtained by the automatic labeling module to train the car-end model.
  • the iterative module uses the vehicle-side model trained by the training module to iteratively update the vehicle-side model being used by the vehicle-side execution device.
  • vehicle-side computing resources include but are not limited to vehicle-side execution devices, vehicle-side data storage units, and may also include other computing resources set up on self-driving vehicles.
  • cloud computing resources include but are not limited to cloud execution devices and cloud databases, and may also include other resources based on cloud computing technologies.
  • FIG. 2 is a schematic flowchart of an automatic iteration method for a target detection model provided in an embodiment of the present application.
  • the automatic iteration method for a target detection model provided in an embodiment of the present application may include:
  • the vehicle-side computing module inputs sensor data, vehicle state data, map data, driver operation data, etc. into the vehicle-side target detection model, and then the vehicle-side target detection model performs reasoning based on the algorithm logic of the model to achieve the goal of self-driving vehicles.
  • the detection function is used to obtain the target detection result.
  • the target detection result is stored in the vehicle-end data storage unit and acquired by the vehicle-end acquisition module.
  • the vehicle-side acquisition module can directly obtain the target detection result from the vehicle-side computing module, or can obtain the target detection result from the vehicle-side data storage unit.
  • the vehicle-end target detection model can achieve the purpose of detecting (recognizing) targets through image recognition technology.
  • the target detection result may include: an image-based target detection frame, target type, confidence level, and the like.
  • Figure 3 shows a target detection result output by the vehicle-side target detection model based on image detection targets, where the white rectangle is the target detection frame, and the red, green, car, and sign next to the white rectangle are the target types, and the white The number next to the rectangle is the confidence level.
  • the vehicle-side target detection model can achieve the purpose of detecting (recognizing) targets by clustering the laser point cloud.
  • the target detection result may include: the target detection frame based on the laser point cloud, the target type, the confidence level, and the like.
  • the vehicle-side object detection model can adopt a multi-task and lightweight network structure.
  • multitasking means that the network structure has the characteristics of sharing parameters and tasks
  • lightweight means that the network structure has the characteristics of satisfying computing efficiency and capability under the condition of limited storage space and power consumption constraints.
  • multi-task means that the feature information of the image can be reused, and the results required by multiple tasks can be obtained through one model reasoning, such as simultaneous detection of pedestrians, vehicles and signal lights, etc.;
  • the order of magnitude can adapt to the limited computing power of the vehicle end and meet the reasoning efficiency of the vehicle end;
  • multi-task means that the feature information of the point cloud can be reused, and the results required for multiple tasks can be obtained through one model reasoning, such as simultaneous detection of pedestrians, vehicle types and The dynamic and static properties of obstacles, etc.; lightweight can adapt to the limited computing power of the car end, and meet the reasoning efficiency of the car end.
  • vehicle-end target detection model can also adopt a network structure with multi-dimensional characteristics, which can help to mine the internal relationship between multiple targets.
  • the car-end acquisition module cooperates with the cloud-end acquisition module to collect valuable data for improving the performance of the car-end target detection model according to the target detection results.
  • the iterative method of model training commonly used in the field of autonomous driving is to use all the results of model reasoning in subsequent model training. Whether the reasoning effect of the scene is good enough or not good enough, the reasoning results are uniformly used for model training. This kind of training without distinguishing key points cannot achieve the training purpose quickly and in a targeted manner. It can be adapted to common typical scenarios, but for rare, sudden, and abnormal long-tail scenarios, this model iteration method will be difficult to adapt.
  • this application collects data that is valuable for improving the performance of the vehicle-side target detection model, and then uses these data that are valuable for improving the performance of the vehicle-side target detection model to perform vehicle-side target detection.
  • the model is trained and iterated. In this way, valuable data can be extracted in a targeted manner according to the training purpose, so as to quickly and effectively achieve the training goal.
  • valuable data for improving the performance of the vehicle-side target detection model includes not only the target detection results themselves, but also environmental data, map data, vehicle body status data, driving Time-space synchronization information such as operator operation data, combined with target detection results, can fully reflect the scene where the autonomous vehicle is located, and is more meaningful for training models.
  • environmental data may include: static environment (fixed obstacles, building facilities, traffic facilities, roads), dynamic environment (dynamic traffic lights, traffic police), communication environment (signal strength, signal delay time, electromagnetic interference strength), traffic Participants (pedestrians, motor vehicles, non-motor vehicles, animals), meteorological environment (temperature, humidity, light conditions, weather conditions), etc.;
  • environmental data can also include: data collected by sensors such as visual sensors, lidar, millimeter-wave radar, and ultrasonic radar, such as images and laser point clouds.
  • sensors such as visual sensors, lidar, millimeter-wave radar, and ultrasonic radar, such as images and laser point clouds.
  • Map data can include: high-precision maps, traffic control information, navigation information, etc.;
  • Vehicle state data can include: basic attributes of the vehicle (such as body weight, geometric dimensions, basic performance), vehicle position (coordinates, lane position), motion state (lateral motion state, longitudinal motion state), human-computer interaction (entertainment, driving task )wait;
  • basic attributes of the vehicle such as body weight, geometric dimensions, basic performance
  • vehicle position coordinates, lane position
  • motion state lateral motion state, longitudinal motion state
  • human-computer interaction entity, driving task
  • the driver's operation data may include: whether to take over the vehicle, the driver's specific actions, etc.
  • collecting data that is valuable for improving the performance of the vehicle-end target detection model can include the following situations:
  • the purpose of training is to make the vehicle-side target detection model cover (adapt) as many scenarios as possible
  • the car-end acquisition module uses the inference results of the car-end object detection model and its space-time synchronization information to construct a scene, and uploads it to the cloud.
  • the target detection results and their space-time synchronization information are collected as valuable data for improving the performance of the vehicle-end target detection model.
  • the scene refers to the overall dynamic description of the comprehensive interaction process between the autonomous driving vehicle and other vehicles, roads, traffic facilities, weather conditions and other elements in the driving environment within a certain time and space range.
  • the organic combination of the driving scene and the driving environment includes not only various entity elements, but also the actions performed by entities and the connection relationship between entities.
  • Fig. 5 shows an embodiment of element types included in a scene.
  • the cloud acquisition module compares the scene uploaded by the vehicle-end acquisition module If it is found that the scene does not exist in the scene library, it means that the vehicle-end target detection model cannot cover (adapt) to this scene, and this scene needs to be added to the scene library. At this time, the cloud acquisition module issues a command, After receiving the command, the car-side acquisition module collects the target detection results corresponding to this scene and its space-time synchronization information as valuable data for improving the performance of the car-side target detection model.
  • the cloud acquisition module compares the scene uploaded by the vehicle end acquisition module with the scene library, if the following two situations occur, it can be counted as the scene missing in the scene library:
  • the scene library does not cover the category corresponding to the scene.
  • the road type in the scene library covers three categories: urban roads, expressways, and park roads, while the scene category uploaded by the vehicle-side collection module is rural roads. At this point, it can be determined that the scene is missing in the scene library;
  • the vehicle-end acquisition module can encode the scene and upload the scene code to the cloud.
  • the cloud acquisition module In addition to storing the scene library in the cloud database , it can also store the code library corresponding to the scene library (which contains the scene code corresponding to each scene in the scene library), and the cloud acquisition module compares the scene code uploaded by the vehicle end acquisition module When there is no scene code, it can be determined that the vehicle-side target detection model cannot cover (adapt) to this scene, and this scene needs to be added to the scene library. At this time, the cloud acquisition module issues a command, and the vehicle-side After the module receives the command, it collects the corresponding target detection results and its space-time synchronization information as valuable data for improving the performance of the vehicle-end target detection model.
  • the vehicle-end acquisition module should encode the scene according to a predetermined coding rule.
  • the predetermined coding rule can be coded according to scene elements.
  • the scene elements are encoded according to the order of the parent node elements.
  • the number after # indicates the order of the current element in its parent node elements:
  • the scene contains the state of lateral motion, then from left to right, the code corresponding to the vehicle itself is 1, the code corresponding to the motion state is 3, the code corresponding to the lateral motion state is 1, and the scene code correspondingly contains the number 131;
  • the scene code contains corresponding data groups (232, 131).
  • the purpose of training is to make the vehicle-side target detection model cover (adapt to) long-tail scenarios such as rare, sudden, and abnormal
  • the vehicle-side acquisition module when the vehicle-side acquisition module detects that the target detection result and/or space-time synchronization information does not belong to the conventional scene, it uses the target detection result and its space-time synchronization information as valuable data for improving the performance of the vehicle-side target detection model. collection.
  • the conventional scene here refers to the ubiquitous and common traffic scene in the physical world, such as vehicles running normally on the road, traffic lights, traffic signs, lane lines, road shoulders and other conventional traffic facilities appearing on the road.
  • the opposite is the long-tail scene, that is, rare, sudden, abnormal traffic scenes that rarely or almost impossible to appear in the physical world, for example, vehicles driving on the sky ⁇ flower beds ⁇ buildings, suddenly appearing in the road Paints ⁇ Buildings ⁇ Large floating objects (such as balloons), etc.
  • long-tail scenarios often mean a high risk factor and complex operation and processing.
  • the long-tail The various information corresponding to the tail scene is valuable data for improving the performance of the vehicle-end target detection model.
  • the target detection result and/or space-time synchronization information is not in the normal scene, it means that the self-driving vehicle is in a rare, sudden, or abnormal long-tail scene, and it is necessary to synchronize the reasoning result and its time-space
  • the information is collected as valuable data for improving the performance of the vehicle-end target detection model.
  • the inference result of the target detection model based on the laser point cloud is that the target vehicle starts to drive on the building on the side of the road from a certain frame, and this situation lasts for multiple frames, and before this frame
  • the monitoring result has always been that the target vehicle is driving on the road.
  • This reasoning result (the vehicle is driving on the building) does not belong to the conventional scene (the vehicle is driving on the road).
  • This situation may be based on the target of the laser point cloud.
  • There is an error in the reasoning of the detection model it may also be a lidar failure, or even the target vehicle is actually driving on a flower bed, etc.
  • These abnormal or rare scenes are all long-tail scenes, which need to be covered by the vehicle-side target detection model. It is necessary to collect the inference results and their space-time synchronization information at this time as valuable data for improving the performance of the vehicle-end target detection model, and to use it for subsequent model training.
  • the vehicle-side target detection model performs inference on an intersection image collected by a vehicle-mounted camera, and the target detection result is only the target detection frame, target type, and confidence level corresponding to a traffic light panel.
  • the inference result of the vehicle-side target detection model does not match the expected value. This situation shows that the vehicle-side target detection model The inference result of the vehicle is abnormal. It may be that the reasoning effect of the vehicle-side target detection model on the current scene is not good enough. It is also necessary to train the vehicle-side target detection model to adapt to the current scene. Synchronization information is collected as valuable data for improving the performance of the vehicle-end target detection model, and is used for subsequent model training.
  • the object detection results based on images show that obstacles are dynamic obstacles, while the target detection results based on laser point cloud and millimeter wave point cloud both show that obstacles are static obstacles.
  • the results obtained by the logic of this algorithm are checked for consistency.
  • the result of the test is that the consistency of the three is not good (the one based on the image is a dynamic obstacle, and the one based on the laser point cloud and millimeter-wave radar is a static obstacle), and it has not reached the predetermined Consistency lower limit (for example, the three must be completely consistent).
  • the inference result of the image-based target detection model may not be accurate enough, or the target detection model based on laser point cloud and/or millimeter wave point cloud.
  • the inference result is not accurate enough, which means that among the target detection models of these three types of algorithmic logic, at least one target detection model is not good enough for the reasoning effect of the current scene. It is necessary to improve the reasoning ability of the target detection model in the current scene, so it is necessary to use
  • the inference results and their space-time synchronization information at this time are collected as valuable data for improving the performance of the vehicle-end target detection model, and are used for subsequent model training.
  • the purpose of training is to make the vehicle-end target detection model continue to perform well in scenes with very good reasoning effects
  • the car-side target detection model performs inference on an intersection image collected by the car camera, and the target detection result has 3 traffic light panels.
  • the matching degree with the expected value has reached a good level (for example, the predetermined matching threshold has been reached), which shows that the inference result of the vehicle-side target detection model for the current scene is very good, and the vehicle-side target detection model needs to continue to maintain this good reasoning Therefore, it is also necessary to collect the target detection results and their space-time synchronization information at this time as valuable data for improving the performance of the vehicle-end target detection model for subsequent model training.
  • a good level for example, the predetermined matching threshold has been reached
  • the target detection results based on images, the target detection results based on laser point cloud, and the target detection results based on millimeter wave point cloud all show that obstacles are static obstacles.
  • the results obtained by the logic are checked for consistency.
  • the test results show that the three are completely consistent (both are static obstacles), reaching the predetermined upper limit of consistency.
  • This situation shows the inference effect of the target detection model based on these three types of algorithm logic.
  • the target detection model needs to continue to maintain this good reasoning ability, so it is also necessary to collect the reasoning results and their space-time synchronization information at this time as valuable data for improving the performance of the vehicle-end target detection model for subsequent Model training.
  • the embodiment provided by this application realizes the automatic collection of data that is beneficial to the performance improvement of the vehicle-side target detection model through the mutual cooperation of the vehicle-side computing resources and the cloud computing resources.
  • This data collection method is not only fast, but also targeted, and can be used in In the case of limited communication resources between the car and the cloud, useful data can be collected more efficiently to provide an effective and reliable data basis for the subsequent training of the car-side target detection model.
  • the automatic labeling module uses the data-driven model to label the valuable data.
  • cloud computing resources are not only powerful in computing power, but also The real-time requirements are also low. Therefore, for the same target, using a data-driven model for detection in the cloud can obtain more accurate results.
  • This result can be used as labeled data to train the vehicle-side target detection model to achieve The purpose of training the model and improving the reasoning ability of the model (making the target detection result of reasoning more accurate).
  • the "data-driven model” referred to in the embodiment of the present application refers to a data-driven model, such as a deep learning model, a traditional machine learning model, and the like.
  • the data-driven model is a traditional machine learning model, which can use support vector machine algorithm (SVM), Adaboost algorithm, logistic regression algorithm, hidden Markov algorithm, K nearest neighbor algorithm (KNN), three-layer artificial Any traditional machine learning algorithm such as neural network algorithm, Bayesian algorithm, decision tree algorithm, etc.
  • SVM support vector machine algorithm
  • Adaboost algorithm logistic regression algorithm
  • hidden Markov algorithm hidden Markov algorithm
  • KNN K nearest neighbor algorithm
  • three-layer artificial Any traditional machine learning algorithm such as neural network algorithm, Bayesian algorithm, decision tree algorithm, etc.
  • the above-mentioned traditional machine learning model (such as SVM or Adaboost) is calculated based on the artificially defined Histogram of Oriented Gradient (HOG) feature, which helps to achieve the purpose of labeling valuable data.
  • HOG Histogram of Oriented Gradient
  • the target type library used by the data-driven model should cover all target types that the car-end target detection model focuses on, and should cover as many other target objects that need attention as possible.
  • Fig. 7 shows an embodiment of the object type library used by the data-driven model.
  • the data-driven model is set to multiple deep learning models with a single-task, deep-level feature network structure.
  • the single-task feature means that a single model is only used to perform a single task, and the models are independent of each other and do not share parameters.
  • Using the single-task feature of the model can maximize the recall and recognition accuracy of individual targets; deep-level features It means that the model has multiple hidden layers, which can abstract the input features at multiple levels, better divide different types of data, and use the deep features of the model to improve the target recall and recognition accuracy of individual targets under complex road conditions.
  • multiple single-task, deep-level data-driven models can be set up according to the target types that need to be perceived in specific scenarios, which are used to detect pedestrians, motor vehicles, bicycles, traffic signs, traffic lights, crosswalks, etc.
  • the target detection model detects targets based on images ⁇ laser point clouds
  • the target detection model uses images collected by vehicle cameras ⁇ vehicle lidar ⁇ laser point clouds to detect targets, and the output target detection results include target detection frames for targets, target Category, confidence and other information; then, the car-side acquisition module judges whether the target detection result and its space-time synchronization information are data that are conducive to improving the car-side target detection model.
  • the target detection result and its space-time synchronization information is uploaded to the cloud, where the spatio-temporal synchronization information includes the image with the detection result of the target; then, for the same target, the data-driven model detects the image with the detection result of the target in the valuable data, outputs the label, Complete the callout.
  • the labels output by the data-driven model also include information such as target detection frames and target categories for the same target.
  • the vehicle-end target detection model will obtain multiple candidate target detection results for the same target, and generally select the target detection result with the highest confidence as the final target detection result for output, in order to ensure the valuable data uploaded to the cloud
  • two confidence thresholds (high confidence threshold ⁇ and low confidence threshold ⁇ , and ⁇ > ⁇ ) are set in the vehicle target detection model, when the candidate target detection results
  • the target detection result is output as the final target detection result
  • the confidence of the candidate target detection result is greater than the low confidence threshold ⁇
  • the target detection result and its space-time synchronization information All are collected as valuable data and uploaded to the cloud.
  • step 203' may also be included, combining the spatio-temporal synchronization information of the target detection results to determine that the target detection results belong to When the detection result is obviously wrong, the target detection result is deleted from the valuable data set.
  • the target detection result shows that there is a traffic light at the intersection, but the high-precision map data shows that there is no traffic light at the intersection. delete.
  • the target detection result obtained by inference for a certain frame of image shows that there is a charging pile at the intersection, but the target detection results obtained by reasoning for consecutive multiple frames of images before and after this frame of image all show that there is no charging pile at the intersection.
  • the high probability of target detection results is also an obviously wrong detection result, which needs to be deleted from the valuable data set.
  • the above-mentioned steps of screening out false target detection results can be performed by the vehicle-side acquisition module on the vehicle side, and upload a simplified valuable data set after screening, which can save communication resources between the cloud and the vehicle side.
  • the above-mentioned steps of screening out the wrong target detection results can be performed by the cloud acquisition module in the cloud, which can not only use the abundant computing resources in the cloud to obtain more accurate screening results, but also save computing resources in the vehicle .
  • the vehicle-side target detection model In order to ensure the effectiveness of model training, it can be considered to improve the reasoning ability of the vehicle-side target detection model from two aspects of reducing false detection and reducing missed detection.
  • the data-driven model when the data-driven model labels valuable data, it can perform local detection and global detection on images or laser point clouds with target detection results, and then combine target detection results, The consistency of the local detection results and the global detection results determines and outputs the target label.
  • the target detection result is the inference result of the vehicle-side target detection model
  • the local detection result and the global detection result are the results of the data-driven model inference.
  • the consistency level of the three detection results is high, it means that the vehicle-side target
  • the reasoning of the detection model is consistent with that of the data-driven model.
  • the reasoning of the vehicle-side target detection model is relatively consistent with the reasoning of the data-driven model, indicating that the vehicle-side target detection model has a better reasoning effect on the current scene.
  • improving the inference ability of the vehicle-side target detection model for such scenarios can be regarded as a non-key training goal; and for the case where the consistency level of the three detection results is not high enough or low, the vehicle-side target detection
  • the reasoning of the model is not consistent with the reasoning of the data-driven model, indicating that the reasoning effect of the vehicle-side object detection model on the current scene is not good enough.
  • it is necessary to improve the reasoning of the vehicle-side object detection model for such scenarios Ability as a key training target.
  • the label type of difficulty level can be introduced into the final target label, where the greater the difficulty level, the vehicle The better the inference effect of the end target detection model on the current scene, on the contrary, the smaller the difficulty level, indicating that the inference effect of the car end target detection model on the current scene is worse, and it is necessary to focus on strengthening the car end target detection model in subsequent model training The ability to reason about such scenarios.
  • the data-driven model uses the first target detection frame bbox1 in the entire frame of image or entire frame of laser point cloud.
  • a target detection frame bbox1 is centered to expand a preset range around to obtain a local detection area, and the target is detected in the local detection area, and the local detection results are output: the second target category class2, the second target detection frame bbox2, the second Confidence score2;
  • the data-driven model calculates the degree of overlap of the first target detection frame bbox1, the second target detection frame bbox2, and the third target detection frame bbox3 by using the Intersection Union Algorithm (IOU), when the calculated overlap reaches a predetermined overlap threshold , proceed to follow-up operations, otherwise, enter the next labeling process;
  • IOU Intersection Union Algorithm
  • the data-driven model can use the intersection algorithm to calculate the overlap degree of any two (two-two combinations) of the first target detection frame bbox1, the second target detection frame bbox2, and the third target detection frame bbox3, and When the calculation results all reach the predetermined overlap threshold, the subsequent operation is continued;
  • the data-driven model compares the consistency of the first target category class1, the second target category class2, and the third target category class3:
  • the data-driven model can also determine the image or laser point cloud with target detection results, local detection results, and global detection results as difficult data for output; Labeling, and then use the manually labeled target detection frame, target category and difficulty level together with the target label determined by the aforementioned data-driven model as the labeling result to train the vehicle-end target detection model;
  • the grades may be marked as zero grade manually or automatically by a computing device.
  • the automatic labeling module constructs all the target labels determined by the data-driven model to label the valuable data into a labeling data set (labeling result), and sends the labeling data set to the training module, and then the training module uses the labeling data set (labeling result) To train the vehicle-side target detection model.
  • this step can specifically be: the training module uses the image with the target label or the laser point cloud as the labeling result to train the vehicle-side target detection model, specifically, including S51, the training module according to the fourth target category class4,
  • the fourth target detection box bbox4 modifies the parameters of the vehicle-side target detection model
  • S52 modifies the weight parameters of the loss function of the vehicle-side target detection model according to the difficulty level; wherein, the lower the difficulty level, the weight parameter of the modified loss function The larger the , the modification can improve the generalization ability of the car-side target detection model to difficult data sets.
  • the iteration module determines the model parameters corresponding to the vehicle-side target detection model trained by the training module, and sends the model parameters to the vehicle-side calculation module, and the vehicle-side calculation module
  • the detection model is iterated into the trained vehicle-side target detection model.
  • the iterative module can test the trained vehicle-side object detection model, when the test results meet the iteration requirements (indicating that the reasoning ability of the trained vehicle-side object detection model is significantly better than that of the vehicle-side object detection model being used Target detection model), the model parameters will be sent to the calculation module of the vehicle end, and the iterative operation of the target detection model of the vehicle end will be completed.
  • the training module can use the optimization results to optimize the data-driven model.
  • the model is trained to improve the reasoning ability of the data-driven model in such aspects that are inferior to the vehicle-end target detection model; the iteration module sends the model parameters corresponding to the trained data-driven model to the automatic labeling module, and the data being used
  • the driving model iterates into a trained data-driven model.
  • the iteration module can also test the trained data-driven model, when the test result meets the iteration requirement (indicating that the reasoning ability of the trained data-driven model is significantly better than the data-driven model being used) , the model parameters will be sent to the automatic labeling module to complete the iterative operation of the data-driven model.
  • the object detection model automatic iteration method provided by the embodiment of the present application firstly automatically collects valuable data that can improve the performance of the vehicle-side object detection model, then uses the data-driven model to automatically label the valuable data, and finally uses the labeling results to train the vehicle-side Target detection model and complete iteration:
  • This method adopts the vehicle-side inference-cloud training mode, that is, deploys a multi-task, lightweight vehicle-side target detection model on the vehicle side, and automatically and targetedly collects the vehicle-side targets based on the target detection results.
  • the performance of the detection model improves valuable data, and then uses the powerful computing power and data storage capacity of the cloud to automatically complete a series of operations such as labeling data set generation, model training, and model iteration in real time; this car-side reasoning --
  • the cloud training mode takes full advantage of the resource advantages of the cloud and improves the efficiency of the iteration of the target detection model on the autonomous driving vehicle side;
  • This method automatically collects valuable data for improving the performance of the vehicle-side target detection model in an environment with limited vehicle-to-cloud communication resources.
  • the automatic collection process is not only efficient, but also covers rare, abnormal, and sudden long-tail scenarios, shields duplicate data and junk data, ensures the validity, diversity and integrity of the collected data, and provides sufficient, high-quality, diverse, effective and reliable data for the automatic completion of model training and model iteration on the cloud. data basis;
  • This method uses a single-task, deep-level data-driven model to automatically complete the data labeling to obtain the labeling dataset.
  • This method of automatically generating the labeling dataset greatly reduces the manual labeling work, which is obviously helpful to solve the problem caused by low labeling efficiency. The problem of time-consuming and slow model iteration.
  • Fig. 8 shows the structure of an automatic driving vehicle ADV provided according to an embodiment of the present application.
  • the autonomous vehicle ADV includes a power system V-110, a sensor system V-120, an actuation system V-130, a peripheral system V-140, and a vehicle computing system V-150.
  • an autonomous vehicle ADV vehicle may include more, fewer, or different units, and each unit may include more, fewer, or different components.
  • the units and components shown in Figure 8 can also be combined or divided in any number.
  • the powertrain V-110 can be configured to provide sporty power to the vehicle.
  • Powertrain V-110 includes one or more of engine V-111, energy source V112, transmission V113, and wheels V114.
  • Engine V-111 can be any combination of internal combustion, electric motor, steam and Stirling engines, as well as other motors and engines.
  • powertrain V-110 may include multiple types of engines and/or motors.
  • a gas-electric hybrid may include a gasoline engine and an electric motor.
  • Energy source V112 may be an energy source V112 that powers engine V-111 in whole or in part.
  • Engine V-111 may be configured to convert energy source V112 into mechanical energy.
  • Energy sources V112 may include gasoline, diesel, propane, other compressed gas based fuels, ethanol, solar panels, batteries, and other sources of electrical power.
  • Energy source V112 may additionally or alternatively include any combination of fuel tanks, batteries, capacitors, and/or flywheels. In some possible designs, the energy source V112 can also provide energy for other units of the vehicle.
  • the transmission V113 may be configured to send mechanical power from the engine V-111 to the wheels V114.
  • the transmission V113 may include gearboxes, clutches, differentials, drive shafts, and/or other components.
  • the transmission V113 includes a drive shaft
  • the drive shaft may include one or more axles configured to couple to the wheels V114.
  • the wheels V114 can be configured in any form, including single-wheel, double-wheel, three-wheel, four-wheel, six-wheel and other forms. Other wheel V 114 configurations are also possible, such as configurations comprising eight or more wheels. In any event, the wheels V114 may be configured to rotate differentially relative to the other wheels V114. In some possible designs, the wheels V114 may comprise at least one wheel fixedly attached to the transmission V113, and at least one tire coupled to a rim of the vehicle which may come into contact with the road surface. Wheel V114 may comprise any combination of metal and rubber, or other combinations of materials.
  • Powertrain V-110 may additionally or alternatively include other components in addition to the aforementioned components.
  • Sensor system V-120 may include external sensor V-121 and internal sensor V-122.
  • Exterior sensors V- 121 may include a plurality of sensors configured to sense information about the environment in which the vehicle is located, and one or more actuators V 1216 configured to modify the position and/or orientation of the sensors.
  • the external sensors V-121 may include one or more of a position sensor V1217, an inertial sensor V1211, an object sensor V1212, an image sensor V1213.
  • the position sensor V1217 can be any sensor that estimates the geographical location of the vehicle, for example, the global positioning system GPS positioning device, the carrier phase difference RTK positioning device, the Beidou satellite positioning system positioning device, the GLONASS positioning system positioning device, the Galileo positioning system positioning device, the global Navigation satellite system GNSS positioning equipment.
  • the position sensor V1 217 may include a transceiver that estimates the position of the vehicle relative to the earth.
  • the inertial sensor V1211 may be any combination of sensors configured to sense a change in position and orientation of the vehicle from inertial acceleration, such as an inertial measurement unit IMU.
  • Inertial sensors V1211 may include accelerometers and gyroscopes in some possible designs.
  • the object sensor V1212 can be any sensor that uses radio signals or laser signals to sense objects in the vehicle's environment, such as radar, laser range finder, lidar. In some possible designs, in addition to sensing objects, radar and lidar may additionally sense the speed and/or direction of travel of the object. In some possible designs, object sensor V1212 may include an emitter that emits a radio or laser signal and a detector that detects the radio or laser signal.
  • the image sensor V1213 may include any camera (such as a still camera, a video camera, etc.) for capturing images of the environment in which the vehicle is located.
  • any camera such as a still camera, a video camera, etc.
  • the external sensor V-121 may also include other sensors, such as any sensor for detecting the distance of an object, such as a sonar V1214, an ultrasonic sensor V-1216, and the like.
  • the interior sensor V- 122 may include a plurality of sensors configured to detect information corresponding to a driving state of the vehicle.
  • interior sensors V-122 may include one or more of vehicle speed sensor V-1221, acceleration sensor V-1222, and yaw rate sensor V-1223.
  • the vehicle speed sensor V-1221 may be any sensor that detects the speed of the vehicle.
  • the acceleration sensor V-1222 may be any sensor that detects the acceleration of the vehicle.
  • the yaw rate sensor V-1223 may be any sensor that detects the yaw rate (rotational angular velocity) of the vehicle around the vertical axis of the center of gravity, for example, a gyro sensor.
  • the internal sensor V-122 may also include one or more of an accelerator pedal sensor V-1224, a brake pedal sensor V-1225, and a steering wheel sensor V-1226.
  • the accelerator pedal sensor V-1224 may be any sensor that detects the amount of depression of the accelerator pedal, and the accelerator pedal sensor V-1224 is provided, for example, on the shaft portion of the accelerator pedal of the vehicle.
  • the brake pedal sensor V-1225 may be any sensor that detects the depression amount of the brake pedal, and the brake pedal sensor V-1225 is provided, for example, on the shaft portion of the brake pedal.
  • the brake pedal sensor V-1225 can also detect the operating force of the brake pedal (depressing force on the brake pedal, pressure of the master cylinder, etc.).
  • the steering wheel sensor V-1226 can be any sensor that detects the rotation state of the steering wheel.
  • the detection value of the rotation state is, for example, steering torque or rudder angle.
  • the steering wheel sensor V-1226 is, for example, installed on the steering shaft of the vehicle.
  • interior sensors V- 122 may also include other sensors, such as sensors that monitor various components inside the vehicle (eg, oxygen monitor, fuel gauge, engine oil temperature gauge, etc.).
  • sensor system V-120 may be implemented as a plurality of sensor assemblies, each sensor assembly configured to be mounted at a corresponding location on the vehicle (e.g., top, bottom, front, rear, left, right wait).
  • Actuation system V- 130 may be configured to control the driving behavior of the vehicle.
  • the actuation system V- 130 may include one or more of a steering module V- 131 , a throttle module V- 132 , and a braking module V- 133 .
  • the steering module V-131 may be any combination of devices that controls the steering torque (or steering torque) of the vehicle.
  • the throttle module V-132 can be any combination of devices that can control the operating speed of the engine V-111 and control the speed of the vehicle by adjusting the air supply to the engine (throttle opening).
  • the braking module V-133 may be any combination of devices that slows the vehicle, for example, the braking module V-133 may use friction to slow the wheels V114.
  • Peripherals system V- 140 may be configured to enable the vehicle to interact with external sensors V- 121 , other vehicles, external computing devices, and/or a user.
  • peripheral system V-140 may include one or more of a wireless communication device V-141, a wired communication interface V-142, a touch screen display V-143, a microphone V-144, and a speaker V-145.
  • Wireless communication device V-141 may be configured to connect directly or wirelessly to powertrain system V-110, sensor system V-120, actuation system V-130, peripheral system V-140, and vehicle computing system V-150 including One or more devices, and directly or wirelessly connect to one or more of other vehicles, central control systems, and entities in the hub service area.
  • the wireless communication device V-141 may include an antenna and a chipset based on wireless communication technology communication, wherein the wireless communication technology may include Global System for Mobile Communications (Global System for Mobile Communications, GSM), General Packet Radio Service (General Packet Radio Service, GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Time-Division Code Division Multiple Access (TD-SCDMA ), Long Term Evolution (LTE), Bluetooth (Blue Tooth, BT), Global Navigation Satellite System (Global Navigation Satellite System, GNSS), Frequency Modulation (FM), Near Field Communication , NFC), infrared technology (Infrared, IR).
  • GSM Global System for Mobile Communications
  • GSM Global System for Mobile Communications
  • General Packet Radio Service General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • TD-SCDMA Time-Division Code Division Multiple Access
  • LTE Long Term Evolution
  • Bluetooth Bluetooth
  • Bluetooth Bluetooth
  • Global Navigation Satellite System Global
  • GNSS can include Global Positioning System (Global Positioning System, GPS), Global Navigation Satellite System (Global Navigation Satellite System, GLONASS), Beidou Navigation Satellite System (Beidou Navigation Satellite System, BDS), Quasi-zenith Satellite System (Quasi-zenith) Satellite System, QZSS) and/or Satellite Based Augmentation Systems (Satellite Based Augmentation Systems, SBAS).
  • Global Positioning System Global Positioning System, GPS
  • Global Navigation Satellite System Global Navigation Satellite System
  • GLONASS Global Navigation Satellite System
  • Beidou Navigation Satellite System Beidou Navigation Satellite System
  • BDS Beidou Navigation Satellite System
  • Quasi-zenith Satellite System Quasi-zenith Satellite System
  • QZSS Satellite Based Augmentation Systems
  • SBAS Satellite Based Augmentation Systems
  • Wired communication interface V-142 may be configured to directly connect one or more equipment, and directly connect to one or more of other vehicles, central control systems, and entities in the hub service area.
  • the wired communication interface V-142 can include integrated circuit (Inter-Integrated Circuit, I2C) interface, integrated circuit built-in audio (Inter-Integrated Circuit Sound, I2S) interface, pulse code modulation (Pulse Code Modulation, PCM) interface, general asynchronous transceiver Transmitter (Universal Asynchronous Receiver/Transmitter, UART) interface, mobile industry processor interface (Mobile Industry Processor Interface, MIPI), general-purpose input and output (General-Purpose Input/Output, GPIO) interface, subscriber identity module (Subscriber Identity Module, SIM) interface, and/or Universal Serial Bus (Universal Serial Bus, USB) interface, etc.
  • I2C Inter-Integrated Circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM Pulse Code Modulation
  • the touchscreen display V-143 can be used by the user to enter commands into the vehicle.
  • the touchscreen display V- 143 may be configured to sense the position and/or movement of a user's finger through capacitive sensing, resistive sensing, or surface acoustic wave processing.
  • the touchscreen display V-143 is capable of sensing finger movement in a direction parallel or coplanar to the touchscreen surface, perpendicular to the touchscreen surface, or both, and is also capable of sensing the level of pressure applied to the touchscreen surface.
  • Touchscreen display V-143 may be formed from one or more translucent or transparent insulating layers and one or more translucent or transparent conductive layers.
  • the touchscreen display V-143 can also be configured in other forms.
  • Microphone V- 144 may be configured to receive sound signals (eg, voice commands or other audio input) and convert the sound signals to electrical signals.
  • sound signals eg, voice commands or other audio input
  • Speaker V-145 can be configured to output audio.
  • Peripherals system V- 140 may further or alternatively include other components.
  • Vehicle computing system V-150 may include a processor V-151 and a data storage device V-152.
  • Processor V-151 may be configured to execute instructions stored in data storage device V-152 to perform various functions, including but not limited to positioning fusion module V-1501, perception module V-1501 as described below 1502. Functions corresponding to the driving state determination module V-1503, the navigation module V-1504, the decision-making module V-1505, the driving control module V-1506, and the task receiving module V-1507.
  • the processor V-151 can include a general-purpose processor (such as CPU, GPU), a special-purpose processor (such as an application-specific integrated circuit (Application-specific integrated circuit, ASIC)), a field-programmable gate array (FPGA), a digital signal processor ( A combination of one or more of DSP), integrated circuits, microcontrollers, etc. In the case where the processor V-151 includes a plurality of processors V-151, these processors V-151 can work individually or in combination.
  • Data storage V- 152 may include one or more volatile computer-readable storage media and/or one or more non-volatile computer-readable storage media, such as optical, magnetic, and/or organic storage media.
  • the data storage device V-152 may include read only memory (ROM), random access memory (RAM), flash memory, electrically programmable memory (EPROM), electrically programmable and erasable memory (EEPROM), embedded multimedia card (eMMC), hard drive, or any combination of volatile or non-volatile media, etc.
  • the data storage V-152 may be wholly or partially integrated with the processor V-151.
  • the data storage device V-152 may be configured to store instructions executable by the processor V-151 to perform various functions, wherein these functions include but are not limited to the positioning fusion module V-1501, the perception module V-1502 as described below . Functions corresponding to the driving state determination module V-1503, the navigation module V-1504, the decision-making module V-1505, the driving control module V-1506, and the task receiving module V-1507.
  • the positioning fusion module V-1501 can be configured to receive environmental data, location data or other types of data sensed by the sensor system V-120, and obtain the fused data by performing time stamp alignment, fusion calculation and other processing on these data Environmental data and vehicle location data.
  • the positioning fusion module V-1501 may include, for example, Kalman filter, Bayesian network, and algorithms for realizing other functions.
  • the perception module V-1502 may be configured to receive the fused environment data calculated by the location fusion module V-1501, and perform computer vision processing on it to identify objects and/or features in the environment where the vehicle is located, the objects and/or Features include, for example, lane markings, pedestrians, other vehicles, traffic signals, infrastructure, etc.
  • the perception module V-1502 can use object recognition algorithm, structure from motion (SFM) algorithm, video tracking or other computer vision technology. In some possible designs, the perception module V-1502 may be further configured to map the environment, track objects, estimate the speed of objects, and the like.
  • the driving state determination module V-1503 identifies the driving state of the vehicle based on the data obtained by the internal sensor V-122 in the sensor system V-120, including vehicle speed, acceleration or yaw rate, for example.
  • the task receiving module V-1507 can be configured to receive the task, analyze the loading and unloading address, cargo category, loading and unloading time and other information contained in the task, and send this information to the navigation module V-1504.
  • the navigation module V-1504 can be configured as any unit that determines the driving route of the vehicle.
  • the navigation module V-1504 can be further configured to dynamically update the driving route during the operation of the vehicle.
  • the navigation module V-1504 can be configured to be based on the processing results from the positioning fusion module V-1501, the positioning sensor, the object sensor V1212, the task receiving module V-1507 and one or more pre-stored high-precision Map data to determine the driving route for the vehicle.
  • the decision-making module V-1505 can be configured to be based on the driving route calculated by the navigation module V-1504, the vehicle position data calculated by the positioning fusion module V-1501, and the objects in the environment where the vehicle is identified by the perception module V-1502 and/or features, generating waypoint information of the vehicle, where the waypoint in the waypoint information is the track point where the vehicle is moving forward in the driving path.
  • the travel control module V-1506 can be configured to receive the waypoint information generated by the decision module V-1505, and control the actuation system V-130 according to the waypoint information, so that the vehicle travels according to the waypoint information.
  • Data storage device V-152 may also be configured to store other instructions, including sending data to power system V-110, sensor system V-120, actuation system V-130, and/or peripherals system V-140. One or more instructions from which to receive data, interact with it, and/or control it. Data storage device V- 152 may also be configured to store other instructions. For example, the data storage device V-152 may store instructions for controlling the operation of the transmission V113 to improve fuel efficiency, may store instructions for controlling the image sensor V1213 to capture an image of the environment, and may store instructions for controlling the data sensed by the object sensor V1212. Instructions for generating a three-dimensional image of the environment in which the vehicle is located, and instructions for recognizing electrical signals converted by the microphone V- 144 into voice commands may be stored.
  • Data storage device V- 152 may also be configured to store other instructions. In addition to storing instructions, the data storage device V-152 can also be configured to store various information, such as image processing parameters, training data, high-definition maps, route information, and the like. During the operation of the vehicle in automatic mode, semi-automatic mode, manual mode, this information can be used by powertrain system V-110, sensor system V-120, actuation system V-130 and peripheral equipment system V-140, vehicle computing system V-140 One or more of 150 are used.
  • Vehicle computing system V-150 may be communicatively coupled to one or more of powertrain system V-110, sensor system V-120, actuation system V-130, and peripheral system V-140 via a system bus, network, and/or other connection mechanism Multiple.
  • the vehicle computing system V-150 can be directly connected to the wireless communication device V-141 in the peripheral equipment system V-140 through the data line or wirelessly through the wireless communication technology, and then wirelessly connected to the hub service area and the /or central control system.
  • Vehicle computing system V- 150 may also be a plurality of computing devices that distribute control of individual components or individual systems of the vehicle.
  • Vehicle computing system V- 150 may additionally or alternatively include other components.
  • FIG. 8 presents a functional block diagram of the self-driving vehicle 100, and the vehicle computing system V-150 in the self-driving vehicle 100 is introduced below.
  • FIG. 9 is a schematic structural diagram of a vehicle computing system V-150 provided by an embodiment of the present application.
  • the vehicle computing system V-150 includes a processor E-100 coupled to a system bus E-000.
  • Processor E-100 may be any conventional processor, including a Reduced Instruction Set Computing (RISC) processor, a Complex Instruction Set Computing (CISC) processor, or a combination thereof.
  • processor E-100 may be a dedicated device such as an Application Specific Integrated Circuit (ASIC).
  • the processor E-100 may be one or more processors, wherein each processor may include one or more processor cores.
  • the system memory E-900 is coupled with the system bus E-000.
  • the data running in the system memory E-900 may include the operating system E-901 and application programs E-904 of the vehicle computing system V-150.
  • the operating system E-901 includes a shell (Shell) E-902 and a kernel (kernel) E-903.
  • the shell E-902 is an interface between the user and the kernel E-903 of the operating system, and is the outermost layer of the operating system.
  • Shell E-902 manages the interaction between the user and the operating system, waits for user input, interprets user input to the operating system, and processes various operating system output results.
  • the kernel E-903 consists of those parts of the operating system E-901 for managing memory, files, peripherals and system resources. Directly interacting with hardware, the operating system kernel usually runs processes and provides inter-process communication, providing CPU time slice management, interrupts, memory management, I/O management, and so on.
  • the application program E-904 includes an automatic driving related program E-905, such as a program for managing the interaction between the automatic driving vehicle 100 and obstacles on the road, a program for controlling the driving route or speed of the automatic driving device, and controlling the automatic driving vehicle 100 and other automatic driving vehicles on the road Programs for device interaction.
  • Application E-904 also exists on the system of the software deployment server. When the application E-904 needs to be executed, the vehicle computing system V-150 can download the application E-904 from the software deployment server.
  • System bus E-000 is coupled via bus bridge E-200 and I/O bus E-300.
  • the I/O bus E-300 is coupled with the I/O interface E-400.
  • the I/O interface E-400 is connected with the USB interface E-500 to communicate with various I/O devices, such as input devices, media disks, transceivers, cameras, sensors, etc.
  • the input device is such as keyboard, mouse, touch screen, etc.
  • the media disk is such as CD-ROM, multimedia interface, etc.
  • the transceiver is used to send and/or receive radio communication signals
  • the camera is used to capture scenery and dynamic digital video images; It may be various sensors included in the sensing system in FIG. 8 , used to detect the environment around the vehicle computing system V-150 and provide the sensed information to the vehicle computing system V-150.
  • the hard disk drive E-800 is coupled to the system bus E-000 via a hard disk drive interface.
  • the display adapter E-700 is coupled with the system bus E-000 to drive the display.
  • the vehicle computing system V-150 can communicate with the software deployment server through the network interface E-600.
  • the network interface E-600 is a hardware network interface, such as a network card.
  • the network may be an external network such as the Internet, an internal network such as Ethernet or a virtual private network (VPN), or a wireless network such as a WiFi network or a cellular network.
  • VPN virtual private network
  • Vehicle computing system V-150 may include an on-board execution device that may include one or more first processors, one or more first memories, and The computer instructions to run.
  • the first processor executes the functions corresponding to the on-vehicle execution device in various embodiments provided by the present application.
  • the first processor can be configured as one or more general-purpose processors (such as CPU, GPU) in the processor V-151, one or more special-purpose processors (such as ASIC), one or more field programmable A gate array (FPGA), one or more digital signal processors (DSP), one or more integrated circuits, and/or, one or more microcontrollers, etc.
  • the first memory may be configured as one or more read-only memories (ROMs), one or more random-access memories (RAMs), one or more flash memories, one or more An electrically programmable memory (EPROM), one or more electrically programmable and erasable memories (EEPROM), one or more embedded multimedia cards (eMMC), and/or, one or more hard drives, etc.
  • ROMs read-only memories
  • RAMs random-access memories
  • EPROM electrically programmable memory
  • EEPROM electrically programmable and erasable memories
  • eMMC embedded multimedia cards
  • hard drives etc.
  • the functions corresponding to the vehicle-mounted execution device can be implemented as a computer program product, and when the computer program product is run on a computer, the functions corresponding to the vehicle-mounted execution device are realized.
  • the computer program product for realizing the corresponding function may be stored in the first memory.
  • FIG. 10 shows a possible example of an automatic driving vehicle 100 and a vehicle-mounted execution device 50.
  • a first memory 50B and computer instructions stored on the first memory and executable on the first processor.
  • the first processor runs the computer instructions in the first memory, execute the method corresponding to the following steps: S91, obtain the target detection result through inference of the vehicle-side target detection model; S92, collect the vehicle-side target detection according to the target detection result Valuable data for model performance improvement; S93, iterating the in-use vehicle-side target detection model into a trained vehicle-side target detection model; wherein, the trained vehicle-side target detection model is driven by cloud execution equipment through data
  • the model marks the valuable data for improving the performance of the vehicle-side target detection model, and uses the labeling results to train the vehicle-side target detection model.
  • the specific implementation form of the on-vehicle execution device 50 may also be other electronic devices with similar memory and processor architectures.
  • the embodiment of the present application also provides a cloud execution device.
  • the cloud execution device 60 may include one or more second processors 60A, one or more second memories 60B, and Computer instructions stored on the second memory and executable on the second processor.
  • the second processor When the second processor is running the computer instructions in the second memory, it executes the functions corresponding to the cloud execution device in various embodiments provided by the present application.
  • the second processor can be configured as one or more general-purpose processors (such as CPU, GPU), one or more special-purpose processors (such as ASIC), one or more field programmable gate arrays (FPGA), one or multiple digital signal processors (DSPs), one or more integrated circuits, and/or, one or more microcontrollers, and the like.
  • the second memory may be configured as one or more read only memories (ROM), one or more random access memories (RAM), one or more flash memories, one or more electrically programmable memories (EPROM), One or more electrically programmable and erasable memories (EEPROM), one or more embedded multimedia cards (eMMC), and/or, one or more hard drives, etc.
  • the functions corresponding to the cloud execution device can be realized as a computer program product, and when the computer program product is run on a computer, the functions corresponding to the cloud execution device are realized.
  • the computer program product for realizing the corresponding function may be stored in the second memory.
  • FIG. 11 shows a possible example of a cloud execution device 60, including a second processor 60A, a second memory 60B, and computer instructions stored in the second memory and executable on the second processor.
  • the method corresponding to the following steps is executed: S101, according to the target detection result, collect valuable data for improving the performance of the vehicle end target detection model; S102, drive the model through the data Label the valuable data for improving the performance of the vehicle-side target detection model, and use the labeling results to train the vehicle-side target detection model; S103, iterate the vehicle-side target detection model being used by the vehicle-side execution device into the trained vehicle-side Model.
  • the specific implementation form of the cloud execution device 60 may also be other electronic devices with similar memory and processor architectures.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be Incorporation or may be integrated into another device, or some features may be omitted, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the unit described as a separate component may or may not be physically separated, and the component displayed as a unit may be one physical unit or multiple physical units, that is, it may be located in one place, or may be distributed to multiple different places . Part or all of the units can be selected according to actual needs to realize the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solution of the embodiment of the present application is essentially or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the software product is stored in a storage medium Among them, several instructions are included to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Geometry (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Automation & Control Theory (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Train Traffic Observation, Control, And Security (AREA)
  • Traffic Control Systems (AREA)

Abstract

一种目标检测模型自动化迭代方法、设备及存储介质,该方法包括:使用车端计算资源通过车端目标检测模型推理得到目标检测结果;根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据;使用云端计算资源通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果训练车端目标检测模型;将车端计算资源正在使用的车端目标检测模型迭代为训练后的车端目标检测模型。本申请采用车端推理--云端训练的模式,有针对性地自动采集有价值数据,并自动化完成数据标注,充分发挥了云端的资源优势,提高了自动驾驶车端目标检测模型迭代的效率。

Description

一种目标检测模型自动化迭代方法、设备及存储介质 技术领域
本申请涉及自动驾驶技术领域,尤其涉及一种目标检测模型自动化迭代方法、设备及存储介质。
背景技术
自动驾驶长尾场景是指突发的、低概率的、不可预知的场景,比如交通信号灯故障的路口、醉驾的车辆、路中央的气球等。怎样应对长尾场景一直是行业难题,已逐渐成为制约自动驾驶发展的关键,要解决这些问题,自动驾驶系统需要积累大量的数据,持续优化模型。
据统计,全球汽车保有量超过10亿辆,但平均每30s发生一起事故,说明交通事故属于低频出现的事件。要实现自动驾驶落地,就要解决这些低频发生的问题,至少要达到现有人类驾驶员的安全驾驶水平,甚至全面超越人类驾驶员。
根据以上数据,为了充分测试自动驾驶系统的安全性,至少需要进行上亿公里的道路测试,这意味着数万乃至几十万辆车24小时不间断运行几百天,同时,测试过程中产生有效问题数据的效率很低,这些原因导致模型的迭代和验证的成本越来越高。
发明内容
传统的模型迭代验证方式采用以功能测试驱动模型迭代的方式,在开发端以需求和问题驱动数据采集,然后人工分析和标注数据并设计优化方案,在测试端人工搭建场景进行测试或实车随机测试,最终形成标注、开发、测试的串行迭代流程,这种方法对于软件功能开发是有效的,有限的人力解决有限的问题,实现特定范围内的功能。
但传统的模型迭代验证方式难以让自动驾驶真正落地,使整个行业做到全时段全工况的安全运行。一方面,传统问题驱动的方式是依靠串行的开发模式对模型进行优化,开发和测试的周期长,无法并行开展。另一方面,人工标注数据的方式耗时长,标注效率低;第三方面,测试大多通过人工搭建典型场景或随机测试对模型进行验证,对于实际运行场景的覆盖率较低。以上几个方面说明问题驱动的方式已经无法满足解决真实场景中海量问题的需求,不能自动化的解决绝大部分的问题,无法高效的实现自动驾驶落地目标。
在上述背景下,开发一种快速模型优化和验证的方法以有效解决模型迭代周期长、验证效率低等现实问题,成为本领域亟待解决的技术问题。
本申请实施例旨在至少解决上述技术问题之一。
第一方面,本申请实施例提供一种目标检测模型自动化迭代方法,包括:
使用车端计算资源通过车端目标检测模型推理得到目标检测结果;
根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据;
使用云端计算资源通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果训练车端目标检测模型;
将车端计算资源正在使用的车端目标检测模型迭代为训练后的车端目标检测模型。
第二方面,本申请实施例提供一种车端目标检测模型自动化迭代方法,包括:
通过车端目标检测模型推理得到目标检测结果;
与云端计算资源相配合,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据;
与云端计算资源相配合,将正在使用的车端目标检测模型迭代为训练后的车端目标检测模型;其中,所述训练后的车端目标检测模型是由云端计算资源通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果对车端目标检测模型进行训练得到的模型。
第三方面,本申请实施例提供一种云端目标检测模型自动化迭代方法,包括:
与车端计算资源相配合,根据车端计算资源通过车端目标检测模型推理得到的目标检测结果,采集对车端目标检测模型性能提升有价值的数据;
通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果训练车端目标检测模型;
与车端计算资源相配合,将车端计算资源正在使用的车端目标检测模型迭代为训练后的车端目标检测模型。
第四方面,本申请实施例提供一种车端执行设备,包括:
车端计算模块,配置有车端目标检测模型,通过车端目标检测模型推理得到目标检测结果;
车端采集模块,用于与云端执行设备相配合,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据;其中,所述云端执行设备通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果训练车端目标检测模型;
所述车端计算模块还用于与云端执行设备相配合,将配置的车端目标检测模型迭代为训练后的车端目标检测模型。
第五方面,本申请实施例提供一种云端执行设备,包括:
云端采集模块,用于与车端执行设备相配合,根据车端执行设备通过车端目标检测模型推理得到的目标检测结果,采集对车端目标检测模型性能提升有价值的数据;
自动标注模块,用于通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注;
训练模块,用于利用标注结果训练车端目标检测模型;
迭代模块,用于与车端执行设备相配合,将车端计算资源正在使用的车端目标检测模型迭代为训练后的车端目标检测模型。
第六方面,本申请实施例提供一种电子设备,包括:至少一个处理器,以及与所述至少一个处理器通信连接的存储器,其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行前述任意一项车端目标检测模型自动化迭代方法的步骤。
第七方面,本申请实施例提供一种自动驾驶车辆,包括前述的电子设备。
第八方面,本申请实施例提供一种存储介质,其上存储有计算机程序,该程序被处理器执行时实现前述车端目标检测模型自动化迭代方法的步骤。
第九方面,本申请实施例提供电子设备,包括:至少一个处理器,以及与所述至少一个处理器通信连接的存储器,其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行前述云端目标检测模型自动化迭代方法的步骤。
第十方面,本发明实施例提供存储介质,其上存储有计算机程序,该程序被处理器执行时实现前述云端目标检测模型自动化迭代方法的步骤。
本申请提供的目标检测模型自动化迭代方法采用车端推理--云端训练的模式,即在车端部署多任务、轻量级的车端目标检测模型,基于目标检测结果自动地、有针对性地采集对车端目标检测模型的性能提升有价值的数据,再利用云端强 大的计算能力和数据存储能力,实时地、自动地完成标注数据集生成、模型训练、模型迭代等一系列操作;这种车端推理--云端训练的模式充分发挥了云端的资源优势,提高了自动驾驶车端目标检测模型迭代的效率。
本申请提供的方法在车端至云端通信资源有限的环境下,自动采集对车端目标检测模型的性能提升有价值的数据,该自动采集过程不仅效率高,而且涵盖了罕见、异常、突发的长尾场景,屏蔽了重复数据和垃圾数据,保证了所采集数据的有效性、多样性和完整性,为云端自动化完成模型训练和模型迭代提供了充足、高质、多样、有效、可靠的数据基础。
本申请提供的方法利用单任务、深层次的数据驱动模型自动化完成数据标注获得标注数据集,这种自动生成标注数据集的方式大大减少了人工标注工作,明显有助于解决因为标注效率低导致的模型迭代耗时长、速度慢的问题。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的目标检测模型自动化迭代系统的系统架构图;
图2为本申请实施例提供的目标检测模型自动化迭代方法的一种流程示意图;
图3为基于图像检测目标的车端目标检测模型输出的一种目标检测结果示意图;
图4为车端采集模块采集对车端目标检测模型性能提升有价值的数据的一种示例;
图5为一种场景所包含要素类型的示例;
图6为车端采集模块采集对车端目标检测模型性能提升有价值的数据的一种示例;
图7为数据驱动模型使用的目标类型库的一种示例;
图8为自动驾驶车辆的一种结构示意图;
图9为车辆计算系统的一种结构示意图;
图10为自动驾驶车辆和车载执行设备的一种可能的示例;
图11为云端执行设备的一种结构示意图;
图12为本申请实施例提供的目标检测模型自动化迭代方法的另一种流程示意图;
图13为数据驱动模型的标注过程的一种流程示意图;
图14为对目标检测结果、局部检测结果、全局检测结果进行一致性检验,根据检验结果确定目标标签的一种流程示意图;
图15为利用标注结果训练车端目标检测模型的一种流程示意图。
具体实施方式
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
在对本申请实施例提供的目标检测模型自动化迭代方法进行详细介绍之前,先结合图1对本申请实施例提供的目标检测模型自动化迭代系统进行介绍。请先参阅图1,图1为本申请实施例提供的目标检测模型自动化迭代系统的一种系统架构图,在图1中,目标检测模型自动化迭代系统300包括:车端执行设备10、车端数据存储单元20、云端执行设备30和云端数据库40。
车端执行设备10包括:车端采集模块10A、车端计算模块10B。车端计算模块10B中配置有车端模型。
车端执行设备可应用于自动驾驶车辆中,其中,自动驾驶车辆设置有至少一种传感器,如车载雷达(如毫米波雷达、红外雷达、激光雷达、多普勒雷达等)、光量传感器、雨量传感器、视觉传感器(如摄像头、行车记录仪)、车姿传感器(如陀螺仪)、速度传感器(如多普勒雷达)、惯性测量单元IMU等。
车端采集模块具有数据采集功能,并将采集的数据发送给上位机进行分析、处理,可用于采集自动驾驶车辆上设置的各类传感器采集的模拟或数字信号,也可用于采集车端计算模块通过车端模型推理的结果,也可以用于采集车辆状态数据、地图数据、驾驶员操作数据等。车端采集模块内置数据采集卡(即实现数据采集功能的计算机扩展卡),可以通过USB、PXI、PCI、PCI Express、火线(1394)、PCMCIA、ISA、Compact Flash、485、232、以太网、各种无线网络等总线采集和发送数据。
车端采集模块还具有数据处理功能,具体为与云端采集模块相配合,从采集的数据中提取对车端模型性能提升有价值的数据。
车端数据存储单元具有数据存储功能,可用于存储前述各类传感器采集的信号、车端模型推理的结果、车辆状态数据、地图数据、驾驶员操作数据,还可用于存储操作系统、应用程序等。在一种可能的设计中,车端数据存储单元可采用嵌入式多媒体卡(eMMC)、单层单元闪存(SLC NAND)、通用闪存(UFS)固态硬盘(SSD)等实现。在一种可能的设计中,车端数据存储单元可以设置于车端执行设备中,也可以是车端执行设备以外的外部设备。
车端模型具有推理功能,可用于为自动驾驶车辆实现目标检测、行为预测、决策规划等功能。在一种可能的设计中,车端模型可以是神经网络类型的模型或非神经网络类型的模型,本申请实施例中仅以车端模型为神经网络类型的模型为例。本申请实施例所称的“车端目标检测模型”10C即为实现目标检测功能的车端模型。本申请实施例所称的“目标检测结果”即为车端目标检测模型推理的结果。
车端计算模块获取传感器数据、车辆状态数据、地图数据、驾驶员操作数据等,然后将这些数据作为车端模型的输入数据,利用车端模型进行推理,实现自动驾驶车辆的目标检测、行为预测、决策规划等功能。
云端执行设备30包括:云端采集模块30A、自动标注模块30B、训练模块30C、迭代模块30D。
云端执行设备可由云服务器实现。车端执行设备与云端执行设备之间通过通信接口实现数据传输,该通信接口可以采用车用无线通信技术V2X、车载以太网、3G/4G/5G移动通信技术等进行通信。
云端采集模块具有数据采集功能,并将采集的数据发送给上位机进行分析、处理。云端采集模块与车端采集模块之间具有数据传输关系,云端采集模块根据需求从车端采集模块获取数据。云端采集模块内置数据采集卡,可以通过USB、PXI、PCI、PCI Express、火线(1394)、PCMCIA、ISA、Compact Flash、485、232、以太网、各种无线网络等总线采集和发送数据。
云端采集模块还具有数据处理功能,具体为与车端采集模块相配合,采集对车端模型性能提升有价值的数据。
云端数据库具有数据存储功能,可采用云存储技术、云数据库技术等实现。
自动标注模块具有数据处理功能,可实现数据标注功能。
训练模块利用自动标注模块处理得到的标注结果训练车端模型。
迭代模块利用训练模块训练后的车端模型,对车端执行设备正在使用的车端模型进行迭代更新。
本申请实施例所称的“车端计算资源”包括但不限于车端执行设备、车端数据存储单元,还可以包括其他设置于自动驾驶车辆上的计算资源。本申请实施例所称的“云端计算资源”包括但不限于云端执行设备和云端数据库,还可以包括其他基于云计算技术的资源。
结合上述描述,下面开始对本申请实施例提供的目标检测模型自动化迭代方法的具体实现流程进行描述。
具体的,请参阅图2,图2为本申请实施例提供的目标检测模型自动化迭代方法的一种流程示意图,本申请实施例提供的目标检测模型自动化迭代方法可以包括:
201,使用车端计算资源通过车端目标检测模型推理得到目标检测结果。
具体的,车端计算模块将传感器数据、车辆状态数据、地图数据、驾驶员操作数据等输入车端目标检测模型,然后车端目标检测模型基于模型的算法逻辑进行推理,实现自动驾驶车辆的目标检测功能,得到目标检测结果。
此后,目标检测结果被存储至车端数据存储单元中,并被车端采集模块获取。在一种可能的设计中,车端采集模块可以直接从车端计算模块获得目标检测结果,也可以从车端数据存储单元中获得目标检测结果。
在一种可能的设计中,车端目标检测模型可以通过图像识别技术达到检测(识别)目标的目的。相应的,目标检测结果可以包括:基于图像的目标检测框、目标类型、置信度等。图3所示为基于图像检测目标的车端目标检测模型输出的一种目标检测结果,其中,白色矩形框为目标检测框,白色矩形框旁边的Red、Green、Car、Sign为目标类型,白色矩形框旁边的数字为置信度。
在另一种可能的设计中,车端目标检测模型可以通过对激光点云进行聚类达到检测(识别)目标的目的。相应的,目标检测结果可以包括:基于激光点云的目标检测框、目标类型、置信度等。
考虑到车端计算资源具有成本高、算力有限、推理速度快等特点,车端目标检测模型可采用具有多任务、轻量级等特征的网络结构。其中,多任务是指网络结构具有共享参数、共享任务的特点,轻量级是指网络结构具有在有限存储空间和功耗限制的情况下满足计算效率和能力的特点。
例如:
当车端目标检测模型是基于视觉的目标检测模型,多任务是指可以复用图像的特征信息,通过一次模型推理得到多种任务所需的结果,例如同时检测行人、车辆和信号灯等;轻量级可以适配车端有限的算力,并满足车端推理效率;
当车端目标检测模型是基于激光点云的目标检测模型,多任务是指可以复用点云的特征信息,通过一次模型推理得到多种任务所需的结果,例如同时检测行人、车辆类别和障碍物动静态属性等;轻量级可以适配车端有限的算力,并满足车端推理效率。
此外,车端目标检测模型还可以采用具有多维度特点的网络结构,多维度的网络结构可有助于挖掘多目标之间内在联系。
202,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据。
具体的,车端采集模块与云端采集模块相配合,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据。
目前自动驾驶领域常用的模型训练迭代方式是将模型推理的结果全部用于后续的模型训练中去,这种方式并不分辨模型对于场景的推理效果(学习效果),也就是不论模型对某一场景的推理效果是否足够好或者还不够好,都统一将推理结果继续用于模型训练,这种不区分重点的训练,不能快速、有针对性地实现训练目的。对于常见的典型场景还能适应,但对于罕见、突发、异常的长尾场景,这种模型迭代方式就会很难适应。
为了克服常规模型训练迭代方式的缺陷,本申请有针对性地采集对车端目标检测模型性能提升有价值的数据,再利用这些对车端目标检测模型性能提升有价值的数据对车端目标检测模型进行训练和迭代,这种方式可根据训练目的有针对性地提取有价值数据,从而快速、有效地达到训练目标。
具体的,对车端目标检测模型性能提升有价值的数据,不仅包括目标检测结果本身,还应包括与目标检测结果在时间及空间上存在同步关系的环境数据、地图数据、车身状态数据、驾驶员操作数据等时空同步信息,这些信息与目标检测结果结合在一起,能够全面地反映自动驾驶车辆所处的场景,对于训练模型也更有意义。
其中,环境数据可以包括:静态环境(固定障碍物、建筑设施、交通设施、道路)、动态环境(动态交通信号灯、交通警察)、通信环境(信号强度、信号延迟时间、电磁干扰强度)、交通参与者(行人、机动车辆、非机动车辆、动物)、气象环境(温度、湿度、光照条件、天气情况)等;
此外,环境数据还可以包括:视觉传感器、激光雷达、毫米波雷达、超声波雷达等传感器采集的数据,如图像、激光点云等。
地图数据可以包括:高精地图、交通管制信息、导航信息等;
车辆状态数据可以包括:车辆基础属性(如车身重量、几何尺寸、基本性能)、车辆位置(坐标、车道位置)、运动状态(横向运动状态、纵向运动状态)、人机交互(娱乐、驾驶任务)等;
驾驶员操作数据可以包括:是否接管车辆、驾驶员具体动作等。
根据不同的训练目的,采集对车端目标检测模型性能提升有价值的数据,可以包括如下几种情况:
(1)训练目的为令车端目标检测模型涵盖(适应)尽可能多的场景
如图4所示,车端采集模块利用车端目标检测模型推理的结果及其时空同步信息构建场景,并上传给云端,云端采集模块确定已有场景库中缺少车端采集模块上传的场景时,将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集。
在自动驾驶测试领域,场景是指在一定的时间和空间范围内,自动驾驶车辆与行驶环境中的其它车辆、道路、交通设施、气象条件等元素综合交互过程的总体动态描述,是自动驾驶车辆的驾驶情景与行驶环境的有机组合,既包括各类实体元素,也涵盖了实体执行的动作及实体之间的连接关系。图5所示为一种场景所包含要素类型的实施例。
具体的,云端数据库中存储有场景库,其中包含有车端目标检测模型涵盖(适应)的各种场景,若云端采集模块将车端采集模块上传的场景与场景库中已有的场景进行比对后发现场景库中没有该场景,则说明车端目标检测模型还不能涵盖(适应)这一场景,需要将这一场景其加入到场景库中去,此时,云端采集模块下发命令,车端采集模块收到命令后,即将这一场景对应的目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集。
在一种可能的设计中,云端采集模块将车端采集模块上传的场景与场景库进行比对时,如果有如下两种情况出现,均可以算作场景库中缺少该场景:
(1.1)已有场景库中缺少该场景对应的类别
这种情况直接表明场景库还没涵盖该场景对应的类别,例如,场景库中道路类型下涵盖城市道路、高速公路、园区道路三个类别,而车端采集模块上传的场景类别为农村道路,此时,就可以确定场景库中缺少该场景;
(1.2)已有场景库中存在该场景对应的类别,但已有场景库中该类别下的数据量未达到预定数量
这种情况表明虽然场景库中已经涵盖了该场景,但是该场景对应的数据量还较少,而模型训练需要足够多的数据量,这时仍然需要认为场景库中缺少该场景,需要将场景对应的目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据上传给云端采集模块。例如,场景库中道路类型下的农村道路这一类别仅有10条数据,还远达不到有效训练模型的目标,因此,如果再有农村道路场景上传时,还需要继续记录到场景库中。
考虑到场景包含的信息量较大,完整上传场景要不仅浪费通信资源,还会影响采集效率,然而实际上并非所有场景都是对车端目标检测模型性能提升有价值的(场景库中可能已经包含有该场景),这种情况下,为了节省通信资源、加快数据采集效率,可以由车端采集模块对场景进行编码,并上传场景编码给云端,而云端数据库中除了存储有场景库之外,还可以存储场景库对应的编码库(其中包含有场景库中每个场景对应的场景编码),云端采集模块将车端采集模块上传的场景编码与编码库进行比对,当确定编码库中没有该场景编码时,就可以确定车端目标检测模型还不能涵盖(适应)这一场景,需要将这一场景其加入到场景库中去,此时,云端采集模块下发命令,车端采集模块接收命令后,即将对应的目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集。
具体的,车端采集模块应按照预定的编码规则对场景进行编码,在一种可能的设计中,预定的编码规则可以是按照场景要素编码。
例如,针对图5所示的场景,对场景要素按照在父节点要素中的排序进行编码,对于每个具体的要素,#后面的数字表示当前要素在其父节点要素中的排序:
若场景包含行人,则按照从左往右,外部环境要素对应的编码为2,交通参与者对应的编码为3,行人对应的编码为2,则场景编码相应的包含数字232;
若场景包含横向运动状态,则按照从左往右,车辆自身要素对应的编码为1,运动状态对应的编码为3,横向运动状态对应的编码为1,则场景编码相应的包含数字131;
若场景同时包含行人、横向运动状态,则场景编码相应的包含数据组(232、131)。
(2)训练目的为令车端目标检测模型涵盖(适应)罕见、突发、异常等长尾场景
如图6所示,车端采集模块监测到目标检测结果和/或时空同步信息不属于常规场景时,将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集。
这里的常规场景是指物理世界中普遍存在的、常见的交通场景,例如车辆在道路上正常行驶,道路中出现交通信号灯、交通指示牌、车道线、路肩等常规交通设施。与之相反的是长尾场景,即鲜少或几乎不可能在物理世界中出现的、罕见、突发、异常的交通场景,例如,车辆在天空\花坛\建筑物上行驶,道路中突然出现野兽\建筑物\大型漂浮物(例如气球)等。长尾场景对于自动驾驶车辆来说,常常意味着危险系数高、操作处理复杂,要想应对长尾场景,就需要提升车端目标检测模型面对长尾场景时的推理能力,相应的,长尾场景对应的各种信息就属于对车端目标检测模型性能提升有价值的数据。
当监测到目标检测结果和/或时空同步信息不属于常规场景时,说明自动驾驶车辆处于罕见的、或者突发的、或者异常的长尾场景中,需要将此时的推理结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集。
在一种可能的设计中,基于激光点云的目标检测模型推理的结果为目标车辆从某一帧开始在道路一侧的建筑物上行驶,并且这种情况持续多帧,而在这帧之前的监测结果一直是目标车辆在道路上行驶的,这一推理结果(车辆在建筑物上行驶)不属于常规场景(车辆在道路上行驶),这种情况的出现可能是基于激光点云的目标检测模型的推理出现了错误,也有可能是激光雷达故障,甚至也可能是目标车辆真的在花坛上行驶等,这些异常或罕见场景都属于长尾场景,需要车端目标检测模型去涵盖,因此需要将此时的推理结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集,用于后续的模型训练。
在另一种可能的设计中,车辆行驶过程中,突然出现一群大象要横穿道路,或者道路上飘来一个气球,或者道路中央有一个房子(例如钉子户),这些车辆行驶中遇到的突发情况作为时空同步信息(环境信息)被记录下来,并被判断为不属于常规场景,而是罕见的、异常的场景(长尾场景),需要车端目标检测模型去涵盖,因此需要将此时的推理结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集,用于后续的模型训练。
(3)训练目的为令车端目标检测模型在自身推理效果不够好的场景中具备更好的推理能力
具体的,如图6所示,如下两种情况均表明车端目标检测模型的推理效果不够好,需要提升车端目标检测模型面对相应场景时的推理能力:
(3.1)目标检测结果与期望值不匹配
在一种可能的设计中,车端目标检测模型对车载摄像头采集的某路口图像进行推理,其目标检测结果只有1个交通灯灯盘对应的目标检测框、目标类型和置信度,然而根据高精地图的记载,该路口处实际有3个交通灯灯盘,若将高精地图的记载作为期望值,则车端目标检测模型的推理结果与期望值不够匹配,这种情况表明车端目标检测模型的推理结果出现异常,有可能是车端目标检测模型对当前场景的推理效果不够好,还需要训练车端目标检测模型对当前场景的适应能力,因此需要将此时的目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集,用于后续的模型训练。
(3.2)对基于不同算法逻辑得到的目标检测结果进行一致性检验,检验结果未达到预定的一致性下限
在一种可能的设计中,基于图像进行的目标检测结果显示障碍物为动态障碍物,而基于激光点云和毫米波点云进行的目标检测结果均显示障碍物为静态障碍物,对这三种算法逻辑得到的结果进行一致性检验,检验结果为三者的一致性并不好(基于图像的是动态障碍物,基于激光点云和毫米波雷达的是静态障碍物),没有达到预定的一致性下限(例如要求三者必须完全一致),这种情况有可能是基于图像的目标检测模型的推理结果不够准确,也可能是基于激光点云和/或毫米波点云的目标检测模型的推理结果不够准确,这说明这三类算法逻辑的目标检测模型中,至少有一种目标检测模型对当前场景的推理效果不够好,需要提升该目标检测模型在当前场景中的推理能力,因此需要将此时的推理结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集,用于后续的模型训练。
(4)训练目的为令车端目标检测模型在自身推理效果非常好的场景中持续优良表现
具体的,如下两种情况下,均表明车端目标检测模型的推理效果非常好,需要继续保持这种推理能力:
(4.1)目标检测结果与期望值相匹配,且匹配度达到预定的匹配阈值
这种情况需要令场景库中涵盖更多车端模型推理效果非常好的场景,例如,车端目标检测模型对车载摄像头采集的某路口图像进行推理,其目标检测结果有3个交通灯灯盘对应的目标检测框、目标类型和置信度,而根据高精地图的记载,该路口处也有3个交通灯灯盘,若将高精地图的记载作为期望值,则车端目标检测模型的推理结果与期望值的匹配度达到了较好的水平(例如达到了预定的匹配阈值),这说明车端目标检测模型对当前场景的推理结果非常好,需要车端目标检测模型继续保持这种良好的推理能力,因此也需要将此时的目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集,用于后续的模型训练。
(4.2)对基于不同算法逻辑得到的目标检测结果进行一致性检验,检验结果达到预定的一致性上限
在一种可能的设计中,基于图像进行的目标检测结果、基于激光点云进行的目标检测结果、基于毫米波点云进行的 目标检测结果均显示障碍物为静态障碍物,对这三种算法逻辑得到的结果进行一致性检验,检验结果显示三者完全一致(均为静态障碍物),达到了预定的一致性上限,这种情况就表明基于这三类算法逻辑的目标检测模型的推理效果很好,需要目标检测模型继续保持这种良好的推理能力,因此也需要将此时的推理结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集,用于后续的模型训练。
本申请提供的实施例,通过车端计算资源与云端计算资源的相互配合,实现自动化采集有利于车端目标检测模型性能提升的数据,该数据采集方式不仅速度快,而且有针对性,能够在车端与云端通信资源有限的情况下,更高效地采集有用数据,为后续训练车端目标检测模型提供有效、可靠的数据基础。
203,使用云端计算资源通过数据驱动模型对有价值数据(对车端目标检测模型性能提升有价值的数据)进行标注,并利用标注结果训练车端目标检测模型。
该步骤由自动标注模块利用数据驱动模型对有价值数据标注标签,相比于车端计算资源要在有限算力的基础上快速推理得出检测结果的特点,云端计算资源不仅算力强大,而且对实时性的要求也较低,因此针对相同的目标,在云端利用数据驱动模型进行检测,可以得到更加精准的结果,这一结果就可以当作标注数据拿去训练车端目标检测模型,达到训练模型、提升模型推理能力(使推理的目标检测结果更加准确)的目的。
本申请实施例所称的“数据驱动模型”是指基于数据驱动的模型,例如可以是深度学习模型、传统机器学习模型等。
在一种可能的设计中,数据驱动模型是传统机器学习模型,可以采用支持向量机算法(SVM)、Adaboost算法、逻辑回归算法、隐马尔可夫算法、K近邻算法(KNN)、三层人工神经网络算法、贝叶斯算法、决策树算法等任意一种传统机器学习算法。
在一种可能的设计中,上述传统机器学习模型(如SVM或Adaboost)基于人为定义的方向梯度直方图(HOG)特征进行计算,有助于实现对有价值数据标注标签的目的。
为了提升标注效率、确保模型训练的有效性,数据驱动模型使用的目标类型库应当涵盖车端目标检测模型所关注的全部目标类型,并且应当尽可能多得涵盖其他一些需要关注的目标对象。如图7所示为数据驱动模型使用的目标类型库的一种实施例。
为了提升标注效率、确保模型训练的有效性,在一种可能的设计中,将数据驱动模型设置为多个具有单任务、深层次特征网络结构的深度学习模型。其中,单任务特征是指单个模型仅用于执行单个任务,且模型与模型之间相互独立,并不共享参数,利用模型的单任务特征可最大程度挖掘个体目标召回与识别精度;深层次特征是指具有模型具有多个隐藏层,能够对输入特征进行多层次的抽象,更好的划分不同类型的数据,利用模型的深层次特征可提升个体目标在复杂路况下的目标召回与识别精度。例如,可根据具体场景需要感知的目标类型设置多个单任务、深层次的数据驱动模型,分别用于检测行人、机动车、非机动车、交通标志牌、交通信号灯、人行横道线等。
在目标检测模型基于图像\激光点云检测目标的示例中,目标检测模型利用车载摄像头\车载激光雷达采集的图像\激光点云检测目标,输出的目标检测结果包括针对目标的目标检测框、目标类别、置信度等信息;然后,车端采集模块判断该目标检测结果及其时空同步信息是否为有利于提升车端目标检测模型的数据,如果属于有价值数据,该目标检测结果及其时空同步信息就被上传到云端,其中时空同步信息包括带有该目标检测结果的图像;随后,针对相同的目标,数据驱动模型对有价值数据中带有该目标检测结果的图像进行检测,输出标签,完成标注。其中,数据驱动模型输出的标签也包括针对相同目标的目标检测框、目标类别等信息。
实际处理中,车端目标检测模型针对同一目标会得到多个候选的目标检测结果,一般会选择其中置信度最高的目标检测结果作为最终目标检测结果进行输出,为确保上传到云端的有价值数据量充足且多样,在一种可能的设计中,在车端目标检测模型设置两种置信度阈值(高置信度阈值α和低置信度阈值β,且α>β),当候选的目标检测结果的置信度大于高置信度阈值α时,该目标检测结果作为最终目标检测结果进行输出,而当候选的目标检测结果的置信度大于低置信度阈值β时,该目标检测结果及其时空同步信息就都作为有价值数据被采集并上传至云端。
实际处理中,受限于车端目标检测模型的推理能力,候选目标检测结果很可能存在明显错误,这类有误的候选目标检测结果如果作为有价值数据被采集,会影响有价值数据集的质量,降低模型训练的有效性。为了解决这一问题,在一种可能的设计中,如图12所示,在对有价值数据进行标注之前,还可以包括步骤203’,结合目标检测结果的时空同步信息,确定目标检测结果属于明显错误的检测结果时,将该目标检测结果从有价值数据集中删除。
例如,目标检测结果显示路口有一交通信号灯,然而高精地图数据却显示该路口处没有交通信号灯,这种情况下,该目标检测结果就属于明显错误的检测结果,需要将其从有价值数据集中删除。
再例如,针对某一帧图像推理得到的目标检测结果显示路口有一充电桩,然而针对这一帧图像之前和之后的连续多帧图像推理得到的目标检测结果都显示路口没有充电桩,这时该目标检测结果大概率也属于明显错误的检测结果,需要将其从有价值数据集中删除。
在一种可能的设计中,上述筛除错误目标检测结果的步骤可以在车端由车端采集模块执行,在筛除后再上传精简的有价值数据集,可节约云端和车端的通信资源。
在另一种可能的设计中,上述筛除错误目标检测结果的步骤可以在云端由云端采集模块执行,不仅可以利用云端丰富的计算资源获得更精确的筛除效果,而且可以节约车端计算资源。
为了确保模型训练的有效性,可以考虑从减少误检和减少漏检两个方面提升车端目标检测模型的推理能力。为此,在一种可能的设计中,数据驱动模型在对有价值数据进行标注时,可通过对带有目标检测结果的图像或激光点云进行局部检测和全局检测,然后结合目标检测结果、局部检测结果、全局检测结果的一致性情况,确定及输出目标标签。
在上述设计中,目标检测结果是车端目标检测模型推理的结果,局部检测结果和全局检测结果是数据驱动模型推理的结果,当三种检测结果的一致性水平较高时,说明车端目标检测模型的推理与数据驱动模型的推理比较一致,这三种检测结果接近目标真实情况的概率也较高,可以将其中置信度最高的检测结果作为目标标签用于后续训练模型使用。
在上述设计中,当三者的一致性水平不够高或较低时,说明车端目标检测模型的推理与数据驱动模型的推理不够一致,这时需要分情况讨论采用其中的哪一个检测结果作为目标标签用于后续训练模型使用。
在上述设计中,对于三种检测结果的一致性水平较高的情况,车端目标检测模型的推理与数据驱动模型的推理比较一致,说明车端目标检测模型对当前场景的推理效果较好,在后续的模型训练中,可以将提升车端目标检测模型对这类场景的推理能力作为非重点训练目标;而对于三种检测结果的一致性水平不够高或较低的情况,车端目标检测模型的推理与数据驱动模型的推理不够一致,说明车端目标检测模型对当前场景的推理效果还不够好,在后续的模型训练中,就需要将提升车端目标检测模型对这类场景的推理能力作为重点训练目标。
在一种可能的设计中,为了有针对性地提升车端目标检测模型的推理能力,可以在最终确定的目标标签中引入难例等级这一标签类型,其中,难例等级越大,说明车端目标检测模型对当前场景的推理效果越好,反之,难例等级越小,说明车端目标检测模型对当前场景的推理效果越不好,需要在后续模型训练中重点加强车端目标检测模型对这类场景的推理能力。
下面通过一个具体的实施例来介绍数据驱动模型的标注过程,如图13所示,其可以实现为包括:
S1,将带有目标检测结果的图像或激光点云输入数据驱动模型;其中,车端目标检测模型利用图像或激光点云检测 目标得到的目标检测结果包括:第一目标类别class1、第一目标检测框bbox1、第一置信度score1;
S2,通过数据驱动模型对带有目标检测结果的图像或激光点云进行局部检测;其中,数据驱动模型在带有第一目标检测框bbox1的整帧图像或整帧激光点云中,以第一目标检测框bbox1为中心向周围扩大一预设范围得到局部检测区域,并在所述局部检测区域中检测目标,输出局部检测结果:第二目标类别class2、第二目标检测框bbox2、第二置信度score2;
S3,通过数据驱动模型对带有目标检测结果的图像或激光点云进行全局检测;其中,数据驱动模型在带有第一目标检测框bbox1的整帧图像或整帧激光点云中检测目标,输出全局检测结果:第三目标类别class3、第三目标检测框bbox3、第三置信度score3;
S4,对目标检测结果、局部检测结果、全局检测结果进行一致性检验,根据检验结果确定目标标签;其中,如图14所示,其进一步包括:
S41,数据驱动模型采用交除并算法(IOU)计算第一目标检测框bbox1、第二目标检测框bbox2、第三目标检测框bbox3的重叠度,当计算出的重叠度达到预定的重叠度阈值时,继续进行后续操作,否则,进入下一标注流程;
该步骤中,数据驱动模型可以采用交除并算法对第一目标检测框bbox1、第二目标检测框bbox2、第三目标检测框bbox3中的任意两个(两两组合)进行重叠度计算,并且在计算结果全都达到预定的重叠度阈值时,才继续进行后续操作;
S42,数据驱动模型比对第一目标类别class1、第二目标类别class2、第三目标类别class3的一致性:
S42A,若第二目标类别class2与第三目标类别class3一致,且第一目标类别class1与第二目标类别class2不一致,则,将第四目标类别class4确定为第二目标类别class2或第三目标类别class3,将第四目标检测框bbox4确定为第二置信度score2和第三置信度score3中的较大者对应的目标检测框,将难例等级确定为一级;
S42B,若第一目标类别class1与第二目标类别class2一致,且第二目标类别class2与第三目标类别class3不一致,则将第四目标类别class4确定为第一目标类别class1或第二目标类别class2,将第四目标检测框bbox4确定为第一置信度score1和第二置信度score2中的较大者对应的目标检测框,将难例等级确定为二级;
S42C,若第一目标类别class1与第三目标类别class3一致,且第二目标类别class2与第三目标类别class3不一致,则将第四目标类别class4确定为第一目标类别class1或第三目标类别class3,将第四目标检测框bbox4确定为第一置信度score1和第三置信度score3中的较大者对应的目标检测框,将难例等级确定为二级;
S42D,若第一目标类别class1、第二目标类别class2、第三目标类别class3均一致,则,将第四目标类别class4确定为第一目标类别class1或第二目标类别class2或第三目标类别class3,将第四目标检测框bbox4确定为第一置信度score1、第二置信度score2、第三置信度score3中的较大者对应的目标检测框,将难例等级确定为三级。
此外,S42E,若第一目标类别class1、第二目标类别class2、第三目标类别class3各不一致,这说明车端目标检测模型和数据驱动模型的推理结果可能都不够准确,考虑到这种情况,在一种可能的设计中,数据驱动模型还可以将带有目标检测结果、局部检测结果、全局检测结果的图像或激光点云确定为难例数据进行输出;此后由人工完成对难例数据集的标注,再将人工标注的目标检测框、目标类别和难例等级与前述数据驱动模型确定的目标标签一起作为标注结果用来训练车端目标检测模型;其中,对于难例数据集,其难例等级可由人工标注为零级或由计算设备自动标注为零级。
利用云端计算资源强大的运算能力和数据存储能力,在云端设置目标检测能力远高于车端目标检测模型的数据驱动模型,然后利用数据驱动模型识别有价值数据中的特定目标(如:行人、机动车、非机动车、交通标志牌、交通信号灯、人行横道线等),由于数据驱动模型的识别能力远远高于车端目标检测模型,其识别结果相对车端目标检测模型的推理能力就可以作为标签用来训练车端目标检测模型。这种自动标注方式可节省大量的人工标注工作量,显著提升数据标注效率,加快模型迭代速度。
自动标注模块将数据驱动模型对有价值数据进行标注从而确定的所有目标标签构建成标注数据集(标注结果),并将标注数据集发送给训练模块,此后训练模块利用标注数据集(标注结果)来训练车端目标检测模型。如图15所示,该步骤具体可以为:训练模块将带有目标标签的图像或激光点云作为标注结果训练车端目标检测模型,具体的,包括S51,训练模块根据第四目标类别class4、第四目标检测框bbox4修改车端目标检测模型的参数,和S52根据难例等级修改车端目标检测模型的损失函数的权重参数;其中,难例等级越低,修改后的损失函数的权重参数越大,经过这样修改可以提升车端目标检测模型对难例数据集的泛化能力。
204,将车端计算资源正在使用的车端目标检测模型迭代为训练后的车端目标检测模型。
具体的,迭代模块确定训练模块训练后的车端目标检测模型对应的模型参数,并将模型参数下发给车端计算模块,车端计算模块利用下发的模型参数将正在使用的车端目标检测模型迭代为训练后的车端目标检测模型。
在一种可能的设计中,迭代模块可以对训练后的车端目标检测模型进行测试,当测试结果达到迭代要求(表明训练后的车端目标检测模型的推理能力明显优于正在使用的车端目标检测模型)时,才会将模型参数下发给车端计算模块,完成车端目标检测模型迭代操作。
实际处理中,有可能存在车端目标检测模型在某些方面的推理能力优于数据驱动模型的情况,考虑到这种情况,在一种可能的设计中,训练模块可以利用优化结果对数据驱动模型进行训练,提升数据驱动模型在这类相比车端目标检测模型较为劣势的方面的推理能力;迭代模块将训练后的数据驱动模型对应的模型参数发送给自动标注模块,将正在使用的数据驱动模型迭代为训练后的数据驱动模型。
在一种可能的设计中,迭代模块还可以对训练后的数据驱动模型进行测试,当测试结果达到迭代要求(表明训练后的数据驱动模型的推理能力明显优于正在使用的数据驱动模型)时,才会将模型参数下发给自动标注模块,完成数据驱动模型迭代操作。
本申请实施例提供的目标检测模型自动化迭代方法,首先自动采集对车端目标检测模型的性能提升有价值的数据,然后利用数据驱动模型自动对有价值数据进行标注,最后利用标注结果训练车端目标检测模型并完成迭代:
(1)该方法采用车端推理--云端训练的模式,即在车端部署多任务、轻量级的车端目标检测模型,基于目标检测结果自动地、有针对性地采集对车端目标检测模型的性能提升有价值的数据,再利用云端强大的计算能力和数据存储能力,实时地、自动地完成标注数据集生成、模型训练、模型迭代等一系列操作;这种车端推理--云端训练的模式充分发挥了云端的资源优势,提高了自动驾驶车端目标检测模型迭代的效率;
(2)该方法在车端至云端通信资源有限的环境下,自动采集对车端目标检测模型的性能提升有价值的数据,该自动采集过程不仅效率高,而且涵盖了罕见、异常、突发的长尾场景,屏蔽了重复数据和垃圾数据,保证了所采集数据的有效性、多样性和完整性,为云端自动化完成模型训练和模型迭代提供了充足、高质、多样、有效、可靠的数据基础;
(3)该方法利用单任务、深层次的数据驱动模型自动化完成数据标注获得标注数据集,这种自动生成标注数据集的方式大大减少了人工标注工作,明显有助于解决因为标注效率低导致的模型迭代耗时长、速度慢的问题。
图8示出根据本申请实施例提供的一种自动驾驶车辆ADV的结构。自动驾驶车辆ADV包括动力系统V-110、传感器系统V-120、致动系统V-130、外围设备系统V-140、车辆计算系统V-150。在一些可能的设计中,自动驾驶车辆ADV车辆可以包括更多、更少或不同的单元,并且每个单元可以包括更多、更少或不同的组件。在一些可 能的设计中,图8所示单元和组件还可以以任意的数量进行组合或划分。
动力系统V-110可以被配置为为车辆提供运动动力。动力系统V-110包括引擎V-111、能量源V112、变速器V113和车轮V114中的一个或多个。
引擎V-111可以是内燃机、电机马达、蒸汽机和斯特林引擎的任何组合,也可以是其它马达和引擎。在一些可能的设计中,动力系统V-110可以包括多种类型的引擎和/或马达。例如,气电混合动力车可以包括汽油引擎和电动马达。
能量源V112可以是全部或部分地为引擎V-111提供动力的能量源V112。引擎V-111可以被配置为将能量源V112转换成机械能。能量源V112可以包括汽油、柴油、丙烷、其它基于压缩气体的燃料、乙醇、太阳能电池板、电池和其它电力源。能量源V112可以附加或可替换地包括燃料箱、电池、电容器和/或飞轮的任何组合。在一些可能的设计中,能量源V112也可以为车辆的其它单元提供能量。
变速器V113可以被配置为将机械动力从引擎V-111发送到车轮V114。为此,变速器V113可以包括变速箱、离合器、差速器、驱动轴和/或其它原件。在变速器V113包括驱动轴的实施例中,驱动轴可以包括被配置为耦合到车轮V114的一个或多个轮轴。
车轮V114可以被配置为任何形式,包括单轮、双轮、三轮、四轮、六轮等形式。其它车轮V114形式也是可能的,例如包括八个或更多车轮的形式。在任何情况下,车轮V114可以被配置为相对于其他车轮V114差速地旋转。在一些可能的设计中,车轮V114可以包括固定地附接到变速器V113的至少一个车轮,以及可以与路面面接触的、耦合到车辆的轮辋的至少一个轮胎。车轮V114可以包括金属和橡胶的任何组合,或者其它材料的组合。
动力系统V-110可以附加或可替换地包括除了前述组件之外的其它组件。
传感器系统V-120可以包括外部传感器V-121和内部传感器V-122。
外部传感器V-121可以包括被配置为感测车辆所处环境的信息的多个传感器,以及被配置为修改传感器的位置和/或方向的一个或多个致动器V1216。例如,外部传感器V-121可以包括位置传感器V1217、惯性传感器V1211、物体传感器V1212、图像传感器V1213中的一个或多个。
位置传感器V1217可以是估计车辆的地理位置的任何传感器,例如,全球定位系统GPS定位设备、载波相位差分RTK定位设备、北斗卫星定位系统定位设备、GLONASS定位系统定位设备、Galileo定位系统定位设备、全球导航卫星系统GNSS定位设备。位置传感器V1217可以包括估计车辆相对于地球的位置的收发器。
惯性传感器V1211可以是被配置为根据惯性加速度来感测车辆的位置和方向改变的任何传感器组合,例如惯性测量单元IMU。在一些可能的设计中,惯性传感器V1211可以包括加速计和陀螺仪。
物体传感器V1212可以是使用无线电信号或激光信号来感测车辆所处环境中的物体的任何传感器,例如雷达、激光测距仪、激光雷达。在一些可能的设计中,除了感测物体之外,雷达和激光雷达还可以附加地感测物体的速度和/或行驶方向。在一些可能的设计中,物体传感器V1212可以包括发射无线电信号或激光信号的发射器以及检测无线电信号或激光信号的检测器。
图像传感器V1213可以包括任何相机(例如静态相机、视频相机等),用于拍摄车辆所处环境的图像。
此外,外部传感器V-121还可以包括其它的传感器,例如用于检测物体距离的任何传感器,例如,声呐V1214、超声波传感器V-1216等。
内部传感器V-122可以包括被配置为检测与车辆的行驶状态相应的信息的多个传感器。例如,内部传感器V-122可以包括车速传感器V-1221、加速度传感器V-1222以及横摆率传感器V-1223中的一个或多个。
车速传感器V-1221可以是检测车辆的速度的任何传感器。
加速度传感器V-1222可以是检测车辆的加速度的任何传感器。
横摆率传感器V-1223可以是检测车辆绕重心的铅垂轴的横摆率(旋转角速度)的任何传感器,例如,陀螺仪传感器。
在一些可能的设计中,为检测驾驶操作信息,内部传感器V-122还可以包括加速器踏板传感器V-1224、制动器踏板传感器V-1225以及方向盘传感器V-1226中的一个或多个。
加速器踏板传感器V-1224可以是检测加速器踏板的踩踏量的任何传感器,加速器踏板传感器V-1224例如设置于车辆的加速器踏板的轴部分。
制动器踏板传感器V-1225可以是检测制动器踏板的踩踏量的任何传感器,制动器踏板传感器V-1225例如设置于制动器踏板的轴部分。制动器踏板传感器V-1225也可以检测制动器踏板的操作力(对制动器踏板的踏力、主缸的压力等)。
方向盘传感器V-1226可以是检测方向盘的旋转状态的任何传感器,旋转状态的检测值例如是操舵转矩或舵角,方向盘传感器V-1226例如设置于车辆的转向轴。
此外,内部传感器V-122还可以包括其它的传感器,例如监测车辆内部各个组件的传感器(例如氧气监测器、燃油表、引擎油温度计等)。
在一些示例中,传感器系统V-120可以实施为多个传感器组合,每个传感器组合被配置为安装在车辆的相应位置上(例如,顶部、底部、前侧、后侧、左侧、右侧等)。
致动系统V-130可以被配置为控制车辆的驾驶行为。致动系统V-130可以包括转向模块V-131、节气门模块V-132、制动模块V-133中的一个或多个。
转向模块V-131可以是控制车辆的转向转矩(或操舵转矩)的任何设备组合。
节气门模块V-132可以是通过调整发动机的空气供给量(节气门开度)来达到控制引擎V-111的操作速度和控制车辆的速度的任何设备组合。
制动模块V-133可以是使车辆减速的任何设备组合,例如,制动模块V-133可以利用摩擦力来使车轮V114减速。
外围设备系统V-140可以被配置为使车辆与外部传感器V-121、其它车辆、外部计算设备和/或用户进行交互。例如,外围设备系统V-140可以包括无线通信装置V-141、有线通信接口V-142、触屏显示器V-143、麦克风V-144和扬声器V-145中的一个或多个。
无线通信装置V-141可以被配置为直接地或无线地连接到动力系统V-110、传感器系统V-120、致动系统V-130、外围设备系统V-140和车辆计算系统V-150包括的一个或多个设备,以及直接地或无线地连接其它车辆、中控系统、枢纽服务区中的实体中的一种或多种。无线通信装置V-141可以包括基于无线通信技术通信的天线和芯片组,其中,无线通信技术可以包括全球移动通讯系统(Global System for Mobile Communications,GSM),通用分组无线服务(General Packet Radio Service,GPRS),码分多址接入(Code Division Multiple Access,CDMA),宽带码分多址(Wideband Code Division Multiple Access,WCDMA),时分码分多址(Time-Division Code Division Multiple Access,TD-SCDMA),长期演进(Long Term Evolution,LTE),蓝牙(Blue Tooth,BT),全球导航卫星系统(Global Navigation Satellite System,GNSS),调频(Frequency Modulation,FM),近距离无线通信技术(Near Field Communication,NFC), 红外技术(Infrared,IR)。GNSS可以包括全球卫星定位系统(Global Positioning System,GPS),全球导航卫星系统(Global Navigation Satellite System,GLONASS),北斗卫星导航系统(Beidou Navigation Satellite System,BDS),准天顶卫星系统(Quasi-zenith Satellite System,QZSS)和/或星基增强系统(Satellite Based Augmentation Systems,SBAS)。
有线通信接口V-142可以被配置为直接地连接动力系统V-110、传感器系统V-120、致动系统V-130、外围设备系统V-140和车辆计算系统V-150包括的一个或多个设备,以及直接地连接其它车辆、中控系统、枢纽服务区中的实体中的一种或多种。有线通信接口V-142可以包括集成电路(Inter-Integrated Circuit,I2C)接口,集成电路内置音频(Inter-Integrated Circuit Sound,I2S)接口,脉冲编码调制(Pulse Code Modulation,PCM)接口,通用异步收发传输器(Universal Asynchronous Receiver/Transmitter,UART)接口,移动产业处理器接口(Mobile Industry Processor Interface,MIPI),通用输入输出(General-Purpose Input/Output,GPIO)接口,用户标识模块(Subscriber Identity Module,SIM)接口,和/或通用串行总线(Universal Serial Bus,USB)接口等。
触屏显示器V-143可以被用户用来向车辆输入命令。触屏显示器V-143可以被配置为通过电容感测、电阻感测或表面声波处理来感测用户手指的位置和、或位置的移动。触屏显示器V-143能够感测在平行或共面与触摸屏表面的方向、垂直与触摸屏表面的方向或者两个方向上的手指移动,并且还能够感测施加到触摸屏表面的压力水平。触屏显示器V-143可以由一个或多个半透明或透明的绝缘层和一个或多个半透明或透明的导电层形成。触屏显示器V-143也可以被配置为其它形式。
麦克风V-144可以被配置为用于接收声音信号(例如,语音命令或其它音频输入)并将声音信号转换为电信号。
扬声器V-145可以被配置为输出音频。
外围设备系统V-140可以进一步或可替换地包括其他组件。
车辆计算系统V-150可以包括处理器V-151和数据存储装置V-152。
处理器V-151可以被配置为用于运行存储于数据存储装置V-152中的指令以执行各种功能,这些功能包括但不限于如下所述的定位融合模块V-1501、感知模块V-1502、行驶状态确定模块V-1503、导航模块V-1504、决策模块V-1505、行驶控制模块V-1506、任务接收模块V-1507对应的功能。处理器V-151可以包括通用处理器(例如CPU、GPU)、专用处理器(例如专用集成电路(Application-specific integrated circuit,ASIC))、现场可编程门阵列(FPGA)、数字信号处理器(DSP)、集成电路、微控制器等一个或多个的组合。在处理器V-151包括多个处理器V-151的情况下,这些处理器V-151能够单独或组合地工作。
数据存储装置V-152可以包括一个或多个易失性计算机可读存储介质和/或一个或多个非易失性计算机可读存储介质,诸如光学、磁性和/或有机存储介质。数据存储装置V-152可以包括只读存储器(ROM)、随机存取存储器(RAM)、闪速存储器、电可编程存储器(EPROM)、电可编程和可擦除存储器(EEPROM)、嵌入式多媒体卡(eMMC)、硬盘驱动器或任何易失性或非易失性介质等中的一个或多个的组合。数据存储装置V-152可以整体或部分地与处理器V-151集成。数据存储装置V-152可以被配置为存储可由处理器V-151运行以执行各种功能的指令,其中,这些功能包括但不限于如下所述的定位融合模块V-1501、感知模块V-1502、行驶状态确定模块V-1503、导航模块V-1504、决策模块V-1505、行驶控制模块V-1506、任务接收模块V-1507对应的功能。
定位融合模块V-1501可以被配置为接收来自传感器系统V-120感测到的环境数据、位置数据或其他类型的数据,通过对这些数据进行时间戳对齐、融合计算等处理,得到融合后的环境数据和车辆位置数据。定位融合模块V-1501可以包括例如卡尔曼滤波器、贝叶斯网络,以及实现其它功能的算法。
感知模块V-1502可以被配置为接收定位融合模块V-1501计算的融合后的环境数据,并对其进行计算机视觉处理以识别车辆所处环境中的物体和/或特征,该物体和/或特征包括例如车道线、行人、其他车辆、交通信号、基础交通设施等。感知模块V-1502可以使用物体识别算法、运动中恢复结构(Structure from Motion,SFM)算法、视频跟踪或其它计算机视觉技术。在一些可能的设计中,感知模块V-1502可以进一步地配置为对环境进行地图绘制、跟踪物体、估计物体的速度等。
行驶状态确定模块V-1503基于传感器系统V-120中的内部传感器V-122得到的数据识别车辆的行驶状态,例如包括车速、加速度或者横摆率。
任务接收模块V-1507可以被配置为接收任务,解析任务包含的装卸货地址、货物品类、装卸货时间等信息,并将这些信息发送给导航模块V-1504。
导航模块V-1504可以被配置为确定车辆的驾驶路径的任何单元。导航模块V-1504可以进一步地被配置为在车辆的操作时动态地更新驾驶路径。在一些可能的设计中,导航模块V-1504可以被配置为根据来自定位融合模块V-1501、定位传感器、物体传感器V1212、任务接收模块V-1507的处理结果和一个或多个预存的高精地图数据,为车辆确定行驶路径。
决策模块V-1505可以被配置为基于导航模块V-1504计算出的行驶路径、定位融合模块V-1501计算得到的车辆位置数据、以及感知模块V-1502识别出的车辆所处环境中的物体和/或特征,生成车辆的路点信息,路点信息中的路点是在行驶路径中车辆前进的轨迹点。
行驶控制模块V-1506可以被配置为接收决策模块V-1505产生的路点信息,并根据路点信息控制致动系统V-130,以使得车辆按照路点信息行驶。
数据存储装置V-152还可以被配置为存储其他的指令,包括将数据发送到动力系统V-110、传感器系统V-120、致动系统V-130和/或外围设备系统V-140中的一个或多个,从其中接收数据,与其交互,和/或对其进行控制的指令。数据存储装置V-152还可以被配置为存储其他的指令。例如,数据存储装置V-152可以存储用于控制变速器V113的操作以改善燃料效率的指令,可以存储用于控制图像传感器V1213拍摄环境图像的指令,可以存储用于根据物体传感器V1212感测的数据生成车辆所处环境的三维图像的指令,以及,可以存储用于将麦克风V-144转换得到的电信号识别成语音命令的指令。
数据存储装置V-152还可以被配置为存储其他的指令。除存储指令之外,数据存储装置V-152还可以被配置为存储多种信息,例如图像处理参数、训练数据、高精地图、路径信息等。在车辆以自动模式、半自动模式、手动模式运行的期间,这些信息可以被动力系统V-110、传感器系统V-120、致动系统V-130和外围设备系统V-140、车辆计算系统V-150中的一个或多个所使用。
车辆计算系统V-150可以通过系统总线、网络和/或其它连接机制通信连接到动力系统V-110、传感器系统V-120、致动系统V-130和外围设备系统V-140中的一个或多个。
车辆计算系统V-150可以通过数据线直接地或通过无线通信技术无线地连接外围设备系统V-140中的无线通信装置V-141,然后通过无线通信装置V-141无线地连接枢纽服务区和/或中控系统。
车辆计算系统V-150也可以是多个计算装置,这些计算装置分布式地控制车辆的个别组件或者个别系统。
车辆计算系统V-150可以附加地或可替换地包括其它的组件。
图8介绍了自动驾驶车辆100的功能框图,下面介绍自动驾驶车辆100中的车辆计算系统V-150。图9为本申请实施例提供的一种车辆计算系统V-150的结构示意图。
如图9所示,车辆计算系统V-150包括处理器E-100,处理器E-100和系统总线E-000耦合。处理器E-100可以是任何传统处理器,包括精简指令集计算(RISC)处理器、复杂指令集计算(CISC)处理器或上述的组合。可选的,处理器E-100可以是诸如专用集成电路(ASIC)的专用装置。处理器E-100可以是一个或者多个处理器,其中,每个处理器都可以包括一个或多个处理器核。
系统内存E-900和系统总线E-000耦合。运行在系统内存E-900的数据可以包括车辆计算系统V-150的操作系统E-901和应用程序E-904。
操作系统E-901包括壳(Shell)E-902和内核(kernel)E-903。壳E-902是介于用户和操作系统之内核E-903间的一种接口,是操作系统最外面的一层。壳E-902管理用户与操作系统之间的交互,等待用户的输入,向操作系统解释使用者的输入,并且处理各种各样的操作系统的输出结果。
内核E-903由操作系统E-901中用于管理存储器、文件、外设和系统资源的那些部分组成。直接与硬件交互,操作系统内核通常运行进程,并提供进程间的通信,提供CPU时间片管理、中断、内存管理、I/O管理等等。
应用程序E-904包括自动驾驶相关程序E-905,例如管理自动驾驶车辆100和路上障碍物交互的程序,控制自动驾驶装置的行车路线或者速度的程序,控制自动驾驶车辆100和路上其他自动驾驶装置交互的程序。应用程序E-904也存在于软件部署服务器的系统上。当需要执行应用程序E-904时,车辆计算系统V-150可以从软件部署服务器下载应用程序E-904。
系统总线E-000通过总线桥E-200和I/O总线E-300耦合。I/O总线E-300与I/O接口E-400耦合。I/O接口E-400连接有USB接口E-500和多种I/O设备进行通信,这些I/O设备例如是输入设备、媒体盘、收发器、摄像头、传感器等。其中,输入设备例如是键盘、鼠标、触摸屏等;媒体盘例如是CD-ROM、多媒体接口等;收发器用于发送和/或接受无线电通信信号;摄像头用于捕捉景田和动态数字视频图像;传感器可以是图8中传感系统包含的各类传感器,用于探测车辆计算系统V-150周围的环境,并将所感测的信息提供给车辆计算系统V-150。
硬盘驱动器E-800通过硬盘驱动器接口和系统总线E-000耦合。
显示适配器E-700与系统总线E-000耦合,用以驱动显示器。
车辆计算系统V-150可以通过网络接口E-600和软件部署服务器通信。网络接口E-600是硬件网络接口,例如网卡。网络可以是外部网络,例如因特网,也可以是内部网络,例如以太网或者虚拟私人网络(VPN),还可以是无线网络,例如WiFi网络,蜂窝网络等。
车辆计算系统V-150可以包括车载执行设备,该车载执行设备可以包括一个或多个第一处理器、一个或多个第一存储器、以及存储在第一存储器上并可在第一处理器上运行的计算机指令。当第一处理器在运行第一存储器中的计算机指令时,执行本申请提供的各种实施例中车载执行设备对应的功能。其中,第一处理器可以被配置为处理器V-151中的一个或多个通用处理器(例如CPU、GPU),一个或多个专用处理器(例如ASIC),一个或多个现场可编程门阵列(FPGA),一个或多个数字信号处理器(DSP),一个或多个集成电路,和/或,一个或多个微控制器等。第一存储器可以被配置为数据存储装置V-152中的一个或多个只读存储器(ROM),一个或多个随机存取存储器(RAM),一个或多个闪速存储器,一个或多个电可编程存储器(EPROM),一个或多个电可编程和可擦除存储器(EEPROM),一个或多个嵌入式多媒体卡(eMMC),和/或,一个或多个硬盘驱动器等。车载执行设备对应的功能可以实现为一种计算机程序产品,当该计算机程序产品在计算机上运行时,实现车载执行设备对应的功能。在一种可能的示例中,实现该对应功能的计算机程序产品可以存储在第一存储器中。
图10所示为自动驾驶车辆100和车载执行设备50的一种可能的示例,如图10所示,该自动驾驶车辆100中配置有车载执行设备50,该车载执行设备50包括第一处理器50A,第一存储器50B,以及存储在第一存储器上并可在第一处理器上运行的计算机指令。当第一处理器在运行第一存储器中的计算机指令时,执行如下步骤对应的方法:S91,通过车端目标检测模型推理得到目标检测结果;S92,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据;S93,将正在使用的车端目标检测模型迭代为训练后的车端目标检测模型;其中,所述训练后的车端目标检测模型是由云端执行设备通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果对车端目标检测模型进行训练得到的模型。在一些可能的设计中,车载执行设备50的具体实现形态还可以为其他具有相似存储器和处理器架构的电子设备。
基于相同的发明思想,本申请实施例还提供一种云端执行设备,如图11所示,云端执行设备60可以包括一个或多个第二处理器60A、一个或多个第二存储器60B、以及存储在第二存储器上并可在第二处理器上运行的计算机指令。当第二处理器在运行第二存储器中的计算机指令时,执行本申请提供的各种实施例中云端执行设备对应的功能。其中,第二处理器可以被配置为一个或多个通用处理器(例如CPU、GPU),一个或多个专用处理器(例如ASIC),一个或多个现场可编程门阵列(FPGA),一个或多个数字信号处理器(DSP),一个或多个集成电路,和/或,一个或多个微控制器等。第二存储器可以被配置为一个或多个只读存储器(ROM),一个或多个随机存取存储器(RAM),一个或多个闪速存储器,一个或多个电可编程存储器(EPROM),一个或多个电可编程和可擦除存储器(EEPROM),一个或多个嵌入式多媒体卡(eMMC),和/或,一个或多个硬盘驱动器等。云端执行设备对应的功能可以实现为一种计算机程序产品,当该计算机程序产品在计算机上运行时,实现云端执行设备对应的功能。在一种可能的示例中,实现该对应功能的计算机程序产品可以存储在第二存储器中。
图11所示为云端执行设备60的一种可能的示例,包括第二处理器60A,第二存储器60B,以及存储在第二存储器上并可在第二处理器上运行的计算机指令。当第二处理器在运行第二存储器中的计算机指令时,执行如下步骤对应的方法:S101,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据;S102,通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果训练车端目标检测模型;S103,将车载执行设备正在使用的车端目标检测模型迭代为训练后的车端模型。在一些可能的设计中,云端执行设备60的具体实现形态还可以为其他具有相似存储器和处理器架构的电子设备。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单 元来实现本实施例方案的目的。另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (45)

  1. 一种目标检测模型自动化迭代方法,其特征在于,包括:
    使用车端计算资源通过车端目标检测模型推理得到目标检测结果;
    根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据;
    使用云端计算资源通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果训练车端目标检测模型;
    将车端计算资源正在使用的车端目标检测模型迭代为训练后的车端目标检测模型。
  2. 根据权利要求1所述的方法,其特征在于,所述车端目标检测模型为具有多任务、轻量级特征的网络结构的神经网络模型。
  3. 根据权利要求1所述的方法,其特征在于,所述数据驱动模型为具有单任务、深层次特征的网络结构的深度学习模型。
  4. 根据权利要求1所述的方法,其特征在于,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据,包括:
    根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据;
    其中,所述时空同步信息包括与目标检测结果在时间及空间上存在同步关系的环境数据、地图数据、车身状态数据、驾驶员操作数据中的一种或多种。
  5. 根据权利要求4所述的方法,其特征在于,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据,包括:
    利用目标检测结果及其时空同步信息构建场景;
    确定已有场景库中缺少所述场景时,将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集。
  6. 根据权利要求5所述的方法,其特征在于,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据,包括:
    监测到目标检测结果和/或时空同步信息存在异常时,将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集。
  7. 根据权利要求6所述的方法,其特征在于,监测到目标检测结果和/或时空同步信息存在异常,包括下述至少一种:
    确定目标检测结果和/或时空同步信息不属于常规场景;
    对基于不同算法逻辑得到的目标检测结果进行一致性检验,确定检验结果未达到预定的一致性下限;
    确定目标检测结果与期望值不匹配。
  8. 根据权利要求4所述的方法,其特征在于,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据,包括:
    确定目标检测结果与期望值相匹配,且匹配度达到预定的匹配阈值时,或者,对基于不同算法逻辑得到的目标检测结果进行一致性检验,确定检验结果达到预定的一致性上限时;
    将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集。
  9. 根据权利要求5-8任一所述的方法,其特征在于,所述车端目标检测模型利用图像或激光点云检测目标。
  10. 根据权利要求9所述的方法,其特征在于,使用云端计算资源通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,包括:
    使用云端计算资源,通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据中带有目标检测结果的图像或激光点云进行标注。
  11. 根据权利要求10所述的方法,其特征在于,在使用云端计算资源,通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据中带有目标检测结果的图像或激光点云进行标注之前,所述方法还包括:
    结合目标检测结果的时空同步信息,确定目标检测结果属于错误的检测结果时,将该目标检测结果从所述对车端目标检测模型性能提升有价值的数据中删除。
  12. 根据权利要求10所述的方法,其特征在于,通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据中带有目标检测结果的图像或激光点云进行标注,包括:
    将带有目标检测结果的图像或激光点云输入数据驱动模型;
    通过数据驱动模型对带有目标检测结果的图像或激光点云进行局部检测和全局检测,并对目标检测结果、局部检测结果、全局检测结果进行一致性检验,以及根据检验结果确定目标标签。
  13. 根据权利要求12所述的方法,其特征在于,所述目标检测结果包括车端目标检测模型利用图像或激光点云检测目标得到的第一目标类别class1、第一目标检测框bbox1、第一置信度score1;
    则,通过数据驱动模型对包含目标检测结果的图像或激光点云进行局部检测,包括:
    通过数据驱动模型在带有第一目标检测框bbox1的整帧图像或整帧激光点云中,以第一目标检测框bbox1为中心向周围扩大一预设范围得到局部检测区域,并在所述局部检测区域中检测目标,输出局部检测结果;所述局部检测结果包括第二目标类别class2、第二目标检测框bbox2、第二置信度score2;
    以及,通过数据驱动模型对包含目标检测结果的图像或激光点云进行全局检测,包括:
    通过数据驱动模型在带有第一目标检测框bbox1的整帧图像或整帧激光点云中检测目标,输出全局检测结果;所述全局检测结果包括第三目标类别class3、第三目标检测框bbox3、第三置信度score3。
  14. 根据权利要求13所述的方法,其特征在于,所述目标标签包括第四目标类别class4、第四目标检测框bbox4;
    通过数据驱动模型对目标检测结果、局部检测结果、全局检测结果进行一致性检验,以及根据检验结果确定目 标标签,包括:
    采用交除并算法计算第一目标检测框bbox1、第二目标检测框bbox2、第三目标检测框bbox3的重叠度;
    当所述重叠度达到预定的重叠度阈值时,比对第一目标类别class1、第二目标类别class2、第三目标类别class3的一致性;
    若第二目标类别class2与第三目标类别class3一致,且第一目标类别class1与第二目标类别class2不一致,则,将第四目标类别class4确定为第二目标类别class2或第三目标类别class3,将第四目标检测框bbox4确定为第二置信度score2和第三置信度score3中的较大者对应的目标检测框;
    若第一目标类别class1与第二目标类别class2一致,且第二目标类别class2与第三目标类别class3不一致,则将第四目标类别class4确定为第一目标类别class1或第二目标类别class2,将第四目标检测框bbox4确定为第一置信度score1和第二置信度score2中的较大者对应的目标检测框;
    若第一目标类别class1与第三目标类别class3一致,且第二目标类别class2与第三目标类别class3不一致,则将第四目标类别class4确定为第一目标类别class1或第三目标类别class3,将第四目标检测框bbox4确定为第一置信度score1和第三置信度score3中的较大者对应的目标检测框;
    若第一目标类别class1、第二目标类别class2、第三目标类别class3均一致,则,将第四目标类别class4确定为第一目标类别class1或第二目标类别class2或第三目标类别class3,将第四目标检测框bbox4确定为第一置信度score1、第二置信度score2、第三置信度score3中的较大者对应的目标检测框。
  15. 根据权利要求14所述的方法,其特征在于,所述目标标签还包括难例等级;则,通过数据驱动模型对目标检测结果、局部检测结果、全局检测结果进行一致性检验,以及根据检验结果确定目标标签,还包括:
    若第二目标类别class2与第三目标类别class3一致,且第一目标类别class1与第二目标类别class2不一致,则将难例等级确定为一级;
    若第一目标类别class1与第二目标类别class2一致,且第二目标类别class2与第三目标类别class3不一致,则将难例等级确定为二级;
    若第一目标类别class1与第三目标类别class3一致,且第二目标类别class2与第三目标类别class3不一致,则将难例等级确定为二级;
    若第一目标类别class1、第二目标类别class2、第三目标类别class3均一致,则将难例等级确定为三级。
  16. 根据权利要求15所述的方法,其特征在于,
    通过数据驱动模型对目标检测结果、局部检测结果、全局检测结果进行一致性检验,以及根据检验结果确定目标标签,还包括:若第一目标类别class1、第二目标类别class2、第三目标类别class3各不一致;则,将带有目标检测结果、局部检测结果、全局检测结果的图像或激光点云确定为难例数据集并进行输出;
    则,所述目标检测模型自动化迭代方法还包括:接收人工对难例数据集标注的目标检测框、目标类别和难例等 级;其中,难例等级为零级;
    以及,利用标注结果训练车端目标检测模型,包括:将人工标注的目标检测框、目标类别和难例等级与数据驱动模型确定的目标标签结合在一起作为标注结果训练车端目标检测模型。
  17. 根据权利要求15所述的方法,其特征在于,利用标注结果训练车端目标检测模型,包括:
    将带有目标标签的图像或激光点云作为标注结果训练车端目标检测模型;
    其中,训练车端目标检测模型的过程包括:
    根据第四目标类别class4、第四目标检测框bbox4修改车端目标检测模型的参数;
    根据难例等级修改车端目标检测模型的损失函数的权重参数,其中,难例等级越低,权重参数越大。
  18. 一种车端目标检测模型自动化迭代方法,其特征在于,包括:
    通过车端目标检测模型推理得到目标检测结果;
    与云端计算资源相配合,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据;
    与云端计算资源相配合,将正在使用的车端目标检测模型迭代为训练后的车端目标检测模型;其中,所述训练后的车端目标检测模型是由云端计算资源通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果对车端目标检测模型进行训练得到的模型。
  19. 根据权利要求18所述的方法,其特征在于,所述车端目标检测模型为具有多任务、轻量级特征的网络结构的神经网络模型。
  20. 根据权利要求18所述的方法,其特征在于,与云端计算资源相配合,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据,包括:
    与云端计算资源相配合,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据;
    其中,所述时空同步信息包括与目标检测结果在时间及空间上存在同步关系的环境数据、地图数据、车身状态数据、驾驶员操作数据中的一种或多种。
  21. 根据权利要求20所述的方法,其特征在于,与云端计算资源相配合,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据,包括:
    利用目标检测结果及其时空同步信息构建场景,并将所述场景上传给云端计算资源;
    接收到云端计算资源下发的命令时,将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集,并上传给云端计算资源,其中,云端计算资源在确定已有场景库中缺少所述场景时下发所述命令。
  22. 根据权利要求20所述的方法,其特征在于,与云端计算资源相配合,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据,包括:
    监测到目标检测结果和/或时空同步信息存在异常时,将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集,并上传给云端计算资源。
  23. 根据权利要求22所述的方法,其特征在于,监测到目标检测结果和/或时空同步信息存在异常,包括下述至少一种:
    确定目标检测结果和/或时空同步信息不属于常规场景;
    对基于不同算法逻辑得到的目标检测结果进行一致性检验,确定检验结果未达到预定的一致性下限;
    确定目标检测结果与期望值不匹配。
  24. 根据权利要求20所述的方法,其特征在于,与云端计算资源相配合,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据,包括:
    确定目标检测结果与期望值相匹配,且匹配度达到预定的匹配阈值时,或者,对基于不同算法逻辑得到的目标检测结果进行一致性检验,确定检验结果达到预定的一致性上限时;
    将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集,并上传给云端计算资源。
  25. 根据权利要求18所述的方法,其特征在于,所述车端目标检测模型利用图像或激光点云检测目标。
  26. 一种云端目标检测模型自动化迭代方法,其特征在于,包括:
    与车端计算资源相配合,根据车端计算资源通过车端目标检测模型推理得到的目标检测结果,采集对车端目标检测模型性能提升有价值的数据;
    通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果训练车端目标检测模型;
    与车端计算资源相配合,将车端计算资源正在使用的车端目标检测模型迭代为训练后的车端目标检测模型。
  27. 根据权利要求26所述的方法,其特征在于,所述数据驱动模型为具有单任务、深层次特征的网络结构的深度学习模型。
  28. 根据权利要求26所述的方法,其特征在于,与车端计算资源相配合,根据车端计算资源通过车端目标检测模型推理得到的目标检测结果,采集对车端目标检测模型性能提升有价值的数据,包括:
    与车端计算资源相配合,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据;
    其中,所述时空同步信息包括与目标检测结果在时间及空间上存在同步关系的环境数据、地图数据、车身状态数据、驾驶员操作数据中的一种或多种。
  29. 根据权利要求28所述的方法,其特征在于,与车端计算资源相配合,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据,包括:
    接收车端计算资源上传的场景,其中,车端计算资源利用目标检测结果及其时空同步信息构建场景;
    确定已有场景库中缺少所述场景时,命令车端计算资源将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集并上传。
  30. 根据权利要求28所述的方法,其特征在于,与车端计算资源相配合,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据,包括:
    接收车端计算资源上传的目标检测结果及其时空同步信息,其中,车端计算资源监测到目标检测结果和/或时空同步信息存在异常时,将目标检测结果及其时空同步信息作为对车端目标检测模型性能提升有价值的数据进行采集进行上传。
  31. 根据权利要求28所述的方法,其特征在于,与车端计算资源相配合,根据目标检测结果及其时空同步信息,确定并采集对车端目标检测模型性能提升有价值的数据,包括:
    接收车端计算资源上传的目标检测结果及其时空同步信息,其中,车端计算资源在确定目标检测结果与期望值相匹配,且匹配度达到预定的匹配阈值时,或者,对基于不同算法逻辑得到的目标检测结果进行一致性检验,确定检验结果达到预定的一致性上限时,将目标检测结果及其时空同步信息作为所述对车端目标检测模型性能提升有价值的数据进行上传。
  32. 根据权利要求26所述的方法,其特征在于,所述车端目标检测模型利用图像或激光点云检测目标;则,通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,包括:
    通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据中带有目标检测结果的图像或激光点云进行标注。
  33. 根据权利要求28所述的方法,其特征在于,通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据中带有目标检测结果的图像或激光点云进行标注,包括:
    将带有目标检测结果的图像或激光点云输入数据驱动模型;
    通过数据驱动模型对带有目标检测结果的图像或激光点云进行局部检测和全局检测,并对目标检测结果、局部检测结果、全局检测结果进行一致性检验,以及根据检验结果确定目标标签。
  34. 根据权利要求28所述的方法,其特征在于,所述目标检测结果包括车端目标检测模型利用图像或激光点云检测目标得到的第一目标类别class1、第一目标检测框bbox1、第一置信度score1;
    则,通过数据驱动模型对包含目标检测结果的图像或激光点云进行局部检测,包括:
    通过数据驱动模型在带有第一目标检测框bbox1的整帧图像或整帧激光点云中,以第一目标检测框bbox1为中心向周围扩大一预设范围得到局部检测区域,并在所述局部检测区域中检测目标,输出局部检测结果;所述局部检测结果包括第二目标类别class2、第二目标检测框bbox2、第二置信度score2;
    通过数据驱动模型对包含目标检测结果的图像或激光点云进行全局检测,包括:
    通过数据驱动模型在带有第一目标检测框bbox1的整帧图像或整帧激光点云中检测目标,输出全局检测结果;所述全局检测结果包括第三目标类别class3、第三目标检测框bbox3、第三置信度score3。
  35. 根据权利要求34所述的方法,其特征在于,所述目标标签包括第四目标类别class4、第四目标检测框bbox4;
    通过数据驱动模型对目标检测结果、局部检测结果、全局检测结果进行一致性检验,以及根据检验结果确定目标标签,包括:
    采用交除并算法计算第一目标检测框bbox1、第二目标检测框bbox2、第三目标检测框bbox3的重叠度;
    当所述重叠度达到预定的重叠度阈值时,比对第一目标类别class1、第二目标类别class2、第三目标类别class3的一致性;
    若第二目标类别class2与第三目标类别class3一致,且第一目标类别class1与第二目标类别class2不一致,则,将第四目标类别class4确定为第二目标类别class2或第三目标类别class3,将第四目标检测框bbox4确定为第二置信度score2和第三置信度score3中的较大者对应的目标检测框;
    若第一目标类别class1与第二目标类别class2一致,且第二目标类别class2与第三目标类别class3不一致,则将第四目标类别class4确定为第一目标类别class1或第二目标类别class2,将第四目标检测框bbox4确定为第一置信度score1和第二置信度score2中的较大者对应的目标检测框;
    若第一目标类别class1与第三目标类别class3一致,且第二目标类别class2与第三目标类别class3不一致,则将第四目标类别class4确定为第一目标类别class1或第三目标类别class3,将第四目标检测框bbox4确定为第一置信度score1和第三置信度score3中的较大者对应的目标检测框;
    若第一目标类别class1、第二目标类别class2、第三目标类别class3均一致,则,将第四目标类别class4确定为第一目标类别class1或第二目标类别class2或第三目标类别class3,将第四目标检测框bbox4确定为第一置信度score1、第二置信度score2、第三置信度score3中的较大者对应的目标检测框。
  36. 根据权利要求35所述的方法,其特征在于,所述目标标签还包括难例等级;则,通过数据驱动模型对目标检测结果、局部检测结果、全局检测结果进行一致性检验,以及根据检验结果确定目标标签,还包括:
    若第二目标类别class2与第三目标类别class3一致,且第一目标类别class1与第二目标类别class2不一致,则将难例等级确定为一级;
    若第一目标类别class1与第二目标类别class2一致,且第二目标类别class2与第三目标类别class3不一致,则将难例等级确定为二级;
    若第一目标类别class1与第三目标类别class3一致,且第二目标类别class2与第三目标类别class3不一致,则将难例等级确定为二级;
    若第一目标类别class1、第二目标类别class2、第三目标类别class3均一致,则将难例等级确定为三级。
  37. 根据权利要求36所述的方法,其特征在于,
    通过数据驱动模型对目标检测结果、局部检测结果、全局检测结果进行一致性检验,以及根据检验结果确定目标标签,还包括:若第一目标类别class1、第二目标类别class2、第三目标类别class3各不一致;则,将带有目标检测结果、局部检测结果、全局检测结果的图像或激光点云确定为难例数据集并进行输出;
    则,所述云端目标检测模型自动化迭代方法还包括:接收人工对难例数据集标注的目标检测框、目标类别和难例等级;其中,难例等级为零级;
    以及,利用标注结果训练车端目标检测模型,包括:将人工标注的目标检测框、目标类别和难例等级与数据驱动模型确定的目标标签结合在一起作为标注结果训练车端目标检测模型。
  38. 根据权利要求37所述的方法,其特征在于,利用标注结果训练车端目标检测模型,包括:
    将带有目标标签的图像或激光点云作为标注结果训练车端目标检测模型;
    其中,训练车端目标检测模型的过程包括:
    根据第四目标类别class4、第四目标检测框bbox4修改车端目标检测模型的参数;
    根据难例等级修改车端目标检测模型的损失函数的权重参数,其中,难例等级越低,权重参数越大。
  39. 一种车端执行设备,其特征在于,包括:
    车端计算模块,配置有车端目标检测模型,通过车端目标检测模型推理得到目标检测结果;
    车端采集模块,被配置为与云端执行设备相配合,根据目标检测结果,采集对车端目标检测模型性能提升有价值的数据;其中,所述云端执行设备通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注,并利用标注结果训练车端目标检测模型;
    所述车端计算模块还被配置为与云端执行设备相配合,将配置的车端目标检测模型迭代为训练后的车端目标检测模型。
  40. 一种云端执行设备,其特征在于,包括:
    云端采集模块,被配置为与车端执行设备相配合,根据车端执行设备通过车端目标检测模型推理得到的目标检测结果,采集对车端目标检测模型性能提升有价值的数据;
    自动标注模块,被配置为通过数据驱动模型对所述对车端目标检测模型性能提升有价值的数据进行标注;
    训练模块,被配置为利用标注结果训练车端目标检测模型;
    迭代模块,被配置为与车端执行设备相配合,将车端计算资源正在使用的车端目标检测模型迭代为训练后的车端目标检测模型。
  41. 一种电子设备,其特征在于,包括:至少一个处理器,以及与所述至少一个处理器通信连接的存储器,其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求18-25中任意一项所述方法的步骤。
  42. 一种自动驾驶车辆,其特征在于,包括根据权利要求41所述的电子设备。
  43. 一种存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现权利要求18-25中任意一项所述方法的步骤。
  44. 一种电子设备,其特征在于,包括:至少一个处理器,以及与所述至少一个处理器通信连接的存储器,其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求26-38中任意一项所述方法的步骤。
  45. 一种存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现权利要求26-38中任意一项所述方法的步骤。
PCT/CN2022/120032 2021-09-22 2022-09-20 一种目标检测模型自动化迭代方法、设备及存储介质 WO2023045935A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111108053.1A CN113962141A (zh) 2021-09-22 2021-09-22 一种目标检测模型自动化迭代方法、设备及存储介质
CN202111108053.1 2021-09-22

Publications (1)

Publication Number Publication Date
WO2023045935A1 true WO2023045935A1 (zh) 2023-03-30

Family

ID=79462387

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/120032 WO2023045935A1 (zh) 2021-09-22 2022-09-20 一种目标检测模型自动化迭代方法、设备及存储介质

Country Status (2)

Country Link
CN (1) CN113962141A (zh)
WO (1) WO2023045935A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116503695A (zh) * 2023-06-29 2023-07-28 天津所托瑞安汽车科技有限公司 目标检测模型的训练方法、目标检测方法及设备
CN116665025A (zh) * 2023-07-31 2023-08-29 福思(杭州)智能科技有限公司 数据闭环方法和系统
CN116680752A (zh) * 2023-05-23 2023-09-01 杭州水立科技有限公司 一种基于数据处理的水利工程安全监测方法及系统
CN116681123A (zh) * 2023-07-31 2023-09-01 福思(杭州)智能科技有限公司 感知模型训练方法、装置、计算机设备和存储介质
CN116977810A (zh) * 2023-09-25 2023-10-31 之江实验室 多模态后融合的长尾类别检测方法和系统

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113962141A (zh) * 2021-09-22 2022-01-21 北京智行者科技有限公司 一种目标检测模型自动化迭代方法、设备及存储介质
CN114418021B (zh) * 2022-01-25 2024-03-26 腾讯科技(深圳)有限公司 模型优化方法、装置及计算机程序产品
CN116150221B (zh) * 2022-10-09 2023-07-14 浙江博观瑞思科技有限公司 服务于企业电商运营管理的信息交互方法及系统
CN117894015A (zh) * 2024-03-15 2024-04-16 浙江华是科技股份有限公司 点云标注数据优选方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190354817A1 (en) * 2018-05-18 2019-11-21 Google Llc Learning Data Augmentation Strategies for Object Detection
CN112347818A (zh) * 2019-08-08 2021-02-09 初速度(苏州)科技有限公司 一种视频目标检测模型的困难样本图像筛选方法及装置
CN112347817A (zh) * 2019-08-08 2021-02-09 初速度(苏州)科技有限公司 一种视频目标检测与跟踪方法及装置
CN112733666A (zh) * 2020-12-31 2021-04-30 湖北亿咖通科技有限公司 一种难例图像的搜集、及模型训练方法、设备及存储介质
CN113962141A (zh) * 2021-09-22 2022-01-21 北京智行者科技有限公司 一种目标检测模型自动化迭代方法、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190354817A1 (en) * 2018-05-18 2019-11-21 Google Llc Learning Data Augmentation Strategies for Object Detection
CN112347818A (zh) * 2019-08-08 2021-02-09 初速度(苏州)科技有限公司 一种视频目标检测模型的困难样本图像筛选方法及装置
CN112347817A (zh) * 2019-08-08 2021-02-09 初速度(苏州)科技有限公司 一种视频目标检测与跟踪方法及装置
CN112733666A (zh) * 2020-12-31 2021-04-30 湖北亿咖通科技有限公司 一种难例图像的搜集、及模型训练方法、设备及存储介质
CN113962141A (zh) * 2021-09-22 2022-01-21 北京智行者科技有限公司 一种目标检测模型自动化迭代方法、设备及存储介质

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116680752A (zh) * 2023-05-23 2023-09-01 杭州水立科技有限公司 一种基于数据处理的水利工程安全监测方法及系统
CN116680752B (zh) * 2023-05-23 2024-03-19 杭州水立科技有限公司 一种基于数据处理的水利工程安全监测方法及系统
CN116503695A (zh) * 2023-06-29 2023-07-28 天津所托瑞安汽车科技有限公司 目标检测模型的训练方法、目标检测方法及设备
CN116503695B (zh) * 2023-06-29 2023-10-03 天津所托瑞安汽车科技有限公司 目标检测模型的训练方法、目标检测方法及设备
CN116665025A (zh) * 2023-07-31 2023-08-29 福思(杭州)智能科技有限公司 数据闭环方法和系统
CN116681123A (zh) * 2023-07-31 2023-09-01 福思(杭州)智能科技有限公司 感知模型训练方法、装置、计算机设备和存储介质
CN116665025B (zh) * 2023-07-31 2023-11-14 福思(杭州)智能科技有限公司 数据闭环方法和系统
CN116681123B (zh) * 2023-07-31 2023-11-14 福思(杭州)智能科技有限公司 感知模型训练方法、装置、计算机设备和存储介质
CN116977810A (zh) * 2023-09-25 2023-10-31 之江实验室 多模态后融合的长尾类别检测方法和系统
CN116977810B (zh) * 2023-09-25 2024-01-09 之江实验室 多模态后融合的长尾类别检测方法和系统

Also Published As

Publication number Publication date
CN113962141A (zh) 2022-01-21

Similar Documents

Publication Publication Date Title
WO2023045935A1 (zh) 一种目标检测模型自动化迭代方法、设备及存储介质
US10762360B2 (en) Automatically detecting unmapped drivable road surfaces for autonomous vehicles
US11774966B2 (en) Generating testing instances for autonomous vehicles
US11630458B2 (en) Labeling autonomous vehicle data
CN109211575B (zh) 无人驾驶汽车及其场地测试方法、装置及可读介质
US11256263B2 (en) Generating targeted training instances for autonomous vehicles
WO2023045936A1 (zh) 一种模型自动化迭代方法、设备及存储介质
US20210403034A1 (en) Systems and Methods for Optimizing Trajectory Planner Based on Human Driving Behaviors
WO2021196052A1 (zh) 驾驶数据采集方法及装置
US20220198107A1 (en) Simulations for evaluating driving behaviors of autonomous vehicles
US11754719B2 (en) Object detection based on three-dimensional distance measurement sensor point cloud data
CN113498529B (zh) 一种目标跟踪方法及其装置
WO2020123105A1 (en) Detecting spurious objects for autonomous vehicles
CN114880842A (zh) 轨迹预测模型的自动化迭代方法、电子设备和存储介质
US20230048680A1 (en) Method and apparatus for passing through barrier gate crossbar by vehicle
CN112810603B (zh) 定位方法和相关产品
CN113859265A (zh) 一种驾驶过程中的提醒方法及设备
CN114255275A (zh) 一种构建地图的方法及计算设备
US20230294736A1 (en) Offline Tracking System for Autonomous Vehicle Control Systems
US20230192121A1 (en) Class-aware depth data clustering
CN113741384B (zh) 检测自动驾驶系统的方法和装置
US20230399008A1 (en) Multistatic radar point cloud formation using a sensor waveform encoding schema
US20230243952A1 (en) Unified radar perception architecture
US20230194692A1 (en) Radar tracking association with velocity matching by leveraging kinematics priors
US11907050B1 (en) Automated event analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22871981

Country of ref document: EP

Kind code of ref document: A1