WO2021143231A1 - 一种目标检测模型的训练方法、数据的标注方法和装置 - Google Patents

一种目标检测模型的训练方法、数据的标注方法和装置 Download PDF

Info

Publication number
WO2021143231A1
WO2021143231A1 PCT/CN2020/121370 CN2020121370W WO2021143231A1 WO 2021143231 A1 WO2021143231 A1 WO 2021143231A1 CN 2020121370 W CN2020121370 W CN 2020121370W WO 2021143231 A1 WO2021143231 A1 WO 2021143231A1
Authority
WO
WIPO (PCT)
Prior art keywords
detection model
labeled
labeling
data
target detection
Prior art date
Application number
PCT/CN2020/121370
Other languages
English (en)
French (fr)
Inventor
江浩
马贤忠
胡皓瑜
董维山
Original Assignee
初速度(苏州)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 初速度(苏州)科技有限公司 filed Critical 初速度(苏州)科技有限公司
Priority to DE112020003158.6T priority Critical patent/DE112020003158T5/de
Publication of WO2021143231A1 publication Critical patent/WO2021143231A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the invention relates to the technical field of automatic driving, in particular to a method for training a target detection model, a method and a device for labeling data.
  • the perception module uses data from a variety of sensors and high-precision map information as input. After a series of calculations and processing, it accurately perceives the surrounding environment of the autonomous vehicle.
  • Autopilot perception algorithms currently adopt deep learning methods as the mainstream. At present, the training of deep learning target detection models still needs to rely on large-scale manual annotation data. Therefore, obtaining more annotation data with less cost is an urgent problem to be solved.
  • the loss function of the deep learning target detection model generally includes classification and regression.
  • the regression part generally adopts the loss function of L1, L2, Smooth L1, etc. of the predicted value of physical quantities such as position, size, and orientation angle and the true value difference.
  • IoU Intersection over Union
  • GIoU Intersection over Union
  • DIoU DIoU between the prediction box and the real box.
  • the embodiment of the invention discloses a method for training a target detection model, a method and device for labeling data, which effectively reduces the time for labelers to modify auxiliary frames, improves labeling efficiency of continuous frame data, and reduces labeling costs.
  • an embodiment of the present invention discloses a method for training a target detection model, and the method includes:
  • the loss function of the target detection model includes a classification part and a regression part
  • the value of the regression part is the weighted sum of the positions of the objects to be labeled according to the size of the normalized error.
  • the weight is the k power of w
  • w is the hyperparameter
  • k is the order value after the normalized error sorting.
  • the normalized error is obtained by normalizing the absolute value of the difference between the predicted position and the target position based on the target position.
  • an embodiment of the present invention also provides a method for labeling continuous frame data, which is applied to the cloud, and the method includes:
  • the labeling task including the category and position of the object to be labelled;
  • target detection is performed on each frame of the read continuous frame data according to the labeling task, and the category and position of the object to be labeled in each frame of data obtained are used as the detection result;
  • an association relationship between the same object to be labeled in each frame of data is established, where the association relationship is used as the pre-labeling result of the continuous frame data to be used in the annotation Make corrections at the end;
  • the preset target detection model establishes an association relationship between the object to be labeled and its category and position in each frame of data.
  • the value of the regression part of the loss function adopted by the preset target detection model is :
  • the position of the object to be labeled is sorted according to the size of the normalized error and the weighted sum, where the weight of the normalized error is the k power of w, w is the hyperparameter, and k is the sorted position of the normalized error Ordinal value.
  • the method further includes:
  • the detection result is corrected so that the same object to be labeled has the same size, wherein the machine learning method includes a Kalman filter algorithm.
  • the labeling task also includes an output file format
  • the method further includes:
  • the pre-labeled result is generated according to the output file format to generate an expandable pre-labeled file, and the pre-labeled file and the continuous frame data are sent to the labeling terminal.
  • the continuous frame data is a picture or a lidar point cloud.
  • an embodiment of the present invention also provides a method for labeling continuous frame data, which is applied to the labeling terminal, and the method includes:
  • the pre-labeling result is: after the cloud reads the continuous frame data, based on the preset target detection model, and according to the labeling task, the detection result obtained by the target detection of the object to be labeled in each frame of data and the difference between each frame of data Time sequence information, the established association relationship between the same object to be labeled in each frame of data; wherein the detection result includes the category and position of the object to be labeled, and the preset target detection model is the target according to claim 1.
  • the detection model is generated by the training method.
  • an embodiment of the present invention also discloses a training device for a target detection model, the device including:
  • the sample data acquisition module is configured to acquire sample data labeled with a preset target category and target location of the object to be labeled;
  • a predicted position determining module configured to input the sample data into an initial detection model to obtain the predicted position of the preset object
  • the target detection model determination module is configured to compare the target position and the predicted position, adjust the parameters of the initial detection model according to the comparison result, and use the detection model when the value of the regression part of the loss function reaches convergence as Target detection model;
  • the loss function of the target detection model includes a classification part and a regression part
  • the value of the regression part is the weighted sum of the positions of the objects to be labeled according to the size of the normalized error.
  • the weight is the k power of w
  • w is the hyperparameter
  • k is the order value after the normalized error sorting.
  • the normalized error is obtained by normalizing the absolute value of the difference between the predicted position and the target position based on the target position.
  • an embodiment of the present invention also provides an apparatus for labeling continuous frame data, which is applied to the cloud, and the apparatus includes:
  • the continuous frame data acquisition module is configured to acquire a labeling task and read continuous frame data, the labeling task including the category and position of the object to be labeled;
  • the detection result determination module is configured to perform target detection on each frame of the read continuous frame data based on the preset target detection model according to the labeling task, and to obtain the category and the type of the object to be labeled in each frame of data obtained. The location is used as the test result;
  • the association relationship establishment module is configured to establish an association relationship between the same object to be marked in each frame of data according to the detection result and the timing information between each frame data, wherein the association relationship is used as the continuous frame data Pre-labeled results, used to make corrections on the labeling side;
  • the preset target detection model establishes an association relationship between the object to be labeled and its category and position in each frame of data.
  • the value of the regression part of the loss function adopted by the preset target detection model is :
  • the position of the object to be labeled is sorted according to the size of the normalized error and the weighted sum, where the weight of the normalized error is the k power of w, w is the hyperparameter, and k is the sorted position of the normalized error Ordinal value.
  • the device further includes:
  • the correction module is configured to correct the detection result based on a machine learning method so that the same object to be labeled has the same size, wherein the machine learning method includes a Kalman filter algorithm.
  • the labeling task also includes an output file format
  • the device further includes:
  • the file generating module is configured to generate an expandable pre-labeled file according to the output file format from the pre-labeled result, and send the pre-labeled file and the continuous frame data to the labeling terminal.
  • an embodiment of the present invention also provides a device for labeling continuous frame data, which is applied to the labeling terminal, and the device includes:
  • the pre-labeled result obtaining module is configured to obtain the pre-labeled result of continuous frame data sent by the cloud;
  • the correction module is configured to, if a correction instruction for the pre-marking result is received, correct the marking result according to the correction instruction, and use the corrected marking result as the target marking result of the continuous frame data ;
  • the pre-labeling result is: after the cloud reads the continuous frame data, based on the preset target detection model, and according to the labeling task, the detection result obtained by the target detection of the object to be labeled in each frame of data and the difference between each frame of data Time sequence information, the established association relationship between the same object to be labeled in each frame of data; wherein the detection result includes the category and position of the object to be labeled, and the preset target detection model is provided according to any embodiment of the present invention
  • the target detection model is generated by the training method.
  • an embodiment of the present invention also provides a device, including:
  • a memory storing executable program codes
  • a processor coupled with the memory
  • the processor calls the executable program code stored in the memory to execute part or all of the steps of the target detection model training method provided by any embodiment of the present invention.
  • the present invention also provides a cloud server in real time, including:
  • a memory storing executable program codes
  • a processor coupled with the memory
  • the processor calls the executable program code stored in the memory to execute part or all of the steps of the continuous frame data labeling method applied to the cloud provided by any embodiment of the present invention.
  • the present invention also provides a labeling terminal in real time, including:
  • a memory storing executable program codes
  • a processor coupled with the memory
  • the processor calls the executable program code stored in the memory to execute part or all of the steps of the continuous frame data labeling method applied to the labeling terminal provided by any embodiment of the present invention.
  • an embodiment of the present invention also provides a computer-readable storage medium that stores a computer program.
  • the computer program includes part or all of the training method for the target detection model provided by any embodiment of the present invention. Step instructions.
  • an embodiment of the present invention also provides a computer-readable storage medium that stores a computer program, and the computer program includes a label for executing continuous frame data applied to the cloud provided by any embodiment of the present invention. Instructions for some or all of the steps of the method.
  • an embodiment of the present invention also provides a computer-readable storage medium that stores a computer program, and the computer program includes a label for executing continuous frame data applied to the labeling terminal provided by any embodiment of the present invention. Instructions for some or all of the steps of the method.
  • the embodiments of the present invention also provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute the target detection model training method provided by any embodiment of the present invention. Part or all of the steps.
  • the embodiments of the present invention also provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute the continuous frame data applied to the cloud provided by any embodiment of the present invention Part or all of the steps of the labeling method.
  • the embodiments of the present invention also provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute the continuous frame data applied to the labeling terminal provided by any embodiment of the present invention Part or all of the steps of the labeling method.
  • the predicted position of the preset object can be obtained.
  • the target position is compared with the predicted position, and the parameters of the initial detection model are adjusted according to the comparison result, and the detection model when the value of the regression part of the loss function reaches convergence is used as the target detection model.
  • the loss function of the target detection model includes a classification part and a regression part.
  • the value of the regression part of the target detection model in this implementation is the weighted sum of the positions of the objects to be labeled according to the size of the normalized error, where the weight of the normalized error is w To the k-th power, w is the hyperparameter, and k is the order value after the normalized error sorting.
  • the weights of different items of the loss function can be adjusted, so that only a few items in the result of the loss function have some deviations, and other items are close to 0, instead of each item has a deviation, so that it can be used in the continuous frame data
  • the number and time of the labeler adjusting the auxiliary frame is reduced, and the labeling efficiency is improved.
  • the invention points of the present invention include:
  • the target detection model establishes the relationship between the object to be labeled and its category and position in each frame of data.
  • the loss function used in the training process of the model is the weighted sum of the positions of the objects to be labeled according to the size of the normalized error, where the weight of the normalized error is w to the power of k, and w is the hyperparameter , K is the order value after normalization error sorting.
  • auxiliary annotation links such as target detection of single frame data and association of continuous frame data in the cloud.
  • the pre-labeling results obtained by the cloud after auxiliary labeling can be used as the basis for subsequent labeling personnel’s review.
  • the labeling personnel can make adjustments and corrections through the labeling terminal, which solves the problem of low manual labeling efficiency in the prior art.
  • the embodiment of the present invention adopts a labeling mode in which the cloud and the labeling terminal cooperate with each other, which effectively improves the labeling efficiency and reduces the labeling cost, which is one of the invention points of the present invention.
  • FIG. 1 is a schematic flowchart of a method for training a target detection model provided by an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of a method for labeling continuous frame data applied to the cloud according to an embodiment of the present invention
  • FIG. 3 is a schematic flowchart of a method for labeling continuous frame data applied to the labeling terminal according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a training device for a target detection model provided by an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an apparatus for labeling continuous frame data applied to the cloud according to an embodiment of the present invention
  • FIG. 6 is a schematic structural diagram of a continuous frame data labeling device applied to the labeling terminal according to an embodiment of the present invention.
  • Fig. 7 is a schematic structural diagram of a device provided by an embodiment of the present invention.
  • FIG. 1 is a schematic flowchart of a method for training a target detection model according to an embodiment of the present invention.
  • the target detection model is mainly applied to the cloud for auxiliary annotation of continuous frame data.
  • the method can be executed by a training device of the target detection model, and the device can be implemented by software and/or hardware, which is not limited in the embodiment of the present invention.
  • the method provided in this embodiment specifically includes:
  • the sample data is a sample image used to train the target detection model.
  • the training in the embodiments of the present application is a supervised training, so all the sample data used need to have corresponding labels, that is, each preset object in the sample data needs to have a corresponding target category and target location label.
  • the initial detection model may be a deep neural network model, for example, PointRCNN (Regions with Convolution Neural Network, a region-based convolutional neural network for the original point cloud).
  • PointRCNN Registered with Convolution Neural Network
  • the position of the object to be marked can be calibrated by the auxiliary box of a rectangular parallelepiped.
  • the specific position information of the rectangular parallelepiped can be determined by the coordinates (x, y, z) of the center of the rectangular parallelepiped, and the length, width and height of the rectangular parallelepiped (w, h, d) and the orientation angle ⁇ of the cuboid, that is, the position of the target detection model regression is the seven variables of x, y, z, w, h, d and ⁇ . These variables can be represented in the form of auxiliary boxes.
  • the target detection model to be trained in this embodiment mainly recognizes the category and position of a preset object. Among them, whether the category of the preset object is the object that needs to be labeled in the labeling task can be achieved by classification, and the position of the preset object can be determined by regression.
  • the loss function used in the training process of the target detection model generally also includes classification and regression. Among them, the value of the regression part of the loss function used is: the weighted sum of the positions of the objects to be labeled according to the size of the normalized error, where the normalized error is the difference between the predicted position and the target position The absolute value is normalized based on the target position. The weight of the normalized error is w to the k power, w is a hyperparameter, and k is the order value after the normalized error is sorted. The reasons for this setting are as follows:
  • the regression part of the target detection model generally uses the predicted value and true value difference L1, L2, of physical quantities such as position (x, y, z), size (w, h, d), and orientation angle ( ⁇ ).
  • IoU Intersection over Union
  • GIoU Generalized Intersection over Union
  • DIoU DIoU
  • the loss function currently used generally only considers the accuracy of the position of the predicted frame and the real frame, and does not consider the specific requirements during labeling, that is, the number of times that the labeler modifies the auxiliary frame is reduced as much as possible.
  • the loss function used in the training process of the target detection model provided in this embodiment can be adjusted by adjusting the weights of different items of the loss function, so that only a few items in the result of the loss function have some deviations, and other items are close to zero. , Not every item is biased. This setting reduces the number of times and time for annotator to adjust the auxiliary frame, and improves the efficiency of annotation.
  • the predicted position of the preset object can be obtained.
  • the target position is compared with the predicted position, and the parameters of the initial detection model are adjusted according to the comparison result, and the detection model when the value of the regression part of the loss function reaches convergence is used as the target detection model.
  • the loss function of the target detection model includes a classification part and a regression part.
  • the value of the regression part of the target detection model in this implementation is the weighted sum of the positions of the objects to be labeled according to the size of the normalized error, where the weight of the normalized error is w To the k-th power, w is the hyperparameter, and k is the order value after the normalized error sorting.
  • the weights of different items of the loss function can be adjusted, so that only a few items in the result of the loss function have some deviations, and other items are close to 0, instead of each item has a deviation, so that it can be used in the continuous frame data
  • the labeling stage reduces the number of times and time for the labeler to adjust the auxiliary frame, and improves the labeling efficiency.
  • FIG. 2 is a schematic flowchart of a method for labeling continuous frame data applied to the cloud according to an embodiment of the present invention. This embodiment is optimized on the basis of the above-mentioned embodiment. As shown in Figure 2, the method includes:
  • the labeling task includes the category and position of the object to be labelled.
  • the labeling task is the prior information of the labeling process, including the object to be labeled (such as vehicles, pedestrians, etc.), the category of the object to be labeled (such as tricycles, buses, or cars, etc.), preset sizes, and output files of the labeling file Format etc.
  • the labeling task can be set by the labeling personnel modifying the parameters of the cloud model according to actual needs, or it can be sent from the labeling terminal to the cloud by the labeling personnel. Since the cloud is not limited by computer resources, the cloud's deep learning algorithm can be used to pre-label continuous frame data to reduce the workload of subsequent manual labeling and improve work efficiency.
  • the continuous frame data is a sequence of several data of the same type with chronological order and equal intervals, and may be a picture or a 3D lidar point cloud.
  • 3D lidar point clouds in the process of labeling them using the existing labeling technology, the labeling speed is slower and the cost is higher.
  • the labeling system provided in this embodiment can be used as an auxiliary labeling link of the 3D lidar point cloud. Since the cloud is not limited by computer resources, it is pre-labeled in the cloud to reduce the workload of manual labelers, reduce labeling costs, and improve labeling efficiency.
  • the cloud performs target detection on each frame of continuous frame data, which can be achieved by using a preset target detection model that establishes the object to be labeled and its category in each frame of data. The relationship of the location. By pre-setting the target detection model, the category and location of the object to be marked can be obtained.
  • the training process of the preset target detection model can refer to the content of the foregoing embodiment, and this embodiment will not be repeated here.
  • the preset target detection model can be PointRCNN (Regions with Convolution Neural Network, a region-based convolutional neural network for the original point cloud), or the output results of multiple models can be used for fusion processing.
  • This embodiment is here There is no specific limitation.
  • the position of the object to be marked can be calibrated by the auxiliary frame of the rectangular parallelepiped.
  • the specific position information of the rectangular parallelepiped can be determined by the coordinates (x, y, z) of the center of the rectangular parallelepiped, and the length, width and height of the rectangular parallelepiped (w, h).
  • the orientation angle ⁇ of the cuboid that is, the position of the object to be labeled returned by the preset target detection model is the seven variables of x, y, z, w, h, d, and ⁇ . These variables can be represented in the form of auxiliary boxes.
  • the cloud After the cloud obtains the category and position of the object to be labeled based on the preset target detection model, it can establish the association relationship between the same object to be labeled in each frame of data according to the detection result and the timing information between each frame of data. Among them, the same object to be labeled in each frame of data can be represented by the same number. Establishing the association relationship between the same object to be labeled in each frame of data is mainly to track the same object to be labeled.
  • vehicle 1 For example, if vehicle 1 appears in the current frame of data, it is necessary to determine whether it can be detected in the next frame of data For vehicle 1, if vehicle 1 can still be detected, the connection between vehicle 1 in the current frame of data and vehicle 1 in the next frame of data can be established according to the time sequence information.
  • the specific correlation method can be correlated through a machine learning method, such as Kalman filter algorithm.
  • machine learning methods such as Kalman filter algorithm
  • Checksum correction For example, the missing objects to be labeled in the continuous frame data can be complemented.
  • the association relationship can be used as the pre-labeled result of continuous frame data.
  • the cloud 110 will generate an expandable pre-labeled file according to the pre-labeled result according to the output file format in the labeling task, and add The pre-labeled file and continuous frame data are sent to the labeling terminal for the labeling staff to make corrections on the labeling side.
  • the labeling terminal After receiving the continuous frame data and the corresponding pre-labeled file sent by the cloud, the labeling terminal can correct the labeling file according to the correction instruction, and use the revised labeling result as the target labeling result of the continuous frame data.
  • the labeling terminal adds a function button for correcting the pre-labeled file.
  • the function button is triggered, the pre-labeled file can be corrected.
  • the cloud-based preset target detection model detects The orientation of the vehicle may not be accurate, so you can add the function of changing the orientation 180° with one key on the labeling terminal, so that the labeling personnel can check and modify it.
  • the pre-labeling results of continuous frame data can be obtained.
  • Subsequent manual annotators only need to check the omissions on the basis of the pre-annotation results through the annotation terminal. Because the cloud preset target detection model is trained, by adjusting the weights of different items of the loss function, only a few items in the result of the loss function have some deviations, and other items are close to 0, but not every item has a deviation.
  • the technical solution provided by this embodiment can effectively reduce the labeling workload of manual labelers, reduce labeling costs, and improve labeling speed and accuracy by adopting a labeling mode that cooperates with the cloud and the labeling terminal.
  • FIG. 3 is a schematic flowchart of a method for labeling continuous frame data applied to the labeling terminal according to an embodiment of the present invention.
  • the method can be executed by a labeling device for continuous frame data, which can be implemented by software and/or hardware, and generally can be integrated in a labeling terminal.
  • the method provided in this embodiment specifically includes:
  • auxiliary function buttons can be added to the labeling end, such as one-key rotation of the direction of the vehicle by 180°, etc., to facilitate manual labeling.
  • the pre-labeling result is: after the cloud reads the continuous frame data, based on the preset target detection model, and according to the labeling task, the detection result obtained by the target detection of the object to be labeled in each frame of data and the timing information between each frame of data , The establishment of the association relationship between the same object to be labeled in each frame of data; wherein the detection result includes the category and position of the object to be labeled, and the preset target detection model is the target detection provided in the first embodiment of the present invention Generated by the training method of the model.
  • the loss function of the regression part used in the training process of the preset target detection model is: the weighted sum of the positions of the objects to be labeled according to the size of the normalized error, where the weight of the normalized error is w K to the power of k, w is the hyperparameter, and k is the position after the normalized error sorting.
  • This setting makes that only a few items in the result of the loss function have some deviations, and other items are close to 0, but not every item has a deviation, so that when the annotator performs manual annotation, it reduces the need for annotator to adjust the auxiliary frame.
  • the frequency and time improve the efficiency of labeling.
  • the pre-annotated file sent from the cloud is used as the basis for the annotation end correction, and on this basis, the annotator can further check for omissions in the pre-annotated file.
  • FIG. 4 is a schematic structural diagram of a training device for a target detection model provided by an embodiment of the present invention.
  • the device includes: a sample data acquisition module 410, a predicted position determination module 420, and a target detection model determination module 430; wherein,
  • the sample data obtaining module 410 is configured to obtain sample data marked with a preset target category and target position of the object to be marked;
  • the predicted position determining module 420 is configured to input the sample data into the initial detection model to obtain the predicted position of the preset object;
  • the target detection model determination module 430 is configured to compare the target position with the predicted position, and adjust the parameters of the initial detection model according to the comparison result, so that the value of the regression part of the loss function reaches the detection model when the loss function is converged As a target detection model;
  • the loss function of the target detection model includes a classification part and a regression part
  • the value of the regression part is the weighted sum of the positions of the objects to be labeled according to the size of the normalized error.
  • the weight is the k power of w
  • w is the hyperparameter
  • k is the order value after the normalized error sorting.
  • the normalized error is obtained by normalizing the absolute value of the difference between the predicted position and the target position based on the target position.
  • the training device for the target detection model provided by the embodiment of the present invention can execute the training method for the target detection model provided by any embodiment of the present invention, and has the corresponding functional modules and beneficial effects of the execution method.
  • the training method of the target detection model provided in any embodiment of the present invention refer to the training method of the target detection model provided in any embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a continuous frame data labeling device applied to the cloud according to an embodiment of the present invention. As shown in FIG. 5, the device includes: a continuous frame data acquisition module 510, and a detection result is determined Module 520 and association relationship establishment module 530; among them,
  • the continuous frame data acquisition module 510 is configured to acquire a labeling task and read continuous frame data, where the labeling task includes the category and position of the object to be labeled;
  • the detection result determination module 520 is configured to perform target detection on each frame of the read continuous frame data based on a preset target detection model and according to the labeling task, and to obtain the category of the object to be labeled in each frame of data. And location as the test result;
  • the association relationship establishment module 530 is configured to establish an association relationship between the same object to be labeled in each frame data according to the detection result and the timing information between each frame data, wherein the association relationship is used as the continuous frame data The pre-labeled result of, used to make corrections on the labeling side;
  • the preset target detection model establishes an association relationship between the object to be labeled and its category and position in each frame of data.
  • the value of the regression part of the loss function adopted by the preset target detection model is :
  • the position of the object to be labeled is sorted according to the size of the normalized error and the weighted sum, where the weight of the normalized error is the k power of w, w is the hyperparameter, and k is the sorted position of the normalized error Ordinal value.
  • the device further includes:
  • the correction module is configured to correct the detection result based on a machine learning method so that the same object to be labeled has the same size, wherein the machine learning method includes a Kalman filter algorithm.
  • the labeling task also includes an output file format
  • the device further includes:
  • the file generating module is configured to generate an expandable pre-labeled file according to the output file format from the pre-labeled result, and send the pre-labeled file and the continuous frame data to the labeling terminal.
  • the device for labeling continuous frame data provided by the embodiment of the present invention can execute the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for the execution method.
  • the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention can execute the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for the execution method.
  • FIG. 6 is a schematic structural diagram of an annotation device for continuous frame data applied to an annotation terminal according to an embodiment of the present invention. As shown in FIG. 6, the device includes: a pre-annotation result acquisition module 610 and a correction module 620 ;in,
  • the pre-labeled result obtaining module 610 is configured to obtain the pre-labeled result of continuous frame data sent by the cloud;
  • the correction module 620 is configured to, if a correction instruction for the pre-marking result is received, correct the marking result according to the correction instruction, and use the corrected marking result as the target marking of the continuous frame data result;
  • the pre-labeling result is: after the cloud reads the continuous frame data, based on the preset target detection model, and according to the labeling task, the detection result obtained by the target detection of the object to be labeled in each frame of data and the difference between each frame of data Time sequence information, the established association relationship between the same object to be labeled in each frame of data; wherein the detection result includes the category and position of the object to be labeled, and the preset target detection model is provided according to any embodiment of the present invention
  • the target detection model is generated by the training method.
  • the device for labeling continuous frame data provided by the embodiment of the present invention can execute the method for labeling continuous frame data applied to the labeling terminal provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for the execution method.
  • the method for labeling continuous frame data applied to the labeling terminal provided by any embodiment of the present invention can execute the method for labeling continuous frame data applied to the labeling terminal provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for the execution method.
  • FIG. 7 is a schematic structural diagram of a device according to an embodiment of the present invention.
  • the device may include:
  • a memory 701 storing executable program codes
  • a processor 702 coupled to the memory 701;
  • the processor 702 calls the executable program code stored in the memory 701 to execute the method for training a target detection model provided by any embodiment of the present invention.
  • the embodiment of the present invention also provides another cloud server, including a memory storing executable program code; a processor coupled with the memory; wherein the processor calls the executable program code stored in the memory to execute any embodiment of the present invention Provides an annotation method for continuous frame data applied to the cloud.
  • the embodiment of the present invention also provides another labeling terminal, including a memory storing executable program code; a processor coupled with the memory; wherein the processor calls the executable program code stored in the memory to execute any embodiment of the present invention
  • the provided labeling method applied to the continuous frame data of the labeling terminal including a memory storing executable program code; a processor coupled with the memory; wherein the processor calls the executable program code stored in the memory to execute any embodiment of the present invention
  • the embodiment of the present invention also provides a computer-readable storage medium that stores a computer program, and the computer program includes instructions for executing part or all of the steps of the target detection model training method provided by any embodiment of the present invention.
  • the embodiment of the present invention also provides a computer-readable storage medium that stores a computer program.
  • the computer program includes part or all of the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention. Step instructions.
  • the embodiment of the present invention also provides a computer-readable storage medium that stores a computer program, and the computer program includes part or all of the method for labeling continuous frame data applied to the labeling terminal provided by any embodiment of the present invention. Step instructions.
  • the embodiment of the present invention also provides a computer program product, which when the computer program product runs on a computer, causes the computer to execute part or all of the steps of the target detection model training method provided by any embodiment of the present invention.
  • the embodiment of the present invention also provides a computer program product, which when the computer program product runs on a computer, causes the computer to execute part of the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention Or all steps.
  • the embodiment of the present invention also provides a computer program product, which when the computer program product runs on a computer, causes the computer to execute part of the continuous frame data labeling method applied to the labeling terminal provided by any embodiment of the present invention Or all steps.
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B based on A does not mean that B is determined only based on A, and B can also be determined based on A and/or other information.
  • the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the aforementioned integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-accessible memory.
  • the essence of the technical solution of the present invention, or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a memory.
  • a computer device which may be a personal computer, a server or a network device, etc., specifically a processor in a computer device
  • the program can be stored in a computer-readable storage medium.
  • the storage medium includes read-only Memory (Read-Only Memory, ROM), Random Access Memory (RAM), Programmable Read-only Memory (PROM), Erasable Programmable Read Only Memory, EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronically-Erasable Programmable Read-Only Memory (EEPROM), CD-ROM (Compact Disc) Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or any other computer-readable medium that can be used to carry or store data.
  • Read-Only Memory ROM
  • RAM Random Access Memory
  • PROM Programmable Read-only Memory
  • EPROM Erasable Programmable Read Only Memory
  • OTPROM One-time Programmable Read-Only Memory
  • EEPROM Electronically-Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例公开了一种目标检测模型的训练方法、数据的标注方法和装置。该方法包括:获取标注有预设物体目标类别和目标位置的样本数据;将样本数据输入初始检测模型,得到预设物体的预测位置;将目标位置和预测位置进行比较,并根据比较结果调整所述初始检测模型的参数,将使得损失函数回归部分的值达到收敛时的检测模型作为目标检测模型;其中,目标检测模型的损失函数包括分类部分和回归部分,回归部分的值为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。通过采用上述方案,有效减少了标注人员修改辅助框的时间,提升了连续帧数据的标注效率。

Description

一种目标检测模型的训练方法、数据的标注方法和装置 技术领域
本发明涉及自动驾驶技术领域,具体涉及一种目标检测模型的训练方法、数据的标注方法和装置。
背景技术
在自动驾驶领域,感知模块是以多种传感器的数据,以及高精度地图的信息作为输入,经过一系列的计算及处理,对自动驾驶车的周围的环境精确感知。自动驾驶感知算法目前主流采用深度学习方法,目前深度学习目标检测模型的训练仍然需要依赖大规模的人工标注数据,所以利用较少的成本获得更多的标注数据,是一个亟待解决的问题。
目前,深度学习目标检测模型的损失函数一般包括分类和回归两部分,其中回归部分一般采用位置、尺寸、朝向角等物理量预测值和真值差值的L1、L2、Smooth L1等形式的损失函数,以及预测框与真实框的IoU(Intersection over Union,交并比)、GIoU、DIoU等形式的损失函数,这些损失函数都可以使目标检测模型预测值尽可能接近真实值。然而,目前采用的损失函数都只考虑了预测框和真实框位置的准确性,没有考虑辅助标注应用时的具体需求,即尽可能的减少标注员修改辅助框的次数。
发明内容
本发明实施例公开一种目标检测模型的训练方法、数据的标注方法和装置,有效减少了标注人员修改辅助框的时间,提升了连续帧数据的标注效率,降低了标注成本。
第一方面,本发明实施例公开了一种目标检测模型的训练方法,该方法包括:
获取标注有预设物体目标类别和目标位置的样本数据;
将所述样本数据输入初始检测模型,得到所述预设物体的预测位置;
将所述目标位置和所述预测位置进行比较,并根据比较结果调整所述初始检测模型的参数,将使得损失函数回归部分的值达到收敛时的检测模型作为目标检测模型;
其中,所述目标检测模型的损失函数包括分类部分和回归部分,所述回归部分的值为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
可选的,所述归一化误差是将所述预测位置与所述目标位置作差后的绝对值,以所述目 标位置为准进行归一化得到的。
第二方面,本发明实施例还提供了一种连续帧数据的标注方法,应用于云端,该方法包括:
获取标注任务并读取连续帧数据,所述标注任务中包括待标注物体的类别和位置;
基于预设目标检测模型,并按照标注任务对读取到的连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端进行修正;
其中,所述预设目标检测模型建立了待标注物体与其在每一帧数据中的类别、位置的关联关系,所述预设目标检测模型在训练时,所采用的损失函数回归部分的值为:待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
可选的,所述方法还包括:
基于机器学习方法,对所述检测结果进行修正,使得同一个待标注物体具有相同的尺寸,其中,所述机器学习方法包括卡尔曼滤波算法。
可选的,所述标注任务中还包括输出文件格式;
相应的,所述方法还包括:
将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端。
可选的,所述连续帧数据为图片或激光雷达点云。
第三方面,本发明实施例还提供了一种连续帧数据的标注方法,应用于标注端,该方法包括:
获取云端发送的连续帧数据的预标注结果;
如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
其中,所述预标注结果是:云端在读取连续帧数据后,基于预设目标检测模型,并按照标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;其中,所述检测结果包括待标注物体的类别和位置,所述预设目标检测模型是根据权利要求1所述的目标检测模型的训练方法生成的。
第四方面,本发明实施例还公开了一种目标检测模型的训练装置,该装置包括:
样本数据获取模块,被配置为获取标注有预设待标注物体目标类别和目标位置的样本数据;
预测位置确定模块,被配置为将所述样本数据输入初始检测模型,得到所述预设物体的预测位置;
目标检测模型确定模块,被配置为将所述目标位置和所述预测位置进行比较,并根据比较结果调整所述初始检测模型的参数,将使得损失函数回归部分的值达到收敛时的检测模型作为目标检测模型;
其中,所述目标检测模型的损失函数包括分类部分和回归部分,所述回归部分的值为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
可选的,所述归一化误差是将所述预测位置与所述目标位置作差后的绝对值,以所述目标位置为准进行归一化得到的。
第五方面,本发明实施例还提供了一种连续帧数据的标注装置,应用于云端,该装置包括:
连续帧数据获取模块,被配置为获取标注任务并读取连续帧数据,所述标注任务中包括待标注物体的类别和位置;
检测结果确定模块,被配置为基于预设目标检测模型,并按照标注任务对读取到的连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
关联关系建立模块,被配置为根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端进行修正;
其中,所述预设目标检测模型建立了待标注物体与其在每一帧数据中的类别、位置的关联关系,所述预设目标检测模型在训练时,所采用的损失函数回归部分的值为:待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
可选的,所述装置还包括:
修正模块,被配置为基于机器学习方法,对所述检测结果进行修正,使得同一个待标注物体具有相同的尺寸,其中,所述机器学习方法包括卡尔曼滤波算法。
可选的,所述标注任务中还包括输出文件格式;
相应的,所述装置还包括:
文件生成模块,被配置为将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端。
第六方面,本发明实施例还提供了一种连续帧数据的标注装置,应用于标注端,该装置包括:
预标注结果获取模块,被配置为获取云端发送的连续帧数据的预标注结果;
修正模块,被配置为如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
其中,所述预标注结果是:云端在读取连续帧数据后,基于预设目标检测模型,并按照标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;其中,所述检测结果包括待标注物体的类别和位置,所述预设目标检测模型是根据本发明任意实施例所提供的目标检测模型的训练方法生成的。
第七方面,本发明实施例还提供了一种设备,包括:
存储有可执行程序代码的存储器;
与所述存储器耦合的处理器;
所述处理器调用所述存储器中存储的所述可执行程序代码,执行本发明任意实施例所提供的目标检测模型的训练方法的部分或全部步骤。
第八方面,本发明实时还提供了一种云端服务器,包括:
存储有可执行程序代码的存储器;
与所述存储器耦合的处理器;
所述处理器调用所述存储器中存储的所述可执行程序代码,执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法的部分或全部步骤。
第九方面,本发明实时还提供了一种标注终端,包括:
存储有可执行程序代码的存储器;
与所述存储器耦合的处理器;
所述处理器调用所述存储器中存储的所述可执行程序代码,执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法的部分或全部步骤。
第十方面,本发明实施例还提供了一种计算机可读存储介质,其存储计算机程序,所述 计算机程序包括用于执行本发明任意实施例所提供的目标检测模型的训练方法的部分或全部步骤的指令。
第十一方面,本发明实施例还提供了一种计算机可读存储介质,其存储计算机程序,所述计算机程序包括用于执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法的部分或全部步骤的指令。
第十二方面,本发明实施例还提供了一种计算机可读存储介质,其存储计算机程序,所述计算机程序包括用于执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法的部分或全部步骤的指令。
第十三方面,本发明实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行本发明任意实施例所提供的目标检测模型的训练方法的部分或全部步骤。
第十四方面,本发明实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法的部分或全部步骤。
第十五方面,本发明实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法的部分或全部步骤。
本实施例提供的技术方案,通过获取标注有预设物体目标类别和目标位置的样本数据,将样本数据输入初始检测模型,可得到预设物体的预测位置。将目标位置和预测位置进行比较,并根据比较结果调整初始检测模型的参数,将使得损失函数回归部分的值达到收敛时的检测模型作为目标检测模型。该目标检测模型的损失函数包括分类部分和回归部分。相对于传统的目标检测模型,本实施中的目标检测模型的回归部分的值为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。这样设置,可通过调整损失函数不同项的权重,使得损失函数的结果中只有比较少的项有一些偏差,其他项均接近0,而非每一项都有偏差,从而可在连续帧数据的标注阶段减少标注员调整辅助框的次数和时间,提高了标注效率。
本发明的发明点包括:
1、目标检测模型建立了待标注物体与其在每一帧数据中的类别、位置的关联关系。该模型在训练过程中所采用的损失函数为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位 序值。这样设置,减少了标注人员调整辅助框的次数和时间,提升了标注效率,是本发明的发明点之一。
2、在现有技术的基础上,在连续帧数据在标注端进行标注之前,本发明实施例的技术方案在云端增加了对单帧数据进行目标检测以及对连续帧数据进行关联等辅助标注环节。云端在进行辅助标注后得到的预标注结果可作为后续标注人员审核的基础,标注人员可在此基础上通过标注端进行调整和修正,解决了现有技术中人工标注效率低的问题,是本发明的发明点之一。
3、在标注端增加一些辅助功能按键,标注人员可通过这些功能按键触发修正指令,以为标注人员调整预标注文件提供方便。本发明实施例采用将云端和标注端二者相互配合的标注模式,有效提高了标注效率,降低了标注成本,是本发明的发明点之一。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种目标检测模型的训练方法的流程示意图;
图2是本发明实施例提供的一种应用于云端的连续帧数据的标注方法的流程示意图;
图3是本发明实施例提供的一种应用于标注端连续帧数据的标注方法的流程示意图;
图4是本发明实施例提供的一种目标检测模型的训练装置的结构示意图;
图5是本发明实施例提供的一种应用于云端的连续帧数据的标注装置的结构示意图;
图6是本发明实施例提供的一种应用于标注端的连续帧数据的标注装置的结构示意图;
图7是本发明实施例提供的一种设备的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,本发明实施例及附图中的术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没 有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
实施例一
请参阅图1,图1是本发明实施例提供的一种目标检测模型的训练方法的流程示意图。该目标检测模型主要应用于云端对连续帧数据进行辅助标注。该方法可由目标检测模型的训练装置来执行,该装置可通过软件和/或硬件的方式实现,本发明实施例不做限定。如图1所示,本实施例提供的方法具体包括:
110、获取标注有预设物体目标类别和目标位置的样本数据。
其中,样本数据为用于训练目标检测模型的样本图像。本申请实施例中的训练是一种有监督的训练,因而所用的所有样本数据都需具有相应标注,即样本数据中的每一个预设物体都需要有对应的目标类别和目标位置标注。
120、将样本数据输入初始检测模型,得到预设物体的预测位置。
其中,初始检测模型可以为深度神经网络模型,例如,PointRCNN(Regions with Convolution Neural Network,用于原始点云的基于区域的卷积神经网络)。
示例性的,待标注物体的位置可通过长方体这一辅助框来标定,这个长方体的具体位置信息可通过长方体的中心的坐标(x,y,z)、长方体的长宽高(w,h,d)和长方体的朝向角θ来表示,即目标检测模型回归的位置是x、y、z、w、h、d和θ这七个变量。这些变量可通过辅助框的形式来表示。
130、将目标位置和预测位置进行比较,并根据比较结果调整初始检测模型的参数,将使得损失函数回归部分的值达到收敛时的检测模型作为目标检测模型。
需要说明的是,本实施例所要训练的目标检测模型,主要是对预设物体的类别和位置进行识别。其中,预设物体的类别是否为标注任务中需要标注的物体可通过分类的方式实现,预设物体的位置可通过回归的方式确定。相应的,该目标检测模型在其训练的过程中所采用的损失函数一般也包括分类和回归两部分。其中,所采用的损失函数的回归部分的值为:待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差是将预测位置与目标位置作差后的绝对值,以目标位置为准进行归一化得到的。该归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。这样设置的原因如下:
现有技术中,目标检测模型的回归部分一般采用位置(x,y,z)、尺寸(w,h,d)和朝向角(θ)等物理量的预测值和真值差值L1、L2、Smooth L1等形式的损失函数,以及预测框与真实框 的IoU(Intersection over Union,交并比)、GIoU(Generalized Intersection over Union,泛化的交并比)、DIoU等形式的损失函数,这些损失函数都可以使目标检测模型预测值尽可能接近真实值。但是,目前采用的损失函数一般都只考虑预测框和真实框位置的准确性,没有考虑标注时的具体需求,即尽可能的减少标注人员修改辅助框的次数。而本实施例所提供的目标检测模型在训练过程中所采用的损失函数,可通过调整损失函数不同项的权重,使得损失函数的结果中只有比较少的项有一些偏差,其他项均接近0,而非每一项都有偏差。这样设置减少了标注员调整辅助框的次数和时间,提高了标注效率。
本实施例提供的技术方案,通过获取标注有预设物体目标类别和目标位置的样本数据,将样本数据输入初始检测模型,可得到预设物体的预测位置。将目标位置和预测位置进行比较,并根据比较结果调整初始检测模型的参数,将使得损失函数回归部分的值达到收敛时的检测模型作为目标检测模型。该目标检测模型的损失函数包括分类部分和回归部分。相对于传统的目标检测模型,本实施中的目标检测模型的回归部分的值为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。这样设置,可通过调整损失函数不同项的权重,使得损失函数的结果中只有比较少的项有一些偏差,其他项均接近0,而非每一项都有偏差,从而可在连续帧数据的标注阶段减少了标注员调整辅助框的次数和时间,提高了标注效率。
实施例二
请参阅图2,图2是本发明实施例提供的一种应用于云端的连续帧数据的标注方法的流程示意图。本实施例在上述实施例的基础上进行了优化。如图2所示,该方法包括:
210、获取标注任务并读取连续帧数据,该标注任务中包括待标注物体的类别和位置。
其中,标注任务作为标注过程的先验信息,包括待标注物体(例如车辆、行人等)、待标注物体的类别(例如三轮车、公交车或者小轿车等)、预设尺寸和标注文件的输出文件格式等。标注任务可通过标注人员按照实际需求修改云端模型的参数来设定,或者也可以通过标注人员将其从标注端发送到云端。由于云端不受计算机资源的限制,因此可利用云端的深度学习算法对连续帧数据进行预标注,以减少后续人工标注的工作量,提升工作效率。
本实施例中,连续帧数据是具有时间先后顺序、等间隔的若干同类型数据的序列,可以为图片或3D激光雷达点云等。特别是对于3D激光雷达点云,在利用现有标注技术对其进行标注的过程中,标注速度较慢、成本较高。本实施例提供的标注系统可作为3D激光雷达点云的辅助标注环节。由于云端不受计算机资源的限制,因此通过在云端对其进行预标注,以减 少人工标注员的标注工作量,降低标注成本,提高标注效率。
220、基于预设目标检测模型,并按照标注任务对读取到的连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果。
示例性的,云端对连续帧数据中的每一帧数据进行目标检测,可采用预设目标检测模型来实现,该预设目标检测模型建立了待标注物体与其在每一帧数据中的类别、位置的关联关系。通过预设目标检测模型,可得到待标注物体的类别和位置。
示例性的,预设目标检测模型的训练过程可参见上述实施例的内容,本实施例在此不作赘述。该预设目标检测模型可以为PointRCNN(Regions with Convolution Neural Network,用于原始点云的基于区域的卷积神经网络),或者也可以利用多种模型的输出结果进行融合处理,本实施例在此不作具体限定。本实施例中,待标注物体的位置可通过长方体这一辅助框来标定,这个长方体的具体位置信息可通过长方体的中心的坐标(x,y,z)、长方体的长宽高(w,h,d)和长方体的朝向角θ来表示,即预设目标检测模型回归的待标注物体的位置是x、y、z、w、h、d和θ这七个变量。这些变量可通过辅助框的形式来表示。
230、根据检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端进行修正。
云端在基于预设目标检测模型得到待标注物体的类别和位置之后,可根据检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系。其中,各帧数据中同一个待标注物体可通过相同的编号来表示。建立各帧数据中同一个待标注物体间的关联关系主要是对同一个待标注物体进行跟踪,例如,如果在当前帧数据中出现车辆1,则需判断下一帧数据中是否还可检测到车辆1,如果仍能检测到车辆1,则可按照时序信息,建立当前帧数据中的车辆1和下一帧数据中的车辆1之间的联系。具体的关联方法可通过机器学习方法,例如卡尔曼滤波算法,进行关联。
此外,根据时序信息,由于同一个待标注物体应该具有相同的长宽高尺寸,并且物体位置和朝向较是连续变化的,因此可利用机器学习方法,例如卡尔曼滤波算法,对单帧结果进行校验和修正。例如,可对连续帧数据中漏检的待标注物体进行补齐,比如前后几帧都存在车辆2,如果中间某一帧未检测到车辆2,则通过该方法说明在单帧检测时车辆2被漏检。同样的,可利用该方法对单帧检测结果中的误检项进行删除。通过采用上述实现方式可实现对连续帧数据中待标注物体的跟踪。
本实施例中,在关联关系确定后,该关联关系可作为连续帧数据的预标注结果,云端110会将该预标注结果按照标注任务中的输出文件格式生成可扩展的预标注文件,并将预标注文 件和连续帧数据发送到标注端,以供标注人员在标注端进行修正。
标注端在接收到云端发送的连续帧数据及对应的预标注文件后,可按照修正指令对标注文件进行修正,并将修正后的标注结果作为连续帧数据的目标标注结果。
示例性的,可标注端增加对预标注文件进行修正的功能按键,当该功能按键被触发时,可对预标注文件进行修正,例如,对于车辆的检测,云端的预设目标检测模型检测的车辆朝向不一定准确,因此可在标注端增加一键更改朝向180°的功能,以便于标注人员进行核对和修改。
本实施例提供的技术方案,通过对单帧数据进行目标检测,并将检测结果按照各帧数据间的时序信息进行关联,可得到连续帧数据的预标注结果。后续人工标注员只需要通过标注端在预标注结果的基础上查漏补缺即可。由于云端的预设目标检测模型在训练时,通过调整损失函数不同项的权重,使得损失函数的结果中只有比较少的项有一些偏差,其他项均接近0,而非每一项都有偏差,因此,标准人员在对标注端预设目标检测模型的检测结果,即待标注物体的辅助框进行修改时,减少了标注人员调整辅助框的次数和时间,提高了标注效率。此外,由于在标注端设置有一些功能按键,可为标注人员的修改提供便利,这在一定程度上也提升了连续帧数据的标注效率。即本实施例提供的技术方案通过采用云端与标注端相配合的标注模式,可有效减少人工标注员的标注工作量,降低标注成本,提高标注速度和准确率。
实施例三
请参阅图3,图3是本发明实施例提供的一种应用于标注端连续帧数据的标注方法的流程示意图。该方法可由连续帧数据的标注装置来执行,该装置可通过软件和/或硬件的方式实现,一般可集成于标注终端中。如图3所示,本实施例提供的方法具体包括:
310、获取云端发送的连续帧数据的预标注结果。
320、如果接收到对预标注结果的修正指令,则根据修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果。
本实施例中,可在标注端增加一些辅助功能按键,例如将车辆的朝向一键旋转180°等,以为人工标注提供便利。
其中,预标注结果是:云端在读取连续帧数据后,基于预设目标检测模型,并按照标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;其中,所述检测结果包括待标注物体的类别和位置,所述预设目标检测模型是根据本发明实施例一提供的目标检测模型的训练方法生成 的。该预设目标检测模型在其训练过程中所采用的回归部分的损失函数为:待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位置。这样设置,使得损失函数的结果中只有比较少的项有一些偏差,其他项均接近0,而非每一项都有偏差,从而使得标注人员在进行人工标注时,减少标注人员调整辅助框的次数和时间,提升标注效率。
本实施例中,将云端发送的预标注文件作为标注端修正的基础,在此基础上,标注人员可对预标注文件进行进一步的查漏补缺。通过采用将云端的预标注与标注端相互配合的标注模式,可有效提高标注效率,降低标注成本。
实施例四
请参阅图4,图4是本发明实施例提供的一种目标检测模型的训练装置的结构示意图。如图4所示,该装置包括:样本数据获取模块410、预测位置确定模块420和目标检测模型确定模块430;其中,
样本数据获取模块410,被配置为获取标注有预设待标注物体目标类别和目标位置的样本数据;
预测位置确定模块420,被配置为将所述样本数据输入初始检测模型,得到所述预设物体的预测位置;
目标检测模型确定模块430,被配置为将所述目标位置和所述预测位置进行比较,并根据比较结果调整所述初始检测模型的参数,将使得损失函数回归部分的值达到收敛时的检测模型作为目标检测模型;
其中,所述目标检测模型的损失函数包括分类部分和回归部分,所述回归部分的值为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
可选的,所述归一化误差是将所述预测位置与所述目标位置作差后的绝对值,以所述目标位置为准进行归一化得到的。
本发明实施例所提供的目标检测模型的训练装置可执行本发明任意实施例所提供的目标检测模型的训练方法,具备执行方法相应的功能模块和有益效果。未在上述实施例中详尽描述的技术细节,可参见本发明任意实施例所提供的目标检测模型的训练方法。
实施例五
请参阅图5,图5是本发明实施例提供的一种应用于云端的连续帧数据的标注装置的结构示意图,如图5所示,该装置包括:连续帧数据获取模块510、检测结果确定模块520和关联关系建立模块530;其中,
连续帧数据获取模块510,被配置为获取标注任务并读取连续帧数据,所述标注任务中包括待标注物体的类别和位置;
检测结果确定模块520,被配置为基于预设目标检测模型,并按照标注任务对读取到的连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
关联关系建立模块530,被配置为根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端进行修正;
其中,所述预设目标检测模型建立了待标注物体与其在每一帧数据中的类别、位置的关联关系,所述预设目标检测模型在训练时,所采用的损失函数回归部分的值为:待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
可选的,所述装置还包括:
修正模块,被配置为基于机器学习方法,对所述检测结果进行修正,使得同一个待标注物体具有相同的尺寸,其中,所述机器学习方法包括卡尔曼滤波算法。
可选的,所述标注任务中还包括输出文件格式;
相应的,所述装置还包括:
文件生成模块,被配置为将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端。
本发明实施例所提供的连续帧数据的标注装置可执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法,具备执行方法相应的功能模块和有益效果。未在上述实施例中详尽描述的技术细节,可参见本发明任意实施例所提供的应用于云端的连续帧数据的标注方法。
实施例六
请参阅图6,图6是本发明实施例提供的一种应用于标注端的连续帧数据的标注装置的结构示意图,如图6所示,该装置包括:预标注结果获取模块610和修正模块620;其中,
预标注结果获取模块610,被配置为获取云端发送的连续帧数据的预标注结果;
修正模块620,被配置为如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
其中,所述预标注结果是:云端在读取连续帧数据后,基于预设目标检测模型,并按照标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;其中,所述检测结果包括待标注物体的类别和位置,所述预设目标检测模型是根据本发明任意实施例所提供的目标检测模型的训练方法生成的。
本发明实施例所提供的连续帧数据的标注装置可执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法,具备执行方法相应的功能模块和有益效果。未在上述实施例中详尽描述的技术细节,可参见本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法。
实施例七
请参阅图7,图7是本发明实施例提供的一种设备的结构示意图。如图7所示,该设备可以包括:
存储有可执行程序代码的存储器701;
与存储器701耦合的处理器702;
其中,处理器702调用存储器701中存储的可执行程序代码,执行本发明任意实施例所提供的目标检测模型的训练方法。
本发明实施例还提供了另外一种云端服务器,包括存储有可执行程序代码的存储器;与存储器耦合的处理器;其中,处理器调用存储器中存储的可执行程序代码,执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法。
本发明实施例还提供了另外一种标注终端,包括存储有可执行程序代码的存储器;与存储器耦合的处理器;其中,处理器调用存储器中存储的可执行程序代码,执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法。
本发明实施例还提供了一种计算机可读存储介质,其存储计算机程序,所述计算机程序包括用于执行本发明任意实施例所提供的目标检测模型的训练方法的部分或全部步骤的指令。
本发明实施例还提供了一种计算机可读存储介质,其存储计算机程序,所述计算机程序包括用于执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法的部分或全部 步骤的指令。
本发明实施例还提供了一种计算机可读存储介质,其存储计算机程序,所述计算机程序包括用于执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法的部分或全部步骤的指令。
本发明实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行本发明任意实施例所提供的目标检测模型的训练方法的部分或全部步骤。
本发明实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法的部分或全部步骤。
本发明实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法的部分或全部步骤。
在本发明的各种实施例中,应理解,上述各过程的序号的大小并不意味着执行顺序的必然先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
在本发明所提供的实施例中,应理解,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其他信息确定B。
另外,在本发明各实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
上述集成的单元若以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可获取的存储器中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或者部分,可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干请求用以使得一台计算机设备(可以为个人计算机、服务器或者网络设备等,具体可以是计算机设备中的处理器)执行本发明的各个实施例上述方法的部分或全部步骤。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质包括只 读存储器(Read-Only Memory,ROM)、随机存储器(Random Access Memory,RAM)、可编程只读存储器(Programmable Read-only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子抹除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的任何其他介质。
以上对本发明实施例公开的一种目标检测模型的训练方法、数据的标注方法和装置进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。

Claims (10)

  1. 一种目标检测模型的训练方法,其特征在于,包括:
    获取标注有预设物体目标类别和目标位置的样本数据;
    将所述样本数据输入初始检测模型,得到所述预设物体的预测位置;
    将所述目标位置和所述预测位置进行比较,并根据比较结果调整所述初始检测模型的参数,将使得损失函数回归部分的值达到收敛时的检测模型作为目标检测模型;
    其中,所述目标检测模型的损失函数包括分类部分和回归部分,所述回归部分的值为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
  2. 根据权利要求1所述的方法,其特征在于,所述归一化误差是将所述预测位置与所述目标位置作差后的绝对值,以所述目标位置为准进行归一化得到的。
  3. 一种连续帧数据的标注方法,应用于云端,其特征在于,包括:
    获取标注任务并读取连续帧数据,所述标注任务中包括待标注物体的类别和位置;
    基于预设目标检测模型,并按照标注任务对读取到的连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
    根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端进行修正;
    其中,所述预设目标检测模型建立了待标注物体与其在每一帧数据中的类别、位置的关联关系,所述预设目标检测模型在训练时,所采用的损失函数回归部分的值为:待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    基于机器学习方法,对所述检测结果进行修正,使得同一个待标注物体具有相同的尺寸,其中,所述机器学习方法包括卡尔曼滤波算法。
  5. 根据权利要求3所述的方法,其特征在于,所述标注任务中还包括输出文件格式;
    相应的,所述方法还包括:
    将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端。
  6. 根据权利要求3-5任一所述的方法,其特征在于,所述连续帧数据为图片或激光雷达点云。
  7. 一种连续帧数据的标注方法,应用于标注端,其特征在于,包括:
    获取云端发送的连续帧数据的预标注结果;
    如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
    其中,所述预标注结果是:云端在读取连续帧数据后,基于预设目标检测模型,并按照标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;其中,所述检测结果包括待标注物体的类别和位置,所述预设目标检测模型是根据权利要求1所述的目标检测模型的训练方法生成的。
  8. 一种目标检测模型的训练装置,其特征在于,包括:
    样本数据获取模块,被配置为获取标注有预设待标注物体目标类别和目标位置的样本数据;
    预测位置确定模块,被配置为将所述样本数据输入初始检测模型,得到所述预设物体的预测位置;
    目标检测模型确定模块,被配置为将所述目标位置和所述预测位置进行比较,并根据比较结果调整所述初始检测模型的参数,将使得损失函数回归部分的值达到收敛时的检测模型作为目标检测模型;
    其中,所述目标检测模型的损失函数包括分类部分和回归部分,所述回归部分的值为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
  9. 一种连续帧数据的标注装置,应用于云端,其特征在于,包括:
    连续帧数据获取模块,被配置为获取标注任务并读取连续帧数据,所述标注任务中包括待标注物体的类别和位置;
    检测结果确定模块,被配置为基于预设目标检测模型,并按照标注任务对读取到的连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
    关联关系建立模块,被配置为根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端进行修正;
    其中,所述预设目标检测模型建立了待标注物体与其在每一帧数据中的类别、位置的关联关系,所述预设目标检测模型在训练时,所采用的损失函数回归部分的值为:待标注物体 的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。
  10. 一种连续帧数据的标注装置,应用于标注端,其特征在于,包括:
    预标注结果获取模块,被配置为获取云端发送的连续帧数据的预标注结果;
    修正模块,被配置为如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
    其中,所述预标注结果是:云端在读取连续帧数据后,基于预设目标检测模型,并按照标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;其中,所述检测结果包括待标注物体的类别和位置,所述预设目标检测模型是根据权利要求1所述的目标检测模型的训练方法生成的。
PCT/CN2020/121370 2020-01-17 2020-10-16 一种目标检测模型的训练方法、数据的标注方法和装置 WO2021143231A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
DE112020003158.6T DE112020003158T5 (de) 2020-01-17 2020-10-16 Trainingsverfahren für ein Zielerfassungsmodell, Verfahren und Vorrichtung zur Kennzeichnung der Daten

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010051741.8A CN113139559B (zh) 2020-01-17 2020-01-17 一种目标检测模型的训练方法、数据的标注方法和装置
CN202010051741.8 2020-01-17

Publications (1)

Publication Number Publication Date
WO2021143231A1 true WO2021143231A1 (zh) 2021-07-22

Family

ID=76808467

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/121370 WO2021143231A1 (zh) 2020-01-17 2020-10-16 一种目标检测模型的训练方法、数据的标注方法和装置

Country Status (3)

Country Link
CN (1) CN113139559B (zh)
DE (1) DE112020003158T5 (zh)
WO (1) WO2021143231A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113627568A (zh) * 2021-08-27 2021-11-09 广州文远知行科技有限公司 一种补标方法、装置、设备及可读存储介质
CN115329722A (zh) * 2022-10-17 2022-11-11 山东亿云信息技术有限公司 一种遥感影像地物标注的要素自动处理系统及方法
CN115687334A (zh) * 2023-01-05 2023-02-03 粤港澳大湾区数字经济研究院(福田) 数据质检方法、装置、设备及存储介质
CN116912603A (zh) * 2023-09-12 2023-10-20 浙江大华技术股份有限公司 预标注筛选方法及相关装置、设备和介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723616A (zh) * 2021-08-17 2021-11-30 上海智能网联汽车技术中心有限公司 一种多传感器信息半自动标注方法、系统及存储介质
CN115294505B (zh) * 2022-10-09 2023-06-20 平安银行股份有限公司 风险物体检测及其模型的训练方法、装置及电子设备
CN116665025B (zh) * 2023-07-31 2023-11-14 福思(杭州)智能科技有限公司 数据闭环方法和系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130230A1 (en) * 2017-10-26 2019-05-02 Samsung Sds Co., Ltd. Machine learning-based object detection method and apparatus
CN109961107A (zh) * 2019-04-18 2019-07-02 北京迈格威科技有限公司 目标检测模型的训练方法、装置、电子设备及存储介质
US20190354817A1 (en) * 2018-05-18 2019-11-21 Google Llc Learning Data Augmentation Strategies for Object Detection
CN110598764A (zh) * 2019-08-28 2019-12-20 杭州飞步科技有限公司 目标检测模型的训练方法、装置及电子设备
CN110633717A (zh) * 2018-06-21 2019-12-31 北京京东尚科信息技术有限公司 一种目标检测模型的训练方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107180220B (zh) * 2016-03-11 2023-10-31 松下电器(美国)知识产权公司 危险预测方法
CN107229904B (zh) * 2017-04-24 2020-11-24 东北大学 一种基于深度学习的目标检测与识别方法
JP6550442B2 (ja) * 2017-11-21 2019-07-24 三菱電機インフォメーションシステムズ株式会社 追跡装置及び追跡プログラム
CN109784190A (zh) * 2018-12-19 2019-05-21 华东理工大学 一种基于深度学习的自动驾驶场景关键目标检测提取方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130230A1 (en) * 2017-10-26 2019-05-02 Samsung Sds Co., Ltd. Machine learning-based object detection method and apparatus
US20190354817A1 (en) * 2018-05-18 2019-11-21 Google Llc Learning Data Augmentation Strategies for Object Detection
CN110633717A (zh) * 2018-06-21 2019-12-31 北京京东尚科信息技术有限公司 一种目标检测模型的训练方法和装置
CN109961107A (zh) * 2019-04-18 2019-07-02 北京迈格威科技有限公司 目标检测模型的训练方法、装置、电子设备及存储介质
CN110598764A (zh) * 2019-08-28 2019-12-20 杭州飞步科技有限公司 目标检测模型的训练方法、装置及电子设备

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113627568A (zh) * 2021-08-27 2021-11-09 广州文远知行科技有限公司 一种补标方法、装置、设备及可读存储介质
CN115329722A (zh) * 2022-10-17 2022-11-11 山东亿云信息技术有限公司 一种遥感影像地物标注的要素自动处理系统及方法
CN115329722B (zh) * 2022-10-17 2023-01-24 山东亿云信息技术有限公司 一种遥感影像地物标注的要素自动处理系统及方法
CN115687334A (zh) * 2023-01-05 2023-02-03 粤港澳大湾区数字经济研究院(福田) 数据质检方法、装置、设备及存储介质
CN116912603A (zh) * 2023-09-12 2023-10-20 浙江大华技术股份有限公司 预标注筛选方法及相关装置、设备和介质
CN116912603B (zh) * 2023-09-12 2023-12-15 浙江大华技术股份有限公司 预标注筛选方法及相关装置、设备和介质

Also Published As

Publication number Publication date
CN113139559B (zh) 2022-06-24
DE112020003158T5 (de) 2022-03-17
CN113139559A (zh) 2021-07-20

Similar Documents

Publication Publication Date Title
WO2021143231A1 (zh) 一种目标检测模型的训练方法、数据的标注方法和装置
WO2021143230A1 (zh) 一种连续帧数据的标注系统、方法和装置
WO2022179261A1 (zh) 基于3d匹配的物体抓取方法、装置及计算设备
US10878372B2 (en) Method, system and device for association of commodities and price tags
US20200167568A1 (en) Image processing method, device, and storage medium
US20230351618A1 (en) System and method for detecting moving target based on multi-frame point cloud
US11216919B2 (en) Image processing method, apparatus, and computer-readable recording medium
CN105678322A (zh) 样本标注方法和装置
CN110059637B (zh) 一种人脸对齐的检测方法及装置
JP2017146710A (ja) 搬送計画生成装置および搬送計画生成方法
WO2022142744A1 (zh) 回环检测方法、装置、设备及计算机可读存储介质
CN108133235A (zh) 一种基于神经网络多尺度特征图的行人检测方法
CN110796066A (zh) 一种车道线组构建方法及装置
CN115082523A (zh) 一种基于视觉的机器人智能引导系统及方法
US20220114813A1 (en) Detecting obstacle
CN113762049B (zh) 内容识别方法、装置、存储介质和终端设备
US20230419509A1 (en) Production line monitoring method and monitoring system thereof
CN111985471A (zh) 一种车牌定位方法、装置及存储介质
CN112633187A (zh) 基于图像分析的机器人自动搬运方法、系统和存储介质
US20210114204A1 (en) Mobile robot device for correcting position by fusing image sensor and plurality of geomagnetic sensors, and control method
WO2022247628A1 (zh) 一种数据标注方法及相关产品
CN115631374A (zh) 控件操作方法、控件检测模型的训练方法、装置和设备
CN115319739A (zh) 一种基于视觉机械臂抓取工件方法
CN114428878A (zh) 一种商标图像检索方法及系统
CN110619354A (zh) 一种无人售货柜图像识别系统及方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20913563

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20913563

Country of ref document: EP

Kind code of ref document: A1