WO2021143230A1 - 一种连续帧数据的标注系统、方法和装置 - Google Patents

一种连续帧数据的标注系统、方法和装置 Download PDF

Info

Publication number
WO2021143230A1
WO2021143230A1 PCT/CN2020/121362 CN2020121362W WO2021143230A1 WO 2021143230 A1 WO2021143230 A1 WO 2021143230A1 CN 2020121362 W CN2020121362 W CN 2020121362W WO 2021143230 A1 WO2021143230 A1 WO 2021143230A1
Authority
WO
WIPO (PCT)
Prior art keywords
labeling
labeled
data
result
continuous frame
Prior art date
Application number
PCT/CN2020/121362
Other languages
English (en)
French (fr)
Inventor
马贤忠
胡皓瑜
江浩
董维山
范一磊
Original Assignee
初速度(苏州)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 初速度(苏州)科技有限公司 filed Critical 初速度(苏州)科技有限公司
Priority to DE112020003085.7T priority Critical patent/DE112020003085T5/de
Publication of WO2021143230A1 publication Critical patent/WO2021143230A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/587Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/945User interactive design; Environments; Toolboxes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations

Definitions

  • the present invention relates to the technical field of automatic driving, in particular to a system, method and device for marking continuous frame data.
  • the perception module uses data from a variety of sensors and high-precision map information as input. After a series of calculations and processing, it accurately perceives the surrounding environment of the autonomous vehicle.
  • Autonomous driving perception algorithms currently adopt deep learning methods as the mainstream, which requires a large number of labeled data sets to train the model. Therefore, the ability to generate large amounts of labeled data faster and more efficiently is the key to autonomous driving perception.
  • the embodiment of the present invention discloses a continuous frame data labeling system, method and device, which greatly shortens the manual time for continuous frame data labeling, improves the labeling efficiency of continuous frame data, and reduces the labeling cost.
  • an embodiment of the present invention discloses a continuous frame data labeling system.
  • the system includes: a cloud and a labeling terminal; wherein,
  • the cloud is configured to: obtain a labeling task, the labeling task includes the category, location, and output file format of the object to be labeled;
  • the cloud reads continuous frame data, and performs target detection on each frame of data in the continuous frame data according to the labeling task, and uses the category and position of the object to be labeled in each frame of data obtained as a detection result;
  • the cloud establishes an association relationship between the same object to be annotated in each frame of data according to the detection result and the timing information between each frame of data, and the association relationship is the pre-annotation result of the continuous frame data;
  • the cloud generates an expandable pre-labeled file according to the output file format from the pre-labeled result, and sends the pre-labeled file and the continuous frame data to the labeling terminal;
  • the labeling terminal is configured to: receive continuous frame data and corresponding pre-labeled files sent by the cloud, and after receiving a correction instruction for the pre-labeled file, perform processing on the labeling file according to the correction instruction Correct, and use the corrected labeling result as the target labeling result of the continuous frame data.
  • the embodiment of the present invention also provides an annotation of continuous frame data, which is applied to the cloud, and the method includes:
  • the labeling task includes the category and location of the object to be labeled
  • Reading continuous frame data and performing target detection on each frame of data in the continuous frame data according to the labeling task, and using the obtained category and position of the object to be labeled in each frame of data as the detection result;
  • an association relationship between the same object to be labeled in each frame of data is established, where the association relationship is used as the pre-labeling result of the continuous frame data to be used in the annotation
  • the terminal performs correction according to the correction instruction, and the marking result after the correction of the marking terminal is the target marking result of the continuous frame data.
  • the method further includes:
  • the detection result is corrected so that the same object to be labeled has the same size, wherein the machine learning method includes a Kalman filter algorithm.
  • the labeling task also includes an output file format
  • the method further includes:
  • the continuous frame data is a picture or a 3D lidar point cloud.
  • the embodiment of the present invention also discloses a device for labeling continuous frame data, which is applied to the cloud, and the device includes:
  • a labeling task acquisition module configured to acquire a labeling task, where the labeling task includes the category and position of the object to be labelled;
  • the target detection module is configured to read continuous frame data, and perform target detection on each frame of data in the continuous frame data according to the labeling task, and obtain the category and position of the object to be labeled in each frame of data obtained As a test result;
  • the association module is configured to establish an association relationship between the same object to be labeled in each frame of data according to the detection result and the timing information between each frame of data, wherein the association relationship is used as a pre-annotation of the continuous frame data
  • the result is used to make corrections at the marking end according to the correction instruction, and the marking result after correction at the marking end is the target marking result of the continuous frame data.
  • the device further includes:
  • the correction module is configured to correct the detection result based on a machine learning method so that the same object to be labeled has the same size, wherein the machine learning method includes a Kalman filter algorithm.
  • the labeling task also includes an output file format
  • the device further includes:
  • a file generating module configured to generate an expandable pre-labeled file according to the output file format from the pre-labeled result, and send the pre-labeled file and the continuous frame data to the labeling terminal for labeling The personnel make corrections at the marked end.
  • the continuous frame data is a picture or a 3D lidar point cloud.
  • the embodiment of the present invention also discloses a method for labeling continuous frame data, which is applied to the labeling terminal, and the method includes:
  • the pre-labeling result is: after the cloud reads the continuous frame data, each frame of data is established based on the detection result obtained by target detection of the object to be labeled in each frame of data according to the labeling task and the timing information between each frame of data The association relationship between the same object to be marked in the above; the detection result includes the category and location of the object to be marked;
  • the labeling task includes the category and location of the object to be labeled.
  • an embodiment of the present invention also provides a device for labeling continuous frame data, which is applied to the labeling terminal, and the device includes:
  • the pre-labeled result obtaining module is configured to obtain the pre-labeled result of continuous frame data sent by the cloud;
  • the target labeling result generation module is configured to, if a correction instruction for the pre-labeling result is received, correct the labeling result according to the correction instruction, and use the corrected labeling result as the continuous frame data Target marking result;
  • the pre-labeling result is: after the cloud reads the continuous frame data, each frame of data is established based on the detection result obtained by target detection of the object to be labeled in each frame of data according to the labeling task and the timing information between each frame of data The association relationship between the same object to be marked in the above; the detection result includes the category and location of the object to be marked;
  • the labeling task includes the category and location of the object to be labeled.
  • an embodiment of the present invention also provides a cloud server, including:
  • a memory storing executable program codes
  • a processor coupled with the memory
  • the processor calls the executable program code stored in the memory to execute part or all of the steps of the continuous frame data labeling method applied to the cloud provided by any embodiment of the present invention.
  • the present invention also provides a labeling terminal in real time, including:
  • a memory storing executable program codes
  • a processor coupled with the memory
  • the processor calls the executable program code stored in the memory to execute part or all of the steps of the continuous frame data labeling method applied to the labeling terminal provided by any embodiment of the present invention.
  • an embodiment of the present invention also provides a computer-readable storage medium that stores a computer program, and the computer program includes a method for executing continuous frame data applied to the cloud provided by any embodiment of the present invention. Instructions for some or all of the steps.
  • an embodiment of the present invention also provides a computer-readable storage medium that stores a computer program, and the computer program includes an annotation method for continuous frame data applied to an annotation terminal provided by any embodiment of the present invention. Instructions for some or all of the steps.
  • the embodiments of the present invention also provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute the continuous frame data application to the cloud provided by any embodiment of the present invention. Mark part or all of the steps of the method.
  • the embodiments of the present invention also provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute the continuous frame data applied to the labeling terminal provided by any embodiment of the present invention Part or all of the steps of the labeling method.
  • the technical solution provided by this embodiment by performing target detection on a single frame of data, and correlating the detection results according to the timing information between each frame of data, the pre-labeling results of continuous frame data can be obtained. Subsequent manual annotators only need to check the omissions on the basis of the pre-annotation results through the annotation terminal. In addition, because there are some function buttons on the labeling terminal, it can provide convenience for the labeling personnel to modify, which also improves the labeling efficiency of continuous frame data to a certain extent. In summary, the technical solution provided by this embodiment can effectively reduce the labeling workload of manual labelers, reduce labeling costs, and improve labeling speed and accuracy by adopting a labeling mode that cooperates with the cloud and the labeling terminal.
  • the invention points of the present invention include:
  • the technical solution of the embodiment of the present invention adds auxiliary annotation links such as target detection of single frame data and association of continuous frame data in the cloud. .
  • the pre-labeling results obtained by the cloud after auxiliary labeling can be used as the basis for subsequent labeling personnel’s review.
  • the labeling personnel can make adjustments and corrections through the labeling terminal, which solves the problem of low manual labeling efficiency in the prior art.
  • the embodiment of the present invention adopts a labeling mode in which the cloud and the labeling terminal cooperate with each other, which effectively improves the labeling efficiency and reduces the labeling cost, which is one of the invention points of the present invention.
  • the cloud uses a preset target detection model when performing target detection on a single frame of data.
  • the preset target detection model establishes an association relationship between the object to be labeled and its category and position in each frame of data.
  • the loss function used in the training process of the model is the weighted sum of the positions of the objects to be labeled according to the size of the normalized error, where the weight of the normalized error is w to the power of k, and w is the hyperparameter , K is the order value after normalization error sorting.
  • FIG. 1 is a schematic structural diagram of a continuous frame data labeling system provided by an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of a method for labeling continuous frame data applied to the cloud according to an embodiment of the present invention
  • FIG. 3 is a schematic flowchart of a method for labeling continuous frame data applied to the labeling terminal according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of an apparatus for labeling continuous frame data applied to the cloud according to an embodiment of the present invention
  • FIG. 5 is a schematic structural diagram of a continuous frame data labeling device applied to the labeling terminal according to an embodiment of the present invention
  • Fig. 6 is a schematic structural diagram of a cloud server provided by an embodiment of the present invention.
  • FIG. 1 is a schematic structural diagram of a continuous frame data labeling system provided by an embodiment of the present invention.
  • the system can be applied to automatic driving, and a large amount of labeled data can be generated faster and more efficiently through the system for model training.
  • a continuous frame data labeling system provided by this embodiment specifically includes: a cloud 110 and a labeling terminal 120; among them,
  • the cloud 110 is configured to: obtain a labeling task, which includes the category, location, and output file format of the object to be labeled;
  • the labeling task is the prior information of the labeling process, including the object to be labeled (such as vehicles, pedestrians, etc.), the category of the object to be labeled (such as tricycles, buses, or cars, etc.), preset sizes, and output files of the labeling file Format etc.
  • the labeling task can be set by the labeling personnel modifying the parameters of the cloud model according to actual needs, or it can be sent from the labeling terminal to the cloud by the labeling personnel. Since the cloud is not limited by computer resources, the cloud's deep learning algorithm can be used to pre-label continuous frame data to reduce the workload of subsequent manual labeling and improve work efficiency.
  • the specific labeling process in the cloud is as follows:
  • the cloud 110 reads the continuous frame data, and performs target detection on each frame of the continuous frame data according to the labeling task, and uses the category and position of the object to be labeled in each frame of the obtained data as the detection result.
  • the cloud 110 establishes an association relationship between the same object to be labeled in each frame of data according to the detection result and the timing information between each frame of data.
  • the association relationship is the pre-labeled result of the continuous frame data; the cloud 110 will follow the pre-labeled result according to the output file Format to generate an expandable pre-labeled file, and send the pre-labeled file and continuous frame data to the labeling terminal 120;
  • the continuous frame data is a sequence of several data of the same type with chronological order and equal intervals, and may be a picture or a 3D lidar point cloud.
  • 3D lidar point clouds in the process of labeling them using the existing labeling technology, the labeling speed is slower and the cost is higher.
  • the labeling system provided in this embodiment can be used as an auxiliary labeling link of the 3D lidar point cloud. Since the cloud is not limited by computer resources, it is pre-labeled in the cloud to reduce the labeling workload of manual labelers, reduce labeling costs, and improve labeling efficiency.
  • the cloud performs target detection on each frame of continuous frame data, which can be achieved by using a preset target detection model that establishes the object to be labeled and its category in each frame of data. The relationship of the location. By pre-setting the target detection model, the category and location of the object to be marked can be obtained.
  • the preset target detection model can be PointRCNN (Regions with Convolution Neural Network, a region-based convolutional neural network used for the original point cloud), or the output results of multiple models can be used for fusion processing.
  • PointRCNN Registered with Convolution Neural Network
  • the output results of multiple models can be used for fusion processing.
  • the position of the object to be marked can be calibrated by the auxiliary frame of the rectangular parallelepiped.
  • the specific position information of the rectangular parallelepiped can be determined by the coordinates (x, y, z) of the center of the rectangular parallelepiped, and the length, width and height of the rectangular parallelepiped (w, h).
  • the orientation angle ⁇ of the cuboid that is, the position of the object to be labeled returned by the preset target detection model is the seven variables of x, y, z, w, h, d, and ⁇ . These variables can be represented in the form of auxiliary boxes.
  • the preset target detection model mainly recognizes the category and position of the object to be labeled. Among them, whether the category of the object to be labeled is the object that needs to be labeled in the labeling task can be achieved by classification, and the position of the object to be labeled can be determined by regression.
  • the loss function used in the training process of the preset target detection model generally also includes classification and regression.
  • the regression part of the loss function used by the preset target detection model in its training process is: the position of the object to be labeled is sorted according to the size of the normalized error, and the weighted sum of the normalized error
  • the weight is the k power of w, w is the hyperparameter, and k is the order value after the normalized error sorting.
  • the regression part of the target detection model generally uses the predicted value and true value difference L1, L2, of physical quantities such as position (x, y, z), size (w, h, d), and orientation angle ( ⁇ ).
  • IoU Intersection over Union
  • GIoU Generalized Intersection over Union
  • DIoU etc.
  • the loss function used in the training process of the preset target detection model provided in this embodiment can be adjusted by adjusting the weights of different items of the loss function, so that only a few items in the result of the loss function have some deviations, and other items are close to 0, not every item is biased. This setting reduces the number of times and time for annotator to adjust the auxiliary frame, and improves the efficiency of annotation.
  • the cloud 110 After the cloud 110 obtains the category and position of the object to be labeled based on the preset target detection model, it can establish an association relationship between the same object to be labeled in each frame of data according to the detection result and the timing information between each frame of data. Among them, the same object to be labeled in each frame of data can be represented by the same number. Establishing the association relationship between the same object to be labeled in each frame of data is mainly to track the same object to be labeled.
  • vehicle 1 For example, if vehicle 1 appears in the current frame of data, it is necessary to determine whether it can be detected in the next frame of data For vehicle 1, if vehicle 1 can still be detected, the connection between vehicle 1 in the current frame of data and vehicle 1 in the next frame of data can be established according to the time sequence information.
  • the specific correlation method can be correlated through a machine learning method, such as Kalman filter algorithm.
  • machine learning methods such as Kalman filter algorithm
  • Checksum correction For example, the missing objects to be labeled in the continuous frame data can be complemented.
  • the association relationship can be used as the pre-labeled result of continuous frame data.
  • the cloud 110 will generate an expandable pre-labeled file according to the pre-labeled result according to the output file format in the labeling task, and add The pre-labeled file and continuous frame data are sent to the labeling terminal 120 for the labeling staff to make corrections at the labeling terminal 120.
  • the labeling terminal 120 is configured to receive the continuous frame data and the corresponding pre-labeled file sent by the cloud 110, and after receiving the correction instruction for the pre-labeled file, correct the label file according to the correction instruction, and modify the corrected
  • the labeling result is used as the target labeling result of continuous frame data.
  • the labeling terminal adds a function button for correcting the pre-labeled file.
  • the function button is triggered, the pre-labeled file can be corrected.
  • the cloud-based preset target detection model detects The orientation of the vehicle may not be accurate, so you can add the function of changing the orientation 180° with one key on the labeling terminal, so that the labeling personnel can check and modify it.
  • the preset target detection model in the cloud is trained, by adjusting the weights of different items of the loss function, only a few items in the result of the loss function have some deviations, and other items are close to 0, instead of every item. There are deviations. Therefore, when standard personnel modify the detection result of the preset target detection model on the labeling terminal, that is, the auxiliary frame of the object to be labeled, the number of times and time for the labeler to adjust the auxiliary frame is reduced, and the labeling efficiency is improved.
  • the technical solution provided by this embodiment by performing target detection on a single frame of data, and correlating the detection results according to the timing information between each frame of data, the pre-labeling results of continuous frame data can be obtained. Subsequent manual annotators only need to check the omissions on the basis of the pre-annotation results through the annotation terminal.
  • it can provide convenience for the labeling personnel to modify, which also improves the labeling efficiency of continuous frame data to a certain extent. That is, the technical solution provided by this embodiment can effectively reduce the labeling workload of manual labelers, reduce labeling costs, and improve labeling speed and accuracy by adopting a labeling mode that cooperates with the cloud and the labeling terminal.
  • FIG. 2 is a schematic flowchart of a method for labeling continuous frame data applied to the cloud according to an embodiment of the present invention.
  • the method in this embodiment can be executed by a continuous frame data tagging device, which can be implemented by software and/or hardware, and can generally be integrated in cloud servers such as Facebook Cloud and Baidu Cloud.
  • cloud servers such as Facebook Cloud and Baidu Cloud.
  • the embodiment of the present invention does not limit it. .
  • the method provided in this embodiment specifically includes:
  • the labeling task includes the category and location of the object to be labeled.
  • the specific target detection method can refer to the description of the foregoing embodiment, and this embodiment will not be repeated here.
  • the target detection and association of continuous frame data in the cloud is an auxiliary labeling link before the continuous frame data is labelled at the labeling end.
  • the algorithm of this auxiliary labeling link runs in the cloud, and there is no limitation of computer resources.
  • the pre-labeling results obtained after using the cloud for auxiliary labeling can be used as the basis for subsequent labeling staff review, and the labeling staff can make adjustments on this basis. This setting reduces the workload of the labeling staff and improves the efficiency and accuracy of labeling.
  • FIG. 3 is a schematic flowchart of a method for labeling continuous frame data applied to the labeling terminal according to an embodiment of the present invention.
  • the method can be executed by a labeling device for continuous frame data, which can be implemented by software and/or hardware, and generally can be integrated in a labeling terminal.
  • the method provided in this embodiment specifically includes:
  • the pre-labeling result is: after the cloud reads the continuous frame data, according to the labeling task, the detection result of the object to be labeled in each frame of data is detected and the timing information between each frame of data. An association relationship between objects to be labeled.
  • the detection result includes the category and location of the object to be marked.
  • auxiliary function buttons can be added to the labeling end, such as one-key rotation of the direction of the vehicle by 180°, etc., to facilitate manual labeling.
  • the loss function of the regression part used in the training process of the preset target detection model used by the cloud for target detection on a single frame of data is: the position of the object to be labeled is sorted according to the size of the normalized error The weighted sum of, where the weight of the normalized error is w to the power of k, w is the hyperparameter, and k is the sorted position of the normalized error.
  • This setting makes that only a few items in the result of the loss function have some deviations, and other items are close to 0, but not every item has a deviation, so that when the annotator performs manual annotation, it reduces the need for annotator to adjust the auxiliary frame.
  • the frequency and time improve the efficiency of labeling.
  • the pre-annotated file sent from the cloud is used as the basis for the annotation end correction, and on this basis, the annotator can further check for omissions in the pre-annotated file.
  • FIG. 4 is a schematic structural diagram of an apparatus for labeling continuous frame data applied to the cloud according to an embodiment of the present invention.
  • the device includes: an annotation task acquisition module 410, a target detection module 420, and an association module 430; among them,
  • the labeling task acquisition module 410 is configured to acquire a labeling task, where the labeling task includes the category and location of the object to be labelled;
  • the target detection module 420 is configured to read continuous frame data, and perform target detection on each frame of data in the continuous frame data according to the labeling task, and obtain the type and the type of the object to be labeled in each frame of data obtained. The location is used as the test result;
  • the association module 430 is configured to establish an association relationship between the same object to be marked in each frame of data according to the detection result and the timing information between each frame of data, wherein the association relationship is used as a prediction of the continuous frame data.
  • the labeling result is used for correction at the labeling end according to the correction instruction, and the labeling result after the correction of the labeling end is the target labeling result of the continuous frame data.
  • the device further includes:
  • the correction module is configured to correct the detection result based on a machine learning method so that the same object to be labeled has the same size, wherein the machine learning method includes a Kalman filter algorithm.
  • the labeling task also includes an output file format
  • the device further includes:
  • a file generating module configured to generate an expandable pre-labeled file according to the output file format from the pre-labeled result, and send the pre-labeled file and the continuous frame data to the labeling terminal for labeling The personnel make corrections at the marked end.
  • the continuous frame data is a picture or a 3D lidar point cloud.
  • the device for labeling continuous frame data provided by the embodiment of the present invention can execute the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for the execution method.
  • the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention can execute the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for the execution method.
  • FIG. 5 is a schematic structural diagram of an annotation device for continuous frame data applied to an annotation terminal according to an embodiment of the present invention. As shown in FIG. 5, the device includes: a pre-annotation result acquisition module 510 and a target annotation result Generate module 520; among them,
  • the pre-labeled result obtaining module 510 is configured to obtain the pre-labeled result of continuous frame data sent by the cloud;
  • the target labeling result generating module 520 is configured to, if a correction instruction to the pre-labeling result is received, correct the labeling result according to the correction instruction, and use the corrected labeling result as the continuous frame data The target marking result;
  • the pre-labeling result is: after the cloud reads the continuous frame data, each frame of data is established based on the detection result obtained by target detection of the object to be labeled in each frame of data according to the labeling task and the timing information between each frame of data The association relationship between the same object to be marked in the above; the detection result includes the category and location of the object to be marked;
  • the labeling task includes the category and location of the object to be labeled.
  • the device for labeling continuous frame data provided by the embodiment of the present invention can execute the method for labeling continuous frame data applied to the labeling terminal provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for the execution method.
  • the method for labeling continuous frame data applied to the labeling terminal provided by any embodiment of the present invention can execute the method for labeling continuous frame data applied to the labeling terminal provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for the execution method.
  • FIG. 6 is a schematic structural diagram of a cloud server according to an embodiment of the present invention.
  • the cloud server may include:
  • a memory 701 storing executable program codes
  • a processor 702 coupled to the memory 701;
  • the processor 702 calls the executable program code stored in the memory 701 to execute the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention.
  • the embodiment of the present invention also provides another labeling terminal, including a memory storing executable program code; a processor coupled with the memory; wherein the processor calls the executable program code stored in the memory to execute any embodiment of the present invention
  • the provided labeling method applied to the continuous frame data of the labeling terminal including a memory storing executable program code; a processor coupled with the memory; wherein the processor calls the executable program code stored in the memory to execute any embodiment of the present invention
  • the embodiment of the present invention discloses a computer-readable storage medium that stores a computer program, where the computer program causes a computer to execute the method for labeling continuous frame data applied to the cloud provided by any embodiment of the present invention.
  • the embodiment of the present invention also discloses a computer-readable storage medium that stores a computer program, wherein the computer program causes the computer to execute the method for labeling continuous frame data applied to the labeling terminal provided by any embodiment of the present invention.
  • the embodiment of the present invention discloses a computer program product, wherein when the computer program product runs on a computer, the computer is caused to execute part or all of the steps of the continuous frame data labeling method applied to the cloud provided by any embodiment of the present invention.
  • the embodiment of the present invention also discloses a computer program product, wherein when the computer program product runs on a computer, the computer is caused to execute part or all of the steps of the continuous frame data labeling method applied to the labeling terminal provided by any embodiment of the present invention .
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B based on A does not mean that B is determined only based on A, and B can also be determined based on A and/or other information.
  • the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the aforementioned integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-accessible memory.
  • the essence of the technical solution of the present invention, or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a memory.
  • a computer device which may be a personal computer, a server or a network device, etc., specifically a processor in a computer device
  • the program can be stored in a computer-readable storage medium.
  • the storage medium includes read-only Memory (Read-Only Memory, ROM), Random Access Memory (RAM), Programmable Read-only Memory (PROM), Erasable Programmable Read Only Memory, EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronically-Erasable Programmable Read-Only Memory (EEPROM), CD-ROM (Compact Disc) Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or any other computer-readable medium that can be used to carry or store data.
  • Read-Only Memory ROM
  • RAM Random Access Memory
  • PROM Programmable Read-only Memory
  • EPROM Erasable Programmable Read Only Memory
  • OTPROM One-time Programmable Read-Only Memory
  • EEPROM Electronically-Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

本发明实施例公开了一种连续帧数据的标注系统、方法和装置,该系统包括云端和标注端;其中,云端读取连续帧数据,并根据标注任务,对连续帧数据中的每一帧数据进行目标检测,得到每帧数据中待标注物体的检测结果;根据检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系作为预标注结果;将预标注结果生成可扩展的预标注文件,并将预标注文件和连续帧数据发送到标注端;标注端接收云端发送的连续帧数据及对应的预标注文件,并在接收到对预标注文件的修正指令后,按照修正指令对标注文件进行修正,得到目标标注结果。通过采用上述方案,缩短了连续帧数据标注的人工用时,提升了连续帧数据的标注效率,降低了标注成本。

Description

一种连续帧数据的标注系统、方法和装置 技术领域
本发明涉及自动驾驶技术领域,具体涉及一种连续帧数据的标注系统、方法和装置。
背景技术
在自动驾驶领域,感知模块是以多种传感器的数据,以及高精度地图的信息作为输入,经过一系列的计算及处理,对自动驾驶车的周围的环境精确感知。自动驾驶感知算法目前主流采用深度学习方法,需要用到大量的标注数据集来训练模型,所以能够更快更高效的生成大量的标注数据,是自动驾驶感知的关键。
目前,大多数标注数据都是人工标注的,包括2D图像、3D激光雷达点云数据等,这是一个非常缓慢而低效的过程。它需要人坐在计算机屏幕前操作标注工具,逐个标记它们,极度耗费人力。对于激光雷达数据,由于其数据形态的复杂性和稀疏性,很容易标注错误或者漏标,甚至有可能给神经网络训练带来负面影响。
发明内容
本发明实施例公开一种连续帧数据的标注系统、方法和装置,极大地缩短了连续帧数据标注的人工用时,提升了连续帧数据的标注效率,降低了标注成本。
第一方面,本发明实施例公开了一种连续帧数据的标注系统,该系统包括:云端和标注端;其中,
所述云端被配置为:获取标注任务,所述标注任务中包括待标注物体的类别、位置和输出文件格式;
所述云端读取连续帧数据,并根据所述标注任务,对所述连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
所述云端根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,所述关联关系为所述连续帧数据的预标注结果;
所述云端将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端;
所述标注端被配置为:接收所述云端发送的连续帧数据及对应的预标注文件,并在接收到对所述预标注文件的修正指令后,按照所述修正指令对所述标注文件进行修正,并将修正 后的标注结果作为所述连续帧数据的目标标注结果。
第二方面,本发明实施例还提供了一种连续帧数据的标注,应用于云端,该方法包括:
获取标注任务,所述标注任务中包括待标注物体的类别和位置;
读取连续帧数据,并根据所述标注任务,对所述连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端根据修正指令进行修正,所述标注端修正后的标注结果为所述连续帧数据的目标标注结果。
可选的,所述方法还包括:
基于机器学习方法,对所述检测结果进行修正,使得同一个待标注物体具有相同的尺寸,其中,所述机器学习方法包括卡尔曼滤波算法。
可选的,所述标注任务中还包括输出文件格式;
相应的,所述方法还包括:
将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端,以供标注人员在所述标注端进行修正。
可选的,所述连续帧数据为图片或3D激光雷达点云。
第三方面,本发明实施例还公开了一种连续帧数据的标注装置,应用于云端,该装置包括:
标注任务获取模块,被配置为获取标注任务,所述标注任务中包括待标注物体的类别和位置;
目标检测模块,被配置为读取连续帧数据,并根据所述标注任务,对所述连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
关联模块,被配置为根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端根据修正指令进行修正,所述标注端修正后的标注结果为所述连续帧数据的目标标注结果。
可选的,所述装置还包括:
修正模块,被配置为基于机器学习方法,对所述检测结果进行修正,使得同一个待标注物体具有相同的尺寸,其中,所述机器学习方法包括卡尔曼滤波算法。
可选的,所述标注任务中还包括输出文件格式;
相应的,所述装置还包括:
文件生成模块,被配置为将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端,以供标注人员在所述标注端进行修正。
可选的,所述连续帧数据为图片或3D激光雷达点云。
第四方面,本发明实施例还公开了一种连续帧数据的标注方法,应用于标注端,该方法包括:
获取云端发送的连续帧数据的预标注结果;
如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
其中,所述预标注结果是:云端在读取连续帧数据后,根据标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;所述检测结果包括待标注物体的类别和位置;
其中,所述标注任务中包括待标注物体的类别和位置。
第五方面,本发明实施例还提供了一种连续帧数据的标注装置,应用于标注端,该装置包括:
预标注结果获取模块,被配置为获取云端发送的连续帧数据的预标注结果;
目标标注结果生成模块,被配置为如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
其中,所述预标注结果是:云端在读取连续帧数据后,根据标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;所述检测结果包括待标注物体的类别和位置;
其中,所述标注任务中包括待标注物体的类别和位置。
第六方面,本发明实施例还提供了一种云端服务器,包括:
存储有可执行程序代码的存储器;
与所述存储器耦合的处理器;
所述处理器调用所述存储器中存储的所述可执行程序代码,执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法的部分或全部步骤。
第七方面,本发明实时还提供了一种标注终端,包括:
存储有可执行程序代码的存储器;
与所述存储器耦合的处理器;
所述处理器调用所述存储器中存储的所述可执行程序代码,执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法的部分或全部步骤。
第八方面,本发明实施例还提供了一种计算机可读存储介质,其存储计算机程序,所述计算机程序包括用于执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法的部分或全部步骤的指令。
第九方面,本发明实施例还提供了一种计算机可读存储介质,其存储计算机程序,所述计算机程序包括用于执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法的部分或全部步骤的指令。
第十方面,本发明实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法的部分或全部步骤。
第十一方面,本发明实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法的部分或全部步骤。
本实施例提供的技术方案,通过对单帧数据进行目标检测,并将检测结果按照各帧数据间的时序信息进行关联,可得到连续帧数据的预标注结果。后续人工标注员只需要通过标注端在预标注结果的基础上查漏补缺即可。此外,由于在标注端设置有一些功能按键,可为标注人员的修改提供便利,这在一定程度上也提升了连续帧数据的标注效率。综上所述,本实施例提供的技术方案通过采用云端与标注端相配合的标注模式,可有效减少人工标注员的标注工作量,降低标注成本,提高标注速度和准确率。
本发明的发明点包括:
1、在现有技术的基础上,在连续帧数据在标注端进行标注之前,本发明实施例的技术方案在云端增加了对单帧数据进行目标检测以及对连续帧数据进行关联等辅助标注环节。云端在进行辅助标注后得到的预标注结果可作为后续标注人员审核的基础,标注人员可在此基础上通过标注端进行调整和修正,解决了现有技术中人工标注效率低的问题,是本发明的发明点之一。
2、在标注端增加一些辅助功能按键,标注人员可通过这些功能按键触发修正指令,以为标注人员调整预标注文件提供方便。本发明实施例采用将云端和标注端二者相互配合的标注 模式,有效提高了标注效率,降低了标注成本,是本发明的发明点之一。
3、云端在对单帧数据进行目标检测时所采用的是预设目标检测模型,该预设目标检测模型建立了待标注物体与其在每一帧数据中的类别、位置的关联关系。该模型在训练过程中所采用的损失函数为待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。这样设置,减少了标注人员调整辅助框的次数和时间,提升了标注效率,是本发明的发明点之一。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种连续帧数据的标注系统的结构示意图;
图2是本发明实施例提供的一种应用于云端的连续帧数据的标注方法的流程示意图;
图3是本发明实施例提供的一种应用于标注端的连续帧数据的标注方法的流程示意图;
图4是本发明实施例提供的一种应用于云端的连续帧数据的标注装置的结构示意图;
图5是本发明实施例提供的一种应用于标注端的连续帧数据的标注装置的结构示意图;
图6是本发明实施例提供的一种云端服务器的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,本发明实施例及附图中的术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
实施例一
请参阅图1,图1是本发明实施例提供的一种连续帧数据的标注系统的结构示意图。该系统可应用于自动驾驶中,通过该系统可更快、更高效的生成大量的标注数据,以进行模型 的训练。如图1所示,本实施例提供的一种连续帧数据的标注系统具体包括:云端110和标注端120;其中,
云端110,被配置为:获取标注任务,该标注任务中包括待标注物体的类别、位置和输出文件格式;
其中,标注任务作为标注过程的先验信息,包括待标注物体(例如车辆、行人等)、待标注物体的类别(例如三轮车、公交车或者小轿车等)、预设尺寸和标注文件的输出文件格式等。标注任务可通过标注人员按照实际需求修改云端模型的参数来设定,或者也可以通过标注人员将其从标注端发送到云端。由于云端不受计算机资源的限制,因此可利用云端的深度学习算法对连续帧数据进行预标注,以减少后续人工标注的工作量,提升工作效率。具体的,云端的具体标注过程如下:
云端110读取连续帧数据,并根据标注任务,对连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果。云端110根据检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,该关联关系为连续帧数据的预标注结果;云端110将预标注结果按照输出文件格式生成可扩展的预标注文件,并将预标注文件和连续帧数据发送到标注端120;
本实施例中,连续帧数据是具有时间先后顺序、等间隔的若干同类型数据的序列,可以为图片或3D激光雷达点云等。特别是对于3D激光雷达点云,在利用现有标注技术对其进行标注的过程中,标注速度较慢、成本较高。本实施例提供的标注系统可作为3D激光雷达点云的辅助标注环节。由于云端不受计算机资源的限制,因此通过在云端对其进行预标注,以减少人工标注员的标注工作量,降低标注成本,提高标注效率。
示例性的,云端对连续帧数据中的每一帧数据进行目标检测,可采用预设目标检测模型来实现,该预设目标检测模型建立了待标注物体与其在每一帧数据中的类别、位置的关联关系。通过预设目标检测模型,可得到待标注物体的类别和位置。
示例性的,预设目标检测模型可以为PointRCNN(Regions with Convolution Neural Network,用于原始点云的基于区域的卷积神经网络),或者也可以利用多种模型的输出结果进行融合处理,本实施例在此不作具体限定。本实施例中,待标注物体的位置可通过长方体这一辅助框来标定,这个长方体的具体位置信息可通过长方体的中心的坐标(x,y,z)、长方体的长宽高(w,h,d)和长方体的朝向角θ来表示,即预设目标检测模型回归的待标注物体的位置是x、y、z、w、h、d和θ这七个变量。这些变量可通过辅助框的形式来表示。
需要说明的是,由于本实施例提供的预设目标检测模型,主要是对待标注物体的类别和 位置进行识别。其中,待标注物体的类别是否为标注任务中需要标注的物体可通过分类的方式实现,待标注物体的位置可通过回归的方式确定。相应的,该预设目标检测模型在其训练的过程中所采用的损失函数一般也包括分类和回归两部分。其中,预设目标检测模型在其训练的过程中,所采用的损失函数的回归部分为:待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位序值。这样设置的原因如下:
现有技术中,目标检测模型的回归部分一般采用位置(x,y,z)、尺寸(w,h,d)和朝向角(θ)等物理量的预测值和真值差值L1、L2、Smooth L1等形式的损失函数,以及预测框与真实框的IoU(Intersection over Union,交并比)、GIoU(Generalized Intersection over Union,泛化的交并比)、DIoU等形式的损失函数,这些损失函数都可以使目标检测模型预测值尽可能接近真实值。但是,目前采用的损失函数一般都只考虑预测框和真实框位置的准确性,没有考虑标注时的具体需求,即尽可能的减少标注人员修改辅助框的次数。
本实施例所提供的预设目标检测模型在训练过程中所采用的损失函数,可通过调整损失函数不同项的权重,使得损失函数的结果中只有比较少的项有一些偏差,其他项均接近0,而非每一项都有偏差。这样设置减少了标注员调整辅助框的次数和时间,提高了标注效率。
云端110在基于预设目标检测模型得到待标注物体的类别和位置之后,可根据检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系。其中,各帧数据中同一个待标注物体可通过相同的编号来表示。建立各帧数据中同一个待标注物体间的关联关系主要是对同一个待标注物体进行跟踪,例如,如果在当前帧数据中出现车辆1,则需判断下一帧数据中是否还可检测到车辆1,如果仍能检测到车辆1,则可按照时序信息,建立当前帧数据中的车辆1和下一帧数据中的车辆1之间的联系。具体的关联方法可通过机器学习方法,例如卡尔曼滤波算法,进行关联。
此外,根据时序信息,由于同一个待标注物体应该具有相同的长宽高尺寸,并且物体位置和朝向较是连续变化的,因此可利用机器学习方法,例如卡尔曼滤波算法,对单帧结果进行校验和修正。例如,可对连续帧数据中漏检的待标注物体进行补齐,比如前后几帧都存在车辆2,如果中间某一帧未检测到车辆2,则通过该方法说明在单帧检测时车辆2被漏检。同样的,可利用该方法对单帧检测结果中的误检项进行删除。通过采用上述实现方式可实现对连续帧数据中待标注物体的跟踪。
本实施例中,在关联关系确定后,该关联关系可作为连续帧数据的预标注结果,云端110会将该预标注结果按照标注任务中的输出文件格式生成可扩展的预标注文件,并将预标注文 件和连续帧数据发送到标注端120,以供标注人员在标注端120进行修正。
标注端120,被配置为:接收云端110发送的连续帧数据及对应的预标注文件,并在接收到对预标注文件的修正指令后,按照修正指令对标注文件进行修正,并将修正后的标注结果作为连续帧数据的目标标注结果。
示例性的,可标注端增加对预标注文件进行修正的功能按键,当该功能按键被触发时,可对预标注文件进行修正,例如,对于车辆的检测,云端的预设目标检测模型检测的车辆朝向不一定准确,因此可在标注端增加一键更改朝向180°的功能,以便于标注人员进行核对和修改。
此外,由于云端的预设目标检测模型在训练时,通过调整损失函数不同项的权重,使得损失函数的结果中只有比较少的项有一些偏差,其他项均接近0,而非每一项都有偏差,因此,标准人员在对标注端预设目标检测模型的检测结果,即待标注物体的辅助框进行修改时,减少了标注人员调整辅助框的次数和时间,提高了标注效率。
本实施例提供的技术方案,通过对单帧数据进行目标检测,并将检测结果按照各帧数据间的时序信息进行关联,可得到连续帧数据的预标注结果。后续人工标注员只需要通过标注端在预标注结果的基础上查漏补缺即可。此外,由于在标注端设置有一些功能按键,可为标注人员的修改提供便利,这在一定程度上也提升了连续帧数据的标注效率。即本实施例提供的技术方案通过采用云端与标注端相配合的标注模式,可有效减少人工标注员的标注工作量,降低标注成本,提高标注速度和准确率。
实施例二
请参阅图2,图2是本发明实施例提供的一种应用于云端的连续帧数据的标注方法的流程示意图。本实施例的方法可由连续帧数据的标注装置来执行,该装置可通过软件和/或硬件的方式实现,一般可集成在如阿里云、百度云等云端服务器中,本发明实施例不做限定。如图2所示,本实施例提供的方法具体包括:
210、获取标注任务。
其中,标注任务中包括待标注物体的类别和位置。
220、读取连续帧数据,并根据标注任务,对连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果。
其中,具体的目标检测方法可参见上述实施例的说明,本实施例在此不作赘述。
230、根据检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关 联关系,其中,所述关联关系作为连续帧数据的预标注结果,用于在标注端根据修正指令进行修正。
本实施例中,云端对连续帧数据的目标检测和关联是连续帧数据在标注端进行标注前的辅助标注环节。该辅助标注环节的算法运行在云端,没有计算机资源的限制。利用云端进行辅助标注后得到的预标注结果可作为后续标注人员审核的基础,标注人员可在此基础上进行调整,这样设置减少了标注人员的工作量,提升了标注效率和准确率。
实施例三
请参阅图3,图3是本发明实施例提供的一种应用于标注端的连续帧数据的标注方法的流程示意图。该方法可由连续帧数据的标注装置来执行,该装置可通过软件和/或硬件的方式实现,一般可集成于标注终端中。如图3所示,本实施例提供的方法具体包括:
310、获取云端发送的连续帧数据的预标注结果。
320、如果接收到对预标注结果的修正指令,则根据修正指令对标注结果进行修正,并将修正后的标注结果作为连续帧数据的目标标注结果。
其中,预标注结果是:云端在读取连续帧数据后,根据标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系。其中,所述检测结果包括待标注物体的类别和位置。
本实施例中,可在标注端增加一些辅助功能按键,例如将车辆的朝向一键旋转180°等,以为人工标注提供便利。
另外,由于云端在对单帧数据进行目标检测时所用的预设目标检测模型在其训练过程中所采用的回归部分的损失函数为:待标注物体的位置按照归一化误差的大小进行排序后的加权和,其中,归一化误差的权重为w的k次方,w为超参数,k为归一化误差排序后的位置。这样设置,使得损失函数的结果中只有比较少的项有一些偏差,其他项均接近0,而非每一项都有偏差,从而使得标注人员在进行人工标注时,减少标注人员调整辅助框的次数和时间,提升标注效率。
本实施例中,将云端发送的预标注文件作为标注端修正的基础,在此基础上,标注人员可对预标注文件进行进一步的查漏补缺。通过采用将云端的预标注与标注端相互配合的标注模式,可有效提高标注效率,降低标注成本。
实施例四
请参阅图4,图4是本发明实施例提供的一种应用于云端的连续帧数据的标注装置的结构示意图。如图4所示,该装置包括:标注任务获取模块410、目标检测模块420和关联模块430;其中,
标注任务获取模块410,被配置为获取标注任务,所述标注任务中包括待标注物体的类别和位置;
目标检测模块420,被配置为读取连续帧数据,并根据所述标注任务,对所述连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
关联模块430,被配置为根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端根据修正指令进行修正,所述标注端修正后的标注结果为所述连续帧数据的目标标注结果。
可选的,所述装置还包括:
修正模块,被配置为基于机器学习方法,对所述检测结果进行修正,使得同一个待标注物体具有相同的尺寸,其中,所述机器学习方法包括卡尔曼滤波算法。
可选的,所述标注任务中还包括输出文件格式;
相应的,所述装置还包括:
文件生成模块,被配置为将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端,以供标注人员在所述标注端进行修正。
可选的,所述连续帧数据为图片或3D激光雷达点云。
本发明实施例所提供的连续帧数据的标注装置可执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法,具备执行方法相应的功能模块和有益效果。未在上述实施例中详尽描述的技术细节,可参见本发明任意实施例所提供的应用于云端的连续帧数据的标注方法。
实施例五
请参阅图5,图5是本发明实施例提供的一种应用于标注端的连续帧数据的标注装置的结构示意图,如图5所示,该装置包括:预标注结果获取模块510和目标标注结果生成模块520;其中,
预标注结果获取模块510,被配置为获取云端发送的连续帧数据的预标注结果;
目标标注结果生成模块520,被配置为如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
其中,所述预标注结果是:云端在读取连续帧数据后,根据标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;所述检测结果包括待标注物体的类别和位置;
其中,所述标注任务中包括待标注物体的类别和位置。
本发明实施例所提供的连续帧数据的标注装置可执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法,具备执行方法相应的功能模块和有益效果。未在上述实施例中详尽描述的技术细节,可参见本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法。
实施例六
请参阅图6,图6是本发明实施例提供的一种云端服务器的结构示意图。如图6所示,该云端服务器可以包括:
存储有可执行程序代码的存储器701;
与存储器701耦合的处理器702;
其中,处理器702调用存储器701中存储的可执行程序代码,执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法。
本发明实施例还提供了另外一种标注终端,包括存储有可执行程序代码的存储器;与存储器耦合的处理器;其中,处理器调用存储器中存储的可执行程序代码,执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法。
本发明实施例公开一种计算机可读存储介质,其存储计算机程序,其中,该计算机程序使得计算机执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法。
本发明实施例还公开一种计算机可读存储介质,其存储计算机程序,其中,该计算机程序使得计算机执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法。
本发明实施例公开一种计算机程序产品,其中,当计算机程序产品在计算机上运行时,使得计算机执行本发明任意实施例所提供的应用于云端的连续帧数据的标注方法的部分或全部步骤。
本发明实施例还公开一种计算机程序产品,其中,当计算机程序产品在计算机上运行时, 使得计算机执行本发明任意实施例所提供的应用于标注端的连续帧数据的标注方法的部分或全部步骤。
在本发明的各种实施例中,应理解,上述各过程的序号的大小并不意味着执行顺序的必然先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
在本发明所提供的实施例中,应理解,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其他信息确定B。
另外,在本发明各实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
上述集成的单元若以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可获取的存储器中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或者部分,可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干请求用以使得一台计算机设备(可以为个人计算机、服务器或者网络设备等,具体可以是计算机设备中的处理器)执行本发明的各个实施例上述方法的部分或全部步骤。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质包括只读存储器(Read-Only Memory,ROM)、随机存储器(Random Access Memory,RAM)、可编程只读存储器(Programmable Read-only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子抹除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的任何其他介质。
以上对本发明实施例公开的一种连续帧数据的标注系统、方法和装置进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对 本发明的限制。

Claims (10)

  1. 一种连续帧数据的标注系统,其特征在于,包括云端和标注端;其中,
    所述云端被配置为:获取标注任务,所述标注任务中包括待标注物体的类别、位置和输出文件格式;
    所述云端读取连续帧数据,并根据所述标注任务,对所述连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
    所述云端根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,所述关联关系为所述连续帧数据的预标注结果;
    所述云端将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端;
    所述标注端被配置为:接收所述云端发送的连续帧数据及对应的预标注文件,并在接收到对所述预标注文件的修正指令后,按照所述修正指令对所述标注文件进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果。
  2. 一种连续帧数据的标注方法,应用于云端,其特征在于,包括:
    获取标注任务,所述标注任务中包括待标注物体的类别和位置;
    读取连续帧数据,并根据所述标注任务,对所述连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
    根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端根据修正指令进行修正,所述标注端修正后的标注结果为所述连续帧数据的目标标注结果。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    基于机器学习方法,对所述检测结果进行修正,使得同一个待标注物体具有相同的尺寸,其中,所述机器学习方法包括卡尔曼滤波算法。
  4. 根据权利要求2所述的方法,其特征在于,所述标注任务中还包括输出文件格式;
    相应的,所述方法还包括:
    将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端,以供标注人员在所述标注端进行修正。
  5. 根据权利要求2-4任一所述的方法,其特征在于,所述连续帧数据为图片或3D激光雷达点云。
  6. 一种连续帧数据的标注装置,应用于云端,其特征在于,包括:
    标注任务获取模块,被配置为获取标注任务,所述标注任务中包括待标注物体的类别和 位置;
    目标检测模块,被配置为读取连续帧数据,并根据所述标注任务,对所述连续帧数据中的每一帧数据进行目标检测,将得到的每帧数据中待标注物体的类别和位置作为检测结果;
    关联模块,被配置为根据所述检测结果和各帧数据间的时序信息,建立各帧数据中同一个待标注物体间的关联关系,其中,所述关联关系作为所述连续帧数据的预标注结果,用于在标注端根据修正指令进行修正,所述标注端修正后的标注结果为所述连续帧数据的目标标注结果。
  7. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    修正模块,被配置为基于机器学习方法,对所述检测结果进行修正,使得同一个待标注物体具有相同的尺寸,其中,所述机器学习方法包括卡尔曼滤波算法。
  8. 根据权利要求6所述的装置,其特征在于,所述标注任务中还包括输出文件格式;
    相应的,所述装置还包括:
    文件生成模块,被配置为将所述预标注结果按照所述输出文件格式生成可扩展的预标注文件,并将所述预标注文件和所述连续帧数据发送到所述标注端,以供标注人员在所述标注端进行修正。
  9. 一种连续帧数据的标注方法,应用于标注端,其特征在于,包括:
    获取云端发送的连续帧数据的预标注结果;
    如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
    其中,所述预标注结果是:云端在读取连续帧数据后,根据标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待标注物体间的关联关系;所述检测结果包括待标注物体的类别和位置;
    其中,所述标注任务中包括待标注物体的类别和位置。
  10. 一种连续帧数据的标注装置,应用于标注端,其特征在于,包括:
    预标注结果获取模块,被配置为获取云端发送的连续帧数据的预标注结果;
    目标标注结果生成模块,被配置为如果接收到对所述预标注结果的修正指令,则根据所述修正指令对所述标注结果进行修正,并将修正后的标注结果作为所述连续帧数据的目标标注结果;
    其中,所述预标注结果是:云端在读取连续帧数据后,根据标注任务对每帧数据中待标注物体进行目标检测得到的检测结果和各帧数据间的时序信息,建立的各帧数据中同一个待 标注物体间的关联关系;所述检测结果包括待标注物体的类别和位置;
    其中,所述标注任务中包括待标注物体的类别和位置。
PCT/CN2020/121362 2020-01-15 2020-10-16 一种连续帧数据的标注系统、方法和装置 WO2021143230A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
DE112020003085.7T DE112020003085T5 (de) 2020-01-15 2020-10-16 System, Verfahren und Vorrichtung zur Kennzeichnung von Daten in aufeinanderfolgenden Rahmen

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010041206.4 2020-01-15
CN202010041206.4A CN113127666B (zh) 2020-01-15 2020-01-15 一种连续帧数据的标注系统、方法和装置

Publications (1)

Publication Number Publication Date
WO2021143230A1 true WO2021143230A1 (zh) 2021-07-22

Family

ID=76771378

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/121362 WO2021143230A1 (zh) 2020-01-15 2020-10-16 一种连续帧数据的标注系统、方法和装置

Country Status (3)

Country Link
CN (1) CN113127666B (zh)
DE (1) DE112020003085T5 (zh)
WO (1) WO2021143230A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114067091A (zh) * 2022-01-17 2022-02-18 深圳慧拓无限科技有限公司 一种多源数据标注方法、系统、电子设备和存储介质
CN114827242A (zh) * 2022-04-24 2022-07-29 深圳市元征科技股份有限公司 流控帧修正方法、装置、设备及介质
CN116665177A (zh) * 2023-07-31 2023-08-29 福思(杭州)智能科技有限公司 数据处理方法、装置、电子装置和存储介质
CN116681123A (zh) * 2023-07-31 2023-09-01 福思(杭州)智能科技有限公司 感知模型训练方法、装置、计算机设备和存储介质
CN117784162A (zh) * 2024-02-26 2024-03-29 安徽蔚来智驾科技有限公司 目标标注数据获取方法、目标跟踪方法、智能设备及介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114882211A (zh) * 2022-03-01 2022-08-09 广州文远知行科技有限公司 时序数据自动标注方法、装置、电子设备、介质及产品

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160210704A1 (en) * 2015-01-20 2016-07-21 Grace Fang Methods and systems for tagging data in a network
CN108830466A (zh) * 2018-05-31 2018-11-16 长春博立电子科技有限公司 一种基于云平台的图像内容语义标注系统和方法
US20190108677A1 (en) * 2017-10-05 2019-04-11 Danaë Blondel System and Method for Object Recognition
CN109949439A (zh) * 2019-04-01 2019-06-28 星觅(上海)科技有限公司 行车实景信息标注方法、装置、电子设备和介质
CN110674295A (zh) * 2019-09-11 2020-01-10 成都数之联科技有限公司 一种基于深度学习的数据标注系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106385640B (zh) * 2016-08-31 2020-02-11 北京旷视科技有限公司 视频标注方法及装置
CN108491774B (zh) * 2018-03-12 2020-06-26 北京地平线机器人技术研发有限公司 对视频中的多个目标进行跟踪标注的方法和装置
CN108986134B (zh) * 2018-08-17 2021-06-18 浙江捷尚视觉科技股份有限公司 一种基于相关滤波跟踪的视频目标半自动标注方法
CN109145836B (zh) * 2018-08-28 2021-04-16 武汉大学 基于深度学习网络和卡尔曼滤波的船只目标视频检测方法
CN110084895B (zh) * 2019-04-30 2023-08-22 上海禾赛科技有限公司 对点云数据进行标注的方法和设备
CN110288629B (zh) * 2019-06-24 2021-07-06 湖北亿咖通科技有限公司 基于移动物体检测的目标检测自动标注方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160210704A1 (en) * 2015-01-20 2016-07-21 Grace Fang Methods and systems for tagging data in a network
US20190108677A1 (en) * 2017-10-05 2019-04-11 Danaë Blondel System and Method for Object Recognition
CN108830466A (zh) * 2018-05-31 2018-11-16 长春博立电子科技有限公司 一种基于云平台的图像内容语义标注系统和方法
CN109949439A (zh) * 2019-04-01 2019-06-28 星觅(上海)科技有限公司 行车实景信息标注方法、装置、电子设备和介质
CN110674295A (zh) * 2019-09-11 2020-01-10 成都数之联科技有限公司 一种基于深度学习的数据标注系统

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114067091A (zh) * 2022-01-17 2022-02-18 深圳慧拓无限科技有限公司 一种多源数据标注方法、系统、电子设备和存储介质
CN114827242A (zh) * 2022-04-24 2022-07-29 深圳市元征科技股份有限公司 流控帧修正方法、装置、设备及介质
CN114827242B (zh) * 2022-04-24 2023-10-20 深圳市元征科技股份有限公司 流控帧修正方法、装置、设备及介质
CN116665177A (zh) * 2023-07-31 2023-08-29 福思(杭州)智能科技有限公司 数据处理方法、装置、电子装置和存储介质
CN116681123A (zh) * 2023-07-31 2023-09-01 福思(杭州)智能科技有限公司 感知模型训练方法、装置、计算机设备和存储介质
CN116665177B (zh) * 2023-07-31 2023-10-13 福思(杭州)智能科技有限公司 数据处理方法、装置、电子装置和存储介质
CN116681123B (zh) * 2023-07-31 2023-11-14 福思(杭州)智能科技有限公司 感知模型训练方法、装置、计算机设备和存储介质
CN117784162A (zh) * 2024-02-26 2024-03-29 安徽蔚来智驾科技有限公司 目标标注数据获取方法、目标跟踪方法、智能设备及介质
CN117784162B (zh) * 2024-02-26 2024-05-14 安徽蔚来智驾科技有限公司 目标标注数据获取方法、目标跟踪方法、智能设备及介质

Also Published As

Publication number Publication date
CN113127666A (zh) 2021-07-16
DE112020003085T5 (de) 2022-04-07
CN113127666B (zh) 2022-06-24

Similar Documents

Publication Publication Date Title
WO2021143230A1 (zh) 一种连续帧数据的标注系统、方法和装置
WO2021143231A1 (zh) 一种目标检测模型的训练方法、数据的标注方法和装置
US20210023720A1 (en) Method for detecting grasping position of robot in grasping object
US10878372B2 (en) Method, system and device for association of commodities and price tags
US20150266182A1 (en) Method And An Apparatus For Automatically Generating A Collision Free Return Program For Returning A Robot From A Stop Position To A Predefined Restart Position
US20200167568A1 (en) Image processing method, device, and storage medium
US11928594B2 (en) Systems and methods for creating training data
US20200286212A1 (en) Image processing method, apparatus, and computer-readable recording medium
US11972578B2 (en) Method and system for object tracking using online training
CN110059637B (zh) 一种人脸对齐的检测方法及装置
JP2017146710A (ja) 搬送計画生成装置および搬送計画生成方法
CN111581968A (zh) 口语理解模型的训练方法、识别方法、系统、设备及介质
WO2022142744A1 (zh) 回环检测方法、装置、设备及计算机可读存储介质
US20240062415A1 (en) Terminal device localization method and related device therefor
CN111368927A (zh) 一种标注结果处理方法、装置、设备及存储介质
CN111368860B (zh) 重定位方法及终端设备
US11631191B2 (en) Location discovery
US11481577B2 (en) Machine learning (ML) quality assurance for data curation
US12020444B2 (en) Production line monitoring method and monitoring system thereof
CN116626700A (zh) 一种机器人定位方法、装置、电子设备和存储介质
CN112148817A (zh) 一种基于全景图的slam优化方法、装置和系统
US20230071291A1 (en) System and method for a precise semantic segmentation
WO2022133776A1 (zh) 点云标注方法、装置、计算机设备和存储介质
CN114415698A (zh) 机器人、机器人的定位方法、装置和计算机设备
CN114359376A (zh) 包裹定位方法、装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20914531

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20914531

Country of ref document: EP

Kind code of ref document: A1