CN115221356A

CN115221356A - Data labeling method and device, electronic equipment and storage medium

Info

Publication number: CN115221356A
Application number: CN202210912190.9A
Authority: CN
Inventors: 郑尧成; 丁进超; 李怡康
Original assignee: Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority date: 2022-07-29
Filing date: 2022-07-29
Publication date: 2022-10-21

Abstract

The disclosure provides a data annotation method, a device, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring a point cloud data set; taking each frame of point cloud data included in the point cloud data set as point cloud data to be marked, and determining reference point cloud data from the point cloud data including the object to be marked in the point cloud data set according to the state attribute of each object to be marked in the point cloud data to be marked; wherein the state attributes include: a moving state and/or a stationary state; and determining target marking data of the object to be marked in the point cloud data to be marked based on the reference marking data of the object to be marked in the point cloud data to be marked.

Description

Data labeling method and device, electronic equipment and storage medium

Technical Field

The disclosure relates to the technical field of automatic driving, in particular to a data annotation method, a data annotation device, electronic equipment and a storage medium.

Background

The automatic driving technology is provided with a complete sensing system, and obtains the information of the road environment around the automatic driving vehicle through the sensing system, so that the automatic driving vehicle can be ensured to run safely on the road.

Generally, the labeled point cloud data can be used in the training process of the perception system. With the continuous improvement of the performance and the requirement of the perception system, the data volume of the marked point cloud data used in the training process is increased. Therefore, a method capable of generating labeled point cloud data quickly and efficiently is particularly important.

Disclosure of Invention

In view of the above, the present disclosure provides at least a data annotation method, an apparatus, an electronic device and a storage medium.

In a first aspect, the present disclosure provides a data annotation method, including:

acquiring a point cloud data set;

taking each frame of point cloud data included in the point cloud data set as point cloud data to be marked, and determining reference point cloud data from the point cloud data including the object to be marked in the point cloud data set according to the state attribute of each object to be marked in the point cloud data to be marked; wherein the state attributes include: a moving state and/or a stationary state;

and determining target marking data of the object to be marked in the point cloud data to be marked based on the reference marking data of the object to be marked in the point cloud data to be marked.

In the method, considering that the state attribute of the object to be marked can comprise a motion state and/or a static state, for the object to be marked with different state attributes, based on the state attribute of the object to be marked, determining reference point cloud data from point cloud data including the object to be marked in a point cloud data set; furthermore, target marking data of the object to be marked in the point cloud data to be marked can be determined based on the reference marking data of the object to be marked in the reference point cloud data, automatic marking of the point cloud data to be marked is achieved, the problems of complexity and time consumption caused by manual marking are solved, and data marking efficiency is improved.

For example, the benchmark annotation data can be determined as target annotation data of the object to be annotated in the point cloud data to be annotated, so that the efficiency of data annotation is improved. Or, the reference marking data can be adjusted in response to the adjustment operation, so that target marking data of the object to be marked in the point cloud data to be marked is obtained, and the data marking efficiency is improved on the basis of ensuring the data marking accuracy.

In one possible embodiment, each frame of point cloud data included in the point cloud data set is determined by the following steps:

acquiring video data matched with the point cloud data set;

respectively carrying out frame extraction processing on the point cloud data set and the video data to obtain point cloud data of each frame included in the point cloud data set and each video frame included in the video data;

determining a video frame matched with the point cloud data based on first timestamp information corresponding to each frame of the point cloud data and second timestamp information corresponding to each video frame;

detecting each frame of point cloud data, and determining preset marking data corresponding to each object to be marked included in the point cloud data;

and taking the point cloud data associated with the preset marking data and the video frame as point cloud data of each frame included in the point cloud data set.

In the embodiment of the disclosure, the point cloud data associated with the preset annotation data and the video frame is determined as each frame of point cloud data included in the point cloud data set, and the information associated with the point cloud data is relatively rich and diverse, so that the target annotation data corresponding to each object to be annotated included in each frame of point cloud data can be relatively accurately determined in the following process.

In a possible embodiment, after the determining, based on the first timestamp information corresponding to each frame of the point cloud data and the second timestamp information corresponding to each video frame, a video frame matching the point cloud data, further includes:

determining a timestamp difference value corresponding to each frame of point cloud data based on first timestamp information corresponding to each frame of point cloud data and second timestamp information corresponding to the video frame matched with each frame of point cloud data;

screening out the point cloud data with the timestamp difference value larger than the set difference value threshold from the point cloud data of each frame to obtain the screened point cloud data of each frame;

the detecting each frame of the point cloud data and determining the preset labeling data corresponding to each object to be labeled included in the point cloud data comprises the following steps:

and detecting each frame of screened point cloud data, and determining preset marking data corresponding to each object to be marked included in each frame of screened point cloud data.

In consideration of the fact that the subsequent data labeling process of the point cloud data is assisted by using the video frame associated with the point cloud data, the object information included in the video frame associated with the point cloud data and the object information included in the point cloud data should be matched. Based on the method, the time stamp difference value corresponding to each frame of point cloud data can be determined, when the time stamp difference value is larger than the difference threshold value, the deviation between the object information included in the point cloud data and the object information included in the associated video frame is determined to be large, so that the point cloud data of which the time stamp difference value is larger than the set difference threshold value can be screened out from each frame of point cloud data, and the point cloud data of each frame after being screened are obtained, so that the accuracy of point cloud data labeling is improved.

In a possible implementation manner, the detecting each frame of point cloud data and determining preset labeling data corresponding to each object to be labeled included in the point cloud data includes:

detecting each frame of point cloud data, and generating initial labeling data corresponding to each object to be labeled, wherein the point cloud data comprises at least one of the following data: category data, pose data, and size data;

and adjusting the initial marking data based on the video frame matched with the point cloud data to generate preset marking data corresponding to each object to be marked included in the point cloud data.

Detecting each frame of point cloud data, and generating initial labeling data corresponding to each object to be labeled, wherein the point cloud data comprises the initial labeling data; and then, based on the video frame matched with the point cloud data, the initial marking data is adjusted, and the preset marking data corresponding to each object to be marked included in the point cloud data is generated more accurately.

In a possible implementation manner, after generating the initial annotation data corresponding to each object to be annotated included in the point cloud data, the method further includes:

displaying the point cloud data, the video frames matched with the point cloud data and each lane line in the target map matched with the point cloud data;

and responding to the received instruction of adjusting the initial marking data, and generating preset marking data corresponding to each object to be marked included in the point cloud data based on the adjustment of the initial marking data.

Here, after the initial annotation data corresponding to each object to be annotated included in the point cloud data is generated, each lane line in the target map matched with the point cloud data, the video frame matched with the point cloud data, and the point cloud data can also be displayed; and adjusting the initial annotation data based on the relative position information between the object to be annotated and the lane line in the point cloud data indicated by the annotation interface and the video frame matched with the point cloud data, and more accurately generating the preset annotation data corresponding to each object to be annotated included in the point cloud data.

In one possible embodiment, the determining, based on the state attribute of the object to be labeled, reference point cloud data from point cloud data including the object to be labeled in the point cloud data set includes:

and under the condition that the state attribute of the object to be marked indicates a motion state, determining a preset number of point cloud data which are positioned in front of the point cloud data to be marked and adjacent to the point cloud data to be marked as reference point cloud data from the point cloud data including the object to be marked in the point cloud data set.

Here, under the condition that the state attribute of the object to be marked indicates a motion state, from point cloud data including the object to be marked in the point cloud data set, determining a preset number of point cloud data which are positioned in front of the point cloud data to be marked and are adjacent to the point cloud data to be marked as reference point cloud data; therefore, target marking data of the object to be marked in the point cloud data to be marked can be accurately determined on the basis of the preset number of point cloud data.

and under the condition that the state attribute of the object to be marked indicates a static state, determining point cloud data, of which the size of an object detection frame of the object to be marked is in a set size range, from the point cloud data of the object to be marked in the point cloud data set to be the reference point cloud data.

Here, under the condition that the state attribute of the object to be marked indicates a static state, determining point cloud data, of which the size of an object detection frame of the object to be marked is in a set size range, from point cloud data including the object to be marked in the point cloud data set as reference point cloud data; the datum marking data of the object to be marked in the datum point cloud data are accurate.

In a possible implementation manner, in a case that the state attribute of the object to be labeled indicates a static state, the determining, based on the benchmark annotation data of the object to be labeled in the benchmark point cloud data, target annotation data of the object to be labeled in the point cloud data to be labeled includes:

converting the reference marking data of the object to be marked in the reference point cloud data into a world coordinate system based on the calibration parameters of the acquisition equipment when the reference point cloud data is acquired, and obtaining first converted marking data corresponding to the object to be marked; the datum marking data are positioned under a vehicle body center coordinate system when the datum point cloud data are collected;

and based on the calibration parameters of the acquisition equipment when the point cloud data to be marked are acquired, converting the first converted marking data corresponding to the object to be marked to a vehicle body central coordinate system when the point cloud data to be marked are acquired, and obtaining target marking data of the object to be marked in the point cloud data to be marked.

Under the condition that the state attribute of the object to be marked indicates a static state, the data of the position, the posture and the like of the object to be marked in the world coordinate system are fixed, so that the reference marking data of the object to be marked in the reference point cloud data can be converted into the world coordinate system based on the calibration parameters of the acquisition equipment when the reference point cloud data is acquired, and the first converted marking data corresponding to the object to be marked is obtained; and then, based on calibration parameters of acquisition equipment during acquisition of the reference point cloud data, converting the first converted marking data corresponding to the object to be marked to a vehicle body central coordinate system during acquisition of the point cloud data to be marked, so that target marking data of the object to be marked in each frame of point cloud data to be marked can be quickly obtained, and the marking efficiency of the object to be marked in a static state is improved.

In a possible implementation manner, in a case that the state attribute of the object to be labeled indicates a motion state and the reference point cloud data is a preset number, the determining, based on the reference labeling data of the object to be labeled in the reference point cloud data, target labeling data of the object to be labeled in the point cloud data to be labeled includes:

for each frame of the reference point cloud data, converting the reference marking data of the object to be marked included in the reference point cloud data into a world coordinate system based on the calibration parameters of the acquisition equipment when the reference point cloud data is acquired, and obtaining second converted marking data corresponding to the object to be marked included in the reference point cloud data;

determining the predicted marking data of the object to be marked in the point cloud data to be marked in the world coordinate system based on the second converted marking data corresponding to the object to be marked in a preset number of pieces of reference point cloud data;

and converting the predicted marking data to a vehicle body center coordinate system corresponding to the point cloud data to be marked by using the calibration parameters of the acquisition equipment when the point cloud data to be marked are acquired, so as to obtain target marking data of the object to be marked in the point cloud data to be marked.

It is considered that the position data of the object in motion in the continuous frame point cloud data has continuity. Therefore, the predicted marking data of the object to be marked in the point cloud data to be marked in the world coordinate system can be determined by using the second converted marking data corresponding to the object to be marked included in the at least one frame of reference point cloud data; and then, by utilizing calibration parameters of acquisition equipment when the point cloud data to be labeled is acquired, the predicted labeling data is converted into a vehicle body center coordinate system corresponding to the acquired point cloud data to be labeled, and the target labeling data of the object to be labeled in the point cloud data to be labeled can be obtained quickly and accurately.

In a possible embodiment, after the determining the target annotation data of the object to be annotated in the point cloud data to be annotated, the method further includes:

determining track information corresponding to each object to be marked included in the point cloud data set; the track information comprises position data of the object to be marked on each frame of point cloud data to be marked;

for each object to be marked, converting each position data in the track information corresponding to the object to be marked into a world coordinate system to obtain each first position data of the object to be marked;

smoothing each first position data to obtain each second position data of the object to be marked;

converting each second position data into a vehicle body center coordinate system of the point cloud data to be marked in the frame where the object to be marked is located to obtain third position data of the object to be marked in the point cloud data to be marked in the frame where the object to be marked is located;

and updating the target marking data of the object to be marked in the point cloud data to be marked of the frame where the object to be marked is located by utilizing the third position data to obtain updated target marking data corresponding to each frame of point cloud data to be marked.

And converting each position data in the track information corresponding to the object to be marked into a world coordinate system to obtain each first position data of the object to be marked. Considering that the first position data of the object to be labeled have relevance, in order to improve the accuracy of the position data and reduce burrs, smoothing each first position data to obtain second position data of the object to be labeled; and then converting each second position data to a vehicle body center coordinate system of the to-be-marked point cloud data of the frame where the to-be-marked object is located, and obtaining third position data of the to-be-marked object after smoothing processing in the to-be-marked point cloud data of the frame where the to-be-marked object is located. Furthermore, the third position data is utilized to update the target marking data of the object to be marked in the point cloud data to be marked of the frame where the object to be marked is located, so that the updated target marking data corresponding to each frame of point cloud data to be marked can be obtained more accurately.

In a possible embodiment, in the case that the multiple frames of point cloud data to be labeled include an object to be labeled whose state attribute indicates a motion state, the method further includes:

converting target marking data of the moving object to be marked in the frame point cloud data to be marked of the moving object to be marked into a world coordinate system to obtain fourth position data corresponding to the object to be marked in the plurality of frames of point cloud data to be marked;

and aiming at each frame of point cloud data to be marked in the multiple frames of point cloud data to be marked, determining the speed information of the object to be marked in the point cloud data to be marked based on the fourth position data of the object to be marked in the point cloud data to be marked and the fourth position data of the object to be marked in historical point cloud data to be marked before the point cloud data to be marked.

Here, when the multiple frames of point cloud data to be labeled include an object to be labeled whose state attribute indicates a motion state, the speed information of the object to be labeled can be determined more accurately, so as to more comprehensively determine the information of the object to be labeled in the point cloud data to be labeled.

In a possible embodiment, the method further comprises:

and determining the acceleration information of the object to be marked in each frame of point cloud data to be marked based on the speed information of the object to be marked in the plurality of frames of point cloud data to be marked.

The acceleration information of the object to be marked in each frame of point cloud data to be marked can be accurately determined based on the speed information of the object to be marked in the plurality of frames of point cloud data to be marked, so that the information corresponding to the object to be marked is relatively comprehensive.

training a neural network to be trained by using the multi-frame point cloud data to be labeled, which comprises the target labeling data, to obtain a target neural network; and/or the presence of a gas in the gas,

and testing the neural network to be tested by using the multi-frame point cloud data to be marked comprising the target marking data to obtain a test result of the neural network to be tested.

In the embodiment of the disclosure, because the labeling process of the target labeling data corresponding to the point cloud data to be labeled is relatively efficient, the neural network to be trained can be trained relatively efficiently by using the multi-frame point cloud data to be labeled including the target labeling data, so as to obtain the target neural network, and the training efficiency of the target neural network is improved. And/or by utilizing the multi-frame point cloud data to be marked comprising the target marking data, the neural network to be tested can be tested more efficiently, the test result of the neural network to be tested is obtained, and the test efficiency of the neural network to be tested is improved.

The following description of the effects of the apparatus, the electronic device, and the like refers to the description of the above method, and is not repeated here.

In a second aspect, the present disclosure provides a data annotation device, including:

the acquisition module is used for acquiring a point cloud data set;

the first determining module is used for taking each frame of point cloud data included in the point cloud data set as point cloud data to be marked, and determining reference point cloud data from the point cloud data including the object to be marked in the point cloud data set according to the state attribute of each object to be marked in the point cloud data to be marked based on the state attribute of the object to be marked; wherein the state attribute comprises: a moving state and/or a stationary state;

and the second determination module is used for determining target marking data of the object to be marked in the point cloud data to be marked based on the benchmark marking data of the object to be marked in the benchmark point cloud data.

In a third aspect, the present disclosure provides an electronic device comprising: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, the processor and the memory communicate with each other via the bus when the electronic device is running, and the machine-readable instructions, when executed by the processor, perform the steps of the data annotation method according to the first aspect or any one of the embodiments.

In a fourth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the data annotation method according to the first aspect or any one of the embodiments.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

Fig. 1 is a schematic flow chart illustrating a data annotation method provided by an embodiment of the present disclosure;

fig. 2 is a schematic interface diagram of a labeling tool in a data labeling method provided by an embodiment of the present disclosure;

FIG. 3 is a schematic diagram illustrating an architecture of a data annotation device provided in an embodiment of the present disclosure;

fig. 4 shows a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the disclosure, provided in the accompanying drawings, is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

The autonomous driving technology acquires information of a road environment around an autonomous vehicle through a perception system. With the development of the automatic driving technology, the performance and the requirements of the perception system are continuously improved, and the data volume of the marked point cloud data used in the training process is increased. Generally, the category, pose, size and the like of each object in each frame of point cloud data can be labeled manually, but the data labeling method is complex in operation, large in workload and high in time consumption.

In order to alleviate the above problem, embodiments of the present disclosure provide a data annotation method and apparatus, an electronic device, and a storage medium.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

For the convenience of understanding the embodiments of the present disclosure, a detailed description will be given to a data annotation method disclosed in the embodiments of the present disclosure. The execution subject of the data annotation method provided by the embodiment of the present disclosure is generally a computer device with certain computing capability, and the computer device includes: a terminal device, or a server or other processing devices, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, or a terminal. In some possible implementations, the data annotation process can be implemented by a processor calling computer readable instructions stored in a memory.

Referring to fig. 1, a schematic flow chart of a data annotation method provided in the embodiment of the present disclosure is shown, where the method includes: S101-S103, specifically:

s101, a point cloud data set is obtained.

S102, taking each frame of point cloud data included in the point cloud data set as point cloud data to be marked, and determining reference point cloud data from the point cloud data including objects to be marked in the point cloud data set according to the state attribute of each object to be marked in the point cloud data to be marked based on the state attribute of the object to be marked; wherein the state attributes include: a moving state and/or a stationary state.

S103, determining target marking data of the object to be marked in the point cloud data to be marked based on the reference marking data of the object to be marked in the point cloud data to be marked.

In the method, considering that the state attribute of the object to be marked can comprise a motion state and/or a static state, aiming at the object to be marked with different state attributes, based on the state attribute of the object to be marked, determining reference point cloud data from point cloud data comprising the object to be marked in a point cloud data set; furthermore, target marking data of the object to be marked in the point cloud data to be marked can be determined based on the reference marking data of the object to be marked in the reference point cloud data, automatic marking of the point cloud data to be marked is achieved, the problems of complexity and time consumption caused by manual marking are solved, and data marking efficiency is improved.

S101 to S103 will be specifically described below.

For S101:

during implementation, in the driving process of the driving device, the radar equipment arranged on the driving device can be used for collecting road scene information around the driving device to obtain a point cloud data set. When the point cloud data set is collected, road scene information around the driving device can be collected through an image collecting sensor arranged on the driving device, and video data matched with the point cloud data set is obtained. And meanwhile, a positioning file corresponding to the point cloud data set can be obtained. For example, the traveling device may be a vehicle, a non-motor vehicle, a robot, or the like. The positioning file comprises pose information of the driving device when each frame of point cloud data in the point cloud data set is collected.

step A1, video data matched with the point cloud data set is obtained.

And A2, respectively carrying out frame extraction processing on the point cloud data set and the video data to obtain each frame of point cloud data included in the point cloud data set and each video frame included in the video data.

And A3, determining a video frame matched with the point cloud data based on the first time stamp information corresponding to each frame of point cloud data and the second time stamp information corresponding to each video frame.

And A4, detecting each frame of point cloud data, and determining preset marking data corresponding to each object to be marked included in the point cloud data.

And step A5, using the point cloud data associated with the preset marking data and the video frame as point cloud data of each frame included in the point cloud data set.

During implementation, video data matched with the point cloud data set is obtained, and frame extraction processing is performed on the point cloud data set and the video data respectively by using a marking tool such as OpenCV, so that each frame of point cloud data included in the point cloud data set and each video frame included in the video data are obtained. Meanwhile, when the point cloud data sets and the video data are subjected to frame extraction processing, first timestamp information corresponding to each frame of point cloud data and second timestamp information corresponding to each video frame can be obtained.

Or, when the point cloud data set and the video data are acquired, a first time file corresponding to the point cloud data set and/or a second time file corresponding to the video data may be acquired. The first time file comprises a mapping relation between first acquisition time and point cloud data when the point cloud data are acquired, and then first timestamp information corresponding to each frame of point cloud data can be determined according to the first acquisition time. The second time file comprises a mapping relation between second acquisition time and video frames when the video frames are acquired, and second timestamp information corresponding to each video frame can be further determined according to the second acquisition time.

Further, the point cloud data may be aligned with the video frames based on the timestamp information, i.e., the video frames matching the point cloud data may be determined based on the first timestamp information corresponding to each frame of point cloud data and the second timestamp information corresponding to each video frame. For example, for each frame of point cloud data 1, a video frame with the minimum deviation between the second timestamp information and the first timestamp information 1 corresponding to the point cloud data 1 is determined from each video frame, and the video frame is determined as a video frame matched with the point cloud data 1.

Detecting each frame of point cloud data, and determining preset marking data corresponding to each object to be marked included in the point cloud data; for example, the trained perceptual model may be used to detect point cloud data, so as to obtain preset labeling data corresponding to each object to be labeled included in the point cloud data. The objects to be marked can include, but are not limited to, pedestrians, trucks, cars, bicycles, cone barrels, traffic lights, etc.; the preset labeling data can comprise category data, pose data, size data and the like; the pose data includes position data and pose data.

Here, the preset labeling data corresponding to the object to be labeled is data in a vehicle body center coordinate system corresponding to the point cloud data. The vehicle body center coordinate system is a three-dimensional coordinate system which is constructed by taking the center point of the running device as an origin; the world coordinate system is a three-dimensional coordinate system constructed by taking a preset place in a real scene as an origin. The positions of the running devices in the world coordinate system are different when the running devices collect different point cloud data, namely the central points of the running devices are different, so that the vehicle body central coordinate systems corresponding to different point cloud data are different.

And finally, taking the point cloud data associated with the preset labeling data and the video frame as the point cloud data of each frame included in the point cloud data set.

During implementation, the storage format conversion processing may be performed on the preset annotation data to obtain the first converted preset annotation data, where the storage format of the first converted preset annotation data is a storage format that can be identified by the annotation tool. For example, the storage format of the preset annotation data may be a rossbag storage format, and the rossbag storage format is converted into a storage format that can be recognized by the annotation tool, such as a json storage format and a text storage format. And/or performing data format conversion processing on the preset labeling data to obtain second converted preset labeling data, wherein the data format of the second converted preset labeling data is a data format which can be identified by the labeling tool. For example, the data format of the preset annotation data may be: the method comprises the steps of obtaining category data of an object to be marked, vertex coordinates of each vertex on a detection frame of the object to be marked and angle information of the object to be marked; and converting the data format of the preset marking data into a data format which can be identified by a marking tool, such as class data of the object to be marked, coordinates of a central point on a detection frame of the object to be marked, size information of the detection frame of the object to be marked, and angle information of the object to be marked.

In the embodiment of the disclosure, the point cloud data associated with the preset annotation data and the video frame is determined as each frame of point cloud data included in the point cloud data set, and the information associated with the point cloud data is relatively abundant and diverse, so that the target annotation data corresponding to each object to be annotated included in each frame of point cloud data can be determined relatively accurately in the following process.

In a possible implementation manner, after determining, in step A3, video frames matched with the point cloud data based on the first timestamp information corresponding to each frame of the point cloud data and the second timestamp information corresponding to each video frame, the method further includes: determining a timestamp difference value corresponding to each frame of point cloud data based on first timestamp information corresponding to each frame of point cloud data and second timestamp information corresponding to a video frame matched with each frame of point cloud data; and screening out the point cloud data with the timestamp difference value larger than the set difference threshold value from the point cloud data of each frame to obtain the screened point cloud data of each frame.

In implementation, after the video frame matched with the point cloud data is determined, for each frame of point cloud data, a timestamp difference corresponding to the frame of point cloud data may be determined based on the first timestamp information of the frame of point cloud data and the second timestamp information of the video frame matched with the frame of point cloud data. When the timestamp difference is larger than a set difference threshold value, screening out the frame point cloud data; otherwise, the frame point cloud data is reserved. And then point cloud data after screening of each frame can be obtained. The difference threshold may be set according to actual requirements, for example, the difference threshold may be 0.05 second, 0.08 second, and the like.

Illustratively, after frame extraction is performed on a point cloud data set, each frame of point cloud data is obtained to comprise point cloud data 1, point cloud data 2 and point cloud data 3, the timestamp difference value corresponding to the point cloud data 1 is 0.01 second, the timestamp difference value corresponding to the point cloud data 2 is 0.03 second, and the timestamp difference value corresponding to the point cloud data 3 is 0.07 second; and if the difference threshold value is 0.05 second, screening the point cloud data 3 from each frame of point cloud data to obtain screened point cloud data comprising point cloud data 1 and point cloud data 2.

And detecting each frame of screened point cloud data, and determining preset marking data corresponding to each object to be marked included in the screened point cloud data.

In consideration of the fact that the subsequent data labeling process of the point cloud data is assisted by using the video frame associated with the point cloud data, the object information included in the video frame associated with the point cloud data and the object information included in the point cloud data should be matched. Based on the method, the timestamp difference corresponding to each frame of point cloud data can be determined, and when the timestamp difference is larger than the difference threshold, the difference between the object information included in the point cloud data and the object information included in the associated video frame is determined to be large, so that the point cloud data with the timestamp difference larger than the set difference threshold can be screened out from each frame of point cloud data to obtain the screened point cloud data of each frame, and the accuracy of point cloud data labeling is improved.

In a possible implementation manner, in the step A4, detecting each frame of point cloud data, and determining preset labeling data corresponding to each object to be labeled included in the point cloud data may include:

step A41, detecting each frame of point cloud data, and generating initial labeling data corresponding to each object to be labeled, wherein the point cloud data comprises at least one of the following data: category data, pose data, and size data.

Step A42, based on the video frame matched with the point cloud data, adjusting the initial annotation data to generate preset annotation data corresponding to each object to be annotated included in the point cloud data.

During implementation, each frame of point cloud data can be input into the trained perception model for detection, and initial labeling data corresponding to each object to be labeled included in the point cloud data are generated. Wherein the initial annotation data comprises at least one of: category data, pose data, and size data; furthermore, the initial labeling data can be adjusted to generate preset labeling data corresponding to each object to be labeled included in the point cloud data, so as to obtain more accurate preset labeling data.

Specifically, the category data included in the initial annotation data may be adjusted according to the video frame matched with the point cloud data, for example, if the video frame indicates that the category of the object 1 to be annotated is a car, but the category of the object 1 to be annotated in the initial annotation data is a medium-sized car, the category data of the object 1 to be annotated in the initial annotation data may be adjusted to be a car.

Or the marker can adjust the pose data and the size data of the object to be marked in the point cloud data based on the video frame matched with the point cloud data, and generate preset marking data corresponding to each object to be marked included in the point cloud data in response to the triggered adjustment operation.

Or predicting three-dimensional detection frame information of the object to be marked based on the two-dimensional detection frame information of the object to be marked in the video frame matched with the point cloud data; and then, according to the predicted three-dimensional detection frame information, adjusting the size data, the pose data and the like of the objects to be marked in the point cloud data to generate preset marking data corresponding to each object to be marked included in the point cloud data.

In consideration of the situation that the object to be marked in the point cloud data is partially shielded or the point cloud data corresponding to the object to be marked is sparse, the accuracy of the initial marking data of the object to be marked is low, for example, the size information, the position information and the like of the object to be marked may be incomplete. Therefore, in order to alleviate the above problem, the initial annotation data of the object to be annotated may be adjusted by using the video frame matched with the point cloud data, so as to obtain the preset annotation data of the object to be annotated.

In another possible embodiment, after the step a41 of generating the initial annotation data corresponding to each object to be annotated included in the point cloud data, the method further includes:

and B1, displaying the point cloud data, the video frames matched with the point cloud data and all lane lines in the target map matched with the point cloud data.

And B2, responding to the received instruction for adjusting the initial marking data, and generating preset marking data corresponding to each object to be marked included in the point cloud data based on the adjustment of the initial marking data.

During implementation, the initial marking data can be adjusted by using a target map matched with the point cloud data, and preset marking data corresponding to each object to be marked included in the point cloud data is generated.

Specifically, the marking interface can be controlled to display each lane line in the target map matched with the point cloud data, the video frame matched with the point cloud data and the point cloud data; and adjusting the initial labeling data based on the relative position information between the objects to be labeled and the lane lines in the point cloud data indicated by the labeling interface and the video frames matched with the point cloud data to generate preset labeling data corresponding to each object to be labeled included in the point cloud data. The target map may be a local map corresponding to the point cloud data determined from the entire map corresponding to the point cloud data set.

After the control labeling interface displays each lane line and point cloud data in the target map matched with the point cloud data, the labeling interface indicates relative position information between an object to be labeled and the lane line in the point cloud data, for example, the longitudinal distance between the vehicle 1 and the zebra crossing is 5 meters. Therefore, the initial marking data of the object to be marked is adjusted according to the relative position information between the object to be marked and the lane line indicated by the video frame and the relative position information between the object to be marked and the lane line in the point cloud data indicated by the marking interface, and the preset marking data corresponding to the object to be marked is generated.

The determination method of the target map corresponding to the point cloud data may be as follows: and determining the pose information of the driving device in the world coordinate system when the frame of point cloud data is collected. Then, the position indicated by the pose information is taken as an original point, and a preset distance is taken as a radius, so that the range of a target area is determined; and determining a local map corresponding to the target area range in the whole map as a target map corresponding to the frame of point cloud data. The preset distance may be determined according to actual requirements, for example, the preset distance may be 80 meters, 100 meters, or the like.

Referring to the labeling interface shown in fig. 2, the labeling interface displays the lane lines in the target map matched with the point cloud data, and the video frames displayed in the camera display area. Based on the relative position information between the object to be marked and the lane line in the point cloud data indicated by the marking interface and the video frame matched with the point cloud data, the pose data included in the initial marking data is adjusted to generate preset marking data corresponding to each object to be marked included in the point cloud data, namely, the vehicle and the label displayed on the marking interface shown in fig. 2.

Or, the initial annotation data may be adjusted by using a target map and a video frame matched with the point cloud data, so as to generate preset annotation data corresponding to each object to be annotated included in the point cloud data.

Firstly, based on a video frame matched with the point cloud data, adjusting the category data included in the initial annotation data and adjusting the pose data included in the initial annotation data; then, the marking interface can be controlled to display each lane line and point cloud data in the target map matched with the point cloud data; and adjusting the initial marking data based on the relative position information between the objects to be marked and the lane lines in the point cloud data indicated by the marking interface and the video frames matched with the point cloud data to generate preset marking data corresponding to each object to be marked included in the point cloud data.

After the initial marking data corresponding to each object to be marked included in the point cloud data are generated, all lane lines in a target map matched with the point cloud data, video frames matched with the point cloud data and the point cloud data can be displayed; and adjusting the initial annotation data based on the relative position information between the object to be annotated and the lane line in the point cloud data indicated by the annotation interface and the video frame matched with the point cloud data, and more accurately generating the preset annotation data corresponding to each object to be annotated included in the point cloud data.

For S102:

because the state attribute of the object to be marked comprises a static state and/or a motion state, the reference point cloud data can be determined from the point cloud data including the object to be marked in the point cloud data set according to the state attribute of the object to be marked aiming at each object to be marked in the point cloud data to be marked; for example, for an object to be labeled in a static state, a frame of point cloud data may be randomly selected from the point cloud data including the object to be labeled as reference point cloud data. For another example, for an object to be labeled in a motion state, each determined point cloud data including the object to be labeled may be determined as reference point cloud data.

The objects to be labeled in a static state can include, but are not limited to, a cone, a traffic light, a static car, etc., and the objects to be labeled in a moving state can include, but is not limited to, a running car, a walking pedestrian, etc.; and the determined reference point cloud data is data under a vehicle body center coordinate system corresponding to the point cloud data to be marked.

In one possible embodiment, determining, from point cloud data including an object to be labeled in a point cloud data set based on a state attribute of the object to be labeled, reference point cloud data includes: under the condition that the state attribute of the object to be marked indicates a motion state, determining a preset number of point cloud data which are positioned in front of the point cloud data to be marked and are adjacent to the point cloud data to be marked as reference point cloud data from the point cloud data including the object to be marked in the point cloud data set.

The data of the motion track, the size and the like of the object to be marked in consideration of the motion state has continuity and relevance, namely the marking data of the object to be marked in the current frame point cloud data is closely related to the marking data of the object to be marked in the previous point cloud data before the current frame point cloud data and the marking data of the object to be marked in the subsequent point cloud data.

Therefore, if the state attribute of the object to be marked indicates a motion state, a preset number of point cloud data which are positioned in front of the point cloud data to be marked and adjacent to the point cloud data to be marked can be determined as reference point cloud data from the point cloud data including the object to be marked in the point cloud data set; the preset number can be set according to actual requirements and can be 3, 5 and the like. Illustratively, the preset number is 5, the object 1 to be marked is an automobile, the point cloud data including the object 1 to be marked includes point cloud data 1 to point cloud data 10, and if the marked point cloud data is point cloud data 8, the point cloud data 3 to point cloud data 7 can be determined as reference point cloud data.

In specific implementation, from point cloud data including an object to be marked in a point cloud data set, a first preset number of point cloud data which are positioned in front of the point cloud data to be marked and are adjacent to the point cloud data to be marked and a second preset number of point cloud data which are positioned behind the point cloud data to be marked and are adjacent to the point cloud data to be marked can be determined as reference point cloud data; illustratively, the object 1 to be marked is an automobile, the point cloud data including the object 1 to be marked comprises point cloud data 1 to point cloud data 10, and if the marked point cloud data is point cloud data 5, point cloud data 3, point cloud data 4, point cloud data 6 and point cloud data 7 can be determined as reference point cloud data.

Or, a second preset number of point cloud data which are adjacent to the point cloud data to be marked and are positioned behind the point cloud data to be marked can be determined as reference point cloud data from the point cloud data including the object to be marked in the point cloud data set; illustratively, the object 1 to be marked is an automobile, the point cloud data including the object 1 to be marked includes point cloud data 1 to point cloud data 10, and if the marked point cloud data is point cloud data 5, point cloud data 6 to point cloud data 8 can be determined as reference point cloud data.

Here, when the state attribute of the object to be marked indicates a motion state, determining a preset number of point cloud data adjacent to the point cloud data to be marked, which is located in front of the point cloud data to be marked, from the point cloud data including the object to be marked in the point cloud data set as reference point cloud data; therefore, target marking data of the object to be marked in the point cloud data to be marked can be accurately determined on the basis of the preset number of point cloud data.

In one possible embodiment, determining reference point cloud data from point cloud data including an object to be marked in a point cloud data set based on a state attribute of the object to be marked includes: under the condition that the state attribute of the object to be marked indicates a static state, determining point cloud data, of which the size of an object detection frame of the object to be marked is in a set size range, from point cloud data of the object to be marked in the point cloud data set as reference point cloud data.

During implementation, if the state attribute of the object to be marked indicates a static state, because the data of the position, the posture and the like of the object to be marked in the static state under the world coordinate system are fixed, the point cloud data of which the size of the object detection frame of the object to be marked is in the set size range can be determined from the point cloud data of the object to be marked in the point cloud data set; if the point cloud data of the object detection frame of the object to be marked is in the set size range, one frame of point cloud data can be selected and determined as the datum point cloud data, or multiple frames of point cloud data can be determined as the datum point cloud data.

Considering that when the distance between the object to be marked and the driving device is far, the size of the object to be marked in the point cloud data is small, and the error of the marking data of the object to be marked is possibly large; when the object to be marked is close to the driving device, the information of the object to be marked in the point cloud data is incomplete, so that the marking data of the object to be marked are inaccurate; based on the method, the two-dimensional detection frame information of the object to be marked in each frame of video frame can be determined, and then the reference point cloud data can be determined from the multi-frame point cloud data to be marked comprising the object to be marked according to the two-dimensional position, the two-dimensional size, the size proportion and the like indicated by the two-dimensional detection frame information in the video frame and the point cloud data associated with the video frame.

For example, a video frame with a two-dimensional size within a set size range may be selected, and point cloud data associated with the video frame may be determined as reference point cloud data. And/or selecting a video frame with the two-dimensional position within the set area range, and determining point cloud data associated with the video frame as reference point cloud data. And/or selecting a video frame with the size proportion within the set proportion range, and determining point cloud data associated with the video frame as reference point cloud data.

Or, the reference point cloud data can be determined according to the distance between the object to be marked and the driving device indicated by the marking data. For example, point cloud data whose distance is within the set distance range is selected as reference point cloud data.

Or determining at least one frame of candidate point cloud data from the multiple frames of point cloud data according to the distance between the object to be marked and the driving device indicated by the marking data. And determining reference point cloud data from at least one frame of candidate point cloud data according to the video frame matched with each frame of candidate point cloud data respectively. Or determining at least one frame of candidate point cloud data from the multiple frames of point cloud data according to the video frame matched with each frame of point cloud data. And determining reference point cloud data from at least one frame of candidate point cloud data based on the distance between the object to be marked and the driving device indicated by the marking data.

When the point cloud data to be labeled comprises a plurality of objects to be labeled in a static state, corresponding reference point cloud data can be determined for each object to be labeled. The multiple objects to be marked can correspond to the same datum point cloud data and can also correspond to different datum point cloud data.

For S103:

after the benchmark marking data of the object to be marked in the benchmark point cloud data is obtained, the target marking data of the object to be marked in the point cloud data to be marked can be determined based on the benchmark marking data. For example, when the reference annotation data includes size data, the average size may be obtained by averaging or weighted averaging the size data indicated by the reference annotation data of the object to be annotated in each frame of reference point cloud data. And then taking the average size as size data of the object to be marked in target marking data in the point cloud data to be marked.

For another example, when the reference annotation data includes position data, the position data indicated by the reference annotation data of the object to be annotated in each frame of reference point cloud data may be converted into a world coordinate system, so as to obtain converted position data of the object to be annotated in each frame of reference point cloud data; predicting the predicted position of the object to be marked in the point cloud data to be marked according to the converted position data of the object to be marked in each frame of reference point cloud data; and taking the predicted position as position data of the object to be marked in target marking data in the point cloud data to be marked.

Or the reference marking data can be adjusted based on the video frame matched with the point cloud data to be marked to obtain adjusted reference marking data; and determining the adjusted reference marking data as target marking data of the object to be marked in the point cloud data to be marked.

Because the object to be marked in the point cloud data to be marked is possibly shielded or in a motion state, the determined object to be marked has an error in the target marking data in the point cloud data to be marked. Therefore, in order to reduce the error, the target labeling data of the object to be labeled in the point cloud data to be labeled can be smoothed.

In one possible embodiment, in a case that the state attribute of the object to be labeled indicates a static state, determining target labeling data of the object to be labeled in the point cloud data to be labeled based on the reference labeling data of the object to be labeled in the point cloud data to be labeled, includes:

step C1, converting the reference marking data of the object to be marked in the reference point cloud data into a world coordinate system based on the calibration parameters of the acquisition equipment when the reference point cloud data is acquired, and obtaining first converted marking data corresponding to the object to be marked; the reference marking data are positioned under a vehicle body center coordinate system when the reference point cloud data are collected.

And step C2, converting the first converted marking data corresponding to the object to be marked to a vehicle body center coordinate system when the point cloud data to be marked is collected based on the calibration parameters of the collection equipment when the point cloud data to be marked is collected, and obtaining target marking data of the object to be marked in the point cloud data to be marked.

In implementation, calibration parameters of the acquisition device when acquiring the reference point cloud data may be determined first, for example, the calibration parameters may be determined from the generated calibration file. The calibration file comprises calibration parameters of acquisition equipment when each frame of point cloud data is acquired; the calibration parameters can be used for representing the conversion relation between the world coordinate system and the vehicle body center coordinate system corresponding to the point cloud data. The calibration parameters of the acquisition equipment when each frame of point cloud data is acquired can be generated according to the pose data of the running device when each frame of point cloud data is acquired.

Then, calibration parameters of the acquisition equipment can be acquired when the reference point cloud data is acquired, and reference marking data of the object to be marked in the reference point cloud data is converted into a world coordinate system to obtain first converted marking data corresponding to the object to be marked; the benchmark annotation data can be preset annotation data, and the benchmark annotation data is located under a vehicle body center coordinate system when the benchmark point cloud data are collected. And converting the first converted marking data corresponding to the object to be marked to a vehicle body central coordinate system when the point cloud data to be marked is acquired based on the calibration parameters of the acquisition equipment when the reference point cloud data is acquired, so as to obtain target marking data of the object to be marked in the point cloud data to be marked.

In a possible implementation manner, when the state attribute of the object to be marked indicates a motion state and the reference point cloud data is a preset number, determining target marking data of the object to be marked in the point cloud data to be marked based on the reference marking data of the object to be marked in the reference point cloud data to be marked, includes:

and D1, converting the reference marking data of the object to be marked included in the reference point cloud data to a world coordinate system based on the calibration parameters of the acquisition equipment when the reference point cloud data is acquired, and obtaining second converted marking data corresponding to the object to be marked included in the reference point cloud data.

And D2, determining the prediction marking data of the object to be marked in the point cloud data to be marked in the world coordinate system based on the marking data corresponding to the object to be marked in the preset number of reference point cloud data after second conversion.

And D3, converting the predicted marking data to a vehicle body center coordinate system corresponding to the collected point cloud data to be marked by using the calibration parameters of the collection equipment when the point cloud data to be marked is collected, so as to obtain target marking data of the object to be marked in the point cloud data to be marked.

In step D1, for each frame of reference point cloud data, based on calibration parameters of the acquisition device when the frame of reference point cloud data is acquired, reference annotation data of an object to be annotated included in the frame of reference point cloud data is converted into a world coordinate system, so as to obtain second converted annotation data corresponding to the object to be annotated included in the frame of reference point cloud data. For example, the position data in the reference annotation data may be converted into a world coordinate system to obtain second converted position data, that is, the second converted annotation data includes the second converted position data. And the reference marking data are target marking data and are positioned under a vehicle body center coordinate system corresponding to the frame of reference point cloud data.

Illustratively, if the point cloud data set includes point cloud data 1 to point cloud data 30, if the point cloud data 8 is point cloud data 8 to be annotated, and the point cloud data 8 includes point cloud data 5 to point cloud data 7, for the point cloud data 5, the reference annotation data of the object to be annotated included in the point cloud data 5 may be converted to a world coordinate system based on calibration parameters of an acquisition device when the point cloud data 5 is acquired, so as to obtain second converted annotation data corresponding to the object to be annotated included in the point cloud data 5; based on this, second converted annotation data corresponding to the object to be annotated in the point cloud data 5 to the point cloud data 7 can be obtained respectively.

In step D2, the predicted labeling data of the object to be labeled in the point cloud data to be labeled in the world coordinate system may be determined based on the second converted labeling data corresponding to the object to be labeled in the preset number of pieces of reference point cloud data. For example, the instantaneous speed of the object to be marked is determined according to the second converted position data corresponding to the object to be marked in the preset number of reference point cloud data and the time interval between the adjacent frame of reference point cloud data; and then, according to the instantaneous speed, predicting to obtain the predicted position of the object to be marked in the point cloud data to be marked in the world coordinate system. The prediction annotation data comprises the predicted position.

Illustratively, if the point cloud data set includes point cloud data 1 to point cloud data 30, if the point cloud data 8 is point cloud data 8 to be labeled, and the reference point cloud data includes point cloud data 5 to point cloud data 7, a position difference value 1 between the second converted position data of the object to be labeled in the point cloud data 5 and the second converted position data of the object to be labeled in the point cloud data 6, and a position difference value 2 between the second converted position data of the object to be labeled in the point cloud data 6 and the second converted position data of the object to be labeled in the point cloud data 7 may be determined. And predicting to obtain the predicted position of the object to be marked in the point cloud data 8 to be marked in the world coordinate system by using the position difference value 1 and the position difference value 2.

When the predicted marking data includes size data, averaging, weighting and averaging size data respectively indicated by second converted marking data corresponding to the objects to be marked in the preset number of reference point cloud data, and determining the predicted size data of the objects to be marked in the world coordinate system in the point cloud data to be marked.

In the step D3, since the predicted labeling data is data in a world coordinate system, the predicted labeling data is converted to a vehicle body central coordinate system corresponding to the collected point cloud data to be labeled by using calibration parameters of a collection device when the point cloud data to be labeled is collected, so as to obtain target labeling data of the object to be labeled in the point cloud data to be labeled.

It is considered that the position data of the object in motion state in the continuous frame point cloud data has continuity. Therefore, the predicted marking data of the object to be marked in the point cloud data to be marked in the world coordinate system can be determined by using the second converted marking data corresponding to the object to be marked which is included in at least one frame of reference point cloud data; and then, by utilizing calibration parameters of acquisition equipment when the point cloud data to be labeled is acquired, the predicted labeling data is converted into a vehicle body center coordinate system corresponding to the acquired point cloud data to be labeled, and the target labeling data of the object to be labeled in the point cloud data to be labeled can be obtained quickly and accurately.

In one possible embodiment, after determining target annotation data of an object to be annotated in point cloud data to be annotated, the method further includes:

e1, determining track information corresponding to each object to be marked in the point cloud data set; the track information comprises position data of the object to be marked on each frame of point cloud data to be marked.

And E2, converting each position data in the track information corresponding to the object to be marked into a world coordinate system aiming at each object to be marked to obtain each first position data of the object to be marked.

And E3, smoothing each first position data to obtain each second position data of the object to be marked.

And E4, converting each second position data into a vehicle body center coordinate system of the point cloud data to be marked in the frame where the object to be marked is located, and obtaining third position data of the object to be marked in the point cloud data to be marked in the frame where the object to be marked is located.

And E5, updating the target marking data of the object to be marked in the point cloud data to be marked of the frame where the object to be marked is located by using the third position data to obtain updated target marking data corresponding to each frame of point cloud data to be marked.

During implementation, track information corresponding to each object to be marked included in the point cloud data set can be determined. The object to be marked comprises an object to be marked in a static state and/or an object to be marked in a motion state; the track information comprises position data of the object to be marked on each frame of point cloud data to be marked.

For example, for each object to be marked, determining point cloud data one to be marked of each frame including the object to be marked; and determining the track information of the object to be marked according to the position data (the position data is data in a corresponding vehicle body center coordinate system) indicated by the target marking data of the object to be marked in each frame of point cloud data I to be marked.

And for each object to be marked, converting each position data in the track information corresponding to the object to be marked into a world coordinate system by using the corresponding calibration parameter to obtain each first position data of the object to be marked. And smoothing each first position data to obtain each second position data of the object to be marked.

For example, a neighborhood value-taking algorithm may be used to perform smoothing processing on each first position data at least once, the number of times of smoothing processing and a first neighborhood value used in each smoothing processing may be determined according to actual requirements, for example, smoothing processing may be performed 2 times, the first neighborhood value used in the first smoothing processing may be 7, and the first neighborhood value used in the second smoothing processing may be 3.

Illustratively, the track information corresponding to the pedestrian 1 included in the point cloud data set is determined, and the track information includes the position data of the pedestrian 1 in the point cloud data 10 to 20 to be marked. Based on calibration parameters respectively corresponding to the point cloud data 10 to be marked and the point cloud data 20 to be marked, respectively converting position data of the pedestrian 1 in the point cloud data 10 to be marked and the point cloud data 20 to be marked into a world coordinate system to obtain first position data 10 to first position data 20 of the pedestrian 1.

Smoothing is performed on the first position data 10 to the first position data 20, respectively, to obtain each second position data of the pedestrian 1. For example, 2 smoothing processes are performed on the first position data 15, and the first neighborhood value used in the 1 st smoothing process may be 7. The process of the 1 st smoothing process comprises the following steps: averaging the first position data 15, 3 first position data before the first position data 15, and 3 first position data after the first position data 15, that is, averaging the first position data 12 to the first position data 18, to obtain intermediate position data corresponding to the point cloud data 15 to be marked. The first neighborhood value used for the 2 nd smoothing may be 3. The process of the 2 nd smoothing process includes: and averaging the intermediate position data, the first position data 14 and the first position data 16 corresponding to the point cloud data 15 to be marked to obtain second position data 15 of the pedestrian 1.

And then converting each second position data into a vehicle body center coordinate system of the point cloud data to be marked where the pedestrian 1 is located to obtain third position data of the pedestrian 1 in the frame point cloud data to be marked where the pedestrian is located. And then, updating the target marking data of the pedestrian 1 in the to-be-marked point cloud data of the frame where the pedestrian is located by using the third position data to obtain updated target marking data corresponding to each frame of the to-be-marked point cloud data. Namely, the position data after the smoothing processing is used for replacing the position data in the target labeling data, and the updated target labeling data is obtained.

Considering the speed information of the object to be marked, the speed information of the object to be marked in a static state in the point cloud data to be marked can be directly determined to be 0, and the speed information of the object to be marked in a moving state needs to be further determined. Therefore, in the case that the multiple frames of point cloud data to be labeled include an object to be labeled whose state attribute indicates a motion state, the method further includes:

step F1, converting target marking data of an object to be marked in a moving state in the frame point cloud data to be marked of the object to be marked in the moving state to a world coordinate system to obtain fourth position data corresponding to the object to be marked in the multiple frames of point cloud data to be marked;

and F2, aiming at each frame of point cloud data to be marked in the multi-frame point cloud data to be marked, and determining the speed information of the object to be marked in the point cloud data to be marked based on the fourth position data of the object to be marked in the point cloud data to be marked and the fourth position data of the object to be marked in the historical point cloud data to be marked positioned in front of the point cloud data to be marked.

During implementation, for an object to be marked in a moving state, target marking data of the object to be marked in the frame point cloud data to be marked where the object to be marked is located can be converted into a world coordinate system by using corresponding calibration parameters, and fourth position data corresponding to the object to be marked in multiple frames of point cloud data to be marked is obtained.

For each frame of point cloud data to be marked in the multiple frames of point cloud data to be marked, determining the position deviation of an object to be marked based on fourth position data of the object to be marked in the point cloud data to be marked and fourth position data of the object to be marked in historical point cloud data to be marked positioned in front of the point cloud data to be marked; and determining the speed information of the object to be marked in the point cloud data to be marked according to the position deviation and the time deviation between the point cloud data to be marked and the historical point cloud data to be marked.

The historical point cloud data to be annotated can be selected as required, for example, the first frame of point cloud data to be annotated before the point cloud data to be annotated can be selected as the historical point cloud data to be annotated, and the tenth frame of point cloud data to be annotated before the point cloud data to be annotated can also be selected as the historical point cloud data to be annotated.

For example, under the condition that multiple frames of point cloud data to be labeled include the automobile 1 and the automobile 1 is in a motion state, if the point cloud data 5 to be labeled to the point cloud data 20 to be labeled include the automobile 1, the position data included in the target labeling data of the automobile 1 in the point cloud data 5 to be labeled to the point cloud data 20 to be labeled can be converted into the world coordinate system, so as to obtain fourth position data of the automobile 1 in the point cloud data 5 to be labeled to the point cloud data 20 to be labeled.

For the point cloud data to be labeled 10, a position deviation between the fourth position data 10 and the fourth position data 9 is determined based on the fourth position data 10 of the automobile 1 in the point cloud data to be labeled 10 and the fourth position data 9 of the automobile 1 in the point cloud data to be labeled 9 (historical point cloud data to be labeled). And the time deviation between two adjacent frames of point cloud data can be obtained according to the frame rate of the point cloud data. And obtaining the speed information of the automobile 1 in the point cloud data 10 to be marked based on the position deviation and the time deviation.

During implementation, after the speed information of the object to be marked in the point cloud data to be marked is obtained, smoothing can be performed on each speed information of the object to be marked to obtain smoothed speed information of the object to be marked in each frame of point cloud data to be marked. The number of times of performing the smoothing processing on the speed information and the second neighborhood value used by each smoothing processing may be determined according to actual requirements, for example, the number of times of performing the smoothing processing may be 1 or the like, and the second neighborhood value may be 9 or the like.

For example, after the speed information of the automobile 1 in the point cloud data 5 to 20 to be marked is obtained, when the speed information 15 of the automobile 1 in the point cloud data 15 to be marked is smoothed, the second neighborhood value used may be 9, that is, the speed information 11 of the automobile 1 in the point cloud data 11 to be marked and the speed information 19 of the automobile 1 in the point cloud data 19 to be marked are averaged to obtain the smoothed speed information of the automobile 1 in the point cloud data 15 to be marked.

Considering that the object in the motion state still has acceleration information, the acceleration information of the object to be marked in each frame of point cloud data to be marked can be determined based on the speed information of the object to be marked in the frames of point cloud data to be marked.

During implementation, after the speed information of the object to be marked in the point cloud data to be marked is obtained, for each frame of point cloud data to be marked in multiple frames of point cloud data to be marked, the acceleration information of the object to be marked in the frame of point cloud data to be marked is determined based on the speed information of the object to be marked in the frame of point cloud data to be marked and the speed information of the object to be marked in the historical point cloud data to be marked positioned in front of the frame of point cloud data to be marked. For example, the speed information of the same object to be marked in two frames of point cloud data can be subtracted to obtain a speed difference value; and obtaining acceleration information according to the speed difference and the time deviation between the two frames of point cloud data.

In one possible implementation, after determining target annotation data of an object to be annotated in point cloud data to be annotated, the method further includes: training a neural network to be trained by using multi-frame point cloud data to be marked, which comprises target marking data, to obtain a target neural network; and/or testing the neural network to be tested by using the multi-frame point cloud data to be marked comprising the target marking data to obtain the test result of the neural network to be tested.

In implementation, after the target labeling data corresponding to the point cloud data to be labeled is determined, the multi-frame point cloud data to be labeled including the target labeling data can be input to the neural network to be trained, and the neural network to be trained is trained to obtain the target neural network.

Or, inputting multi-frame point cloud data to be marked, including target marking data, to the neural network to be tested, and testing the neural network to be tested to obtain a test result of the neural network to be tested.

Or, training a neural network to be trained by using multi-frame point cloud data to be labeled including target labeling data to obtain a target neural network; and testing the target neural network by using at least part of point cloud data in the multi-frame point cloud data to be marked comprising the target marking data to obtain a test result of the target neural network.

In the embodiment of the disclosure, since the labeling process of the target labeling data corresponding to the point cloud data to be labeled is relatively efficient, the neural network to be trained can be trained more efficiently by using the multi-frame point cloud data to be labeled including the target labeling data, so as to obtain the target neural network, and the training efficiency of the target neural network is improved. And/or by utilizing the multi-frame point cloud data to be marked comprising the target marking data, the neural network to be tested can be tested more efficiently, the test result of the neural network to be tested is obtained, and the test efficiency of the neural network to be tested is improved.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

Based on the same concept, an embodiment of the present disclosure further provides a data annotation device, as shown in fig. 3, which is an architecture schematic diagram of the data annotation device provided in the embodiment of the present disclosure, and includes an obtaining module 301, a first determining module 302, and a second determining module 303, specifically:

an obtaining module 301, configured to obtain a point cloud data set;

a first determining module 302, configured to use each frame of point cloud data included in the point cloud data set as point cloud data to be annotated, and determine, for each object to be annotated in the point cloud data to be annotated, reference point cloud data from the point cloud data including the object to be annotated in the point cloud data set based on a state attribute of the object to be annotated; wherein the state attribute comprises: a moving state and/or a stationary state;

a second determining module 303, configured to determine, based on the benchmark annotation data of the object to be annotated in the benchmark point cloud data, target annotation data of the object to be annotated in the point cloud data to be annotated.

In a possible embodiment, the apparatus further comprises: a third determining module 304, wherein the third determining module 304 is configured to determine each frame of point cloud data included in the point cloud data set according to the following steps:

acquiring video data matched with the point cloud data set;

respectively carrying out frame extraction processing on the point cloud data set and the video data to obtain point cloud data of each frame included in the point cloud data set and video frames included in the video data;

determining a video frame matched with the point cloud data based on first timestamp information corresponding to each frame of point cloud data and second timestamp information corresponding to each video frame;

In a possible implementation, the third determining module 304, after determining the video frames matching the point cloud data based on the first timestamp information corresponding to the point cloud data of each frame and the second timestamp information corresponding to each video frame, is further configured to:

screening out the point cloud data with the timestamp difference value larger than the set difference threshold value from the point cloud data of each frame to obtain the screened point cloud data of each frame;

the third determining module 304, when detecting each frame of point cloud data and determining preset labeling data corresponding to each object to be labeled included in the point cloud data, is configured to:

In a possible implementation manner, the third determining module 304, when detecting each frame of point cloud data and determining preset labeling data corresponding to each object to be labeled included in the point cloud data, is configured to:

and adjusting the initial labeling data based on the video frame matched with the point cloud data to generate preset labeling data corresponding to each object to be labeled included in the point cloud data.

In a possible implementation manner, after generating the initial annotation data corresponding to each object to be annotated included in the point cloud data, the third determining module 304 is further configured to:

displaying the point cloud data, video frames matched with the point cloud data and each lane line in a target map matched with the point cloud data;

and responding to the received instruction of adjusting the initial labeling data, and generating preset labeling data corresponding to each object to be labeled included in the point cloud data based on the adjustment of the initial labeling data.

In one possible implementation, the first determining module 302, when determining, based on the state attribute of the object to be labeled, reference point cloud data from point cloud data including the object to be labeled in the point cloud data set, is configured to:

In a possible implementation manner, the first determining module 302, when determining reference point cloud data from the point cloud data including the object to be marked in the point cloud data set based on the state attribute of the object to be marked, is configured to:

In a possible implementation manner, in a case that the state attribute of the object to be annotated indicates a static state, the second determining module 303, when determining target annotation data of the object to be annotated in the point cloud data to be annotated based on benchmark annotation data of the object to be annotated in the point cloud data to be annotated, is configured to:

and converting the first converted marking data corresponding to the object to be marked to a vehicle body central coordinate system when the point cloud data to be marked is collected based on the calibration parameters of the collection equipment when the point cloud data to be marked is collected, so as to obtain target marking data of the object to be marked in the point cloud data to be marked.

In a possible implementation manner, in a case that the state attribute of the object to be labeled indicates a motion state and the reference point cloud data is a preset number, the second determining module 303, when determining target labeling data of the object to be labeled in the point cloud data to be labeled based on the reference labeling data of the object to be labeled in the reference point cloud data, is configured to:

and converting the predicted marking data to a vehicle body center coordinate system corresponding to the collected point cloud data to be marked by using the calibration parameters of the collecting equipment when the point cloud data to be marked is collected to obtain target marking data of the object to be marked in the point cloud data to be marked.

In a possible embodiment, the apparatus further comprises: an updating module 305, wherein after the determining of the target annotation data of the object to be annotated in the point cloud data to be annotated, the updating module 305 is configured to:

and updating the target marking data of the object to be marked in the point cloud data to be marked of the frame where the object to be marked is located by using the third position data to obtain updated target marking data corresponding to each frame of point cloud data to be marked.

In a possible embodiment, the apparatus further comprises: a fourth determining module 306, configured to, in a case that the multiple frames of point cloud data to be labeled include an object to be labeled whose state attribute indicates a motion state, the fourth determining module 306 is configured to:

and aiming at each frame of point cloud data to be marked in the multiple frames of point cloud data to be marked, determining the speed information of the object to be marked in the point cloud data to be marked based on the fourth position data of the object to be marked in the point cloud data to be marked and the fourth position data of the object to be marked in the historical point cloud data to be marked before the point cloud data to be marked.

In a possible embodiment, the apparatus further comprises: a fifth determining module 307, wherein the fifth determining module 307 is configured to:

In a possible embodiment, the apparatus further comprises: a training module 308 and/or a testing module 309.

The training module 308 is configured to train a neural network to be trained by using multiple frames of point cloud data to be labeled including the target labeling data, so as to obtain a target neural network;

the testing module 309 is configured to test the neural network to be tested by using the multiple frames of point cloud data to be labeled including the target labeling data, so as to obtain a test result of the neural network to be tested.

In some embodiments, the functions of the apparatus provided in the embodiments of the present disclosure or the included templates may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, no further description is provided here.

Based on the same technical concept, the embodiment of the disclosure also provides an electronic device. Referring to fig. 4, a schematic structural diagram of an electronic device 400 provided in the embodiment of the present disclosure includes a processor 401, a memory 402, and a bus 403. The memory 402 is used for storing execution instructions and includes a memory 4021 and an external memory 4022; the memory 4021 is also referred to as an internal memory, and is configured to temporarily store operation data in the processor 401 and data exchanged with the external memory 4022 such as a hard disk, the processor 401 exchanges data with the external memory 4022 through the memory 4021, and when the electronic device 400 operates, the processor 401 communicates with the memory 402 through the bus 403, so that the processor 401 executes the following instructions:

acquiring a point cloud data set;

taking each frame of point cloud data included in the point cloud data set as point cloud data to be marked, and determining reference point cloud data from the point cloud data including the object to be marked in the point cloud data set according to the state attribute of each object to be marked in the point cloud data to be marked; wherein the state attribute comprises: a moving state and/or a stationary state;

The specific processing flow of the processor 401 may refer to the description of the above method embodiment, and is not described herein again.

In addition, the embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the data annotation method described in the above method embodiment. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The embodiments of the present disclosure further provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the data labeling method in the foregoing method embodiments, which may be specifically referred to in the foregoing method embodiments and are not described herein again.

The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK) or the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in software functional units and sold or used as a stand-alone product, may be stored in a non-transitory computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.

The above are only specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present disclosure, and shall be covered by the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A method for annotating data, comprising:

acquiring a point cloud data set;

2. The method of claim 1, wherein each frame of point cloud data included in the set of point cloud data is determined by:

acquiring video data matched with the point cloud data set;

3. The method of claim 2, further comprising, after determining the video frames matching the point cloud data based on the first timestamp information corresponding to each frame of the point cloud data and the second timestamp information corresponding to each video frame, respectively:

the detecting each frame of point cloud data and determining the preset marking data corresponding to each object to be marked included in the point cloud data comprises the following steps:

4. The method according to claim 2 or 3, wherein the detecting each frame of the point cloud data and determining the preset labeling data corresponding to each object to be labeled included in the point cloud data comprises:

5. The method of claim 4, after generating the initial labeling data corresponding to each object to be labeled included in the point cloud data, further comprising:

6. The method according to any one of claims 1 to 5, wherein the determining of the reference point cloud data from the point cloud data including the object to be labeled in the point cloud data set based on the state attribute of the object to be labeled comprises:

7. The method according to any one of claims 1 to 6, wherein the determining reference point cloud data from the point cloud data including the object to be labeled in the point cloud data set based on the state attribute of the object to be labeled comprises:

and under the condition that the state attribute of the object to be marked indicates a static state, determining point cloud data, of which the size of an object detection frame of the object to be marked is in a set size range, from the point cloud data of the object to be marked in the point cloud data set as reference point cloud data.

8. The method according to any one of claims 1 to 7, wherein in a case that the status attribute of the object to be labeled indicates a static status, the determining target labeling data of the object to be labeled in the point cloud data to be labeled based on the reference labeling data of the object to be labeled in the point cloud data to be labeled comprises:

converting the reference marking data of the object to be marked in the reference point cloud data into a world coordinate system based on the calibration parameters of the acquisition equipment when the reference point cloud data is acquired, and obtaining first converted marking data corresponding to the object to be marked; the benchmark marking data are positioned under a vehicle body center coordinate system when the benchmark point cloud data are collected;

9. The method according to any one of claims 1 to 7, wherein in a case that the state attribute of the object to be labeled indicates a motion state and the reference point cloud data is a preset number, determining target labeling data of the object to be labeled in the point cloud data to be labeled based on the reference labeling data of the object to be labeled in the reference point cloud data comprises:

10. The method according to any one of claims 1 to 9, wherein after the determining the target annotation data of the object to be annotated in the point cloud data to be annotated, the method further comprises:

11. The method according to any one of claims 1 to 10, wherein in a case that the plurality of frames of point cloud data to be labeled include an object to be labeled whose state attribute indicates a motion state, the method further comprises:

converting target marking data of an object to be marked in a moving state in the frame point cloud data to be marked of the object to be marked in the moving state into a world coordinate system to obtain fourth position data corresponding to the object to be marked in the multiple frames of point cloud data to be marked;

12. The method of claim 11, further comprising:

13. The method according to any one of claims 1 to 12, wherein after the determining the target annotation data of the object to be annotated in the point cloud data to be annotated, the method further comprises:

training a neural network to be trained by using the multi-frame point cloud data to be labeled, which comprises the target labeling data, to obtain a target neural network; and/or the presence of a gas in the atmosphere,

14. A data annotation device, comprising:

the acquisition module is used for acquiring a point cloud data set;

the first determining module is used for taking each frame of point cloud data included in the point cloud data set as point cloud data to be marked, and determining reference point cloud data from the point cloud data including the object to be marked in the point cloud data set based on the state attribute of the object to be marked aiming at each object to be marked in the point cloud data to be marked; wherein the state attribute comprises: a moving state and/or a stationary state;

and the second determining module is used for determining target marking data of the object to be marked in the point cloud data to be marked based on the reference marking data of the object to be marked in the point cloud data to be marked.

15. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of the data annotation method of any one of claims 1 to 13.

16. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the data annotation method according to any one of claims 1 to 13.