CN114638947A

CN114638947A - Data labeling method and device, electronic equipment and storage medium

Info

Publication number: CN114638947A
Application number: CN202210195145.6A
Authority: CN
Inventors: 王煜城; 赵元; 刘兰个川
Original assignee: Guangzhou Xiaopeng Autopilot Technology Co Ltd
Current assignee: Guangzhou Xiaopeng Autopilot Technology Co Ltd
Priority date: 2022-02-28
Filing date: 2022-02-28
Publication date: 2022-06-17

Abstract

The embodiment of the application discloses a data labeling method, a data labeling device, electronic equipment and a storage medium, wherein the method comprises the following steps: identifying foreground data points corresponding to a target object and background data points irrelevant to the target object from environmental point cloud data collected by a radar sensor; respectively projecting the foreground data point and the background data point to an imaging plane of a visual sensor to obtain a first projection result corresponding to the foreground data point and a second projection result corresponding to the background data point; determining the non-truncation ratio of the target object according to the first projection result, and determining the non-occlusion ratio of the target object according to the first projection result and the second projection result; and carrying out visibility marking on the target object according to the non-truncation ratio and the non-occlusion ratio of the target object. By implementing the embodiment of the application, the acquired barrier data can be quickly and accurately subjected to visibility marking.

Description

Data labeling method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a data labeling method and apparatus, an electronic device, and a storage medium.

Background

Three-dimensional (3-division, 3D) visual perception is one of the important perception capabilities in the field of autopilot. With the continuous optimization of the performance of the automatic driving processing model, the demand for labeling the 3D data becomes larger and larger. In order to meet the rapidly-growing data labeling requirement, the mainstream labeling method at present adopts the detection result of the laser radar as the pre-labeling result, and then the pre-labeling result is manually checked and adjusted. However, since the installation position of the laser radar generally cannot coincide with the installation position of the camera, the range that the laser radar can sense is not consistent with the visual field of the camera, and therefore the problem of visibility of the obstacle is caused.

In order to solve the visibility problem of the obstacle, the conventional method mostly adopts a rule means to define the visibility standard. However, in practice, it is found that the method for defining the visibility standard by the rule means often causes the problems of inaccurate visibility judgment, low labeling efficiency and the like.

Disclosure of Invention

The embodiment of the application discloses a data labeling method, a data labeling device, electronic equipment and a storage medium, which can rapidly and accurately label the visibility of acquired barrier data.

The embodiment of the application discloses a data annotation method, which comprises the following steps:

identifying foreground data points corresponding to a target object and background data points irrelevant to the target object from environmental point cloud data collected by a radar sensor;

respectively projecting the foreground data point and the background data point to an imaging plane of a visual sensor to obtain a first projection result corresponding to the foreground data point and a second projection result corresponding to the background data point;

determining the non-truncation ratio of the target object according to the first projection result, and determining the non-occlusion ratio of the target object according to the first projection result and the second projection result;

and carrying out visibility marking on the target object according to the non-truncation ratio and the non-occlusion ratio of the target object.

In one embodiment, the determining the visibility of the target object according to the truncation ratio and the occlusion ratio of the target object includes:

calculating the visibility score of the target object according to the product of the non-truncation ratio and the non-occlusion ratio of the target object;

and if the visibility score of the target object is greater than or equal to a preset threshold value, marking the visibility of the target object as true.

In one embodiment, the determining the non-occlusion rate of the target object according to the first projection result and the second projection result includes:

constructing a first foreground depth map of the target object according to the image coordinates, on the imaging plane, of each foreground data point included in the first projection result and the depth value of each foreground data point corresponding to the target object;

performing resolution compression operation on the first foreground depth map to obtain a second foreground depth map;

constructing a first background depth map according to the image coordinates of each background data point on the imaging plane and the depth value of each background data point included in the second projection result;

performing resolution compression operation on the first background depth map to obtain a second background depth map;

and calculating the shielding rate of the target object according to the second foreground depth map and the second background depth map, and calculating the non-shielding rate of the target object according to the shielding rate of the target object.

In one embodiment, the performing a resolution compression operation on the first foreground depth map to obtain a second foreground depth map includes:

determining a two-dimensional external frame corresponding to the target object according to the image coordinates of each foreground data point in the first foreground depth map;

calculating a scale transformation coefficient corresponding to resolution compression operation according to the preset row number and column number of the low-resolution depth map and the height and width of the two-dimensional external frame;

and carrying out resolution compression operation on the first foreground depth map according to the scale transformation coefficient to obtain a second foreground depth map.

In one embodiment, the first foreground depth map has at least two data points which are mapped to the same first data point in the second foreground depth map after being subjected to a resolution compression operation; the depth value of the first data point in the second foreground depth map is the minimum of the depth values of the at least two data points in the first foreground depth map.

In one embodiment, the calculating the occlusion rate of the target object according to the second foreground depth map and the second background depth map includes:

comparing corresponding depth values of the foreground data point and the background data point with the same image coordinate in the second foreground depth map and the second background depth map respectively, and counting the number of target data points of which the difference value of the depth values in the second foreground depth map exceeds a depth threshold;

determining a ratio between the number of the target data points and the total number of foreground data points as an occlusion rate of the target object.

In one embodiment, the determining a non-truncation ratio of the target object from the first projection result includes:

counting the number of foreground data points projected to the outside of the visual angle range of the visual sensor according to the first projection result;

determining a ratio between the number of foreground data points projected outside the viewing angle range and the total number of foreground data points as a truncation ratio of the target object;

and determining the non-truncation ratio of the target object according to the truncation ratio of the target object.

The embodiment of the application discloses a data annotation device, includes:

the identification module is used for identifying foreground data points corresponding to a target object and background data points irrelevant to the target object from environmental point cloud data collected by a radar sensor;

the projection module is used for projecting the foreground data point and the background data point to an imaging plane of a visual sensor respectively to obtain a first projection result corresponding to the foreground data point and a second projection result corresponding to the background data point;

the first determining module is used for determining the non-truncation ratio of the target object according to the first projection result and determining the non-occlusion ratio of the target object according to the first projection result and the second projection result;

and the second determination module is used for carrying out visibility marking on the target object according to the non-truncation ratio and the non-occlusion ratio of the target object.

The embodiment of the application discloses an electronic device, which comprises a memory and a processor, wherein a computer program is stored in the memory, and when the computer program is executed by the processor, the processor is enabled to realize any one of the data annotation methods disclosed by the embodiment of the application.

The embodiment of the application discloses a computer readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to realize any one of the data annotation methods disclosed in the embodiment of the application.

Compared with the related art, the embodiment of the application has the following beneficial effects:

after the radar sensor collects the environmental point cloud data, a first projection result corresponding to the target object and a second projection result irrelevant to the target object are obtained through identification and projection of the environmental point cloud data, so that the non-truncation ratio of the target object can be calculated according to the first projection result, and the non-occlusion ratio of the target object is calculated by combining the second projection result. The visibility of the target object can be calculated through the non-truncation ratio and the non-occlusion ratio of the target object, so that the complicated rule means design can be reduced, and the visibility marking of the target object can be quickly and accurately carried out only by calculating the non-truncation ratio and the non-occlusion ratio of the target object.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is an exemplary diagram of an application scenario of a data annotation method disclosed in an embodiment of the present application;

FIG. 2 is a schematic diagram illustrating a method flow of a data annotation method according to an embodiment;

FIG. 3 is a schematic method flow diagram of another data annotation method according to an exemplary embodiment;

FIG. 4 is an exemplary illustration of a projection result of an embodiment of the disclosed projection of ambient point cloud data onto an imaging plane;

FIG. 5 is a schematic structural diagram of a data annotation device according to an embodiment;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It is to be noted that the terms "comprises" and "comprising" and any variations thereof in the examples and figures of the present application are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

The embodiment of the application discloses a data labeling method and device, electronic equipment and a storage medium, which can rapidly and accurately label the visibility of collected barrier data. The following are detailed below.

Referring to fig. 1, fig. 1 is a diagram illustrating an example of acquiring environmental point cloud data according to an embodiment. As shown in fig. 1, a radar sensor and a vision sensor may be provided on the vehicle 10.

The radar sensor may include, but is not limited to, a laser radar, a millimeter wave radar, an ultrasonic radar, and the like. The vision sensor may include: a visible light camera, a fisheye camera, a panoramic camera, and the like, and the details are not limited.

The radar sensor may continuously collect environmental point cloud data in the environment in which the vehicle 10 is located. As shown in fig. 1, the environmental point cloud data collected by the radar sensor may include, but is not limited to, point cloud data corresponding to the vehicle 20, point cloud data corresponding to the tree 30, point cloud data corresponding to the building 40, and the like.

Different objects in the environment of the vehicle 10 may be selected as the target objects corresponding to the visibility labels according to different labeling objectives. For example, if visibility marking of an obstacle on which the obstacle vehicle 10 is traveling is required, the vehicle 20 may be used as the target object.

Based on the application scenario as shown in fig. 1. Referring to fig. 2, fig. 2 is a schematic method flow diagram of a data annotation method according to an embodiment, and the method is applicable to various Electronic devices, such as a vehicle-mounted computer of a vehicle, a vehicle-mounted device such as an Electronic Control Unit (ECU), or a computing device such as a personal computer, and is not limited specifically. As shown in fig. 2, the method may include the steps of:

210. foreground data points corresponding to the target object and background data points irrelevant to the target object are identified from the environmental point cloud data collected by the radar sensor.

The electronic equipment can process the environmental point cloud data collected by the radar sensor through target detection methods such as statistic-based classification identification, fuzzy classification identification, neural network-based classification identification and support vector machine-based classification identification, so that the foreground data points corresponding to the target object are identified. That is, foreground data points belonging to the target object are identified.

After identifying the foreground data points, the electronic device may identify remaining data points of the collected ambient point cloud data other than the foreground data points as background data points that are unrelated to the target object.

In some embodiments, to reduce the amount of computation, the electronic device may also perform thinning processing on the remaining data points in the environmental point cloud data except for the foreground data point, remove some data points from the remaining data points according to a rule, and identify the remaining data points as background data points unrelated to the target object.

220. And projecting the foreground data points and the background data points to an imaging plane of the vision sensor to obtain a first projection result corresponding to the foreground data points and a second projection result corresponding to the background data points.

The size of the imaging plane of the vision sensor is related to the field of view (FOV) of the vision sensor, which may refer to the range that can be covered by the lens of the vision sensor, i.e., the range of viewing angles. Therefore, when the foreground data point and the background data point are respectively projected onto the imaging plane of the vision sensor, the data points within the angle of view can be projected onto the imaging plane, and the data points outside the angle of view generally cannot be projected onto the imaging plane.

Therefore, the first projection result obtained by projecting the foreground data point may include one or more of the following parameters: a number of foreground data points falling within a viewing angle range of the vision sensor, an image coordinate of each foreground data point falling within the viewing angle range, a number of foreground data points outside the viewing angle range of the vision sensor.

Accordingly, the second projection result obtained by projecting the background data point may include one of the following parameters: a number of background data points falling within a viewing angle range of the vision sensor, an image coordinate of each background data point falling within the viewing angle range, a number of background data points outside the viewing angle range of the vision sensor.

230. And determining the non-truncation ratio of the target object according to the first projection result, and determining the non-occlusion ratio of the target object according to the first projection result and the second projection result.

The non-truncation ratio of the target object may be used to indicate a probability that the foreground data point corresponding to the target object falls within a viewing angle range of the visual sensor, and the more the number of foreground data points falling within the viewing angle range, the higher the non-truncation ratio of the target object.

In some embodiments, the first projection result may include a number of foreground data points falling within the viewing angle range, and the electronic device may calculate the non-truncation ratio of the target object directly from the number of foreground data points falling within the viewing angle range and the total number of foreground data points included in the first projection result.

In other embodiments, the electronic device may include the number of foreground data points falling outside the viewing angle range, and the electronic device may also calculate the truncation ratio of the target object according to the number of foreground data points falling outside the viewing angle range and the total number of foreground data points, and then calculate the non-truncation ratio of the target object according to the truncation ratio of the target object. The fewer the number of foreground data points falling outside the viewing angle range, the lower the truncation ratio of the target object and the higher the non-truncation ratio of the target object.

The number of the foreground data points falling outside the visual range is relatively small, the truncation ratio is calculated firstly, and then the non-truncation ratio is calculated according to the truncation ratio, so that the calculation amount during counting the number of the foreground data points can be reduced, and the calculation speed can be improved.

For example, the truncation ratio of the target object may be calculated by the following formula:

wherein crop _ ratio may represent a truncation ratio of the target object, I_cropMay represent foreground data points, I, projected outside the range of viewing angles_allForeground data points belonging to the target object may be represented.

Thus, the non-truncation ratio of the target object may be expressed as (1-crop _ ratio).

In addition, the non-occlusion rate of the target object can be used to indicate the probability that the foreground data point corresponding to the target object is occluded by the background data point unrelated to the target object. The foreground data point is shielded by the background data point, which means that the physical distance between the foreground data point and the background data point is too small, and the vehicle cannot pass between the two points.

In some embodiments, the electronic device may count the number of target data points occluded by the background data points in the foreground data points according to the image coordinates of each foreground data point included in the first projection result and the image coordinates of each background data point included in the second projection result, calculate an occlusion rate of the target object according to the number of target data points and the total number of foreground data points, and further calculate a non-occlusion rate of the target object according to the occlusion rate of the target object.

Optionally, if the image coordinates of a certain foreground data point and another background data point are the same, the foreground data point may be counted as an occluded target data point.

In still other embodiments, the electronic device may further calculate an occlusion rate of the target object by counting a number of target data points occluded by the background data points in the foreground data points according to the image coordinates of each foreground data point included in the first projection result, the image coordinates of each background data point included in the second projection result, and the depth value of each foreground data point and each background data point, so as to calculate a non-occlusion rate of the target object according to the occlusion rate of the target object.

It should be noted that the depth value of each foreground data point and each background data point may be measured by a radar sensor and included in the environmental point cloud data.

240. And carrying out visibility marking on the target object according to the non-truncation ratio and the non-occlusion ratio of the target object.

In the embodiment of the present application, the visibility of the target object may be in positive correlation with the non-truncation ratio and the non-occlusion ratio of the target object, respectively. The higher the non-truncation ratio of the target object, the higher the visibility; the higher the non-occlusion rate of the target object, the higher the visibility.

In some embodiments, the electronic device may set threshold values corresponding to the non-truncation rate and the non-occlusion rate, respectively, and perform visibility labeling on the target object according to the threshold values corresponding to the non-truncation rate and the non-occlusion rate, respectively.

In still other embodiments, the electronic device may also calculate a visibility score of the target object according to the non-truncation ratio and the non-occlusion ratio of the target object, and perform visibility labeling on the target object according to the visibility score. A preset threshold corresponding to the visibility score can be preset, and if the calculated visibility score is greater than or equal to the preset threshold, the visibility of the target object can be marked as true; otherwise, if the calculated visibility score is less than the preset threshold, the visibility of the target object may be marked as false.

For example, the visibility label of the target object can be expressed by referring to the following formula:

wherein visibility may represent visibility of the target object, (1-crop _ ratio) may represent non-truncation ratio of the target object, (1-occ)_ratio) May represent the non-occlusion rate of the target object and the Troshold may represent a preset threshold.

That is to say, in the embodiment of the application, the visibility of the target object can be automatically evaluated according to the non-occlusion rate and the non-truncation rate of the target object, and the visibility marking can be quickly and accurately performed on the acquired obstacle data. Furthermore, the visibility of the target object can be defined as the product of the non-occlusion rate and the non-truncation rate, so that the binary problem of whether the target object is visible or not is converted into a regression problem according to the product data, and the artificial intelligent models such as machine learning and deep learning can more flexibly perform subsequent data processing operation on the visibility of the target object.

Because the radar sensor may have a certain sensing blind area, the corresponding data points of the object close to the radar sensor in the point cloud environment data are sparse, which easily results in misjudgment of the shielding relationship.

In some possible embodiments, the electronic device may adopt a voxelization and other pseudo point cloud manner to complete sparse environmental point cloud data, and then process the complete environmental point cloud data through any one of the aforementioned data labeling methods, so as to reduce erroneous judgment of a shielding relationship, improve calculation accuracy of a target object shielding rate and a non-shielding rate, and improve accuracy of visibility labeling.

In other possible embodiments, the electronic device may also perform low-resolution mapping on the point cloud data, so that the sparse point cloud data can also effectively express the occlusion relationship. Compared with a voxel pseudo point cloud filling mode, the low-resolution mapping mode is low in calculation cost and high in calculation speed, and can avoid occlusion relation misjudgment caused by voxel filling errors.

Referring to fig. 3, fig. 3 is a flowchart illustrating another method for data annotation according to an embodiment, which can be applied to the electronic device. As shown in fig. 3, the data annotation method may include the following steps:

310. foreground data points corresponding to the target object and background data points irrelevant to the target object are identified from the environmental point cloud data collected by the radar sensor.

320. And projecting the foreground data point and the background data point to an imaging plane of the vision sensor to obtain a first projection result corresponding to the foreground data point and a second projection result corresponding to the background data point.

330. And determining the non-truncation ratio of the target object according to the first projection result.

The detailed implementation of steps 310-330 can refer to the foregoing embodiments, and the following description is omitted.

340. And constructing a first foreground depth map of the target object according to the image coordinates of each foreground data point on the imaging plane, which are included in the first projection result, and the depth value of each foreground data point corresponding to the target object.

Illustratively, the first foreground depth map may be represented as follows:

wherein depth is_mapMay represent a first foreground depth map, u_iAnd v_iCan represent the image coordinates of the ith foreground data point on the imaging plane, d_iMay represent the depth value of the ith foreground data point and n may represent the total number of foreground data points.

350. And carrying out resolution compression operation on the first foreground depth map to obtain a second foreground depth map.

In the embodiment of the present application, the image size of the compressed low-resolution depth map may be preset, that is, the number of rows and columns of the low-resolution depth map may be preset. The resolution compression operation can be used for compressing the first foreground depth map with larger image size into a low-resolution depth map with preset row number and preset column number.

As an alternative embodiment, step 350 may include the steps of:

3510. and determining a two-dimensional external frame corresponding to the target object according to the image coordinates of each foreground data point in the first foreground depth map.

The two-dimensional bounding box is used for indicating the position of the target object in the first foreground depth map and can be represented by the image coordinates of the foreground data points at the outermost periphery.

For example, the two-dimensional bounding box corresponding to the target object may be represented as follows:

wherein u is_minCan represent the abscissa minimum, u, of the foreground data point_maxCan represent the maximum value of the abscissa, v, of the foreground data point_minCan represent the ordinate minimum, v, of the foreground data point_maxThe ordinate maximum of the foreground data point may be represented.

3520. And calculating a scale transformation coefficient corresponding to resolution compression operation according to the preset number of rows and columns of the low-resolution depth map and the height and width of the two-dimensional external frame.

The scaling coefficients may be used to guide resolution compression operations. Wherein, the scale change coefficient may include: horizontal transform coefficients and vertical transform coefficients. The transverse transformation coefficient can be obtained by calculating the preset number of low-resolution depth map lines and the width of the two-dimensional external frame; the longitudinal transformation coefficient can be obtained by the calculation of the preset number of lines of the low-resolution depth map and the height of the two-dimensional external frame.

For example, the calculation manner of the horizontal transform coefficient and the vertical transform coefficient may be represented by the following formula:

scale_u＝N/(u_max-u_min) (ii) a Formula (5);

scale_v＝M/(v_max-v_min) (ii) a Formula (6);

wherein, scale_uMay represent the horizontal transform coefficients, N may represent the number of preset low resolution depth map columns, (u)_max-u_min) May represent the width of the two-dimensional bounding box; scale_vMay represent vertical transform coefficients, M may represent a predetermined number of rows of a low resolution depth map, (v)_max-v_min) May represent the height of the two-dimensional bounding box.

3530. And carrying out resolution compression operation on the first foreground depth map according to the scale transformation coefficient to obtain a second foreground depth map.

In an embodiment of the present application, performing a resolution compression operation on the first foreground depth map according to the scaling coefficient may include: and adjusting the image position of the foreground data point in the first foreground depth map by using the scale conversion coefficient, and determining the corresponding depth value of each converted foreground data point.

It should be noted that there may be two or more foreground data points in the first depth map that are mapped to the same first data point in the second foreground depth map after the resolution compression operation. At this time, if the corresponding depth values of the plurality of foreground data points mapped as the same first data point in the first depth map are different, the final depth value of the mapped first data point in the second foreground depth needs to be determined according to the different depth values of the plurality of foreground data points. Optionally, a minimum value of the depth values corresponding to the plurality of foreground data points may be taken as a depth value corresponding to the first data point in the mapped second depth map.

Illustratively, the second foreground depth map may be represented as follows:

wherein, depth'_mapMay represent a second foreground depth map, u_jAnd v_jCan represent the image coordinate of the jth foreground data point on the second depth map, d_jMay represent the depth value of the jth foreground data point in the second depth map.

Based on the foregoing formula (7), performing a resolution compression operation on the first foreground depth map according to the scale transformation coefficient can be represented by the following formula:

u_j＝(u_i-u_min)×sacle_u+ 0.5; formula (8);

v_j＝(v_i-v_min)×sacle_v+ 0.5; formula (9);

d_j＝min(d_I) (ii) a Equation (10);

it should be noted that, the jth foreground data point in the second depth map is obtained after the resolution compression operation is performed on the ith foreground data point in the first depth map. d_ICan express a reflectionThe depth value in the first depth map of all foreground data points that hit the jth foreground data point in the second depth map, min (-) is used to indicate the minimum value. In addition, 0.5 in the foregoing formula (8) and formula (9) may be a quantitative parameter set by a developer based on experience, and may be set to other values according to actual business requirements, which is not limited specifically.

360. And constructing a first background depth map according to the image coordinates of each background data point on the imaging plane and the depth value of each background data point included by the second projection result.

370. And carrying out resolution compression operation on the first background depth map to obtain a second background depth map.

In this embodiment of the present application, in the implementation manner that the first background depth map is constructed in step 360 and the resolution compression operation is performed on the first background depth map in step 370, reference may be made to the implementation manner that the first foreground depth map is constructed in step 340 and the resolution compression operation is performed on the first foreground depth map in step 350, which is not described in detail below.

It should be noted that a plurality of background data points may also correspond to the same background object. Thus, if the electronic device needs to calculate a two-dimensional bounding box in step 370, a two-dimensional bounding box for the background object can be calculated.

380. And calculating the shielding rate of the target object according to the second foreground depth map and the second background depth map, and calculating the non-shielding rate of the target object according to the shielding rate of the target object.

In this embodiment of the application, the electronic device may compare depth values respectively corresponding to a foreground data point and a background data point of the same image coordinate in the second foreground depth map and the second background depth map to calculate a difference value between the depth values of the foreground data point and the background data point of the same image coordinate.

Depending on the definition of occlusion, a depth threshold may be preset, for example the depth threshold may be set to 1 meter.

The electronic device may count a number of target data points in the second foreground depth map for which a difference in depth values exceeds a depth threshold, e.g., data points for which a difference in depth values exceeds 1 meter or more are target data points. After counting the number of the target data points, a ratio between the number of the target data points and the total number of the foreground data points can be calculated as the occlusion rate of the target object.

For example, the occlusion rate of the target object may be calculated with reference to the following formula:

wherein occ _ ratio may represent an occlusion rate, I_occMay represent the number of target data points, I_allForeground data points belonging to a target object may be represented.

390. And calculating the visibility score of the target object according to the product of the non-truncation ratio and the non-occlusion ratio of the target object, and performing visibility marking on the target object according to the visibility score of the target object.

And if the visibility score of the target object is greater than or equal to the preset threshold value, marking the visibility of the target object as true. If the visibility score of the target object is less than the preset threshold, the visibility of the target object may be marked as false.

For example, to more clearly illustrate the aforementioned concepts of the truncation ratio, non-truncation ratio, occlusion ratio, and non-occlusion ratio of the target object, please refer to fig. 4, where fig. 4 is an exemplary diagram of a projection result of the environmental point cloud data projected onto the imaging plane according to an embodiment.

As shown in fig. 4, a depth map 410 of the target object may indicate image coordinates of respective foreground data points corresponding to the target object after projection onto the imaging plane. The viewing angle range 420 of the vision sensor does not completely coincide with the depth map 410 of the target object, and the foreground data points in the depth map 410 of the target object that are outside the viewing angle range 420 are truncated foreground data points 410a, i.e. I in the aforementioned formula (1)_crop。

Meanwhile, the depth map 410 of the target object includes a plurality of foreground data points, of which there is a data point 410b that is occluded by a background data point, for example, I in the foregoing formula (11)_occ。

Therefore, in the foregoing embodiment, the electronic device can perform fast and accurate labeling on the visibility of the target object through calculation of the non-occlusion rate and the non-truncation rate, and can further reduce the calculation cost of the occlusion rate by introducing the low-resolution depth map, and improve the calculation accuracy of the occlusion rate, thereby further improving the speed and accuracy of the visibility labeling.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a data annotation device according to an embodiment, where the data annotation device can be applied to any one of the electronic devices. As shown in fig. 5, the data annotation device 500 may include: an identification module 510, a projection module 520, a first determination module 530, and a second determination module 540.

An identifying module 510, configured to identify a foreground data point corresponding to a target object and a background data point unrelated to the target object from environmental point cloud data collected by a radar sensor;

the projection module 520 is configured to project the foreground data point and the background data point to an imaging plane of the visual sensor, respectively, to obtain a first projection result corresponding to the foreground data point and a second projection result corresponding to the background data point;

a first determining module 530, configured to determine a non-truncation ratio of the target object according to the first projection result, and determine a non-occlusion ratio of the target object according to the first projection result and the second projection result;

and the second determining module 540 is configured to perform visibility labeling on the target object according to the non-truncation ratio and the non-occlusion ratio of the target object.

In one embodiment, the second determining module 540 is further configured to calculate the visibility score of the target object according to a product of the non-truncation ratio and the non-occlusion ratio of the target object; and if the visibility score of the target object is greater than or equal to the preset threshold, marking the visibility of the target object as true.

In one embodiment, the first determining module 530 may include: the device comprises a construction unit, a compression unit and a calculation unit.

A building unit, configured to build a first foreground depth map of the target object according to the image coordinates of the foreground data points included in the first projection result on the imaging plane and the depth values of the foreground data points corresponding to the target object;

the compression unit can be used for carrying out resolution compression operation on the first foreground depth map to obtain a second foreground depth map;

the construction unit is further used for constructing a first background depth map according to the image coordinates of each background data point on the imaging plane and the depth value of each background data point included in the second projection result;

the compression unit can also be used for carrying out resolution compression operation on the first background depth map to obtain a second background depth map;

and the calculating unit can be used for calculating the shielding rate of the target object according to the second foreground depth map and the second background depth map and calculating the non-shielding rate of the target object according to the shielding rate of the target object.

In one embodiment, the compression unit may be configured to determine a two-dimensional bounding box corresponding to the target object according to image coordinates of each foreground data point in the first foreground depth map; calculating a scale transformation coefficient corresponding to resolution compression operation according to the preset number of rows and columns of the low-resolution depth map and the height and width of the two-dimensional external frame; and carrying out resolution compression operation on the first foreground depth map according to the scale transformation coefficient to obtain a second foreground depth map.

Optionally, the first foreground depth map has at least two data points, which are mapped to the same first data point in the second foreground depth map after the resolution compression operation; the depth value of the first data point in the second foreground depth map is the minimum of the depth values of the at least two data points in the first foreground depth map.

In one embodiment, the computing unit may be further configured to compare corresponding depth values of the foreground data point and the background data point of the same image coordinate in the second foreground depth map and the second background depth map, respectively, and count the number of target data points in the second foreground depth map whose difference between the depth values exceeds the depth threshold; and determining the ratio of the number of the target data points to the total number of the foreground data points as the occlusion rate of the target object.

In one embodiment, the first determining module 530 may be further configured to count the number of foreground data points projected to the visual sensor outside the viewing angle range according to the first projection result; determining the ratio of the number of the foreground data points projected to be out of the view angle range to the total number of the foreground data points as the truncation ratio of the target object; and determining the non-truncation ratio of the target object according to the truncation ratio of the target object.

By implementing the data labeling device, the visibility of the target object can be rapidly and accurately labeled through the calculation of the non-occlusion rate and the non-truncation rate, the visibility can be defined as the product of the non-occlusion rate and the non-truncation rate, and the method is beneficial to converting the two classification problems of whether the visibility is visible into the regression problem according to the product data so as to more flexibly support the subsequent processing model to carry out further processing. In addition, the calculation cost of the occlusion rate can be reduced by introducing a low-resolution depth map, and the calculation accuracy of the occlusion rate is improved, so that the speed and the accuracy of the visibility annotation are further improved.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to an embodiment. As shown in fig. 6, the electronic device may include:

a memory 610 storing executable program code;

a processor 620 coupled to the memory 610;

the processor 620 calls the executable program code stored in the memory 610 to execute any one of the data annotation methods disclosed in the embodiments of the present application.

It should be noted that the electronic device shown in fig. 6 may further include components, which are not shown, such as a power supply, an input key, a camera, a speaker, a screen, an RF circuit, a Wi-Fi module, a bluetooth module, and a sensor, which are not described in detail in this embodiment.

The embodiment of the application discloses a computer-readable storage medium which stores a computer program, wherein when the computer program is processed by a processor, the processor realizes any one of the data annotation methods disclosed in the embodiment of the application.

Embodiments of the present application disclose a computer program product comprising a non-transitory computer readable storage medium storing a computer program, and the computer program is operable to cause a computer to perform any one of the data annotation methods disclosed in the embodiments of the present application.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. Those skilled in the art should also appreciate that the embodiments described in this specification are all alternative embodiments and that the acts and modules involved are not necessarily required for this application.

In various embodiments of the present application, it should be understood that the size of the serial number of each process described above does not mean that the execution sequence is necessarily sequential, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated units, if implemented as software functional units and sold or used as a stand-alone product, may be stored in a computer accessible memory. Based on such understanding, the technical solution of the present application, which is a part of or contributes to the prior art in essence, or all or part of the technical solution, may be embodied in the form of a software product, stored in a memory, including several requests for causing a computer device (which may be a personal computer, a server, a network device, or the like, and may specifically be a processor in the computer device) to execute part or all of the steps of the above-described method of the embodiments of the present application.

It will be understood by those skilled in the art that all or part of the steps in the methods of the embodiments described above may be implemented by hardware instructions of a program, and the program may be stored in a computer-readable storage medium, where the storage medium includes Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM), or other Memory, such as a magnetic disk, or a combination thereof, A tape memory, or any other medium readable by a computer that can be used to carry or store data.

The data annotation method, device, electronic device and storage medium disclosed in the embodiments of the present application are described in detail above, and a specific example is applied in the present application to explain the principle and the implementation of the present application. Meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method for annotating data, the method comprising:

2. The method of claim 1, wherein determining the visibility of the target object based on the truncation ratio and the occlusion ratio of the target object comprises:

3. The method according to claim 1 or 2, wherein the determining the non-occlusion rate of the target object according to the first projection result and the second projection result comprises:

4. The method of claim 3, wherein performing a resolution compression operation on the first foreground depth map to obtain a second foreground depth map comprises:

calculating a scale transformation coefficient corresponding to resolution compression operation according to the preset number of rows and columns of the low-resolution depth map and the height and width of the two-dimensional external frame;

5. The method of claim 3, wherein the first foreground depth map has at least two data points that are mapped to a same first data point in the second foreground depth map after a resolution compression operation; the depth value of the first data point in the second foreground depth map is the minimum of the depth values of the at least two data points in the first foreground depth map.

6. The method of claim 3, wherein the calculating the occlusion rate of the target object from the second foreground depth map and the second background depth map comprises:

7. The method of any of claims 1-6, wherein determining the non-truncation ratio of the target object from the first projection results comprises:

8. A data annotation device, comprising:

9. An electronic device, comprising a memory and a processor, wherein a computer program is stored in the memory, and wherein the computer program, when executed by the processor, causes the processor to carry out the method according to any one of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.