CN116416601A - Labeling method, device, medium and program product applied to automatic driving simulation - Google Patents

Labeling method, device, medium and program product applied to automatic driving simulation Download PDF

Info

Publication number
CN116416601A
CN116416601A CN202310349881.7A CN202310349881A CN116416601A CN 116416601 A CN116416601 A CN 116416601A CN 202310349881 A CN202310349881 A CN 202310349881A CN 116416601 A CN116416601 A CN 116416601A
Authority
CN
China
Prior art keywords
list
dimensional
sub
points
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310349881.7A
Other languages
Chinese (zh)
Inventor
钟娟
张峻川
方涛
许雪娟
童立平
余恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Furui Microelectronics Co ltd
Original Assignee
Hefei Furui Microelectronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Furui Microelectronics Co ltd filed Critical Hefei Furui Microelectronics Co ltd
Priority to CN202310349881.7A priority Critical patent/CN116416601A/en
Publication of CN116416601A publication Critical patent/CN116416601A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/584Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/647Three-dimensional objects by matching two-dimensional images to three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • Traffic Control Systems (AREA)

Abstract

The embodiment of the application provides a labeling method applied to automatic driving simulation, which comprises the following steps: acquiring the same type Lei Dadian cloud; acquiring the instance numbers of all points in the same type Lei Dadian cloud, and generating a visible object number list according to the instance numbers of all points; generating a matching list according to the same-kind Lei Dadian cloud and a visible object number list, wherein the matching list comprises a plurality of indexes and a plurality of contents, the plurality of contents and the plurality of indexes are respectively in one-to-one correspondence, at least one index comprises an instance number in the visible object number list, and the plurality of contents comprise points in the same-kind Lei Dadian cloud; acquiring camera information, wherein the camera information comprises internal parameters and external parameters of a virtual camera outputting a to-be-annotated picture and position information of the virtual camera in a three-dimensional space; and marking a two-dimensional boundary box of the detected object on the camera plane of the virtual camera according to the matching list and the camera information.

Description

Labeling method, device, medium and program product applied to automatic driving simulation
Technical Field
The present disclosure relates to the field of autopilot technology, and in particular, to the field of autopilot simulation technology, and more particularly, to a labeling method, apparatus, electronic device, storage medium, and program product applied to autopilot simulation.
Background
When the automatic driving related algorithm is developed and verified, a large amount of data is required for training or testing, such as sensor data of visual images, laser radar point clouds and the like, and the data is required to be matched with truth values of two-dimensional bounding boxes, three-dimensional bounding boxes and the like of detected objects. If the real vehicle is adopted for collection and manual true value marking is carried out, the required cost is extremely high. Therefore, the automatic driving simulation software is getting more and more attention, and can quickly generate virtual sensor data in batches and automatically carry out true value labeling through development.
In the existing automatic driving simulation software, a user can conveniently acquire the real-time position of an object in a three-dimensional space. However, if a two-dimensional bounding box of an object in an image captured by a virtual camera in simulation is desired, a secondary development is required. In the existing automatic labeling method of the two-dimensional boundary frame, the three-dimensional boundary frame vertex of an object is directly projected onto a two-dimensional camera plane to obtain the two-dimensional boundary frame, the labeled part of the two-dimensional boundary frame is incorrect, and the labeled two-dimensional boundary frame is not accurate and compact enough for objects such as vehicles.
The above information disclosed in this section is only for understanding the background of the technical idea of the present application, and thus, the above information may contain information that does not constitute the prior art.
Disclosure of Invention
In view of the foregoing, the present application provides a labeling method, apparatus, electronic device, storage medium, and program product applied to automatic driving simulation.
According to a first aspect of the present application, there is provided a labeling method applied to automated driving simulation, the method comprising:
obtaining a homogeneous class of radar point clouds, each point in the homogeneous class of radar point clouds containing the following information of the detected object: the instance number of the detected object, the kind number of the detected object and the three-dimensional space position information of the point, wherein the kind numbers of all the points in the same kind of radar point cloud are the same;
obtaining instance numbers of all points in the similar radar point cloud, and generating a visible object number list according to the instance numbers of all the points;
generating a matching list according to the similar radar point clouds and the visible object number list, wherein the matching list comprises a plurality of indexes and a plurality of contents, the plurality of contents and the plurality of indexes are respectively in one-to-one correspondence, at least one index in the plurality of indexes comprises an instance number in the visible object number list, and the plurality of contents comprise points in the similar radar point clouds;
Acquiring camera information, wherein the camera information comprises internal parameters and external parameters of a virtual camera outputting a graph to be annotated and position information of the virtual camera in a three-dimensional space; and
marking a two-dimensional boundary box of the detected object on the camera plane of the virtual camera according to the matching list and the camera information,
the marking, according to the matching list and the camera information, a two-dimensional bounding box of the detected object in the two-dimensional plane of the virtual camera, specifically includes:
projecting a plurality of content with the same index in the matching list onto a two-dimensional camera plane of the virtual camera to obtain a plurality of projection points positioned on the camera plane; and
a two-dimensional bounding box surrounding the plurality of projection points is determined, the two-dimensional bounding box surrounding the plurality of projection points being a two-dimensional bounding box of the detected object in a two-dimensional camera plane of the virtual camera.
According to some exemplary embodiments, the acquiring the same kind Lei Dadian cloud comprises:
an initial Lei Dadian cloud is acquired, each point in the initial Lei Dadian cloud containing the following information of the detected object: an instance number of the detected object, a kind number of the detected object, and three-dimensional spatial position information of the point; and
And selecting points with the same category number from the initial Lei Dadian cloud to form a radar point cloud of the same category.
According to some exemplary embodiments, the method further comprises: and acquiring a background static object list, wherein the background static object list comprises three-dimensional space position information and three-dimensional boundary frame vertex position information of each static object in the simulation environment.
According to some exemplary embodiments, the selecting points with the same category number from the initial Lei Dadian cloud forms the same category Lei Dadian cloud, and specifically includes: and selecting points of the vehicles corresponding to the class numbers from the initial Lei Dadian cloud to form a first same-class radar point cloud.
According to some exemplary embodiments, the generating a matching list according to the homogeneous radar point cloud and the visible object number list specifically includes:
matching the list of visible object numbers with the list of background static objects to form a first matching result,
the first matching result comprises a first sub-list and a second sub-list;
the first sub-list comprises a plurality of indexes and a plurality of contents, the plurality of indexes in the first sub-list comprise instance numbers of running vehicles recorded in the visible object number list, the instance numbers of the running vehicles are not equal to a first specified value, wherein the first specified value represents the instance numbers of parked vehicles, and the plurality of contents in the first sub-list are all empty;
The second sub-list comprises a plurality of indexes and a plurality of contents, the plurality of indexes in the second sub-list comprise parked vehicles in the background static object list, and the plurality of contents in the second sub-list are all empty.
According to some exemplary embodiments, the generating a matching list according to the homogeneous radar point cloud and the visible object number list specifically further includes:
screening points with instance numbers not equal to a first specified value in the first same-kind radar point cloud to form a point cloud of a running vehicle; and
and according to the corresponding relation between each index in the first sub-list and the instance number of each point in the point cloud of the running vehicle, correspondingly storing each point in the point cloud of the running vehicle in a plurality of contents in the first sub-list to obtain an updated first sub-list.
According to some exemplary embodiments, the generating a matching list according to the homogeneous radar point cloud and the visible object number list specifically further includes:
screening out points with instance numbers equal to a first specified value in the first same-kind radar point cloud to form a point cloud for parking the vehicle;
Traversing the background static object list for each point in the point cloud of the parked vehicle;
when the ith point in the point cloud of the parked vehicle is enclosed in the three-dimensional boundary box of the jth static object in the background static object list, determining the corresponding relation between the ith point and the jth static object, wherein i is more than or equal to 1 and less than or equal to the number of points in the point cloud of the parked vehicle, and j is more than or equal to 1 and less than or equal to the number of static objects in the background static object list;
corresponding each point in the point cloud of the parked vehicle to be stored in a plurality of contents in the second sub-list according to the corresponding relation to obtain an updated second sub-list,
the matching list comprises an updated first sub-list and an updated second sub-list.
According to some exemplary embodiments, the selecting points with the same category number from the initial Lei Dadian cloud forms the same category Lei Dadian cloud, and specifically includes: and selecting the points of which the category numbers correspond to pedestrians from the initial Lei Dadian cloud to form a second similar radar point cloud.
According to some exemplary embodiments, the generating a matching list according to the homogeneous radar point cloud and the visible object number list specifically includes:
Forming a first matching result according to the visible object number list,
wherein the first matching result includes a third sub-list;
the third sub-list includes a plurality of indexes and a plurality of contents, the plurality of indexes in the third sub-list include instance numbers of pedestrians recorded in the visible object number list, and the plurality of contents in the third sub-list are all empty.
According to some exemplary embodiments, the generating a matching list according to the homogeneous radar point cloud and the visible object number list specifically further includes:
and according to the corresponding relation between each index in the third sub-list and the instance numbers of each point in the second similar radar point cloud, correspondingly storing each point in the second similar radar point cloud in a plurality of contents in the third sub-list to obtain an updated third sub-list.
According to some exemplary embodiments, the method further comprises: and acquiring a pedestrian controllable foreground object list, wherein the pedestrian controllable foreground object list comprises instance numbers, three-dimensional space position information and three-dimensional boundary box vertex position information of all pedestrians in the foreground of the simulation environment.
According to some exemplary embodiments, the method further comprises:
comparing each index in the third sub-list with the instance numbers of each pedestrian in the controllable foreground object list of the pedestrian;
when the index in the third sub-list is not in the controllable foreground object list of the pedestrian, the index and the content corresponding to the index are removed from the third sub-list to update the third sub-list, the updated third sub-list is determined as a fourth sub-list,
the matching list includes the fourth sub-list.
According to some exemplary embodiments, the method further comprises:
projecting the three-dimensional boundary frame vertex position information of each item of background static object in the background static object list onto a two-dimensional camera plane to obtain a two-dimensional boundary frame list of the background static object;
obtaining an instance segmentation map, wherein the configuration of a virtual camera outputting the instance segmentation map is the same as that of a virtual camera outputting a map to be annotated, and each pixel in the instance segmentation map comprises a type number of an object to which the pixel belongs;
for each two-dimensional bounding box in the list of two-dimensional bounding boxes of the background static object: determining the number of pixels containing the same background static object in the two-dimensional boundary box in the example segmentation diagram according to the class numbers corresponding to the pixels in the example segmentation diagram; calculating the proportion of the pixel quantity to the total number of pixels in the two-dimensional boundary frame; if the ratio exceeds a preset threshold value, reserving the two-dimensional boundary box; discarding the two-dimensional bounding box if the ratio does not exceed the predetermined threshold; and
And taking the updated two-dimensional boundary box list as a labeling result of the two-dimensional boundary box of the background static object on the camera plane of the virtual camera.
According to some exemplary embodiments, the static objects in the background static object list include at least one of traffic lights, traffic signs, guardrails, and trees.
According to a second aspect of the present application, there is provided an annotation device for use in automated driving simulation, the device comprising:
the same type Lei Dadian cloud acquisition module is used for: obtaining a homogeneous class of radar point clouds, each point in the homogeneous class of radar point clouds containing the following information of the detected object: the instance number of the detected object, the kind number of the detected object and the three-dimensional space position information of the point, wherein the kind numbers of all the points in the same kind of radar point cloud are the same;
a visible object number list generating module for: obtaining instance numbers of all points in the similar radar point cloud, and generating a visible object number list according to the instance numbers of all the points;
a matching list generation module, configured to: generating a matching list according to the similar radar point clouds and the visible object number list, wherein the matching list comprises a plurality of indexes and a plurality of contents, the plurality of contents and the plurality of indexes are respectively in one-to-one correspondence, at least one index in the plurality of indexes comprises an instance number in the visible object number list, and the plurality of contents comprise points in the similar radar point clouds;
The camera information acquisition module is used for: acquiring camera information, wherein the camera information comprises internal parameters and external parameters of a virtual camera outputting a graph to be annotated and position information of the virtual camera in a three-dimensional space; and
the two-dimensional boundary box labeling module is used for: marking a two-dimensional boundary box of the detected object on the camera plane of the virtual camera according to the matching list and the camera information,
the marking, according to the matching list and the camera information, a two-dimensional bounding box of the detected object in the two-dimensional plane of the virtual camera, specifically includes:
projecting a plurality of points in the content with the same index in the matching list onto a two-dimensional camera plane of the virtual camera to obtain a plurality of projection points positioned on the camera plane; and
a two-dimensional bounding box surrounding the plurality of projection points is determined, the two-dimensional bounding box surrounding the plurality of projection points being a two-dimensional bounding box of the detected object in a two-dimensional camera plane of the virtual camera.
According to a third aspect of the present application, there is provided an electronic device comprising: one or more processors; and a storage device for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method as described above.
According to a fourth aspect of the present application there is provided a computer readable storage medium having stored thereon executable instructions which when executed by a processor cause the processor to perform a method as described above.
According to a fifth aspect of the present application, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method as described above.
Drawings
The foregoing and other objects, features and advantages of the application will be more apparent from the following description of embodiments of the application with reference to the accompanying drawings in which:
FIG. 1 schematically illustrates a scenario diagram of an autopilot simulation according to an embodiment of the present application;
FIG. 2 is a flowchart of an annotation method applied to autopilot simulation according to some exemplary embodiments of the present application;
FIG. 3 is a flowchart of a labeling method for labeling a vehicle according to some example embodiments of the present application;
FIG. 4 is an exemplary flowchart of the step of generating a matching list in the method shown in FIG. 3;
FIG. 5 is a flowchart of a labeling method for labeling pedestrians, according to some example embodiments of the present application;
FIG. 6 is an exemplary flowchart of the step of generating a matching list in the method shown in FIG. 5;
FIG. 7 is an exemplary flowchart of the steps of labeling a two-dimensional bounding box of a pedestrian in the method shown in FIG. 5;
FIG. 8 is a flowchart of a labeling method for labeling static objects such as traffic lights, according to some example embodiments of the present application;
FIG. 9 schematically illustrates a block diagram of an annotation device applied to autopilot simulation according to an embodiment of the present application; and
fig. 10 schematically shows a block diagram of an electronic device adapted to implement the labeling method according to an embodiment of the application.
Detailed Description
Hereinafter, embodiments of the present application will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present application. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present application. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present application.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first message may also be referred to as a second message, and similarly, a second message may also be referred to as a first message, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
First, technical terms appearing herein are explained as follows:
and (3) point cloud: and expressing mass point cloud data of the target space distribution and the target surface characteristics under the same space reference system. For example, the point cloud data may be acquired by a sensor such as a laser radar, millimeter wave radar, or the like.
Lei Dadian cloud: and the point cloud data is acquired through sensors such as laser radar, millimeter wave radar and the like.
Automatic driving simulation: the application scene of the automatic driving is digitally restored in a mathematical modeling mode, a system model which is as close to the real world as possible is established, and analysis and research are carried out through simulation test, so that the aim of testing and verifying an automatic driving system and algorithm is fulfilled. To realize the autopilot simulation, autopilot simulation software is gradually developed. An autopilot simulation software mainly comprises the following parts: static scene construction: editing capability of static scenes such as roads and surrounding infrastructure thereof, and capability of constructing or generating road environments of large-scale virtual scenes according to a real road network or a high-precision map; dynamic scene simulation: it is possible to simulate the actions of various traffic participants and generate traffic flows similar to real traffic scenes, or to reproduce the current scenes from real data; sensor model: the sensor module is provided with or supports the automatic driving sensor modules such as a camera, a laser radar, a millimeter wave radar, a GPS, an ultrasonic radar, an IMU and the like, and supports various output modes of the sensor module; vehicle dynamics model: on-board or supporting editable vehicle dynamics models; docking an ADAS with a rich interface of an autopilot system. Various types of automatic driving simulation software or platforms in the market at present comprise the parts and have the characteristics. For example, CARLA is an autopilot simulation software or platform for development, training and validation of autopilot systems. Calla relies on the illusion engine for development, using a server and multi-client architecture. The CARLA provides simple automatic behavior simulation of vehicles and pedestrians, also provides a whole set of Python interfaces, can control vehicles or signal lamps and the like in a scene, is used for conveniently carrying out joint simulation with an automatic driving system, and completes decision-making system and end-to-end reinforcement learning training.
Three-dimensional bounding box: a rectangular parallelepiped that can completely enclose an object in three-dimensional space.
Two-dimensional bounding box: a rectangle that can completely enclose an object in a two-dimensional plane.
Example number: a unique number that can identify an object in the simulation environment.
Category number: unique numbers in the simulation environment that can identify a class of objects.
Fig. 1 schematically illustrates a scenario diagram of an autopilot simulation according to an embodiment of the present application. Illustratively, the graph shown in FIG. 1 may be a 1-frame simulated scene graph generated in autopilot simulation software, which may be CARLA software, for example. In the example shown in FIG. 1, a three-dimensional callout box 100 for a vehicle is schematically shown.
In autopilot simulation software, such as calla, each frame in the provided simulation may include the following information:
(1) Lei Dadian cloud: each point in the Lei Dadian cloud contains the following information of the detected object: an instance number of the detected object, a kind number of the detected object, and three-dimensional space position information of the point;
(2) Example segmentation map: the virtual camera outputting the instance segmentation map needs to be identical to the configuration of the virtual camera outputting the map to be annotated (for example, the configuration is identical to refer to that the internal parameters and the external parameters of the camera are identical), and each pixel of the instance segmentation map comprises the type number of the object to which the pixel belongs;
(3) Camera information: the internal parameters, external parameters and the position information of the camera in the three-dimensional space of the virtual camera;
(4) Controllable foreground object list: all objects in a certain class (such as all running vehicles and all pedestrians) in the current simulation environment are listed, and each item in the list comprises an instance number of an object, position information of the object in a three-dimensional space and position information of a three-dimensional boundary frame vertex of the object;
(5) Background static object list: a list of all some type of static objects in the current simulation environment (e.g., all parked vehicles, traffic lights, traffic signs), each item in the list containing position information of the object in three-dimensional space, three-dimensional bounding box vertex position information of the object.
In the actual development and testing process, a two-dimensional bounding box of an object in an image captured by a virtual camera in simulation needs to be obtained.
For example, among the methods provided by the calla software self-contained document, the method of labeling two-dimensional bounding boxes of vehicles and traffic lights is mentioned: and acquiring a three-dimensional boundary box list of the measured object, projecting the three-dimensional boundary box vertexes onto a two-dimensional camera plane, and acquiring a two-dimensional boundary box list. However, the inventor finds that the method provided by the CARLA self-contained document does not consider the occlusion relation among objects, and the labeling result of the two-dimensional bounding box is inaccurate.
For another example, some open source projects are secondarily developed based on calla, a controllable foreground object that is not occluded is filtered out using a three-dimensional bounding box that partitions a point of a radar point cloud and a controllable foreground object list, and then a three-dimensional bounding box vertex of the controllable foreground object is projected onto a two-dimensional camera plane to obtain a two-dimensional bounding box. However, the inventor finds that the open source item is only marked for the running vehicle, and does not cover the marking of the parked vehicle; and the two-dimensional bounding box is obtained directly using the projection of the three-dimensional bounding box vertices of the object onto the two-dimensional camera plane, the bounding box tends to be inaccurate and compact for the vehicle.
Based on this, an embodiment of the present application provides a labeling method applied to automatic driving simulation, the method including: obtaining a homogeneous class of radar point clouds, each point in the homogeneous class of radar point clouds containing the following information of the detected object: the instance number of the detected object, the kind number of the detected object and the three-dimensional space position information of the point, wherein the kind numbers of all the points in the same kind of radar point cloud are the same; obtaining instance numbers of all points in the similar radar point cloud, and generating a visible object number list according to the instance numbers of all the points; generating a matching list according to the similar radar point clouds and the visible object number list, wherein the matching list comprises a plurality of indexes and a plurality of contents, the plurality of contents and the plurality of indexes are respectively in one-to-one correspondence, at least one index in the plurality of indexes comprises an instance number in the visible object number list, and the plurality of contents comprise points in the similar radar point clouds; acquiring camera information, wherein the camera information comprises internal parameters and external parameters of a virtual camera outputting a graph to be annotated and position information of the virtual camera in a three-dimensional space; and marking a two-dimensional boundary box of the detected object on the camera plane of the virtual camera according to the matching list and the camera information, wherein the marking of the two-dimensional boundary box of the detected object in the two-dimensional plane of the virtual camera according to the matching list and the camera information specifically comprises the following steps: projecting a plurality of points in the content with the same index in the matching list onto a two-dimensional camera plane of the virtual camera to obtain a plurality of projection points positioned on the camera plane; and determining a two-dimensional bounding box surrounding the plurality of projection points, the two-dimensional bounding box surrounding the plurality of projection points being a two-dimensional bounding box of the detected object in a two-dimensional camera plane of the virtual camera. In the labeling method provided by the embodiment of the application, the two-dimensional bounding box source is a bounding box which surrounds the two-dimensional projection points of the segmented radar point cloud, but is not a two-dimensional projection point of the vertex of the three-dimensional bounding box, and the labeled two-dimensional bounding box is more compact and accurate.
In an exemplary embodiment of the application, the labeling method can label various objects in the simulation, for example, the various objects can include various objects such as vehicles, pedestrians, traffic lights, traffic signs and the like.
Fig. 2 is a flowchart of a labeling method applied to autopilot simulation, which may include steps S210-S250, according to some example embodiments of the present application.
In the embodiments of the present application, unless otherwise specifically stated, the order of execution of the steps included in the method is not limited to the order in which they are described herein, and the steps may be executed in parallel, or may be executed in any other order, without conflict.
In step S210, a homogeneous class radar point cloud is acquired, each point in the homogeneous class radar point cloud containing the following information of the detected object: the instance number of the detected object, the kind number of the detected object and the three-dimensional space position information of the point are the same, and the kind numbers of all the points in the same kind of radar point cloud are the same.
In an embodiment of the present application, the acquiring the same kind Lei Dadian cloud may include: an initial Lei Dadian cloud is acquired, each point in the initial Lei Dadian cloud containing the following information of the detected object: an instance number of the detected object, a kind number of the detected object, and three-dimensional spatial position information of the point; and selecting points with the same category number from the initial Lei Dadian cloud to form a radar point cloud of the same category.
In some exemplary embodiments, the selecting the points with the same category number from the initial Lei Dadian cloud to form the same category Lei Dadian cloud may specifically include: and selecting points of the vehicles corresponding to the class numbers from the initial Lei Dadian cloud to form a first same-class radar point cloud.
Alternatively or additionally, the selecting points with the same category number from the initial Lei Dadian cloud forms the same category Lei Dadian cloud, which may specifically include: and selecting the points of which the category numbers correspond to pedestrians from the initial Lei Dadian cloud to form a second similar radar point cloud.
That is, in the embodiment of the present application, the same type of radar point cloud may be a radar point cloud of all vehicles in the simulation, a radar point cloud of all pedestrians, a radar point cloud of all static objects such as traffic lights or traffic signs, and the like.
In step S220, the instance numbers of the points in the similar radar point cloud are obtained, and a visible object number list is generated according to the instance numbers of the points.
For example, in the case where the radar point clouds of the same kind are those of all vehicles in the simulation, the visible object number list includes instance numbers of respective points in the radar point clouds of all vehicles in the simulation.
In the case where the homogeneous class radar point cloud is a radar point cloud of all pedestrians in the simulation, the visible object number list includes instance numbers of respective points in the radar point cloud of all pedestrians in the simulation.
In the case where the homogeneous radar point cloud is a radar point cloud of all traffic lights and traffic signs in the simulation, the list of visible object numbers includes instance numbers of each point in the radar point cloud of all traffic lights and traffic signs in the simulation.
In the embodiment of the application, the object detected by the point in the radar point cloud already considers the shielding relation among the objects, so the shielding relation among the objects is already considered for the example corresponding to each example number in the visible object number list, namely, the shielding problem is solved.
In step S230, a matching list is generated according to the similar radar point cloud and the visible object number list, where the matching list includes a plurality of indexes and a plurality of contents, the plurality of contents and the plurality of indexes are respectively in one-to-one correspondence, at least one index in the plurality of indexes includes an instance number in the visible object number list, and the plurality of contents includes a point in the similar radar point cloud.
That is, the matching list includes a plurality of items, each item including an index and a content, the index and the content being in one-to-one correspondence, wherein the index is an instance number in the visible object number list and the content is a point in the radar point cloud. That is, in the matching list, the instance number and the points in the radar point cloud are matched.
In step S240, camera information including internal parameters and external parameters of the virtual camera outputting the map to be annotated and position information of the virtual camera in the three-dimensional space is acquired.
For example, a two-dimensional bounding box may be annotated in the output image of a virtual camera, which may be referred to as a virtual camera that outputs a map to be annotated.
It should be understood that the camera reference is a parameter related to the characteristics of the camera itself, such as the focal length, pixel size, etc., of the camera, the camera reference refers to the positional relationship between the camera and the radar, and the "positional information of the camera in three-dimensional space" refers to the parameters of the camera in the world coordinate system, such as the position, rotation direction, etc., of the camera
In step S250, a two-dimensional bounding box of the detected object on the camera plane of the virtual camera is marked according to the matching list and the camera information.
In an embodiment of the present application, the marking, according to the matching list and the camera information, a two-dimensional bounding box of the detected object in the two-dimensional plane of the virtual camera may specifically include: projecting a plurality of points in the content with the same index in the matching list onto a two-dimensional camera plane of the virtual camera to obtain a plurality of projection points positioned on the camera plane; and determining a two-dimensional bounding box surrounding the plurality of projection points, the two-dimensional bounding box surrounding the plurality of projection points being a two-dimensional bounding box of the detected object in a two-dimensional camera plane of the virtual camera.
In some exemplary embodiments, the method may further include: and obtaining a background static object list of a certain type of background object, wherein the background static object list comprises three-dimensional space position information and three-dimensional boundary frame vertex position information of each static object in the simulation environment.
In some exemplary embodiments, the method may further include: a controllable foreground object list of a foreground object of a certain class is obtained, for example, the controllable foreground object list of the foreground object of the certain class comprises an instance number, three-dimensional space position information and three-dimensional boundary box vertex position information of each pedestrian in the foreground of the simulation environment.
In the following, four types of objects, such as vehicles, pedestrians, traffic lights, traffic signs, etc., will be taken as examples, and embodiments of the present application will be described in further detail.
It should be noted that, in the simulation scenario provided in the embodiment of the present application, the vehicle includes a traveling vehicle and a parked vehicle, wherein the running vehicle is a foreground dynamic object, and the parked vehicle is a background static object; the pedestrian is a foreground dynamic object; traffic lights and traffic signs are background static objects. These four classes of objects are representative of foreground dynamic objects and background dynamic objects, and embodiments of the present application are not limited to the four classes of objects described above.
FIG. 3 is a flowchart of a labeling method for labeling a vehicle, which may include steps S310-S340, according to some example embodiments of the present application.
In step S310, an initial Lei Dadian cloud is acquired, each point in the initial Lei Dadian cloud containing the following information of the detected object: an instance number of the detected object, a kind number of the detected object, and three-dimensional spatial position information of the point.
In some exemplary embodiments, points in the initial Lei Dadian cloud correspond to multiple categories of objects in the simulated scene, e.g., multiple categories of objects in a vehicle, pedestrian, traffic light, traffic sign, tree, guardrail, etc., in the simulated environment.
In step S320, the initial Lei Dadian cloud is processed. For example, processing the initial Lei Dadian cloud includes: and selecting points with the same category number from the initial Lei Dadian cloud to form a radar point cloud of the same category.
In this embodiment, the selecting the points with the same class number from the initial Lei Dadian cloud to form the same class Lei Dadian cloud may specifically include: and selecting points of the vehicles corresponding to the class numbers from the initial Lei Dadian cloud to form a first same-class radar point cloud. That is, points in the first homogeneous radar point cloud correspond to all vehicles in the simulation.
In step S330, an instance number of each point in the first homogeneous radar point cloud is acquired, and a visible object number list is generated according to the instance numbers of each point.
For example, in this embodiment, the list of visible object numbers includes instance numbers for each point in the radar point clouds of all vehicles in the simulation.
In step S340, a matching list is generated according to the first homogeneous radar point cloud and the visible object number list.
Fig. 4 is an exemplary flowchart of the steps in the method of fig. 3 for generating a matching list. Referring to fig. 3 and 4 in combination, the step of generating a matching list may include sub-steps S3410 to S3440.
In sub-step S3410, the list of visible object numbers and the list of background static objects are matched to form a first matching result.
In this embodiment, the first matching result includes a first sub-list and a second sub-list. The first sub-list is a list of traveling vehicles, and the second sub-list is a list of parked vehicles.
The first sub-list comprises a plurality of indexes and a plurality of contents, the plurality of indexes in the first sub-list comprise instance numbers of running vehicles recorded in the visible object number list, the instance numbers of the running vehicles are not equal to a first specified value, the first specified value represents the instance numbers of parked vehicles, and the plurality of contents in the first sub-list are all empty.
The second sub-list comprises a plurality of indexes and a plurality of contents, the plurality of indexes in the second sub-list comprise parked vehicles in the background static object list, and the plurality of contents in the second sub-list are all empty.
For example, taking calla software as an example, the instance number of a traveling vehicle is a non-0 value and the instance number of a parked vehicle is 0, in which case each index in the first sub-list is all non-0 instance numbers in the visible object number list of the vehicle, the contents of the first sub-list being an empty list. Each item in the second sub-list is indexed to a vehicle in the background static number list, and the content of the second sub-list is an empty list.
The following sub-steps S3420 and S3430 may be performed for the first sub-list, i.e. for the sub-list of driving vehicles.
In substep S3420, points in the first homogeneous radar point cloud, for which the instance number is not equal to a first prescribed value, are screened out to form a point cloud of the traveling vehicle, wherein the first prescribed value represents the instance number of the parked vehicle.
For example, the first predetermined value may be 0, but embodiments of the present application do not particularly limit the first predetermined value.
In sub-step S3430, each point in the point cloud of the traveling vehicle is correspondingly stored in a plurality of contents in the first sub-list according to the correspondence between each index in the first sub-list and the instance number of each point in the point cloud of the traveling vehicle, so as to obtain an updated first sub-list.
For example, in the updated first sub-list, each item index in the first sub-list is all non-0 instance numbers in the visible object number list of the vehicle, and each item content of the first sub-list is a point of the running vehicle corresponding to the instance number.
The following sub-steps S3440 to S3470 may be performed for the second sub-list, i.e., for the sub-list of parked vehicles.
In substep S3440, points in the first homogeneous radar point cloud having an instance number equal to a first prescribed value are screened out to form a point cloud for parking the vehicle.
For example, using calla software as an example, points with an instance number equal to 0 may be screened out of all vehicle radar point clouds, and thus, all parked vehicle point clouds may be formed.
Since the point clouds of all the parked vehicles have the same instance number, for example, the instance numbers thereof are all 0 in the simulation, it is impossible to determine which parked vehicle each point specifically corresponds to in the point clouds of the parked vehicles. In an embodiment of the present application, for a point cloud of a parked vehicle, a parked vehicle corresponding to each point in the point cloud of the parked vehicle is further determined by traversing the background static object list.
In sub-step S3450, the background static object list is traversed for each point in the point cloud of the parked vehicle.
In sub-step S3460, when an i-th point in the point cloud of the parked vehicle is enclosed within the three-dimensional bounding box of a j-th static object in the background static object list, a correspondence relationship of the i-th point and the j-th static object is determined. And i is more than or equal to 1 and less than or equal to the number of points of the point cloud of the parked vehicle, and j is more than or equal to 1 and less than or equal to the number of static objects in the background static object list.
In sub-step S3470, according to the correspondence, each point in the point cloud of the parked vehicle is correspondingly stored in a plurality of contents in the second sub-list to obtain an updated second sub-list.
For example, in the updated second sub-list, each item in the second sub-list indexes a parked vehicle in the background static object list, and each item in the second sub-list is a point of the parked vehicle corresponding to the parked vehicle.
In an embodiment of the present application, the matching list includes an updated first sub-list and an updated second sub-list.
Referring back to fig. 2, after the matching list is formed, step S250 of marking a two-dimensional bounding box of the detected object on the camera plane of the virtual camera according to the matching list and the camera information may be performed.
Specifically, a plurality of points in the content with the same index in the matching list are projected onto a two-dimensional camera plane of the virtual camera, so as to obtain a plurality of projection points positioned on the camera plane; and determining a two-dimensional bounding box surrounding the plurality of projection points, the two-dimensional bounding box surrounding the plurality of projection points being a two-dimensional bounding box of the detected object in a two-dimensional camera plane of the virtual camera.
For example, a plurality of points in the content of each item in the updated first sub-list in the matching list may be projected onto a two-dimensional camera plane of the virtual camera in combination with camera information to obtain a plurality of projected points located on the camera plane, and then a two-dimensional bounding box surrounding the plurality of projected points is determined, the two-dimensional bounding box surrounding the plurality of projected points being a two-dimensional bounding box of the driving vehicle in the two-dimensional camera plane of the virtual camera. In this way, the marking of the two-dimensional boundary box of the running vehicle in the simulation environment is realized.
For another example, a plurality of points in the content of each item in the updated second sub-list in the matching list may be projected onto a two-dimensional camera plane of the virtual camera in combination with camera information to obtain a plurality of projected points located on the camera plane, and then a two-dimensional bounding box surrounding the plurality of projected points is determined, the two-dimensional bounding box surrounding the plurality of projected points being a two-dimensional bounding box of the parked vehicle in the two-dimensional camera plane of the virtual camera. In this way, the labeling of the two-dimensional bounding box of the parked vehicle in the simulation environment is achieved.
In embodiments of the present application, the two-dimensional bounding box source of the vehicle is a bounding box that encloses the two-dimensional projected points of the vehicle radar point cloud, rather than the two-dimensional projected points of the three-dimensional bounding box vertices of the vehicle, which makes the generated two-dimensional bounding box more compact and accurate.
Further, in the embodiment of the application, in the process of marking the vehicle, the running vehicle and the parked vehicle are marked at the same time, so that the marking result is more effective.
Fig. 5 is a flowchart of a labeling method for labeling pedestrians, which may include steps S510-S540, according to some example embodiments of the present application.
In step S510, an initial Lei Dadian cloud is acquired, each point in the initial Lei Dadian cloud containing the following information of the detected object: an instance number of the detected object, a kind number of the detected object, and three-dimensional spatial position information of the point.
In step S520, the initial Lei Dadian cloud is processed. For example, processing the initial Lei Dadian cloud includes: and selecting points with the same category number from the initial Lei Dadian cloud to form a radar point cloud of the same category.
In this embodiment, the selecting the points with the same class number from the initial Lei Dadian cloud to form the same class Lei Dadian cloud may specifically include: and selecting the points of which the category numbers correspond to pedestrians from the initial Lei Dadian cloud to form a second similar radar point cloud. That is, points in the second similar radar point cloud correspond to all pedestrians in the simulation.
In step S530, the instance numbers of the points in the second similar radar point cloud are acquired, and a visible object number list is generated according to the instance numbers of the points.
For example, in this embodiment, the list of visible object numbers includes instance numbers for each point in the radar point cloud for all pedestrians in the simulation.
In step S540, a matching list is generated according to the second similar radar point cloud and the visible object number list.
Fig. 6 is an exemplary flowchart of the step of generating a matching list in the method shown in fig. 5. Referring to fig. 5 and 6 in combination, the step of generating a matching list may include sub-steps S5410 to S5420.
In sub-step S5410, a first matching result is formed from said list of visible object numbers, wherein said first matching result comprises a third sub-list; the third sub-list includes a plurality of indexes and a plurality of contents, the plurality of indexes in the third sub-list include instance numbers of pedestrians recorded in the visible object number list, and the plurality of contents in the third sub-list are all empty.
For example, taking calla software as an example, all pedestrians ' instance numbers are non-0 values, in which case each item in the third sub-list indexes all non-0 instance numbers in the pedestrians ' visible object number list, and the third sub-list's contents are empty lists.
In sub-step S5420, according to the correspondence between each index in the third sub-list and the instance number of each point in the second similar type radar point cloud, each point in the second similar type radar point cloud is correspondingly stored in a plurality of contents in the third sub-list, so as to obtain an updated third sub-list.
For example, in the updated third sub-list, each item index in the third sub-list is all non-0 instance numbers in the visible object number list of the pedestrian, and each item content in the third sub-list is a point of the pedestrian corresponding to the instance number.
In this embodiment, the matching list may include an updated third sub-list.
Referring back to fig. 2, after the matching list is formed, step S250 of marking a two-dimensional bounding box of the detected object (e.g., pedestrian) on the camera plane of the virtual camera according to the matching list and the camera information may be performed.
Alternatively, FIG. 7 is an exemplary flowchart of the steps of labeling a two-dimensional bounding box of a pedestrian in the method shown in FIG. 5.
Referring to fig. 2, 5 and 7 in combination, in some exemplary embodiments of the present application, the method may further include steps S710 to S730.
In step S710, a controllable foreground object list of pedestrians is acquired, where the controllable foreground object list of pedestrians includes an instance number, three-dimensional spatial position information, and three-dimensional bounding box vertex position information of each pedestrian in a foreground of a simulation environment.
In step S720, each index in the third sub-list is compared with the instance number of each pedestrian in the controllable foreground object list of the pedestrian.
In step S730, when the index in the third sub-list is not in the controllable foreground object list of the pedestrian, the index and the content corresponding to the index are removed from the third sub-list to update the third sub-list, and the updated third sub-list is determined as a fourth sub-list.
In this embodiment, the matching list comprises the fourth sub-list.
In simulation software such as calla, on the one hand, the category number of a point in the radar point cloud corresponding to the driver of the two-wheeled vehicle is a pedestrian, and on the other hand, in the controllable foreground object list, the instance number of the driver of the two-wheeled vehicle is not in the controllable foreground object list of the pedestrian, but in the controllable foreground object list of the vehicle. The inventors have found that in this case, it is reasonable to label the two-wheeled vehicle driver and the two-wheeled vehicle as a whole, i.e., it is reasonable to label the whole as a vehicle. In the embodiment of the application, by executing steps S710 to S730, the driver of the two-wheeled vehicle can be removed from the pedestrian, so that the marking result is more accurate and reasonable.
Referring back to fig. 2, after the matching list (e.g., the fourth sub-list) is formed, step S250 of labeling a two-dimensional bounding box of the detected object (e.g., a pedestrian) on the camera plane of the virtual camera according to the fourth sub-list and the camera information may be performed.
Specifically, a plurality of points in the content with the same index in the fourth sub-list are projected onto a two-dimensional camera plane of the virtual camera, so as to obtain a plurality of projection points positioned on the camera plane; and determining a two-dimensional bounding box surrounding the plurality of projection points, the two-dimensional bounding box surrounding the plurality of projection points being a two-dimensional bounding box of the pedestrian in a two-dimensional camera plane of the virtual camera.
For example, a plurality of points in the content of each item in the updated fourth sub-list in the matching list may be projected onto a two-dimensional camera plane of the virtual camera in combination with camera information to obtain a plurality of projected points located on the camera plane, and then a two-dimensional bounding box surrounding the plurality of projected points is determined, the two-dimensional bounding box surrounding the plurality of projected points being a two-dimensional bounding box of the pedestrian in the two-dimensional camera plane of the virtual camera. Thus, the marking of the two-dimensional bounding box of the pedestrian in the simulation environment is realized.
In embodiments of the present application, the two-dimensional bounding box source of the pedestrian is a bounding box that encloses the two-dimensional projected points of the vehicle radar point cloud, rather than the two-dimensional projected points of the three-dimensional bounding box vertices of the pedestrian, which makes the generated two-dimensional bounding box more compact and accurate.
Fig. 8 is a flowchart of a labeling method for labeling static objects such as traffic lights, according to some exemplary embodiments of the present application, which may include steps S810-S840. In the embodiment of the present application, the process of labeling the static object by using the traffic light as an example is described, but the embodiment of the present application is not limited thereto, and the labeling method provided in the embodiment of the present application may be applied to other types of static objects such as traffic labels, trees, guardrails, and the like.
In step S810, the three-dimensional bounding box vertex position information of each item of the background static object in the background static object list is projected onto a two-dimensional camera plane, and a two-dimensional bounding box list of the background static object is obtained. For example, the background static object list may be a background static object list of a traffic light.
In step S820, an instance segmentation map is obtained, where a virtual camera outputting the instance segmentation map is configured identically to the virtual camera outputting the to-be-annotated map, and each pixel in the instance segmentation map includes a type number of an object to which the pixel belongs, for example, a type number of a traffic light.
It should be noted that "the same configuration" herein may mean that the internal parameters and the external parameters of the two virtual cameras are the same.
In step S830, for each two-dimensional bounding box in the list of two-dimensional bounding boxes of the background static object: determining the number of pixels containing the same background static object in the two-dimensional boundary box in the example segmentation diagram according to the class numbers corresponding to the pixels in the example segmentation diagram; calculating the proportion of the pixel quantity to the total number of pixels in the two-dimensional boundary frame; if the ratio exceeds a preset threshold value, reserving the two-dimensional boundary box; if the ratio does not exceed the predetermined threshold, discarding the two-dimensional bounding box. In this way, the two-dimensional bounding box list is updated.
Note that, when the ratio does not exceed the predetermined threshold value, it is assumed that the traffic light is blocked, and therefore, it is necessary to discard the two-dimensional bounding box of the traffic light. In the embodiment of the application, the shielding condition of static objects such as traffic lights, traffic signs and the like is judged by using the example segmentation map, so that the labeling result is more stable and accurate.
In step S840, the updated two-dimensional bounding box list is used as a labeling result of the two-dimensional bounding box of the background static object on the camera plane of the virtual camera.
In the labeling method provided by the embodiment of the application, dynamic objects such as vehicles and pedestrians and static objects such as traffic lights and traffic labels can be accurately and compactly labeled, that is, the labeling method provided by the embodiment of the application is high in compatibility, stable and accurate in labeling result, and can be expanded to labeling of various objects in automatic driving simulation.
Based on the labeling method, the embodiment of the application also provides a labeling device applied to automatic driving simulation. The device will be described in detail below in connection with fig. 9.
Fig. 9 schematically shows a block diagram of a labeling device applied to automated driving simulation according to an embodiment of the present application.
As shown in fig. 9, the labeling device applied to the autopilot simulation may include a homogeneous radar point cloud acquisition module 910, a visible object number list generation module 920, a matching list generation module 930, a camera information acquisition module 940, and a two-dimensional bounding box labeling module 950.
The homogeneous radar point cloud acquisition module 910 is configured to: obtaining a homogeneous class of radar point clouds, each point in the homogeneous class of radar point clouds containing the following information of the detected object: the instance number of the detected object, the kind number of the detected object and the three-dimensional space position information of the point are the same, and the kind numbers of all the points in the same kind of radar point cloud are the same.
In some exemplary embodiments, the homogeneous radar point cloud obtaining module 910 may be configured to perform operations S210, S310, S320, S510, S520 described above, which are not described herein.
The visible object number list generating module 920 is configured to: and obtaining the instance numbers of all the points in the similar radar point cloud, and generating a visible object number list according to the instance numbers of all the points.
In some exemplary embodiments, the visible object number list generating module 920 may be configured to perform operations S220, S330, and S530 described above, which are not described herein.
The matching list generating module 930 is configured to: generating a matching list according to the same-kind radar point cloud and the visible object number list, wherein the matching list comprises a plurality of indexes and a plurality of contents, the plurality of contents and the plurality of indexes are respectively in one-to-one correspondence, at least one index in the plurality of indexes comprises an instance number in the visible object number list, and the plurality of contents comprise points in the same-kind radar point cloud.
In some exemplary embodiments, the matching list generating module 930 may be configured to perform operations S230, S340, and S540 described above, which are not described herein.
The camera information acquisition module 940 is configured to: and acquiring camera information, wherein the camera information comprises internal parameters and external parameters of a virtual camera outputting a graph to be annotated and position information of the virtual camera in a three-dimensional space.
In some exemplary embodiments, the camera information obtaining module 940 may be configured to perform the operation S240 described above, which is not described herein.
The two-dimensional bounding box labeling module 950 is configured to: and marking a two-dimensional boundary box of the detected object on the camera plane of the virtual camera according to the matching list and the camera information.
For example, the marking, according to the matching list and the camera information, a two-dimensional bounding box of the detected object in the two-dimensional plane of the virtual camera may specifically include: projecting a plurality of points in the content with the same index in the matching list onto a two-dimensional camera plane of the virtual camera to obtain a plurality of projection points positioned on the camera plane; and determining a two-dimensional bounding box surrounding the plurality of projection points, the two-dimensional bounding box surrounding the plurality of projection points being a two-dimensional bounding box of the detected object in a two-dimensional camera plane of the virtual camera.
In some exemplary embodiments, the two-dimensional bounding box labeling module 950 may be configured to perform the operation S250 described above, which is not described herein.
Alternatively or additionally, the labeling device applied to the autopilot simulation may further include a background static object list acquisition module 960. The background static object list acquisition module 960 may be used to acquire a background static object list. The background static object list comprises three-dimensional space position information and three-dimensional boundary frame vertex position information of each static object in the simulation environment.
Alternatively or additionally, the annotation device applied to the autopilot simulation may further comprise a controllable foreground object list acquisition module 970. The controllable foreground object list acquisition module 970 may be configured to acquire a controllable foreground object list. For example, the method can be used for acquiring a controllable foreground object list of pedestrians, wherein the controllable foreground object list of the pedestrians comprises instance numbers, three-dimensional space position information and three-dimensional boundary box vertex position information of each pedestrian in the foreground of the simulation environment.
Any of the various modules or units described above may be combined in one module or unit to be implemented, or any of the modules or units may be split into multiple modules or units, according to embodiments of the present application. Alternatively, at least some of the functionality of one or more of the modules or units may be combined with at least some of the functionality of other modules or units and implemented in one module or unit. According to embodiments of the present application, at least one of the various modules or units described above may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or in hardware or firmware, such as any other reasonable way of integrating or packaging the circuits, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the individual modules or units described above may be at least partially implemented as a computer program module, which when executed performs the corresponding function.
Fig. 10 schematically shows a block diagram of an electronic device adapted to implement the labeling method according to an embodiment of the application.
As shown in fig. 10, an electronic device 1000 according to an embodiment of the present application includes a processor 1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. The processor 1001 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 1001 may also include on-board memory for caching purposes. The processor 1001 may include a single processing unit or multiple processing units for performing different actions in accordance with the method flows of embodiments of the present application.
In the RAM 1003, various programs and data necessary for the operation of the electronic apparatus 1000 are stored. The processor 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. The processor 1001 performs various operations of the method flow according to the embodiment of the present application by executing programs in the ROM 1002 and/or the RAM 1003. Note that the program may be stored in one or more memories other than the ROM 1002 and the RAM 1003. The processor 1001 may also perform various operations of the method flow according to the embodiments of the present application by executing programs stored in the one or more memories.
According to an embodiment of the present application, the electronic device 1000 may also include an input/output (I/O) interface 1005, the input/output (I/O) interface 1005 also being connected to the bus 1004. The electronic device 1000 may also include one or more of the following components connected to the I/O interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output portion 1007 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), etc., and a speaker, etc.; a storage portion 1008 including a hard disk or the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The driver 1010 is also connected to the I/O interface 1005 as needed. A removable medium 1011, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed as needed in the drive 1010, so that a computer program read out therefrom is installed as needed in the storage section 1008.
The present application also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs that when executed implement methods according to embodiments of the present application.
According to embodiments of the present application, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present application, the computer-readable storage medium may include ROM 1002 and/or RAM 1003 described above and/or one or more memories other than ROM 1002 and RAM 1003.
Embodiments of the present application also include a computer program product comprising a computer program containing program code for performing the method shown in the flowcharts. The program code means for causing a computer system to carry out the labeling methods provided by the embodiments of the present application when the computer program product is run on the computer system.
The above-described functions defined in the system/apparatus of the embodiments of the present application are performed when the computer program is executed by the processor 1001. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the application.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of signals on a network medium, distributed, and downloaded and installed via the communication section 1009, and/or installed from the removable medium 1011. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 1009, and/or installed from the removable medium 1011. The above-described functions defined in the system of the embodiment of the present application are performed when the computer program is executed by the processor 1001. According to embodiments of the present application, the systems, devices, means, modules, units, etc. described above may be implemented by means of computer program modules.
According to embodiments of the present application, program code for carrying out computer programs provided by embodiments of the present application may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or in assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments and/or claims of the present application may be combined in various combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the present application. In particular, the features recited in the various embodiments and/or the claims of the present application may be combined and/or combined in various ways without departing from the spirit and teachings of the present application. All such combinations and/or combinations fall within the scope of the present application.

Claims (18)

1. A labeling method applied to automatic driving simulation, characterized in that the method comprises the following steps:
obtaining a homogeneous class of radar point clouds, each point in the homogeneous class of radar point clouds containing the following information of the detected object: the instance number of the detected object, the kind number of the detected object and the three-dimensional space position information of the point, wherein the kind numbers of all the points in the same kind of radar point cloud are the same;
obtaining instance numbers of all points in the similar radar point cloud, and generating a visible object number list according to the instance numbers of all the points;
generating a matching list according to the similar radar point clouds and the visible object number list, wherein the matching list comprises a plurality of indexes and a plurality of contents, the plurality of contents and the plurality of indexes are respectively in one-to-one correspondence, at least one index in the plurality of indexes comprises an instance number in the visible object number list, and the plurality of contents comprise points in the similar radar point clouds;
Acquiring camera information, wherein the camera information comprises internal parameters and external parameters of a virtual camera outputting a graph to be annotated and position information of the virtual camera in a three-dimensional space; and
marking a two-dimensional boundary box of the detected object on the camera plane of the virtual camera according to the matching list and the camera information,
the marking, according to the matching list and the camera information, a two-dimensional bounding box of the detected object in the two-dimensional plane of the virtual camera, specifically includes:
projecting a plurality of points in the content with the same index in the matching list onto a two-dimensional camera plane of the virtual camera to obtain a plurality of projection points positioned on the camera plane; and
a two-dimensional bounding box surrounding the plurality of projection points is determined, the two-dimensional bounding box surrounding the plurality of projection points being a two-dimensional bounding box of the detected object in a two-dimensional camera plane of the virtual camera.
2. The method of claim 1, wherein the acquiring the same kind Lei Dadian cloud comprises:
an initial Lei Dadian cloud is acquired, each point in the initial Lei Dadian cloud containing the following information of the detected object: an instance number of the detected object, a kind number of the detected object, and three-dimensional spatial position information of the point; and
And selecting points with the same category number from the initial Lei Dadian cloud to form a radar point cloud of the same category.
3. The method according to claim 2, wherein the method further comprises: and acquiring a background static object list, wherein the background static object list comprises three-dimensional space position information and three-dimensional boundary frame vertex position information of each static object in the simulation environment.
4. A method according to claim 3, wherein the selecting points with the same category number from the initial Lei Dadian cloud forms the same category Lei Dadian cloud, and specifically includes:
and selecting points of the vehicles corresponding to the class numbers from the initial Lei Dadian cloud to form a first same-class radar point cloud.
5. The method according to claim 4, wherein the generating a matching list according to the homogeneous radar point cloud and the visible object number list, in particular comprises:
matching the list of visible object numbers with the list of background static objects to form a first matching result,
the first matching result comprises a first sub-list and a second sub-list;
the first sub-list comprises a plurality of indexes and a plurality of contents, the plurality of indexes in the first sub-list comprise instance numbers of running vehicles recorded in the visible object number list, the instance numbers of the running vehicles are not equal to a first specified value, wherein the first specified value represents the instance numbers of parked vehicles, and the plurality of contents in the first sub-list are all empty;
The second sub-list comprises a plurality of indexes and a plurality of contents, the plurality of indexes in the second sub-list comprise parked vehicles in the background static object list, and the plurality of contents in the second sub-list are all empty.
6. The method according to claim 5, wherein the generating a matching list according to the homogeneous class radar point cloud and the list of visible object numbers, in particular further comprises:
screening points with instance numbers not equal to a first specified value in the first same-kind radar point cloud to form a point cloud of a running vehicle; and
and according to the corresponding relation between each index in the first sub-list and the instance number of each point in the point cloud of the running vehicle, correspondingly storing each point in the point cloud of the running vehicle in a plurality of contents in the first sub-list to obtain an updated first sub-list.
7. The method according to claim 6, wherein the generating a matching list from the homogeneous class radar point cloud and the list of visible object numbers, in particular further comprises:
screening out points with instance numbers equal to a first specified value in the first same-kind radar point cloud to form a point cloud for parking the vehicle;
Traversing the background static object list for each point in the point cloud of the parked vehicle;
when the ith point in the point cloud of the parked vehicle is enclosed in the three-dimensional boundary box of the jth static object in the background static object list, determining the corresponding relation between the ith point and the jth static object, wherein i is more than or equal to 1 and less than or equal to the number of points in the point cloud of the parked vehicle, and j is more than or equal to 1 and less than or equal to the number of static objects in the background static object list;
corresponding each point in the point cloud of the parked vehicle to be stored in a plurality of contents in the second sub-list according to the corresponding relation to obtain an updated second sub-list,
the matching list comprises an updated first sub-list and an updated second sub-list.
8. The method according to any one of claims 3-7, wherein the selecting points with the same category number from the initial Lei Dadian cloud forms the same category Lei Dadian cloud, and specifically comprises:
and selecting the points of which the category numbers correspond to pedestrians from the initial Lei Dadian cloud to form a second similar radar point cloud.
9. The method according to claim 8, wherein the generating a matching list from the homogeneous class radar point cloud and the list of visible object numbers, in particular comprises:
Forming a first matching result according to the visible object number list,
wherein the first matching result includes a third sub-list;
the third sub-list includes a plurality of indexes and a plurality of contents, the plurality of indexes in the third sub-list include instance numbers of pedestrians recorded in the visible object number list, and the plurality of contents in the third sub-list are all empty.
10. The method according to claim 9, wherein the generating a matching list from the homogeneous class radar point cloud and the list of visible object numbers, in particular further comprises:
and according to the corresponding relation between each index in the third sub-list and the instance numbers of each point in the second similar radar point cloud, correspondingly storing each point in the second similar radar point cloud in a plurality of contents in the third sub-list to obtain an updated third sub-list.
11. The method according to claim 10, wherein the method further comprises: and acquiring a pedestrian controllable foreground object list, wherein the pedestrian controllable foreground object list comprises instance numbers, three-dimensional space position information and three-dimensional boundary box vertex position information of all pedestrians in the foreground of the simulation environment.
12. The method of claim 11, wherein the method further comprises:
comparing each index in the third sub-list with the instance numbers of each pedestrian in the controllable foreground object list of the pedestrian;
when the index in the third sub-list is not in the controllable foreground object list of the pedestrian, the index and the content corresponding to the index are removed from the third sub-list to update the third sub-list, the updated third sub-list is determined as a fourth sub-list,
the matching list includes the fourth sub-list.
13. The method according to any one of claims 3-7 and 9-12, wherein the method further comprises:
projecting the three-dimensional boundary frame vertex position information of each item of background static object in the background static object list onto a two-dimensional camera plane to obtain a two-dimensional boundary frame list of the background static object;
obtaining an instance segmentation map, wherein the configuration of a virtual camera outputting the instance segmentation map is the same as that of a virtual camera outputting a map to be annotated, and each pixel in the instance segmentation map comprises a type number of an object to which the pixel belongs;
For each two-dimensional bounding box in the list of two-dimensional bounding boxes of the background static object: determining the number of pixels containing the same background static object in the two-dimensional boundary box in the example segmentation diagram according to the class numbers corresponding to the pixels in the example segmentation diagram; calculating the proportion of the pixel quantity to the total number of pixels in the two-dimensional boundary frame; if the ratio exceeds a preset threshold value, reserving the two-dimensional boundary box; discarding the two-dimensional bounding box if the ratio does not exceed the predetermined threshold; and
and taking the updated two-dimensional boundary box list as a labeling result of the two-dimensional boundary box of the background static object on the camera plane of the virtual camera.
14. The method of claim 13, wherein the static objects in the background static object list comprise at least one of traffic lights, traffic signs, guardrails, and trees.
15. A labeling apparatus for use in automated driving simulation, the apparatus comprising:
the same type Lei Dadian cloud acquisition module is used for: obtaining a homogeneous class of radar point clouds, each point in the homogeneous class of radar point clouds containing the following information of the detected object: the instance number of the detected object, the kind number of the detected object and the three-dimensional space position information of the point, wherein the kind numbers of all the points in the same kind of radar point cloud are the same;
A visible object number list generating module for: obtaining instance numbers of all points in the similar radar point cloud, and generating a visible object number list according to the instance numbers of all the points;
a matching list generation module, configured to: generating a matching list according to the similar radar point clouds and the visible object number list, wherein the matching list comprises a plurality of indexes and a plurality of contents, the plurality of contents and the plurality of indexes are respectively in one-to-one correspondence, at least one index in the plurality of indexes comprises an instance number in the visible object number list, and the plurality of contents comprise points in the similar radar point clouds;
the camera information acquisition module is used for: acquiring camera information, wherein the camera information comprises internal parameters and external parameters of a virtual camera outputting a graph to be annotated and position information of the virtual camera in a three-dimensional space; and
the two-dimensional boundary box labeling module is used for: marking a two-dimensional boundary box of the detected object on the camera plane of the virtual camera according to the matching list and the camera information,
the marking, according to the matching list and the camera information, a two-dimensional bounding box of the detected object in the two-dimensional plane of the virtual camera, specifically includes:
Projecting a plurality of points in the content with the same index in the matching list onto a two-dimensional camera plane of the virtual camera to obtain a plurality of projection points positioned on the camera plane; and
a two-dimensional bounding box surrounding the plurality of projection points is determined, the two-dimensional bounding box surrounding the plurality of projection points being a two-dimensional bounding box of the detected object in a two-dimensional camera plane of the virtual camera.
16. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-14.
17. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1 to 14.
18. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 14.
CN202310349881.7A 2023-03-31 2023-03-31 Labeling method, device, medium and program product applied to automatic driving simulation Pending CN116416601A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310349881.7A CN116416601A (en) 2023-03-31 2023-03-31 Labeling method, device, medium and program product applied to automatic driving simulation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310349881.7A CN116416601A (en) 2023-03-31 2023-03-31 Labeling method, device, medium and program product applied to automatic driving simulation

Publications (1)

Publication Number Publication Date
CN116416601A true CN116416601A (en) 2023-07-11

Family

ID=87050892

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310349881.7A Pending CN116416601A (en) 2023-03-31 2023-03-31 Labeling method, device, medium and program product applied to automatic driving simulation

Country Status (1)

Country Link
CN (1) CN116416601A (en)

Similar Documents

Publication Publication Date Title
US20180349526A1 (en) Method and system for creating and simulating a realistic 3d virtual world
CN110796714B (en) Map construction method, device, terminal and computer readable storage medium
CN113009506B (en) Virtual-real combined real-time laser radar data generation method, system and equipment
US20240017747A1 (en) Method and system for augmenting lidar data
CN112258519B (en) Automatic extraction method and device for way-giving line of road in high-precision map making
CN113554643B (en) Target detection method and device, electronic equipment and storage medium
CN115164918B (en) Semantic point cloud map construction method and device and electronic equipment
CN110969592A (en) Image fusion method, automatic driving control method, device and equipment
US11699234B2 (en) Semantic segmentation ground truth correction with spatial transformer networks
Christensen et al. Autonomous vehicles for micro-mobility
CN116978010A (en) Image labeling method and device, storage medium and electronic equipment
CN112507891B (en) Method and device for automatically identifying high-speed intersection and constructing intersection vector
CN117036607A (en) Automatic driving scene data generation method and system based on implicit neural rendering
CN115468578B (en) Path planning method and device, electronic equipment and computer readable medium
CN112712098A (en) Image data processing method and device
Ahmed et al. Lane marking detection using LiDAR sensor
CN116452911A (en) Target detection model training method and device, target detection method and device
CN116416601A (en) Labeling method, device, medium and program product applied to automatic driving simulation
Bai et al. Cyber mobility mirror for enabling cooperative driving automation: A co-simulation platform
Zhuo et al. A novel vehicle detection framework based on parallel vision
Koduri et al. AUREATE: An Augmented Reality Test Environment for Realistic Simulations
Patel A simulation environment with reduced reality gap for testing autonomous vehicles
CN113808142A (en) Ground identifier identification method and device and electronic equipment
US11908095B2 (en) 2-D image reconstruction in a 3-D simulation
KR102482829B1 (en) Vehicle AR display device and AR service platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination