CN112950785B

CN112950785B - Point cloud labeling method, device and system

Info

Publication number: CN112950785B
Application number: CN201911268207.6A
Authority: CN
Inventors: 徐建云; 孙杰; 朱雨时
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2019-12-11
Filing date: 2019-12-11
Publication date: 2023-05-30
Anticipated expiration: 2039-12-11
Also published as: WO2021114884A1; CN112950785A

Abstract

The application discloses a point cloud labeling method, device and system, and belongs to the field of data processing. The method comprises the following steps: generating a first two-dimensional frame on the image to select a target object to be marked currently; and generating a three-dimensional frame on the point cloud according to the first two-dimensional frame so as to mark the target object in the point cloud, wherein the three-dimensional frame is positioned in a viewing cone area corresponding to the target object, and the point cloud and the image are determined for the same scene. Because the pixel points in the image are rich and dense, the pose and the outline of the target object can be easily determined, and therefore, the target object in the point cloud is marked by combining the image, and the marking efficiency and the marking accuracy can be improved.

Description

Point cloud labeling method, device and system

Technical Field

The present disclosure relates to the field of data processing, and in particular, to a method, an apparatus, and a system for point cloud labeling.

Background

With the continuous development of mobile devices such as robots and autopilots, in order for the mobile devices to automatically identify objects such as surrounding objects and pedestrians during movement, it is generally necessary to obtain data samples of the objects in advance for the mobile devices to perform machine training learning. When acquiring the data samples of these objects, it is generally necessary to acquire the point cloud containing these objects first, and label the information such as the category, the position, the size and the like of the objects in the point cloud, so as to obtain the label of each object in the point cloud.

At present, the related technology mainly marks a target object according to the outline of the target object to be marked in the point cloud. However, when the three-dimensional points included in the target object in the point cloud are sparse or the distance between the target objects is long, determining the outline of the target object is difficult, so that the accuracy of the labeling result of the target object is low.

Disclosure of Invention

The application provides a point cloud labeling method, a point cloud labeling device and a point cloud labeling system, which can solve the problem that in the related art, when three-dimensional points contained in a target object in the point cloud are sparse or the distance of the target object is far, the outline of the target object is difficult to determine, so that the accuracy of a labeling result of the target object is low. The technical scheme is as follows:

in a first aspect, a point cloud labeling method is provided, the method including:

generating a first two-dimensional frame on the image to select a target object to be marked currently;

and generating a three-dimensional frame on the point cloud according to the first two-dimensional frame so as to mark the target object in the point cloud, wherein the three-dimensional frame is positioned in a viewing cone area corresponding to the target object, and the point cloud and the image are determined for the same scene.

Optionally, the generating a three-dimensional frame on the point cloud according to the first two-dimensional frame includes:

determining a view cone region corresponding to the target object from the point cloud according to the first two-dimensional frame;

determining the size of the three-dimensional frame according to the category of the target object;

and generating the three-dimensional frame on the point cloud according to the position of the view cone region corresponding to the target object and the size of the three-dimensional frame.

Optionally, the determining, according to the first two-dimensional frame, a viewing cone area corresponding to the target object from the point cloud includes:

determining a point cloud image, wherein the point cloud image refers to an image corresponding to the point cloud in an image coordinate system;

determining a plurality of pixel points located in the first two-dimensional frame from the point cloud image;

and determining the space area occupied by the corresponding three-dimensional points of the plurality of pixel points in the point cloud as a viewing cone area corresponding to the target object.

Optionally, the image is acquired by a camera, and the point cloud is acquired by a point cloud acquisition device, wherein the point cloud acquisition device comprises a time-of-flight TOF camera and/or a laser radar;

the determining the point cloud image includes:

And projecting the point cloud into the image coordinate system according to the external parameters between the point cloud collector and the camera, the internal parameters of the camera and the distortion coefficients to obtain the point cloud image.

Optionally, the image is acquired by a camera, and the point cloud is obtained by converting two images acquired by a binocular camera;

the determining the point cloud image includes:

and converting reference images in the two images acquired by the binocular camera into the image coordinate system to obtain the point cloud image, wherein the pixel points in the reference images have a mapping relationship with three-dimensional points in the point cloud.

Optionally, after generating a three-dimensional frame on the point cloud according to the first two-dimensional frame, the method further includes:

and adjusting the position and the size of the three-dimensional frame according to the position and the size of the first two-dimensional frame.

Optionally, the adjusting the position and the size of the three-dimensional frame according to the position and the size of the first two-dimensional frame includes:

performing primary adjustment on the position and/or the size of the three-dimensional frame;

generating a second two-dimensional frame in the image, wherein the second two-dimensional frame is an outer envelope frame of projection of the three-dimensional frame on the image;

And carrying out secondary adjustment on the position and/or the size of the three-dimensional frame after primary adjustment according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame, so that the target object in the point cloud can be surrounded by the three-dimensional frame after secondary adjustment.

Optionally, before the first adjusting the position and/or the size of the three-dimensional frame, the method further includes:

determining a three-dimensional object corresponding to the first two-dimensional frame;

the primary adjustment of the position and/or the size of the three-dimensional frame comprises the following steps:

and performing primary adjustment on the position and/or the size of the three-dimensional frame according to the position and the size of the three-dimensional target corresponding to the first two-dimensional frame.

Optionally, the three-dimensional frame is a three-dimensional rectangular frame;

the first adjusting the position and/or the size of the three-dimensional frame according to the position and the size of the three-dimensional target corresponding to the first two-dimensional frame includes:

moving the geometric center of the three-dimensional frame to the geometric center of the three-dimensional target in the overlooking direction of the point cloud; and/or

When a first adjustment operation is detected, adjusting the orientation of the three-dimensional frame in the top view direction so that the orientation of the adjusted three-dimensional frame coincides with the orientation of the target object in the image; and/or

When a second adjustment operation is detected, at least one of a horizontal position, a length, and a width of the three-dimensional frame is adjusted in the top-down direction so that a contour of the three-dimensional frame in the top-down direction is aligned with a contour of the three-dimensional object in the top-down direction.

the secondary adjustment of the position and/or the size of the three-dimensional frame after the primary adjustment according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame comprises the following steps:

when a third adjustment operation is detected, adjusting at least one of a vertical position and a height of the three-dimensional frame in a front view direction of the point cloud so that a position and a size of upper and lower edges of the second two-dimensional frame are aligned with those of upper and lower edges of the first two-dimensional frame; and/or

When a fourth adjustment operation is detected, at least one of the horizontal position, length, width, and orientation of the three-dimensional frame is adjusted in the top view direction of the point cloud so that the position and size of the left and right edges of the second two-dimensional frame are aligned with the position and size of the left and right edges of the first two-dimensional frame.

Optionally, the method further comprises:

taking the point cloud marked with the three-dimensional frame as a sample point cloud, and training an initial object recognition network to obtain an object recognition model;

when the object is identified, the point cloud comprising the object to be identified is identified through the object identification model.

In a second aspect, a point cloud labeling apparatus is provided, the apparatus including:

the first generation module is used for generating a first two-dimensional frame on the image so as to select a target object to be marked currently;

the second generation module is used for generating a three-dimensional frame on the point cloud according to the first two-dimensional frame so as to mark the target object in the point cloud, the three-dimensional frame is positioned in a view cone area corresponding to the target object, and the point cloud and the image are determined aiming at the same scene.

Optionally, the second generating module includes:

the first determining submodule is used for determining a view cone area corresponding to the target object from the point cloud according to the first two-dimensional frame;

a second determining submodule, configured to determine a size of the three-dimensional frame according to a category of the target object;

and the first generation submodule is used for generating the three-dimensional frame on the point cloud according to the position of the view cone region corresponding to the target object and the size of the three-dimensional frame.

Optionally, the first determining submodule includes:

the first determining unit is used for determining a point cloud image, wherein the point cloud image refers to an image corresponding to the point cloud in an image coordinate system;

a second determining unit, configured to determine a plurality of pixel points located in the first two-dimensional frame from the point cloud image;

and the third determining unit is used for determining the space area occupied by the three-dimensional points corresponding to the pixel points in the point cloud as the viewing cone area corresponding to the target object.

the first determining unit is specifically configured to:

Optionally, the apparatus further comprises:

and the adjusting module is used for adjusting the position and the size of the three-dimensional frame according to the position and the size of the first two-dimensional frame.

Optionally, the adjusting module includes:

the first adjusting sub-module is used for carrying out primary adjustment on the position and/or the size of the three-dimensional frame;

a second generation sub-module, configured to generate a second two-dimensional frame in the image, where the second two-dimensional frame is an outer envelope frame of the projection of the three-dimensional frame on the image;

and the second adjustment sub-module is used for carrying out secondary adjustment on the position and/or the size of the three-dimensional frame after the primary adjustment according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame so that the three-dimensional frame after the secondary adjustment can surround the target object in the point cloud.

Optionally, the adjustment module further includes:

a third determination sub-module for determining a three-dimensional object corresponding to the first two-dimensional box;

The first adjustment submodule is specifically configured to:

the first adjusting submodule is specifically:

the second adjusting submodule is specifically used for:

Optionally, the apparatus further comprises:

the training module is used for training the initial object recognition network by taking the point cloud marked with the three-dimensional frame as a sample point cloud to obtain an object recognition model;

and the identification module is used for identifying the point cloud comprising the object to be identified through the object identification model when the object identification is carried out.

In a third aspect, a point cloud labeling system is provided, the system includes a movable device and a point cloud labeling device, the movable device includes a point cloud acquisition component and a camera;

the point cloud acquisition component is used for acquiring point clouds and sending the point clouds to the point cloud labeling equipment;

the camera is used for shooting an image, sending the image to the point cloud labeling equipment, and determining the point cloud and the image aiming at the same scene;

the point cloud labeling equipment is used for receiving the point cloud and the image, and generating a first two-dimensional frame on the image so as to select a target object to be labeled currently; and generating a three-dimensional frame on the point cloud according to the first two-dimensional frame so as to mark the target object in the point cloud, wherein the three-dimensional frame is positioned in a viewing cone area corresponding to the target object.

In a fourth aspect, a point cloud labeling apparatus is provided, including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the steps of any of the methods of the first aspect above.

In a fifth aspect, there is provided a computer readable storage medium having stored thereon instructions which, when executed by a processor, implement the steps of any of the methods of the first aspect described above.

In a sixth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the method of any of the first aspects above.

The technical scheme that this application provided can bring following beneficial effect at least:

since three-dimensional points included in the point cloud are sparse, for a target object that is far away or a target object that is severely occluded, it is difficult to observe the pose and contour of the target object only with the naked eye. However, the pixels in the image are dense, especially for a high-resolution camera, the pose and contour of the target object can be easily determined even if the target object is far away or is severely blocked. Therefore, the target object in the point cloud is marked by combining the image, so that the marking difficulty can be reduced, and the marking efficiency and the marking accuracy can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of an implementation environment provided by embodiments of the present application.

Fig. 2 is a schematic diagram of a point cloud according to an embodiment of the present application.

Fig. 3 is a schematic diagram of an image provided in an embodiment of the present application.

Fig. 4 is a flowchart of a point cloud labeling method provided in an embodiment of the present application.

Fig. 5 is a schematic diagram of generating a first two-dimensional frame on an image according to an embodiment of the present application.

Fig. 6 is a schematic diagram of generating a three-dimensional frame on a point cloud according to an embodiment of the present application.

Fig. 7 is a schematic diagram of indicating the orientation of a target object according to an embodiment of the present application.

Fig. 8 is a schematic diagram of generating a second two-dimensional frame on an image according to an embodiment of the present application.

Fig. 9 is a block diagram of a point cloud labeling device provided in an embodiment of the present application.

Fig. 10 is a block diagram of another point cloud labeling apparatus according to an embodiment of the present application.

Fig. 11 is a schematic structural diagram of a point cloud labeling device provided in an embodiment of the present application.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with aspects of the present application.

Before explaining the embodiments of the present application in detail, an application scenario of the embodiments of the present application is described:

fig. 1 is a schematic diagram of an implementation environment provided in an embodiment of the present application, and referring to fig. 1, the implementation environment includes a mobile device 100 and a point cloud labeling device 200. The movable device 100 and the point cloud labeling device 200 are connected through a network. A point cloud acquisition component 110 and a camera 120 are mounted on the mobile device 100. The point cloud acquiring unit 110 may acquire a point cloud during the movement of the mobile device 100, as shown in fig. 2, and fig. 2 is a schematic diagram of a frame of the point cloud acquired by the point cloud acquiring unit 110. The camera 120 may capture images during movement of the mobile device 100, as shown in fig. 3, where fig. 3 is a schematic view of an image captured by the camera 120. After the point cloud acquisition part 110 acquires the point cloud, the point cloud may be transmitted to the point cloud labeling device 200, and after the camera 120 captures an image, the image may be transmitted to the point cloud labeling device 200. Alternatively, the point cloud acquired by the point cloud acquisition unit 110 and the image captured by the camera 120 may also be transmitted to the point cloud labeling device 200 by the mobile device 110.

The point cloud is a set composed of a plurality of three-dimensional points, the three-dimensional points included in the point cloud can be points on objects such as vehicles, people or buildings included in a scene, and the point cloud data in the point cloud is information such as three-dimensional coordinates of each three-dimensional point included in the point cloud. The image captured by the camera 120 may include objects such as vehicles, people, or buildings in a scene.

Illustratively, the mobile device 100 may be an automobile or a robot, and the point cloud acquisition component 110 may be a point cloud collector such as a laser radar, a Time of Flight (TOF) camera, or a binocular camera. Wherein the TOF camera may also be referred to as a time of flight camera. The point cloud acquisition section 110 may acquire the point cloud at a certain cycle. The period may be preset according to the use requirement, which is not limited in the embodiment of the present application. For example, the period may be 0.1 seconds or the like, that is, the point cloud acquisition section 110 may acquire one frame of the point cloud every 0.1 seconds.

In one possible implementation, the point cloud acquisition component 110 is a lidar that can emit laser light that is reflected back into the lidar when it contacts the surface of an object in the scene. The lidar may determine the distance between the object and the lidar based on the time taken for the laser to fire and return and the speed of the laser transmission. The laser emitted by the laser radar can be 4 lines, 8 lines, 16 lines and the like. The fewer the laser lines emitted by the laser radar, the sparse the acquired point cloud; the more laser lines the laser radar emits, the denser the acquired point cloud. And for the same frame of point cloud, in a scene corresponding to the frame of point cloud, the more distant objects from the laser radar are sparse in three-dimensional points included in the frame of point cloud, and the more close objects from the laser radar are dense in three-dimensional points included in the frame of point cloud. For convenience of description, point clouds related to embodiments of the present application may be all collected by a lidar. Of course, the point cloud related in the embodiments of the present application may also be collected by the point cloud obtaining component 110 of another type, which is not limited in the embodiments of the present application.

In addition, in some examples, the point cloud obtaining component 110 is a mechanical rotation type laser radar, the mechanical rotation type laser radar can rotate 360 degrees around its own central axis, and can obtain a point cloud data packet every time a small angle is rotated, a plurality of point cloud data packets can be obtained by rotating 360 degrees, and a frame of complete point cloud can be obtained by splicing the point clouds in the plurality of point cloud data packets. In other words, the mechanical rotary lidar needs to rotate around its central axis for one cycle to acquire a point cloud of a 360 degree scene.

However, in a period, the movable device 100 may have a large pose change during the rotation of the mechanical rotation type lidar, that is, the position or pose of the movable device 100 may be changed greatly, so that the point clouds in the plurality of point cloud data packets collected in the period are easy to cause measurement deviation when being spliced. In order to eliminate the measurement deviation, the relative pose information of the mobile device 100 in the period may be acquired first, then, according to the relative pose information and the relative pose information between the point clouds in the plurality of point cloud data packets acquired in the period, the point clouds in the plurality of point cloud data packets are projected into the same three-dimensional coordinate system, and then, a frame of point clouds is obtained by stitching, so that the process is motion compensation. The three-dimensional coordinate system may be a three-dimensional coordinate system corresponding to a point cloud in any point cloud data packet, and the embodiment of the present application is not limited thereto.

Of course, the point cloud acquired by the point cloud acquisition component 110 in other situations may also require motion compensation, just to exemplify one possible situation. However, when the point cloud acquired by the point cloud acquisition unit 110 has no deviation or a deviation is small, the deviation may be ignored, or motion compensation may not be performed. That is, motion compensation is not an essential step for the point cloud acquired by the point cloud acquisition section 110.

The point cloud labeling apparatus 200 is an apparatus that provides a background service for the removable apparatus 100. The point cloud labeling apparatus 200 includes a user data store. The point cloud labeling device 200 may receive the point cloud and the image transmitted by the mobile device 100, and may determine the acquisition time of each frame of the received point cloud and each image.

The point cloud labeling method provided by the embodiment of the application is explained in detail below.

Fig. 4 is a flowchart of a point cloud labeling method provided in an embodiment of the present application, and an execution subject of the method may be the point cloud labeling apparatus 200 shown in fig. 1. Referring to fig. 4, the method includes:

step 401: and generating a first two-dimensional frame on the image to select the current target object to be marked.

In some embodiments, the target object on the image may be labeled, resulting in a two-dimensional tag for the target object. The two-dimensional label comprises information such as the category of the target object, the position and the size of the target object in the image and the like. In this way, a first two-dimensional frame may be generated on the image based on the position and size of the target object in the image.

In other embodiments, a two-dimensional box may also be generated directly on the image. When a moving operation and a size adjusting operation for the position of the two-dimensional frame are detected, the position of the two-dimensional frame may be moved and the size of the two-dimensional frame may be adjusted, and the two-dimensional frame with the adjusted position and size may be determined as the first two-dimensional frame.

The first two-dimensional frame is used for selecting the target object, and may surround a portion of the target object that is exposed on the image, or may surround a portion of the target object that is not exposed on the image.

In addition, the first two-dimensional frame may be a rectangular frame, a polygonal frame, or the like, that is, the shape of the first two-dimensional frame may be a rectangle, a polygon, or the like.

For example, as shown in fig. 5, assuming that the target object to be marked is the closest white vehicle, a white solid frame may be generated in the image, where the white solid frame is the first two-dimensional frame, so as to select the target object.

Step 402: according to the first two-dimensional frame, a three-dimensional frame is generated on the point cloud, the three-dimensional frame is located in a viewing cone area corresponding to the target object, and the point cloud and the image are determined for the same scene.

In some embodiments, a three-dimensional box may be generated on the point cloud by the following steps (1) - (3).

(1) And determining a viewing cone region corresponding to the target object from the point cloud according to the first two-dimensional frame.

In some embodiments, a point cloud image may be determined, which is an image of the pointing cloud corresponding in an image coordinate system. From the point cloud image, a plurality of pixel points located within a first two-dimensional frame are determined. And determining the space area occupied by the corresponding three-dimensional points of the plurality of pixel points in the point cloud as a viewing cone area corresponding to the target object.

The point cloud acquisition component can be a point cloud collector such as a laser radar, a TOF camera and the like, and can also be a binocular camera. Therefore, the manner of determining the point cloud image is described next in two cases.

In the first case, the image is acquired by a camera, and the point cloud is acquired by a point cloud acquisition unit, which may include a TOF camera and/or a lidar. Therefore, the point cloud can be projected into an image coordinate system according to the external parameters between the point cloud collector and the camera, the internal parameters of the camera and the distortion coefficients, and a point cloud image is obtained.

Based on the above description, the point cloud includes a plurality of three-dimensional points, and the three-dimensional coordinates of the plurality of three-dimensional points are in the three-dimensional coordinate system of the point cloud collector, and the external parameters between the point cloud collector and the camera can be determined, so that the point cloud can be projected into the three-dimensional coordinate system of the camera, that is, the three-dimensional coordinates of all three-dimensional points in the point cloud are converted into the three-dimensional coordinate system of the camera, based on the external parameters between the point cloud collector and the camera. And because of the internal parameters and distortion coefficients of the camera, the two-dimensional coordinates of two-dimensional points (also called pixel points) included in the image shot by the camera in the image coordinate system are determined, so that all three-dimensional points converted into the three-dimensional coordinate system of the camera can be converted into the image coordinate system, and a point cloud image can be obtained.

In the second case, the image is acquired by a camera, and the point cloud is obtained by converting two images acquired by a binocular camera. In this way, the reference images in the two images acquired by the binocular camera can be converted into an image coordinate system to obtain a point cloud image. Wherein, the pixel point in the reference image has a mapping relation with the three-dimensional point in the point cloud.

First, a process of converting two images acquired by a binocular camera into a point cloud will be described.

The binocular camera may capture two images, one of which may be the reference image and the other may be the comparison image. By comparing the images with the reference image, a parallax value corresponding to each pixel point in the reference image can be determined, so that a parallax image is obtained, the size of the parallax image is the same as that of the reference image, and the pixel value of each pixel point in the parallax image is the parallax value corresponding to the corresponding pixel point in the reference image. Then, a depth image corresponding to the parallax image can be determined, the size of the depth image is the same as that of the parallax image, and the pixel value of each pixel point in the depth image is the depth value corresponding to the corresponding pixel point in the reference image. Finally, a point cloud may be constructed from the depth image.

That is, the size of the reference image, the size of the parallax image, and the size of the depth image are the same. In addition, the point cloud is constructed by the depth image, so that a mapping relationship exists between the pixel points in the reference image and the three-dimensional points in the point cloud. Further, the reference image is in a two-dimensional coordinate system, and then an image obtained after converting the reference image into the image coordinate system can be used as the above-described point cloud image. In this way, a plurality of pixel points located within the first two-dimensional frame may be determined from the point cloud image.

Since the first two-dimensional frame is used for selecting the target object and the point cloud image and the image are located in the same coordinate system, a plurality of pixel points located in the first two-dimensional frame can be determined from the point cloud image, and the space area occupied by the three-dimensional points corresponding to the pixel points in the point cloud can be determined as the viewing cone area corresponding to the target object.

It should be noted that, in the imaging process, a certain two-dimensional frame on the image plane may correspond to a real three-dimensional space, and for some cameras, the space is a frustum, so it is called a viewing cone.

Another point to be noted is that the above-described point cloud may be any one frame of the point clouds that the point cloud acquisition section has currently acquired. The image may be any one of the images currently captured by the camera. Because the point cloud collector and the camera are both installed on the movable equipment, the point cloud collected by the point cloud collector and the image shot by the camera are aimed at the same scene. Alternatively, the acquisition time of the point cloud and the shooting time of the image may be the same time. Thus, two-dimensional images and three-dimensional point clouds are acquired simultaneously, and the influence caused by the movement of an object can be avoided.

Optionally, after determining the view cone region corresponding to the target object, the three-dimensional point located in the view cone region may be further highlighted, for example, highlighted in a highlighting, thickening, or the like.

(2) The size of the three-dimensional frame is determined according to the class of the target object.

In some embodiments, the same three-dimensional box size may be set for the same class of objects. That is, the mapping relationship between the category and the three-dimensional frame size is stored in advance. Thus, the size of the three-dimensional frame can be determined from the mapping relation according to the category of the target object.

Of course, the size of the three-dimensional frame may be determined in other manners, which are not limited in this embodiment of the present application.

It should be noted that the category of the target object may be a car, a person, a building, or the like, which is not limited in the embodiment of the present application.

(3) And generating the three-dimensional frame on the point cloud according to the position of the view cone region corresponding to the target object and the size of the three-dimensional frame.

In some embodiments, the geometric center of the view cone region corresponding to the target object may be used as the geometric center of the three-dimensional frame, so that the three-dimensional frame is generated on the point cloud according to the size of the three-dimensional frame.

Of course, taking the geometric center of the view cone region corresponding to the target object as the geometric center of the three-dimensional frame is only an implementation manner, and in the embodiment of the present application, other points in the view cone region corresponding to the target object may also be taken as the geometric center of the three-dimensional frame, which only needs to be located in the view cone region corresponding to the target object.

For example, as shown in fig. 6, the three-dimensional box generated on the point cloud may be as the black solid line box in fig. 6.

It should be noted that, in some embodiments, after the three-dimensional frame is generated on the point cloud in the above manner, the three-dimensional frame may be labeled for the target object in the point cloud. That is, the three-dimensional box may enclose the target object in the point cloud. At this time, the operation of the point cloud annotation may end. However, in other embodiments, after the three-dimensional frame is generated on the point cloud in the above manner, the three-dimensional frame may not surround the target object in the point cloud, and at this time, the position and the size of the three-dimensional frame may be adjusted according to step 403 described below, so that the adjusted three-dimensional frame can surround the target object in the point cloud.

Step 403: the position and size of the three-dimensional frame are adjusted according to the position and size of the first two-dimensional frame.

In some embodiments, the position and size of the three-dimensional frame may be adjusted according to the position and size of the first two-dimensional frame by the following steps (1) -step (3).

(1) And (3) performing primary adjustment on the position and/or the size of the three-dimensional frame.

In some embodiments, a three-dimensional object corresponding to the first two-dimensional frame may also be determined prior to the initial adjustment of the position and/or size of the three-dimensional frame. Thus, when the position and/or the size of the three-dimensional frame are/is primarily adjusted, the position and/or the size of the three-dimensional frame can be primarily adjusted according to the position and the size of the three-dimensional object corresponding to the first two-dimensional frame.

As an example, the three-dimensional object located in the cone region corresponding to the target object may be directly determined as the three-dimensional object corresponding to the first two-dimensional frame. In some cases, however, there may be more than one three-dimensional object located in the cone region corresponding to the target object, at which time the three-dimensional object corresponding to the first two-dimensional frame may be determined by the user. That is, receiving a selection operation triggered by a user in the point cloud, and determining a three-dimensional target selected by the selection operation as a three-dimensional target corresponding to the first two-dimensional frame.

In some embodiments, the three-dimensional frame may be a three-dimensional rectangular frame, so the position and/or the size of the three-dimensional frame may be adjusted for the first time according to the position and the size of the three-dimensional object corresponding to the first two-dimensional frame through at least one of the following three possible implementations.

A first possible implementation:the geometric center of the three-dimensional frame is moved to the geometric center of the three-dimensional object corresponding to the first two-dimensional frame in the top view direction of the point cloud.

Because the three-dimensional frame on the point cloud is subsequently used for marking the target object, and the first two-dimensional frame is used for selecting the target object on the image, the three-dimensional target corresponding to the first two-dimensional frame can be regarded as the target object, so that after the geometric center of the three-dimensional frame is moved to the geometric center of the three-dimensional target corresponding to the first two-dimensional frame, the target object on the point cloud can be marked through the three-dimensional frame, and the marking of the target object in the subsequent point cloud is padded, so that the subsequent marking efficiency is improved.

Note that, the top view direction of the point cloud may be a direction as shown in fig. 6, that is, a direction of looking down the ground, and the line of sight of the observation target object may be perpendicular to the ground.

In the first possible implementation manner, the position of the geometric center of the three-dimensional frame may be directly adjusted by the point cloud identification device. However, in other embodiments, the position of the geometric center of the three-dimensional frame may also be adjusted by user interaction with the point cloud identification device. That is, when the fifth adjustment operation is detected, the position of the geometric center of the three-dimensional frame may be adjusted according to the fifth adjustment operation.

A second possible implementation:when the first adjustment operation is detected, the orientation of the three-dimensional frame is adjusted in the top view direction of the point cloud so that the three adjusted frames are adjustedThe orientation of the dimension box is consistent with the orientation of the target object in the image.

Since the target objects may be movable target objects such as vehicles and pedestrians, and when the target objects move towards the movable device, the movable device may be affected, that is, the direction of the target objects may have a certain effect on the movable device, and the direction of the target objects may generally be presented in the overlooking direction of the point cloud.

In other words, the three-dimensional frame is directional, that is, not only the target object can be surrounded by the three-dimensional frame, but also the orientation of the target object can be indicated. For example, as shown in fig. 7, the direction of the arrow in fig. 7 may indicate the orientation of the target object.

It should be noted that, since the point cloud includes a plurality of three-dimensional points, the direction of the target object may not be seen through the three-dimensional points, but the direction of the target object in the image is generally easily determinable, and the first adjustment operation may be a single click operation, a double click operation or a drag operation, and all the operations may be implemented by interaction between the user and the point cloud labeling device. Thus, in conjunction with the orientation of the target object in the image, the user can quickly adjust the orientation of the three-dimensional frame. Thus, not only the adjustment speed but also the adjustment accuracy can be improved.

In some cases, after adjustment in at least one of the two ways, the three-dimensional frame is already able to surround the target object, and the orientation of the three-dimensional frame is consistent with the orientation of the target object, at which time subsequent position and size adjustments may not be required. However, since the size of the three-dimensional frame is determined based on the class of the target object, the three-dimensional frame may not necessarily just surround the target object or may not be close to the outline of the target object, and at this time, the position and/or size of the three-dimensional frame may need to be adjusted according to the following implementation manner.

A third possible implementation:when the second adjustment operation is detected, at least one of the horizontal position, the length, and the width of the three-dimensional frame is adjusted in the top-view direction of the point cloud so that the outline of the three-dimensional frame in the top-view direction of the point cloud is aligned with the outline of the three-dimensional object in the top-view direction of the point cloud.

In some embodiments, when the outline of the three-dimensional object corresponding to the first two-dimensional frame is clearer, the position and the size of the three-dimensional frame can be adjusted according to the outline of the three-dimensional object. Since the three-dimensional rectangular frame may have a horizontal position and a vertical position in the three-dimensional coordinate system of the point cloud collector, the three-dimensional rectangular frame may include a length, a width, and a height, the three-dimensional frame may be displayed in a top view direction of the point cloud, and since the horizontal position, the length, and the width of the three-dimensional frame may be displayed in the top view direction, when the second adjustment operation is detected, at least one of the horizontal position, the length, and the width of the three-dimensional frame may be adjusted so that a contour of the three-dimensional frame in the top view direction of the point cloud is aligned with a contour of the three-dimensional object in the top view direction of the point cloud.

The horizontal position of the three-dimensional frame refers to a position in a direction parallel to the ground. The vertical position of the three-dimensional frame refers to a position in a direction perpendicular to the ground.

In addition, the second adjustment operation may be the same as the first adjustment operation, and of course, the second adjustment operation may also be different from the first adjustment operation, which is not limited in the embodiment of the present application.

Because the second operation can be realized by interaction between the user and the point cloud labeling device, the user can combine the outline of the three-dimensional object under the condition that the outline of the three-dimensional object corresponding to the first two-dimensional frame is clearer, and at least one of the horizontal position, the length and the width of the three-dimensional frame can be initially adjusted, so that the subsequent adjustment operation can be simplified.

(2) And generating a second two-dimensional frame in the image, wherein the second two-dimensional frame is an outer envelope frame of the projection of the three-dimensional frame on the image.

In some embodiments, each vertex included in the three-dimensional frame may be projected into an image coordinate system to obtain two-dimensional coordinates of projection points corresponding to the vertices, and then, according to the two-dimensional coordinates of the projection points, a projection area of the three-dimensional frame is generated on the image, and then, a two-dimensional frame surrounding the projection area is determined, so as to obtain a second two-dimensional frame.

For example, as shown in fig. 8, the white dashed box in fig. 8 is a second two-dimensional box.

(3) And carrying out secondary adjustment on the position and/or the size of the three-dimensional frame after the primary adjustment according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame, so that the target object in the point cloud can be surrounded by the three-dimensional frame after the secondary adjustment.

In some embodiments, when the third adjustment operation is detected, at least one of a vertical position and a height of the three-dimensional frame is adjusted in a front view direction of the point cloud so that a position and a size of upper and lower edges of the second two-dimensional frame are aligned with a position and a size of upper and lower edges of the first two-dimensional frame. And/or, when the fourth adjustment operation is detected, adjusting at least one of the horizontal position, the length, the width, and the orientation of the three-dimensional frame in the top view direction of the point cloud so that the positions and the dimensions of the left and right edges of the second two-dimensional frame are aligned with the positions and the dimensions of the left and right edges of the first two-dimensional frame.

Since the vertical position and height of the three-dimensional frame can be displayed on the front view, the three-dimensional frame can be displayed in the front view direction of the point cloud. In addition, since the second two-dimensional frame is an outer envelope frame of the projection of the three-dimensional frame on the image, that is, when the position and the size of the three-dimensional frame are changed, the position and the size of the second two-dimensional frame are also changed correspondingly, and the first two-dimensional frame is used for selecting the target object in the image, when the third adjustment operation is detected, at least one of the vertical position and the height of the three-dimensional frame can be adjusted in combination with the position and the size of the upper and the lower edges of the second two-dimensional frame and the position and the size of the upper and the lower edges of the first two-dimensional frame. Similarly, when the fourth adjustment operation is detected, at least one of the horizontal position, length, width, and orientation of the three-dimensional frame may be adjusted in combination with the position and size of the left and right edges of the second two-dimensional frame and the position and size of the left and right edges of the first two-dimensional frame.

Because the outline and the orientation of the target object in the image are clearer, under the condition that the outline of the three-dimensional target is not clear enough, the position and the size of the first two-dimensional frame and the position and the size of the second two-dimensional frame in the image can be combined, and at least one of the position, the size and the orientation of the three-dimensional frame can be adjusted, so that the labeling accuracy of the target object can be ensured.

It should be noted that the front view direction of the point cloud may be a direction parallel to the ground, that is, the line of sight of the observation target object may be parallel to the ground.

In addition, the third adjustment operation and the fourth adjustment operation may be the same as or different from the first adjustment operation, which is not limited in the embodiment of the present application.

Optionally, point cloud data located within the top plane and outside the three-dimensional frame may also be hidden prior to adjusting at least one of the vertical position and the height of the three-dimensional frame.

Since the line of sight of the target object in the elevation direction is parallel to the ground, other objects may be present on the ground in the same horizontal direction as the target object. Therefore, in order to more efficiently adjust at least one of the vertical position and the height of the three-dimensional frame in the front view direction, the point cloud data located in the top plane and located outside the three-dimensional frame may be hidden first. Therefore, when at least one of the vertical position and the height of the three-dimensional frame is adjusted in the direction of the front view plane, the three-dimensional frame is not blocked by other objects, and therefore the labeling efficiency of the target object is improved.

It should be noted that, after the above steps are performed, the target object may be enclosed within a three-dimensional frame. Thus, the point cloud labeling process can be completed. However, in some cases, in addition to labeling the target object in the point cloud through a three-dimensional box, some other information of the target object needs to be labeled. Such as some text labels, or some color information. At this point, the target object may be further labeled as per step 404 described below.

Step 404: and marking the target object in the point cloud according to the adjusted three-dimensional frame.

The adjusted three-dimensional frame can enclose the target object therein, and at this time, the information such as the length, width, height and the like of the target object can be determined according to the information such as the length, width, height and the like of the three-dimensional frame and used as a three-dimensional tag of the target object. Further, in a possible implementation manner, the three-dimensional coordinates of the movable device in the world coordinate system can be determined according to the positioning information of the movable device, so that the three-dimensional coordinates of the target object in the world coordinate system can be determined according to the three-dimensional coordinates of the movable device in the world coordinate system and the three-dimensional coordinates of the target object in the three-dimensional coordinate system of the point cloud collector. And the three-dimensional coordinates of the target object in the world coordinate system can be used as the three-dimensional label of the target object.

The world coordinate system is an absolute coordinate system, and the world coordinate system can be preset according to the use requirement, which is not limited in the embodiment of the present application. In addition, the category of the target object determined in the above steps may be used as a three-dimensional label of the target object. Of course, the three-dimensional tag of the target object may also include other data, which is not limited in this embodiment of the present application.

The installation position and angle between the point cloud acquisition component and the camera are different, and the two have different observation ranges of space, namely parallax. In this case, the target object in the image may be completely blocked, and the outline of the target object in the point cloud is still clearer, in this case, a three-dimensional frame may be directly generated in the point cloud, and the position and/or the size of the three-dimensional frame may be adjusted through interaction between the user and the point cloud labeling device, so that the target object in the point cloud is labeled according to the adjusted three-dimensional frame.

In the embodiment of the application, after the point cloud labeling is performed by the method, the point cloud labeled with the three-dimensional frame can be used as a sample point cloud, and the initial object recognition network is trained to obtain the object recognition model. Then, when object recognition is performed, the point cloud including the object to be recognized can be recognized through the object recognition model.

It should be noted that, the point cloud marked with the three-dimensional frame may be used not only to train the initial object recognition network, but also to apply to other scenes, which is not limited in the embodiment of the present application.

Fig. 9 is a block diagram of a point cloud labeling apparatus provided in an embodiment of the present application, where the apparatus may be applied to a point cloud labeling device, and referring to fig. 9, the apparatus includes a first generating module 901 and a second generating module 902.

The first generating module 901 is configured to generate a first two-dimensional frame on the image, so as to select a target object to be annotated currently;

The second generating module 902 is configured to generate a three-dimensional frame on the point cloud according to the first two-dimensional frame, so as to label the target object in the point cloud, where the three-dimensional frame is located in the view cone area corresponding to the target object, and the point cloud and the image are determined for the same scene.

Optionally, the second generating module 902 includes:

a second determining sub-module for determining the size of the three-dimensional frame according to the category of the target object;

the first generation sub-module is used for generating the three-dimensional frame on the point cloud according to the position of the view cone area corresponding to the target object and the size of the three-dimensional frame.

Optionally, the first determining submodule includes:

a first determining unit configured to determine a point cloud image, the point cloud image being an image of a pointing cloud corresponding in an image coordinate system;

and the third determining unit is used for determining the space area occupied by the corresponding three-dimensional points of the plurality of pixel points in the point cloud as the viewing cone area corresponding to the target object.

Optionally, the image is acquired by a camera, the point cloud is acquired by a point cloud acquisition device, and the point cloud acquisition device comprises a time-of-flight TOF camera and/or a laser radar;

the first determining unit is specifically configured to:

and projecting the point cloud into an image coordinate system according to the external parameters between the point cloud collector and the camera, the internal parameters of the camera and the distortion coefficients to obtain a point cloud image.

Optionally, the images are acquired by a camera, and the point cloud is obtained by converting two images acquired by a binocular camera;

the first determining unit is specifically configured to:

and converting reference images in the two images acquired by the binocular camera into an image coordinate system to obtain a point cloud image, wherein the pixel points in the reference image have a mapping relation with three-dimensional points in the point cloud.

Optionally, referring to fig. 10, the apparatus further includes:

the adjusting module 903 is configured to adjust the position and the size of the three-dimensional frame according to the position and the size of the first two-dimensional frame.

Optionally, the adjustment module 903 includes:

the second generation submodule is used for generating a second two-dimensional frame in the image, wherein the second two-dimensional frame is an outer envelope frame of projection of the three-dimensional frame on the image;

And the second adjustment sub-module is used for carrying out secondary adjustment on the position and/or the size of the three-dimensional frame after the primary adjustment according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame so that the three-dimensional frame after the secondary adjustment can surround a target object in the point cloud.

Optionally, the adjustment module 903 further includes:

the first adjustment submodule is specifically configured to:

the first adjustment submodule is specifically:

When the first adjustment operation is detected, adjusting the orientation of the three-dimensional frame in the top view direction so that the orientation of the adjusted three-dimensional frame is consistent with the orientation of the target object in the image; and/or

When the second adjustment operation is detected, at least one of the horizontal position, the length, and the width of the three-dimensional frame is adjusted in the top view direction so that the contour of the three-dimensional frame in the top view direction is aligned with the contour of the three-dimensional object in the top view direction.

the second adjustment submodule is specifically configured to:

when the third adjustment operation is detected, adjusting at least one of the vertical position and the height of the three-dimensional frame in the front view direction of the point cloud so that the position and the size of the upper and lower edges of the second two-dimensional frame are aligned with the position and the size of the upper and lower edges of the first two-dimensional frame; and/or

When the fourth adjustment operation is detected, at least one of the horizontal position, the length, the width, and the orientation of the three-dimensional frame is adjusted in the top view direction of the point cloud so that the position and the size of the left and right edges of the second two-dimensional frame are aligned with the position and the size of the left and right edges of the first two-dimensional frame.

Optionally, the apparatus further comprises:

It should be noted that: in the point cloud labeling device provided in the above embodiment, only the division of the above functional modules is used for illustration when point cloud labeling is performed, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the point cloud labeling device and the point cloud labeling method provided in the foregoing embodiments belong to the same concept, and specific implementation processes of the point cloud labeling device and the point cloud labeling method are detailed in the method embodiments and are not described herein again.

Fig. 11 is a schematic structural diagram of a point cloud labeling apparatus provided in an embodiment of the present application, where the point cloud labeling apparatus 1100 may generate relatively large differences due to different configurations or performances, and may include one or more processors (central processing units, CPU) 1101 and one or more memories 1102, where at least one instruction is stored in the memories 1102, and the at least one instruction is loaded and executed by the processor 1101. Of course, the point cloud labeling apparatus 1100 may further have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the apparatus, which are not described herein.

In an exemplary embodiment, a computer readable storage medium, such as a memory, comprising instructions executable by a processor in a point cloud labeling apparatus to perform the point cloud labeling method of the above embodiments is also provided. For example, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

In an exemplary embodiment, there is also provided a point cloud labeling system, including a movable device and a point cloud labeling device, the movable device including a point acquisition component and a camera;

the point cloud acquisition component is used for acquiring point cloud and sending the point cloud to the point cloud labeling equipment;

Optionally, the point cloud labeling device is configured to generate a three-dimensional frame on the point cloud according to the first two-dimensional frame, and includes:

and generating a three-dimensional frame on the point cloud according to the position of the view cone region corresponding to the target object and the size of the three-dimensional frame.

Optionally, the point cloud labeling device is configured to determine, according to a first two-dimensional frame, a view cone region corresponding to the target object from the point cloud, including:

determining a point cloud image, wherein the point cloud image is an image corresponding to the point cloud in an image coordinate system;

determining a plurality of pixel points positioned in a first two-dimensional frame from the point cloud image;

and determining a space area occupied by the corresponding three-dimensional points of the plurality of pixel points in the point cloud as a viewing cone area corresponding to the target object.

the point cloud labeling device is used for determining a point cloud image, and comprises:

Optionally, the point cloud labeling device is configured to adjust the position and the size of the three-dimensional frame according to the position and the size of the first two-dimensional frame after generating a three-dimensional frame on the point cloud according to the first two-dimensional frame.

Optionally, the point cloud labeling device is configured to adjust the position and the size of the three-dimensional frame according to the position and the size of the first two-dimensional frame, including:

and carrying out secondary adjustment on the position and/or the size of the three-dimensional frame after the primary adjustment according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame, so that the target object in the point cloud can be surrounded by the three-dimensional frame after the secondary adjustment.

Optionally, before the point cloud labeling device is used for performing primary adjustment on the position and/or the size of the three-dimensional frame, the point cloud labeling device is further used for determining a three-dimensional target corresponding to the first two-dimensional frame;

The point cloud labeling device is used for primarily adjusting the position and/or the size of the three-dimensional frame, and comprises the following components:

the point cloud labeling device is used for primarily adjusting the position and/or the size of the three-dimensional frame according to the position and the size of the three-dimensional target corresponding to the first two-dimensional frame, and comprises the following steps:

the point cloud labeling device is used for performing secondary adjustment on the position and/or the size of the three-dimensional frame after primary adjustment according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame, and comprises the following steps:

Optionally, the point cloud labeling device is further configured to train the initial object recognition network by using the point cloud labeled with the three-dimensional frame as a sample point cloud to obtain an object recognition model; when the object is identified, the point cloud comprising the object to be identified is identified through the object identification model.

In an exemplary embodiment, a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the point cloud labeling method described above is also provided.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The foregoing description of the preferred embodiments of the present application is not intended to limit the invention to the particular embodiments of the present application, but to limit the scope of the invention to the particular embodiments of the present application.

Claims

1. A point cloud labeling method, the method comprising:

generating a two-dimensional frame on the image; when detecting a moving operation for the position of the two-dimensional frame and an adjusting operation for the size of the two-dimensional frame, moving the position of the two-dimensional frame, adjusting the size of the two-dimensional frame, and determining the two-dimensional frame subjected to the position movement and the size adjustment as a first two-dimensional frame so as to select a target object to be marked currently;

generating a three-dimensional frame on the point cloud according to the first two-dimensional frame so as to mark the target object in the point cloud, wherein the three-dimensional frame is positioned in a viewing cone area corresponding to the target object, the point cloud and the image are determined for the same scene, and the three-dimensional frame is a three-dimensional rectangular frame;

performing primary adjustment on the position and/or the size of the three-dimensional frame; generating a second two-dimensional frame in the image, wherein the second two-dimensional frame is an outer envelope frame of projection of the three-dimensional frame on the image; according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame, the position and/or the size of the three-dimensional frame after primary adjustment is subjected to secondary adjustment, so that the target object in the point cloud can be surrounded by the three-dimensional frame after secondary adjustment;

The secondary adjustment of the position and/or the size of the three-dimensional frame after the primary adjustment is performed according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame, including: when a third adjustment operation is detected, adjusting at least one of a vertical position and a height of the three-dimensional frame in a front view direction of the point cloud so that a position and a size of upper and lower edges of the second two-dimensional frame are aligned with those of upper and lower edges of the first two-dimensional frame; and/or, when the fourth adjustment operation is detected, adjusting at least one of the horizontal position, the length, the width, and the orientation of the three-dimensional frame in the top view direction of the point cloud so that the positions and the dimensions of the left and right edges of the second two-dimensional frame are aligned with the positions and the dimensions of the left and right edges of the first two-dimensional frame.

2. The method of claim 1, wherein generating a three-dimensional box on the point cloud from the first two-dimensional box comprises:

3. The method of claim 2, wherein the determining, from the point cloud, a view cone region corresponding to the target object according to the first two-dimensional frame, includes:

4. A method according to claim 3, wherein the image is acquired by a camera, the point cloud is acquired by a point cloud acquisition device comprising a time of flight TOF camera and/or a lidar;

the determining the point cloud image includes:

5. The method of claim 3, wherein the image is acquired by a camera and the point cloud is obtained by converting two images acquired by a binocular camera;

the determining the point cloud image includes:

6. The method of claim 1, wherein prior to the initial adjustment of the position and/or size of the three-dimensional frame, further comprising:

7. The method of claim 6, wherein the primarily adjusting the position and/or the size of the three-dimensional frame according to the position and the size of the three-dimensional object corresponding to the first two-dimensional frame comprises:

When a first adjustment operation is detected, adjusting the orientation of the three-dimensional frame in a top view direction so that the orientation of the adjusted three-dimensional frame coincides with the orientation of the target object in the image; and/or

When a second adjustment operation is detected, at least one of the horizontal position, the length, and the width of the three-dimensional frame is adjusted in the top-down direction so that the contour of the three-dimensional frame in the top-down direction is aligned with the contour of the three-dimensional object in the top-down direction.

8. The method of claim 1, wherein the method further comprises:

9. A point cloud labeling apparatus, the apparatus comprising:

the first generation module is used for generating a two-dimensional frame on the image; when detecting a moving operation for the position of the two-dimensional frame and an adjusting operation for the size of the two-dimensional frame, moving the position of the two-dimensional frame, adjusting the size of the two-dimensional frame, and determining the two-dimensional frame subjected to the position movement and the size adjustment as a first two-dimensional frame so as to select a target object to be marked currently;

The second generation module is used for generating a three-dimensional frame on the point cloud according to the first two-dimensional frame so as to mark the target object in the point cloud, the three-dimensional frame is positioned in a view cone area corresponding to the target object, and the point cloud and the image are determined aiming at the same scene;

the apparatus further comprises:

the adjusting module is used for adjusting the position and the size of the three-dimensional frame according to the position and the size of the first two-dimensional frame;

the adjustment module includes:

the second adjustment sub-module is used for carrying out secondary adjustment on the position and/or the size of the three-dimensional frame after the primary adjustment according to the position and the size of the second two-dimensional frame and the position and the size of the first two-dimensional frame so that the target object in the point cloud can be surrounded by the three-dimensional frame after the secondary adjustment;

the three-dimensional frame is a three-dimensional rectangular frame; the second adjusting submodule is specifically used for:

10. The apparatus of claim 9, wherein the second generation module comprises:

11. The apparatus of claim 10, wherein the first determination submodule comprises:

12. The apparatus of claim 11, wherein the image is acquired by a camera, the point cloud is acquired by a point cloud acquisition device, the point cloud acquisition device comprises a time of flight TOF camera and/or a lidar;

the first determining unit is specifically configured to:

13. The apparatus of claim 11, wherein the image is acquired by a camera and the point cloud is obtained by converting two images acquired by a binocular camera;

The first determining unit is specifically configured to:

14. The apparatus of claim 9, wherein the adjustment module further comprises:

the first adjustment submodule is specifically configured to:

15. The apparatus of claim 14, wherein the first adjustment submodule is specifically configured to:

16. The apparatus of claim 9, wherein the apparatus further comprises:

17. A point cloud labeling system, which is characterized by comprising a movable device and a point cloud labeling device, wherein the movable device comprises a point cloud acquisition component and a camera;

the point cloud labeling equipment is used for receiving the point cloud and the image and generating a two-dimensional frame on the image; when detecting a moving operation for the position of the two-dimensional frame and an adjusting operation for the size of the two-dimensional frame, moving the position of the two-dimensional frame, adjusting the size of the two-dimensional frame, and determining the two-dimensional frame subjected to the position movement and the size adjustment as a first two-dimensional frame so as to select a target object to be marked currently; generating a three-dimensional frame on the point cloud according to the first two-dimensional frame so as to mark the target object in the point cloud, wherein the three-dimensional frame is positioned in a viewing cone area corresponding to the target object, and the three-dimensional frame is a three-dimensional rectangular frame;