WO2022016368A1 - 3d框标注方法、设备及计算机可读存储介质 - Google Patents

3d框标注方法、设备及计算机可读存储介质 Download PDF

Info

Publication number
WO2022016368A1
WO2022016368A1 PCT/CN2020/103263 CN2020103263W WO2022016368A1 WO 2022016368 A1 WO2022016368 A1 WO 2022016368A1 CN 2020103263 W CN2020103263 W CN 2020103263W WO 2022016368 A1 WO2022016368 A1 WO 2022016368A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
target object
corner point
corner
determining
Prior art date
Application number
PCT/CN2020/103263
Other languages
English (en)
French (fr)
Inventor
陈创荣
徐斌
陈晓智
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN202080033702.3A priority Critical patent/CN113795847A/zh
Priority to PCT/CN2020/103263 priority patent/WO2022016368A1/zh
Publication of WO2022016368A1 publication Critical patent/WO2022016368A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/40Software arrangements specially adapted for pattern recognition, e.g. user interfaces or toolboxes therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the embodiments of the present application relate to image processing technologies, and in particular, to a 3D frame labeling method, device, and computer-readable storage medium.
  • AI artificial intelligence
  • neural networks can be used to identify images collected by cameras mounted on vehicles, so as to obtain 2D or 3D information of surrounding target objects (such as surrounding vehicles, pedestrians, trees, etc.).
  • surrounding target objects such as surrounding vehicles, pedestrians, trees, etc.
  • the neural network used needs to be trained first. For example, if a neural network needs to be used to identify an image to obtain a target object and its 3D information, the neural network needs to be trained first by using the training image and the known target object and its 3D information in the image.
  • Obtaining the known target object in the image and its 3D information usually requires manual annotation.
  • One of the existing annotation methods is an annotation method that relies on active depth sensors such as external lidar sensors or depth cameras. The method relies on an external depth sensor to obtain depth information, generates point cloud data, and directly marks the actual 3D frame of the object in 3D space, and then projects its 3D frame onto the image according to the coordinate transformation relationship between sensors to obtain a 3D frame.
  • this kind of labeling method has a complicated process and high cost, and the 3D frame obtained based on the point cloud data labeling is quite different from the actual size of the object, which affects the subsequent training of the neural network, thereby affecting the recognition effect of the neural network. Therefore, it is necessary to provide a more efficient and prepared annotation method.
  • Embodiments of the present application provide a 3D frame labeling method, device, and computer-readable storage medium to overcome at least one of the above problems.
  • an embodiment of the present application provides a 3D frame labeling method, including:
  • a 3D frame of the target object is determined and displayed.
  • the embodiments of the present application provide another 3D frame labeling method, including:
  • a 3D frame of the target object is determined.
  • an embodiment of the present application provides a 3D frame annotation device, including a memory, a processor, an interaction unit, and computer instructions stored in the memory and executable on the processor, and the processor executes
  • the computer instructions implement the following steps:
  • a 3D frame of the target object is determined and displayed by the interaction unit.
  • an embodiment of the present application provides another 3D frame labeling device, including a memory, a processor, and computer instructions stored in the memory and executable on the processor, and the processor executes the The computer instructions implement the following steps:
  • a 3D frame of the target object is determined.
  • an embodiment of the present application provides a neural network training method, including:
  • the neural network is trained by using the 3D frame of the target object determined by the 3D frame labeling method described in the first aspect and various possible designs of the first aspect, and the two-dimensional image containing the target object.
  • the embodiments of the present application provide another neural network training method, including:
  • the neural network is trained by using the 3D frame of the target object determined by the 3D frame labeling method described in the second aspect and various possible designs of the second aspect, and the two-dimensional image containing the target object.
  • an embodiment of the present application provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when a processor executes the computer instructions, the first aspect and each of the first aspect are implemented.
  • a possible design of the described 3D box annotation method is provided.
  • the embodiments of the present application provide another computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the processor executes the computer instructions, the second aspect and the second aspect above are implemented Various possible designs for the described 3D box annotation method.
  • the 3D frame labeling method, device, and computer-readable storage medium provided by the embodiments of the present application directly label the 3D frame at the image level without relying on an additional depth sensor, thereby reducing the cost, and the method can only be used when the target object is included.
  • the 2D frame of the target object and a corner of the 2D frame are marked on the 2D image of the target object, and the 3D frame of the target object can be obtained, the processing process is simple, and the workload is reduced.
  • the size difference is small, which solves the problem that the existing 3D frame obtained based on the point cloud data annotation and the actual size of the object has a large difference, thereby ensuring the subsequent accurate training of the neural network and improving the recognition effect of the neural network.
  • FIG. 1 is a schematic diagram of the architecture of a 3D frame labeling system provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a 3D frame labeling method provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of a 2D frame provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a corner on a 2D frame provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a 3D frame provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of another 3D frame labeling method provided by an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of still another 3D frame labeling method provided by an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of another 3D frame labeling method provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of another 3D frame labeling method provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of another 3D frame labeling method provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of the correspondence between a 2D frame and a 3D frame provided by an embodiment of the present application;
  • FIG. 12 is a schematic flowchart of another 3D frame labeling method provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a 3D frame labeling device provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of another 3D frame labeling device provided by an embodiment of the present application.
  • FIG. 15 is a basic hardware architecture of a 3D frame labeling device provided by an embodiment of the present application.
  • FIG. 16 is a basic hardware architecture of another 3D frame labeling device provided by an embodiment of the present application.
  • the 3D frame labeling method provided in the embodiment of this application can be applied to the early data labeling of neural network training, wherein the neural system can be used to obtain 2D or 3D information of target objects (such as vehicles, houses, etc.)
  • target objects such as vehicles, houses, etc.
  • the 3D frame marked in the embodiments of the present application refers to the projection of the three-dimensional frame of the object on the two-dimensional image, rather than the three-dimensional frame in the actual three-dimensional space.
  • the 3D frame labeling method provided in this embodiment of the present application may be applied to the application scenario shown in FIG. 1 .
  • FIG. 1 only describes a possible application scenario of the 3D frame labeling method provided by the embodiment of the present application, and the application scenario of the 3D frame labeling method provided by the embodiment of the present application is not limited to the application scenario shown in FIG. 1 .
  • Figure 1 is a schematic diagram of the architecture of the 3D frame annotation system.
  • the 3D information of surrounding vehicles is obtained as an example.
  • the above architecture includes the processing device 11 and multiple cameras.
  • the multiple cameras are taken as examples of the first camera 12 , the second camera 13 and the third camera 14 .
  • the structures illustrated in the embodiments of the present application do not constitute a specific limitation on the 3D frame annotation architecture.
  • the above architecture may include more or less components than shown in the figure, or combine some components, or separate some components, or arrange different components, depending on the actual application. The scene is determined, and there is no restriction here.
  • the components shown in Figure 1 may be implemented in hardware, software, or a combination of software and hardware.
  • the first camera 12 , the second camera 13 , and the third camera 14 in the embodiment of the present application may collect images of surrounding vehicles, respectively.
  • the captured images may be sent to the processing device 11 .
  • the processing device 11 uses the received image as sample data, and the sample data can be used to train the neural network after being marked. After acquiring the basic operation of the user at the image level, the processing device 11 can directly generate a marked 3D frame, so that the neural network can be trained by using the known vehicle and its 3D information in the above image.
  • the 3D frame labeling involved in the embodiments of the present application refers to directly labeling the 3D frame of the target object at the image level, and refers to the 3D frame projected on the 2D image by the actual 3D frame in the 3D space of the target object, that is, through The 2D frame of the target object and a corner of the 2D frame are marked on the 2D image containing the target object, and the 3D frame of the target object is obtained without relying on an additional depth sensor, reducing costs, simple processing, and reducing workload.
  • system architecture and business scenarios described in the embodiments of the present application are for the purpose of illustrating the technical solutions of the embodiments of the present application more clearly, and do not constitute limitations on the technical solutions provided by the embodiments of the present application.
  • the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
  • the execution body of the method may be the processing device 11 in FIG. 1 .
  • the workflow of the processing device 11 mainly includes a 2D frame stage and a 3D frame stage.
  • the processing device 11 acquires the 2D frame of the target object marked on the two-dimensional image containing the target object and a corner point on the 2D frame.
  • the processing device 11 generates a 3D frame of the target object according to the above-mentioned 2D frame and a corner point on the 2D frame, without relying on an additional depth sensor, reducing cost, simple processing process, and reduced workload.
  • FIG. 2 is a schematic flowchart of a 3D frame labeling method provided by an embodiment of the present application.
  • the execution body of this embodiment may be the processing device 11 in FIG. 1 , and the specific execution body may be determined according to an actual application scenario.
  • the 3D frame labeling method provided by the embodiment of the present application includes the following steps:
  • S201 Acquire a 2D frame labeling operation, and according to the 2D frame labeling operation, determine the 2D frame of the target object on the two-dimensional image including the target object.
  • the above-mentioned target object may be determined according to the actual situation, such as a vehicle, a house, etc., which is not particularly limited in this embodiment of the present application.
  • the processing device 11 determines the 2D frame of the above-mentioned vehicle on the two-dimensional image including the above-mentioned vehicle.
  • the 2D frame labeling operation can be completed by the labeling personnel.
  • the 2D frame of the above-mentioned vehicle completely frames the above-mentioned vehicle on the two-dimensional image including the above-mentioned vehicle, and the specific size of the above-mentioned 2D frame can be determined according to actual needs. For example, the closer the size of the 2D frame is to the size of the vehicle on the two-dimensional image, the better, which is not particularly limited in this embodiment of the present application.
  • the processing device 11 may also acquire a 2D frame adjustment operation, and adjust the 2D frame according to the operation, for example, adjust the size and position of the 2D frame.
  • S202 Obtain a corner point labeling operation, wherein the corner point is located on one side of the 2D frame, and label the corner point on the 2D frame according to the above corner point labeling operation.
  • a corner point refers to a point where a certain corner of the actual three-dimensional frame of the marked object is projected on the two-dimensional image.
  • the object is a vehicle
  • the marked corner points are the projection points of the six corners of the cuboid on the two-dimensional image.
  • Which side of the above-mentioned 2D frame the above-mentioned corner point is specifically located on can be determined according to the actual situation, which is not particularly limited in this embodiment of the present application.
  • the above-mentioned corner point is located on the bottom edge of the above-mentioned 2D frame.
  • S203 Determine and display the 3D frame of the target object based on the 2D frame and the corner points.
  • the 3D frame of the vehicle is obtained on the basis of the above-mentioned 2D frame and the above-mentioned corner points of the vehicle.
  • the 3D frame is directly annotated at the image level, without relying on an additional depth sensor, which reduces costs, and the method only labels the 2D frame of the target object and the 2D frame on the 2D image containing the target object.
  • the 3D frame of the target object can be obtained at the corner of the target object, the processing process is simple, and the workload is reduced.
  • the difference between the 3D frame obtained in the embodiment of the present application and the actual size of the object is small, which solves the problem of the existing 3D frame obtained based on point cloud data annotation. There is a big difference between the actual size of the frame and the object, so as to ensure the accurate training of the neural network and improve the recognition effect of the neural network.
  • FIG. 6 is a schematic flowchart of another 3D frame labeling method proposed by an embodiment of the present application. As shown in Figure 6, the method includes:
  • S601 Acquire a 2D frame labeling operation, and according to the 2D frame labeling operation, determine the 2D frame of the target object on the two-dimensional image including the target object.
  • S602 Obtain a corner point labeling operation, wherein the corner point is located on an edge of the above-mentioned 2D frame, and the above-mentioned corner point is marked on the above-mentioned 2D frame according to the above-mentioned corner point labeling operation.
  • steps S601-S602 are implemented in the same manner as the above-mentioned steps S201-S202, and are not repeated here.
  • S603 Obtain the corner point number of the above-mentioned corner point, where the corner point number is used to indicate the position of the above-mentioned corner point relative to the above-mentioned target object.
  • the order of the corner point numbers is not limited, as long as there is a mapping relationship between the corner points and their corner point numbers.
  • the above mapping relationship can be understood as a corresponding relationship between the position of the corner point relative to the object and its corner point number, such as the four corner points located on the bottom edge of the 2D frame of the object, starting from the right position at the back, clockwise, Determine the corner number can be p0, p1, p2 and p3. If a corner is located behind and to the left of the bottom edge of the object's 2D box, its corner number is p1.
  • the above-mentioned processing device 11 may pre-store the above-mentioned mapping relationship, thereby obtaining the corner point number of the above-mentioned corner point based on the mapping relationship, and the above-mentioned corner point number is used to indicate the position of the above-mentioned corner point relative to the target object.
  • corner point number may be input by the user, or may be pre-configured, which is not particularly limited in this embodiment of the present application.
  • S604 Determine and display the 3D frame of the target object based on the corner number, the 2D frame, and the corner points.
  • the corner point number of the corner point is also obtained, and then, based on the corner point number, the 2D frame and the corner point, the 3D frame of the target object is accurately determined and displayed. box to meet application needs.
  • the target object is a vehicle
  • the vehicle orientation can be set to be straight ahead by default; after determining a corner number of the 3D frame of the vehicle, because Each corner of the 3D frame on the image falls on each side of the marked 2D frame. Therefore, according to the vanishing point of the 3D object in the 2D view and the determined corner number and mapping relationship, the 3D image of the vehicle can be directly generated.
  • the 3D frame is directly marked at the image level without relying on an additional depth sensor, thereby reducing the cost.
  • the method only marks the 2D frame of the target object and a corner of the 2D frame on the two-dimensional image containing the target object. point, the 3D frame of the target object can be obtained, the processing process is simple, the workload is reduced, and compared with the manual annotation of the 3D frame of the target object, it can better meet the geometric relationship constraints of the actual 3D frame projected on the 2D image, and avoid manual annotation. errors or errors, thereby improving the accuracy and usability of annotation data.
  • the difference between the 3D frame obtained in the embodiment of the present application and the actual size of the object is small, which solves the problem that the existing 3D frame obtained based on the point cloud data annotation and the actual size of the object is greatly different, thereby ensuring the accurate subsequent training of the neural network, Improve the recognition effect of neural network.
  • FIG. 7 is a schematic flowchart of still another 3D frame labeling method proposed by an embodiment of the present application. As shown in Figure 7, the method includes:
  • S701 Acquire a 2D frame labeling operation, and according to the 2D frame labeling operation, determine the 2D frame of the target object on the two-dimensional image including the target object.
  • S702 Obtain a corner point labeling operation, wherein the corner point is located on an edge of the above-mentioned 2D frame, and the above-mentioned corner point is labelled on the above-mentioned 2D frame according to the above-mentioned corner point labeling operation.
  • steps S701-S702 are implemented in the same manner as the above-mentioned steps S201-S202, and are not repeated here.
  • S703 Acquire an orientation angle of the target object in the 3D space, where the orientation angle is used to indicate the orientation of the target object.
  • the above-mentioned orientation angle may be input by the user or preset, which is not particularly limited in this embodiment of the present application.
  • the user may not input the orientation angle, and the default target object is Facing straight ahead and facing an angle of 0 degrees.
  • the user can also input other orientation angles.
  • the target object is a vehicle
  • the default heading angle can be selected; and when there are some vehicles running obliquely or in the opposite direction, the actual heading angle can be input according to the actual situation. .
  • S704 Determine and display the 3D frame of the target object based on the orientation angle, the 2D frame, and the corner points.
  • the orientation angle of the target object in the 3D space is also obtained, and then, based on the orientation angle, the 2D frame, and the corner points, the 3D frame of the target object is determined and displayed. box, so that the obtained 3D box is more consistent with the actual.
  • the orientation angle takes the default setting, for example, 0 degrees, it means that the orientation of the target object is straight ahead.
  • the vanishing point corresponding to the target object is located in the center of the image, so that the positions of other corners of the target object can be determined according to the vanishing point.
  • the generation of the 3D frame is completed; when the orientation angle is other values input by the user, it means that the target object is facing other directions, and the vanishing point is located at other positions, but the generation of the 3D frame can also be completed according to the vanishing point.
  • the 3D frame is directly marked at the image level without relying on an additional depth sensor, thereby reducing the cost.
  • the method only marks the 2D frame of the target object and a corner of the 2D frame on the two-dimensional image containing the target object. Point, the 3D frame of the target object can be obtained, the processing process is simple, and the workload is reduced.
  • the difference between the 3D frame obtained in the embodiment of the present application and the actual size of the object is small, which solves the problem of the existing 3D frame obtained based on point cloud data annotation.
  • the actual size of the object is quite different, so as to ensure the accurate training of the neural network and improve the recognition effect of the neural network.
  • FIG. 8 is a schematic flowchart of still another 3D frame labeling method proposed by an embodiment of the present application. As shown in Figure 8, the method includes:
  • S801 Acquire a 2D frame labeling operation, and according to the 2D frame labeling operation, determine the 2D frame of the target object on a two-dimensional image including the target object.
  • S802 Obtain a corner point labeling operation, wherein the corner point is located on one side of the above-mentioned 2D frame, and the above-mentioned corner point is labelled on the above-mentioned 2D frame according to the above-mentioned corner point labeling operation.
  • S803 Determine and display the 3D frame of the target object based on the 2D frame and the corner points.
  • steps S801-S803 are implemented in the same manner as the above-mentioned steps S201-S203, which will not be repeated here.
  • the above-mentioned 2D frame labeling operation includes at least one of a frame selection operation, a moving operation, and a rotation operation, and the above-mentioned 2D frame is adjusted according to the above-mentioned 2D frame labeling operation.
  • the above-mentioned 2D frame labeling operation may include other operations in addition to the above, which is not particularly limited in this embodiment of the present application.
  • the 2D frame can be adjusted, so that a new 3D frame can be generated based on the adjusted 2D frame to meet various application needs.
  • the embodiment of the present application directly labels the 3D frame at the image level, without relying on an additional depth sensor, thereby reducing the cost.
  • the method only labels the 2D frame of the target object and the 2D frame on the 2D image containing the target object.
  • the 3D frame of the target object can be obtained at the corner of the target object, the processing process is simple, and the workload is reduced.
  • the difference between the 3D frame obtained in the embodiment of the present application and the actual size of the object is small, which solves the problem of the existing 3D frame obtained based on point cloud data annotation.
  • FIG. 9 is a schematic flowchart of another 3D frame labeling method provided by an embodiment of the present application.
  • the execution body of this embodiment may be the processing device 11 in FIG. 1 , and the specific execution body may be determined according to an actual application scenario.
  • the 3D frame labeling method provided by the embodiment of the present application includes the following steps:
  • S901 Determine the 2D frame of the target object on the two-dimensional image including the target object.
  • the processing device 11 may also obtain a 2D frame adjustment operation, and adjust the 2D frame according to the operation, for example, adjust the size and position of the 2D frame.
  • S902 Acquire a marked corner point, where the corner point is located on an edge of the 2D frame.
  • the above-mentioned corner point is located on the bottom edge of the above-mentioned 2D frame.
  • S903 Determine the 3D frame of the target object based on the 2D frame and the corner points.
  • the 3D frame is directly annotated at the image level, without relying on an additional depth sensor, which reduces costs, and the method only labels the 2D frame of the target object and the 2D frame on the 2D image containing the target object.
  • the 3D frame of the target object can be obtained at the corner of the target object, the processing process is simple, and the workload is reduced.
  • the difference between the 3D frame obtained in the embodiment of the present application and the actual size of the object is small, which solves the problem of the existing 3D frame obtained based on point cloud data annotation. There is a big difference between the actual size of the frame and the object, so as to ensure the accurate training of the neural network and improve the recognition effect of the neural network.
  • FIG. 10 is a schematic flowchart of still another 3D frame labeling method proposed by an embodiment of the present application. As shown in Figure 10, the method includes:
  • S1001 Determine the 2D frame of the target object on the two-dimensional image including the target object.
  • S1002 Acquire a marked corner point, where the corner point is located on an edge of the 2D frame.
  • steps S1001-S1002 are implemented in the same manner as the above-mentioned steps S901-S902, and are not repeated here.
  • S1003 Acquire the corner point number of the above-mentioned corner point, where the corner point number is used to indicate the position of the above-mentioned corner point relative to the above-mentioned target object.
  • the above corner point number may be input by the user, or may be pre-configured.
  • S1004 Determine the 3D frame of the target object based on the corner number, the 2D frame, and the corner points.
  • the processing device 11 may determine the correspondence between the 2D frame and the 3D frame based on the corner number and the corner point, and then determine the 3D frame according to the correspondence and the 2D frame. .
  • FIG. 11 in the figure, front represents the front, and rear represents the rear.
  • the above-mentioned determination of the corresponding relationship between the 2D frame and the 3D frame may include:
  • the above-mentioned corresponding relationship is determined according to the above-mentioned corresponding rule, the above-mentioned corner point number and the above-mentioned corner point.
  • the corner point number of the corner point is also obtained, and further, based on the corner point number, the 2D frame and the corner point, the 3D frame of the target object is accurately determined, which satisfies the application need.
  • the embodiment of the present application directly labels the 3D frame at the image level, without relying on an additional depth sensor, thereby reducing the cost.
  • the method only labels the 2D frame of the target object and the 2D frame on the 2D image containing the target object.
  • the 3D frame of the target object can be obtained at the corner of the target object, the processing process is simple, and the workload is reduced.
  • the difference between the 3D frame obtained in the embodiment of the present application and the actual size of the object is small, which solves the problem of the existing 3D frame obtained based on point cloud data annotation.
  • FIG. 12 is a schematic flowchart of still another 3D frame labeling method proposed by an embodiment of the present application. As shown in Figure 12, the method includes:
  • S1201 Determine the 2D frame of the target object on the two-dimensional image including the target object.
  • S1202 Acquire a marked corner point, where the corner point is located on an edge of the 2D frame.
  • steps S1201-S1202 are implemented in the same manner as the above-mentioned steps S901-S902, and are not repeated here.
  • S1203 Acquire an orientation angle of the target object in the 3D space, where the orientation angle is used to indicate the orientation of the target object.
  • the above-mentioned orientation angle may be input by the user, or may be pre-configured.
  • S1204 Determine the 3D frame of the target object based on the above-mentioned orientation angle, the 2D frame and the corner points.
  • the above-mentioned processing device 11 may determine three vanishing points corresponding to the above-mentioned 3D frame according to the above-mentioned orientation angle, and then determine the above-mentioned 3D frame according to the three vanishing points, the above-mentioned 2D frame and the corner points .
  • the above-mentioned 3D frame is determined according to the above-mentioned three vanishing points, the above-mentioned 2D frame and the corner points in combination with the parallel relationship of the sides of the cuboid.
  • determining the three vanishing points corresponding to the above-mentioned 3D frame according to the above-mentioned orientation angle may include:
  • the three vanishing points are determined according to the projection matrix and the projection matrix of the orientation angle.
  • the above-mentioned projection matrix and the above-mentioned projection matrix of the orientation angle are multiplied, and the above-mentioned three vanishing points are determined according to the matrix multiplication result, for example, as shown in FIG.
  • the method further includes:
  • the corner point number of the above-mentioned corner point is acquired, where the corner point number is used to indicate the position of the above-mentioned corner point relative to the target object.
  • the above-mentioned 3D frame is determined according to the above-mentioned three vanishing points, 2D frame and corner point, including:
  • the above 3D frame is determined according to the above corner point number, the three vanishing points, the 2D frame and the corner points.
  • the above-mentioned processing device 11 may determine the corresponding relationship between the above-mentioned 2D frame and the above-mentioned 3D frame based on the above-mentioned corner point number and corner point, and further, in combination with the parallel relationship of the sides of the cuboid, according to the corresponding relationship, the above-mentioned three vanishing points. and the 2D box, solve the eight corner positions of the 3D box projection, such as the corner points p0, p1, p2, p3, p4, p5, p6 and p7 in Fig. 11, so as to determine the above 3D box.
  • the embodiment of the present application combines the intrinsic geometric relationship of the cuboid with the projection model of the above-mentioned image acquisition device, so as to ensure that the labeled pseudo 3D frame satisfies the intrinsic geometric relationship of the cuboid, and the labeling accuracy and labeling consistency are higher.
  • the orientation angle of the target object in the 3D space is also obtained, and further, based on the orientation angle, the 2D frame, and the corner points, the 3D frame of the target object is determined and displayed. box, so that the obtained 3D box is more consistent with the actual.
  • the 3D frame is directly marked at the image level, without relying on an additional depth sensor, and the cost is reduced.
  • the 3D frame of the target object can be obtained at the corner of the object, the processing process is simple, and the workload is reduced.
  • the difference between the 3D frame obtained in the embodiment of the present application and the actual size of the object is small, which solves the problem of the existing 3D frame obtained based on point cloud data annotation.
  • the problem is that the actual size of the frame and the object is quite different, so as to ensure the accurate training of the neural network and improve the recognition effect of the neural network.
  • FIG. 13 is a schematic structural diagram of a 3D frame labeling apparatus provided by an embodiment of the present application.
  • the 3D frame labeling apparatus 1300 includes: a first acquisition module 1301, a second acquisition module 1302, and a display module 1303.
  • the 3D frame labeling device here may be the aforementioned processing device 11 itself, or a chip or an integrated circuit that implements the functions of the processing device 11 . It should be noted here that the division of the first acquisition module, the second acquisition module, and the display module is only a logical function division, and the two may be physically integrated or independent.
  • the first obtaining module 1301 is configured to obtain a 2D frame labeling operation, and according to the 2D frame labeling operation, determine the 2D frame of the target object on the 2D image including the target object.
  • the second acquiring module 1302 is configured to acquire a corner point labeling operation, wherein the corner point is located on one side of the 2D frame, and the corner point is marked on the 2D frame according to the corner point labeling operation;
  • the display module 1303 is configured to determine and display the 3D frame of the target object based on the 2D frame and the corner points.
  • the display module 1303 determines and displays the 3D frame of the target object, it is further used for:
  • a corner point number of the corner point is acquired, where the corner point number is used to indicate the position of the corner point relative to the target object.
  • the display module 1303 is specifically used for:
  • the 3D box is determined and displayed.
  • the display module 1303 determines and displays the 3D frame of the target object, it is further used for:
  • the orientation angle of the target object in the 3D space is obtained, and the orientation angle is used to indicate the orientation of the target object.
  • the display module 1303 is specifically used for:
  • the 3D box is determined and displayed.
  • the corner point is located on the bottom edge of the 2D frame.
  • the 2D frame labeling operation includes at least one of a frame selection operation, a move operation, and a rotation operation.
  • the display module 1303 determines and displays the 3D frame of the target object, it is further used to:
  • the 2D box is adjusted according to the 2D box labeling operation.
  • the corner point number is input by the user or pre-configured.
  • the orientation angle is user input or pre-configured.
  • FIG. 14 is a schematic structural diagram of another 3D frame labeling apparatus provided by an embodiment of the present application.
  • the 3D frame labeling apparatus 1400 includes: a first determination module 1401 , a third acquisition module 1402 , and a second determination module 1403 .
  • the 3D frame labeling device here may be the aforementioned processing device 11 itself, or a chip or an integrated circuit that implements the functions of the processing device 11 . It should be noted here that the division of the first determination module, the third acquisition module, and the second determination module is only a division of logical functions, and the two may be physically integrated or independent.
  • the first determining module 1401 is configured to determine the 2D frame of the target object on the two-dimensional image containing the target object.
  • a third obtaining module 1402 configured to obtain a marked corner point, wherein the corner point is located on an edge of the 2D frame;
  • the second determination module 1403 is configured to determine the 3D frame of the target object based on the 2D frame and the corner points.
  • the second determination module 1403 determines the 3D frame of the target object, it is further used for:
  • a corner point number of the corner point is acquired, where the corner point number is used to indicate the position of the corner point relative to the target object.
  • the second determining module 1403 is specifically configured to:
  • the 3D box is determined based on the corner point number, the 2D box, and the corner points.
  • the second determining module 1403 determines the 3D frame based on the corner point number, the 2D frame and the corner points, including:
  • the 3D frame is determined according to the corresponding relationship and the 2D frame.
  • the second determining module 1403 determines the correspondence between the 2D frame and the 3D frame, including:
  • the corresponding relationship is determined according to the corresponding rule, as well as the corner point number and the corner point.
  • the second determination module 1403 determines the 3D frame of the target object, it is further used for:
  • the orientation angle of the target object in the 3D space is obtained, and the orientation angle is used to indicate the orientation of the target object.
  • the second determining module 1403 is specifically configured to:
  • the 3D box is determined based on the orientation angle, the 2D box and the corner points.
  • the second determining module 1403 determines the 3D frame based on the orientation angle, the 2D frame and the corner point, including:
  • the 3D frame is determined according to the three vanishing points, the 2D frame and the corner points.
  • the second determination module 1403 determines three vanishing points corresponding to the 3D frame according to the orientation angle, including:
  • the three vanishing points are determined according to the projection matrix and the projection matrix of the orientation angle.
  • the second determining module 1403 determines the 3D frame according to the three vanishing points, the 2D frame and the corner points, it is further configured to:
  • corner point number is used to indicate the position of the corner point relative to the target object
  • the determining the 3D frame according to the three vanishing points, the 2D frame and the corner points includes:
  • the 3D frame is determined according to the corner point number, the three vanishing points, the 2D frame and the corner points.
  • the corner point is located on the bottom edge of the 2D frame.
  • the corner point number is input by the user or pre-configured.
  • the orientation angle is user input or pre-configured.
  • FIG. 15 schematically provides a possible basic hardware architecture of the 3D frame annotation device described in this application.
  • the 3D frame annotation device 1500 includes at least one processor 1501 and a memory 1502 . Further optionally, a communication interface 1503 and a bus 1504 may also be included.
  • the 3D frame marking device 1500 may be a computer or a server, which is not particularly limited in this application.
  • the number of processors 1501 may be one or more, and FIG. 15 only illustrates one of the processors 1501.
  • the processor 1501 may be a central processing unit (central processing unit, CPU), a graphics processing unit (graphics processing unit, GPU) or a digital signal processor (digital signal processor, DSP). If the 3D box annotation device 1500 has multiple processors 1501, the types of the multiple processors 1501 may be different, or may be the same. Optionally, the multiple processors 1501 of the 3D box annotation device 1500 may also be integrated into a multi-core processor.
  • the memory 1502 stores computer instructions and data; the memory 1502 can store computer instructions and data required to implement the above-mentioned 3D frame annotation method provided by the present application, for example, the memory 1502 stores instructions for implementing the steps of the above-mentioned 3D frame annotation method.
  • Memory 1502 may be any one or any combination of the following storage media: non-volatile memory (eg, read-only memory)
  • ROM Read Only Memory
  • SSD Solid State Drive
  • HDD Hard Disk
  • Optical Disc volatile memory
  • Communication interface 1503 may provide information input/output for the at least one processor. It may also include any one or any combination of the following devices: a network interface (eg, an Ethernet interface), a wireless network card, and other devices with network access functions.
  • the communication interface 1503 may also be used for data communication between the 3D frame annotation device 1500 and other computing devices or terminals.
  • Figure 15 represents bus 1504 with a thick line.
  • the bus 1504 may connect the processor 1501 with the memory 1502 and the communication interface 1503 .
  • the processor 1501 can access the memory 1502, and can also use the communication interface 1503 to perform data interaction with other computing devices or terminals.
  • the processor 1501 executes the computer instructions in the memory 1502 to implement the following steps:
  • a 3D frame of the target object is determined and displayed.
  • the processor 1501 before the determining and displaying the 3D frame of the target object, the processor 1501 further implements the following steps when executing the computer instructions:
  • a corner point number of the corner point is acquired, where the corner point number is used to indicate the position of the corner point relative to the target object.
  • the determining and displaying the 3D frame of the target object includes:
  • the 3D box is determined and displayed.
  • the processor 1501 before the determining and displaying the 3D frame of the target object, the processor 1501 further implements the following steps when executing the computer instructions:
  • the orientation angle of the target object in the 3D space is obtained, and the orientation angle is used to indicate the orientation of the target object.
  • the determining and displaying the 3D frame of the target object includes:
  • the 3D box is determined and displayed.
  • the corner point is located on the bottom edge of the 2D frame.
  • the 2D frame labeling operation includes at least one of a frame selection operation, a move operation, and a rotation operation.
  • the processor 1501 further implements the following steps when executing the computer instructions:
  • the 2D box is adjusted according to the 2D box labeling operation.
  • the corner point number is input by the user or pre-configured.
  • the orientation angle is user input or pre-configured.
  • the memory 1502 may include a first acquisition module 1301 , a second acquisition module 1302 , and a display module 1303 .
  • the inclusion here only refers to that the functions of the first obtaining module 1301 , the second obtaining module 1302 and the display module 1303 can be implemented respectively when the instructions stored in the memory are executed, and are not limited to physical structures.
  • the above-mentioned 3D frame labeling device can be implemented as a hardware module, or as a circuit unit, in addition to being implemented by software as in the above-mentioned FIG. 15 .
  • FIG. 16 schematically provides another possible basic hardware architecture of the 3D frame annotation device described in this application.
  • the 3D frame annotation device 1600 includes at least one processor 1601 and a memory 1602 . Further optionally, a communication interface 1603 and a bus 1604 may also be included.
  • the 3D frame marking device 1600 may be a computer or a server, which is not particularly limited in this application.
  • the number of processors 1601 may be one or more, and FIG. 16 only illustrates one of the processors 1601.
  • the processor 1601 may be a central processing unit (central processing unit, CPU), a graphics processing unit (graphics processing unit, GPU) or a digital signal processor (digital signal processor, DSP).
  • the 3D box annotation device 1600 has a plurality of processors 1601, the types of the plurality of processors 1601 may be different, or may be the same.
  • the multiple processors 1601 of the 3D box annotation device 1600 may also be integrated into a multi-core processor.
  • the memory 1602 stores computer instructions and data; the memory 1602 can store the computer instructions and data required for implementing the above-mentioned parallel execution unit management method provided by the present application. instruction.
  • the memory 1602 may be any one or any combination of the following storage media: non-volatile memory (eg, read only memory (ROM), solid state disk (SSD), hard disk (HDD), optical disk), volatile memory.
  • Communication interface 1603 may provide information input/output for the at least one processor. It may also include any one or any combination of the following devices: a network interface (eg, an Ethernet interface), a wireless network card, and other devices with network access functions.
  • the communication interface 1603 may also be used for data communication between the 3D frame annotation device 1600 and other computing devices or terminals.
  • Figure 16 represents bus 1604 with a thick line.
  • a bus 1604 may connect the processor 1601 with the memory 1602 and the communication interface 1603 .
  • the processor 1601 can access the memory 1602, and can also use the communication interface 1603 to perform data interaction with other computing devices or terminals.
  • processor 1601 executes the computer instructions in the memory 1602 to implement the following steps:
  • a 3D frame of the target object is determined.
  • the processor 1601 before the determining of the 3D frame of the target object, the processor 1601 further implements the following steps when executing the computer instructions:
  • a corner point number of the corner point is acquired, where the corner point number is used to indicate the position of the corner point relative to the target object.
  • the determining of the 3D frame of the target object includes:
  • the 3D box is determined based on the corner point number, the 2D box, and the corner points.
  • the determining the 3D frame based on the corner point number, the 2D frame and the corner point includes:
  • the 3D frame is determined according to the corresponding relationship and the 2D frame.
  • the determining the correspondence between the 2D frame and the 3D frame includes:
  • the corresponding relationship is determined according to the corresponding rule, and the corner point number and the corner point.
  • the processor 1601 before the determining of the 3D frame of the target object, the processor 1601 further implements the following steps when executing the computer instructions:
  • the orientation angle of the target object in the 3D space is obtained, and the orientation angle is used to indicate the orientation of the target object.
  • the determining of the 3D frame of the target object includes:
  • the 3D box is determined based on the orientation angle, the 2D box and the corner points.
  • the determining the 3D frame based on the orientation angle, the 2D frame and the corner point includes:
  • the 3D frame is determined according to the three vanishing points, the 2D frame and the corner points.
  • the determining three vanishing points corresponding to the 3D frame according to the orientation angle includes:
  • the three vanishing points are determined according to the projection matrix and the projection matrix of the orientation angle.
  • the processor 1601 before the determination of the 3D frame according to the three vanishing points, the 2D frame and the corner points, the processor 1601 further implements when executing the computer instructions.
  • corner point number is used to indicate the position of the corner point relative to the target object
  • the determining the 3D frame according to the three vanishing points, the 2D frame and the corner points includes:
  • the 3D frame is determined according to the corner point number, the three vanishing points, the 2D frame and the corner points.
  • the corner point is located on the bottom edge of the 2D frame.
  • the corner point number is input by the user or pre-configured.
  • the orientation angle is user input or pre-configured.
  • the memory 1602 may include a first determination module 1401 , a third acquisition module 1402 and a second determination module 1403 .
  • the inclusion here only refers to that the functions of the first determination module 1401 , the third acquisition module 1402 and the second determination module 1403 can be implemented respectively when the instructions stored in the memory are executed, and are not limited to physical structures.
  • the above-mentioned 3D frame labeling device can be implemented as a hardware module, or as a circuit unit, in addition to being implemented by software as in the above-mentioned FIG. 16 .
  • an embodiment of the present application provides a neural network training method, including: using the 3D frame of the target object determined by the above-mentioned 3D frame labeling method, and a two-dimensional image including the above-mentioned target object, to train the neural network.
  • the present application provides a computer-readable storage medium, including instructions, which, when executed on a computer, cause the computer to execute the above-mentioned 3D frame labeling method.
  • the present application provides a computer program product, characterized in that, the computer program product includes instructions that, when executed on a computer, cause the computer to execute the above-mentioned 3D box marking method.
  • the present application provides a movable platform, which can be a smart device or a transportation tool, such as an unmanned aerial vehicle, an unmanned vehicle, or a robot, etc., on which the above-mentioned 3D frame marking device is included.
  • the disclosed apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional units.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Human Computer Interaction (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)

Abstract

一种3D框标注方法、设备及计算机可读存储介质,该方法在图像层面直接对3D框进行标注,无需依赖额外的深度传感器,降低成本,而且该方法仅通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,就能获得目标物体的3D框,处理过程简单,降低工作量,另外,本申请实施例获取的3D框与物体实际尺寸相差较小,解决现有基于点云数据标注获得的3D框与物体的实际尺寸相差较大的问题,从而,保证后续对神经网络的准确训练,提高神经网络的识别效果。

Description

3D框标注方法、设备及计算机可读存储介质 技术领域
本申请实施例涉及图像处理技术,尤其涉及一种3D框标注方法、设备及计算机可读存储介质。
背景技术
随着神经网络至深度学习等人工智能(AI)领域技术的快速发展,人们已经能够使用这些AI技术来实现对周围环境的感知功能。例如,在自动驾驶中,可以利用神经网络对搭载于车辆上的摄像头所采集的图像进行识别,从而得到周围目标物体(如周围的车辆、行人、树木等)的2D或3D信息。然而,想要获得高准确率的识别结果,需要首先对所使用的神经网络进行训练。例如,如需要使用神经网络对图像进行识别得到目标物体及其3D信息,则需要首先利用训练图像和图像中已知目标物体和其3D信息对该神经网络进行训练。
获得图像中已知的目标物体和其3D信息(如伪3D框,即二维图像平面上的3D框投影,后续简述为3D框),通常需要人工进行标注。现有的标注方法一个是依赖于外部激光雷达传感器或深度相机等主动式深度传感器的标注方法。该方法依赖外部深度传感器获取深度信息,生成点云数据,并直接在3D空间标注物体的实际3D框,再根据传感器间坐标转换关系将其3D框投影到图像上,从而得到3D框。
然而,这类标注方法过程复杂,成本较高,而且基于点云数据标注获得的3D框与物体的实际尺寸相差较大,影响后续对神经网络的训练,从而影响神经网络的识别效果。因此,有必要提供一种更为高效准备的标注方法。
发明内容
本申请实施例提供一种3D框标注方法、设备及计算机可读存储介质,以克服上述至少一个问题。
第一方面,本申请实施例提供一种3D框标注方法,包括:
获取2D框标注操作,根据所述2D框标注操作,在包含目标物体的二维图像上确定所述目标物体的2D框;
获取角点标注操作,其中,所述角点位于所述2D框的一条边上,根据所述角点标注操作在所述2D框上标注所述角点;
基于所述2D框和所述角点,确定并显示所述目标物体的3D框。
第二方面,本申请实施例提供另一种3D框标注方法,包括:
在包含目标物体的二维图像上确定所述目标物体的2D框;
获取标注的角点,其中,所述角点位于所述2D框的一条边上;
基于所述2D框和所述角点,确定所述目标物体的3D框。
第三方面,本申请实施例提供一种3D框标注设备,包括存储器、处理器、交互单元,以及存储在所述存储器中并可在所述处理器上运行的计算机指令,所述处理器执行所述计算机指令时实现如下步骤:
通过所述交互单元获取2D框标注操作,根据所述2D框标注操作,在包含目标物体的二维图像上确定所述目标物体的2D框;
通过所述交互单元获取角点标注操作,其中,所述角点位于所述2D框的一条边上,根据所述角点标注操作在所述2D框上标注所述角点;
基于所述2D框和所述角点,确定所述目标物体的3D框,并通过所述交互单元显示。
第四方面,本申请实施例提供另一种3D框标注设备,包括存储器、处理器,以及存储在所述存储器中并可在所述处理器上运行的计算机指令,所述处理器执行所述计算机指令时实现如下步骤:
在包含目标物体的二维图像上确定所述目标物体的2D框;
获取标注的角点,其中,所述角点位于所述2D框的一条边上;
基于所述2D框和所述角点,确定所述目标物体的3D框。
第五方面,本申请实施例提供一种神经网络训练方法,包括:
利用如上第一方面以及第一方面各种可能的设计所述的3D框标注方法确定的目标物体的3D框,以及包含所述目标物体的二维图像,进行神经网络的训练。
第六方面,本申请实施例提供另一种神经网络训练方法,包括:
利用如上第二方面以及第二方面各种可能的设计所述的3D框标注方法确定的目标物体的3D框,以及包含所述目标物体的二维图像,进行神经网络的训练。
第七方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机指令,当处理器执行所述计算机指令时,实现如上第一方面以及第一方面各种可能的设计所述的3D框标注方法。
第八方面,本申请实施例提供另一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机指令,当处理器执行所述计算机指令时,实现如上第二方面以及第二方面各种可能的设计所述的3D框标注方法。
本申请实施例提供的3D框标注方法、设备及计算机可读存储介质,该方法在图像层面直接对3D框进行标注,无需依赖额外的深度传感器,降低成本,而且该方法仅通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,就能获得目标物体的3D框,处理过程简单,降低工作量,另外,本申请实施例获取的3D框与物体实际尺寸相差较小,解决现有基于点云数据标注获得的3D框与物体的实际尺寸相差较大的问题,从而,保证后续对神经网络的准确训练,提高神经网络的识别效果。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。
图1为本申请实施例提供的3D框标注系统架构示意图;
图2为本申请实施例提供的一种3D框标注方法的流程示意图;
图3为本申请实施例提供的一种2D框示意图;
图4为本申请实施例提供的一种2D框上的角点示意图;
图5为本申请实施例提供的一种3D框示意图;
图6为本申请实施例提供的另一种3D框标注方法的流程示意图;
图7为本申请实施例提供的再一种3D框标注方法的流程示意图;
图8为本申请实施例提供的又一种3D框标注方法的流程示意图;
图9为本申请实施例提供的又一种3D框标注方法的流程示意图;
图10为本申请实施例提供的又一种3D框标注方法的流程示意图;
图11为本申请实施例提供的一种2D框和3D框的对应关系示意图;
图12为本申请实施例提供的又一种3D框标注方法的流程示意图;
图13为本申请实施例提供的一种3D框标注装置的结构示意图;
图14为本申请实施例提供的另一种3D框标注装置的结构示意图;
图15为本申请实施例提供的一种3D框标注设备的基本硬件架构;
图16为本申请实施例提供的另一种3D框标注设备的基本硬件架构。
通过上述附图,已示出本申请明确的实施例,后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本申请构思的范围,而是通过参考特定实施例为本领域技术人员说明本申请的概念。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”及“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请实施例提供的3D框标注方法可应用在神经网络训练的前期数据标注中,其中,该神经系统可以用于获得目标物体(如车辆、房屋等)的2D或3D信息等,本申请实施例对此不做特别限制。需要说明的是,本申请实施例中标注的3D框指的是在二维图像上的物体的三维框的投影,而非实际三维空间中的三维框。
可选地,本申请实施例提供的3D框标注方法可以应用于如图1所示的应用场景中。图1只是以示例的方式描述了本申请实施例提供的3D框标注方法的一种可能的应用场景,本申请实施例提供的3D框标注方法的应用场景不限于图1所示的应用场景。
图1为3D框标注系统架构示意图。在图1中,以获得周围的车辆的3D信息为例。上述架构包括处理装置11和多个摄像头,这里,该多个摄像头以第一摄像头12、第二摄像头13和第三摄像头14为例。
可以理解的是,本申请实施例示意的结构并不构成对3D框标注架构的具体限定。在本申请另一些可行的实施方式中,上述架构可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置,具体可根据实际应用场景确定,在此不做限制。图1所示的部件可以以硬件,软件,或软件与硬件的组合实现。
在具体实现过程中,本申请实施例中第一摄像头12、第二摄像头13和第三摄像头14可以分别采集周围的车辆的图像。在上述应用场景中,第一摄像头12、第二摄像头13和第三摄像头14在采集图像后,可以把采集的图像发送至处理装置11。处理装置11将接收的上述图像作为样本数据,该样本数据经过标注后,可用于训练神经网络。处理装置11在图像层面获取到用户的基本操作后,可以直接生成标注完成的3D框,从而,利用上述图像中已知车辆和其3D信息对神经网络进行训练。
本申请实施例所涉及的3D框标注是指在图像层面直接对目标物体的3D框进行标注,指的是目标物体的三维空间中的实际3D框在二维图像上投影的3D框,即通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,获得目标物体的3D框,无需依赖额外的深度传感器,降低成本,处理过程简单,降低工作量。
另外,本申请实施例描述的系统架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
下面结合附图详细介绍本申请实施例提供的3D框标注方法。该方法的执行主体可以为图1中的处理装置11。处理装置11的工作流程主要包括2D 框阶段和3D框阶段。在2D框阶段,处理装置11获取在包含目标物体的二维图像上标注的目标物体的2D框和该2D框上的一角点。在3D框阶段,处理装置11根据上述2D框和2D框上的一角点,生成目标物体的3D框,无需依赖额外的深度传感器,降低成本,处理过程简单,降低工作量。
下面以几个实施例为例对本申请的技术方案进行描述,对于相同或相似的概念或过程可能在某些实施例不再赘述。
图2为本申请实施例提供的一种3D框标注方法的流程示意图,本实施例的执行主体可以为图1中的处理装置11,具体执行主体可以根据实际应用场景确定。如图2所示,在图1所示应用场景的基础上,本申请实施例提供的3D框标注方法包括如下步骤:
S201:获取2D框标注操作,根据该2D框标注操作,在包含目标物体的二维图像上确定上述目标物体的2D框。
这里,上述目标物体可以根据实际情况确定,例如车辆、房屋等,本申请实施例对此不做特别限制。
本申请实施例中,以目标物体为车辆为例,处理装置11在获取2D框标注操作后,在包含上述车辆的二维图像上确定上述车辆的2D框。示例性的,2D框标注操作可由标注人员操作完成,如图3所示,上述车辆的2D框在包含上述车辆的二维图像上完全框住上述车辆,具体上述2D框大小可以根据实际需要确定,例如上述2D框的大小与上述二维图像上的上述车辆的大小越近似越好,本申请实施例对此不做特别限制。
另外,处理装置11在确定上述目标物体的2D框后,还可以获取2D框调整操作,根据该操作,调整上述2D框,例如调整上述2D框的大小、位置等。
S202:获取角点标注操作,其中,该角点位于上述2D框的一条边上,根据上述角点标注操作在上述2D框上标注该角点。
本申请实施例中,角点指的是被标注物体实际三维框的某个角投影在二维图像上的点。例如,当物体为车辆时,通常其实际三维框为长方体,则标注的角点为该长方体的六个角在二维图像上的投影的点。上述角点具体在上述2D框的哪条边上,可以根据实际情况确定,本申请实施例对此不做特别限制。
示例性的,还以目标物体为车辆为例,如图4所示,上述角点位于上述2D框的底边上。
S203:基于上述2D框和上述角点,确定并显示上述目标物体的3D框。
这里,还以目标物体为车辆为例,如图5所示,在该车辆的上述2D框和上述角点基础上,获得该车辆的3D框。
本申请实施例,在图像层面直接对3D框进行标注,无需依赖额外的深度传感器,降低成本,而且该方法仅通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,就能获得目标物体的3D框,处理过程简单,降低工作量,另外,本申请实施例获取的3D框与物体实际尺寸相差较小,解决现有基于点云数据标注获得的3D框与物体的实际尺寸相差较大的问题,从而,保证后续对神经网络的准确训练,提高神经网络的识别效果。
另外,本申请实施例在确定并显示上述目标物体的3D框之前,还获取上述角点的角点编号。图6为本申请实施例提出的另一种3D框标注方法的流程示意图。如图6所示,该方法包括:
S601:获取2D框标注操作,根据该2D框标注操作,在包含目标物体的二维图像上确定上述目标物体的2D框。
S602:获取角点标注操作,其中,该角点位于上述2D框的一条边上,根据上述角点标注操作在上述2D框上标注该角点。
其中,步骤S601-S602与上述步骤S201-S202的实现方式相同,此处不再赘述。
S603:获取上述角点的角点编号,该角点编号用于指示上述角点相对于上述目标物体的位置。
这里,角点编号的顺序可以不限,只要角点与其角点编号之间有映射关系即可。
其中,上述映射关系可以理解为角点相对于物体的位置与其角点编号之间有对应关系,例如位于物体的2D框的底边上四个角点,从后面的右边位置开始,顺时针,确定角点编号可以为p0、p1、p2和p3。如果某一角点位于物体的2D框的底边的后面左边位置,则其角点编号为p1。
上述处理装置11可以预存上述映射关系,从而,基于该映射关系获取 上述角点的角点编号,该角点编号用于指示上述角点相对于目标物体的位置。
另外,上述角点编号可以是用户输入的,也可以是预先配置的,本申请实施例对此不做特别限制。
S604:基于上述角点编号、2D框和角点,确定并显示上述目标物体的3D框。
本申请实施例,在确定并显示上述目标物体的3D框之前,还获取上述角点的角点编号,进而,基于上述角点编号、2D框和角点,准确确定并显示上述目标物体的3D框,满足应用需要。示例性的,当所述目标物体为车辆时,通常由于车辆位于前方且朝向前方行驶,因此可以默认设定为车辆朝向为正前方;在确定了车辆的3D框的一个角点编号后,由于3D框在图像上的各个角点都落在已标注的2D框的各个边上,因此,可以根据2D视图下三维物体的消失点及已确定的角点编号及映射关系,直接生成车辆的3D框。本申请实施例在图像层面直接对3D框进行标注,无需依赖额外的深度传感器,降低成本,该方法仅通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,就能获得目标物体的3D框,处理过程简单,降低工作量,并且相对于人工标注目标物体3D框,能更加满足实际三维框在二维图像上投影的几何关系约束,避免人工标注造成的误差或错误,从而提高标注数据精确度及可用性。本申请实施例获取的3D框与物体实际尺寸相差较小,解决现有基于点云数据标注获得的3D框与物体的实际尺寸相差较大的问题,从而,保证后续对神经网络的准确训练,提高神经网络的识别效果。
另外,在另一些实施例中,本申请在确定并显示上述目标物体的3D框之前,还获取上述目标物体在3D空间的朝向角。图7为本申请实施例提出的再一种3D框标注方法的流程示意图。如图7所示,该方法包括:
S701:获取2D框标注操作,根据该2D框标注操作,在包含目标物体的二维图像上确定上述目标物体的2D框。
S702:获取角点标注操作,其中,该角点位于上述2D框的一条边上,根据上述角点标注操作在上述2D框上标注该角点。
其中,步骤S701-S702与上述步骤S201-S202的实现方式相同,此处不 再赘述。
S703:获取上述目标物体在3D空间的朝向角,该朝向角用于指示上述目标物体的朝向。
其中,上述朝向角可以是用户输入的,也可以是预设设置的,本申请实施例对此不做特别限制,例如,以目标物体为车辆为例,用户可以不输入朝向角,默认目标物体朝向正前方且朝向角为0度。当然用户也可以输入其他朝向角的角度。例如,当目标物体为车辆时,当车辆朝向正前方行驶时,则可以选择默认的朝向角;而当出现某些斜向行驶或逆向行驶的车辆时,则可以根据实际情况输入实际的朝向角。
S704:基于上述朝向角、2D框和角点,确定并显示上述目标物体的3D框。
本申请实施例,在确定并显示上述目标物体的3D框之前,还获取上述目标物体在3D空间的朝向角,进而,基于上述朝向角、2D框和角点,确定并显示上述目标物体的3D框,使得获得的3D框与实际更相符。具体的,当朝向角取默认设置,例如为0度时,表示目标物体朝向为正前方,此时目标物体对应的消失点位于图像中心,从而可以根据消失点确定出目标物体其他角点的位置完成3D框的生成;而当朝向角为用户输入的其他数值时,表示目标物体朝向其他方向,此时消失点位于其他位置,但同样地可以根据消失点完成3D框的生成。本申请实施例在图像层面直接对3D框进行标注,无需依赖额外的深度传感器,降低成本,该方法仅通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,就能获得目标物体的3D框,处理过程简单,降低工作量,另外,本申请实施例获取的3D框与物体实际尺寸相差较小,解决现有基于点云数据标注获得的3D框与物体的实际尺寸相差较大的问题,从而,保证后续对神经网络的准确训练,提高神经网络的识别效果。
另外,本申请实施例在确定并显示上述目标物体的3D框之后,还可以调整上述2D框。图8为本申请实施例提出的又一种3D框标注方法的流程示意图。如图8所示,该方法包括:
S801:获取2D框标注操作,根据该2D框标注操作,在包含目标物体的二维图像上确定上述目标物体的2D框。
S802:获取角点标注操作,其中,该角点位于上述2D框的一条边上,根据上述角点标注操作在上述2D框上标注该角点。
S803:基于上述2D框和上述角点,确定并显示上述目标物体的3D框。
其中,步骤S801-S803与上述步骤S201-S203的实现方式相同,此处不再赘述。
S804:上述2D框标注操作包括框选操作、移动操作和旋转操作中至少一个,根据上述2D框标注操作调整上述2D框。
其中,上述2D框标注操作除上述外,还可以包括其它操作,本申请实施例对此不做特别限制。
本申请实施例,在确定并显示上述目标物体的3D框之后,还可以调整上述2D框,从而,可以基于调整后的2D框生成新的3D框,满足多种应用需要。而且,本申请实施例在图像层面直接对3D框进行标注,无需依赖额外的深度传感器,降低成本,该方法仅通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,就能获得目标物体的3D框,处理过程简单,降低工作量,另外,本申请实施例获取的3D框与物体实际尺寸相差较小,解决现有基于点云数据标注获得的3D框与物体的实际尺寸相差较大的问题,从而,保证后续对神经网络的准确训练,提高神经网络的识别效果。
图9为本申请实施例提供的又一种3D框标注方法的流程示意图,本实施例的执行主体可以为图1中的处理装置11,具体执行主体可以根据实际应用场景确定。如图9所示,在图1所示应用场景的基础上,本申请实施例提供的3D框标注方法包括如下步骤:
S901:在包含目标物体的二维图像上确定上述目标物体的2D框。
这里,处理装置11在确定上述目标物体的2D框后,还可以获取2D框调整操作,根据该操作,调整上述2D框,例如调整上述2D框的大小、位置等。
S902:获取标注的角点,其中,该角点位于上述2D框的一条边上。
在一种可能的实现方式中,上述角点位于上述2D框的底边上。
S903:基于上述2D框和角点,确定上述目标物体的3D框。
本申请实施例,在图像层面直接对3D框进行标注,无需依赖额外的深 度传感器,降低成本,而且该方法仅通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,就能获得目标物体的3D框,处理过程简单,降低工作量,另外,本申请实施例获取的3D框与物体实际尺寸相差较小,解决现有基于点云数据标注获得的3D框与物体的实际尺寸相差较大的问题,从而,保证后续对神经网络的准确训练,提高神经网络的识别效果。
另外,本申请实施例在确定上述目标物体的3D框之前,还获取上述角点的角点编号。图10为本申请实施例提出的又一种3D框标注方法的流程示意图。如图10所示,该方法包括:
S1001:在包含目标物体的二维图像上确定上述目标物体的2D框。
S1002:获取标注的角点,其中,该角点位于上述2D框的一条边上。
其中,步骤S1001-S1002与上述步骤S901-S902的实现方式相同,此处不再赘述。
S1003:获取上述角点的角点编号,该角点编号用于指示上述角点相对于上述目标物体的位置。
其中,上述角点编号可以是用户输入的,也可以是预先配置的。
S1004:基于上述角点编号、2D框和角点,确定上述目标物体的3D框。
在一种可能的实现方式中,上述处理装置11可以基于上述角点编号和上述角点,确定上述2D框和3D框的对应关系,进而,根据该对应关系和上述2D框,确定上述3D框。
示例性的,还以目标物体为车辆为例,上述2D框和3D框的对应关系,如图11所示,图中front表示前方,rear表示后方。
在本申请实施例中,上述确定上述2D框和3D框的对应关系,可以包括:
获取预存的物体2D框与其3D框的角点的对应规则;
根据上述对应规则,以及上述角点编号和上述角点,确定上述对应关系。
其中,上述对应规则可以根据实际情况设置,本申请实施例对此不做特别限制。
本申请实施例,在确定上述目标物体的3D框之前,还获取上述角点的角点编号,进而,基于上述角点编号、2D框和角点,准确确定上述目标物体的3D框,满足应用需要。而且,本申请实施例在图像层面直接对3D框进行标注,无需依赖额外的深度传感器,降低成本,该方法仅通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,就能获得目标物体的3D框,处理过程简单,降低工作量,另外,本申请实施例获取的3D框与物体实际尺寸相差较小,解决现有基于点云数据标注获得的3D框与物体的实际尺寸相差较大的问题,从而,保证后续对神经网络的准确训练,提高神经网络的识别效果。
另外,本申请实施例在确定上述目标物体的3D框之前,还获取上述目标物体在3D空间的朝向角。图12为本申请实施例提出的又一种3D框标注方法的流程示意图。如图12所示,该方法包括:
S1201:在包含目标物体的二维图像上确定上述目标物体的2D框。
S1202:获取标注的角点,其中,该角点位于上述2D框的一条边上。
其中,步骤S1201-S1202与上述步骤S901-S902的实现方式相同,此处不再赘述。
S1203:获取上述目标物体在3D空间的朝向角,该朝向角用于指示上述目标物体的朝向。
其中,上述朝向角可以是用户输入的,也可以是预先配置的。
S1204:基于上述朝向角、2D框和角点,确定上述目标物体的3D框。
在一种可能的实现方式中,上述处理装置11可以根据上述朝向角,确定上述3D框对应的三个消失点,进而,根据该三个消失点、上述2D框和角点,确定上述3D框。示例性的,结合长方体的边的平行关系,根据上述三个消失点、上述2D框和角点,确定上述3D框。
其中,上述根据上述朝向角,确定上述3D框对应的三个消失点,可以包括:
获取上述二维图像对应的图像获取装置的投影矩阵;
根据上述投影矩阵和上述朝向角的投影矩阵,确定上述三个消失点。
示例性的,将上述投影矩阵和上述朝向角的投影矩阵相乘,根据矩阵相乘结果确定上述三个消失点,例如,如图11所示,三个消失点vp0、vp1 和vp2。
另外,在上述根据上述三个消失点、2D框和角点,确定上述3D框之前,还包括:
获取上述角点的角点编号,该角点编号用于指示上述角点相对于所述目标物体的位置。
相应的,上述根据上述三个消失点、2D框和角点,确定上述3D框,包括:
根据上述角点编号、三个消失点、2D框和角点,确定上述3D框。
示例性的,上述处理装置11可以基于上述角点编号和角点,确定上述2D框和上述3D框的对应关系,进而,结合长方体的边的平行关系,根据该对应关系、上述三个消失点和2D框,求解3D框投影的8个角点位置,例如图11中角点p0、p1、p2、p3、p4、p5、p6和p7,从而,确定上述3D框。
这里,本申请实施例将长方体的内在几何关系与上述图像获取装置的投影模型相结合,从而可以保证标注得到的伪3D框满足长方体的内在几何关系,标注精度和标注一致性更高。
本申请实施例,在确定并显示上述目标物体的3D框之前,还获取上述目标物体在3D空间的朝向角,进而,基于上述朝向角、2D框和角点,确定并显示上述目标物体的3D框,使得获得的3D框与实际更相符。而且,本申请实施例在图像层面直接对3D框进行标注,无需依赖额外的深度传感器,降低成本,该方法仅通过在包含目标物体的二维图像上标注目标物体的2D框和该2D框上的一角点,就能获得目标物体的3D框,处理过程简单,降低工作量,另外,本申请实施例获取的3D框与物体实际尺寸相差较小,解决现有基于点云数据标注获得的3D框与物体的实际尺寸相差较大的问题,从而,保证后续对神经网络的准确训练,提高神经网络的识别效果。
对应于上文实施例的3D框标注方法,图13为本申请实施例提供的一种3D框标注装置的结构示意图。为了便于说明,仅示出了与本申请实施例相关的部分。图13为本申请实施例提供的一种3D框标注装置的结构示意图,该3D框标注装置1300包括:第一获取模块1301、第二获取模块 1302以及显示模块1303。这里的3D框标注装置可以是上述处理装置11本身,或者是实现处理装置11的功能的芯片或者集成电路。这里需要说明的是,第一获取模块、第二获取模块以及显示模块的划分只是一种逻辑功能的划分,物理上两者可以是集成的,也可以是独立的。
其中,第一获取模块1301,用于获取2D框标注操作,根据所述2D框标注操作,在包含目标物体的二维图像上确定所述目标物体的2D框。
第二获取模块1302,用于获取角点标注操作,其中,所述角点位于所述2D框的一条边上,根据所述角点标注操作在所述2D框上标注所述角点;
显示模块1303,用于基于所述2D框和所述角点,确定并显示所述目标物体的3D框。
在一种可能的实现方式中,在所述显示模块1303确定并显示所述目标物体的3D框之前,还用于:
获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置。
在一种可能的实现方式中,所述显示模块1303,具体用于:
基于所述角点编号、所述2D框和所述角点,确定并显示所述3D框。
在一种可能的实现方式中,在所述显示模块1303确定并显示所述目标物体的3D框之前,还用于:
获取所述目标物体在3D空间的朝向角,所述朝向角用于指示所述目标物体的朝向。
在一种可能的实现方式中,所述显示模块1303,具体用于:
基于所述朝向角、所述2D框和所述角点,确定并显示所述3D框。
在一种可能的实现方式中,所述角点位于所述2D框的底边上。
在一种可能的实现方式中,所述2D框标注操作包括框选操作、移动操作和旋转操作中至少一个。
在一种可能的实现方式中,在所述显示模块1303确定并显示所述目标物体的3D框之后,还用于:
根据所述2D框标注操作调整所述2D框。
在一种可能的实现方式中,所述角点编号是用户输入或预先配置的。
在一种可能的实现方式中,所述朝向角是用户输入或预先配置的。
本申请实施例提供的装置,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本申请实施例此处不再赘述。
图14为本申请实施例提供的另一种3D框标注装置的结构示意图,该3D框标注装置1400包括:第一确定模块1401、第三获取模块1402以及第二确定模块1403。这里的3D框标注装置可以是上述处理装置11本身,或者是实现处理装置11的功能的芯片或者集成电路。这里需要说明的是,第一确定模块、第三获取模块以及第二确定模块的划分只是一种逻辑功能的划分,物理上两者可以是集成的,也可以是独立的。
其中,第一确定模块1401,用于在包含目标物体的二维图像上确定所述目标物体的2D框。
第三获取模块1402,用于获取标注的角点,其中,所述角点位于所述2D框的一条边上;
第二确定模块1403,用于基于所述2D框和所述角点,确定所述目标物体的3D框。
在一种可能的实现方式中,在所述第二确定模块1403确定所述目标物体的3D框之前,还用于:
获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置。
在一种可能的实现方式中,所述第二确定模块1403,具体用于:
基于所述角点编号、所述2D框和所述角点,确定所述3D框。
在一种可能的实现方式中,所述第二确定模块1403基于所述角点编号、所述2D框和所述角点,确定所述3D框,包括:
基于所述角点编号和所述角点,确定所述2D框和所述3D框的对应关系;
根据所述对应关系和所述2D框,确定所述3D框。
在一种可能的实现方式中,所述第二确定模块1403确定所述2D框和所述3D框的对应关系,包括:
获取预存的物体2D框与其3D框的角点的对应规则;
根据所述对应规则,以及所述角点编号和所述角点,确定所述对应关 系。
在一种可能的实现方式中,在所述第二确定模块1403确定所述目标物体的3D框之前,还用于:
获取所述目标物体在3D空间的朝向角,所述朝向角用于指示所述目标物体的朝向。
在一种可能的实现方式中,所述第二确定模块1403,具体用于:
基于所述朝向角、所述2D框和所述角点,确定所述3D框。
在一种可能的实现方式中,所述第二确定模块1403基于所述朝向角、所述2D框和所述角点,确定所述3D框,包括:
根据所述朝向角,确定所述3D框对应的三个消失点;
根据所述三个消失点、所述2D框和所述角点,确定所述3D框。
在一种可能的实现方式中,所述第二确定模块1403根据所述朝向角,确定所述3D框对应的三个消失点,包括:
获取所述二维图像对应的图像获取装置的投影矩阵;
根据所述投影矩阵和所述朝向角的投影矩阵,确定所述三个消失点。
在一种可能的实现方式中,在所述第二确定模块1403根据所述三个消失点、所述2D框和所述角点,确定所述3D框之前,还用于:
获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置;
所述根据所述三个消失点、所述2D框和所述角点,确定所述3D框,包括:
根据所述角点编号、所述三个消失点、所述2D框和所述角点,确定所述3D框。
在一种可能的实现方式中,所述角点位于所述2D框的底边上。
在一种可能的实现方式中,所述角点编号是用户输入或预先配置的。
在一种可能的实现方式中,所述朝向角是用户输入或预先配置的。
本申请实施例提供的装置,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本申请实施例此处不再赘述。
可选地,图15示意性地提供本申请所述3D框标注设备的一种可能的基本硬件架构。
参见图15,3D框标注设备1500包括至少一个处理器1501以及存储器1502。进一步可选的,还可以包括通信接口1503和总线1504。
其中,3D框标注设备1500可以是计算机或服务器,本申请对此不作特别限制。3D框标注设备1500中,处理器1501的数量可以是一个或多个,图15仅示意了其中一个处理器1501。可选地,处理器1501,可以是中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)或者数字信号处理器(digital signal processor,DSP)。如果3D框标注设备1500具有多个处理器1501,多个处理器1501的类型可以不同,或者可以相同。可选地,3D框标注设备1500的多个处理器1501还可以集成为多核处理器。
存储器1502存储计算机指令和数据;存储器1502可以存储实现本申请提供的上述3D框标注方法所需的计算机指令和数据,例如,存储器1502存储用于实现上述3D框标注方法的步骤的指令。存储器1502可以是以下存储介质的任一种或任一种组合:非易失性存储器(例如只读存储器
(ROM)、固态硬盘(SSD)、硬盘(HDD)、光盘),易失性存储器。
通信接口1503可以为所述至少一个处理器提供信息输入/输出。也可以包括以下器件的任一种或任一种组合:网络接口(例如以太网接口)、无线网卡等具有网络接入功能的器件。
可选的,通信接口1503还可以用于3D框标注设备1500与其它计算设备或者终端进行数据通信。
进一步可选的,图15用一条粗线表示总线1504。总线1504可以将处理器1501与存储器1502和通信接口1503连接。这样,通过总线1504,处理器1501可以访问存储器1502,还可以利用通信接口1503与其它计算设备或者终端进行数据交互。
在本申请中,处理器1501执行存储器1502中的计算机指令,实现如下步骤:
获取2D框标注操作,根据所述2D框标注操作,在包含目标物体的二维图像上确定所述目标物体的2D框;
获取角点标注操作,其中,所述角点位于所述2D框的一条边上,根据所述角点标注操作在所述2D框上标注所述角点;
基于所述2D框和所述角点,确定并显示所述目标物体的3D框。
在一种可能的实现方式中,在所述确定并显示所述目标物体的3D框之前,所述处理器1501执行所述计算机指令时还实现如下步骤:
获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置。
在一种可能的实现方式中,所述确定并显示所述目标物体的3D框,包括:
基于所述角点编号、所述2D框和所述角点,确定并显示所述3D框。
在一种可能的实现方式中,在所述确定并显示所述目标物体的3D框之前,所述处理器1501执行所述计算机指令时还实现如下步骤:
获取所述目标物体在3D空间的朝向角,所述朝向角用于指示所述目标物体的朝向。
在一种可能的实现方式中,所述确定并显示所述目标物体的3D框,包括:
基于所述朝向角、所述2D框和所述角点,确定并显示所述3D框。
在一种可能的实现方式中,所述角点位于所述2D框的底边上。
在一种可能的实现方式中,所述2D框标注操作包括框选操作、移动操作和旋转操作中至少一个。
在一种可能的实现方式中,在所述确定并显示所述目标物体的3D框之后,所述处理器1501执行所述计算机指令时还实现如下步骤:
根据所述2D框标注操作调整所述2D框。
在一种可能的实现方式中,所述角点编号是用户输入或预先配置的。
在一种可能的实现方式中,所述朝向角是用户输入或预先配置的。
另外,从逻辑功能划分来看,示例性的,如图15所示,存储器1502中可以包括第一获取模块1301、第二获取模块1302以及显示模块1303。这里的包括仅仅涉及存储器中所存储的指令被执行时可以分别实现第一获取模块1301、第二获取模块1302以及显示模块1303的功能,而不限定是物理上的结构。
上述的3D框标注设备除了可以像上述图15通过软件实现外,也可以作为硬件模块,或者作为电路单元,通过硬件实现。
可选地,图16示意性地提供本申请所述3D框标注设备的另一种可能的基本硬件架构。
参见图16,3D框标注设备1600包括至少一个处理器1601以及存储器1602。进一步可选的,还可以包括通信接口1603和总线1604。
其中,3D框标注设备1600可以是计算机或服务器,本申请对此不作特别限制。3D框标注设备1600中,处理器1601的数量可以是一个或多个,图16仅示意了其中一个处理器1601。可选地,处理器1601,可以是中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)或者数字信号处理器(digital signal processor,DSP)。如果3D框标注设备1600具有多个处理器1601,多个处理器1601的类型可以不同,或者可以相同。可选地,3D框标注设备1600的多个处理器1601还可以集成为多核处理器。
存储器1602存储计算机指令和数据;存储器1602可以存储实现本申请提供的上述并行执行单元的管理方法所需的计算机指令和数据,例如,存储器1602存储用于实现上述并行执行单元的管理方法的步骤的指令。存储器1602可以是以下存储介质的任一种或任一种组合:非易失性存储器(例如只读存储器(ROM)、固态硬盘(SSD)、硬盘(HDD)、光盘),易失性存储器。
通信接口1603可以为所述至少一个处理器提供信息输入/输出。也可以包括以下器件的任一种或任一种组合:网络接口(例如以太网接口)、无线网卡等具有网络接入功能的器件。
可选的,通信接口1603还可以用于3D框标注设备1600与其它计算设备或者终端进行数据通信。
进一步可选的,图16用一条粗线表示总线1604。总线1604可以将处理器1601与存储器1602和通信接口1603连接。这样,通过总线1604,处理器1601可以访问存储器1602,还可以利用通信接口1603与其它计算设备或者终端进行数据交互。
在本申请中,处理器1601执行存储器1602中的计算机指令,实现如下步骤:
在包含目标物体的二维图像上确定所述目标物体的2D框;
获取标注的角点,其中,所述角点位于所述2D框的一条边上;
基于所述2D框和所述角点,确定所述目标物体的3D框。
在一种可能的实现方式中,在所述确定所述目标物体的3D框之前,所述处理器1601执行所述计算机指令时还实现如下步骤:
获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置。
在一种可能的实现方式中,所述确定所述目标物体的3D框,包括:
基于所述角点编号、所述2D框和所述角点,确定所述3D框。
在一种可能的实现方式中,所述基于所述角点编号、所述2D框和所述角点,确定所述3D框,包括:
基于所述角点编号和所述角点,确定所述2D框和所述3D框的对应关系;
根据所述对应关系和所述2D框,确定所述3D框。
在一种可能的实现方式中,所述确定所述2D框和所述3D框的对应关系,包括:
获取预存的物体2D框与其3D框的角点的对应规则;
根据所述对应规则,以及所述角点编号和所述角点,确定所述对应关系。
在一种可能的实现方式中,在所述确定所述目标物体的3D框之前,所述处理器1601执行所述计算机指令时还实现如下步骤:
获取所述目标物体在3D空间的朝向角,所述朝向角用于指示所述目标物体的朝向。
在一种可能的实现方式中,所述确定所述目标物体的3D框,包括:
基于所述朝向角、所述2D框和所述角点,确定所述3D框。
在一种可能的实现方式中,所述基于所述朝向角、所述2D框和所述角点,确定所述3D框,包括:
根据所述朝向角,确定所述3D框对应的三个消失点;
根据所述三个消失点、所述2D框和所述角点,确定所述3D框。
在一种可能的实现方式中,所述根据所述朝向角,确定所述3D框对应的三个消失点,包括:
获取所述二维图像对应的图像获取装置的投影矩阵;
根据所述投影矩阵和所述朝向角的投影矩阵,确定所述三个消失点。
在一种可能的实现方式中,在所述根据所述三个消失点、所述2D框和所述角点,确定所述3D框之前,所述处理器1601执行所述计算机指令时还实现如下步骤:
获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置;
所述根据所述三个消失点、所述2D框和所述角点,确定所述3D框,包括:
根据所述角点编号、所述三个消失点、所述2D框和所述角点,确定所述3D框。
在一种可能的实现方式中,所述角点位于所述2D框的底边上。
在一种可能的实现方式中,所述角点编号是用户输入或预先配置的。
在一种可能的实现方式中,所述朝向角是用户输入或预先配置的。
另外,从逻辑功能划分来看,示例性的,如图16所示,存储器1602中可以包括第一确定模块1401、第三获取模块1402以及第二确定模块1403。这里的包括仅仅涉及存储器中所存储的指令被执行时可以分别实现第一确定模块1401、第三获取模块1402以及第二确定模块1403的功能,而不限定是物理上的结构。
上述的3D框标注设备除了可以像上述图16通过软件实现外,也可以作为硬件模块,或者作为电路单元,通过硬件实现。
另外,本申请实施例提供一种神经网络训练方法,包括:利用如上上述的3D框标注方法确定的目标物体的3D框,以及包含上述目标物体的二维图像,进行神经网络的训练。
本申请提供一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得所述计算机执行上述3D框标注方法。
本申请提供一种计算机程序产品,其特征在于,所述计算机程序产品包括指令,当其在计算机上运行时,使得所述计算机执行上述3D框标注方法。
本申请提供一种可移动平台,所述可移动平台可以为智能设备或者运 输工具,例如无人机、无人车或者机器人等,其上包含上述3D框标注设备。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。

Claims (50)

  1. 一种3D框标注方法,其特征在于,包括:
    获取2D框标注操作,根据所述2D框标注操作,在包含目标物体的二维图像上确定所述目标物体的2D框;
    获取角点标注操作,其中,所述角点位于所述2D框的一条边上,根据所述角点标注操作在所述2D框上标注所述角点;
    基于所述2D框和所述角点,确定并显示所述目标物体的3D框。
  2. 根据权利要求1所述的方法,其特征在于,在所述确定并显示所述目标物体的3D框之前,还包括:
    获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置。
  3. 根据权利要求2所述的方法,其特征在于,所述确定并显示所述目标物体的3D框,包括:
    基于所述角点编号、所述2D框和所述角点,确定并显示所述3D框。
  4. 根据权利要求1所述的方法,其特征在于,在所述确定并显示所述目标物体的3D框之前,还包括:
    获取所述目标物体在3D空间的朝向角,所述朝向角用于指示所述目标物体的朝向。
  5. 根据权利要求4所述的方法,其特征在于,所述确定并显示所述目标物体的3D框,包括:
    基于所述朝向角、所述2D框和所述角点,确定并显示所述3D框。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述角点位于所述2D框的底边上。
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述2D框标注操作包括框选操作、移动操作和旋转操作中至少一个。
  8. 根据权利要求7所述的方法,其特征在于,在所述确定并显示所述目标物体的3D框之后,还包括:
    根据所述2D框标注操作调整所述2D框。
  9. 根据权利要求2或3所述的方法,其特征在于,所述角点编号是用户输入或预先配置的。
  10. 根据权利要求4或5所述的方法,其特征在于,所述朝向角是用户输入或预先配置的。
  11. 一种3D框标注方法,其特征在于,包括:
    在包含目标物体的二维图像上确定所述目标物体的2D框;
    获取标注的角点,其中,所述角点位于所述2D框的一条边上;
    基于所述2D框和所述角点,确定所述目标物体的3D框。
  12. 根据权利要求11所述的方法,其特征在于,在所述确定所述目标物体的3D框之前,还包括:
    获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置。
  13. 根据权利要求12所述的方法,其特征在于,所述确定所述目标物体的3D框,包括:
    基于所述角点编号、所述2D框和所述角点,确定所述3D框。
  14. 根据权利要求13所述的方法,其特征在于,所述基于所述角点编号、所述2D框和所述角点,确定所述3D框,包括:
    基于所述角点编号和所述角点,确定所述2D框和所述3D框的对应关系;
    根据所述对应关系和所述2D框,确定所述3D框。
  15. 根据权利要求14所述的方法,其特征在于,所述确定所述2D框和所述3D框的对应关系,包括:
    获取预存的物体2D框与其3D框的角点的对应规则;
    根据所述对应规则,以及所述角点编号和所述角点,确定所述对应关系。
  16. 根据权利要求11所述的方法,其特征在于,在所述确定所述目标物体的3D框之前,还包括:
    获取所述目标物体在3D空间的朝向角,所述朝向角用于指示所述目标物体的朝向。
  17. 根据权利要求16所述的方法,其特征在于,所述确定所述目标物体的3D框,包括:
    基于所述朝向角、所述2D框和所述角点,确定所述3D框。
  18. 根据权利要求17所述的方法,其特征在于,所述基于所述朝向角、所述2D框和所述角点,确定所述3D框,包括:
    根据所述朝向角,确定所述3D框对应的三个消失点;
    根据所述三个消失点、所述2D框和所述角点,确定所述3D框。
  19. 根据权利要求18所述的方法,其特征在于,所述根据所述朝向角,确定所述3D框对应的三个消失点,包括:
    获取所述二维图像对应的图像获取装置的投影矩阵;
    根据所述投影矩阵和所述朝向角的投影矩阵,确定所述三个消失点。
  20. 根据权利要求18或19所述的方法,其特征在于,在所述根据所述三个消失点、所述2D框和所述角点,确定所述3D框之前,还包括:
    获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置;
    所述根据所述三个消失点、所述2D框和所述角点,确定所述3D框,包括:
    根据所述角点编号、所述三个消失点、所述2D框和所述角点,确定所述3D框。
  21. 根据权利要求11至20中任一项所述的方法,其特征在于,所述角点位于所述2D框的底边上。
  22. 根据权利要求12至15中任一项所述的方法,其特征在于,所述角点编号是用户输入或预先配置的。
  23. 根据权利要求16至20中任一项所述的方法,其特征在于,所述朝向角是用户输入或预先配置的。
  24. 一种3D框标注设备,其特征在于,包括存储器、处理器,以及存储在所述存储器中并可在所述处理器上运行的计算机指令,所述处理器执行所述计算机指令时实现如下步骤:
    获取2D框标注操作,根据所述2D框标注操作,在包含目标物体的二维图像上确定所述目标物体的2D框;
    获取角点标注操作,其中,所述角点位于所述2D框的一条边上,根据所述角点标注操作在所述2D框上标注所述角点;
    基于所述2D框和所述角点,确定并显示所述目标物体的3D框。
  25. 根据权利要求24所述的设备,其特征在于,在所述确定并显示所述目标物体的3D框之前,所述处理器执行所述计算机指令时还实现如下步骤:
    获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置。
  26. 根据权利要求25所述的设备,其特征在于,所述确定并显示所述目标物体的3D框,包括:
    基于所述角点编号、所述2D框和所述角点,确定并显示所述3D框。
  27. 根据权利要求24所述的设备,其特征在于,在所述确定并显示所述目标物体的3D框之前,所述处理器执行所述计算机指令时还实现如下步骤:
    获取所述目标物体在3D空间的朝向角,所述朝向角用于指示所述目标物体的朝向。
  28. 根据权利要求27所述的设备,其特征在于,所述确定并显示所述目标物体的3D框,包括:
    基于所述朝向角、所述2D框和所述角点,确定并显示所述3D框。
  29. 根据权利要求24至28中任一项所述的设备,其特征在于,所述角点位于所述2D框的底边上。
  30. 根据权利要求24至29中任一项所述的设备,其特征在于,所述2D框标注操作包括框选操作、移动操作和旋转操作中至少一个。
  31. 根据权利要求30所述的设备,其特征在于,在所述确定并显示所述目标物体的3D框之后,所述处理器执行所述计算机指令时还实现如下步骤:
    根据所述2D框标注操作调整所述2D框。
  32. 根据权利要求25或26所述的设备,其特征在于,所述角点编号是用户输入或预先配置的。
  33. 根据权利要求27或28所述的设备,其特征在于,所述朝向角是用户输入或预先配置的。
  34. 一种3D框标注设备,其特征在于,包括存储器、处理器,以及存储在所述存储器中并可在所述处理器上运行的计算机指令,所述处理器 执行所述计算机指令时实现如下步骤:
    在包含目标物体的二维图像上确定所述目标物体的2D框;
    获取标注的角点,其中,所述角点位于所述2D框的一条边上;
    基于所述2D框和所述角点,确定所述目标物体的3D框。
  35. 根据权利要求34所述的设备,其特征在于,在所述确定所述目标物体的3D框之前,所述处理器执行所述计算机指令时还实现如下步骤:
    获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置。
  36. 根据权利要求35所述的设备,其特征在于,所述确定所述目标物体的3D框,包括:
    基于所述角点编号、所述2D框和所述角点,确定所述3D框。
  37. 根据权利要求36所述的设备,其特征在于,所述基于所述角点编号、所述2D框和所述角点,确定所述3D框,包括:
    基于所述角点编号和所述角点,确定所述2D框和所述3D框的对应关系;
    根据所述对应关系和所述2D框,确定所述3D框。
  38. 根据权利要求37所述的设备,其特征在于,所述确定所述2D框和所述3D框的对应关系,包括:
    获取预存的物体2D框与其3D框的角点的对应规则;
    根据所述对应规则,以及所述角点编号和所述角点,确定所述对应关系。
  39. 根据权利要求34所述的设备,其特征在于,在所述确定所述目标物体的3D框之前,所述处理器执行所述计算机指令时还实现如下步骤:
    获取所述目标物体在3D空间的朝向角,所述朝向角用于指示所述目标物体的朝向。
  40. 根据权利要求39所述的设备,其特征在于,所述确定所述目标物体的3D框,包括:
    基于所述朝向角、所述2D框和所述角点,确定所述3D框。
  41. 根据权利要求40所述的设备,其特征在于,所述基于所述朝向角、所述2D框和所述角点,确定所述3D框,包括:
    根据所述朝向角,确定所述3D框对应的三个消失点;
    根据所述三个消失点、所述2D框和所述角点,确定所述3D框。
  42. 根据权利要求41所述的设备,其特征在于,所述根据所述朝向角,确定所述3D框对应的三个消失点,包括:
    获取所述二维图像对应的图像获取装置的投影矩阵;
    根据所述投影矩阵和所述朝向角的投影矩阵,确定所述三个消失点。
  43. 根据权利要求41或42所述的设备,其特征在于,在所述根据所述三个消失点、所述2D框和所述角点,确定所述3D框之前,所述处理器执行所述计算机指令时还实现如下步骤:
    获取所述角点的角点编号,所述角点编号用于指示所述角点相对于所述目标物体的位置;
    所述根据所述三个消失点、所述2D框和所述角点,确定所述3D框,包括:
    根据所述角点编号、所述三个消失点、所述2D框和所述角点,确定所述3D框。
  44. 根据权利要求34至43中任一项所述的设备,其特征在于,所述角点位于所述2D框的底边上。
  45. 根据权利要求35至38中任一项所述的设备,其特征在于,所述角点编号是用户输入或预先配置的。
  46. 根据权利要求39至43中任一项所述的设备,其特征在于,所述朝向角是用户输入或预先配置的。
  47. 一种神经网络训练方法,其特征在于,包括:
    利用权利要求1至10任一项所述的3D框标注方法确定的目标物体的3D框,以及包含所述目标物体的二维图像,进行神经网络的训练。
  48. 一种神经网络训练方法,其特征在于,包括:
    利用权利要求11至23任一项所述的3D框标注方法确定的目标物体的3D框,以及包含所述目标物体的二维图像,进行神经网络的训练。
  49. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机指令,当处理器执行所述计算机指令时,实现如权利要求1至10任一项所述的3D框标注方法。
  50. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机指令,当处理器执行所述计算机指令时,实现如权利要求11至23任一项所述的3D框标注方法。
PCT/CN2020/103263 2020-07-21 2020-07-21 3d框标注方法、设备及计算机可读存储介质 WO2022016368A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080033702.3A CN113795847A (zh) 2020-07-21 2020-07-21 3d框标注方法、设备及计算机可读存储介质
PCT/CN2020/103263 WO2022016368A1 (zh) 2020-07-21 2020-07-21 3d框标注方法、设备及计算机可读存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/103263 WO2022016368A1 (zh) 2020-07-21 2020-07-21 3d框标注方法、设备及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2022016368A1 true WO2022016368A1 (zh) 2022-01-27

Family

ID=79181475

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/103263 WO2022016368A1 (zh) 2020-07-21 2020-07-21 3d框标注方法、设备及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN113795847A (zh)
WO (1) WO2022016368A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190088027A1 (en) * 2017-09-18 2019-03-21 Shoppar Inc. Method for developing augmented reality experiences in low computer power systems and devices
CN109829447A (zh) * 2019-03-06 2019-05-31 百度在线网络技术(北京)有限公司 用于确定车辆三维框架的方法和装置
CN110390258A (zh) * 2019-06-05 2019-10-29 东南大学 图像目标三维信息标注方法
CN111079619A (zh) * 2019-12-10 2020-04-28 北京百度网讯科技有限公司 用于检测图像中的目标对象的方法和装置
CN111126161A (zh) * 2019-11-28 2020-05-08 北京联合大学 一种基于关键点回归的3d车辆检测方法
CN111310667A (zh) * 2020-02-18 2020-06-19 北京小马慧行科技有限公司 确定标注是否准确的方法、装置、存储介质与处理器

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005275500A (ja) * 2004-03-23 2005-10-06 Zenrin Co Ltd 消失点決定方法
KR101677232B1 (ko) * 2015-11-27 2016-11-30 중앙대학교 산학협력단 자동차 카메라 보정 장치 및 방법
WO2018093796A1 (en) * 2016-11-15 2018-05-24 Magic Leap, Inc. Deep learning system for cuboid detection
CN108257139B (zh) * 2018-02-26 2020-09-08 中国科学院大学 基于深度学习的rgb-d三维物体检测方法
CN110969064B (zh) * 2018-09-30 2023-10-27 北京四维图新科技股份有限公司 一种基于单目视觉的图像检测方法、装置及存储设备
CN110298262B (zh) * 2019-06-06 2024-01-02 华为技术有限公司 物体识别方法及装置
CN110745140B (zh) * 2019-10-28 2021-01-01 清华大学 一种基于连续图像约束位姿估计的车辆换道预警方法
CN111008557A (zh) * 2019-10-30 2020-04-14 长安大学 一种基于几何约束的车辆细粒度识别方法
CN110826499A (zh) * 2019-11-08 2020-02-21 上海眼控科技股份有限公司 物体空间参数检测方法、装置、电子设备及存储介质
CN111126269B (zh) * 2019-12-24 2022-09-30 京东科技控股股份有限公司 三维目标检测方法、装置以及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190088027A1 (en) * 2017-09-18 2019-03-21 Shoppar Inc. Method for developing augmented reality experiences in low computer power systems and devices
CN109829447A (zh) * 2019-03-06 2019-05-31 百度在线网络技术(北京)有限公司 用于确定车辆三维框架的方法和装置
CN110390258A (zh) * 2019-06-05 2019-10-29 东南大学 图像目标三维信息标注方法
CN111126161A (zh) * 2019-11-28 2020-05-08 北京联合大学 一种基于关键点回归的3d车辆检测方法
CN111079619A (zh) * 2019-12-10 2020-04-28 北京百度网讯科技有限公司 用于检测图像中的目标对象的方法和装置
CN111310667A (zh) * 2020-02-18 2020-06-19 北京小马慧行科技有限公司 确定标注是否准确的方法、装置、存储介质与处理器

Also Published As

Publication number Publication date
CN113795847A (zh) 2021-12-14

Similar Documents

Publication Publication Date Title
JP7403700B2 (ja) ホモグラフィ適合を介した完全畳み込み着目点検出および記述
CN109345596B (zh) 多传感器标定方法、装置、计算机设备、介质和车辆
WO2022068225A1 (zh) 点云标注的方法、装置、电子设备、存储介质及程序产品
Wang et al. Monocular 3d object detection with depth from motion
US10726580B2 (en) Method and device for calibration
EP3621032A2 (en) Method and apparatus for determining motion vector field, device, storage medium and vehicle
WO2021114776A1 (en) Object detection method, object detection device, terminal device, and medium
US20220301277A1 (en) Target detection method, terminal device, and medium
US12106514B2 (en) Efficient localization based on multiple feature types
WO2021114773A1 (en) Target detection method, device, terminal device, and medium
KR20200136723A (ko) 가상 도시 모델을 이용하여 객체 인식을 위한 학습 데이터 생성 방법 및 장치
CN106570482A (zh) 人体动作识别方法及装置
WO2022199195A1 (zh) 地图更新方法、系统、车载终端、服务器及存储介质
CN111161398A (zh) 一种图像生成方法、装置、设备及存储介质
US20220301176A1 (en) Object detection method, object detection device, terminal device, and medium
CN115953464A (zh) 全局定位方法和装置
Kim et al. Piccolo: Point cloud-centric omnidirectional localization
CN114529800A (zh) 一种旋翼无人机避障方法、系统、装置及介质
CN113033426A (zh) 动态对象标注方法、装置、设备和存储介质
CN117830397A (zh) 重定位方法、装置、电子设备、介质和车辆
CN117870716A (zh) 地图兴趣点的显示方法、装置、电子设备及存储介质
WO2022016368A1 (zh) 3d框标注方法、设备及计算机可读存储介质
Kniaz Real-time optical flow estimation on a GPU for a skied-steered mobile robot
WO2023283929A1 (zh) 双目相机外参标定的方法及装置
US9165208B1 (en) Robust ground-plane homography estimation using adaptive feature selection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20946392

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20946392

Country of ref document: EP

Kind code of ref document: A1