CN111046743B

CN111046743B - Barrier information labeling method and device, electronic equipment and storage medium

Info

Publication number: CN111046743B
Application number: CN201911151569.7A
Authority: CN
Inventors: 程邦胜; 方晓波; 张辉
Original assignee: Newpoint Intelligent Technology Group Co ltd
Current assignee: Newpoint Intelligent Technology Group Co ltd
Priority date: 2019-11-21
Filing date: 2019-11-21
Publication date: 2023-05-05
Anticipated expiration: 2039-11-21
Also published as: CN111046743A

Abstract

The application provides a method, a device, electronic equipment and a storage medium for marking obstacle information, wherein the method comprises the following steps: acquiring a video image acquired by a camera, a projection transformation matrix of the camera and a pavement coordinate system corresponding to a scene shot by the camera; determining a target obstacle in the video image; under the condition that the target obstacle does not have initialized three-dimensional labeling information, sequentially acquiring 4 anchor points meeting geometric constraint conditions according to a three-dimensional coordinate system where the camera is positioned and a road surface coordinate system corresponding to a scene shot by the camera; initializing the attitude angle of the three-dimensional rectangular labeling frame as an initial attitude angle; obtaining initial three-dimensional coordinates corresponding to the 4 anchor points respectively; and determining the three-dimensional labeling information of the target obstacle. The method has the advantages of low cost, high information accuracy and strong obstacle distinguishing capability, and can meet the requirement of three-dimensional accurate positioning of the target obstacle in the road in various applications related to the road.

Description

Barrier information labeling method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a method and apparatus for labeling obstacle information, an electronic device, and a storage medium.

Background

In the technologies of digital rail, automatic driving, car-road coordination, robot navigation, etc., road condition information of a road area monitored by a camera installed on a vehicle or a road side unit is perceived to obtain a video image on a road condition, and then related information of an obstacle on a monitored road is obtained by analyzing the video image.

In the related art, a method for marking a target obstacle in a video image by using information of a two-dimensional area is provided, but the method can only determine the two-dimensional information of the obstacle under a two-dimensional image coordinate system, so that the three-dimensional information of the target obstacle cannot be obtained, and the requirement of accurately positioning the obstacle in a three-dimensional space, which is required in road condition analysis, in automatic driving, vehicle-road coordination and auxiliary driving is difficult to meet.

In the related art, a method for determining three-dimensional information of an obstacle by relying on a laser radar sensor is also provided, but the method has the defects of high cost, low resolution of the laser radar and difficulty in accurately positioning the obstacle in a road.

Disclosure of Invention

The application provides a method and a device for marking obstacle information, electronic equipment and a storage medium, so as to solve the technical problems.

The first aspect of the application provides a method for labeling obstacle information, which comprises the following steps:

acquiring a video image acquired by a camera, a projection transformation matrix of the camera and a pavement coordinate system corresponding to a scene shot by the camera;

determining a target obstacle in the video image;

under the condition that the target obstacle does not have initialized three-dimensional labeling information, sequentially acquiring 4 anchor points meeting geometric constraint conditions according to a three-dimensional coordinate system where the camera is positioned and a road surface coordinate system corresponding to a scene shot by the camera; the geometric constraint conditions are as follows: the three line segments corresponding to the 3 line segments formed by the 4 anchor points according to the acquisition sequence in the three-dimensional space are mutually perpendicular, so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the three-dimensional rectangular marking frame formed by the 4 anchor points;

initializing the attitude angle of the three-dimensional rectangular labeling frame as an initial attitude angle; obtaining initial three-dimensional coordinates corresponding to the 4 anchor points according to initial two-dimensional coordinates of the 4 anchor points in the video image, a projection transformation matrix of the camera, a road surface coordinate system corresponding to a scene shot by the camera and an initial attitude angle of the three-dimensional rectangular annotation frame; wherein, the attitude angle of three-dimensional rectangle mark frame is: an included angle between the forward direction of the three-dimensional rectangular marking frame and the forward direction of the road surface coordinate system;

And determining three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates corresponding to the 4 anchor points and the attitude angles of the three-dimensional rectangular labeling frame, and labeling the three-dimensional labeling information of the target obstacle in the video image.

A second aspect of the present application provides an obstacle information labeling device, the device comprising:

the first acquisition module is used for acquiring video images acquired by the camera, a projection transformation matrix of the camera and a pavement coordinate system corresponding to a scene shot by the camera;

a first determining module for determining a target obstacle in the video image;

the anchor point acquisition module is used for sequentially acquiring 4 anchor points meeting geometric constraint conditions according to a three-dimensional coordinate system where the camera is positioned and a road surface coordinate system corresponding to a scene shot by the camera under the condition that the target obstacle does not have initialized three-dimensional labeling information; the geometric constraint conditions are as follows: the three line segments corresponding to the 3 line segments formed by the 4 anchor points according to the acquisition sequence in the three-dimensional space are mutually perpendicular, so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the three-dimensional rectangular marking frame formed by the 4 anchor points;

The second acquisition module is used for initializing the attitude angle of the three-dimensional rectangular annotation frame to be an initial attitude angle; obtaining initial three-dimensional coordinates corresponding to the 4 anchor points according to initial two-dimensional coordinates of the 4 anchor points in the video image, the projection transformation matrix of the camera, road surface coordinate system information of a scene shot by the camera and an initial attitude angle of the three-dimensional rectangular annotation frame; wherein, the attitude angle of three-dimensional rectangle mark frame is: an included angle between the forward direction of the three-dimensional rectangular marking frame and the forward direction of the road surface coordinate system;

the first labeling module is used for determining three-dimensional labeling information of the target obstacle according to initial three-dimensional coordinates corresponding to the 4 anchor points, the three-dimensional rectangular labeling frame and the attitude angle of the three-dimensional rectangular labeling frame, and labeling the three-dimensional labeling information of the target obstacle in the video image.

A third aspect of the present application provides an electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute to implement a method of obstacle information labeling.

A fourth aspect of the present application provides a non-transitory computer-readable storage medium, which when executed by a processor of a terminal, enables the terminal to perform a method of implementing an obstacle information labeling.

Compared with the prior art, the application has the following advantages:

according to the method, a target obstacle of a video image is determined according to the video image and the projective transformation matrix acquired by a camera, a corresponding anchor point of the target obstacle is selected from the video image, a three-dimensional rectangular annotation frame of the target obstacle is obtained, and the three-dimensional annotation information of the target obstacle is obtained by obtaining the position, the size and the gesture information of the three-dimensional rectangular annotation frame. The method and the device rely on the video image to obtain the three-dimensional labeling information of the target obstacle, have low cost, high information accuracy and strong obstacle distinguishing capability, and can meet the requirement of three-dimensional accurate positioning of the target obstacle in the road in the applications of automatic driving, vehicle-road coordination, auxiliary driving and the like.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

In order to more clearly illustrate the technical solutions of the various embodiments of the present application, the drawings that are required to be used in the description of the various embodiments of the present application will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.

FIG. 1 is a flowchart illustrating steps of a method for labeling obstacle information according to an embodiment of the present invention;

FIG. 2 is a flowchart of steps of a method for labeling obstacle information, which includes determining an anchor point and then adjusting an attitude angle according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a road surface coordinate system according to an embodiment of the present invention;

FIG. 4 is a three-dimensional rectangular annotation frame under a two-dimensional display coordinate system according to one embodiment of the invention;

FIG. 5 is a three-dimensional rectangular annotation frame in a pavement coordinate system according to one embodiment of the present invention;

FIG. 6 is a flow chart of specific 4 anchor points p1, p2, p3 and p4 shown in FIGS. 4 and 5 in the present invention;

FIG. 7 is a three-dimensional rectangular annotation frame in a two-dimensional display coordinate system according to another embodiment of the invention;

FIG. 8 is a three-dimensional rectangular annotation box in a pavement coordinate system provided by another embodiment of the present invention;

FIG. 9 is a flow chart of determining three-dimensional annotation information with initialized three-dimensional annotation information in accordance with the present invention;

fig. 10 is a schematic structural diagram of an obstacle information labeling device according to another embodiment of the present invention;

FIG. 11 is a schematic structural diagram of an electronic device according to another embodiment of the present invention;

fig. 12 is a schematic structural diagram of an obstacle information labeling system according to another embodiment of the present invention.

Detailed Description

In order that the above-recited objects, features and advantages of the present application will become more readily apparent, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments that are illustrated in the appended drawings.

In technologies such as automatic driving, vehicle-road coordination, auxiliary driving and the like, an algorithm model needs to be trained to detect and position obstacles in a video. Training these models requires labeling the video data acquired by the camera to mark the position and size of the obstacle in the video image in the image. The video image is typically displayed in a display window, and then a two-dimensional area in the video image where the target obstacle is located is marked in the display window. The annotated two-dimensional annotation information is then stored to a storage device.

In the related art, a target obstacle in a video image can be marked using two-dimensional information. The method for marking the target obstacle by utilizing the two-dimensional information is specifically as follows: the camera collects video images of the road area, and a two-dimensional area occupied by the target obstacle is marked in the video images. However, the two-dimensional labeling information labeled by this method cannot reflect the information of the target obstacle in the video image in the three-dimensional space. If the video image marked with the two-dimensional marking information is used for training the detection model or the positioning model or the related video analysis model, the trained model can only position the two-dimensional information of the target obstacle under the two-dimensional image coordinate system and cannot obtain the three-dimensional marking information of the target obstacle in the three-dimensional space, so that the model is difficult to position the target obstacle in the camera in three dimensions and is difficult to meet the requirement of accurate positioning of the target obstacle in the three-dimensional space, which is required in road condition analysis in automatic driving, vehicle-road coordination and auxiliary driving.

Therefore, the related technology of marking the target obstacle in the video image by adopting the two-dimensional information only can mark the two-dimensional area information of the target obstacle in the video image under the two-dimensional image coordinate system, and cannot mark the three-dimensional information of the target obstacle in the video image in the three-dimensional space, so that the obtained marking data does not contain the three-dimensional information of the target obstacle in the video image of the monitored road area, cannot be used for training a three-dimensional detection or positioning model of the target obstacle or a related video analysis model, cannot obtain the three-dimensional space information of the target obstacle through analysis of the video image, and cannot accurately position the obstacle in the video image in the three-dimensional space, so that the requirement of three-dimensional accurate positioning of the target obstacle in the road in the application of automatic driving, vehicle-road coordination, auxiliary driving and the like is difficult to be met.

In the related art, another way of obtaining three-dimensional information of an obstacle depending on three-dimensional point cloud information perceived by a lidar sensor is also provided. That is, the laser radar-based point cloud information includes three-dimensional information of sampling points of the monitored road region, and the target obstacle in the three-dimensional point cloud can be marked with a target region of the three-dimensional space region. Because the three-dimensional point cloud contains three-dimensional coordinate information of the road region relative to the lidar coordinate system, three-dimensional information of the target obstacle can be labeled, including position information of the target obstacle, occupied three-dimensional space region information, and three-dimensional posture information.

However, in the above-mentioned method of obtaining three-dimensional information of an obstacle depending on three-dimensional point cloud information perceived by a laser radar sensor, although three-dimensional information can be marked on a target obstacle on a road by monitoring point cloud data of the road perceived by the laser radar, the laser radar is expensive, the resolution of the laser radar is very low compared with that of a video image, it is difficult to accurately locate the target obstacle in the road, and the location of a three-dimensional space region occupied by the target obstacle in the three-dimensional space is also lower than that of the video image. In addition, the lidar cannot sense color information on the road area, and the accuracy of classifying the target obstacle is lower than that of classifying the target obstacle based on the video image. For example, there are people, boxes and vehicles on the road surface, wherein the difference between the people and the boxes is not great, if the target obstacle is positioned by the laser radar, the people and the boxes are easily confused because the color information cannot be perceived, and the accuracy of classifying the target obstacle is further reduced.

The three-dimensional labeling information of the obstacle comprises position information of the obstacle, three-dimensional space information occupied by the obstacle and posture information of the obstacle. The three-dimensional labeling information is critical to the positioning and analysis of the obstacle in the three-dimensional space, and only the position of the obstacle in the three-dimensional space is accurately positioned, and the accurate analysis is performed on the state of the obstacle, so that an automatic driving vehicle can make a correct driving decision according to the information, avoid the obstacle near the vehicle, and ensure the safe and smooth driving of the vehicle.

To analyze three-dimensional spatial information of an obstacle in a monocular video image of a monitored road area, it is necessary to train a video analysis model with monocular video data labeled with the three-dimensional information of the obstacle. Therefore, there is a need for related techniques and systems that can label three-dimensional information of an obstacle in a monocular video, and accurately reflect position information of the obstacle in a three-dimensional space, occupied three-dimensional space region information, and pose information in the three-dimensional space in a monocular video image.

In order to solve the technical problems, the application provides an obstacle information labeling method, which comprises the steps of determining a target obstacle of a video image according to the video image acquired by a camera and a projective transformation matrix, selecting a corresponding anchor point of the target obstacle in the video image, obtaining a three-dimensional rectangular labeling frame of the target obstacle, and obtaining the three-dimensional labeling information of the target obstacle by obtaining the position, the size and the gesture information of the three-dimensional rectangular labeling frame.

Compared with the mode of acquiring three-dimensional information by a laser radar in the related art, the method and the device for acquiring the three-dimensional annotation information of the target obstacle rely on the video image, the three-dimensional annotation information of the target obstacle can be acquired only by relying on the video image, and the video image with higher resolution can be easily acquired, so that the cost of the annotation method provided by the method and the device for acquiring the three-dimensional annotation information of the target obstacle is low; the three-dimensional annotation information of the target obstacle is obtained based on the video image, and the resolution of the video image can be very high, so that the three-dimensional information of the target obstacle can be obtained more accurately based on the high-resolution video image; and the video image can identify and record the color of the target obstacle, so that the obstacle distinguishing capability can be effectively improved. Compared with the mode of marking the obstacle through the two-dimensional information in the related art, the method and the device can acquire accurate three-dimensional marking information, so that the requirement of three-dimensional accurate positioning of the target obstacle in the road in the application of automatic driving, vehicle-road coordination, auxiliary driving and the like can be met.

The method for labeling obstacle information provided by the application is shown in fig. 1 and fig. 2, and specifically comprises the following steps:

and S11, acquiring a video image acquired by a camera, a projection transformation matrix of the camera and a pavement coordinate system corresponding to a scene shot by the camera.

The video collected by the camera is composed of video images of one frame, and the obstacle of each video image of each frame in the video is marked, so that the mark of the obstacle appearing in the whole video is completed.

In the application, for a given camera, according to the position and the posture of the camera relative to a three-dimensional global coordinate system and the internal parameter value of the camera, the three-dimensional coordinate of a point in the three-dimensional space under the three-dimensional global coordinate system can be obtained, and the coordinate transformation relation between the two-dimensional coordinates of the corresponding two-dimensional point in the two-dimensional plane of the imaging coordinate system on the imaging plane of the camera can be expressed as a projective transformation matrix. This projective transformation matrix may be represented as a 3 x 4 homogeneous matrix P, typically in the form of K r|t, K being a 3 x 3 internal matrix, R being a 3 x 3 rotation matrix, t being a 3 x 1 three-dimensional vector. The projection transformation matrix transforms the homogeneous coordinates of the 3-dimensional points into homogeneous coordinates of the 2-dimensional points, thus completing the projection transformation process from three dimensions to two dimensions.

And determining a three-dimensional coordinate system (namely a three-dimensional global coordinate system, which refers to a world coordinate system) in which the camera is positioned, and a pavement coordinate system corresponding to the camera.

The road surface coordinate system corresponding to the camera is the road surface coordinate system of the road where the camera is located. The road surface coordinate system is obtained as follows:

as shown in fig. 3, the monitored road surface area is approximately a plane. The road surface plane is described by a three-dimensional coordinate system of a normal vector v_n and a forward vector v_f of the road surface, and one anchor point a on the road surface, with respect to the three-dimensional global coordinate system. The anchor point a may be any point of the road surface.

From the normal vector v_n of the road surface and the forward direction vector v_f of the road surface, the left direction vector of the road surface can be calculated

Wherein->

Is the cross product of the vectors. Wherein v_n, v_f, and v_l constitute three coordinate axes of the road surface coordinate system.

Step S12, determining a target obstacle in the video image.

All the obstacles appearing in one frame of video image can be target obstacles. In step S12, all the obstacles needing to determine the three-dimensional labeling information may be determined as target obstacles, and then the three-dimensional labeling information is determined for all the target obstacles one by one; or only one target obstacle is determined, after the three-dimensional labeling information of the target obstacle is determined, the next target obstacle needing to be labeled with the three-dimensional labeling information is determined, and the three-dimensional labeling information is determined for the next target obstacle.

Step S13, under the condition that the target obstacle does not have initialized three-dimensional labeling information, sequentially acquiring 4 anchor points meeting geometric constraint conditions according to a three-dimensional coordinate system where the camera is positioned and a road surface coordinate system corresponding to a scene shot by the camera; the geometric constraint conditions are as follows: the three line segments corresponding to the 3 line segments formed by the 4 anchor points according to the acquisition sequence in the three-dimensional space are mutually perpendicular, so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the three-dimensional rectangular marking frame formed by the 4 anchor points.

After determining the target obstacle, it is first necessary to determine whether the target obstacle has initialized three-dimensional labeling information, which is pre-existing. The initialized three-dimensional labeling information may be obtained by a method in the related art.

In the process of automatic driving, vehicle-road coordination and auxiliary driving, since the target obstacle is on the road surface on which the vehicle is traveling, the vehicle may be possibly caused to travel obstacle, and therefore, information such as the position of the target obstacle in the road, the spatial region and the like needs to be determined. Although the position and the space area occupied by the obstacle on the road are different according to the obstacle, the influence of the obstacle on the running vehicle can be determined as long as the longest length of the obstacle in the horizontal direction, the longest length in the vertical direction and the occupied position in the three-dimensional global coordinate system are obtained. Therefore, the method and the device abstract the target obstacle into the three-dimensional rectangular labeling frame, and obtain the three-dimensional information of the target obstacle by obtaining the three-dimensional information of the three-dimensional rectangular labeling frame.

In the application, the three-dimensional rectangular labeling frame is obtained in the following way:

as shown in fig. 4, the three-dimensional rectangular label frame is shown in a two-dimensional display coordinate system (wherein the two-dimensional display coordinate system refers to two-dimensional coordinates on a screen of a display window where a video image is displayed, O is an origin of the two-dimensional display coordinate system, U is a horizontal axis of the two-dimensional display coordinate system, and V is a vertical axis of the two-dimensional display coordinate system). 4 anchor points P1, P2, P3, P4 are sequentially selected near the presented target obstacle in the video image or on the edge of the target obstacle (i.e. the 4 anchor points are two-dimensional points in a two-dimensional coordinate system selected from the video image), and each of the 4 anchor points in the two-dimensional display coordinate system corresponds to one anchor point in three-dimensional space, which are respectively denoted as P1, P2, P3 and P4. The anchor points P1, P2, P3 and P4 in the three-dimensional space are required to meet the geometric constraint condition, wherein the geometric constraint condition refers to that line segments P1-P2, P2-P3 and P3-P4 formed by 4 anchor points obtained according to the acquisition sequence are mutually perpendicular; therefore, 4 anchor points can be constrained to form a cuboid, and the cuboid is the three-dimensional rectangular marking frame. As shown in fig. 5, a three-dimensional rectangular label frame in the road surface coordinate system.

As shown in fig. 6, fig. 6 is a flowchart of determining three-dimensional annotation information in combination with specific 4 anchor points p1, p2, p3, and p 4. The method comprises the following specific steps: inputting video images, a camera projection transformation matrix and road surface coordinate system information, and displaying the input video images in a display window; determining a target obstacle in a video image, sequentially selecting anchor points P1, P2, P3 and P4 in a display window, and simultaneously calculating three-dimensional coordinates of corresponding three-dimensional points P1, P2, P3 and P4; according to the requirement, the positions of anchor points p1, p2, p3 and p4 in a two-dimensional coordinate system are adjusted, and the three-dimensional coordinates of the corresponding three-dimensional points are updated; adjusting the attitude angle according to the requirement to obtain three-dimensional labeling information of the target obstacle; determining three-dimensional labeling information of the next target obstacle according to the mode; and then transmitting or storing the obtained three-dimensional labeling information.

In addition, as shown in fig. 7, the positions of the 4 anchor points may be three-dimensional rectangular labeling frames in the road surface coordinate system determined by the anchor point positions shown in fig. 7 in fig. 8, and the three-dimensional labeling information may be determined according to the flowchart for determining the three-dimensional labeling information shown in fig. 6.

Because the target obstacle is positioned in the three-dimensional rectangular marking frame, the vehicle can avoid the target obstacle only by avoiding the area of the three-dimensional rectangular marking frame in the specific automatic driving process.

In order to simplify the three-dimensional rectangular marking frame, all barriers can be set to be on the road surface; the method can also set the superposition of the bottom surface of the three-dimensional rectangular marking frame and the road surface. Setting the forward direction of the three-dimensional rectangular labeling frame to be consistent with the forward direction of the road, namely, assuming that the attitude angle of the three-dimensional rectangular labeling frame is 0, and subsequently adjusting the three-dimensional rectangular labeling frame to enable the attitude angle of the three-dimensional rectangular labeling frame to be the same as the attitude angle of the target obstacle.

Step S14, initializing the attitude angle of the three-dimensional rectangular labeling frame as an initial attitude angle; obtaining initial three-dimensional coordinates corresponding to the 4 anchor points according to initial two-dimensional coordinates of the 4 anchor points in the video image, a projection transformation matrix of the camera, a road surface coordinate system corresponding to a scene shot by the camera and an initial attitude angle of the three-dimensional rectangular annotation frame; wherein, the attitude angle of three-dimensional rectangle mark frame is: and an included angle between the forward direction of the three-dimensional rectangular labeling frame and the forward direction of the road surface coordinate system.

The initial three-dimensional coordinates of the 4 anchor points in the three-dimensional global coordinate system are obtained according to the initial two-dimensional coordinates, the projective transformation matrix, the road surface coordinate system and the initial attitude angle.

And S15, determining three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates, the three-dimensional rectangular labeling frame and the attitude angles of the three-dimensional rectangular labeling frame corresponding to the 4 anchor points, and labeling the three-dimensional labeling information of the target obstacle in the video image.

The three-dimensional labeling information comprises position information of the target obstacle, three-dimensional space information occupied by the target obstacle and posture information of the target obstacle.

The position information of the target obstacle is represented by three-dimensional coordinates of an anchor point of the three-dimensional rectangular labeling frame, wherein the anchor point can be a vertex set on the bottom surface of the three-dimensional rectangular labeling frame, a central point of a rectangle on the bottom surface of the three-dimensional rectangular labeling frame, and the like.

The three-dimensional space information occupied by the target obstacle refers to the space volume occupied by the three-dimensional rectangular labeling frame.

The posture information of the target obstacle refers to an angle between a forward direction of the target obstacle and a forward direction vector of the road surface.

For example, the target obstacle is a person whose front face is directed forward of the road surface, and the angle between the front face of the person and the forward vector of the road surface is 0 °. The front of the human body faces away from the front of the road pavement, and the included angle between the front of the human body and the forward vector of the road pavement is 180 degrees.

After the three-dimensional labeling information is obtained, the three-dimensional labeling information can be labeled in the video image; the information may also be stored in an information storage device and/or transmitted to the target device via an information transmission device. The target device is a device requiring three-dimensional labeling information, such as an automatic driving automobile.

The step S15 specifically includes:

step S1501, determining the positions and sizes of the three-dimensional rectangular labeling frames where the 4 anchor points are located according to the initial three-dimensional coordinates corresponding to the 4 anchor points.

In step S14, knowing the initial three-dimensional coordinates of the 4 anchor points, the position and the size of the three-dimensional rectangular labeling frame can be determined, where the position of the three-dimensional rectangular labeling frame is represented by the three-dimensional coordinates of one anchor point of the three-dimensional rectangular labeling frame, and the anchor point may be a vertex set by the bottom surface of the three-dimensional rectangular labeling frame, or may be a center point of the rectangle on the bottom surface of the three-dimensional rectangular labeling frame. The size of the three-dimensional rectangular marking frame is the volume of the three-dimensional space occupied by the length, width and height of the three-dimensional rectangular marking frame. Wherein, the length refers to the length of the three-dimensional rectangular labeling frame in the v_f direction, the width refers to the width of the three-dimensional rectangular labeling frame in the v_l direction, and the height refers to the height of the three-dimensional rectangular labeling frame in the v_n direction. The three-dimensional posture information of the obstacle refers to an angle θ between the forward direction of the obstacle and the road forward direction v_f.

The forward direction of the three-dimensional rectangular annotation frame is determined by the forward direction of the corresponding target obstacle.

Step S1502, determining three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates corresponding to the 4 anchor points, the position and the size of the three-dimensional rectangular labeling frame, and the initial attitude angle of the three-dimensional rectangular labeling frame when the initial attitude angle of the three-dimensional rectangular labeling frame is the same as the attitude angle of the target obstacle; wherein, the attitude angle of the target obstacle is: an angle between a forward direction of the target obstacle and a forward direction of the road surface coordinate system.

When 4 anchor points are selected to construct a three-dimensional rectangular labeling frame, the posture angle of the three-dimensional rectangular labeling frame is possibly different from the posture angle of the target obstacle, and under the condition that the posture angles of the three-dimensional rectangular labeling frame and the target obstacle are different, the three-dimensional labeling information obtained according to the three-dimensional rectangular labeling frame is different from the actual three-dimensional labeling information of the target obstacle.

To avoid such a difference, it is necessary to determine whether the attitude angle of the three-dimensional rectangular callout box is the same as that of the target obstacle. The attitude angle of the three-dimensional rectangular labeling frame is embodied after the attitude angle of the target obstacle is estimated, and the attitude angle of the three-dimensional rectangular labeling frame is confirmed by whether the three-dimensional rectangular space occupied by the three-dimensional rectangular labeling frame is consistent with the three-dimensional rectangular space occupied by the obstacle. If the three-dimensional rectangular space occupied by the three-dimensional rectangular labeling frame is consistent with the three-dimensional rectangular space occupied by the obstacle, the three-dimensional rectangular labeling frame is considered to reflect the attitude angle of the obstacle.

When the initial attitude angle of the three-dimensional rectangular labeling frame is the same as the attitude angle of the target obstacle, namely, when the initial attitude angle of the three-dimensional rectangular labeling frame can enable the three-dimensional rectangular space area occupied by the three-dimensional rectangular labeling frame to be consistent with the three-dimensional rectangular space area occupied by the target obstacle, the attitude angle of the target obstacle can be accurately reflected, which means that the three-dimensional rectangular labeling frame can accurately represent the three-dimensional information of the target obstacle, the three-dimensional labeling information of the target obstacle can be determined according to the initial three-dimensional coordinates corresponding to the 4 anchor points, the position and the size of the three-dimensional rectangular labeling frame and the attitude angle of the three-dimensional rectangular labeling frame.

Step S1503, when the initial attitude angle of the three-dimensional rectangular labeling frame is different from the attitude angle of the target obstacle, adjusting the initial attitude angle of the three-dimensional rectangular labeling frame so that the adjusted attitude angle of the three-dimensional rectangular labeling frame is the same as the attitude angle of the target obstacle;

that is, when the initial attitude angle of the initial three-dimensional rectangular labeling frame cannot make the three-dimensional rectangular space occupied by the three-dimensional rectangular labeling frame coincide with the three-dimensional rectangular space occupied by the target obstacle, and thus the attitude angle of the target obstacle cannot be reflected correctly, the initial attitude angle of the three-dimensional rectangular labeling frame is adjusted, so that the adjusted attitude angle of the three-dimensional rectangular labeling frame makes the three-dimensional rectangular space occupied by the three-dimensional rectangular labeling frame coincide with the three-dimensional rectangular space occupied by the obstacle.

According to the adjusted attitude angles of the three-dimensional rectangular annotation frame, determining adjusted two-dimensional coordinates of the 4 anchor points;

according to the adjusted two-dimensional coordinates of the 4 anchor points, the projection transformation matrix of the camera, the road surface coordinate system corresponding to the scene shot by the camera and the adjusted attitude angle of the three-dimensional rectangular marking frame, determining the corresponding adjusted three-dimensional coordinates after the 4 anchor points adjust the attitude angles;

and determining three-dimensional labeling information of the target obstacle according to the adjusted three-dimensional coordinates, the positions and the sizes of the three-dimensional rectangular labeling frames and the posture angles of the three-dimensional rectangular labeling frames after adjustment, which correspond to the 4 anchor points, and labeling the three-dimensional labeling information of the target obstacle in the video image.

When the attitude angle of the three-dimensional rectangular labeling frame is different from the attitude angle of the target obstacle, that is, when the attitude angle of the three-dimensional rectangular labeling frame cannot make the three-dimensional rectangular space occupied by the three-dimensional rectangular labeling frame coincide with the three-dimensional rectangular space occupied by the target obstacle, the attitude angle of the target obstacle cannot be reflected correctly, it means that the three-dimensional rectangular labeling frame cannot express the three-dimensional information of the target obstacle correctly, and the attitude angle of the three-dimensional rectangular labeling frame needs to be adjusted.

In the marking process, the three-dimensional rectangular marking frame can be rotated on the road surface by continuously adjusting the initial value of the attitude angle until reaching an angle, so that the angle of the three-dimensional rectangular marking frame is consistent with the angle of the target obstacle, and the position and the size of the three-dimensional rectangular marking frame can be adjusted at the moment, so that the three-dimensional rectangular space area represented by the three-dimensional rectangular marking frame is consistent with the three-dimensional rectangular space area occupied by the target obstacle.

Finally, the angle which is the most consistent between the three-dimensional rectangular space area represented by the three-dimensional rectangular labeling frame and the three-dimensional space area occupied by the target obstacle can accurately reflect the attitude angle of the target obstacle.

For example, by changing the position of p1 in the display window, the position of the three-dimensional rectangular frame in the three-dimensional space can be changed. Further, the height of the three-dimensional rectangular frame can be changed by changing the position of p2 with respect to p1 in the display window, the length of the three-dimensional rectangular frame can be changed by changing the position of p3 with respect to p2 in the display window, and the width of the three-dimensional rectangular frame can be changed by changing the position of p4 with respect to p3 in the display window.

And the two-dimensional coordinates and the three-dimensional coordinates of the 4 anchor points are in one-to-one correspondence, so that the adjusted three-dimensional rectangular marking frame can determine the information such as the position, the size and the like of the three-dimensional rectangular marking frame depending on the relation between the two-dimensional coordinates and the three-dimensional coordinates.

In addition, on the basis that the size of the volume of the three-dimensional rectangular labeling frame is determined, the whole three-dimensional rectangular labeling frame can be driven to rotate or move by moving the two-dimensional coordinates of 1 anchor point of the three-dimensional rectangular labeling frame, so that the attitude angle of the three-dimensional rectangular labeling frame can be adjusted, and the attitude angle of the three-dimensional rectangular labeling frame is consistent with the attitude angle of a target obstacle.

If the direction of the obstacle is not consistent with the direction of v_f, the three-dimensional rectangular labeling frame can be rotated anticlockwise around the v_n axis by adjusting the angle theta of the attitude angle of the three-dimensional rectangular labeling frame, so that the direction of the three-dimensional rectangular labeling frame is consistent with the direction of the obstacle.

When the attitude angle of the three-dimensional rectangular labeling frame is the same as that of the target obstacle, which means that the three-dimensional rectangular labeling frame can accurately represent the three-dimensional information of the target obstacle, the three-dimensional labeling information of the target obstacle can be determined according to the adjusted three-dimensional coordinates corresponding to the 4 anchor points, the positions and the sizes of the adjusted three-dimensional rectangular labeling frame and the attitude angle of the adjusted three-dimensional rectangular labeling frame.

In another embodiment provided herein, after determining the target obstacle, it is first determined whether the target obstacle has initialized three-dimensional labeling information, which is pre-existing. The initialized three-dimensional labeling information can be obtained through a mode in the related technology, and can also be provided in the following mode, and the method is specifically as follows:

And S21, acquiring a video image acquired by a camera, a projection transformation matrix of the camera and a pavement coordinate system corresponding to a scene shot by the camera.

The principle of step S21 is similar to that of step S11, please refer to the relevant content of step S11, and the description is omitted here.

Step S22, determining a target obstacle in the video image.

The principle of step S22 is similar to that of step S12, please refer to the relevant content of step S12, and the description is omitted here.

Step S23, estimating the attitude angle of the target obstacle according to the direction of the target obstacle to obtain the estimated attitude angle of the target obstacle; wherein, the attitude angle of the target obstacle is: the angle between the forward direction of the target obstacle and the forward direction of the road surface coordinate system.

The attitude angle of the target obstacle in the video image is estimated, and an estimated attitude angle which is not very accurate is obtained. For example, if the target obstacle is a person, the face of the person is oriented forward, and if the face of the person is oriented forward on the road, the estimated attitude angle is 0. For another example, if the target obstacle is a car parked at a roadside and the direction of the head is parallel to the left-hand vector of the road surface, the angle between the forward direction of the car and the forward direction of the road is 90 °, and the estimated attitude angle is 90 °. For another example, if the target obstacle is another car parked at a roadside and the direction of the head of the car is inclined, the angle between the forward direction of the car and the forward direction of the road is 45 °, and the estimated attitude angle is 45 °.

Step S24, sequentially obtaining 4 initial anchor points meeting geometric constraint conditions according to the estimated attitude angle of the target obstacle, the three-dimensional coordinate system of the camera and the road surface coordinate system corresponding to the scene shot by the camera; the geometric constraint conditions are as follows: the three line segments corresponding to the 3 line segments formed by the 4 initial anchor points according to the acquisition sequence in the three-dimensional space are mutually perpendicular, so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the initial three-dimensional rectangular marking frame formed by the 4 initial anchor points.

Taking the estimated attitude angle of the target obstacle as a reference, sequentially acquiring 4 initial anchor points meeting geometric constraint conditions according to a three-dimensional coordinate system where the camera is positioned and a pavement coordinate system corresponding to the camera, and constructing an initial three-dimensional rectangular frame by using the 4 initial anchor points, wherein the mode of constructing the initial three-dimensional rectangular frame is the same as that of acquiring the three-dimensional rectangular frame in the step S13, and the description is omitted.

Step S25, obtaining initial three-dimensional coordinates corresponding to the 4 initial anchor points according to the initial two-dimensional coordinates of the 4 initial anchor points in the video image, the projective transformation matrix of the camera, the road surface coordinate system corresponding to the scene shot by the camera, and the estimated attitude angle of the target obstacle.

The principle of step S25 is similar to that of step S14, please refer to the relevant content of step S14, and the description thereof is omitted here.

And S26, determining initialized three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates corresponding to the 4 initial anchor points, the position and the size of the initial three-dimensional rectangular labeling frame and the estimated attitude angle.

The principle of step S26 is similar to that of step S15, please refer to the relevant content of step S15, and the description is omitted here.

And step S27, labeling the initialized three-dimensional labeling information of the target obstacle in the video image.

If the initialized three-dimensional labeling information exists in the target obstacle, it is necessary to determine whether the initialized three-dimensional labeling information can accurately reflect the information of the target obstacle. If the three-dimensional rectangular marking frame determined by the three-dimensional marking information can completely cover the target obstacle inside the three-dimensional rectangular marking frame, the initialized three-dimensional marking information can accurately reflect the information of the target obstacle, otherwise, the initialized three-dimensional marking information is not used.

Specifically, as shown in fig. 9, fig. 9 is a flowchart for determining three-dimensional labeling information in the case of having initialized three-dimensional labeling information, and in the case of having initialized three-dimensional labeling information of the target obstacle, the method specifically includes:

Step S31, an initialized three-dimensional rectangular annotation frame is obtained according to the initialized three-dimensional annotation information;

and step S32, labeling the initialized three-dimensional labeling information in the video image when the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the initialized three-dimensional rectangular labeling frame.

Step S33, adjusting the position of at least 1 anchor point in the initialized three-dimensional rectangular labeling frame and/or the attitude angle of the three-dimensional rectangular labeling frame under the condition that the three-dimensional rectangular space area occupied by the target obstacle is inconsistent with the three-dimensional rectangular space area occupied by the initialized three-dimensional rectangular labeling frame, so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the new three-dimensional rectangular labeling frame formed by the new positions of the 4 anchor points after adjustment.

And the positions of the 4 initial anchors are adjusted to obtain new positions of the 4 anchors, and a new three-dimensional rectangular annotation frame is determined according to the 4 new anchors.

And adjusting the positions of at least 1 anchor point in the initialized three-dimensional rectangular labeling frame and/or the attitude angle of the three-dimensional rectangular labeling frame, wherein the length, width and height and the attitude angle of the three-dimensional rectangular labeling frame are changed by adjusting the positions of two-dimensional coordinates of the anchor points in a two-dimensional display coordinate system in a display window for displaying video images.

The conditions and principles of determining 4 new anchor points in step S33 are the same as those of step S13, please refer to the relevant content of step S13, and the description thereof is omitted herein.

Step S34, initializing the attitude angle of the three-dimensional rectangular labeling frame as an initial attitude angle; and obtaining initial three-dimensional coordinates corresponding to the new positions of the 4 anchor points after adjustment according to the initial two-dimensional coordinates of the new positions of the 4 anchor points in the video image, the projection transformation matrix of the camera, the road surface coordinate system corresponding to the scene shot by the camera and the initial attitude angle of the three-dimensional rectangular labeling frame. Wherein, the attitude angle of the new three-dimensional rectangle marking frame is: the included angle between the forward direction of the new three-dimensional rectangular marking frame and the forward direction of the road surface coordinate system

The principle of step S34 is similar to that of step S14, please refer to the relevant content of step S14, and the description thereof is omitted here.

And step S35, determining the position and the size of the new three-dimensional rectangular labeling frame according to the initial three-dimensional coordinates corresponding to the new positions of the adjusted 4 anchor points.

The principle of step S35 is similar to that of step S1501, please refer to the relevant content of step S1501, and the description is omitted here.

Step S36, when the attitude angle of the new three-dimensional rectangular labeling frame is the same as that of the target obstacle, determining new three-dimensional labeling information of the target obstacle according to initial three-dimensional coordinates corresponding to the new positions of the adjusted 4 anchor points, the positions and the sizes of the new three-dimensional rectangular labeling frame and the attitude angle of the new three-dimensional rectangular labeling frame, and labeling the new three-dimensional labeling information in the video image; wherein, the attitude angle of the target obstacle is: an angle between a forward direction of the target obstacle and a forward direction of the road surface coordinate system.

The principle of step S36 is similar to that of step S1502, please refer to the relevant content of step S1502, which is not described herein.

Step S37, when the attitude angle of the new three-dimensional rectangular labeling frame is different from the attitude angle of the target obstacle, adjusting the initial attitude angle of the new three-dimensional rectangular labeling frame so that the adjusted attitude angle of the new three-dimensional rectangular labeling frame is the same as the attitude angle of the target obstacle;

according to the adjusted attitude angle of the new three-dimensional rectangular annotation frame and the projective transformation matrix, determining adjusted three-dimensional coordinates corresponding to the adjusted 4 new anchor points respectively;

And determining three-dimensional labeling information of the target obstacle according to the adjusted three-dimensional coordinates, the positions and the sizes of the new three-dimensional rectangular labeling frames and the posture angles of the new three-dimensional rectangular labeling frames, which correspond to the adjusted 4 anchor points, and labeling the new three-dimensional labeling information of the target obstacle in the video image.

The principle of step S37 is similar to that of step S1503, please refer to the relevant content of step S1503, and the description thereof is omitted here.

Specifically, the adjustment process of the attitude angle θ of the three-dimensional rectangular labeling frame and the adjustment process of the positions of the anchor points p1, p2, p3, and p4 in the display window can be continuously performed until the labeled three-dimensional rectangular labeling frame can accurately reflect the position of the labeled obstacle in the three-dimensional space and the space area occupied in the three-dimensional space; and then, the position information, the length, width and height information and the attitude angle information of the three-dimensional rectangular labeling frame in the three-dimensional space are stored as the three-dimensional labeling information of the labeled obstacle.

Repeating the above process for all the barriers in the video image, so as to finish the marking of the three-dimensional information of all the barriers. Once the labeling of the three-dimensional information of all the obstacles in the video image is completed, the labeling of the three-dimensional information of the video image is finished, the corresponding three-dimensional labeling information is stored in the information storage device, and/or the corresponding three-dimensional labeling information is sent to the target device through the information transmission device.

Through the implementation mode of the example, the three-dimensional information of the obstacle on the road surface can be accurately and effectively marked on the monocular video image, so that a monocular video data set with the three-dimensional information marking can be obtained and used for training a model algorithm capable of detecting the obstacle in the monocular video image and positioning the obstacle in the three-dimensional space, the obstacle on the road surface can be detected and accurately positioned through the monocular camera under the application scenes of vehicle-road cooperation, automatic driving, robot obstacle avoidance and the like, the cost of technologies of vehicle-road cooperation, automatic driving, robot obstacle avoidance and the like is greatly reduced, and the positioning accuracy of the obstacle is improved. The safety of the automatic driving vehicle is guaranteed.

The method and the device can finish adjustment and modification of the existing three-dimensional annotation information of the video image, and ensure the accuracy of the three-dimensional annotation information of the video image; the method can be used for pre-marking the three-dimensional information of the video image by an automatic tool, and then the video image is adjusted and modified, so that the efficiency of marking the three-dimensional information of the video image is improved, and the cost of marking the three-dimensional information of the video image is reduced.

The application marks an obstacle in a video image by using a three-dimensional rectangular frame, and reflects the position of the obstacle in a three-dimensional space, the occupied three-dimensional space area and three-dimensional posture information. Wherein, the bottom surface of three-dimensional rectangle frame and road surface coincidence.

Firstly, inputting a video image to be marked, and corresponding camera projective transformation matrix and road surface coordinate system information, displaying the video image on a display window, and if initial three-dimensional marking information exists, displaying the initial three-dimensional marking information on the display window.

After a target obstacle to be marked in a video image is determined, selecting the position of a first anchor point in a display window, restricting the anchor point to be on a road surface, calculating the three-dimensional coordinate of a first three-dimensional point corresponding to the first anchor point in a three-dimensional space according to a camera projective transformation matrix and road surface coordinate system information, restricting the first three-dimensional point to be a vertex on a three-dimensional rectangular marking frame, and positioning the position of the three-dimensional rectangular marking frame in the three-dimensional space by the first three-dimensional point.

Selecting the position of a second anchor point in the display window, calculating the three-dimensional coordinate of a second three-dimensional point corresponding to the second anchor point in the three-dimensional space according to the projection transformation matrix of the camera and the road surface coordinate system information, restraining the second three-dimensional point on one vertex of the three-dimensional rectangular labeling frame, and forming a first edge of the three-dimensional rectangular labeling frame by connecting the second three-dimensional point with the first three-dimensional point. The size of the three-dimensional rectangular frame in the direction of the first edge is determined by the first edge.

Selecting the position of a third anchor point in the display window, calculating the three-dimensional coordinate of a third three-dimensional point corresponding to the third anchor point in the three-dimensional space according to the projection transformation matrix of the camera and the road surface coordinate system information, restraining the third three-dimensional point on one vertex of the three-dimensional rectangular labeling frame, and forming a second edge of the three-dimensional rectangular labeling frame by connecting the third three-dimensional point with the second three-dimensional point. The second rib and the first rib are constrained to be perpendicular to each other in the three-dimensional space. The second edges are used for determining the dimension of the three-dimensional rectangular labeling frame in the direction of the second edges.

And selecting the position of a fourth anchor point in the display window, calculating the three-dimensional coordinate of a fourth three-dimensional point corresponding to the fourth anchor point in the three-dimensional space according to the projection transformation matrix of the camera and the road surface coordinate system information, restraining the fourth three-dimensional point on one vertex of the three-dimensional rectangular labeling frame, and forming a third edge of the three-dimensional rectangular labeling frame by connecting the third three-dimensional point with the third vertex. And constraining the third edge to be mutually perpendicular to the first edge and the second edge in the three-dimensional space, and determining the dimension of the three-dimensional rectangular labeling frame in the direction of the third edge by using the third edge.

And the three-dimensional rectangular marking frame rotates around one axis in the normal vector direction of the pavement plane by adjusting the value of the attitude angle of the three-dimensional rectangular marking frame, so that the attitude of the three-dimensional rectangular marking frame is consistent with the attitude of the corresponding obstacle in the video image.

And adjusting the position of the three-dimensional rectangular annotation frame in the three-dimensional space by adjusting the position of the first anchor point. And adjusting the size of the three-dimensional rectangular marking frame in the direction of the first edge by adjusting the position of the second anchor point. And adjusting the size of the three-dimensional rectangular marking frame in the direction of the second edge by adjusting the position of the third anchor point. And adjusting the size of the three-dimensional rectangular marking frame in the direction of the third edge by adjusting the position of the fourth anchor point.

According to the projection transformation information of the camera and the road surface coordinate system information, each edge of the three-dimensional rectangular marking frame is displayed on the display window, and the size of the three-dimensional rectangular marking frame can be adjusted by moving the position of the edge in the display window.

And marking three-dimensional information by using the position, the length, the width, the height, the size and the attitude angle of the corresponding three-dimensional rectangular marking frame in the three-dimensional space of each obstacle in the video image. After the three-dimensional information of the video image is marked, the three-dimensional marked information is stored in a storage device and/or is sent to a target device through a transmission device.

According to the same technical concept, the application provides an obstacle information labeling device, as shown in fig. 11, specifically as follows:

the first obtaining module 121 is configured to obtain a video image collected by a camera, a projective transformation matrix of the camera, and a road surface coordinate system corresponding to a scene shot by the camera;

A first determining module 122 for determining a target obstacle in the video image;

an anchor point obtaining module 123, configured to sequentially obtain 4 anchor points that satisfy geometric constraint conditions according to a three-dimensional coordinate system in which the camera is located and a road surface coordinate system corresponding to a scene shot by the camera, when the target obstacle does not have initialized three-dimensional labeling information; the geometric constraint conditions are as follows: the three line segments corresponding to the 3 line segments formed by the 4 anchor points according to the acquisition sequence in the three-dimensional space are mutually perpendicular, so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the three-dimensional rectangular marking frame formed by the 4 anchor points;

the second obtaining module 124 is configured to initialize a posture angle of the three-dimensional rectangular labeling frame to an initial posture angle; obtaining initial three-dimensional coordinates corresponding to the 4 anchor points according to initial two-dimensional coordinates of the 4 anchor points in the video image, the projection transformation matrix of the camera, road surface coordinate system information of a scene shot by the camera and an initial attitude angle of the three-dimensional rectangular annotation frame; wherein, the attitude angle of three-dimensional rectangle mark frame is: an included angle between the forward direction of the three-dimensional rectangular marking frame and the forward direction of the road surface coordinate system;

The first labeling module 125 is configured to determine three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates corresponding to the 4 anchor points, and the attitude angles of the three-dimensional rectangular labeling frame, and label the three-dimensional labeling information of the target obstacle in the video image. The first labeling module comprises: the first determining submodule is used for determining the position and the size of the three-dimensional rectangular marking frame where the 4 anchor points are located according to the initial three-dimensional coordinates corresponding to the 4 anchor points;

the second determining submodule is used for determining three-dimensional marking information of the target obstacle according to the initial three-dimensional coordinates corresponding to the 4 anchor points, the position and the size of the three-dimensional rectangular marking frame and the initial attitude angle of the three-dimensional rectangular marking frame under the condition that the initial attitude angle of the three-dimensional rectangular marking frame is the same as the attitude angle of the target obstacle; wherein, the attitude angle of the target obstacle is: an angle between a forward direction of the target obstacle and a forward direction of the road surface coordinate system.

The attitude angle adjusting module is used for adjusting the initial attitude angle of the three-dimensional rectangular annotation frame under the condition that the initial attitude angle of the three-dimensional rectangular annotation frame is different from the attitude angle of the target obstacle, so that the adjusted attitude angle of the three-dimensional rectangular annotation frame is the same as the attitude angle of the target obstacle;

The second determining module is used for determining the adjusted two-dimensional coordinates of the 4 anchor points according to the adjusted attitude angles of the three-dimensional rectangular annotation frame;

the third determining module is used for determining the corresponding adjusted three-dimensional coordinates after the 4 anchor points adjust the attitude angles according to the adjusted two-dimensional coordinates of the 4 anchor points, the projection transformation matrix of the camera, the road surface coordinate system corresponding to the scene shot by the camera and the adjusted attitude angle of the three-dimensional rectangular marking frame;

the second labeling module is used for determining three-dimensional labeling information of the target obstacle according to the adjusted three-dimensional coordinates, the positions and the sizes of the three-dimensional rectangular labeling frames and the posture angles of the three-dimensional rectangular labeling frames after adjustment, which are respectively corresponding to the 4 anchor points, and labeling the three-dimensional labeling information of the target obstacle in the video image.

The third acquisition module is used for acquiring an initialized three-dimensional rectangular annotation frame according to the initialized three-dimensional annotation information under the condition that the object obstacle has the initialized three-dimensional annotation information;

and the third labeling module is used for labeling the initialized three-dimensional labeling information in the video image under the condition that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the initialized three-dimensional rectangular labeling frame.

The new anchor point determining module is used for adjusting the position of at least 1 anchor point in the initialized three-dimensional rectangular labeling frame and/or the attitude angle of the three-dimensional rectangular labeling frame under the condition that the three-dimensional rectangular space area occupied by the target obstacle is inconsistent with the three-dimensional rectangular space area occupied by the initialized three-dimensional rectangular labeling frame, so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the new three-dimensional rectangular labeling frame formed by the new positions of the 4 anchor points after adjustment;

a fourth obtaining module, configured to obtain initial three-dimensional coordinates corresponding to each of the adjusted new positions of the 4 anchor points according to initial two-dimensional coordinates of the adjusted new positions of the 4 anchor points in the video image, the projective transformation matrix of the camera, and road surface coordinate system information corresponding to the camera;

and the fourth labeling module is used for determining new three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates corresponding to the new positions of the 4 anchor points, the new three-dimensional rectangular labeling frame and the attitude angle of the new three-dimensional rectangular labeling frame, and labeling the new three-dimensional labeling information of the target obstacle in the video image.

Wherein, three-dimensional rectangle mark frame acquisition module includes:

the first acquisition submodule is used for acquiring video images acquired by the camera, a projection transformation matrix of the camera and a pavement coordinate system corresponding to a scene shot by the camera;

a third determination submodule for determining a target obstacle in the video image;

the estimating sub-module is used for estimating the attitude angle of the target obstacle to obtain the estimated attitude angle of the target obstacle; wherein, the attitude angle of the target obstacle is: an included angle between the forward direction of the target obstacle and the forward direction of a road surface coordinate system;

the second acquisition submodule is used for sequentially acquiring 4 initial anchor points meeting geometric constraint conditions according to the estimated attitude angle of the target obstacle, the three-dimensional coordinate system of the camera and the road surface coordinate system corresponding to the scene shot by the camera; the geometric constraint conditions are as follows: three line segments corresponding to 3 line segments formed by the 4 initial anchor points according to the acquisition sequence in a three-dimensional space are mutually perpendicular, so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the initial three-dimensional rectangular marking frame formed by the 4 initial anchor points;

The initial three-dimensional coordinate sub-module is used for obtaining initial three-dimensional coordinates corresponding to the 4 initial anchor points respectively according to the initial two-dimensional coordinates of the 4 initial anchor points in the video image, the projection transformation matrix of the camera, a road surface coordinate system corresponding to a scene shot by the camera and the estimated attitude angle of the target obstacle;

a fourth determining sub-module, configured to determine initialized three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates corresponding to the 4 initial anchor points, the position and the size of the initial three-dimensional rectangular labeling frame, and the estimated attitude angle;

and the labeling sub-module is used for labeling the initialized three-dimensional labeling information of the target obstacle in the video image.

The present application also provides an electronic device, as shown in fig. 11, including:

a processor 111;

a memory 112 for storing the processor-executable instructions;

wherein the processor 111 is configured to perform operations performed to implement a method for tagging obstacle information.

The present application also provides a non-transitory computer-readable storage medium, which when executed by a processor of a terminal, enables the terminal to perform operations performed to implement a method of obstacle information tagging.

In addition, a video image three-dimensional information labeling system on which the method for labeling obstacle information can depend is provided, as shown in fig. 12, and includes: display device, interaction device, computing device, information storage device, information transmission device.

The display device is used for displaying video images and corresponding three-dimensional annotation information in a display window, the interaction device is used for selecting anchor points in the display window, the coordinate of a three-dimensional point corresponding to each anchor point in a three-dimensional space is calculated through the computing device, and the three-dimensional space position information, the three-dimensional rectangular frame gesture information and the length and width information of the three-dimensional rectangular frame corresponding to the three-dimensional rectangular annotation frame are calculated according to the three-dimensional coordinate information corresponding to the 4 selected anchor points, so that the three-dimensional annotation information of the target obstacle is obtained. Once the three-dimensional information of the obstacle in the video image is marked, the three-dimensional marked information in the video image is stored in the information storage device and/or sent out through the information transmission device.

For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

The above description is provided in detail for a method, an apparatus, an electronic device and a storage medium for labeling obstacle information, and specific examples are applied to illustrate the principles and embodiments of the present application, where the description of the above examples is only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A method for marking obstacle information, the method comprising:

determining a target obstacle in the video image;

and determining three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates corresponding to the 4 anchor points, the three-dimensional rectangular labeling frame and the attitude angle of the three-dimensional rectangular labeling frame, and labeling the three-dimensional labeling information of the target obstacle in the video image.

2. The method according to claim 1, wherein determining the three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates corresponding to the 4 anchor points, the three-dimensional rectangular labeling frame and the attitude angle of the three-dimensional rectangular labeling frame specifically comprises:

determining the position and the size of a three-dimensional rectangular marking frame where the 4 anchor points are located according to the initial three-dimensional coordinates corresponding to the 4 anchor points;

Under the condition that the initial attitude angle of the three-dimensional rectangular labeling frame is the same as the attitude angle of the target obstacle, determining three-dimensional labeling information of the target obstacle according to the initial three-dimensional coordinates corresponding to the 4 anchor points, the position and the size of the three-dimensional rectangular labeling frame and the initial attitude angle of the three-dimensional rectangular labeling frame; wherein, the attitude angle of the target obstacle is: an angle between a forward direction of the target obstacle and a forward direction of the road surface coordinate system.

3. The method according to claim 2, wherein the method further comprises:

under the condition that the initial attitude angle of the three-dimensional rectangular labeling frame is different from the attitude angle of the target obstacle, adjusting the initial attitude angle of the three-dimensional rectangular labeling frame so that the adjusted attitude angle of the three-dimensional rectangular labeling frame is the same as the attitude angle of the target obstacle;

4. The method according to claim 1, wherein the method further comprises:

under the condition that the target obstacle has initialized three-dimensional labeling information, acquiring an initialized three-dimensional rectangular labeling frame according to the initialized three-dimensional labeling information;

and labeling the initialized three-dimensional labeling information in the video image under the condition that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the initialized three-dimensional rectangular labeling frame.

5. The method according to claim 4, wherein the method further comprises:

under the condition that the three-dimensional rectangular space area occupied by the target obstacle is inconsistent with the three-dimensional rectangular space area occupied by the initialized three-dimensional rectangular labeling frame, adjusting the position of at least 1 anchor point in the initialized three-dimensional rectangular labeling frame and/or the attitude angle of the three-dimensional rectangular labeling frame so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by a new three-dimensional rectangular labeling frame formed by the new positions of the 4 anchor points after adjustment;

Obtaining initial three-dimensional coordinates corresponding to the new positions of the 4 anchor points after adjustment according to initial two-dimensional coordinates of the new positions of the 4 anchor points in the video image, a projective transformation matrix of the camera and road surface coordinate system information corresponding to the camera;

and determining new three-dimensional labeling information of the target obstacle according to initial three-dimensional coordinates corresponding to the new positions of the 4 anchor points, the new three-dimensional rectangular labeling frame and the attitude angle of the new three-dimensional rectangular labeling frame, and labeling the new three-dimensional labeling information of the target obstacle in the video image.

6. The method of claim 4, wherein the initialized three-dimensional annotation information is obtained as follows:

acquiring video images acquired by a camera, a projection transformation matrix of the camera and road surface coordinate system information corresponding to a scene shot by the camera;

determining a target obstacle in the video image;

estimating the attitude angle of the target obstacle according to the orientation of the target obstacle to obtain an estimated attitude angle of the target obstacle; wherein, the attitude angle of the target obstacle is: an included angle between the forward direction of the target obstacle and the forward direction of a road surface coordinate system;

Sequentially obtaining 4 initial anchor points meeting geometric constraint conditions according to the estimated attitude angle of the target obstacle, a three-dimensional coordinate system in which the camera is positioned and a road surface coordinate system corresponding to a scene shot by the camera; the geometric constraint conditions are as follows: three line segments corresponding to 3 line segments formed by the 4 initial anchor points according to the acquisition sequence in a three-dimensional space are mutually perpendicular, so that the three-dimensional rectangular space area occupied by the target obstacle is consistent with the three-dimensional rectangular space area occupied by the initial three-dimensional rectangular marking frame formed by the 4 initial anchor points;

obtaining initial three-dimensional coordinates corresponding to the 4 initial anchor points according to initial two-dimensional coordinates of the 4 initial anchor points in the video image, a projection transformation matrix of the camera, a road surface coordinate system corresponding to a scene shot by the camera and an estimated attitude angle of the target obstacle;

according to the initial three-dimensional coordinates corresponding to the 4 initial anchor points, the position and the size of the initial three-dimensional rectangular labeling frame and the estimated attitude angle, initialized three-dimensional labeling information of the target obstacle is determined;

the method further comprises the steps of:

And labeling the initialized three-dimensional labeling information of the target obstacle in the video image.

7. An obstacle information labeling device, the device comprising:

The first labeling module is used for determining three-dimensional labeling information of the target obstacle according to initial three-dimensional coordinates corresponding to the 4 anchor points, the three-dimensional rectangular labeling frame and attitude angles of the three-dimensional rectangular labeling frame, and labeling the three-dimensional labeling information of the target obstacle in the video image.

8. The apparatus of claim 7, wherein the first labeling module comprises:

the first determining submodule is used for determining the position and the size of the three-dimensional rectangular marking frame where the 4 anchor points are located according to the initial three-dimensional coordinates corresponding to the 4 anchor points;

9. An electronic device, comprising:

A processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute to implement an obstacle information labeling method as claimed in any one of claims 1 to 6.

10. A non-transitory computer readable storage medium, which when executed by a processor of a terminal, causes the terminal to perform a method of implementing an obstacle information labeling method according to any one of claims 1 to 6.