WO2023273108A1

WO2023273108A1 - Monocular distance measurement method and apparatus, and intelligent apparatus

Info

Publication number: WO2023273108A1
Application number: PCT/CN2021/131460
Authority: WO
Inventors: 李奕润; 程骏; 庞建新
Original assignee: 深圳市优必选科技股份有限公司
Priority date: 2021-06-30
Filing date: 2021-11-18
Publication date: 2023-01-05
Also published as: CN113465573A

Abstract

A monocular distance measurement method and apparatus, and an intelligent apparatus. The method comprises: capturing a head-up view by means of a monocular camera (201); determining a homography matrix according to the vertex angle pixel coordinates of a first area image (203); determining a reference point from a bottom edge of a rectangular area, and acquiring the distance between the reference point and the monocular camera (201); determining the reference point pixel coordinates in the first area image (203) that correspond to the reference point, and acquiring the inverse perspective reference point coordinates corresponding to the reference point pixel coordinates; performing target object (302) detection on another head-up view captured by the monocular camera (201), so as to obtain a target object bounding box (301), and according to the homography matrix, performing inverse perspective transformation on any pixel point coordinates of a bottom edge (3013) of the target object bounding box (301), so as to acquire the target inverse perspective coordinates corresponding to the any pixel point coordinates of the bottom edge; and determining the distance between a target object (302) and the monocular camera (201) according to the acquired relevant parameters. In this way, the distance to a target object (302) can be measured by means of a monocular camera (201), thereby lowering a distance measurement condition, and improving the distance measurement efficiency.

Description

Monocular ranging method, device and intelligent device

Cross References to Related Applications

This application claims the priority of the Chinese patent application with the application number 202110738325X and titled "Monocular distance measuring method, device and intelligent device" submitted to the China Patent Office on June 30, 2021, the entire contents of which are incorporated herein by reference. Applying.

technical field

The present application relates to the technical field of image processing, in particular to a monocular ranging method, device and intelligent device.

Background technique

At present, when a monocular camera takes a photo, it is essentially a projection left by the shooting scene on the imaging plane of the monocular camera. The photo reflects the three-dimensional world in a two-dimensional form. Obviously, the depth information of the shooting scene is lost during the shooting process. In a monocular camera, it is impossible to calculate the distance of an object in the scene from a single image through a single image.

With the continuous development of artificial intelligence in recent years, automatic driving technology based on artificial intelligence has also received extensive attention, and some ranging algorithms based on deep learning monocular cameras have also been proposed, but most of these algorithms are based on known objects in the distance. The image size and attitude in the camera calculate the distance of the object through the neural network. For irregular objects, this algorithm needs to collect the object distance under each attitude. The technology has the problem that the ranging conditions are difficult to meet.

application content

In order to solve the above technical problems, the embodiments of the present application provide a monocular ranging method, device and smart device.

In the first aspect, the embodiment of the present application provides a monocular ranging method, the method comprising:

A plane view is taken by a monocular camera, the ground visible area of the monocular camera includes a rectangular area, and the plane view includes a first area image corresponding to the rectangular area;

determining a homography matrix for inverse perspective transforming the plane view into a top view according to the apex pixel coordinates of the first region image;

Determining a reference point from the bottom edge of the rectangular area, and obtaining the distance between the reference point and the monocular camera;

Determining the pixel coordinates of the reference point corresponding to the reference point in the first region image, and obtaining the coordinates of the reverse perspective reference point corresponding to the pixel coordinates of the reference point;

Perform target object detection on other planar views captured by the monocular camera to obtain the target object bounding box, perform inverse perspective transformation on the coordinates of any pixel point on the bottom edge of the target object bounding box according to the homography matrix, and obtain the target object bounding box. The target inverse perspective coordinate corresponding to any pixel coordinate of the bottom edge;

According to the coordinates of the reverse perspective reference point, the reverse perspective coordinates of the target, the corresponding relationship between the physical scale and the pixel scale, and the distance between the reference point and the monocular camera, determine the target object and the monocular camera. The distance between the cameras.

Optionally, the determining the homography matrix for inverse perspective transformation of the plane view into the top view according to the pixel coordinates of the top corner of the first region image includes:

Acquiring the pixel coordinates of the top corner of the first area image and the corresponding pixel coordinates of the top corner of the reverse perspective image;

The homography matrix is determined according to the vertex pixel coordinates of the first area image and the reverse perspective vertex pixel coordinates.

Optionally, the obtaining the coordinates of the reverse perspective reference point corresponding to the pixel coordinates of the reference point includes:

The reverse perspective coordinates of the pixel coordinates of the reference point are set according to the size of the top view.

Optionally, the acquiring the apex pixel coordinates of the first region image and their corresponding inverse perspective apex pixel coordinates includes:

Correcting the plan view to obtain a corrected plan view, the corrected plan view including a second area image, using the apex pixel coordinates of the second area image as the apex pixel coordinates of the first area image ;

The pixel coordinates of the reverse perspective corner are determined according to the reverse perspective coordinates of the pixel coordinates of the reference point, the corresponding relationship between the physical scale and the pixel scale, and the side length of the rectangular area.

Optionally, the method also includes:

Taking a checkerboard image through the monocular camera, and obtaining internal reference and distortion parameters of the monocular camera according to the checkerboard image;

The plane view is corrected to obtain the corrected plane view, including:

The plane view is corrected according to the internal reference and the distortion parameter to obtain the corrected plane view.

Optionally, the reverse perspective coordinates of the pixel coordinates of the reference point include the pixel coordinates of the reference point in the first direction and the pixel coordinates of the reference point in the second direction, and the second direction is perpendicular to the first direction;

According to the inverse perspective coordinates of the pixel coordinates of the reference point, the corresponding relationship between the physical scale and the pixel scale, and the side length of the rectangular area, determining the pixel coordinates of the inverse perspective corners includes:

According to the pixel coordinates of the reference point in the first direction, the pixel coordinates of the reference point in the second direction, the side length of the rectangular area, the corresponding relationship between the physical scale and the pixel scale, the pixel coordinates of the reverse perspective vertex and the The positional relationship of the reference point coordinates of the reverse perspective is determined, and the coordinates of the first direction and the second direction of the pixel coordinates of the reverse perspective corner are determined.

Optionally, according to the inverse perspective reference point coordinates, the target inverse perspective coordinates, the corresponding relationship between physical scale and pixel scale, and the distance between the reference point and the monocular camera, determine the The distance between the target object and the monocular camera, including:

Subtracting the pixel coordinates of the reference point in the first direction from the coordinates in the first direction of the reverse perspective coordinates of the target to obtain a first pixel difference; multiplying the first pixel difference by the corresponding relationship between the physical scale and the pixel scale, Obtain a first product; use the first product as a first direction distance between the target object and the monocular camera; and/or,

Subtracting the pixel coordinates of the reference point in the second direction from the second direction coordinates of the target reverse perspective coordinates to obtain a second pixel difference; multiplying the second pixel difference by the corresponding relationship between the physical scale and the pixel scale, Obtain the second product; add the second product to the sum of the distance in the second direction between the reference point and the monocular camera, as the second distance between the target object and the monocular camera direction distance.

In the second aspect, the embodiment of the present application provides a monocular ranging device, and the monocular ranging device includes:

A photographing module, configured to photograph a plane view through a monocular camera, the ground visible area of the monocular camera includes a rectangular area, and the plane view includes a first area image corresponding to the rectangular area;

A first determination module, configured to determine a homography matrix for inverse perspective transformation of the plane view into a top view according to the top corner pixel coordinates of the first area image;

An acquisition module, configured to determine a reference point from the bottom of the rectangular area, and acquire a distance between the reference point and the monocular camera;

The first processing module is configured to determine the pixel coordinates of the reference point corresponding to the reference point in the first region image, and acquire the reverse perspective reference point coordinates corresponding to the pixel coordinates of the reference point;

The second processing module is configured to perform target object detection on other flat views captured by the monocular camera to obtain a target object bounding box, and coordinate any pixel point on the bottom edge of the target object bounding box according to the homography matrix Perform inverse perspective transformation to obtain the target inverse perspective coordinates corresponding to the coordinates of any pixel point on the bottom edge;

The second determination module is configured to determine the inverse perspective reference point coordinates, the target inverse perspective coordinates, the corresponding relationship between physical scale and pixel scale, and the distance between the reference point and the monocular camera. The distance between the target object and the monocular camera.

Optionally, the first determining module is further configured to acquire the pixel coordinates of the top corner of the first area image and the corresponding pixel coordinates of the top corner of the reverse perspective;

Optionally, the first determining module is further configured to set reverse perspective coordinates of pixel coordinates of the reference point according to the size of the top view.

Optionally, the first determination module is further configured to correct the plan view to obtain a corrected plan view, the corrected plan view includes a second area image, and the vertex angle of the second area image The pixel coordinates are used as the apex pixel coordinates of the first region image;

Optionally, the monocular distance measuring device also includes:

A rectification module that takes a checkerboard image through the monocular camera, and acquires internal references and distortion parameters of the monocular camera according to the checkerboard image;

The first determining module is further configured to correct the plane view according to the internal reference and distortion parameters to obtain the corrected plane view.

The first determination module is further configured to: according to the pixel coordinates of the reference point in the first direction, the pixel coordinates of the reference point in the second direction, the side length of the rectangular area, the correspondence between the physical scale and the pixel scale, The positional relationship between the reverse perspective vertex corner pixel coordinates and the reverse perspective reference point coordinates determines the first direction coordinate and the second direction coordinate of the reverse perspective vertex corner pixel coordinates.

Optionally, the second determination module is further configured to subtract the pixel coordinates of the first direction reference point from the first direction coordinates of the target reverse perspective coordinates to obtain a first pixel difference value; The value is multiplied by the corresponding relationship between the physical scale and the pixel scale to obtain the first product; the first product is used as the first direction distance between the target object and the monocular camera; and/or,

In a third aspect, an embodiment of the present application provides an intelligent device, including a monocular camera, a memory, and a processor, the memory stores a computer program, and the computer program executes the method provided in the first aspect when the processor runs. Monocular ranging method.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, which stores a computer program, and the computer program executes the monocular ranging method provided in the first aspect when running on a processor.

In the monocular distance measuring method, device and smart device provided by the above-mentioned application, the flat view is taken by a monocular camera; the pixel coordinates of the top corner of the first area image are used to determine the reverse perspective transformation of the flat view into a top view. Homography matrix; determine the reference point from the bottom edge of the rectangular area, and obtain the distance between the reference point and the monocular camera; determine the reference point pixel corresponding to the reference point in the first region image Coordinates, obtain the coordinates of the reverse perspective reference point corresponding to the pixel coordinates of the reference point; perform target object detection on other flat views taken by the monocular camera, obtain the target object bounding box, and calculate the target object according to the homography matrix Perform inverse perspective transformation on the coordinates of any pixel point on the bottom edge of the bounding box, and obtain the target inverse perspective coordinates corresponding to the coordinates of any pixel point on the bottom edge; according to the coordinates of the inverse perspective reference point, the target inverse perspective coordinates, and the physical scale The corresponding relationship with the pixel scale and the distance between the reference point and the monocular camera determine the distance between the target object and the monocular camera. In this way, the distance of the target object can be measured through the monocular camera, without excessive data requirements, and the ranging of irregular target objects can be realized, the ranging condition is reduced, and the ranging efficiency is improved.

Description of drawings

In order to illustrate the technical solution of the present application more clearly, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the application, and therefore should not be regarded It is regarded as a limitation on the scope of protection of the present application. In the respective drawings, similar components are given similar reference numerals.

Fig. 1 shows a schematic flow chart of the monocular ranging method provided by the embodiment of the present application;

Fig. 2 shows a schematic plan view provided by the embodiment of the present application;

Fig. 3 shows another schematic plan view provided by the embodiment of the present application;

Fig. 4 shows a schematic flow diagram of a rectified plane view provided by the embodiment of the present application;

Fig. 5 shows a schematic diagram of a reverse perspective view provided by the embodiment of the present application;

Fig. 6 shows a schematic diagram of a physical coordinate system provided by the embodiment of the present application;

FIG. 7 shows a schematic structural diagram of a monocular ranging device provided by an embodiment of the present application.

detailed description

The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, not all of them.

The components of the embodiments of the application generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely represents selected embodiments of the application. Based on the embodiments of the present application, all other embodiments obtained by those skilled in the art without making creative efforts belong to the scope of protection of the present application.

Hereinafter, the terms "comprising", "having" and their cognates that may be used in various embodiments of the present application are only intended to represent specific features, numbers, steps, operations, elements, components or combinations of the foregoing, And it should not be understood as first excluding the existence of one or more other features, numbers, steps, operations, elements, components or combinations of the foregoing or adding one or more features, numbers, steps, operations, elements, components or a combination of the foregoing possibilities.

In addition, the terms "first", "second", "third", etc. are only used for distinguishing descriptions, and should not be construed as indicating or implying relative importance.

Unless otherwise defined, all terms (including technical terms and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which various embodiments of the application belong. The terms (such as those defined in commonly used dictionaries) will be interpreted as having the same meaning as the contextual meaning in the relevant technical field and will not be interpreted as having an idealized meaning or an overly formal meaning, Unless clearly defined in the various embodiments of the present application.

Example 1

An embodiment of the present disclosure provides a monocular ranging method.

Specifically, as shown in Figure 1, the monocular ranging method includes

Step S101, taking a plan view with a monocular camera.

In this embodiment, the ground viewable area of the monocular camera includes a rectangular area, and the plane view includes a first area image corresponding to the rectangular area.

It should be added that tiles with a rectangular pattern can be laid on the ground, and a plurality of tiles with a rectangular pattern can form a rectangular area, and the corresponding first area image can be determined by performing image detection on a plan view. In addition, in other implementation manners, when there is no rectangular pattern such as tiles on the ground, a rectangular area may be set on the ground, and the rectangular area is set in the ground visible area of the monocular camera.

Please refer to Fig. 2, the monocular camera 201 is set on the ground 202, and the ground 202 is laid with the ceramic tile of rectangular pattern, and the ceramic tile of a plurality of rectangular patterns can form a rectangular area, for example, the ceramic tile of 4 rectangular patterns forms a rectangular area 203, by pairing The image detection is performed on the plane view, and the corresponding first area image 203 can be determined.

Step S102, determining a homography matrix for inverse perspective transformation of the plane view into a top view according to the pixel coordinates of the top corner of the first area image.

Please refer to FIG. 2, the pixel coordinate system can be established by taking the upper left corner of FIG. 2 as the origin, the left vertical side as the y-axis, and the top as the x-axis. , C, D pixel coordinates.

In this embodiment, the homography matrix can be calculated according to the pixel coordinates of the top corner of the first region image and the inverse perspective coordinates of the pixel coordinates of the top corner of the first region image in the previous stage.

In step S103, a reference point is determined from the bottom of the rectangular area, and a distance between the reference point and the monocular camera is acquired.

In this embodiment, the reference point may be any point on the bottom of the rectangular area. For example, for example, the midpoint of the bottom of the rectangular area may be used as the reference point.

In this embodiment, the distance between the reference point and the monocular camera can be obtained by measuring the distance between the reference point and the monocular camera by the user, or according to the distance between the reference point and the monocular camera. The number of tiles in the rectangular pattern at intervals is obtained through image analysis and processing, and is not limited here.

Step S104, determining the pixel coordinates of the reference point corresponding to the reference point in the first region image, and obtaining the coordinates of the reverse perspective reference point corresponding to the pixel coordinates of the reference point.

In this embodiment, the rectangular area is located in the ground visible area of the monocular camera, and the image of the first area in the plane view includes all sides of the rectangular area. When the midpoint of the bottom of the rectangular area is used as the reference point, as shown in FIG. 2 , the pixel coordinate of the reference point corresponding to the reference point in the first area image 203 is the pixel point E.

Step S105: Perform target object detection on other flat views captured by the monocular camera to obtain the target object bounding box, and perform inverse perspective transformation on the coordinates of any pixel point on the bottom edge of the target object bounding box according to the homography matrix , to obtain the target inverse perspective coordinates corresponding to the coordinates of any pixel point on the bottom edge.

In this embodiment, the height and clamping angle of the monocular camera are kept unchanged, and other planar views are captured by the monocular camera. You can choose Yolo V5, Fast-RCNN and other target detection algorithms for target object detection. The main purpose of target object detection is to frame the outline of the object to be detected using a rectangular frame. It can be seen from this that the bottom edge of the target object framed by the Bounding Box of the target detection is in the calibrated two-dimensional plane.

Referring to FIG. 3 , the first plane view 300 includes a target object 302 and a target object bounding box 301 , and the target object bounding box 301 includes a bottom edge 3013 , a top edge 3011 , a left vertical edge 3012 , and a right vertical edge 3013 . Perform inverse perspective transformation on the coordinates of any pixel point of the bottom edge 3013 according to the homography matrix, and obtain the target inverse perspective coordinates corresponding to the coordinates of any pixel point point of the bottom edge 3013. For example, the inverse perspective transformation is performed on the coordinates of the midpoint of the base 3013 according to the homography matrix, and the reverse perspective coordinates of the target corresponding to the coordinates of the midpoint of the base 3013 are obtained.

In this embodiment, according to the homography matrix, the pixel point coordinates of the bottom midpoint of the bounding box of the target object are subjected to inverse perspective transformation, and the target reverse perspective coordinates corresponding to the pixel point coordinates of the bottom midpoint are obtained .

In this embodiment, according to Formula 1, reverse perspective transformation is performed on the pixel point coordinates of the bottom midpoint of the bounding box of the target object, and the target reverse perspective coordinates corresponding to the pixel point coordinates of the bottom midpoint are obtained.

Formula 1:

Among them, (u, v) represent the pixel coordinates of the midpoint of the bottom edge of the bounding box of the target object, and (x, y) are the reverse perspective coordinates of the target corresponding to the pixel coordinates of the midpoint of the bottom edge. Z represents the depth value.

Step S106, according to the inverse perspective reference point coordinates, the target inverse perspective coordinates, the corresponding relationship between physical scale and pixel scale, and the distance between the reference point and the monocular camera, determine the distance between the target object and the The distance between the monocular cameras.

In this embodiment, in order to correspond the pixel coordinates to the real physical coordinates, the corresponding relationship between the pixel scale and the physical scale is preset, that is, each pixel corresponds to the length of the real world. For example, 1 pixel corresponds to s meters in the real world.

Optionally, step S102 includes:

In this embodiment, the apex pixel coordinates of the first region image are respectively p1 (u1, v1), p2 (u2, v2), p3 (u3, v3), p4 (u4, v4), and the corresponding inverse The perspective pixel coordinates are q1(x1, y1), q2(x2, y2), q3(x3, y3), q4(x4, y4). According to p1(u1, v1), p2(u2, v2), p3(u3, v3), p4(u4, v4), q1(x1, y1), q2(x2, y2), q3(x3, y3), q4(x4, y4) computes the homography matrix.

Optionally, in step S104, the acquiring the coordinates of the reverse perspective reference point corresponding to the pixel coordinates of the reference point includes:

In this embodiment, the size of the top view obtained by inverse perspective transformation of the plan view can be set. Half of the width of the top view can be set as the pixel coordinate in the first direction of the reverse perspective coordinates of the pixel coordinates of the reference point, and three quarters of the height of the top view can be set as the first direction of the reverse perspective coordinates of the pixel coordinates of the reference point Pixel coordinates in two directions. In other implementation manners, the height of the top view may also be set as the pixel coordinates in the second direction of the reverse perspective coordinates of the pixel coordinates of the reference point. There is no limitation here.

Optionally, the obtaining the apex pixel coordinates of the first region image and their corresponding inverse perspective apex pixel coordinates includes:

Please refer to FIG. 4 , the pixel coordinates of the top corners of the second area image 401 in the plane view after correction are the pixel coordinates of pixel points F, G, H, and I, and the second area image 401 is larger than the first area image 203 before correction. In other words, the edge of the second region image 401 is relatively straight, which reduces the distortion effect.

Optionally, the method also includes:

The plane view is corrected to obtain the corrected plane view, including:

In this embodiment, in order to determine the relationship between the three-dimensional geometric position of a certain point on the surface of the space object and its corresponding point in the image, a geometric model of camera imaging must be established, and these geometric model parameters are camera parameters. Under most conditions, these parameters must be obtained through experiments and calculations. This process of solving internal parameters, external parameters, and distortion parameters is called camera calibration. Converting from the image coordinate system to the camera coordinate system can give formula 1, where the 3×4 matrix on the right side of the equation is called the internal reference matrix of the camera, where fx and fy are the focal lengths of the x-axis and y-axis respectively, u0 and v0 is the optical center coordinate of the camera, and D is the physical scale parameter.

Formula 2:

In this embodiment, in addition to calibrating the internal parameters of the monocular camera, it is also necessary to calibrate the distortion parameters of the monocular camera. Distortion simply means that due to the lens of the monocular camera, a straight line projected onto the picture cannot be kept as a straight line, resulting in optical distortion. Camera distortion is mainly divided into two types, radial distortion and tangential distortion.

In this embodiment, the internal reference and distortion parameters of the monocular camera are obtained by using a monocular camera to capture a checkerboard image. According to the calibrated camera internal parameters and distortion parameters, the distortion correction is performed on the flat view captured by the monocular camera.

In this embodiment, if the reference point is the midpoint of the bottom of the rectangular area, half of the width of the top view is set as the pixel coordinate in the first direction of the reverse perspective coordinate of the pixel coordinates of the reference point, and the height of the top view is divided into four The third is set as the pixel coordinates in the second direction of the reverse perspective coordinates of the pixel coordinates of the reference point, and set the reverse perspective coordinates of the pixel coordinates of the reference point to (x, y). Since the reference point is the middle point of the bottom of the rectangular area, according to the positional relationship between the pixel coordinates of the reverse perspective corner and the coordinates of the reverse perspective reference point, the coordinates of the pixel coordinates of the reverse perspective corner can be deduced.

Please refer to Figure 5, if the reference point is the midpoint of the bottom of the rectangular area, the inverse perspective coordinates of the pixel coordinates of the reference point are the coordinates of the pixel point N in Figure 5, which is set to (x, y), then the inverse perspective pixel coordinates of the top corner Corresponding to the coordinates of pixel points J, K, L, M, if the length of the bottom side of the rectangular area is W meters, and the length of the vertical side perpendicular to the bottom side is L meters, each pixel corresponds to s in the physical world m, it can be obtained that the bottom edge is W/s pixels in the inverse perspective transformation diagram, and the vertical edge is L/s pixels in the inverse perspective transformation diagram. The coordinates of pixel points J, K, L, and M are q1(x-W/2s, y-L/s), q2(x+W/2s, y-L/s), q3(x-W/2s, y), q4(x +W/2s,y).

It is further added that the homography matrix of inverse perspective transformation is set according to formula 3:

Formula 3:

q1 is obtained by inverse perspective transformation of p1. Taking the corresponding relationship between p1 and q1 as an example, the coordinates of p1 and q1 are first transformed into the second coordinates (u1, v1, 1) and (x1, y1, 1), and the two can be obtained through the homography The matrix yields Equation 4:

Expand the equation to get Equation 5-Equation 7:

Formula 5: h ₁ u ₁ +h ₂ v ₁ +h ₃ = x ₁ ;

Formula 6: h ₄ u ₁ +h ₅ v ₁ +h ₆ =y ₁ ;

Formula 7: h ₇ u ₁ +h ₈ v ₁ +h ₉ =1;

Since the scale of the homography matrix is not deformed, h9=1 can be set. At this time, h ₇ u ₁ +h ₈ v ₁ ＝0 can be substituted into Formula 8-Formula 9;

Formula 8: h ₁ u ₁ +h ₂ v ₁ +h ₃ -h ₇ u ₁ x ₁ -h ₈ v ₁ y ₁ = x ₁ ;

Formula 9: h ₄ u ₁ +h ₅ v ₁ +h ₆ -h ₇ u ₁ x ₁ -h ₈ v ₁ y ₁ =y ₁ ;

Based on the corresponding relationship between p2 and q2, Formula 10-Formula 11 can be deduced:

Formula 10: h ₁ u ₂ +h ₂ v ₂ +h ₃ -h ₇ u ₂ x ₂ -h ₈ v ₂ y ₂ = x ₂ ;

Formula 11: h ₄ u ₂ +h ₅ v ₂ +h ₆ -h ₇ u ₂ x ₂ -h ₈ v ₂ y ₂ =y ₂ ;

Based on the corresponding relationship between p2 and q2, Formula 12-Formula 13 can be deduced:

Formula 12: h ₁ u ₃ +h ₂ v ₃ +h ₃ -h ₇ u ₃ x ₃ -h ₈ v ₃ y ₃ = x ₃ ;

Formula 13: h ₄ u ₃ +h ₅ v ₃ +h ₆ -h ₇ u ₃ x ₃ -h ₈ v ₃ y ₃ =y ₃ ;

Based on the correspondence between p3 and q3, Formula 14-Formula 15 can be deduced:

Formula 14: h ₁ u ₄ +h ₂ v ₄ +h ₃ -h ₇ u ₄ x ₄ -h ₈ v ₄ y ₄ = x ₄ ;

Formula 15: h ₄ u ₄ +h ₅ v ₃ +h ₆ -h ₇ u ₄ 4-h ₈ v ₄ y ₄ =y ₄ ;

p1(u1, v1), p2(u2, v2), p3(u3, v3), p4(u4, v4) can be read directly from the flat view, and the corresponding reverse perspective pixel coordinates q1(x1, y1) , q2(x2, y2), q3(x3, y3), q4(x4, y4), with q1(x-W/2s, y-L/s), q2(x+W/2s, y-L/s), q3(x-W /2s, y), q4(x+W/2s, y) is replaced, brought into Formula 8-Formula 15, and the homography matrix H is calculated.

Optionally, step S106 includes:

In this embodiment, in order to calculate the physical coordinates of road pixel points in other plan views, the origin of the physical coordinate system needs to be determined. As shown in Figure 6, set the projection point of the monocular camera on the road as the origin of the physical coordinate system, take the road as the 2D plane, the line of sight parallel to the camera as the positive direction of the y-axis, and the right side perpendicular to the y-axis as the x-axis Positive direction. Measure the distance between the origin of the coordinate system and the reference point of the rectangular area. Specifically, in this embodiment, the reference point of the rectangular area is the midpoint of the base, and the distance in the y direction between the origin of the coordinate system and the midpoint of the base is L1 meters . Since the image coordinates of the midpoint of the bottom edge of the rectangle are (x, y), it can be seen that the target inverse perspective coordinates (a, b) of any pixel point on the bottom edge of the bounding box of the target object in other flat views are in the physical coordinate system The coordinates of are ((a-x)s, (b-y)s+L1). Among them, (a-x)s and (b-y)s+L1 represent the real physical distance between the target object and the monocular camera in the x-axis direction and y-axis direction, respectively, in meters.

It is supplemented that the coordinates of the target anti-perspective coordinates (a1, b1) of the midpoint coordinates of the bottom edge of the target object's bounding box in other flat views in the physical coordinate system are ((a1-x)s, (b1-y) s+L1). Among them, (a1-x)s and (b1-y)s+L11 represent the real physical distance of the midpoint of the bottom edge of the bounding box of the target object in the x-axis direction and the y-axis direction, respectively, in meters.

In the monocular ranging method provided in this embodiment, a flat view is taken by a monocular camera; a homography matrix for inverse perspective transformation of the flat view into a top view is determined according to the apex pixel coordinates of the first region image; from The bottom edge of the rectangular area determines a reference point, and obtains the distance between the reference point and the monocular camera; determines the reference point pixel coordinates corresponding to the reference point in the first area image, and obtains the The coordinates of the reverse perspective reference point corresponding to the pixel coordinates of the reference point; the target object detection is performed on other flat views captured by the monocular camera to obtain the target object bounding box, and the bottom edge of the target object bounding box is obtained according to the homography matrix Perform inverse perspective transformation on the coordinates of any pixel point, and obtain the target inverse perspective coordinates corresponding to any pixel point coordinates of the bottom edge; relationship, and the distance between the reference point and the monocular camera, determine the distance between the target object and the monocular camera. In this way, the distance of the target object can be measured through the monocular camera, without excessive data requirements, and the ranging of irregular target objects can be realized, the ranging condition is reduced, and the ranging efficiency is improved.

Example 2

In addition, an embodiment of the present disclosure provides a monocular ranging device.

In this embodiment, the monocular ranging device may be a smart device such as a smart car or a robot.

Specifically, as shown in Figure 7, the monocular ranging device 700 includes:

The photographing module 701 is configured to photograph a plane view through a monocular camera, the ground visible area of the monocular camera includes a rectangular area, and the plane view includes a first area image corresponding to the rectangular area;

The first determination module 702 is configured to determine a homography matrix used for inverse perspective transformation of the plane view into a top view according to the top corner pixel coordinates of the first area image;

An acquisition module 703, configured to determine a reference point from the bottom of the rectangular area, and acquire the distance between the reference point and the monocular camera;

The first processing module 704 is configured to determine the pixel coordinates of the reference point corresponding to the reference point in the first region image, and obtain the reverse perspective reference point coordinates corresponding to the pixel coordinates of the reference point;

The second processing module 705 is configured to perform target object detection on other plane views captured by the monocular camera to obtain a bounding box of the target object, and perform any pixel point on the bottom edge of the bounding box of the target object according to the homography matrix The coordinates are subjected to inverse perspective transformation, and the target inverse perspective coordinates corresponding to the coordinates of any pixel point on the bottom edge are obtained;

The second determination module 706 is configured to determine according to the inverse perspective reference point coordinates, the target inverse perspective coordinates, the corresponding relationship between physical scale and pixel scale, and the distance between the reference point and the monocular camera The distance between the target object and the monocular camera.

Optionally, the first determining module 702 is further configured to obtain the pixel coordinates of the top corner of the first area image and the corresponding pixel coordinates of the top corner of the reverse perspective;

Optionally, the first determination module 702 is further configured to set reverse perspective coordinates of pixel coordinates of the reference point according to the size of the top view.

Optionally, the first determination module 702 is further configured to correct the plane view to obtain a corrected plane view, the corrected plane view includes a second area image, and the top of the second area image The corner pixel coordinates are used as the top corner pixel coordinates of the first region image;

Optionally, the monocular ranging device 700 also includes:

The first determination module 702 is further configured to correct the plane view according to the internal reference and distortion parameters to obtain the corrected plane view.

The first determination module 702 is further configured to: according to the pixel coordinates of the reference point in the first direction, the pixel coordinates of the reference point in the second direction, the side length of the rectangular area, and the corresponding relationship between the physical scale and the pixel scale . The positional relationship between the reverse perspective vertex corner pixel coordinates and the reverse perspective reference point coordinates, determining the first direction coordinate and the second direction coordinate of the reverse perspective vertex corner pixel coordinates.

Optionally, the second determination module 706 is further configured to subtract the pixel coordinates of the reference point in the first direction from the coordinates in the first direction of the reverse perspective coordinates of the target to obtain a first pixel difference; The difference is multiplied by the corresponding relationship between the physical scale and the pixel scale to obtain a first product; the first product is used as the first direction distance between the target object and the monocular camera; and/or,

The monocular ranging device 700 provided in this embodiment can implement the monocular ranging method shown in Embodiment 1, and details are not repeated here to avoid repetition.

The monocular distance measuring device provided in this embodiment uses a monocular camera to take a plane view; determines a homography matrix for inverse perspective transformation of the plane view into a top view according to the apex pixel coordinates of the first region image; The bottom edge of the rectangular area determines a reference point, and obtains the distance between the reference point and the monocular camera; determines the reference point pixel coordinates corresponding to the reference point in the first area image, and obtains the The coordinates of the reverse perspective reference point corresponding to the pixel coordinates of the reference point; the target object detection is performed on other flat views captured by the monocular camera to obtain the target object bounding box, and the bottom edge of the target object bounding box is obtained according to the homography matrix Perform inverse perspective transformation on the coordinates of any pixel point, and obtain the target inverse perspective coordinates corresponding to any pixel point coordinates of the bottom edge; relationship, and the distance between the reference point and the monocular camera, determine the distance between the target object and the monocular camera. In this way, the distance of the target object can be measured through the monocular camera, without excessive data requirements, and the ranging of irregular target objects can be realized, the ranging condition is reduced, and the ranging efficiency is improved.

Example 3

In addition, an embodiment of the present disclosure provides an intelligent device, including a monocular camera, a memory, and a processor, the memory stores a computer program, and when the computer program runs on the processor, it executes the above method described in Embodiment 1. Provided monocular ranging method.

In this embodiment, the smart device may be smart devices such as smart cars and robots.

Wherein, the processor is configured to: take a plane view through a monocular camera, the ground visible area of the monocular camera includes a rectangular area, and the plane view includes a first area image corresponding to the rectangular area;

Optionally, the processor is further configured to: obtain the pixel coordinates of the top corner of the image of the first region and the corresponding pixel coordinates of the top corner of the image in reverse perspective;

Optionally, the processor is further configured to: set reverse perspective coordinates of pixel coordinates of the reference point according to the size of the top view.

Optionally, the processor is further configured to: correct the plane view to obtain a corrected plane view, the corrected plane view includes a second area image, and use the apex pixel coordinates of the second area image as The pixel coordinates of the top corner of the first area image;

Optionally, the processor is further configured to: take a checkerboard image through the monocular camera, and acquire internal references and distortion parameters of the monocular camera according to the checkerboard image;

The processor is further configured to: according to the pixel coordinates of the reference point in the first direction, the pixel coordinates of the reference point in the second direction, the side length of the rectangular area, the corresponding relationship between the physical scale and the pixel scale, the inverse The positional relationship between the perspective pixel coordinates and the reverse perspective reference point coordinates determines the first direction coordinates and the second direction coordinates of the reverse perspective pixel coordinates.

Optionally, the processor is further configured to: subtract the pixel coordinates of the first direction reference point from the first direction coordinates of the target reverse perspective coordinates to obtain a first pixel difference; Multiplying the corresponding relationship between the physical scale and the pixel scale to obtain a first product; using the first product as the distance in the first direction between the target object and the monocular camera; and/or,

The smart device provided in this embodiment can implement the monocular ranging method shown in Embodiment 1, and to avoid repetition, details are not repeated here.

The smart device provided in this embodiment uses a monocular camera to take a plane view; determines a homography matrix for reverse perspective transformation of the plane view into a top view according to the pixel coordinates of the top corner of the first region image; The bottom edge of the area determines the reference point, and obtains the distance between the reference point and the monocular camera; determines the reference point pixel coordinates corresponding to the reference point in the first area image, and obtains the reference point pixel The coordinates of the inverse perspective reference point corresponding to the coordinates; the target object detection is performed on other flat views captured by the monocular camera to obtain the bounding box of the target object, and any pixel of the bottom edge of the bounding box of the target object is obtained according to the homography matrix Perform inverse perspective transformation on the point coordinates to obtain the target inverse perspective coordinates corresponding to any pixel point coordinates of the bottom edge; according to the inverse perspective reference point coordinates, the target inverse perspective coordinates, the corresponding relationship between the physical scale and the pixel scale, and The distance between the reference point and the monocular camera determines the distance between the target object and the monocular camera. In this way, the distance of the target object can be measured through the monocular camera, without excessive data requirements, and the ranging of irregular target objects can be realized, the ranging condition is reduced, and the ranging efficiency is improved.

Example 4

The present application also provides a computer-readable storage medium, on which a computer program is stored, and a plane view is taken by a monocular camera, the visible area of the ground of the monocular camera includes a rectangular area, and the plane view including a first area image corresponding to the rectangular area;

When the computer program is executed by the processor, the following steps are implemented:

Optionally, when the computer program is executed by the processor, the following steps are also implemented:

When the computer program is executed by the processor, the following steps are also implemented:

In this embodiment, the computer-readable storage medium may be a read-only memory (Read-Only Memory, ROM for short), a random access memory (Random Access Memory, RAM for short), a magnetic disk or an optical disk, and the like.

The computer-readable storage medium provided in this embodiment can implement the monocular ranging method shown in Embodiment 1, and to avoid repetition, details are not repeated here.

It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or terminal comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or terminal. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article or terminal comprising the element.

Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on such an understanding, the technical solution of the present application can be embodied in the form of a software product in essence or the part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, disk, CD) contains several instructions to enable a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in various embodiments of the present application.

The embodiments of the present application have been described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific implementations. The above-mentioned specific implementations are only illustrative and not restrictive. Those of ordinary skill in the art will Under the inspiration of this application, without departing from the purpose of this application and the scope of protection of the claims, many forms can also be made, all of which belong to the protection of this application.

Claims

A monocular ranging method, characterized in that the method comprises:

A plane view is taken by a monocular camera, the ground visible area of the monocular camera includes a rectangular area, and the plane view includes a first area image corresponding to the rectangular area;

determining a homography matrix for inverse perspective transforming the plane view into a top view according to the apex pixel coordinates of the first region image;

Determining a reference point from the bottom edge of the rectangular area, and obtaining the distance between the reference point and the monocular camera;

Determining the pixel coordinates of the reference point corresponding to the reference point in the first region image, and obtaining the coordinates of the reverse perspective reference point corresponding to the pixel coordinates of the reference point;

Perform target object detection on other planar views captured by the monocular camera to obtain the target object bounding box, perform inverse perspective transformation on the coordinates of any pixel point on the bottom edge of the target object bounding box according to the homography matrix, and obtain the target object bounding box. The target inverse perspective coordinate corresponding to any pixel coordinate of the bottom edge;

According to the coordinates of the reverse perspective reference point, the reverse perspective coordinates of the target, the corresponding relationship between the physical scale and the pixel scale, and the distance between the reference point and the monocular camera, determine the target object and the monocular camera. The distance between the cameras.
The method according to claim 1, wherein the determining a homography matrix for inverse perspective transformation of the plane view into a top view according to the apex pixel coordinates of the first region image comprises:

Acquiring the pixel coordinates of the top corner of the first area image and the corresponding pixel coordinates of the top corner of the reverse perspective image;

The homography matrix is determined according to the vertex pixel coordinates of the first area image and the reverse perspective vertex pixel coordinates.
The method according to claim 2, wherein said obtaining the coordinates of the reverse perspective reference point corresponding to the pixel coordinates of the reference point comprises:

The reverse perspective coordinates of the pixel coordinates of the reference point are set according to the size of the top view.
The method according to claim 3, wherein the acquiring the pixel coordinates of the top corners of the image of the first region and the corresponding pixel coordinates of the top corners of the reverse perspective comprises:

Correcting the plan view to obtain a corrected plan view, the corrected plan view including a second area image, using the apex pixel coordinates of the second area image as the apex pixel coordinates of the first area image ;

The pixel coordinates of the reverse perspective corner are determined according to the reverse perspective coordinates of the pixel coordinates of the reference point, the corresponding relationship between the physical scale and the pixel scale, and the side length of the rectangular area.
The method according to claim 4, characterized in that the method further comprises:

Taking a checkerboard image through the monocular camera, and obtaining internal reference and distortion parameters of the monocular camera according to the checkerboard image;

The plane view is corrected to obtain the corrected plane view, including:

The plane view is corrected according to the internal reference and the distortion parameter to obtain the corrected plane view.
The monocular ranging method according to claim 4, wherein the reverse perspective coordinates of the pixel coordinates of the reference point include the pixel coordinates of the reference point in the first direction and the pixel coordinates of the reference point in the second direction, and the second direction is perpendicular to in said first direction;

According to the inverse perspective coordinates of the pixel coordinates of the reference point, the corresponding relationship between the physical scale and the pixel scale, and the side length of the rectangular area, determining the pixel coordinates of the inverse perspective corners includes:

According to the pixel coordinates of the reference point in the first direction, the pixel coordinates of the reference point in the second direction, the side length of the rectangular area, the corresponding relationship between the physical scale and the pixel scale, the pixel coordinates of the reverse perspective vertex and the The positional relationship of the reference point coordinates of the reverse perspective is determined, and the coordinates of the first direction and the second direction of the pixel coordinates of the reverse perspective corner are determined.
The monocular ranging method according to claim 6, wherein, according to the coordinates of the inverse perspective reference point, the coordinates of the target inverse perspective, the corresponding relationship between the physical scale and the pixel scale, and the relationship between the reference point and the pixel scale The distance between the monocular cameras, determining the distance between the target object and the monocular cameras, includes:

Subtracting the pixel coordinates of the reference point in the first direction from the coordinates in the first direction of the reverse perspective coordinates of the target to obtain a first pixel difference; multiplying the first pixel difference by the corresponding relationship between the physical scale and the pixel scale, Obtain a first product; use the first product as a first direction distance between the target object and the monocular camera; and/or,

Subtracting the pixel coordinates of the reference point in the second direction from the second direction coordinates of the target reverse perspective coordinates to obtain a second pixel difference; multiplying the second pixel difference by the corresponding relationship between the physical scale and the pixel scale, Obtain the second product; add the second product to the sum of the distance in the second direction between the reference point and the monocular camera, as the second distance between the target object and the monocular camera direction distance.
A monocular ranging device, characterized in that the device comprises:

A photographing module, configured to photograph a plane view through a monocular camera, the ground visible area of the monocular camera includes a rectangular area, and the plane view includes a first area image corresponding to the rectangular area;

A first determination module, configured to determine a homography matrix for inverse perspective transformation of the plane view into a top view according to the top corner pixel coordinates of the first area image;

An acquisition module, configured to determine a reference point from the bottom of the rectangular area, and acquire a distance between the reference point and the monocular camera;

The first processing module is configured to determine the pixel coordinates of the reference point corresponding to the reference point in the first region image, and acquire the reverse perspective reference point coordinates corresponding to the pixel coordinates of the reference point;

The second processing module is configured to perform target object detection on other flat views captured by the monocular camera to obtain a target object bounding box, and coordinate any pixel point on the bottom edge of the target object bounding box according to the homography matrix Perform inverse perspective transformation to obtain the target inverse perspective coordinates corresponding to the coordinates of any pixel point on the bottom edge;

The second determination module is configured to determine the inverse perspective reference point coordinates, the target inverse perspective coordinates, the corresponding relationship between physical scale and pixel scale, and the distance between the reference point and the monocular camera. The distance between the target object and the monocular camera.
An intelligent device, characterized in that it comprises a monocular camera, a memory and a processor, the memory stores a computer program, and the computer program executes any one of claims 1 to 7 when the processor runs. monocular ranging method.
A computer-readable storage medium, characterized in that it stores a computer program, and the computer program executes the monocular ranging method according to any one of claims 1 to 7 when running on a processor.