CN112033352B

CN112033352B - Multi-camera ranging robot and visual ranging method

Info

Publication number: CN112033352B
Application number: CN202010905113.1A
Authority: CN
Inventors: 赖钦伟; 肖刚军
Original assignee: Zhuhai Amicro Semiconductor Co Ltd
Current assignee: Zhuhai Amicro Semiconductor Co Ltd
Priority date: 2020-09-01
Filing date: 2020-09-01
Publication date: 2023-11-07
Anticipated expiration: 2040-09-01
Also published as: CN112033352A

Abstract

The invention discloses a robot with multiple cameras for ranging and a visual ranging method, wherein three or more cameras which are not on the same preset straight line are arranged on a bearing base of the robot, the problem that the binocular cameras in the prior art cannot detect depth information of parallel line scenes (threshold, step edge and edge contour line) is solved, compared with the parallel binocular stereoscopic vision model in the prior art, the visual ranging method disclosed by the invention utilizes the binocular ranging results of a first camera and a second camera which are on the same preset straight line to calculate the optimal depth value by utilizing the binocular ranging results of one camera selected by the first camera and the second camera and a third camera which is not on the same preset straight line, avoids misjudgment of depth value distance when the first depth value brought by the binocular ranging of the first camera and the second camera is 0, and improves the accuracy of distance detection.

Description

Multi-camera ranging robot and visual ranging method

Technical Field

The invention belongs to the technical field of binocular vision ranging, and particularly relates to a robot with multiple cameras for ranging and a vision ranging method.

Background

Distance measurement is a necessary function for many robots, particularly three-dimensional information, helping the robot to perceive the outside world. Binocular stereoscopic vision ranging has the advantages of non-contact, automatic measurement, no harm to human eyes and the like. However, in the actual ranging process, the two cameras of the robot can acquire the step edge contour lines and the threshold lines parallel to the baselines of the two cameras, but the depth information of the object parallel to the baselines of the two cameras cannot be calculated according to the image lines parallel to the baselines.

Disclosure of Invention

The invention adopts three or more cameras which are not arranged on the same straight line to solve the problem that the parallel lines of the binocular cameras cannot detect depth, wherein one camera does not execute binocular vision ranging on the straight line, and the accuracy of distance detection is improved, and the specific technical scheme is as follows:

the utility model provides a many cameras range finding's robot, this robot is including bearing the base, should bear the base and install at least three cameras that do not distribute on same default straight line. Compared with a parallel binocular stereoscopic vision model in the prior art, the technical scheme has the advantages that at least two cameras are distributed on the same straight line, one camera is not necessarily distributed on the straight line, the problem that the binocular cameras in the prior art cannot detect depth information of parallel line scenes (threshold, step edge and object horizontal edge contour lines) is solved, and the accuracy of distance detection is improved.

Further, at least two cameras are installed on the same preset straight line in the bearing base, and at least one camera is arranged above and/or below the preset straight line. The method and the device ensure that two cameras on the same preset straight line of the robot can normally execute binocular vision ranging action, and can call cameras which are related to the geometric position of the cameras but are not in the same preset straight line to finish binocular vision ranging work of parallel lines when the two cameras on the same preset straight line cannot detect the parallel lines.

Further, in the bearing base, a camera is respectively arranged vertically above and vertically below a connecting line center point of two cameras mounted on the preset straight line. Compared with the parallel binocular stereoscopic vision model in the prior art, the matching efficiency of the characteristic points on the polar lines of the cameras is improved.

Further, the preset straight line is a horizontal line, binocular vision ranging is conveniently implemented, and the calculated amount of depth information is simplified.

Further, the preset straight line is provided with two or more cameras in the same horizontal acquisition direction, so that: on the preset straight line, the distance between two cameras is different from the distance between the other two cameras, or the base line length of the two cameras is different from the base line length of the other two cameras. Thereby being compatible with long-distance and short-distance visual ranging scenarios.

Further, on the preset straight line, there are a left camera and a right camera which are horizontally placed at a base line distance, and the cameras installed above the preset straight line and/or below the preset straight line are located on a vertical line of the base line between the left camera and the right camera. The parallel binocular stereoscopic vision model is conveniently constructed to calculate the depth value, and the operation efficiency is improved.

A visual ranging method of a robot based on multi-camera ranging comprises the following steps: searching a transverse matching characteristic point pair based on the same object space point to be detected from an image shot by a first camera and an image shot by a second camera, solving a first parallax of the transverse matching characteristic point pair, and then constructing a triangle geometry relationship by combining a base line distance between the first camera and the second camera and a camera focal length to calculate a first depth value; selecting one camera from the first camera and the second camera as a reference camera, searching a longitudinal matching characteristic point pair based on the same object space point to be detected from an image shot by the reference camera and an image shot by the third camera, solving a second parallax of the longitudinal matching characteristic point pair, and then constructing a triangle geometric relationship by combining a base line distance between the reference camera and the third camera and a camera focal length to calculate a second depth value; judging whether the first parallax is 0, if so, determining the second depth value as an optimal depth value for representing the distance between the space point of the object to be detected and the robot, otherwise, processing the first depth value and the second depth value according to the magnitude relation of the first parallax and the second parallax to obtain the optimal depth value; two cameras which are placed at a baseline distance from one another on the same preset straight line are respectively set as a first camera and a second camera, and a camera above the preset straight line or a camera below the preset straight line is set as a third camera.

Compared with a parallel binocular stereoscopic vision model in the prior art, the technical scheme uses the binocular range results of the reference camera and the third camera which are not on the preset straight line to calculate the optimal depth value according to the binocular range results of the first camera and the second camera of the same preset straight line, solves the problem that the binocular camera in the prior art cannot detect depth information of the space point of the object to be detected in a parallel line scene (a threshold and a stepped parallel line edge), avoids misjudgment of depth value distance when the first depth value brought by the binocular range of the first camera and the second camera is 0, and improves the accuracy of the distance detection.

Further, the method for processing the first depth value and the second depth value according to the magnitude relation of the first parallax and the second parallax to obtain the optimal depth value comprises the following steps: when the first parallax is greater than the second parallax, the optimal depth value is determined to be the first depth value; when the second disparity is greater than the first disparity, the optimal depth value is determined as the second depth value. So that the optimal depth value is closer to the actual distance (pixel value) between the space point of the object to be detected and the robot.

Further, the method for processing the first depth value and the second depth value according to the magnitude relation of the first parallax and the second parallax to obtain the optimal depth value comprises the following steps: calculating a first proportion of the first parallax in the sum value of the first parallax and the second parallax, and taking the first proportion as a mean value processing weight distributed by a first depth value; calculating a second proportion of the second parallax in the sum of the first parallax and the second parallax, and taking the second proportion as a mean value processing weight distributed by a second depth value; and then respectively carrying out weighted average processing on the first depth value and the second depth value by using the first proportion and the second proportion so as to obtain the optimal depth value. According to the technical scheme, the depth values obtained by the three cameras which are not in the same preset straight line are processed by using a weighted average method, so that the optimal depth value has better stability, and the influence of environmental factors is reduced.

Further, when the first parallax is 0, the robot recognizes that the object space point to be measured is located on a parallel line of a base line between the first camera and the second camera, so that the optimal depth value is derived from a perpendicular distance between the object space point to be measured and a line connecting a projection center of the third camera and a projection center of the reference camera (a projection center of the second camera or a projection center of the first camera).

Further, the method for searching out the transverse matching characteristic point pairs based on the same object space point to be detected comprises the following steps: setting an image shot by the first camera as a left image and setting an image shot by the second camera as a right image; firstly, taking a left image as a reference image, taking a right image as an image to be matched, selecting a first preset characteristic point in the left image, and then searching a second preset characteristic point matched with the first preset characteristic point from the right image based on epipolar constraint; then taking the right image as a reference image, taking the left image as an image to be matched, selecting a second preset feature point in the right image, and based on epipolar constraint, when a first preset feature point matched with the second preset feature point is searched out from the left image, forming a transverse matching feature point pair by the first preset feature point and the second preset feature point; the first preset feature point is a projection point of the same object space point to be detected on the left image, and the second preset feature point is a projection point of the same object space point to be detected on the right image; or setting an image shot by a first camera as a reference image, setting an image shot by a second camera as an image to be matched, selecting a first preset feature point in the reference image, searching a second preset feature point matched with the first preset feature point from the image to be matched based on polar constraint, and forming a transverse matching feature point pair by the first preset feature point and the second preset feature point; the first preset feature points are projection points of the same object space point to be detected on the reference image, and the second preset feature points are projection points of the same object space point to be detected on the image to be matched. Therefore, the matching range of the characteristic points of the left image shot by the first camera and the characteristic points of the right image shot by the second camera is reduced, and the characteristic point matching efficiency in the transverse direction or the horizontal direction is improved.

Further, the method for searching the longitudinal matching characteristic point pairs based on the same object space point to be detected comprises the following steps: setting an image shot by the reference camera as a lower image, and setting an image shot by the third camera as an upper image; firstly, taking a lower image as a reference image, taking an upper image as an image to be matched, selecting a third preset feature point in the lower image, and searching a fourth preset feature point matched with the third preset feature point from the upper image based on epipolar constraint; then, taking the upper image as a reference image, taking the lower image as an image to be matched, selecting a fourth preset feature point in the upper image, and based on epipolar constraint, when a third preset feature point matched with the fourth preset feature point is searched out from the lower image, forming a longitudinal matching feature point pair by the third preset feature point and the fourth preset feature point; the third preset feature point is a projection point of the same object space point to be detected in the lower image, and the fourth preset feature point is a projection point of the same object space point to be detected in the upper image; or setting an image shot by a reference camera as a reference image, setting an image shot by a third camera as an image to be matched, selecting a third preset feature point in the reference image, searching a fourth preset feature point matched with the third preset feature point from the image to be matched based on polar constraint, and forming a longitudinal matching feature point pair by the third preset feature point and the fourth preset feature point; the third preset feature point is a projection point of the same object space point to be detected on the reference image, and the fourth preset feature point is a projection point of the same object space point to be detected on the image to be matched. Thereby reducing the matching range of the characteristic points of the image to be matched, which is shot by the third camera, and the characteristic points of the reference image, which is shot by the reference camera (the first camera or the second camera), and improving the matching efficiency of the characteristic points in the longitudinal direction.

Drawings

Fig. 1 is a schematic structural view of a multi-camera ranging robot according to an embodiment of the present invention.

Fig. 2 is a flowchart of a visual ranging method based on the robot shown in fig. 1 according to an embodiment of the present invention.

Detailed Description

The following describes the technical solution in the embodiment of the present invention in detail with reference to the drawings in the embodiment of the present invention. In the description of the present invention, it should be understood that the terms "center", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "axial", "radial", "circumferential", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the device or element in question must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present invention.

It is noted that relational terms such as "first" and "second", and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The embodiment of the invention discloses a robot with multiple cameras for ranging, as shown in fig. 1, the robot comprises a bearing base 101 and a driving wheel 102, the bearing base 101 is provided with at least three cameras which are not on the same preset straight line, and the implementation mode comprises the following steps: on the bearing base of the robot, there are at least two cameras distributed on the same straight line, and there must be one camera not distributed on the straight line. Compared with a parallel binocular stereoscopic vision model in the prior art, the three or more cameras which are not in the same preset straight line are installed on the bearing base 101 of the robot, the problem that the binocular cameras in the prior art cannot detect depth information of parallel line scenes (threshold and step edge areas) is solved, and the accuracy of distance detection is improved.

In the embodiment of the present invention, at least two cameras are installed in the same preset line on the carrying base 101, and at least one camera is additionally disposed above and/or below the preset line. The method and the device ensure that two cameras on the same preset straight line of the robot can normally execute binocular vision ranging action, and can call the cameras associated with the geometric positions of the cameras to complete binocular vision ranging operation of parallel lines when the two cameras on the same preset straight line cannot detect the parallel lines, wherein the geometric position association can comprise the cameras which are not arranged in the same preset straight line, but the cameras can execute binocular vision ranging with the cameras arranged in the same preset straight line, so that a new binocular stereoscopic vision model is built.

Preferably, in the bearing base, a camera is respectively arranged vertically above and vertically below a connecting line center point of two cameras installed in the preset straight line, so that 4 cameras arranged in a cross structure are formed in the bearing base, and compared with a parallel binocular stereoscopic vision model in the prior art, the matching efficiency of characteristic points on polar lines of the cameras is improved.

Preferably, the preset straight line is a horizontal line on the horizontal plane where the bearing base 101 is located, so as to facilitate binocular vision ranging and simplify the calculation amount of depth information.

As an embodiment, as shown in fig. 1, the preset straight line is provided with two or more cameras in the same horizontal collecting direction, so that: on the preset straight line (understood as in the horizontal line), there is a case where the distance between two cameras is different from the distance between the other two cameras, or there is a case where the base line length of two cameras is different from the base line length of the other two cameras, and when the distance between two cameras is different from the distance between the other two cameras, then: the base line lengths of the two cameras distributed on the same preset straight line are different from the base line lengths of the other two cameras, so that long-distance and short-distance visual ranging scenes are compatible, namely, the robot can be used for measuring object depth information in a scene farther when the base line lengths of the two cameras distributed on the same preset straight line are larger, and the robot can be used for measuring object depth information in a scene farther when the base line lengths of the two cameras distributed on the same preset straight line are smaller.

As an embodiment, on the preset straight line, as shown in fig. 1, there are a left camera a and a right camera B horizontally placed at a baseline distance, and a camera C installed above the preset straight line is located on a vertical line of a baseline between the left camera a and the right camera B, or the camera C is located on a middle vertical line of a line connecting the left camera a and the right camera B. The inner parameters of the left camera A, the inner parameters of the right camera B and the inner parameters of the camera C are the same, the optical axis of the left camera A, the optical axis of the right camera B and the optical axis of the camera C are parallel and perpendicular to the base line between the left camera A and the right camera B, so that the left image shot by the left camera A and the right image shot by the right camera B are coplanar and have horizontal parallax, but the image shot by the camera C and the image shot by the camera A are not necessarily coplanar, the image shot by the camera C and the image shot by the camera B are not necessarily coplanar, and the connecting line distance between the projection center of the left camera A and the projection center of the right camera B is the base line distance. The embodiment is convenient for constructing the parallel binocular stereoscopic vision model to calculate the depth value, and improves the operation efficiency.

It should be noted that, in the foregoing embodiment, the bearing base of the robot and the camera installed with the bearing base belong to a visual ranging module.

Another embodiment of the invention discloses a visual ranging method of a robot based on the multi-camera ranging of the previous embodiment. In the multi-camera ranging robot of the foregoing embodiment, the first camera and the second camera are placed at a baseline distance from each other, which is selected within the same preset line according to the actual ranging scene range of the multi-camera ranging robot, where the measuring range is limited by the camera baseline. The measurement range is greatly related to the base line (the distance between two cameras); the larger the baseline, the farther the measurement range; the smaller the baseline, the closer the measurement range. The baseline limits the measurement range of the depth camera to some extent.

Of course, a plurality of cameras with the same image acquisition direction can be installed, so that the base line lengths between any two cameras are different, and the cameras are compatible with vision ranging scenes with different distances; the camera above the preset straight line or one camera below the preset straight line is set to be a third camera, or the camera above the preset straight line and/or one camera below the preset straight line are respectively provided with one or more cameras, so that the matching efficiency of feature points is improved, more abundant matching feature points are searched out to participate in the operation of depth values, the probability that the binocular camera in the prior art cannot detect depth information of parallel line scenes (threshold, gallery and step edge areas) is increased, and the distance measurement detection precision is improved.

As an embodiment, as shown in fig. 2, the visual ranging method includes the steps of:

step S101, a transverse matching characteristic point pair based on the same object space point to be detected is searched out from an image shot by a first camera and an image shot by a second camera, first parallax of the transverse matching characteristic point pair is obtained, and then a triangular geometric relationship is constructed by combining a base line distance between the first camera and the second camera and a camera focal length to calculate a first depth value; and then proceeds to step S102.

Preferably, the first camera and the second camera are arranged horizontally, the optical axes are parallel and perpendicular to the base line, the focal length of the camera of the first camera is equal to that of the camera of the second camera, when the first camera and the second camera shoot the same object space point to be measured at the same moment, the first camera and the second camera respectively acquire images of the object space point to be measured, then two feature points which are matched with each other and based on the object space point to be measured are searched out along the same polar line, then the coordinates of the two feature points are subtracted in the same direction to obtain horizontal parallax, and according to the triangle geometric relationship, the ratio of the base line length between the first camera and the second camera to the horizontal parallax is equal to the ratio of the depth of the object space point to be measured to the focal length of the camera, so that the classical binocular vision parallax ranging principle of the double-camera system which is arranged according to the standard configuration is formed. Therefore, in step S101, the first depth value can be calculated by combining the baseline distance between the first camera and the second camera, the first parallax, and the proportional relationship between the focal length of the camera, where the distance between the spatial point of the object to be detected and the baseline between the first camera and the second camera is the distance between the spatial point of the object to be detected and the baseline between the first camera and the second camera, and the depth map can be obtained by representing the pixel value of the corresponding pixel point in the image with the depth value Z. If this object space point to be measured is on a parallel line (such as a step edge) parallel to the base line between the first camera and the second camera, the pairs of laterally matching feature points based on the same object space point to be measured may overlap together on the same polar line, and it is difficult to obtain accurate parallax, in which case the parallax obtained is typically zero.

Step S102, selecting one camera from the first camera and the second camera as a reference camera, searching a longitudinal matching characteristic point pair based on the same object space point to be detected from an image shot by the reference camera and an image shot by the third camera, solving a second parallax of the longitudinal matching characteristic point pair, and then constructing a triangle geometric relationship by combining a baseline distance between the reference camera and the third camera and a camera focal length to calculate a second depth value; then, the process advances to step S103.

Preferably, the optical axis of the reference camera is parallel to the optical axis of the third camera and is perpendicular to the base line between the reference camera and the third camera, the focal length of the reference camera is equal to the focal length of the third camera, when the reference camera and the third camera shoot the same object space point to be detected in the step S101 at the same moment, according to the classical binocular vision parallax ranging principle of the standard configuration placed dual camera system, the reference camera and the third camera respectively acquire the image of the object space point to be detected, the image of the object space point to be detected shot respectively has a projection point and the image of the image shot by the third camera respectively in the direction of the polar line has a projection point, then the matched feature points are respectively searched for in the two projection points along the polar line direction to determine a longitudinal matched feature point pair based on the same object space point to be detected, and then the coordinates of the two matched feature points of the longitudinal matched feature point pair are subtracted in the same direction to obtain a second parallax, namely, the second parallax of the object space point to be detected is not horizontal parallax, and the second parallax is not influenced by the base line between the base line of the reference camera and the third camera in the step S101. In this embodiment, since the polar plane formed by this object to be measured spatial point, the reference camera, and the third camera is perpendicular to the base line between the first camera and the second camera, the parallel edge line parallel to the base line between the first camera and the second camera in the object to be measured is not parallel to the base line between the reference camera and the third camera in the imaging plane of the third camera and the imaging plane of the reference camera.

In the previous step, the optical axes of the two cameras are parallel, the left imaging plane and the right imaging plane are coplanar, and the epipolar lines are aligned. Any point on one image must have the same line number as its corresponding point on the other image, and the corresponding point can be matched by only one-dimensional search in the line. The binocular matching (the pair of transverse matching feature points searched in step S101 and the pair of longitudinal matching feature points searched in step S102) serves to match the corresponding projection points of the same scene on the imaging views of the two cameras, which is done in order to obtain a disparity map.

Step S103, it is determined whether the first parallax is 0, if yes, step S104 is entered, otherwise step S105 is entered. In this embodiment, when the first parallax is 0, the first depth value is also 0, and the robot recognizes that the spatial point of the object to be detected is located on a parallel line of a base line between the first camera and the second camera, where the parallel line is a horizontal edge line belonging to a threshold or a step.

Step S104, determining the second depth value as the optimal depth value for representing the distance between the space point of the object to be measured and the robot. At this time, the first parallax of the transverse matching feature point pair based on the same object space point searched in the image shot by the first camera and the image shot by the second camera in step S101 is 0, which obviously is that the depth information of the object space point to be detected and the robot cannot be obtained, because the base line between the first camera and the second camera is parallel to the edge line where the object space point to be detected is located, the projection points which are matched with each other cannot be acquired by the first camera and the second camera, only lines can be acquired, at this time, the base line between the first camera and the second camera is parallel to the edge line where the object space point to be detected is located, and the robot may detect and identify the step edge contour line and the threshold line which are parallel to the base lines of the two cameras. At this time, only the second depth value obtained in step S102 can be used as an optimal depth value for representing the distance between the spatial point of the object to be measured and the robot.

Step S105, the first depth value and the second depth value are processed according to the magnitude relation between the first parallax and the second parallax to obtain the optimal depth value. The purpose of this step is to obtain a more accurate and stable optimal depth value by processing the first depth value and the second depth value obtained by the cooperation of the three cameras which are not on the same horizontal straight line or on a preset straight line.

Compared with a parallel binocular stereoscopic vision model in the prior art, the method uses the binocular range results of the reference camera and the third camera which are not on the preset straight line to calculate the optimal depth value according to the binocular range results of the first camera and the second camera which are on the same preset straight line, solves the problem that the binocular camera in the prior art cannot detect depth information of the space point of the object to be detected in a parallel line scene (a threshold and a stepped parallel line edge), avoids misjudgment of the depth value distance generated when the first depth value brought by the binocular range of the first camera and the second camera is 0, and improves the accuracy of the distance detection.

As an embodiment, the method for processing the first depth value and the second depth value according to the magnitude relation of the first parallax and the second parallax to obtain the optimal depth value is as follows: when the first parallax is greater than the second parallax, the optimal depth value is determined to be the first depth value; when the second disparity is greater than the first disparity, the optimal depth value is determined as the second depth value. So that the optimal depth value obtained in step S105 is closer to the actual distance (pixel value) between the spatial point of the object to be measured and the robot. If a plurality of third cameras respectively participate in searching based on longitudinal matching characteristic point pairs of the same object space point to be detected according to a classical binocular vision parallax ranging principle above and/or below the preset straight line so as to obtain a plurality of parallaxes and corresponding depth values and form a depth information mapping table to be corrected, selecting a depth value corresponding to the largest parallax from all parallaxes and first parallaxes in the depth information mapping table to be corrected as the optimal depth value.

As an embodiment, the method for processing the first depth value and the second depth value according to the magnitude relation of the first parallax and the second parallax to obtain the optimal depth value is as follows: calculating a first proportion of the first parallax in the sum value of the first parallax and the second parallax, and taking the first proportion as a mean value processing weight distributed by a first depth value; calculating a second proportion of the second parallax in the sum of the first parallax and the second parallax, and taking the second proportion as a mean value processing weight distributed by a second depth value; and then respectively carrying out weighted average processing on the first depth value and the second depth value by using the first proportion and the second proportion so as to obtain the optimal depth value. If a plurality of third cameras respectively participate in searching based on longitudinal matching characteristic point pairs of the same object space point to be detected according to a classical binocular vision parallax ranging principle above and/or below the preset straight line, so as to obtain a plurality of parallaxes and corresponding depth values, forming a depth information mapping table to be corrected, then calculating the proportion of all parallaxes and first parallaxes in the depth information mapping table to be corrected, and participating in weighted average calculation on the depth values corresponding to the parallaxes, wherein the finally obtained weighted average is the optimal depth value. In this embodiment, a weighted average method is used to process depth values obtained by three cameras that are not on the same preset straight line (where there are necessarily at least two cameras distributed on the same straight line), so that the optimal depth value has better stability, and the influence of environmental factors is reduced.

In the foregoing embodiment, the method for searching out the transverse matching feature point pairs based on the same object space point to be detected includes: setting an image shot by the first camera as a left image and setting an image shot by the second camera as a right image; firstly, taking a left image as a reference image, taking a right image as an image to be matched, selecting a first preset characteristic point in the left image, and then searching a second preset characteristic point matched with the first preset characteristic point from the right image based on epipolar constraint; and then taking the right image as a reference image, taking the left image as an image to be matched, selecting a second preset characteristic point in the right image, and based on epipolar constraint, when a first preset characteristic point matched with the second preset characteristic point is searched out from the left image, forming a transverse matching characteristic point pair by the first preset characteristic point and the second preset characteristic point. Or setting an image shot by a first camera as a reference image, setting an image shot by a second camera as an image to be matched, selecting a first preset feature point in the reference image, searching a second preset feature point matched with the first preset feature point from the image to be matched based on polar constraint, and forming a transverse matching feature point pair by the first preset feature point and the second preset feature point; the first preset feature points are projection points of the same object space point to be detected on the reference image, and the second preset feature points are projection points of the same object space point to be detected on the image to be matched. Therefore, the matching range of the characteristic points of the left image shot by the first camera and the characteristic points of the right image shot by the second camera is reduced, and the characteristic point matching efficiency in the transverse direction or the horizontal direction is improved.

It should be noted that, the first preset feature point is a projection point of the same object space point to be measured on the left image, and the second preset feature point is a projection point of the same object space point to be measured on the right image; when the left image and the right image are parallel, the first preset feature point and the second preset feature point are on the same polar line, so that the second preset feature point matched with the first preset feature point of the left image exists in the right image in the polar line direction of the left image, and the corresponding first parallax is that the coordinates of the first preset feature point and the coordinates of the second preset feature point are subtracted in the same direction so as to obtain a first depth value; when the left image and the right image are not parallel, the first preset feature point and the second preset feature point are not on the same polar line, then the left image and the right image are projectively transformed onto the same plane through a homography matrix, so that the projectively transformed left image has the second preset feature point matched with the first preset feature point in the projectively transformed right image, and the corresponding first parallax is that the coordinates of the first preset feature point and the coordinates of the second preset feature point are subtracted according to the horizontal polar line direction, so as to obtain a first depth value.

In the foregoing embodiment, the method for searching the longitudinal matching feature point pairs based on the same object space point to be detected includes: setting an image shot by the reference camera as a lower image, and setting an image shot by the third camera as an upper image; firstly, taking a lower image as a reference image, taking an upper image as an image to be matched, selecting a third preset feature point in the lower image, and searching a fourth preset feature point matched with the third preset feature point from the upper image based on epipolar constraint; then, taking the upper image as a reference image, taking the lower image as an image to be matched, selecting a fourth preset feature point in the upper image, and based on epipolar constraint, when a third preset feature point matched with the fourth preset feature point is searched out from the lower image, forming a longitudinal matching feature point pair by the third preset feature point and the fourth preset feature point; the third preset feature point is a projection point of the same object space point to be detected in the lower image, and the fourth preset feature point is a projection point of the same object space point to be detected in the upper image; or setting an image shot by a reference camera as a reference image, setting an image shot by a third camera as an image to be matched, selecting a third preset feature point in the reference image, searching a fourth preset feature point matched with the third preset feature point from the image to be matched based on polar constraint, and forming a longitudinal matching feature point pair by the third preset feature point and the fourth preset feature point; the third preset feature point is a projection point of the same object space point to be detected on the reference image, and the fourth preset feature point is a projection point of the same object space point to be detected on the image to be matched. Thereby reducing the matching range of the characteristic points of the image to be matched, which is shot by the third camera, and the characteristic points of the reference image, which is shot by the reference camera (the first camera or the second camera), and improving the matching efficiency of the characteristic points in the longitudinal direction.

When the lower image and the upper image are parallel, the third preset feature point and the fourth preset feature point are on the same polar line, so that the fourth preset feature point matched with the third preset feature point of the lower image exists in the upper image in the polar line direction of the lower image, and the corresponding second parallax is that the coordinates of the third preset feature point and the coordinates of the fourth preset feature point are subtracted in the same direction so as to obtain a second depth value; when the lower image and the upper image are not parallel, the third preset feature point and the fourth preset feature point are not on the same polar line, and then the lower image and the upper image are projectively transformed onto the same plane through a homography matrix, so that the fourth preset feature point matched with the third preset feature point exists in the upper image after projective transformation in the polar line direction of the lower image after projective transformation, and the corresponding second parallax is obtained by subtracting the coordinates of the third preset feature point and the coordinates of the fourth preset feature point according to the polar line direction after projective transformation so as to obtain a second depth value.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same; while the invention has been described in detail with reference to the preferred embodiments, those skilled in the art will appreciate that: modifications may be made to the specific embodiments of the present invention or equivalents may be substituted for part of the technical features thereof; without departing from the spirit of the invention, it is intended to cover the scope of the invention as claimed.

Claims

1. A visual ranging method, characterized in that it comprises the steps of:

searching a transverse matching characteristic point pair based on the same object space point to be detected from an image shot by a first camera and an image shot by a second camera, solving a first parallax of the transverse matching characteristic point pair, and then constructing a triangle geometry relationship by combining a base line distance between the first camera and the second camera and a camera focal length to calculate a first depth value;

selecting one camera from the first camera and the second camera as a reference camera, searching a longitudinal matching characteristic point pair based on the same object space point to be detected from an image shot by the reference camera and an image shot by the third camera, solving a second parallax of the longitudinal matching characteristic point pair, and then constructing a triangle geometric relationship by combining a base line distance between the reference camera and the third camera and a camera focal length to calculate a second depth value;

judging whether the first parallax is 0, if so, determining the second depth value as an optimal depth value for representing the distance between the space point of the object to be detected and the robot, otherwise, processing the first depth value and the second depth value according to the magnitude relation of the first parallax and the second parallax to obtain the optimal depth value;

the first camera and the second camera are two cameras which are placed at a baseline distance on the same preset straight line according to the size of the actual ranging scene range of the robot; the camera above the preset straight line or one camera below the preset straight line is set as a third camera;

the method for processing the first depth value and the second depth value according to the magnitude relation of the first parallax and the second parallax to obtain the optimal depth value comprises the following steps:

when the first parallax is greater than the second parallax, the optimal depth value is determined to be the first depth value;

when the second parallax is greater than the first parallax, the optimal depth value is determined as the second depth value;

calculating a first proportion of the first parallax in the sum value of the first parallax and the second parallax, and taking the first proportion as a mean value processing weight distributed by a first depth value;

calculating a second proportion of the second parallax in the sum of the first parallax and the second parallax, and taking the second proportion as a mean value processing weight distributed by a second depth value;

then, respectively carrying out weighted average processing on the first depth value and the second depth value by utilizing the first proportion and the second proportion so as to obtain an optimal depth value;

when the first parallax is 0, the robot recognizes that the space point of the object to be detected is positioned on a parallel line of a base line between the first camera and the second camera;

wherein this parallel line is a horizontal edge line belonging to a threshold or step.

2. The visual ranging method according to claim 1, wherein the method for searching out the pairs of the transverse matching feature points based on the same object space point to be measured comprises:

setting an image shot by the first camera as a left image and setting an image shot by the second camera as a right image; firstly, taking a left image as a reference image, taking a right image as an image to be matched, selecting a first preset characteristic point in the left image, and then searching a second preset characteristic point matched with the first preset characteristic point from the right image based on epipolar constraint; then taking the right image as a reference image, taking the left image as an image to be matched, selecting a second preset feature point in the right image, and based on epipolar constraint, when a first preset feature point matched with the second preset feature point is searched out from the left image, forming a transverse matching feature point pair by the first preset feature point and the second preset feature point; the first preset feature point is a projection point of the same object space point to be detected on the left image, and the second preset feature point is a projection point of the same object space point to be detected on the right image;

or setting an image shot by a first camera as a reference image, setting an image shot by a second camera as an image to be matched, selecting a first preset feature point in the reference image, searching a second preset feature point matched with the first preset feature point from the image to be matched based on polar constraint, and forming a transverse matching feature point pair by the first preset feature point and the second preset feature point; the first preset feature points are projection points of the same object space point to be detected on the reference image, and the second preset feature points are projection points of the same object space point to be detected on the image to be matched.

3. The visual ranging method as set forth in claim 1, wherein the method for searching out the longitudinal matching feature point pairs based on the same object space point to be measured comprises:

setting an image shot by the reference camera as a lower image, and setting an image shot by the third camera as an upper image; firstly, taking a lower image as a reference image, taking an upper image as an image to be matched, selecting a third preset feature point in the lower image, and searching a fourth preset feature point matched with the third preset feature point from the upper image based on epipolar constraint; then, taking the upper image as a reference image, taking the lower image as an image to be matched, selecting a fourth preset feature point in the upper image, and based on epipolar constraint, when a third preset feature point matched with the fourth preset feature point is searched out from the lower image, forming a longitudinal matching feature point pair by the third preset feature point and the fourth preset feature point; the third preset feature point is a projection point of the same object space point to be detected in the lower image, and the fourth preset feature point is a projection point of the same object space point to be detected in the upper image;

or setting an image shot by a reference camera as a reference image, setting an image shot by a third camera as an image to be matched, selecting a third preset feature point in the reference image, searching a fourth preset feature point matched with the third preset feature point from the image to be matched based on polar constraint, and forming a longitudinal matching feature point pair by the third preset feature point and the fourth preset feature point; the third preset feature point is a projection point of the same object space point to be detected on the reference image, and the fourth preset feature point is a projection point of the same object space point to be detected on the image to be matched.

4. A robot for performing a multi-camera ranging of the visual ranging method of any one of claims 1 to 3, the robot comprising a carrying base, characterized in that the carrying base mounts three cameras not distributed on the same preset line;

the preset straight line is provided with two cameras in the same horizontal acquisition direction, so that: on the preset straight line, the distance between two cameras is different from the distance between the other two cameras, or the base line length of the two cameras is different from the base line length of the other two cameras.

5. The multi-camera ranging robot of claim 4, wherein two cameras are installed on the same preset straight line in the bearing base, and there is one camera disposed above or below the preset straight line.

6. The multi-camera ranging robot of claim 4, wherein the preset straight line is a horizontal line.

7. The multi-camera ranging robot of claim 6, wherein on the preset straight line, there are left and right cameras horizontally placed at a base line distance, and cameras installed above or below the preset straight line are located on a vertical line of a connection line of the left and right cameras.