CN110910438B

CN110910438B - High-speed stereo matching algorithm for ultrahigh-resolution binocular image

Info

Publication number: CN110910438B
Application number: CN201811079109.3A
Authority: CN
Inventors: 华春生; 随心
Original assignee: Shenyang Institute of Automation of CAS
Current assignee: Shenyang Institute of Automation of CAS
Priority date: 2018-09-17
Filing date: 2018-09-17
Publication date: 2022-03-22
Anticipated expiration: 2038-09-17
Also published as: CN110910438A

Abstract

The invention relates to a high-speed stereo matching algorithm of a binocular image with ultrahigh resolution, which comprises the steps of down-sampling left and right images of a binocular camera, obtaining a parallax range through matching, and setting a threshold value; coding each picture, and calculating matching cost; constructing a minimum spanning tree method according to NLCA, and aggregating matching cost; and determining the parallax by selecting the parallax value corresponding to the point with the minimum matching cost, and reserving the parallax according to the threshold values corresponding to different layer numbers. The invention determines the size of the left and right images to be matched according to the distance of the object, and for the object at the near part, the parallax is large, the occupied pixel points are more, and much matching time can be consumed, so that the matching is carried out by adopting the image with smaller size, and on the contrary, for the object at the far part, the matching is carried out by adopting the image with large size, so that the matching precision of the object at the far part can be ensured.

Description

High-speed stereo matching algorithm for ultrahigh-resolution binocular image

Technical Field

The invention relates to the field of binocular stereo vision, in particular to a high-speed stereo matching algorithm for an ultrahigh-resolution binocular image.

Background

The stereo matching is to search matching pixel points between two or more images of the same scene and different images shot at different visual angles, obtain a depth value by obtaining a parallax value between the matching pixel points, and solve three-dimensional information of an object. The stereo matching is widely applied to the fields of three-dimensional reconstruction, robot navigation, virtual reality and the like.

The stereo matching algorithm is mainly used for estimating the parallax value of a pixel point by establishing an energy cost function and minimizing the energy cost function. The essence of the stereo matching algorithm is an optimization solving problem, a reasonable energy function is established, some constraints are added, an optimization theory method is adopted to solve equations, corresponding matching points of the left and right pictures are finally obtained, and then the disparity value of the points is obtained. The stereo matching generally comprises four steps of matching cost calculation, cost aggregation, parallax calculation and parallax refinement. The matching cost calculation is generally to obtain the matching cost value of each point by calculating the gray level images of the left and right images, and specific algorithms include AD, SAD, census and the like. And obtaining a cost graph of the matching cost after the cost calculation. The number of cost maps is the search range of the disparity. The cost aggregation stage is a filtering stage, each matched cost graph is filtered, and algorithms such as CSCA, NLCA and the like which are popular at present can obtain good effects. In the parallax calculation stage, for a certain pixel point in the image imagery, a point with small matching cost of the point is searched according to the obtained matching cost graph. The final parallax refinement stage is a process of optimizing the obtained parallax, and mainly comprises the following steps: algorithms such as left-right consistency detection, region voting and the like rarely use the steps in practical application.

In conclusion, the important process of stereo matching is also the matching cost calculation and cost aggregation stage. However, most stereo matching only considers matching of a full-size whole image, which wastes a long time in ultrahigh resolution.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a high-speed stereo matching algorithm for a binocular image with ultrahigh resolution, and solves the technical problems that for a near object, a plurality of pixels are occupied on the image, the parallax is large, and the complete calculation costs a lot of time.

The technical scheme adopted by the invention for realizing the purpose is as follows:

a high-speed stereo matching algorithm for an ultrahigh-resolution binocular image comprises the following steps:

step 1: down-sampling left and right images of the binocular camera, matching on the image with the smallest size to obtain the parallax range of the whole image, and setting a threshold value for each layer according to the down-sampling times and the parallax range;

step 2: coding each picture through census transformation, and calculating matching costs of different scales according to different distances;

and step 3: according to the method for constructing the minimum spanning tree by NLCA, aggregation of matching cost is carried out, and a filtered matching cost matrix is obtained;

and 4, step 4: and determining the parallax by selecting the parallax value corresponding to the point with the minimum matching cost, and reserving the parallax according to the threshold values corresponding to different layer numbers.

The encoding process of each picture comprises the following steps:

selecting a point in the graph, drawing a rectangle of n x n by taking the point as the center, comparing each point except the center point in the rectangle with the center point, marking the gray value as 1 when the gray value is smaller than the center point, marking the gray value as 0 when the gray value is larger than the center point, and representing the pixel value of the center point by the obtained census sequence with the length of (n x n-1).

The calculation process of the matching cost comprises the following steps:

the matching cost of each point is calculated by comparing the hamming distance of the left and right images:

wherein x and y are the n-bit codes of the gray values of the pixel points of the left and right images,

expressing exclusive OR, I expressing a left parallax image, I, j expressing the position in the image, wherein each pixel point is the parallax value of the point, and C expressing the Hamming distance (namely the matching cost value); the smaller the Hamming distance is, the smaller the matching cost is, and the higher the similarity is.

The setting of the threshold for each layer according to the number of downsampling and the parallax range includes:

the number of the threshold values is the number of down-sampling times, the number of the down-sampling times is 1 more than the number of layers, and the size of the threshold value of each layer is selected according to the percentage of the parallax range.

The determining the parallax by selecting the parallax value corresponding to the point with the minimum matching cost includes the following processes:

wherein d (C)_min(x(i_l,j_l),y(i_r,j_r) ) represents the parallax value when the cost is minimum, I represents the left parallax image, I, j represents the position in the image, each pixel point is the parallax value of the point, and C represents the matching cost value.

The threshold reserved parallax according to different layer numbers is as follows:

the smaller the threshold value is, the larger the corresponding layer number is; and retains a disparity greater than a threshold.

The invention has the following beneficial effects and advantages:

1. the method is different from other stereo matching algorithms in that a matching strategy of an image pyramid is proposed and adopted during the calculation of the matching cost, and only the cost value without matching points is calculated during the calculation of the cost, so that the calculation time on the maximum graph can be greatly reduced through the calculation, and the matching speed can be improved;

2. the invention provides and uses a calculation strategy based on an image pyramid in the process of calculating the parallax, and calculates the near object on the left and right camera images with small scale. When calculating distant objects, first consider whether matching has already been performed on a small scale, and only points that have not been matched are calculated. This has the advantage of reducing computation time and also reduces interference from nearby objects at the time of matching.

Drawings

FIG. 1 is a block diagram of a process flow;

FIG. 2 is a flowchart of an algorithm for matching cost calculation;

fig. 3 is a flowchart of an algorithm for calculating parallax.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples.

In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, embodiments accompanying the drawings are described in detail below. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather should be construed as modified in the spirit and scope of the present invention as set forth in the appended claims.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.

The present invention is described by taking the example of using three-level matching and performing two downsampling operations, and is not intended to limit the present invention.

As shown in fig. 1, firstly, down-sampling is performed on the pictures acquired by the left and right cameras, and the down-sampling is performed twice because three-layer matching is used in the application; selecting a disparity threshold θ₁,θ₂. And calculating the matching cost of each pixel, coding each picture by census transformation, and calculating the matching cost of different scales according to the difference of distance. And constructing a minimum spanning tree mode by using NLCA for aggregation of matching costs, wherein the weight of the edge of the minimum spanning tree is at least defined by the distance of matching cost vector characteristics of two pixel points connecting the edge. Calculating parallax, determining parallax by selecting corresponding parallax value of point with minimum matching cost, selecting different matching modes on different scales, and storing parallax larger than theta for object at the nearest position₂A second layer preserving a parallax greater than theta₁The last layer retains all points remaining.

In the matching cost calculation stage, as shown in fig. 2, the left and right camera pictures are subjected to gray scale processing, then a point is selected from the picture, each point except the central point in a rectangle with the point as the center, n is drawn out (the rectangle with the size of 9 × 9 is selected in this application), the gray value is smaller than the central point and is recorded as 1, the gray value larger than the central point is recorded as 0, and the obtained census sequence with the length of (n-1) represents the pixel value of the central point.

Determining the disparity values of points needing to be matched on the left and right images, if the disparity values are matched, not needing to be matched, and if the disparity values are not matched, calculating the similarity of each point by comparing the Hamming distances of the left and right images:

where x, y are both n-bit encodings of the gray scale values of the left and right image pixel points,

expressing exclusive OR, I expressing a left parallax image, I, j expressing the position in the image, wherein each pixel point is the parallax value of the point, and C expressing the Hamming distance (namely the matching cost value); the smaller the Hamming distance, the higher the similarity.

In the parallax calculation section, as shown in fig. 3, we first determine whether the point has already been matched, and if so, do not need to perform any further calculation. If not, selecting the parallax value corresponding to the minimum matching cost point, and calculating according to the following formula

In the above formula, d (C)_min(x(i_l,j_l),y(i_r,j_r) ) represents the parallax value when the cost is minimum, I represents the left parallax image, I, j represents the position in the image, each pixel point is the parallax value of the point, and C represents the matching cost value.

After matching, judging to determine whether the parallax value needs to be reserved, wherein the parallax of the object at the nearest position is larger than theta₂A second layer preserving a parallax greater than theta₁Point of (2), last layerAll points remaining are retained.

Claims

1. A high-speed stereo matching algorithm of an ultrahigh-resolution binocular image is characterized in that: the method comprises the following steps:

the calculation process of the matching cost comprises the following steps:

expressing exclusive OR, I expressing a left parallax image, I, j expressing the position in the image, wherein each pixel point is the parallax value of the point, and C expressing Hamming distance; the smaller the Hamming distance is, the smaller the matching cost is, and the higher the similarity is;

and step 3: constructing a minimum spanning tree method according to NLCA, and aggregating matching cost;

and 4, step 4: determining the parallax by selecting the parallax value corresponding to the point with the minimum matching cost, and reserving the parallax according to the threshold values corresponding to different layer numbers;

where x, y are both n-bit encodings of the gray values of the left and right image pixel points, d (C)_min(x(i_l,j_l),y(i_r,j_r) ) represents the parallax value when the cost is minimum, I represents the left parallax image, I, j represents the position in the image, each pixel point is the parallax value of the point, and C represents the matching cost value.

2. The high-speed stereo matching algorithm for the ultrahigh-resolution binocular image according to claim 1, wherein: the encoding process of each picture comprises the following steps:

3. The high-speed stereo matching algorithm for the ultrahigh-resolution binocular image according to claim 1, wherein: the setting of the threshold for each layer according to the number of downsampling and the parallax range includes:

the number of the threshold values is the number of down-sampling times, the number of the down-sampling times is less than the number of layers by 1, and the threshold value of each layer is selected according to the percentage of the parallax range.

4. The high-speed stereo matching algorithm for the ultrahigh-resolution binocular image according to claim 1, wherein: the threshold reserved parallax according to different layer numbers is as follows: