CN109887008B - Method, device and equipment for parallax stereo matching based on forward and backward smoothing and O (1) complexity - Google Patents

Method, device and equipment for parallax stereo matching based on forward and backward smoothing and O (1) complexity Download PDF

Info

Publication number
CN109887008B
CN109887008B CN201811016383.6A CN201811016383A CN109887008B CN 109887008 B CN109887008 B CN 109887008B CN 201811016383 A CN201811016383 A CN 201811016383A CN 109887008 B CN109887008 B CN 109887008B
Authority
CN
China
Prior art keywords
pixel point
value
pixel
confidence
parallax
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811016383.6A
Other languages
Chinese (zh)
Other versions
CN109887008A (en
Inventor
许金鑫
李庆武
罗颖
刘艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Campus of Hohai University
Original Assignee
Changzhou Campus of Hohai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Campus of Hohai University filed Critical Changzhou Campus of Hohai University
Priority to CN201811016383.6A priority Critical patent/CN109887008B/en
Publication of CN109887008A publication Critical patent/CN109887008A/en
Application granted granted Critical
Publication of CN109887008B publication Critical patent/CN109887008B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses aBased on the forward and backward smoothingO(1) The complexity parallax stereo matching method is used for respectively smoothing the left and right eye images in the forward and backward directions, and combining color and gradient information to construct a cost calculation function and calculate a cost value. In a cost aggregation stage, respectively constructing minimum spanning trees for the smoothed left and right eye images, performing cost aggregation on cost function values, obtaining initial parallax by adopting a WTA (WTA) strategy, judging stable and unstable points through left and right consistency detection and obtaining initial parallax confidence, filling holes in the unstable points to obtain an initial parallax image, obtaining a mixed weight value by combining color information of the left image and the initial parallax image, and performing confidence aggregation on the confidence value by adopting a horizontal tree structure on the basis of the initial parallax confidence and the mixed weight value; carrying out belief propagation on the belief aggregation value to obtain an optimal parallax estimation value so as to obtain a final dense parallax image; the invention effectively improves the accuracy and efficiency of stereo matching.

Description

Method, device and equipment for parallax stereo matching based on forward and backward smoothing and O (1) complexity
Technical Field
The invention belongs to the technical field of image processing, and relates to a parallax stereo matching method based on forward and backward smoothness and O (1) complexity.
Background
The stereo matching algorithm has a wide application range in computer vision, such as 3D reconstruction, image focusing, etc., but still has many challenging problems. The main work of stereo matching is to find corresponding image point pairs in an image, and the method comprises the following four steps: matching cost calculation, cost aggregation, disparity calculation and disparity refinement. Algorithms are generally classified into global and local algorithms.
The objective of the global algorithm is to minimize the energy function of the matching problem, which includes data items and smoothing items, and when there is a large difference in disparity values of neighboring nodes, the smoothing items serve as penalty factors. The global algorithm mainly comprises methods such as dynamic planning, belief propagation and graph cutting. The global method has better robustness in a non-texture area, is not easily influenced by noise, and obtains a disparity map more accurately. However, such methods are computationally complex and not suitable for real-time applications.
Compared with a global method, the local algorithm is sensitive to noise, has low accuracy, consumes less time and has high efficiency. The difficulty with local algorithms is the choice of cost functions and windows. The conventional method of calculating the cost function includes: mutual information, Absolute Difference (AD), Squared Difference (SD), Census transform, and the like. Common local windows include a cross window, an adaptive window, and the like.
The non-local stereo matching algorithm based on the Minimum Spanning Tree (MST) carries out cost aggregation in the whole image without the limitation of a window, all pixel points can support corresponding weights of other points, the accuracy is higher than that of a local algorithm, and the operation efficiency is higher than that of a global algorithm.
At present, many researches are based on two parts of matching cost calculation and cost aggregation, and the complexity is high in the parallax refinement stage, so that the method becomes a difficulty in a stereo matching algorithm. The traditional parallax refinement method comprises left-right consistency detection, hole filling and median filtering, and the steps are used for updating the cost value of unstable points and improving the matching accuracy. The complexity of any pixel point in the image of the dynamic programming algorithm is O (d), wherein d is a parallax range. In the MST-based parallax refinement process, the cost value of the unstable point is updated by transmitting the value of the stable point to the unstable point, and the complexity of any pixel point is O (d) and is higher. In addition, the matching accuracy is affected by a small amount of noise in the image.
Disclosure of Invention
The purpose of the invention is as follows: in order to solve the problems of the prior art, the invention discloses a parallax stereo matching method based on forward and backward smoothness and O (1) complexity, which reduces the computational complexity and effectively improves the accuracy and efficiency of matching.
The technical scheme of the invention is as follows:
a parallax stereo matching method based on forward and backward smoothing and O (1) complexity comprises the following steps:
(1) respectively carrying out forward and backward smoothing treatment on the left eye image and the right eye image;
(2) constructing a cost function based on the color and gradient information of the smoothed left eye image and the smoothed right eye image, and calculating a cost function value;
(3) constructing a minimum spanning tree for the smoothed left eye image and the smoothed right eye image, and performing cost aggregation on the cost function values to generate cost aggregation values;
(4) obtaining a disparity map by adopting a WTA strategy, judging stable points and unstable points through left-right consistency detection, obtaining initial disparity confidence, and filling holes in the unstable points to obtain an initial disparity map;
(5) combining the color information of the smoothed left eye image and the initial parallax image to obtain a mixed weight, and performing confidence aggregation on the initial parallax confidence by adopting a horizontal tree structure based on the initial parallax confidence and the mixed weight to obtain a confidence aggregation value;
(6) and (4) in the parallax value updating stage, performing belief propagation on the belief aggregation value according to the minimum spanning tree generated in the step (3) to obtain the optimal parallax estimation and obtain a dense parallax image.
The step (1) comprises the following steps:
the smoothing process of each pixel point in the left eye image and the right eye image is updated by scanning the pixel points on the horizontal tree structure, each pixel point is taken as a root node, the forward and backward smoothing is carried out by taking the RGB three-channel image as input, and the smoothing processing formula is as shown in formula (1):
Figure GDA0003746370400000021
Figure GDA0003746370400000022
representing the pixel value of the input image after the pixel points (u, v) under the i channel are smoothed;
wherein, I i (u, v) is the pixel value of the pixel point (u, v) of the input image under the i channel,
Figure GDA0003746370400000023
and (3) pixel value updating of pixel points (u, v) of the input image under the i channel in a forward or backward iteration mode:
Figure GDA0003746370400000031
r I i (u,v)=I i (u,v)-I i (u,v-r)
wherein the constant λ is used to adjust the smoothing speed + r I i (u, v) is the difference between a pixel point (u, v) of the input image under the i channel and an adjacent pixel point under the direction r, (u, v-r) is a pixel point which is previous to the pixel point (u, v) in the horizontal propagation direction, and f and b respectively represent the forward direction and the backward direction; ω is a constant.
In order to improve the efficiency of the algorithm operation, the forward and backward smoothing process comprises the following steps:
s1, passing from the leftmost end node to the rightmost end node of each line of the input image in turn, and storing the result of the forward smoothing in a plurality of groups
Figure GDA0003746370400000032
Performing the following steps;
s2, in reverse direction, sequentially transferring from the rightmost node to the leftmost node of each line of the input image, and storing the result of backward smoothing in the array
Figure GDA0003746370400000033
In (3), the smoothing result is obtained as formula (3):
Figure GDA0003746370400000034
Figure GDA0003746370400000035
representing the image matrix smoothed under the I-channel, I i Representing the original image under the i channel; equation (3) is a matrix form of data;
the forward and backward smoothing keeps the real depth edge information of the image while inhibiting the background noise, and the intensity value is updated through the forward and backward smoothing, so that the high texture area of the image is inhibited, and the final matching precision is improved.
The step (2) comprises the following steps:
(201) in order to avoid mismatching between pixels with the same gray level and different color information in the image, RGB three-channel information is adopted to replace single gray level information; setting any pixel point p in the left-eye image as (x, y), setting the corresponding parallax value of the pixel point p as (x, y) as d (the parallax map is a matrix, and the value of each element in the matrix is the parallax value, so that the parallax map and the parallax value are different in that one is the value of a specific point as a whole), and setting the corresponding matching point of the pixel point p in the right-eye image as pd as (x-d, y); color information C AD (p, d) and gradient information C Grad The expression of (p, d) is:
Figure GDA0003746370400000036
Figure GDA0003746370400000037
wherein, C AD (p, d) represents color information of the pixel point p when the parallax value is d, C Grad (p, d) represents gradient information of the pixel point p when the parallax value is d;
Figure GDA0003746370400000038
the pixel value of the pixel point p of the left eye image under the channel i,
Figure GDA0003746370400000039
representing the pixel value of a pixel point pd of a right eye image under an i channel;
Figure GDA0003746370400000041
and
Figure GDA0003746370400000042
representing the gradient in the x and y directions respectively under the i channel of the pixel point p of the left eye image,
Figure GDA0003746370400000043
and
Figure GDA0003746370400000044
respectively representing gradients in x and y directions of a pixel point pd of the right eye image under an i channel;
(202) the constructed cost function is as follows:
C(p,d)=w 1 C AD (p,d)+w 2 C Grad (p,d) (5)
wherein, w 1 、w 2 Weight, w, of color information and gradient information, respectively 1 +w 2 =1;
C (p, d) is a cost function of the pixel point p when the disparity value is d, and the cost function value is calculated based on the cost function.
The step (3) specifically comprises the following steps:
the cost aggregation value based on the minimum spanning tree is the sum of the multiplication of the cost function value and the corresponding weight value, and is represented by formula (6):
Figure GDA0003746370400000045
wherein, C d (q) a cost function value of a pixel point q when the parallax value is d, wherein q is any pixel point in the input image;
Figure GDA0003746370400000046
representing a cost Aggregation value (representing a symbol after Aggregation) of a pixel point P when the parallax value is d, and S (P, q) is a similarity function of the pixel point P and the pixel point q and represents the similarity between the pixel point P and the pixel point q;
Figure GDA0003746370400000047
d (p, q) represents the distance between a pixel point p and a pixel point q, and sigma is a constant and is used for adjusting the similarity between the two pixel points; in the non-texture region, the value of each pixel point is basically the same, the difference value of the color information is very small but not 0, which leads to the problem of small weight accumulation, i.e. many small edge weights are accumulated continuously along the aggregation path, and the aggregation becomes a high weight in the non-texture region, so in order to suppress the problem, the invention proposes an improved weight function, which is shown in formula (8):
Figure GDA0003746370400000048
m, n represents adjacent pixel points in the image;
Figure GDA0003746370400000049
for the largest pixel value in RGB three channels, w (m, n) is the weight of the adjacent pixel point, D (p, q) is the sum accumulated by the weight w (m, n) along the path, and the distance between the pixel point p and the pixel point q is the sum of the weights of the adjacent pixel points on the path.
The step (4) specifically comprises the following steps:
(401) obtaining a disparity map of the left eye image and a disparity map of the right eye image by adopting a WTA (WTA for winner take all);
(402) performing left-right consistency detection on the disparity map of the left eye image and the disparity map of the right eye image to divide pixel points into stable points and unstable points;
(403) the initial parallax confidence reflects the correct probability of the initial parallax value, if the pixel point and the pixels in the neighborhood of the pixel point have larger parallax confidence in the pixel point, and the parallax confidence value is set based on a stable point and an unstable point;
let B be the disparity confidence of the disparity map:
Figure GDA0003746370400000051
wherein, p is any pixel point in the input image (left eye image or right eye image), if p is a stable point, the probability is 1, otherwise, the probability of the pixel point being a correct parallax value is 0.1; b (p) represents the parallax confidence of a pixel point p in the initial parallax map;
(404) for unstable point to enterFilling the row holes: for an unstable point p (occlusion point), the first stable point (non-occlusion point) on the left and right sides is respectively searched in the horizontal direction, and is marked as p left 、p right The parallax value d (p) of the unstable point p is p left And p right The smaller of the median disparity values, i.e.
d(p)=min(d(p left ),d(p right )) (10)
After the filling of the holes is completed, an initial disparity map D is obtained init
The step (5) specifically comprises the following steps:
(501) establishing a new mixed weight function based on the initial disparity map and the smoothed left eye image, wherein the formula (11) is as follows:
Figure GDA0003746370400000052
w H (m, n) represents the mixing weight of the edge connecting the adjacent pixel m and pixel n, w H (n, m) represents the mixing weight of the edge connecting the adjacent pixel point n and the pixel point m, and the subscript H represents the mixing (hybrid), D init (m)、D init (n) refer to the initial disparity values, I, of pixel m and pixel n, respectively i (m)、I i (n) representing pixel values of the pixel point m and the pixel point n under an i channel;
the pixel point m and the pixel point n are two adjacent pixel points on the image, and alpha represents a weight value for balancing information of the initial disparity map and information of the image pixel point after smoothing (alpha is used for balancing the information of the initial disparity map and the information of the image pixel point after smoothing, and alpha is 0.5);
Figure GDA0003746370400000053
S H (p, q) denotes the mixture similarity function of p-and q-points, the subscript H denotes the mixture, D H (p, q) represents the weight w of the mixture from pixel point p to pixel point q H (m, n) distances accumulated along the path; sigma H Is a constant of the mixed similarity function and is used for adjusting the similarity between two pixel points.
(502) And performing confidence aggregation on the initial parallax confidence by adopting a horizontal tree structure, wherein the aggregation process is divided into a left-to-right process and a right-to-left process, and the confidence aggregation value of the aggregated pixel points is as follows:
Figure GDA0003746370400000061
Figure GDA0003746370400000062
Figure GDA0003746370400000063
wherein p is a pixel point in the image, the superscript LR represents the direction of aggregation from left to right, RL represents the direction of aggregation from right to left, pl represents the previous pixel point of the pixel point p, pr represents the next pixel point of the pixel point p, and S H (p, q) represents the mixed similarity between the adjacent pixel point p and the pixel point q;
Figure GDA0003746370400000064
representing the confidence aggregation value of the pixel point p accumulated from left to right based on the horizontal tree when the parallax value is d, B (p) is obtained in formula (9) and represents the parallax confidence value of the p point, B (pr) represents the parallax confidence aggregation value of the next pixel point of the pixel point, S H (p, pr) indicating the mixed similarity of the pixel point p and the next point pr,
Figure GDA0003746370400000065
and the average value of the confidence aggregate value accumulated from left to right, the confidence aggregate value accumulated from right to left and the parallax confidence value of the pixel point p is represented.
The step (6) specifically comprises the following steps:
(601) in the disparity updating phase, namely after confidence aggregation, according to the minimum spanning tree (weight is constructed by color information) established in the step (3), confidence propagation is carried out on the confidence aggregation value based on the minimum spanning tree, and the method comprises the following steps:
(6a) aggregation from leaf node to root node, i.e.:
Figure GDA0003746370400000066
wherein Ch (p) represents a child node of pixel point p,
Figure GDA0003746370400000071
the confidence aggregation value of the pixel point p is expressed as a value after the confidence propagation is carried out from the leaf node to the root node, so the confidence propagation value of the pixel point p comprises the confidence aggregation value of the pixel point p and the sum of the multiplication side weights of all subtrees from the pixel point p;
(6b) aggregation from the root node to the leaf nodes, i.e.:
Figure GDA0003746370400000072
wherein, Pr (p) represents the father node of the pixel point p;
Figure GDA0003746370400000073
performing belief propagation on the belief aggregation value of the pixel point p from the root node to the leaf node;
(602) for any pixel point q, S (p, q) represents the similarity of color information of points p and q in the minimum spanning tree, and the confidence aggregation value B A (q) is the area size similar in both color and disparity information in the neighborhood of q, so S (p, q) B A (q) means the probability that p and q have the same disparity; when in use
Figure GDA0003746370400000074
Then, the disparity value d (p) of p is the optimal disparity estimation of q points,
Figure GDA0003746370400000075
is the probability of the optimal disparity estimate, said probabilityThe rate is obtained by propagating confidence aggregation values of the minimum spanning tree, and the confidence propagation of the node p is defined as B Pro (p) optimal disparity estimation is defined as disparity propagation D Pro (p), for each node p:
Figure GDA0003746370400000076
Figure GDA0003746370400000077
the process is to find the optimal disparity estimation of the unstable point from the stable points, so as to update the disparity value of the unstable point and obtain the final dense disparity map.
A disparity stereo matching device based on forward and backward smoothing and O (1) complexity, comprising:
the device comprises a smoothing processing module, a cost function construction module, a cost aggregation module, a disparity map acquisition module, a confidence aggregation module and a confidence propagation module;
the smoothing module is used for respectively carrying out forward smoothing and backward smoothing on the left eye image and the right eye image;
the cost function construction module constructs a cost function based on the color and gradient information of the smoothed left eye image and the smoothed right eye image and calculates a cost function value;
the cost aggregation module constructs a minimum spanning tree for the smoothed left eye image and the smoothed right eye image, and carries out cost aggregation on the cost function value to generate a cost aggregation value;
the parallax image acquisition module obtains a parallax image by adopting a WTA strategy, judges stable points and unstable points through left-right consistency detection, obtains initial parallax confidence, and fills holes in the unstable points to obtain an initial parallax image;
the confidence aggregation module is combined with the color information of the smoothed left eye image and the initial parallax image to obtain a mixed weight, and based on the initial parallax confidence and the mixed weight, confidence aggregation is carried out on the initial parallax confidence by adopting a horizontal tree structure to obtain a confidence aggregation value;
and in the parallax value updating stage, the belief propagation module performs belief propagation on the belief aggregation value according to the minimum spanning tree to obtain the optimal parallax estimation and obtain a dense parallax image.
The smoothing module for smoothing specifically comprises the following steps:
the smoothing process of each pixel point in the left eye image and the right eye image is updated by scanning the pixel points on the horizontal tree structure, each pixel point is taken as a root node, the forward smoothing and the backward smoothing are input by taking the RGB three-channel image, and the smoothing processing formula is as formula (1):
Figure GDA0003746370400000081
Figure GDA0003746370400000082
representing the pixel value of the input image after the pixel points (u, v) under the i channel are smoothed;
wherein, I i (u, v) is the pixel value of the pixel point (u, v) of the input image under the i channel,
Figure GDA0003746370400000083
updating pixel values representing forward or backward iterations of pixel points (u, v) of the input image under the i channel:
Figure GDA0003746370400000084
r I i (u,v)=I i (u,v)-I i (u,v-r)
wherein the constant λ is used to adjust the smoothing speed + r I i (u, v) is the difference between a pixel (u, v) of the input image under the i channel and an adjacent pixel under the direction r, and (u, v-r) is the horizontal propagation directionThe previous pixel point of the upper pixel point (u, v), f and b represent the forward and backward directions, respectively; ω is a constant.
The processing process of the cost function construction module specifically comprises the following steps:
(201) in order to avoid mismatching between pixels with the same gray level and different color information in the image, RGB three-channel information is adopted to replace single gray level information; setting any pixel point p in the left-eye image as (x, y), setting the corresponding parallax value of the pixel point p as (x, y) as d (the parallax map is a matrix, and the value of each element in the matrix is the parallax value, so that the parallax map and the parallax value are different in that one is the value of a specific point as a whole), and setting the corresponding matching point of the pixel point p in the right-eye image as pd as (x-d, y); color information C AD (p, d) and gradient information C Grad The expression of (p, d) is formula (4):
Figure GDA0003746370400000091
Figure GDA0003746370400000092
wherein, C AD (p, d) represents color information of the pixel point p when the parallax value is d, C Grad (p, d) gradient information of the pixel point p when the parallax value is d;
Figure GDA0003746370400000093
the pixel value of the pixel point p of the left eye image under the channel i,
Figure GDA0003746370400000094
representing the pixel value of a pixel point pd of a right eye image under an i channel;
Figure GDA0003746370400000095
and
Figure GDA0003746370400000096
at x and i channels of pixel p representing the left eye image respectivelyThe gradient in the y-direction is,
Figure GDA0003746370400000097
and
Figure GDA0003746370400000098
respectively representing gradients in x and y directions of a pixel point pd of a right eye image under an i channel;
(202) the constructed cost function is as follows:
C(p,d)=w 1 C AD (p,d)+w 2 C Grad (p,d) (5)
wherein, w 1 、w 2 Weight, w, of color information and gradient information, respectively 1 +w 2 1 in this example w 1 =0.2;
C (p, d) is a cost function of the pixel point p when the parallax value is d, and a cost function value is calculated based on the cost function;
the processing process of the cost aggregation module specifically comprises the following steps:
the cost aggregation value based on the minimum spanning tree is the sum of the multiplication of the cost function value and the corresponding weight value, and is represented by formula (6):
Figure GDA0003746370400000099
wherein, C d (q) a cost function value of a pixel point q when the parallax value is d, wherein q is any pixel point in the input image;
Figure GDA00037463704000000910
representing a cost Aggregation value (representing an aggregated symbol) of a pixel point P when the parallax value is d, and S (P, q) is a similarity function of the pixel point P and the pixel point q and represents the similarity between the pixel point P and the pixel point q;
Figure GDA0003746370400000101
d (p, q) represents the distance between a pixel point p and a pixel point q, and sigma is a constant and is used for adjusting the similarity between the two pixel points; the invention provides an improved weight function, which is shown in an equation (8):
Figure GDA0003746370400000102
m, n represents adjacent pixel points in the image;
Figure GDA0003746370400000103
for the maximum pixel value in RGB three channels, w (m, n) is the weight of the adjacent pixel point, D (p, q) is the sum accumulated by the weight w (m, n) along the path, and the distance between the pixel point p and the pixel point q is the sum of the weights of the adjacent pixel points on the path;
the processing process of the disparity map acquisition module specifically comprises the following steps:
(401) obtaining a disparity map of the left eye image and a disparity map of the right eye image by adopting a WTA (WTA for winner take all);
(402) performing left-right consistency detection on the disparity map of the left eye image and the disparity map of the right eye image, and dividing pixel points into stable points and unstable points;
(403) the initial parallax confidence reflects the correct probability of the initial parallax value, if the pixel point and the pixels in the neighborhood of the pixel point have larger parallax confidence in the pixel point, and the parallax confidence value is set based on a stable point and an unstable point;
let B be the disparity confidence of the disparity map, i.e.:
Figure GDA0003746370400000104
wherein, p is any pixel point in the input image (left eye image or right eye image), if p is a stable point, the probability is 1, otherwise, the probability of the pixel point being a correct parallax value is 0.1; b (p) represents the parallax confidence of a pixel point p in the initial parallax map;
(404) filling holes in unstable points: for the unstable point p (shielded point), respectively in waterLooking for the first stable point (non-occlusion point) on the left and right sides in the direction of the square, and recording as p left 、p right The value of the disparity at the unstable point p is p left And p right The smaller of the median disparity values, i.e.
d(p)=min(d(p left ),d(p right )) (10)
After the filling of the holes is completed, an initial disparity map D is obtained init
The confidence aggregation module processing process specifically comprises the following steps:
(501) establishing a new mixed weight function w based on the initial disparity map and the smoothed left eye image H (m, n) represented by formula (11):
Figure GDA0003746370400000111
w H (m, n) represents the mixing weight of the edge connecting the adjacent pixel m and pixel n, w H (n, m) represents the blending weight of the edge connecting the adjacent pixel n and pixel m, subscript H represents blending (hybrid), D init (m)、D init (n) refer to the initial disparity values, I, of pixel m and pixel n, respectively i (m)、I i (n) representing pixel values of the pixel point m and the pixel point n under the channel i;
the pixel point m and the pixel point n are two adjacent pixel points on the image, and alpha represents the weight value for balancing the information of the initial disparity map and the information of the smoothed image pixel point;
Figure GDA0003746370400000112
S H (p, q) denotes the mixture similarity function of p-and q-points, the subscript H denotes the mixture, D H (p, q) represents the weight w of the mixture from pixel point p to pixel point q H (m, n) distances accumulated along the path; sigma H The constant is a constant of a mixed similarity function and is used for adjusting the similarity between two pixel points;
(502) performing confidence aggregation on the initial parallax confidence by adopting a horizontal tree structure, wherein the aggregation process is divided into a left-to-right process and a right-to-left process, and the confidence aggregation values of the aggregated pixel points are as follows:
Figure GDA0003746370400000113
Figure GDA0003746370400000114
Figure GDA0003746370400000115
wherein p is a pixel point in the image, the superscript LR represents the direction of aggregation from left to right, RL represents the direction of aggregation from right to left, pl represents the previous pixel point of the pixel point p, pr represents the next pixel point of the pixel point p, and S H (p, q) represents the mixed similarity between adjacent pixel points.
Figure GDA0003746370400000116
Representing the confidence aggregation value of the pixel point p accumulated from left to right based on the horizontal tree when the parallax value is d, B (p) is obtained in formula (9) and represents the parallax confidence value of the pixel point p, B (pr) represents the parallax confidence aggregation value of the next pixel point of the pixel point, S H (p, pr) indicating the mixed similarity of the pixel point p and the next point pr,
Figure GDA0003746370400000117
representing confidence aggregate values accumulated from left to right, confidence aggregate values accumulated from right to left, pixel point p p Average of the three disparity confidence values.
The belief propagation module processing process specifically comprises the following steps:
(601) in the disparity updating phase, namely after confidence aggregation, according to the minimum spanning tree (weight is constructed by color information) established in the step (3), confidence propagation is carried out on the confidence aggregation value based on the minimum spanning tree, and the method comprises the following steps:
(6a) aggregation from leaf node to root node, i.e.:
Figure GDA0003746370400000121
wherein Ch (p) represents the child node of the pixel point p,
Figure GDA0003746370400000122
the confidence aggregation value of the pixel point p is expressed as a value after the confidence propagation is carried out from the leaf node to the root node, so the confidence propagation value of the pixel point p comprises the confidence aggregation value of the pixel point p and the sum of the multiplication side weights of all subtrees from the pixel point p;
(6b) from the root node to the leaf nodes, the aggregation is:
Figure GDA0003746370400000123
wherein, Pr (p) represents the father node of the pixel point p;
Figure GDA0003746370400000124
performing belief propagation on the belief aggregation value of the pixel point p from the root node to the leaf node;
(602) for any pixel point q, S (p, q) represents the similarity of color information of points p and q in the minimum spanning tree, and the confidence aggregation value B A (q) is the area size similar in both color and disparity information in the neighborhood of q, so S (p, q) B A (q) means the probability that p and q have the same disparity; when in use
Figure GDA0003746370400000125
Then, the parallax value d (p) of p is the optimal parallax estimation of q point,
Figure GDA0003746370400000126
is the probability of the optimal disparity estimation, which is obtained by propagating the confidence aggregation value of the minimum spanning tree, so it is defined as the confidence propagationIs B Pro The optimal disparity estimate is defined as the disparity propagation D Pro (p), for each node p:
Figure GDA0003746370400000127
Figure GDA0003746370400000128
the process is to find the optimal disparity estimation of the unstable point from the stable points, so as to update the disparity value of the unstable point and obtain the final dense disparity map.
A computing device comprising one or more processors, memory, and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing a front-to-back smoothing and O (1) complexity disparity stereo matching method.
The beneficial effects of the invention include:
the invention discloses a stereo matching method, which comprises the steps of firstly, respectively smoothing left and right eye images in the front-back direction, combining color and gradient information to construct a cost calculation function and calculate a cost value, then respectively constructing a minimum spanning tree for the smoothed left and right eye images, and carrying out cost aggregation on cost function values. And obtaining an initial parallax by adopting a WTA strategy, judging stable and unstable points through left and right consistency detection, obtaining initial parallax confidence, and filling holes in the unstable points to obtain an initial parallax image. And then combining the mixed weight values, and performing confidence aggregation on the confidence values by adopting a horizontal tree structure. And finally, carrying out belief propagation on the belief aggregation value according to the minimum spanning tree constructed in the cost aggregation stage to obtain an optimal parallax estimation value, thereby obtaining a dense parallax image. The method overcomes the defects of low matching precision and high calculation complexity in a parallax refinement stage caused by the existence of noise in the existing binocular stereo matching technology;
the invention respectively carries out forward and backward smoothing treatment on the left and right images in the preprocessing stage, can remove the noise in the original image, simultaneously keeps the information of the image edge and effectively improves the accuracy of the parallax image. In the cost aggregation stage, a non-local method based on minimum spanning tree aggregation is adopted, the accuracy is higher than that of a local algorithm, and the operation efficiency is higher than that of a global algorithm. In the parallax refinement stage, for any pixel point, the calculation complexity is O (1), so that the complexity is greatly reduced, and particularly for high-resolution images, the matching efficiency is effectively improved.
Drawings
The invention is further explained below with reference to the figures and examples;
FIG. 1 is a flow chart of the stereo matching method based on forward and backward smoothing and O (1) complexity parallax according to the present invention;
FIG. 2 is a root node p based smoothing process of the present invention;
FIG. 3a is a bottom-up traversal process of the cost aggregation process of the present invention based on a minimum spanning tree;
FIG. 3b is a top-down traversal process of the cost aggregation process based on the minimum spanning tree according to the present invention;
FIG. 4 is a confidence aggregation process based on a horizontal tree structure in accordance with the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following embodiments, which are illustrative only and not limiting, and the scope of the present invention is not limited thereby. In order to achieve the objectives and effects of the technical means, creation features, working procedures and using methods of the present invention, and to make the evaluation methods easy to understand, the present invention will be further described with reference to the following embodiments.
As shown in fig. 1, a disparity stereo matching method based on forward and backward smoothing and O (1) complexity includes the following steps:
(1) respectively carrying out forward and backward smoothing treatment on the left eye image and the right eye image;
(2) constructing a cost function based on the color and gradient information of the smoothed left eye image and the smoothed right eye image and calculating a cost function value;
(3) constructing a minimum spanning tree for the smoothed left eye image and the smoothed right eye image, and performing cost aggregation on the cost function values to generate cost aggregation values;
(4) obtaining a disparity map by adopting a WTA strategy, judging a stable point and an unstable point through left-right consistency detection, obtaining initial disparity confidence, and filling holes in the unstable point to obtain an initial disparity map;
(5) combining the color information of the smoothed left eye image and the initial parallax image to obtain a mixed weight (the mixed weight passes through a formula 11), and performing confidence aggregation on the initial parallax confidence by adopting a horizontal tree structure based on the initial parallax confidence and the mixed weight to obtain a confidence aggregation value;
(6) and (4) in the parallax value updating stage, performing belief propagation on the belief aggregation value according to the minimum spanning tree generated in the step (3) to obtain the optimal parallax estimation and obtain a dense parallax image.
The step (1) comprises the following steps:
as shown in fig. 2, the smoothing process of each pixel in the left eye image and the right eye image is updated by scanning the pixels on the horizontal tree structure, each pixel is taken as a root node, the forward and backward smoothing is input by using the RGB three-channel image, and the smoothing formula is formula (1):
Figure GDA0003746370400000151
Figure GDA0003746370400000152
representing the pixel value of the input image after pixel points (u, v) under an i channel are smoothed;
wherein, I i (u, v) is a pixel value of a pixel point (u, v) of the input image under the i-channel,
Figure GDA0003746370400000153
and (3) updating pixel values representing forward or backward iteration of pixel points (u, v) of the input image under the i channel:
Figure GDA0003746370400000154
Figure GDA0003746370400000155
where the constant λ is used to adjust the smoothing speed, λ ═ 0.2 is set in the present invention r I i (u, v) is the difference between a pixel point (u, v) of the input image under the i channel and an adjacent pixel point under the direction r, (u, v-r) is a pixel point which is previous to the pixel point (u, v) in the horizontal propagation direction, and f and b respectively represent the forward direction and the backward direction; ω is a constant and can be a fixed value, or based on noise estimation, ω is set to 0.1; when there is a large difference between adjacent pixel points, especially in a high texture region, the value of the exponential term is small, so that the contribution between pixels is small, and the depth information of the image edge can be effectively maintained.
In order to improve the efficiency of the algorithm operation, the forward and backward smoothing process comprises the following steps:
s1, sequentially transferring from the leftmost node to the rightmost node of each line of the input image, and storing the result of forward smoothing in the array
Figure GDA0003746370400000157
Performing the following steps;
s2, in reverse direction, sequentially transferring from the rightmost node to the leftmost node of each line of the input image, and storing the result of backward smoothing in the array
Figure GDA0003746370400000158
In (3), the smoothing result is obtained as formula (3):
Figure GDA0003746370400000159
Figure GDA00037463704000001510
representing the image matrix smoothed under the I-channel, I i Representing the original image under the i channel; equation (3) is a matrix form of data;
the forward and backward smoothing keeps the real depth edge information of the image while inhibiting the background noise, and the intensity value is updated through the forward and backward smoothing, so that the high texture area of the image is inhibited, and the final matching precision is improved.
The step (2) comprises the following steps:
(201) in order to avoid mismatching between pixels with the same gray level and different color information in the image, RGB three-channel information is adopted to replace single gray level information; setting any pixel point p in the left-eye image as (x, y), setting the corresponding parallax value of the pixel point p as (x, y) as d (the parallax map is a matrix, and the value of each element in the matrix is the parallax value, so that the parallax map and the parallax value are different in that one is the value of a specific point as a whole), and setting the corresponding matching point of the pixel point p in the right-eye image as pd as (x-d, y); color information C AD (p, d) and gradient information C Grad The expression of (p, d) is:
Figure GDA0003746370400000161
Figure GDA0003746370400000162
wherein, C AD (p, d) represents color information of the pixel point p when the parallax value is d, C Grad (p, d) represents gradient information of the pixel point p when the parallax value is d;
Figure GDA0003746370400000163
the pixel value of the pixel point p of the left eye image under the channel i,
Figure GDA0003746370400000164
representing the pixel value of a pixel point pd of the right eye image under an i channel;
Figure GDA0003746370400000165
and
Figure GDA0003746370400000166
representing the gradient in the x and y directions respectively under the i channel of the pixel point p of the left eye image,
Figure GDA0003746370400000167
and
Figure GDA0003746370400000168
respectively representing gradients in x and y directions of a pixel point pd of the right eye image under an i channel;
(202) the constructed cost function is as follows:
C(p,d)=w 1 C AD (p,d)+w 2 C Grad (p,d) (5)
wherein, w 1 、w 2 Weights, w, of colour information and gradient information, respectively 1 +w 2 1, w in this embodiment 1 =0.2;
C (p, d) is a cost function of the pixel point p when the disparity value is d, and the cost function value is calculated based on the cost function.
The step (3) specifically comprises the following steps:
the cost aggregation value based on the minimum spanning tree is the sum of the multiplication of the cost function value and the corresponding weight value, and is represented by formula (6):
Figure GDA0003746370400000169
wherein, C d (q) a cost function value of a pixel point q when the parallax value is d, wherein q is any pixel point in the input image;
Figure GDA00037463704000001610
representing a cost Aggregation value (representing an aggregated symbol) of a pixel point P when the parallax value is d, and S (P, q) is a similarity function of the pixel point P and the pixel point q and represents the similarity between the pixel point P and the pixel point q;
Figure GDA00037463704000001611
d (p, q) represents the distance between a pixel point p and a pixel point q, and sigma is a constant and is used for adjusting the similarity between the two pixel points; in the non-texture region, the value of each pixel point is basically the same, the difference value of the color information is very small but not 0, which leads to the problem of small weight accumulation, i.e. many small edge weights are accumulated continuously along the aggregation path, and the aggregation becomes a high weight in the non-texture region, so in order to suppress the problem, the invention proposes an improved weight function, which is shown in formula (8):
Figure GDA0003746370400000171
m, n represents adjacent pixel points in the image;
Figure GDA0003746370400000172
for the largest pixel value in the three channels of RGB, w (m, n) is the weight of the adjacent pixel, D (p, q) in formula (7) is the sum accumulated by the weights w (m, n) along the path, and the distance between the pixel p and the pixel q is the sum of the weights of the adjacent pixels on the path.
The step (4) specifically comprises the following steps:
(401) obtaining a disparity map of the left eye image and a disparity map of the right eye image by adopting a WTA (WTA for winner take all);
(402) performing left-right consistency detection on the disparity map of the left eye image and the disparity map of the right eye image to divide pixel points into stable points and unstable points;
if the parallax value d of the pixel point p in the left eye image L (p) is equal to the disparity value d of the corresponding right eye image R (p-d L (p)), i.e. d L (p)=d R (p-d L (p)), p is considered to be a stable point, otherwise, it is considered to be an unstable point.
(403) The initial parallax confidence reflects the correct probability of the initial parallax value, if the parallax value and the color information of the pixels in the neighborhood of the pixel point are obtained, the pixel point has larger parallax confidence, and the parallax confidence value is set based on a stable point and an unstable point;
let B be the disparity confidence of the disparity map:
Figure GDA0003746370400000173
wherein, p is any pixel point in the input image (left eye image or right eye image), if p is a stable point, the probability is 1, otherwise, the probability of the pixel point being a correct parallax value is 0.1; b (p) represents the parallax confidence of a pixel point p in the initial parallax map;
(404) filling holes in unstable points: for an unstable point p (occlusion point), the first stable point (non-occlusion point) on the left and right sides is respectively searched in the horizontal direction, and is marked as p left 、p right The parallax value d (p) of the unstable point p is p left And p right The smaller of the median disparity values, i.e.
d(p)=min(d(p left ),d(p right )) (10)
After the filling of the holes is completed, an initial disparity map D is obtained init
The step (5) specifically comprises the following steps:
(501) establishing a new mixing weight function based on the initial disparity map and the smoothed left eye image, wherein the formula (11) is as follows:
Figure GDA0003746370400000181
w H (m, n) represents the mixing weight of the edge connecting the adjacent pixel m and pixel n, w H (n, m) represents the blending weight of the edge connecting the adjacent pixel point n and the pixel point mThe value, subscript H represents hybrid, D init (m)、D init (n) refer to the initial disparity values, I, of pixel m and pixel n, respectively i (m)、I i (n) representing pixel values of the pixel point m and the pixel point n under an i channel;
the pixel point m and the pixel point n are two adjacent pixel points on the image, and alpha represents a weight value for balancing information of the initial disparity map and information of the image pixel point after smoothing (alpha is used for balancing the information of the initial disparity map and the information of the image pixel point after smoothing, and alpha is 0.5);
Figure GDA0003746370400000182
S H (p, q) denotes the mixture similarity function of p-and q-points, the subscript H denotes the mixture, D H (p, q) represents the weight w of the mixture from pixel point p to pixel point q H (m, n) distances accumulated along the path; sigma H Is a constant of a mixed similarity function and is used for adjusting the similarity between two pixel points
(502) And performing confidence aggregation on the initial parallax confidence by adopting a horizontal tree structure, wherein the aggregation process is divided into a left-to-right process and a right-to-left process, and the confidence aggregation value of the aggregated pixel points is as follows:
Figure GDA0003746370400000183
Figure GDA0003746370400000184
Figure GDA0003746370400000185
wherein p is a pixel point in the image, the superscript LR represents the direction of aggregation from left to right, RL represents the direction of aggregation from right to left, pl represents the previous pixel point of the pixel point p, pr represents the next pixel point of the pixel point p, and S H (p, q) tableIndicating the mixed similarity between the adjacent pixel point p and the pixel point q;
Figure GDA0003746370400000186
representing the confidence aggregation value of the pixel point p accumulated from left to right based on the horizontal tree when the parallax value is d, B (p) is obtained in formula (9) and represents the parallax confidence aggregation value obtained by the pixel point according to the mixed weight value, B A (pr) a subsequent pixel disparity confidence cluster value, S, representing a pixel H (p, pr) indicating the mixed similarity of the pixel point p and the next point pr,
Figure GDA0003746370400000191
and the average value of the confidence aggregation value accumulated from left to right, the confidence aggregation value accumulated from right to left and the parallax confidence value of the pixel point p is represented.
The step (6) specifically comprises the following steps:
(601) in the disparity updating phase, namely after confidence aggregation, according to the minimum spanning tree (weight is constructed by color information) established in the step (3), confidence propagation is carried out on the confidence aggregation value based on the minimum spanning tree, and the method comprises the following steps:
(6a) as shown in fig. 3a, aggregation is from leaf nodes to root nodes, i.e.:
Figure GDA0003746370400000192
where Ch (p) represents the child node of pixel p,
Figure GDA0003746370400000193
the confidence aggregation value of the pixel point p is expressed as a value after the confidence propagation is carried out from the leaf node to the root node, so the confidence propagation value of the pixel point p comprises the confidence aggregation value of the pixel point p and the sum of the multiplication side weights of all subtrees from the pixel point p;
(6b) as shown in fig. 3b, aggregation is from the root node to the leaf nodes, i.e.:
Figure GDA0003746370400000194
wherein, Pr (p) represents the father node of the pixel point p;
Figure GDA0003746370400000195
the confidence aggregation value of the pixel point p is subjected to confidence propagation from the root node to the leaf node;
(602) as shown in FIG. 4, for any pixel point q, S (p, q) represents the color information similarity of points p and q in the minimum spanning tree, and the confidence aggregation value B A Since (q) is a region size similar to each other in both color and parallax information in the neighborhood of q, S (p, q) B A (q) means the probability that p and q have the same disparity; when the temperature is higher than the set temperature
Figure GDA0003746370400000196
Then, the disparity value of p is the optimal disparity estimation of q points,
Figure GDA0003746370400000197
is the probability of the optimal disparity estimation, the probability is obtained by propagating the confidence aggregation value of the minimum spanning tree, and the confidence propagation of the node p is B Pro (p) optimal disparity estimation is defined as disparity propagation D Pro (p), for each node p:
Figure GDA0003746370400000198
Figure GDA0003746370400000201
the process is to find the optimal disparity estimation of the unstable point from the stable points, so as to update the disparity value of the unstable point and obtain the final dense disparity map.
A disparity stereo matching device based on forward and backward smoothing and O (1) complexity, comprising: the device comprises a smoothing processing module, a cost function construction module, a cost aggregation module, a disparity map acquisition module, a confidence aggregation module and a confidence propagation module;
the smoothing module is used for respectively carrying out forward smoothing and backward smoothing on the left eye image and the right eye image; the cost function construction module constructs a cost function based on the color and gradient information of the smoothed left eye image and the smoothed right eye image and calculates a cost function value; the cost aggregation module constructs a minimum spanning tree for the smoothed left eye image and the smoothed right eye image, and carries out cost aggregation on the cost function value to generate a cost aggregation value; the parallax image acquisition module obtains a parallax image by adopting a WTA strategy, judges stable points and unstable points through left-right consistency detection, obtains initial parallax confidence, and fills holes in the unstable points to obtain an initial parallax image; the confidence aggregation module is combined with the color information of the smoothed left eye image and the initial parallax image to obtain a mixed weight, and based on the initial parallax confidence and the mixed weight, confidence aggregation is carried out on the initial parallax confidence by adopting a horizontal tree structure to obtain a confidence aggregation value; and in the parallax value updating stage, the belief propagation module performs belief propagation on the belief aggregation value according to the minimum spanning tree to obtain the optimal parallax estimation and obtain a dense parallax image.
The smoothing module for smoothing specifically comprises the following steps:
the smoothing process of each pixel point in the left eye image and the right eye image is updated by scanning the pixel points on the horizontal tree structure, each pixel point is taken as a root node, the forward smoothing and the backward smoothing are input by taking the RGB three-channel image, and the smoothing processing formula is as formula (1):
Figure GDA0003746370400000202
Figure GDA0003746370400000203
representing the pixel value of the input image after the pixel points (u, v) under the i channel are smoothed;
wherein,I i (u, v) is the pixel value of the pixel point (u, v) of the input image under the i channel,
Figure GDA0003746370400000204
updating pixel values representing forward or backward iterations of pixel points (u, v) of the input image under the i channel:
Figure GDA0003746370400000211
r I i (u,v)=I i (u,v)-I i (u,v-r)
where a constant λ is used to adjust the smoothing speed, λ ═ 0.2 is set in the present invention r I i (u, v) is the difference between a pixel point (u, v) of the input image under the i channel and an adjacent pixel point under the direction r, (u, v-r) is a pixel point which is previous to the pixel point (u, v) in the horizontal propagation direction, and f and b respectively represent the forward direction and the backward direction; ω is a constant value, which can be a fixed value, or based on noise estimation, ω is set to 0.1; when there is a great difference between adjacent pixels, especially in a high texture region, the value of the exponential term is small, so the contribution between pixels is also small, and the depth information of the image edge can be effectively maintained.
Forward and backward smoothing keeps real depth edge information of the image while inhibiting background noise, and updates the intensity value through the forward and backward smoothing, so that a high-texture area of the image is inhibited, and the accuracy of final matching is improved;
the processing process of the cost function construction module specifically comprises the following steps:
(201) in order to avoid mismatching between pixels with the same gray level and different color information in the image, RGB three-channel information is adopted to replace single gray level information; if any pixel point p in the left eye image is equal to (x, y), the disparity value corresponding to the pixel point p is equal to (x, y) is d (the disparity map is a matrix, and the value of each element in the matrix is the disparity value, so the disparity map and the disparity value are different in that one is the value of a specific point as a whole), and the pixel point p is corresponding to the right eye imageThe coordination point is pd ═ (x-d, y); color information C AD (p, d) and gradient information C Grad The expression of (p, d) is formula (4):
Figure GDA0003746370400000212
Figure GDA0003746370400000213
wherein, C AD (p, d) represents color information of the pixel point p when the parallax value is d, C Grad (p, d) represents gradient information of the pixel point p when the parallax value is d;
Figure GDA0003746370400000214
the pixel value of the pixel point p of the left eye image under the channel i,
Figure GDA0003746370400000215
representing the pixel value of a pixel point pd of a right eye image under an i channel;
Figure GDA0003746370400000216
and
Figure GDA0003746370400000217
representing the gradient in the x and y directions respectively under the i channel of the pixel point p of the left eye image,
Figure GDA0003746370400000218
and
Figure GDA0003746370400000219
respectively representing gradients in x and y directions of a pixel point pd of a right eye image under an i channel;
(202) the constructed cost function is as follows:
C(p,d)=w 1 C AD (p,d)+w 2 C Grad (p,d) (5)
wherein, w 1 、w 2 Respectively color information and gradient informationWeight of, w 1 +w 2 1 in this example w 1 =0.2;
C (p, d) is a cost function of the pixel point p when the parallax value is d, and a cost function value is calculated based on the cost function;
the processing process of the cost aggregation module specifically comprises the following steps:
the cost aggregation value based on the minimum spanning tree is the sum of the multiplication of the cost function value and the corresponding weight value, and is represented by formula (6):
Figure GDA0003746370400000221
wherein, C d (q) a cost function value of a pixel point q when the parallax value is d, wherein q is any pixel point in the input image;
Figure GDA0003746370400000222
representing a cost Aggregation value (representing a symbol after Aggregation) of a pixel point P when the parallax value is d, and S (P, q) is a similarity function of the pixel point P and the pixel point q and represents the similarity between the pixel point P and the pixel point q;
Figure GDA0003746370400000223
d (p, q) represents the distance between a pixel point p and a pixel point q, and sigma is a constant and is used for adjusting the similarity between the two pixel points; in the non-texture region, the value of each pixel point is basically the same, the difference value of the color information is very small but not 0, which leads to the problem of small weight accumulation, i.e. many small edge weights are accumulated continuously along the aggregation path, and the aggregation becomes a high weight in the non-texture region, so in order to suppress the problem, the invention proposes an improved weight function, which is shown in formula (8):
Figure GDA0003746370400000224
m, n represent adjacent pixels in the imagePoint;
Figure GDA0003746370400000225
for the maximum pixel value in RGB three channels, w (m, n) is the weight of the adjacent pixel point, D (p, q) in formula (7) is the sum accumulated by the weight w (m, n) along the path, and the distance between the pixel point p and the pixel point q is the sum of the weights of the adjacent pixel points on the path;
the processing process of the disparity map acquisition module specifically comprises the following steps:
(401) obtaining a disparity map of the left eye image and a disparity map of the right eye image by adopting a WTA (WTA for winner take all);
(402) performing left-right consistency detection on the disparity map of the left eye image and the disparity map of the right eye image, and dividing pixel points into stable points and unstable points;
if the parallax value d of the pixel point p in the left eye image L (p) is equal to the disparity value d of the corresponding right eye image R (p-d L (p)), i.e. d L (p)=d R (p-d L (p)), then p is considered to be a stable point, otherwise, it is considered to be an unstable point.
(403) The initial parallax confidence reflects the correct probability of the initial parallax value, if the pixel point and the pixels in the neighborhood of the pixel point have larger parallax confidence in the pixel point, and the parallax confidence value is set based on a stable point and an unstable point;
let B be the disparity confidence of the disparity map, i.e.:
Figure GDA0003746370400000231
wherein, p is any pixel point in the input image (left eye image or right eye image), if p is a stable point, the probability is 1, otherwise, the probability of the pixel point being a correct parallax value is 0.1; b (p) represents the parallax confidence of a pixel point p in the initial parallax map;
(404) filling holes in unstable points: for an unstable point p (occlusion point), the first stable point (non-occlusion point) on the left and right sides is respectively searched in the horizontal direction, and is marked as p left 、p right The value of the disparity at the unstable point p is p left And p right The smaller of the median disparity values, i.e.
d(p)=min(d(p left ),d(p right )) (10)
After the filling of the holes is completed, an initial disparity map D is obtained init
The processing process of the confidence aggregation module specifically comprises the following steps:
(501) establishing a new mixed weight function w based on the initial disparity map and the smoothed left eye image H (m, n) represented by formula (11):
Figure GDA0003746370400000232
w H (m, n) represents the mixing weight of the edge connecting the adjacent pixel m and pixel n, w H (n, m) represents the blending weight of the edge connecting the adjacent pixel n and pixel m, subscript H represents blending (hybrid), D init (m)、D init (n) refer to the initial disparity values, I, of pixel m and pixel n, respectively i (m)、I i (n) representing pixel values of the pixel point m and the pixel point n under the channel i;
pixel m and pixel n are two adjacent pixels on the image, and α represents a weight value for balancing information of the initial disparity map and information of the image pixel after smoothing (α is used for balancing information of the initial disparity map and information of the image pixel after smoothing, and α is equal to 0.5);
Figure GDA0003746370400000233
S H (p, q) denotes the mixture similarity function of p-and q-points, the subscript H denotes the mixture, D H (p, q) represents that the weight w is mixed from pixel point p to pixel point q H (m, n) cumulative distances along the path; sigma H The constant is a constant of a mixed similarity function and is used for adjusting the similarity between two pixel points;
(502) and performing confidence aggregation on the initial parallax confidence by adopting a horizontal tree structure, wherein the aggregation process is divided into a left-to-right process and a right-to-left process, and the confidence aggregation value of the aggregated pixel points is as follows:
Figure GDA0003746370400000241
Figure GDA0003746370400000242
Figure GDA0003746370400000243
the belief propagation module processing process specifically comprises the following steps:
(601) in the disparity updating phase, namely after confidence aggregation, according to the minimum spanning tree (weight is constructed by color information) established in the step (3), confidence propagation is carried out on the confidence aggregation value based on the minimum spanning tree, and the method comprises the following steps:
6a) as shown in fig. 3a, aggregation is from leaf nodes to root nodes, i.e.:
Figure GDA0003746370400000244
wherein Ch (p) represents the child node of the pixel point p,
Figure GDA0003746370400000245
the confidence aggregation value of the pixel point p is expressed as a value after the confidence propagation is carried out from the leaf node to the root node, so the confidence propagation value of the pixel point p comprises the confidence aggregation value of the pixel point p and the sum of the multiplication side weights of all subtrees from the pixel point p;
(6b) as shown in fig. 3b, aggregation is from the root node to the leaf nodes, i.e.:
Figure GDA0003746370400000246
wherein, Pr (p) represents the father node of the pixel point p;
Figure GDA0003746370400000247
the confidence aggregation value of the pixel point p is subjected to confidence propagation from the root node to the leaf node;
(602) for any pixel point q, S (p, q) represents the similarity of color information of points p and q in the minimum spanning tree, as shown in fig. 4, the confidence aggregation value B A (q) is the area size similar in both color and disparity information in the neighborhood of q, so S (p, q) B A (q) means the probability that p and q have the same disparity; when the temperature is higher than the set temperature
Figure GDA0003746370400000251
Then, the disparity value d (p) of p is the optimal disparity estimation of q points,
Figure GDA0003746370400000252
is the probability of the optimal disparity estimation, which is obtained by propagating the confidence aggregation value of the minimum spanning tree, and the confidence propagation of the node p is B Pro (p) optimal disparity estimation is defined as disparity propagation D Pro (p), for each node p:
Figure GDA0003746370400000253
Figure GDA0003746370400000254
the process is to find the optimal disparity estimation of the unstable point from the stable points, thereby updating the disparity value of the unstable point and obtaining the final dense disparity map.
A computing device, comprising:
one or more processors, memory, and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing a method based on forward-backward smoothing and O (1) complexity disparity stereo matching.
For the non-local stereo matching algorithm based on the minimum spanning tree, in the parallax refinement stage, a new cost value is obtained through left-right consistency detection, then cost aggregation is carried out in each parallax range by adopting a method based on the minimum spanning tree, the parallax value of a stable point is transmitted to an unstable point, the effect of updating the parallax value is achieved, and then a Winner Take All (WTA) strategy is adopted to calculate the final parallax. In the whole parallax refinement process, 2 times of addition operation and 3 times of multiplication operation are required for any pixel point in any parallax range in the image. Therefore, 2N additions and 3N multiplications are required for any pixel, where N is the disparity range and the computational complexity of each pixel is o (N).
The stereo matching method based on forward and backward smoothing and O (1) complexity parallax refinement provided by the invention only needs 4 times of addition and 3 times of multiplication operation for any pixel point in an image in a confidence aggregation part. In the belief propagation stage, any pixel point needs 2 times of addition operation and 3 times of multiplication operation, and 6 times of addition operation and 6 times of multiplication operation are needed in total. Therefore, for any pixel point, the calculation complexity is O (1), the calculation complexity is greatly reduced, and the matching efficiency is improved.
Those skilled in the art can design the invention to be modified or varied without departing from the spirit and scope of the invention. Therefore, if such modifications and variations of the present invention fall within the technical scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A parallax stereo matching method based on forward and backward smoothing and O (1) complexity comprises the following steps:
(1) respectively carrying out forward and backward smoothing treatment on the left eye image and the right eye image;
(2) constructing a cost function based on the color and gradient information of the smoothed left eye image and the smoothed right eye image, and calculating a cost function value;
(3) constructing a minimum spanning tree for the smoothed left eye image and the smoothed right eye image, and performing cost aggregation on the cost function values to generate cost aggregation values;
(4) obtaining a disparity map by adopting a WTA strategy, judging a stable point and an unstable point through left-right consistency detection, obtaining initial disparity confidence, and filling holes in the unstable point to obtain an initial disparity map;
(5) combining the color information of the smoothed left eye image and the initial parallax image to obtain a mixed weight, and performing confidence aggregation on the initial parallax confidence by adopting a horizontal tree structure based on the initial parallax confidence and the mixed weight to obtain a confidence aggregation value;
(6) and (4) in the parallax value updating stage, performing belief propagation on the belief aggregation value according to the minimum spanning tree generated in the step (3) to obtain the optimal parallax estimation and obtain a dense parallax image.
2. The method for disparity stereo matching based on forward and backward smoothing and O (1) complexity as claimed in claim 1, wherein:
the step (1) specifically comprises the following steps:
the smoothing process of each pixel point in the left eye image and the right eye image is updated by scanning the pixel points on the horizontal tree structure, each pixel point is taken as a root node, the forward and backward smoothing is carried out by taking the RGB three-channel image as input, and the smoothing processing formula is as shown in formula (1):
Figure FDA0003753806650000011
Figure FDA0003753806650000012
representing the pixel value of the input image after the pixel points (u, v) under the i channel are smoothed;
wherein the content of the first and second substances,I i (u, v) is the pixel value of the pixel point (u, v) of the input image under the i channel,
Figure FDA0003753806650000013
and (3) updating pixel values representing forward or backward iteration of pixel points (u, v) of the input image under the i channel:
Figure FDA0003753806650000014
Figure FDA0003753806650000015
wherein the constant lambda is used to adjust the smoothing speed,
Figure FDA0003753806650000016
is the difference between a pixel point (u, v) of an input image under an i channel and an adjacent pixel point under a direction r, (u, v-r) is a previous pixel point of the pixel point (u, v) in the horizontal propagation direction, and f and b respectively represent the forward direction and the backward direction; omega is a constant;
the forward and backward smoothing process includes the steps of:
s1, passing from the leftmost end node to the rightmost end node of each line of the input image in turn, and storing the result of the forward smoothing in a plurality of groups
Figure FDA0003753806650000028
The preparation method comprises the following steps of (1) performing;
s2, in reverse direction, passing from the rightmost node to the leftmost node of each line of the input image in sequence, and storing the result of backward smoothing in array I i RL In (3), the smoothing result is obtained as formula (3):
I i new =(I i LR +I i RL +I i )/3,i∈R,G,B (3)
I i new graph showing smoothing under i-channelImage matrix, I i Representing the original image under the i channel; equation (3) is a matrix form of data.
3. The method for disparity stereo matching based on forward and backward smoothing and O (1) complexity as claimed in claim 1, wherein:
the step (2) specifically comprises the following steps:
(201) replacing single gray scale information with RGB three-channel information; setting any pixel point p in the left-eye image as (x, y), setting the parallax value corresponding to the pixel point p as (x, y) as d, and setting the matching point corresponding to the pixel point p in the right-eye image as pd as (x-d, y); color information C AD (p, d) and gradient information C Grad The expression of (p, d) is:
Figure FDA0003753806650000021
Figure FDA0003753806650000022
wherein, C AD (p, d) represents color information of the pixel point p when the parallax value is d, C Grad (p, d) represents gradient information of the pixel point p when the parallax value is d;
Figure FDA0003753806650000027
the pixel value of the pixel point p of the left eye image under the channel i,
Figure FDA0003753806650000029
representing the pixel value of a pixel point pd of the right eye image under an i channel;
Figure FDA0003753806650000023
and
Figure FDA0003753806650000024
under the i channel of the pixel point p respectively representing the left eye imageThe gradient in the x and y directions is,
Figure FDA0003753806650000025
and
Figure FDA0003753806650000026
respectively representing gradients in x and y directions of a pixel point pd of the right eye image under an i channel;
(202) the constructed cost function is as follows:
C(p,d)=w 1 C AD (p,d)+w 2 C Grad (p,d) (5)
wherein, w 1 、w 2 Weights, w, of colour information and gradient information, respectively 1 +w 2 =1;
C (p, d) is a cost function of the pixel point p when the disparity value is d, and the cost function value is calculated based on the cost function.
4. The method for disparity stereo matching based on forward and backward smoothing and O (1) complexity as claimed in claim 1, wherein:
the step (3) specifically comprises the following steps:
the cost aggregation value based on the minimum spanning tree is the sum of the multiplication of the cost function value and the corresponding weight value, and is represented by formula (6):
Figure FDA0003753806650000031
wherein, C d (q) a cost function value of a pixel point q when the parallax value is d, wherein q is any pixel point in the input image;
Figure FDA0003753806650000032
expressing a cost aggregation value of a pixel point P when the parallax value is d, and expressing the similarity between the pixel point P and the pixel point q by using S (P, q) as a similarity function of the pixel point P and the pixel point q;
Figure FDA0003753806650000033
the improved weight function is shown in formula (8):
Figure FDA0003753806650000034
m, n represents adjacent pixel points in the image;
Figure FDA0003753806650000035
for the maximum pixel value in RGB three channels, w (m, n) is the weight of the adjacent pixel point, D (p, q) is the sum accumulated by the weight w (m, n) along the path, and the distance between the pixel point p and the pixel point q is the sum of the weights of the adjacent pixel points on the path.
5. The method for disparity stereo matching based on forward and backward smoothing and O (1) complexity as claimed in claim 1, wherein: the step (4) specifically comprises the following steps:
(401) obtaining a disparity map of a left eye image and a disparity map of a right eye image by adopting a WTA strategy;
(402) performing left-right consistency detection on the disparity map of the left eye image and the disparity map of the right eye image to divide pixel points into stable points and unstable points;
(403) the initial parallax confidence reflects the correct probability of the initial parallax value, if the parallax value and the color information of the pixels in the neighborhood of the pixel point are obtained, the pixel point has larger parallax confidence, and the parallax confidence value is set based on a stable point and an unstable point;
let B be the disparity confidence of the disparity map:
Figure FDA0003753806650000036
wherein, p is any pixel point in the input image, if p is a stable point, the probability is 1, otherwise, the probability of the pixel point being a correct parallax value is 0.1; b (p) represents the parallax confidence of a pixel point p in the initial parallax map;
(404) filling holes in unstable points: for the unstable point p, the first stable point on the left and right sides is respectively searched in the horizontal direction, and is marked as p left 、p right The parallax value d (p) of the unstable point p is p left And p right The smaller of the median disparity values, i.e.
d(p)=min(d(p left ),d(p right )) (10)
After the filling of the holes is completed, an initial disparity map D is obtained init
6. The method for disparity stereo matching based on forward and backward smoothing and O (1) complexity as claimed in claim 1, wherein:
the step (5) specifically comprises the following steps:
(501) establishing a new mixing weight function based on the initial disparity map and the smoothed left eye image, wherein the formula (11) is as follows:
Figure FDA0003753806650000041
w H (m, n) represents the mixing weight of the edge connecting the adjacent pixel m and pixel n, w H (n, m) represents the mixing weight of the edge connecting the adjacent pixel n and pixel m, D init (m)、D init (n) refer to the initial disparity values, I, of pixel m and pixel n, respectively i (m)、I i (n) representing pixel values of the pixel point m and the pixel point n under the channel i;
the pixel point m and the pixel point n are two adjacent pixel points on the image, and alpha represents the weight value for balancing the information of the initial disparity map and the information of the smoothed image pixel point;
Figure FDA0003753806650000042
S H (p, q) representing pixel points p andthe mixture similarity function of pixel point q, subscript H denotes the mixture, D H (p, q) represents that pixel point p and pixel point q are subjected to mixed weight w H (m, n) cumulative distances along the path; sigma H Is a constant of the hybrid similarity function;
(502) and performing confidence aggregation on the initial parallax confidence by adopting a horizontal tree structure, wherein the aggregation process is divided into a left-to-right process and a right-to-left process, and the confidence aggregation value of the aggregated pixel points is as follows:
Figure FDA0003753806650000043
Figure FDA0003753806650000044
Figure FDA0003753806650000051
wherein, p is a pixel point in the image, the superscript LR represents that the aggregation direction goes from left to right, the superscript RL represents that the aggregation direction goes from right to left, the pl represents a former pixel point of the pixel point p, the pr represents a latter pixel point of the pixel point p, and S H (p, q) represents the mixed similarity between the adjacent pixel point p and the pixel point q;
Figure FDA0003753806650000052
representing the confidence aggregation value of the pixel point p accumulated from left to right based on the horizontal tree when the parallax value is d, B (p) is obtained in formula (9) and represents the parallax confidence value of the p point, B (pr) represents the parallax confidence aggregation value of the next pixel point of the pixel point, S H (p, pr) indicating the mixed similarity of the pixel point p and the next point pr,
Figure FDA0003753806650000053
and the average value of the confidence aggregation value accumulated from left to right, the confidence aggregation value accumulated from right to left and the parallax confidence value of the pixel point p is represented.
7. The method according to claim 6, wherein the method comprises the following steps:
the step (6) specifically comprises the following steps:
(601) in the disparity updating phase, namely after confidence aggregation, performing confidence propagation on the confidence aggregation value based on the minimum spanning tree according to the minimum spanning tree established in the step (3), and comprising the following steps of:
(6a) aggregation from leaf node to root node, i.e.:
Figure FDA0003753806650000054
where Ch (p) represents the child node of pixel p,
Figure FDA0003753806650000055
the confidence aggregate value representing the pixel point p is a value after belief propagation, and therefore,
the confidence propagation value of the pixel point p comprises the confidence aggregation value of the pixel point p and the sum of the multiplication side weights of all subtrees from the pixel point p;
(6b) from the root node to the leaf nodes, the aggregation is:
Figure FDA0003753806650000056
wherein, Pr (p) represents the father node of the pixel point p;
Figure FDA0003753806650000057
performing belief propagation on the belief aggregation value of the pixel point p from the root node to the leaf node;
(602) for any pixel point q, S (p, q) represents the similarity of color information of points p and q in the minimum spanning tree, and the confidence aggregation value B A (q) area size similar in both color and disparity information in q-neighborhood, S(p,q)B A (q) indicating the probability that pixel points p and q have the same parallax; when in use
Figure FDA0003753806650000061
Then, the disparity value of p is the optimal disparity estimation of q points,
Figure FDA0003753806650000062
is the probability of the optimal disparity estimation, the probability is obtained by propagating the confidence aggregation value of the minimum spanning tree, and the confidence propagation of the node p is defined as
Figure FDA0003753806650000063
Optimal disparity estimation is defined as disparity propagation
Figure FDA0003753806650000064
For each node p:
Figure FDA0003753806650000065
Figure FDA0003753806650000066
the parallax confidence point p is an unstable point, the point q is a stable point, and the point I represents the whole input image.
8. A disparity stereo matching device based on forward and backward smoothing and O (1) complexity, comprising:
the device comprises a smoothing processing module, a cost function construction module, a cost aggregation module, a disparity map acquisition module, a confidence aggregation module and a confidence propagation module;
the smoothing module is used for respectively carrying out forward smoothing and backward smoothing on the left eye image and the right eye image;
the cost function construction module constructs a cost function based on the color and gradient information of the smoothed left eye image and the smoothed right eye image and calculates a cost function value;
the cost aggregation module constructs a minimum spanning tree for the smoothed left eye image and the smoothed right eye image, and carries out cost aggregation on the cost function value to generate a cost aggregation value;
the parallax image acquisition module obtains a parallax image by adopting a WTA strategy, judges stable points and unstable points through left-right consistency detection, obtains initial parallax confidence, and fills holes in the unstable points to obtain an initial parallax image;
the confidence aggregation module is combined with the color information of the smoothed left eye image and the initial parallax image to obtain a mixed weight, and based on the initial parallax confidence and the mixed weight, confidence aggregation is carried out on the initial parallax confidence by adopting a horizontal tree structure to obtain a confidence aggregation value;
and in the parallax value updating stage, the belief propagation module performs belief propagation on the belief aggregation value according to the minimum spanning tree to obtain the optimal parallax estimation and obtain a dense parallax image.
9. The apparatus for forward-backward smoothing and O (1) complexity-based parallax stereo matching according to claim 8,
the smoothing module for smoothing specifically comprises the following steps:
the smoothing process of each pixel point in the left eye image and the right eye image is updated by scanning the pixel points on the horizontal tree structure, each pixel point is taken as a root node, the forward and backward smoothing is carried out by taking the RGB three-channel image as input, and the smoothing processing formula is as shown in formula (1):
Figure FDA0003753806650000071
Figure FDA0003753806650000072
representing the pixel value of the input image after the pixel points (u, v) under the i channel are smoothed;
wherein, I i (u, v) is the pixel value of the pixel point (u, v) of the input image under the i channel,
Figure FDA0003753806650000073
updating a pixel value representing forward or backward iteration of a pixel point (u, v) of the input image under an i channel:
Figure FDA0003753806650000074
Figure FDA0003753806650000075
wherein the constant lambda is used to adjust the smoothing speed,
Figure FDA0003753806650000076
is the difference between a pixel point (u, v) of an input image under an i channel and an adjacent pixel point under a direction r, (u, v-r) is a previous pixel point of the pixel point (u, v) in the horizontal propagation direction, and f and b respectively represent the forward direction and the backward direction; omega is a constant;
the processing process of the cost function construction module specifically comprises the following steps:
(201) replacing single gray scale information with RGB three-channel information; setting any pixel point p in the left eye image as (x, y), setting the parallax value corresponding to the pixel point p as (x, y) as d, and setting the matching point corresponding to the pixel point p in the right eye image as pd as (x-d, y); color information C AD (p, d) and gradient information C Grad The expression of (p, d) is formula (4):
Figure FDA0003753806650000077
Figure FDA0003753806650000078
wherein, C AD (p, d) represents color information of the pixel point p when the parallax value is d, C Grad (p, d) represents gradient information of the pixel point p when the parallax value is d;
Figure FDA0003753806650000079
the pixel value of the pixel point p of the left eye image under the channel i,
Figure FDA00037538066500000710
representing the pixel value of a pixel point pd of the right eye image under an i channel;
Figure FDA0003753806650000081
and
Figure FDA0003753806650000082
representing the gradient in the x and y directions respectively under the i channel of the pixel point p of the left eye image,
Figure FDA0003753806650000083
and
Figure FDA0003753806650000084
respectively representing gradients in x and y directions of a pixel point pd of the right eye image under an i channel;
(202) the constructed cost function is as follows:
C(p,d)=w 1 C AD (p,d)+w 2 C Grad (p,d) (5)
wherein w 1 、w 2 Weight, w, of color information and gradient information, respectively 1 +w 2 =1;
C (p, d) is a cost function of the pixel point p when the parallax value is d, and a cost function value is calculated based on the cost function;
the processing procedure of the cost aggregation module specifically comprises the following steps:
the cost aggregation value based on the minimum spanning tree is the sum of the multiplication of the cost function value and the corresponding weight value, and is represented by formula (6):
Figure FDA0003753806650000085
wherein, C d (q) a cost function value of a pixel point q when the parallax value is d, wherein q is any pixel point in the input image;
Figure FDA0003753806650000086
expressing a cost aggregation value of a pixel point P when the parallax value is d, and expressing the similarity between the pixel point P and the pixel point q by using S (P, q) as a similarity function of the pixel point P and the pixel point q;
Figure FDA0003753806650000087
d (p, q) represents the distance between a pixel point p and a pixel point q, and sigma is a constant and is used for adjusting the similarity between the two pixel points; the improved weight function is shown in the following equation (8):
Figure FDA0003753806650000088
m, n represents adjacent pixel points in the image;
Figure FDA0003753806650000089
for the maximum pixel value in RGB three channels, w (m, n) is the weight of the adjacent pixel point, D (p, q) is the sum accumulated by the weight w (m, n) along the path, and the distance between the pixel point p and the pixel point q is the sum of the weights of the adjacent pixel points on the path;
the processing process of the disparity map acquisition module specifically comprises the following steps:
(401) obtaining a disparity map of a left eye image and a disparity map of a right eye image by adopting a WTA strategy;
(402) performing left-right consistency detection on the disparity map of the left eye image and the disparity map of the right eye image to divide pixel points into stable points and unstable points;
(403) the initial parallax confidence reflects the correct probability of the initial parallax value, if the pixel point and the pixels in the neighborhood of the pixel point have larger parallax confidence in the pixel point, and the parallax confidence value is set based on a stable point and an unstable point;
let B be the disparity confidence of the disparity map, i.e.:
Figure FDA0003753806650000091
wherein, p is any pixel point in the input image, if p is a stable point, the probability is 1, otherwise, the probability of the pixel point is 0.1 of the correct parallax value; b (p) represents the parallax confidence of a pixel point p in the initial parallax image;
(404) filling holes in unstable points: for the unstable point p, the first stable points on the left and right sides are respectively found in the horizontal direction and are denoted as p left 、p right The parallax value d (p) of the unstable point p is p left And p right The one with the smaller intermediate disparity value:
d(p)=min(d(p left ),d(p right )) (10)
after the filling of the holes is completed, an initial disparity map D is obtained init
The processing process of the confidence aggregation module specifically comprises the following steps:
(501) establishing a new mixed weight function w based on the initial disparity map and the smoothed left eye image H (m, n) represented by formula (11):
Figure FDA0003753806650000092
w H (m, n) represents the mixing weight of the edge connecting the adjacent pixel m and pixel n, w H (n, m) represents the mixing weight of the edge connecting the adjacent pixel n and pixel m, D init (m)、D init (n) refers to the initial disparity values, I, of pixel m and pixel n, respectively i (m)、I i (n) representing pixel m and pixelThe pixel value of n is the pixel value under the channel i;
the pixel point m and the pixel point n are two adjacent pixel points on the image, and alpha represents the weight value for balancing the information of the initial disparity map and the information of the smoothed image pixel point;
Figure FDA0003753806650000093
S H (p, q) represents the mixture similarity function of pixel p and pixel q, subscript H represents the mixture, D H (p, q) represents the weight w of the mixture from pixel point p to pixel point q H (m, n) cumulative distances along the path; sigma H The constant is a constant of a mixed similarity function and is used for adjusting the similarity between two pixel points;
(502) and performing confidence aggregation on the initial parallax confidence by adopting a horizontal tree structure, wherein the aggregation process is divided into a left-to-right process and a right-to-left process, and the confidence aggregation value of the aggregated pixel points is as follows:
Figure FDA0003753806650000101
Figure FDA0003753806650000102
Figure FDA0003753806650000103
wherein p is a pixel point in the image, the superscript LR represents the direction of aggregation from left to right, RL represents the direction of aggregation from right to left, pl represents the previous pixel point of the pixel point p, pr represents the next pixel point of the pixel point p, and S H (p, q) represents the mixed similarity between adjacent pixel points;
Figure FDA0003753806650000104
indicating the parallax value of the pixel point pD is confidence aggregation value accumulated from left to right based on horizontal tree, B (p) is obtained in formula (9) and represents parallax confidence value of p point, B (pr) represents parallax confidence aggregation value of next pixel point of pixel point, S H (p, pr) represents the mixed similarity of the pixel point p and the next point pr,
Figure FDA0003753806650000105
representing the average value of the confidence aggregation value accumulated from left to right, the confidence aggregation value accumulated from right to left and the parallax confidence value of the pixel point p;
the belief propagation module processing process specifically comprises the following steps:
(601) in the disparity updating phase, namely after confidence aggregation, according to the minimum spanning tree established in the step (3), performing confidence propagation on confidence aggregation values based on the minimum spanning tree, and comprising the following steps of:
(6a) aggregation from leaf node to root node, i.e.:
Figure FDA0003753806650000106
wherein Ch (p) represents a child node of the pixel point p,
Figure FDA0003753806650000107
the confidence aggregation value of the pixel point p is expressed as a value after the confidence propagation is carried out from the leaf node to the root node, so the confidence propagation value of the pixel point p comprises the confidence aggregation value of the pixel point p and the sum of the multiplication side weights of all subtrees from the pixel point p;
(6b) from the root node to the leaf nodes, the aggregation is:
Figure FDA0003753806650000111
wherein, Pr (p) represents the father node of the pixel point p;
Figure FDA0003753806650000112
performing belief propagation on the belief aggregation value of the pixel point p from the root node to the leaf node;
(602) for any pixel point q, S (p, q) represents the similarity of color information of points p and q in the minimum spanning tree, and the confidence aggregation value B A (q) is the area size similar in both color and disparity information in the neighborhood of q, so S (p, q) B A (q) means the probability that p and q have the same disparity; when the temperature is higher than the set temperature
Figure FDA0003753806650000113
Then, the disparity value d (p) of p is the optimal disparity estimation of q points,
Figure FDA0003753806650000114
is the probability of the best disparity estimate, I denotes the whole input image, which is propagated from the confidence cluster values of the minimum spanning tree, so it is defined as confidence propagation
Figure FDA0003753806650000115
Optimal disparity estimation is defined as disparity propagation
Figure FDA0003753806650000116
For each node p:
Figure FDA0003753806650000117
Figure FDA0003753806650000118
the process is to find the optimal disparity estimation of the unstable point from the stable points, so as to update the disparity value of the unstable point and obtain the final dense disparity map.
10. A computing device, comprising:
one or more processors, memory, and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods of claims 1-7.
CN201811016383.6A 2018-08-31 2018-08-31 Method, device and equipment for parallax stereo matching based on forward and backward smoothing and O (1) complexity Active CN109887008B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811016383.6A CN109887008B (en) 2018-08-31 2018-08-31 Method, device and equipment for parallax stereo matching based on forward and backward smoothing and O (1) complexity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811016383.6A CN109887008B (en) 2018-08-31 2018-08-31 Method, device and equipment for parallax stereo matching based on forward and backward smoothing and O (1) complexity

Publications (2)

Publication Number Publication Date
CN109887008A CN109887008A (en) 2019-06-14
CN109887008B true CN109887008B (en) 2022-09-13

Family

ID=66924833

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811016383.6A Active CN109887008B (en) 2018-08-31 2018-08-31 Method, device and equipment for parallax stereo matching based on forward and backward smoothing and O (1) complexity

Country Status (1)

Country Link
CN (1) CN109887008B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110610503B (en) * 2019-08-21 2023-10-27 河海大学常州校区 Three-dimensional information recovery method for electric knife switch based on three-dimensional matching
CN111242999B (en) * 2020-01-10 2022-09-20 大连理工大学 Parallax estimation optimization method based on up-sampling and accurate re-matching
CN111415305A (en) * 2020-03-10 2020-07-14 桂林电子科技大学 Method for recovering three-dimensional scene, computer-readable storage medium and unmanned aerial vehicle
CN111432194B (en) * 2020-03-11 2021-07-23 北京迈格威科技有限公司 Disparity map hole filling method and device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105761270B (en) * 2016-03-15 2018-11-27 杭州电子科技大学 A kind of tree-shaped filtering solid matching method based on EP point range conversion
CN106504276B (en) * 2016-10-25 2019-02-19 桂林电子科技大学 Non local solid matching method
CN107274448B (en) * 2017-07-11 2023-05-23 江南大学 Variable weight cost aggregation stereo matching algorithm based on horizontal tree structure

Also Published As

Publication number Publication date
CN109887008A (en) 2019-06-14

Similar Documents

Publication Publication Date Title
CN109887008B (en) Method, device and equipment for parallax stereo matching based on forward and backward smoothing and O (1) complexity
Dai et al. Mvs2: Deep unsupervised multi-view stereo with multi-view symmetry
Pham et al. Domain transformation-based efficient cost aggregation for local stereo matching
Foi Noise estimation and removal in MR imaging: The variance-stabilization approach
CN103236082B (en) Towards the accurate three-dimensional rebuilding method of two-dimensional video of catching static scene
Çiğla et al. Efficient edge-preserving stereo matching
CN107578430B (en) Stereo matching method based on self-adaptive weight and local entropy
CN103310421B (en) The quick stereo matching process right for high-definition image and disparity map acquisition methods
CN106875443B (en) The whole pixel search method and device of 3-dimensional digital speckle based on grayscale restraint
CN101625768A (en) Three-dimensional human face reconstruction method based on stereoscopic vision
CN109146946B (en) Image non-local stereo matching method
CN103702098A (en) In-depth extracting method of three-viewpoint stereoscopic video restrained by time-space domain
CN111223059B (en) Robust depth map structure reconstruction and denoising method based on guide filter
CN103996201A (en) Stereo matching method based on improved gradient and adaptive window
CN103268604B (en) Binocular video depth map acquiring method
CN108629809B (en) Accurate and efficient stereo matching method
CN106408596A (en) Edge-based local stereo matching method
CN104318576A (en) Super-pixel-level image global matching method
Yang Local smoothness enforced cost volume regularization for fast stereo correspondence
CN102740096A (en) Space-time combination based dynamic scene stereo video matching method
CN112734822A (en) Stereo matching algorithm based on infrared and visible light images
CN108805841B (en) Depth map recovery and viewpoint synthesis optimization method based on color map guide
CN103413332B (en) Based on the image partition method of two passage Texture Segmentation active contour models
CN109816781B (en) Multi-view solid geometry method based on image detail and structure enhancement
CN103489183B (en) A kind of sectional perspective matching process split based on edge with seed point

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant