CN112991421A - Robot vision stereo matching method - Google Patents
Robot vision stereo matching method Download PDFInfo
- Publication number
- CN112991421A CN112991421A CN202110304658.1A CN202110304658A CN112991421A CN 112991421 A CN112991421 A CN 112991421A CN 202110304658 A CN202110304658 A CN 202110304658A CN 112991421 A CN112991421 A CN 112991421A
- Authority
- CN
- China
- Prior art keywords
- pixel
- window
- gradient
- representing
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a robot visual stereo matching method, which comprises the steps that firstly, in a cost calculation part, through an SAD and MCT matching cost calculation mode, the correlation and the global property of window pixel points are considered while weak texture and repeated texture effects of an image are ensured, and noise is prevented from being introduced; secondly, in a cost aggregation stage, introducing an adaptive window which changes the size and the direction based on the image gradient, fully considering the gradient information of the image by using the adaptive window based on the gradient change, namely increasing the size of the window in a mild gradient area and reducing the size of the window in a severe gradient area, so that the edge part of the image is kept to the maximum extent, and simultaneously, conducting guide filtering in the adaptive window to aggregate the cost, and further using a multi-scale aggregation method on the basis to obtain a better cost aggregation result; and finally, parallax calculation and parallax optimization are carried out to obtain an optimal parallax result, and the method has the advantage of high visual stereo matching degree of the robot.
Description
Technical Field
The invention belongs to the technical field of visual perception of industrial robots and mobile robots, particularly relates to a robot visual stereo matching method, and particularly relates to a robot visual stereo matching method based on cost improvement calculation and gradient adaptive window multi-scale aggregation.
Background
The stereo matching technology is an extremely critical step of stereo vision, the matching precision and speed are key restriction factors of stereo vision application and development, and the stereo matching technology is widely applied in the fields of photogrammetry, three-dimensional reconstruction, virtual reality, unmanned driving, mobile robots, mobile trolleys, mars, lunar vehicles and the like. The precision of stereo matching directly influences the precision of depth estimation and terrain three-dimensional reconstruction of the mobile robot, plays a very key role in improving the precision of visual navigation of the mobile robot based on stereo vision, and directly influences whether a visual perception task of the mobile robot can be completed.
The stereo matching can be divided into a global algorithm and a local algorithm according to different optimization methods. The global algorithm is to establish an energy function in a global range, optimize the energy function to obtain a cost value of each pixel, and then perform cost calculation and parallax optimization to obtain a final parallax image. The global algorithm has high precision, but has high calculation complexity and poor real-time performance, and has great limitation in practical application scenes. The local algorithm comprises four steps of cost calculation, cost aggregation, parallax calculation and parallax optimization, wherein the purpose of the cost calculation is to calculate the correlation of a pixel pair to be matched, an Xing Mei provides a cost calculation mode of combining an absolute pixel value AD and a census-transform in an On Building Stereo Matching System On Graphics Hardware, the Matching effect of a weak texture region and a repeated texture region of an image is improved, the absolute pixel value AD is the correlation of a single pixel and noise is easily introduced, and the traditional census-transform only compares each pixel in a window with a central pixel and ignores the global property of the whole window. Cost Aggregation is the most important step in a local Stereo matching algorithm, and can be essentially regarded as a process of filtering initial matching Cost, the traditional filtering is carried out by utilizing a box Filter and a Gaussian Filter in a Cost Aggregation stage, edge information in an image cannot be well protected, Pauline Tan proposes that Cost Aggregation is carried out by using a guide Filter in Stereo Disaridity route Cost Aggregation with Guided Filter, the edge protection performance is improved, but the guide filtering is carried out in the image global range, and a better result cannot be obtained in an area with discontinuous depth and large image gradient change. Kang Zhang first proposes the idea of Cost Aggregation through Cross-Scale fusion in Cross-Scale Cost Aggregation for Stereo Matching, but the Cost Aggregation effect in single Scale is relatively general.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a robot vision stereo matching method. And secondly, in a cost aggregation stage, gradient information of the image is fully considered by utilizing a gradient change-based adaptive window, so that the edge part of the image is maintained to the maximum extent, and a multi-scale aggregation method is further used on the basis to obtain a better cost aggregation result. And finally, parallax calculation and parallax optimization are carried out to obtain an optimal parallax result, and the method has the advantage of high visual stereo matching degree of the robot.
The purpose of the invention is realized by the following technical scheme: the robot visual stereo matching method comprises the following steps:
step 1, obtaining a binocular image to be matched after distortion correction and stereo correction;
step 2, improving the traditional gradient-based matching cost calculation mode, and fusing the gradients in the x and y directions;
step 3, fusing the SAD and MCT matching cost calculation modes, wherein the SAD represents the sum of the gray difference absolute values of all pixels in the neighborhood of the pixel to be matched, and the MCT represents improved census transformation which takes the central pixel of a support window in the binocular image to be matched as the pixel mean value;
step 4, the improved gradient-based matching cost calculation mode and the SAD and MCT matching cost calculation mode are fused again to obtain a final matching cost calculation mode;
step 5, down-sampling the binocular image to be matched to generate an image pyramid;
step 6, generating an adaptive window with size changing based on gradient on the image of each scale of the generated image pyramid;
step 7, obtaining a parallax space map corresponding to the image of each scale through the matching cost calculation mode of the step 4, sliding each parallax space map by using the adaptive window obtained in the step 6, and performing guiding filtering in the adaptive window, namely a cost aggregation process of each scale;
step 8, performing multi-scale polymerization on the cost polymerization result of each scale obtained in the step 7 to obtain a final cost polymerization result;
step 9, calculating a cost aggregation result in the step 8 by using a method of taking the winner as the WTA to obtain a pixel parallax value;
and step 10, optimizing the obtained parallax value by using a self-adaptive weight median filtering and left-right consistency detection method to obtain a final parallax result, wherein the parallax result is a final stereo matching result.
As a further improvement, the step 2 is embodied as follows:
setting the gradient in the known x, y direction asWherein the content of the first and second substances,for the calculation of the gradient operation in the x-direction,for calculating gradient operations in the y-direction, GRFor the value of a pixel in the R channel in a three channel image, GGFor the value of a pixel in the G channel in a three channel image, GBObtaining an improved gradient-based matching cost calculation formula for the value of a pixel in a B channel in a three-channel image:
wherein p represents a pixel, d represents a parallax value, α represents a proportion of a gradient in the y direction in the gradient cost, which is a set value,representing the gradient value of pixel p in the left image in the x-direction,representing the gradient values of the pixels p-d in the x-direction in the right diagram,representing the gradient value of pixel p in the left image in the y direction,representing the gradient value, τ, of the pixel p-d in the y-direction in the right diagram1、τ2Each represents a set cutoff value.
As a further improvement, the SAD matching cost in the step 3 is calculated asWherein, Il(p) represents the sum of the values of the three channels of the pixel p in the left image, Ir(p-d) represents the sum of the three channel values of pixel p-d in the right diagram, NPRepresenting a neighborhood centered on pixel p, the matching cost of MCT is calculated as Cmcent(p,d)=Hamming(Ccl(p),Ccr(p-d)), wherein Hamming represents a Hamming distance, and the specific operation is onCcl(p) and Ccr(p-d) performing XOR operation, wherein the statistical result is the number of 1, Ccl(p) a character string obtained by converting the pixel p in the left image, CcrAnd (p-d) represents a character string obtained by converting the pixel p-d in the right graph.
As a further improvement, the MCT matching cost calculation method comprises the following processes:
firstly, comparing the neighborhood pixel with the central pixel to obtain a Boolean value, mapping the Boolean value to a bit string, wherein the central pixel value is the average value of all pixels in the neighborhood window to obtain the Boolean valueCc(p) represents a character string obtained after the conversion operation is performed on the pixel p,representing a connection by bit, NPA neighborhood of p is represented in the neighborhood of p, representing the mean value of all pixels in the neighborhood, and I (p) representing the value of a pixel p, which is the sum of the values of three channels;
then, obtaining MCT matching cost calculation by taking a Hamming distance of two bit strings, wherein the Hamming distance is the number of different corresponding bits of the two bit strings, specifically, carrying out XOR operation on the two bit strings, and counting the number of bits which is not 1 in the bits of the XOR operation result, and the obtained MCT matching cost calculation mode is that
Cmcent(p,d)=Hamming(Ccl(p),Ccr(p-d))。
As a further improvement, the final matching cost calculation method obtained in step 4 is as follows:
in the formula, λSADThe control parameter, λ, representing the SAD matching cost calculation modemcentThe control parameter, λ, representing the MCT matching cost calculation modegA control parameter, λ, representing an improved gradient-based matching cost calculationSAD、λmcentAnd λgAre all set values.
As a further improvement, the step 6 is embodied as follows:
step a, respectively calculating the gradients g in the horizontal direction and the vertical direction of the binocular image to be matchedxAnd gyIn a calculation manner ofThe direction of the initial smoothing window centered on pixel p is θ0(i, j) is calculated in a manner of θ0(i,j)=arctan(g(y)(i,j)/g(x)(i, j)), i represents an abscissa value of the pixel p, j represents an ordinate value of the pixel p, g(x)(i, j) represents the gradient of the pixel p in the x-direction, g(y)(i, j) represents the gradient of the pixel p in the y-direction, and the size of the initial window is set as w0(i, j) and H0(i,j),w0(i, j) denotes the width of the initial window, H0(i, j) represents the height of the initial window, which is calculated asa represents the maximum square window size in the smoothing window;
step b: calculating the gradient algebraic sum of the horizontal direction and the vertical direction in the windowAndthe calculation method isWherein k represents a unit value in the horizontal direction, l represents a unit value in the vertical direction, and gx(i + k, j + l) represents a gradient of a pixel having coordinates (i + k, j + l) in the horizontal direction, gy(i + k, j + L) represents the gradient of the pixel with the coordinate of (i + k, j + L) in the vertical direction, L is the adaptive window, the window direction is updated according to the algebraic sum of the gradients, the calculation mode is,
step c: calculating the sum of absolute values of gradients in the horizontal direction and the vertical direction in the windowAndthe window size is updated according to the sum of absolute values of the gradients which are calculated in the way thatThe window size is calculated in the manner ofWherein, wm(i, j) represents the size of the adaptive window in the horizontal direction, Hm(i, j) represents the size of the adaptive window in the vertical direction, θm(i, j) represents a direction of the adaptive window;
step d: when the adaptive window size satisfies the conditionWhen the window size and direction are stopped to be updated, wm+1(i, j) represents the horizontal dimension of the adaptive window neighborhood, Hm+1(i, j) represents the size of the adaptive window neighborhood in the vertical direction.
As a further improvement, the step 7 performs guided filtering within an adaptive window, and is specifically implemented by the following steps:
firstly, an energy function optimization model of a guide filtering algorithm is establishedIn the formula, akAnd bkLinear coefficients representing guided filtering, IiRepresenting an input image, PiRepresenting the image to be filtered, i and k representing the image indices,for the regularization term, put in the energy function equation to prevent akOver size, NkAn adaptive support window representing pixel k;
secondly, a linear coefficient a of the guiding filtering based on the adaptive window is obtained by minimizing and calculating the energy function optimization modelkAnd bkAre respectively asIn the formula, mukIs represented byiThe average value of (a) of (b),representing the mean, σ, of the image to be filtered in an adaptive windowkIs represented byiStandard deviation of (d);
finally, the cost function obtained after filtering is asWherein a iskIi+bkRepresenting an output image, with I in a window centered on a pixel kiThere is a local linear relationship.
As a further improvement, the final cost polymerization result obtained in step 8 is Representing the final aggregation cost, S representing the number of downsampling layers, a representing the coefficient matrix in the solving process, S representing each specific layer,representing the matching cost matrix of layer 0.
According to the robot visual stereo matching method provided by the invention, firstly, in a cost calculation part, through an SAD and MCT matching cost calculation mode, the correlation and the global property of a window pixel point are considered while the weak texture and the repeated texture effect of an image are ensured, and the introduction of noise is avoided; secondly, in a cost aggregation stage, introducing an adaptive window which changes the size and the direction based on the image gradient, fully considering the gradient information of the image by using the adaptive window based on the gradient change, namely increasing the size of the window in a mild gradient area and reducing the size of the window in a severe gradient area, so that the edge part of the image is kept to the maximum extent, and simultaneously, conducting guide filtering in the adaptive window to aggregate the cost, and further using a multi-scale aggregation method on the basis to obtain a better cost aggregation result; and finally, parallax calculation and parallax optimization are carried out to obtain an optimal parallax result, and the method has the advantage of high visual stereo matching degree of the robot.
Drawings
The invention is further illustrated by means of the attached drawings, but the embodiments in the drawings do not constitute any limitation to the invention, and for a person skilled in the art, other drawings can be obtained on the basis of the following drawings without inventive effort.
Fig. 1 is a flowchart of a robot visual stereo matching method.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the following detailed description of the present invention is provided with reference to the accompanying drawings and specific embodiments, and it is to be noted that the embodiments and features of the embodiments of the present application can be combined with each other without conflict.
Fig. 1 is a diagram illustrating a robot visual stereo matching method according to an embodiment of the present invention. Referring to fig. 1, the robot vision stereo matching method includes the following steps:
step 1, obtaining a binocular image to be matched after distortion correction and stereo correction;
step 2, improving the traditional gradient-based matching cost calculation mode, and fusing the gradients in the x and y directions; the steps are embodied as follows: setting the gradient in the known x, y direction asWhereinFor the calculation of the gradient operation in the x-direction,for calculating gradient operations in the y-direction, GRFor the value of a pixel in the R channel in a three channel image, GGFor the value of a pixel in the G channel in a three channel image, GBFor the values of pixels in the B channel in a three channel image, an improved gradient-based matching cost calculation formula is obtained:
wherein p represents a pixel, d represents a parallax value, α represents a proportion of a gradient in the y direction in the gradient cost, which is a set value,representing the gradient value of pixel p in the left image in the x-direction,representing the gradient values of the pixels p-d in the x-direction in the right diagram,representing the gradient value of pixel p in the left image in the y direction,representing the gradient value, τ, of the pixel p-d in the y-direction in the right diagram1、τ2Indicating the set cutoff value.
Step 3, fusing a SAD (sum of absolute difference) and MCT (mean transform) matching cost calculation mode, wherein SAD represents the sum of absolute difference values of all pixels in a neighborhood of a pixel to be matched, and MCT represents improved census (statistical) transform which supports a window central pixel as a pixel mean value in a binocular image to be matched;
specifically, the SAD matching cost in this step is calculated asWherein, Il(p) represents the sum of the values of the three channels of the pixel p in the left image, Ir(p-d) represents the sum of the three channel values of pixel p-d in the right diagram, NPRepresenting a neighborhood centered on a pixel p; the MCT matching cost calculation method comprises the steps of firstly, comparing a neighborhood pixel with a center pixel to obtain a Boolean value, mapping the Boolean value to a bit string, and obtaining the center pixel value which is the average value of all pixels in a neighborhood windowWherein C isc(p) represents a character string obtained after the conversion operation is performed on the pixel p,representing a connection by bit, NPA neighborhood of p is represented in the neighborhood of p, representing the mean value of all pixels in the neighborhood, and I (p) representing the value of a pixel p, which is the sum of the values of three channels; then, obtaining MCT matching cost calculation by taking the Hamming distance of two bit strings, wherein the Hamming distance is the number of different corresponding bits of the two bit strings, specifically, carrying out XOR operation on the two bit strings, and counting the number of bits which is not 1 in the bits of the XOR operation result to obtain the numberThe MCT matching cost calculation mode is Cmcent(p,d)=Hamming(Ccl(p),Ccr(p-d)), wherein Hamming represents Hamming distance, the specific operation is to Ccl(p) and Ccr(p-d) performing XOR operation, wherein the statistical result is the number of 1, Ccl(p) a character string obtained by converting the pixel p in the left image, CcrAnd (p-d) represents a character string obtained by converting the pixel p-d in the right graph.
Step 4, the improved gradient-based matching cost calculation mode and the SAD and MCT matching cost calculation mode are fused again to obtain the final matching cost calculation modeIn the formula, λSADRepresenting a control parameter, λ, of an SAD calculation methodmcentRegulation parameter, lambda, representing the MCT calculation methodgThe regulation and control parameters representing the improved gradient method are set values;
step 5, down-sampling the binocular image to be matched to generate an image pyramid;
step 6, generating an adaptive window with size changing based on gradient on the image of each scale of the generated image pyramid; it should be noted that the size, shape and orientation of the window may vary with the image structure information. The specific steps for generating the adaptive window are as follows:
step a, respectively calculating the gradients g in the horizontal direction and the vertical direction of the binocular image to be matchedxAnd gyIn a calculation manner ofThe direction of the initial smoothing window centered on pixel p is θ0(i, j) is calculated in a manner of θ0(i,j)=arctan(g(y)(i,j)/g(x)(i, j)), i represents an abscissa value of the pixel p, j represents an ordinate value of the pixel p, g(x)(i, j) represents the gradient of the pixel p in the x-direction, g(y)(i, j) represents the gradient of the pixel p in the y-direction, and the size of the initial window is set as w0(i, j) and H0(i,j),w0(i, j) denotes the width of the initial window, H0(i, j) represents the height of the initial window, which is calculated asa represents the maximum square window size in the smoothing window;
step b: calculating the gradient algebraic sum of the horizontal direction and the vertical direction in the windowAndthe calculation method isWherein k represents a unit value in the horizontal direction, l represents a unit value in the vertical direction, and gx(i + k, j + l) represents a gradient of a pixel having coordinates (i + k, j + l) in the horizontal direction, gy(i + k, j + L) represents the gradient of the pixel with the coordinate of (i + k, j + L) in the vertical direction, L is the adaptive window, the window direction is updated according to the algebraic sum of the gradients, the calculation mode is,
step c: calculating the sum of absolute values of gradients in the horizontal direction and the vertical direction in the windowAndthe window size is updated according to the sum of absolute values of the gradients which are calculated in the way thatThe window size is calculated in the manner ofWherein the content of the first and second substances,wm(i, j) represents the size of the adaptive window in the horizontal direction, Hm(i, j) represents the size of the adaptive window in the vertical direction, θm(i, j) represents a direction of the adaptive window;
step d: when the adaptive window size satisfies the conditionWhen the window size and direction are stopped to be updated, wm+1(i, j) represents the horizontal dimension of the adaptive window neighborhood, Hm+1(i, j) represents the size of the adaptive window neighborhood in the vertical direction.
Step 7, obtaining a parallax space map corresponding to the image of each scale through the matching cost calculation mode of the step 4, sliding each parallax space map by using the adaptive window obtained in the step 6, and performing guiding filtering in the adaptive window, namely a cost aggregation process of each scale; in other words, in this step, the adaptive window is only one window, the guided filtering is performed within the adaptive window, the target is a result (which is a disparity space map) obtained by previous matching cost calculation, images of different scales can obtain their disparity space maps, the images are respectively subjected to adaptive window guided filtering, then are summarized, and the process of performing adaptive window guided filtering on each disparity space map is cost aggregation.
Specifically, in this step, first, an energy function optimization model of a guided filtering algorithm is establishedIn the formula, akAnd bkLinear coefficients representing guided filtering, IiRepresenting an input image, PiRepresenting the image to be filtered, i and k representing the image indices,for the regularization term, put in the energy function equation to prevent akOver size, NkAn adaptive support window representing pixel k;
second, by optimizing the model minimization to the energy functionCalculating to obtain a linear coefficient a of the guide filtering based on the self-adaptive windowkAnd bkAre respectively asμkIs represented byiThe average value of (a) of (b),representing the mean, σ, of the image to be filtered in an adaptive windowkIs represented byiStandard deviation of (d);
finally, the cost function obtained after filtering is asWherein a iskIi+bkRepresenting an output image, with I in a window centered on a pixel kiThere is a local linear relationship.
Step 8, carrying out multi-scale polymerization on the cost polymerization result of each scale obtained in the step 7 to obtain a final cost polymerization result Representing the final aggregation cost, S representing the number of downsampling layers, a representing the coefficient matrix in the solving process, S representing each specific layer,representing the matching cost matrix of layer 0. It should be noted that, since cost aggregation is performed on images of each scale before, the step is called multi-scale aggregation because results obtained by each scale are summarized.
Step 9, calculating a cost aggregation result in the step 8 by using a method of selecting a parallax corresponding to the minimum cost as an optimal parallax from the cost values of the winner take all of the pixels under all the parallaxes of the pixels to obtain a parallax value of the pixels;
and step 10, optimizing the obtained parallax value by using a self-adaptive weight median filtering and left-right consistency detection method to obtain a final parallax result, wherein the parallax result is a final stereo matching result.
In summary, the robot vision stereo matching method of the invention specifically comprises the steps of (1) obtaining binocular images to be matched after distortion correction and stereo correction; (2) the traditional gradient-based matching cost calculation mode is improved, and the gradients in the x and y directions are subjected to reasonable and effective normalized fusion; (3) fusing the sum SAD of the absolute value of the gray level difference of all pixels in the neighborhood of the pixel to be matched with a matching cost calculation mode of improved census transformation MCT which supports the central pixel of a window as the pixel mean value; (4) the improved gradient-based matching cost calculation mode is fused with the SAD and MCT matching cost calculation mode again to obtain a final matching cost calculation mode; (5) down-sampling the binocular image to be matched to generate an image pyramid; (6) generating an adaptive window for automatically controlling the size based on the gradient on the image of each scale; (7) performing guided filtering within each window; (8) aggregating results of different scales of the image pyramid to obtain a final cost aggregation result; (9) calculating the cost aggregation result in the step 8 by using a method of taking the winner as the WTA to obtain a parallax value of the pixel; (10) and performing parallax optimization on the result obtained by parallax calculation by using a self-adaptive weight median filtering and left-right consistency detection method to obtain a final stereo matching result. Through the process, in the cost aggregation process of stereo matching, the self-adaptive window based on gradient change is introduced, the size of the window can be changed along with the change of the gradient, the window is larger when the gradient change is smooth, the window is smaller when the gradient change is severe, and the filtering cost aggregation is guided in the self-adaptive window. Compared with the method for conducting guided filtering cost aggregation in the whole picture global range, the method can more emphatically calculate the local part, can weaken the influence in regions with discontinuous depth and the like, which are easy to make mistakes in stereo matching, and can increase the influence in regions with good stereo matching effect, such as the regions with smooth depth change and the like.
Therefore, compared with the prior art, the invention has the advantages that:
(1) effectively weakening the introduction of noise in the calculation process and improving the precision of stereo matching
The invention improves the traditional cost calculation mode of AD-Census (combination of absolute difference of adjacent degree Census and Census transformation) in the cost calculation process, provides a matching cost calculation mode of SAD and MCT, fuses the cost of pixel sum of a neighborhood support window and Census transformation cost of a central pixel which is the mean value of the neighborhood support window, considers the functions of other pixels in the neighborhood window while considering pixel correlation, has global property, can weaken the influence of noise in the calculation process and reduce the introduction of noise.
(2) The method has the advantages that the method obtains better effect in areas with discontinuous depth and rich texture change, and improves the precision of stereo matching.
In the description above, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore should not be construed as limiting the scope of the present invention.
In conclusion, although the present invention has been described with reference to the preferred embodiments, it should be noted that, although various changes and modifications may be made by those skilled in the art, they should be included in the scope of the present invention unless they depart from the scope of the present invention.
Claims (8)
1. A robot vision stereo matching method is characterized by comprising the following steps:
step 1, obtaining a binocular image to be matched after distortion correction and stereo correction;
step 2, improving the traditional gradient-based matching cost calculation mode, and fusing the gradients in the x and y directions;
step 3, integrating the SAD and MCT matching cost calculation modes, wherein the SAD represents the sum of the absolute values of the gray differences of all pixels in the neighborhood of the pixel to be matched in the binocular image to be matched, and the MCT represents improved census transformation which takes the central pixel of the support window in the binocular image to be matched as the pixel mean value;
step 4, the improved gradient-based matching cost calculation mode and the SAD and MCT matching cost calculation mode are fused again to obtain a final matching cost calculation mode;
step 5, down-sampling the binocular image to be matched to generate an image pyramid;
step 6, generating an adaptive window with size changing based on gradient on the image of each scale of the generated image pyramid;
step 7, obtaining a parallax space map corresponding to the image of each scale through the matching cost calculation mode of the step 4, sliding each parallax space map by using the adaptive window obtained in the step 6, and performing guiding filtering in the adaptive window, namely a cost aggregation process of each scale;
step 8, performing multi-scale polymerization on the cost polymerization result of each scale obtained in the step 7 to obtain a final cost polymerization result;
step 9, calculating a cost aggregation result in the step 8 by using a method of taking the winner as the WTA to obtain a pixel parallax value;
and step 10, optimizing the obtained parallax value by using a self-adaptive weight median filtering and left-right consistency detection method to obtain a final parallax result, wherein the parallax result is a final stereo matching result.
2. The robot-vision stereo matching method according to claim 1, wherein the step 2 is embodied as follows:
setting the gradient in the known x, y direction asWherein the content of the first and second substances,is at the same timeThe computed gradient in the x-direction operates,for calculating gradient operations in the y-direction, GRFor the value of a pixel in the R channel in a three channel image, GGFor the value of a pixel in the G channel in a three channel image, GBObtaining an improved gradient-based matching cost calculation formula for the value of a pixel in a B channel in a three-channel image:
wherein p represents a pixel, d represents a parallax value, α represents a proportion of a gradient in the y direction in the gradient cost, which is a set value,representing the gradient value of pixel p in the left image in the x-direction,representing the gradient values of the pixels p-d in the x-direction in the right diagram,representing the gradient value of pixel p in the left image in the y direction,representing the gradient value, τ, of the pixel p-d in the y-direction in the right diagram1、τ2Each represents a set cutoff value.
3. The robot visual stereo matching method according to claim 2, wherein the SAD matching cost in the step 3 is calculated asWherein, Il(p) representsSum of three channel values, I, of pixel p in the left imager(p-d) represents the sum of the three channel values of pixel p-d in the right diagram, NPRepresenting a neighborhood centered on pixel p, the matching cost of MCT is calculated as Cmcent(p,d)=Hamming(Ccl(p),Ccr(p-d)), where Hamming represents the Hamming distance, the specific operation is for Ccl(p) and Ccr(p-d) performing XOR operation, wherein the statistical result is the number of 1, Ccl(p) a character string obtained by converting the pixel p in the left image, CcrAnd (p-d) represents a character string obtained by converting the pixel p-d in the right graph.
4. The robot visual stereo matching method according to claim 3, wherein the MCT matching cost calculation method process is as follows:
firstly, comparing the neighborhood pixel with the central pixel to obtain a Boolean value, mapping the Boolean value to a bit string, wherein the central pixel value is the average value of all pixels in the neighborhood window to obtain the Boolean valueWherein C isc(p) represents a character string obtained after the conversion operation is performed on the pixel p,representing a connection by bit, NPA neighborhood of p is represented in the neighborhood of p, representing the mean value of all pixels in the neighborhood, and I (p) representing the value of a pixel p, which is the sum of the values of three channels;
then, obtaining MCT matching cost calculation by taking a Hamming distance of two bit strings, wherein the Hamming distance is the number of different corresponding bits of the two bit strings, specifically, carrying out XOR operation on the two bit strings, and counting the number of bits which is not 1 in the bits of the XOR operation result, and the obtained MCT matching cost calculation mode is that
Cmcent(p,d)=Hamming(Ccl(p),Ccr(p-d))。
5. The robot visual stereo matching method according to claim 4, wherein the final matching cost calculation method obtained in the step 4 is:
in the formula, λSADThe control parameter, λ, representing the SAD matching cost calculation modemcentThe control parameter, λ, representing the MCT matching cost calculation modegA control parameter, λ, representing an improved gradient-based matching cost calculationSAD、λmcentAnd λgAre all set values.
6. The robot-vision stereo matching method according to claim 5, wherein the step 6 is embodied as:
step a, respectively calculating the gradients g in the horizontal direction and the vertical direction of the binocular image to be matchedxAnd gyIn a calculation manner ofThe direction of the initial smoothing window centered on pixel p is θ0(i, j) is calculated in a manner of θ0(i,j)=arctan(g(y)(i,j)/g(x)(i, j)), i represents an abscissa value of the pixel p, j represents an ordinate value of the pixel p, g(x)(i, j) represents the gradient of the pixel p in the x-direction, g(y)(i, j) represents the gradient of the pixel p in the y-direction, and the size of the initial window is set as w0(i, j) and H0(i,j),w0(i, j) denotes the width of the initial window, H0(i, j) represents the height of the initial window, which is calculated asa represents the maximum square window size in the smoothing window;
step b: calculating the gradient algebraic sum of the horizontal direction and the vertical direction in the windowAndthe calculation method isWherein k represents a unit value in the horizontal direction, l represents a unit value in the vertical direction, and gx(i + k, j + l) represents a gradient of a pixel having coordinates (i + k, j + l) in the horizontal direction, gy(i + k, j + L) represents the gradient of the pixel with the coordinate of (i + k, j + L) in the vertical direction, L is the adaptive window, the window direction is updated according to the algebraic sum of the gradients, the calculation mode is,
step c: calculating the sum of absolute values of gradients in the horizontal direction and the vertical direction in the windowAndthe window size is updated according to the sum of absolute values of the gradients which are calculated in the way thatThe window size is calculated in the manner ofWherein, wm(i, j) represents the size of the adaptive window in the horizontal direction, Hm(i, j) represents the size of the adaptive window in the vertical direction, θm(i, j) represents a direction of the adaptive window;
step d: when the adaptive window size satisfies the conditionWhen the window size and direction are stopped to be updated, wm+1(i, j) represents the horizontal dimension of the adaptive window neighborhood, Hm+1(i, j) represents the size of the adaptive window neighborhood in the vertical direction.
7. The robot vision stereo matching method according to claim 6, wherein the step 7 of conducting guided filtering within an adaptive window is implemented by:
firstly, an energy function optimization model of a guide filtering algorithm is establishedIn the formula, akAnd bkLinear coefficients representing guided filtering, IiRepresenting an input image, PiRepresenting the image to be filtered, i and k representing the image indices,for the regularization term, put in the energy function equation to prevent akOver size, NkAn adaptive support window representing pixel k;
secondly, a linear coefficient a of the guiding filtering based on the adaptive window is obtained by minimizing and calculating the energy function optimization modelkAnd bkAre respectively asIn the formula, mukIs represented byiThe average value of (a) of (b),representing the mean value of the image to be filtered in an adaptive window,σkIs represented byiStandard deviation of (d);
8. The method for robot visual stereo matching according to claim 7, wherein the final cost aggregation result obtained in the step 8 is Representing the final aggregation cost, S representing the number of downsampling layers, a representing the coefficient matrix in the solving process, S representing each specific layer,representing the matching cost matrix of layer 0.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110304658.1A CN112991421B (en) | 2021-03-23 | 2021-03-23 | Robot vision stereo matching method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110304658.1A CN112991421B (en) | 2021-03-23 | 2021-03-23 | Robot vision stereo matching method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112991421A true CN112991421A (en) | 2021-06-18 |
CN112991421B CN112991421B (en) | 2023-08-08 |
Family
ID=76334333
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110304658.1A Active CN112991421B (en) | 2021-03-23 | 2021-03-23 | Robot vision stereo matching method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112991421B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113822915A (en) * | 2021-07-30 | 2021-12-21 | 济宁安泰矿山设备制造有限公司 | Image stereo matching method for intelligent pump cavity endoscope fault diagnosis |
CN116071415A (en) * | 2023-02-08 | 2023-05-05 | 淮阴工学院 | Stereo matching method based on improved Census algorithm |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013126114A (en) * | 2011-12-14 | 2013-06-24 | Samsung Yokohama Research Institute Co Ltd | Stereo image processing method and stereo image processing apparatus |
US20130259360A1 (en) * | 2012-03-27 | 2013-10-03 | Fujitsu Limited | Method and system for stereo correspondence |
CN103440653A (en) * | 2013-08-27 | 2013-12-11 | 北京航空航天大学 | Binocular vision stereo matching method |
CN110473217A (en) * | 2019-07-25 | 2019-11-19 | 沈阳工业大学 | A kind of binocular solid matching process based on Census transformation |
CN112102382A (en) * | 2020-09-16 | 2020-12-18 | 北京邮电大学 | Electromechanical equipment visual information stereo matching algorithm based on multi-scale transformation and ADcensus-JWGF |
-
2021
- 2021-03-23 CN CN202110304658.1A patent/CN112991421B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013126114A (en) * | 2011-12-14 | 2013-06-24 | Samsung Yokohama Research Institute Co Ltd | Stereo image processing method and stereo image processing apparatus |
US20130259360A1 (en) * | 2012-03-27 | 2013-10-03 | Fujitsu Limited | Method and system for stereo correspondence |
CN103440653A (en) * | 2013-08-27 | 2013-12-11 | 北京航空航天大学 | Binocular vision stereo matching method |
CN110473217A (en) * | 2019-07-25 | 2019-11-19 | 沈阳工业大学 | A kind of binocular solid matching process based on Census transformation |
CN112102382A (en) * | 2020-09-16 | 2020-12-18 | 北京邮电大学 | Electromechanical equipment visual information stereo matching algorithm based on multi-scale transformation and ADcensus-JWGF |
Non-Patent Citations (1)
Title |
---|
王云峰;吴炜;余小亮;王安然;: "基于自适应权重AD-Census变换的双目立体匹配", 工程科学与技术, no. 04 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113822915A (en) * | 2021-07-30 | 2021-12-21 | 济宁安泰矿山设备制造有限公司 | Image stereo matching method for intelligent pump cavity endoscope fault diagnosis |
CN116071415A (en) * | 2023-02-08 | 2023-05-05 | 淮阴工学院 | Stereo matching method based on improved Census algorithm |
CN116071415B (en) * | 2023-02-08 | 2023-12-01 | 淮阴工学院 | Stereo matching method based on improved Census algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN112991421B (en) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106780590A (en) | The acquisition methods and system of a kind of depth map | |
CN107578430B (en) | Stereo matching method based on self-adaptive weight and local entropy | |
CN104616286B (en) | Quick semi-automatic multi views depth restorative procedure | |
CN102184540B (en) | Sub-pixel level stereo matching method based on scale space | |
CN109887021B (en) | Cross-scale-based random walk stereo matching method | |
CN104318576B (en) | Super-pixel-level image global matching method | |
CN112991421A (en) | Robot vision stereo matching method | |
CN111105452B (en) | Binocular vision-based high-low resolution fusion stereo matching method | |
CN106408596A (en) | Edge-based local stereo matching method | |
CN102740096A (en) | Space-time combination based dynamic scene stereo video matching method | |
CN112435267B (en) | Disparity map calculation method for high-resolution urban satellite stereo image | |
CN107945222A (en) | A kind of new Stereo matching cost calculates and parallax post-processing approach | |
CN112287824A (en) | Binocular vision-based three-dimensional target detection method, device and system | |
CN115601406A (en) | Local stereo matching method based on fusion cost calculation and weighted guide filtering | |
CN104980726B (en) | A kind of binocular video solid matching method of associated movement vector | |
CN113034681B (en) | Three-dimensional reconstruction method and device for spatial plane relation constraint | |
CN107274448B (en) | Variable weight cost aggregation stereo matching algorithm based on horizontal tree structure | |
CN113344989B (en) | NCC and Census minimum spanning tree aerial image binocular stereo matching method | |
CN113674415B (en) | Method for jointly manufacturing continuous and hollow-free DSM (digital image) by utilizing high-resolution seventh image and resource third image | |
CN109816711B (en) | Stereo matching method adopting adaptive structure | |
CN110910438B (en) | High-speed stereo matching algorithm for ultrahigh-resolution binocular image | |
CN113850293A (en) | Positioning method based on multi-source data and direction prior joint optimization | |
Sandström et al. | Splat-SLAM: Globally Optimized RGB-only SLAM with 3D Gaussians | |
CN114187208B (en) | Semi-global stereo matching method based on fusion cost and self-adaptive penalty term coefficient | |
CN117078982B (en) | Deep learning-based large-dip-angle stereoscopic image alignment dense feature matching method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |