CN113763451B

CN113763451B - Hierarchical search method for binocular vision depth measurement of intelligent vehicle

Info

Publication number: CN113763451B
Application number: CN202111117235.5A
Authority: CN
Inventors: 白羚; 李银国; 周中奎
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2021-09-23
Filing date: 2021-09-23
Publication date: 2024-01-02
Anticipated expiration: 2041-09-23
Also published as: CN113763451A

Abstract

The invention discloses a hierarchical search method for binocular vision depth measurement of an intelligent vehicle, which is characterized in that a vehicle-mounted embedded binocular camera in an intelligent vehicle auxiliary driving system is calibrated, polar correction of specific installation conditions and an ideal camera model is not needed, a region template and a correlation coefficient function are constructed, left and right stereo image pairs with aligned acquisition time sequences are subjected to region stereo matching of a region template convolution filtering position, a positioning point projection relation model of the region template is constructed, and a depth measurement value and three-dimensional space coordinate can be calculated by optimizing the region-level stereo matching in a hierarchical search mode. The method realizes the salient region-level stereo matching of the global view features, does not need complex geometric calculation and nonlinear optimization processes of pixel feature levels, has low redundancy, and effectively solves the problems of weak image texture features, low region conventional matching reliability and image quality limitation.

Description

Hierarchical search method for binocular vision depth measurement of intelligent vehicle

Technical Field

The invention relates to the technical field of image processing, in particular to a hierarchical search algorithm for binocular vision depth measurement of an intelligent vehicle.

Background

With the technical development and product application of intelligent vehicle environment sensing and advanced auxiliary driving systems, autonomous, rapid and low-cost environment sensing and target ranging are basic requirements for ensuring the automation of the auxiliary driving systems, wherein the most basic binocular vision technology based on depth cameras is widely applied to the intelligent home and transportation fields of autonomous logistics vehicles, robot obstacle avoidance, intelligent vehicle auxiliary driving and the like. The technology integrates multiple aspects of computer vision, solid geometry, image processing and the like. With the rapid development of mobile communication networks, intelligent vehicle environment sensing has generated technical applications of reversing images, positioning and composition, three-dimensional reconstruction of environment scenes, path planning and the like closely related to environment depth measurement technology.

The current binocular camera-based computer binocular stereoscopic vision depth estimation method is a current low-cost driving assistance visual scheme, is a mainstream direction of a depth measurement technology of robot positioning and composition, and is a depth estimation method based on binocular parallax analysis and binocular image feature point extraction, matching, triangulation and depth recovery, and the methods simulate human eye stereoscopic vision and enable the measurement accuracy of depth information to be relatively high through a computer vision stereoscopic geometry calculation method. However, the binocular depth estimation method based on binocular parallax analysis and traditional feature point extraction adopts image pixel level feature points to perform depth three-dimensional measurement, is limited to an ideal binocular camera model, namely an imaging mode with epipolar constraint, has low feature significance on a weak texture region, and has low reliability of conventional feature gray level matching, so that the whole method has high redundancy and high image quality limitation.

Disclosure of Invention

The invention aims to provide a hierarchical search method for binocular vision depth measurement of an intelligent vehicle, and aims to solve the technical problems that in the prior art, based on binocular parallax analysis and traditional feature point extraction, a binocular depth estimation method is adopted to perform depth three-dimensional measurement by adopting image pixel-level feature points, an imaging mode with polar line constraint exists, the feature saliency of a weak texture region is not strong, the reliability of conventional feature gray scale matching is low, the redundancy of the whole method is large, and the image quality limitation is large.

In order to achieve the above purpose, the hierarchical search method for binocular vision depth measurement of the intelligent vehicle adopted by the invention comprises the following steps:

initializing condition setting and calibration of a binocular camera model;

constructing a regional template for acquiring image data by a binocular camera and initializing a correlation coefficient;

constructing positioning point projection relation models of corresponding region templates in the left image and the right image;

three-dimensional matching and hierarchical searching are carried out on the binocular region template to obtain a depth measurement value of the optimal region three-dimensional matching;

and calculating three-dimensional space coordinates based on the depth measurement value of the optimal region stereo matching.

The method for initializing the condition setting and calibration of the binocular camera model comprises the following steps of:

setting two installation conditions of a binocular camera;

and calibrating the internal parameters and the external parameters of the binocular camera to obtain a projection matrix from the two-dimensional image to the three-dimensional world.

Wherein, the step of setting two mounting conditions of the binocular camera comprises:

the installed binocular left and right cameras are the same in model, namely the same in internal reference, and are transversely installed in parallel;

the installed binocular left and right cameras have different models, namely different internal parameters, and the stereo correction is realized without parallel installation.

The method comprises the steps of calibrating internal parameters and external parameters of a binocular camera to obtain a projection matrix from a two-dimensional image to a three-dimensional world:

and respectively calibrating and calculating projection matrixes of the left camera model and the right camera model relative to a world coordinate system by using a monocular camera calibration method or a binocular camera calibration method.

The method for constructing the regional template of the binocular camera for collecting image data and initializing the correlation coefficient comprises the following steps of:

defining and initializing and constructing an image plane area template;

and calculating and initializing correlation coefficients of the left and right region templates.

Wherein, in the steps of definition and initialization construction of the image plane area template:

defining a block area on the image data collected by the left camera and the right camera as an area template, wherein the block area comprises a positioning point, a core area and an edge area of the core area, and the positioning point is a pixel point at the left upper corner position of the core area of the area template, namely an initial point of the core area;

an area template with pixels of 0 is initialized and constructed.

The method for constructing the positioning point projection relation model of the corresponding region template in the left image and the right image comprises the following steps:

decomposing the projection matrix calculated in the left and right camera models;

calculating a joint constant matrix coefficient decomposed in the projection matrix;

initializing a locating point in the left graph, and calculating the track of a locating point of a falling point in the corresponding right graph.

The steps of stereo matching and hierarchical searching of the binocular region template comprise the following steps:

a left and right paired stereo image of the left and right cameras at the same moment is acquired, and an initial positioning point of the regional template is initialized;

performing hierarchical search and stereo matching on the template positions of the corresponding areas of the left and right images, and calculating correlation coefficients;

and judging the right graph region template with the largest correlation as an optimal stereo matching result, and calculating a depth estimation measured value.

The method for initializing the initial positioning point of the regional template comprises the following steps of:

the binocular camera collects video sequences simultaneously, and extracts left and right image data of the same time interval frame after calibration alignment to form left and right paired stereoscopic images;

the locating point of the region template of the initialized left image is calculated from the pixel point at the upper left corner of the left image according to the sliding sequence from left to right and from top to bottom.

The step of carrying out hierarchical search and stereo matching on the template positions of the corresponding areas of the left and right images, and calculating the correlation coefficient comprises the following steps:

calculating an initialized area template of the image plane of the left camera at the current positioning point, and setting a calculation step length, namely sliding and conditions;

performing convolution filtering on the area of the current left-image area template position, and substituting the priori value of the depth pre-estimation of the positioning point and the area template positioning point into a projection relation model;

calculating the position of the region template corresponding to the estimated matching of the image plane of the right camera, namely the position of the positioning point;

taking a corresponding three-dimensional world point of a depth priori value estimated by a positioning point of a current left-drawing region template position as a first starting point for initial calculation, and selecting a quantitative point as a basic searching point set;

obtaining a positioning point set of a right image region template corresponding to the stereo matching according to the projection relation model;

calculating the region correlation coefficient of the region template position of the region corresponding to the region template of the current left image and the region template position of the region template of the right image estimation stereo matching to obtain a correlation coefficient set;

and judging the right graph region template with the largest correlation as an optimal stereo matching result and calculating a depth estimation measured value.

The beneficial effects of the invention are as follows: through calibrating a vehicle-mounted embedded binocular camera in an intelligent vehicle auxiliary driving system, regional templates and related coefficient functions are constructed without specific installation conditions and polar line correction of an ideal camera model, left and right stereo image pairs with aligned acquisition time sequences are subjected to regional stereo matching of regional template convolution filtering positions, a positioning point projection relation model of the regional templates is constructed, and a hierarchical search mode is utilized to optimize regional stereo matching, so that depth measurement values and three-dimensional space coordinates can be calculated. The three-dimensional matching of the salient region level of the global view feature is realized, the complex geometric calculation and nonlinear optimization process of the pixel feature level are not needed, the redundancy is low, the problems of weak image texture features, low conventional matching reliability of the region and limited image quality are effectively solved, and the basis of an auxiliary driving system for three-dimensional reconstruction, reversing image, positioning and composition and path planning of the intelligent vehicle environment scene is provided.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of the steps of a hierarchical search method for binocular vision depth measurement of a smart car of the present invention.

Fig. 2 is a schematic diagram of the composition of the left and right region templates of the present invention.

Fig. 3 is a schematic diagram of a right-hand map anchor point track line based on a projection relationship of region stereo matching according to the present invention.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.

Referring to fig. 1 to 3, the present invention provides a hierarchical search method for binocular vision depth measurement of an intelligent vehicle, which includes the following steps:

s1, initializing condition setting and calibration of a binocular camera model;

s2, constructing a regional template for acquiring image data by a binocular camera and initializing a correlation coefficient;

s3, constructing a positioning point projection relation model of a corresponding region template in the left and right images;

s4, three-dimensional matching and hierarchical searching are carried out on the binocular region template to obtain a depth measurement value of the optimal region three-dimensional matching;

and S5, calculating three-dimensional space coordinates based on depth measurement values of the optimal region stereo matching.

setting two installation conditions of a binocular camera;

Further, the step of setting two mounting conditions of the binocular camera includes:

Specifically, the same model, namely the internal reference, of the binocular left and right cameras is installed, the binocular left and right cameras are installed in parallel according to the x axis of the world coordinate system, namely the transverse direction, the base line is B, and the corresponding relation on the image planes of the pixel points on the images of the left and right cameras is calculated to obtain:

(u ⁽¹⁾ ,v ⁽¹⁾ ) Collecting the coordinate position of the pixel point on the image data on the image plane coordinate system for the left camera, (u) ⁽²⁾ ,v ⁽²⁾ ) Collecting coordinates of pixel points on image data on an image plane for a right cameraCoordinate position of tie, d ⁽¹⁾ For the left camera (u) ⁽¹⁾ ,v ⁽¹⁾ ) Depth information corresponding to three-dimensional spatial points of the three-dimensional world,the scale factor in the x direction in the reference matrix of the camera is taken as the scale factor, wherein f is the focal length, and Sx is the physical dimension of the image pixel block in the x direction;

the internal reference matrix is:

the scale factor in the y direction in the camera internal reference matrix is given by f, wherein f is the focal length, and Sy is the physical dimension of the image pixel block in the y direction;

the binocular left and right cameras are installed in different models, namely, internal parameters are different, and the binocular left and right cameras are installed without strict parallelism, namely, without stereo correction.

Further, in the step of calibrating the internal parameters and external parameters of the binocular camera to obtain the projection matrix from the two-dimensional image to the three-dimensional world:

Specifically, a monocular camera calibration method or a binocular camera calibration method is utilized to respectively calibrate and calculate projection matrixes of the left camera model and the right camera model relative to a world coordinate system:

wherein d is ⁽²⁾ For right camera (u) ⁽²⁾ ,v ⁽²⁾ ) Depth information corresponding to three-dimensional space points of three-dimensional world, M ⁽¹⁾ For the projection matrix of the left camera to the three-dimensional world coordinate system, M ⁽²⁾ Is right phaseThe projection matrices of the machine to three-dimensional world coordinate system, it is noted here that both projection matrices are composed of internal references by external references of the corresponding camera model, and that the internal references are inverse matrices of the so-called internal reference matrices in the calibration result.

defining and initializing and constructing an image plane area template;

Further, in the steps of defining and initializing the image plane region template:

an area template with pixels of 0 is initialized and constructed.

Specifically, a block area on the image data collected by the left and right cameras is defined as an area template, which comprises a positioning point, an n×n core area and an extension boundary with the width of m, namely an edge area of the core area, wherein the positioning point is a pixel point at the upper left corner of the core area of the area template, namely an initial point of the core area, and can also be called a mark point;

an area template with a pixel of 0 and a width of n+2×m is initialized and constructed.

Further, in the step of calculating and initializing the correlation coefficients of the left and right region templates:

initializing positioning points, color parameters, core area width and expansion boundaries of a left area template and a right area template with equal widths contained in left and right image data acquired by a left camera and a right camera at the same moment;

initializing the minimum average error correlation coefficient of the left and right region templates;

initializing normalized average error correlation coefficients of the left and right region templates;

initializing the compressed cosine distance correlation coefficients of the left and right region templates.

Specifically, the positioning point of a left region template with equal width contained in left and right image data acquired by a left and right camera at the same moment is P (u, v), the corresponding color parameter is w (u, v), the positioning point of a right region template is Q (x, y), the corresponding color parameter is f (x, y), the widths of core regions of the two region templates are n, and the expansion boundaries are m;

initializing the minimum average error correlation coefficient of the left and right region templates:

i and j are current labels of the region template corresponding to the cyclic calculation of the row pixel labels and the column pixel labels respectively, namely, the sequence values from 0 to N-1;

the above is the average value of the difference between the values of all corresponding left and right pixel points in the left and right region templates, the range is 0-255, M _ad The smaller the value is, the larger the correlation between the positions of the templates of the corresponding areas in the left and right images is, M _ad The larger the value is, the smaller the correlation of the template positions of the corresponding areas in the left image and the right image is; calculating relative local minimum, namely M _ad ^(min) In particular, when the left and right region templates correspond to the images completely coincide, M _ad ＝0；

Initializing normalized average error correlation coefficients of the left and right region templates:

in the aboveAnd->The average value of the total pixels corresponding to the image positions of the left and right region templates and the threshold value of CThe larger the value is, the larger the correlation of the corresponding region template positions in the left and right images is, the smaller the value is, and the smaller the correlation of the corresponding region template positions in the left and right images is;

initializing compressed cosine distance correlation coefficients of left and right region templates:

taking subvector w obtained by compressed projection of left region template u direction and v direction acquired by a left camera _u And w _v The method comprises the following steps of:

similarly, a subvector f obtained by compressed projection of the right image region template in the x direction and the y direction acquired by a right camera is taken _x And f _y The method comprises the following steps of:

the cosine distance correlation coefficient of the compressed projection is:

the above is vector (w _u ,w _v ) ^T And (f) _x ,f _y ) ^T The cosine distance of the included angle between the two 2N-dimensional vector included angles is within the range of 0-1, the larger the value is, the larger the correlation of the corresponding region template positions in the left and right images is, and the smaller the value is, the smaller the correlation of the corresponding region template positions in the left and right images is.

Further, the step of constructing the positioning point projection relation model of the corresponding region template in the left and right images comprises the following steps:

Specifically, M of the projection matrix calculated in the left and right camera models in S1 ⁽¹⁾ The method comprises the following steps of:

calculating joint constant matrix coefficients decomposed in a projection matrix:

initializing a locating point P (u, v) in a left diagram, and calculating a locating point Q (x, y) of a corresponding right diagram:

description above the depth measurement estimate d on the left camera for the corresponding three-dimensional world space point (X, Y, Z) given the region template anchor point P (u, v) on the left image plane of a priori estimate ⁽¹⁾ Then the corresponding region template locating point Q (x, y) on the right image plane can be estimated and calculated, and P (u, v) to Q (x, y) are perspective transformation calculation relations in the two-dimensional solid geometry;

when the depth estimation of the current left image plane area template positioning point invariant P (u, v) with unknown corresponding scale is unknown variable quantity, the track of the positioning point Q (x, y) of the corresponding stereo matching area template on the right image plane is a straight line.

The step of obtaining the depth measurement value of the optimal region stereo matching comprises the following steps of:

and carrying out hierarchical search and stereo matching on the template positions of the corresponding areas of the left and right images, and calculating the correlation coefficient.

Further, the step of collecting left and right paired stereo images of the same moment of the left and right cameras and initializing an initial positioning point of the region template comprises the following steps:

Further, the step of performing hierarchical search and stereo matching of the template positions of the corresponding areas of the left and right images, and the step of calculating the correlation coefficient comprises the following steps:

and calculating the region correlation coefficient of the region template position of the region corresponding to the region template of the current left image and the region template position of the region template of the right image estimation stereo matching to obtain a correlation coefficient set.

Further, the step of determining the right graph region template with the greatest correlation as the optimal stereo matching result and calculating the depth estimation measurement value includes:

if the correlation coefficient set meets the preset condition, judging that the left positioning point and the left and right positioning point correspond to the left and right drawing areas of the area template to realize the three-dimensional matching of the left and right areas;

if the preset condition is not met, expanding the search range, adding positioning points corresponding to the predicted depth estimation value, and updating the point set;

then calculating a new correlation coefficient set according to the new point set, jumping to a step of taking a corresponding three-dimensional world point of a depth priori value estimated by a locating point of the current left-hand region template position as a starting point to calculate a first starting point, and selecting a certain point as a basic searching point set, otherwise, performing the next step;

if the searching range is expanded and the preset condition is still not met, performing minimum value inhibition of the correlation coefficient, and judging that the stereo matching of the left region and the right region fails if the minimum reliability is not reached;

if the correlation coefficient is concentrated and has the maximum value, but is not unimodal, taking the core area of the area template as a seed area, and expanding the edge area step-by-step growth of the boundary, namely updating the correlation function of the area template;

if the expansion boundary of the region template core region reaches the maximum value, the preset condition is still not met, and the stereo matching of the left region and the right region is judged to be invalid at the moment;

if the mounting conditions of the left and right cameras belong to the same type and are mounted in parallel, the hierarchical search calculation can be simplified, and the corresponding relation and projection matrix on the image planes of the pixel points on the images of the left and right cameras are simplified and updated;

if the optimal stereo matching exists, outputting a measured depth value, if the optimal stereo matching does not exist, updating the position of the region template according to the sliding step length, and carrying out region stereo matching and depth of the position of the region template of the next left graph, namely jumping to the step of carrying out hierarchical searching and stereo matching of the position of the region template corresponding to the left graph and the right graph, and calculating a correlation coefficient;

if the calculated locating points corresponding to the right image region template position are out of range, namely not on the right image plane coordinate, updating the region template, decomposing the region template according to a stepped tower type, and when the s-order is decomposed to the 0-order, namely the region template is degenerated into pixel points, jumping to the steps of performing stepped searching and three-dimensional matching of the left image and right image corresponding region template positions, and calculating the correlation coefficient;

if the sliding step length does not reach the right boundary and the lower boundary of the left image, updating the position of the region template according to the sliding step length, jumping to the step of carrying out hierarchical searching and stereo matching on the position of the region template corresponding to the left image and the right image, and calculating the correlation coefficient, otherwise, outputting all optimal stereo matching and measured depth values, and stopping calculation.

Specifically, the initialized region template Mask (P (u, v), n.m) of the image plane of the left camera is calculated at the current positioning point (i, j), the calculation step length is n, and the condition is set as i+n < image _width ；

Region Mask (P (u, v), n.m) for locating the current left region template _i,j Performing convolution filtering to obtain priori values of depth pre-estimation of positioning pointsSubstituting the regional template positioning points P (u, v) into the projection relation model in the step S3;

an estimated matching region template Mask' corresponding to the right camera image plane is calculated (Q (x) ₀ ,y ₀ ) N, m), i.e. setpoint position:

d ₀ is thatInitial depth of extraction d ₀ The method comprises the steps of estimating a priori a target regional depth;

depth prior value d estimated by locating point of current left-hand region template position ₀ Regarding the corresponding three-dimensional world points of (2) as the first starting point for starting calculation, and selecting 2k+1 points as basic search point sets:

d _i ＝d ₀ ±i·Δd

Δd is the search step, and in the example, the k value is selected to be 2;

according to the projection relation model in the step S2, a positioning point set of a right drawing area template corresponding to 2k+1 stereo matching is obtained:

calculating the region Mask (P (u, v), n.m) of the region template corresponding to the current left image _i,j Region Mask' for region template position stereo matching with right image estimation (Q (x) _i ,y _i ) Correlation coefficient C (P (u, v), Q (x) _i ,y _i ) N, m) to obtain a correlation coefficient set C (x _i ,y _i )＝C _i (i＝0,±1,…k)；

If the preset condition is satisfied, C _-k ,C _-k+1 ,…C ₀ ,C ₁ ,…C _k The distribution of (2) has a unimodal distribution, at the maximum pointAnd |C (x) ^* ,y ^* ) I … δ, δ is an acceptable confidence threshold, then determine setpoint P (u, v) and setpoint Q (x ^* ,y ^* ) The left and right regions corresponding to the region templates realize the three-dimensional matching of the left and right regions;

accordingly, the region Mask (P (u, v), n, m) _i,j With Mask' (Q (x) ^* ,y ^* ) All pixel points in n, m) are matched according to the bit, and the depth value is according to the depth d of the regional level _i ^* ＝d ^* And require Is C _i Average value of (2);

if the preset condition is not satisfied, expanding the search range and increasing the corresponding predicted depth estimation value to d _k+1 ,…d _K Updating the point set, K being the subscript of the maximum search range, such that d _K Not exceeding a maximum depth value;

then calculate a new C from the new point set _-k ,…C ₀ ,…C _k Jump to depth a priori value d estimated by locating point of current left image region template position ₀ Taking the corresponding three-dimensional world points of the set as the first starting point of initial calculation, selecting 2k+1 points as a basic searching point set, otherwise, carrying out the next step;

c, if the preset condition is still not satisfied after the search range is enlarged ^* Minimum suppression of (C) ^* If the I is less than delta and the minimum credibility is not reached, judging that the stereo matching of the left and right areas fails;

if the correlation coefficient is concentrated { C _i Maximum value |C } exists ^* The i … delta is not unimodal, for example, a plurality of peaks exist or is in a flat-top distribution, and the optimal stereo matching is not obvious enough at the moment, which indicates that the position of the region template is a region with weaker image texture details, for example, in a solid-color region such as sky, the core region of the region template is taken as a seed region, the edge region of the expanded boundary is increased in a stepwise manner according to add width values, namely, the correlation functions C (P (u, v), Q (x) of the region template are updated ^* ,y ^* ),n,(m+add)) _i,j For example, add may take 2, then the region template size is n+2 (m+add);

if m+add reaches the maximum value m _max When the preset condition is still not met, judging that the stereo matching of the left area and the right area is invalid at the moment;

if the left and right camera mounting conditions are of the same type and are mounted in parallel, the calculation can be simplified, i.e. the condition that the camera model in S1 is calibratedIn the correspondence relationship between the image planes of the pixel points on the images of the left and right camerasThe method can be obtained by the following steps:

the projection matrix in the projection model in S3 may be updated and reduced to:

if the optimal stereo matching exists, outputting a measured depth value, if the optimal stereo matching does not exist, updating the position of the region template according to the sliding step length n, and performing region stereo matching and depth of the position of the region template of the next left graph, namely jumping to the step of performing hierarchical searching and stereo matching of the position of the region template corresponding to the left graph and the right graph, and calculating a correlation coefficient;

if the calculated locating point corresponding to the right image area template position is out of range, namely not on the right image plane coordinate, updating the area template according to the sizeWhen the s-order decomposition is carried out to the 0-order, namely the region template is degraded into pixel points, jumping to the step of carrying out the hierarchical search and the stereo matching of the region template positions corresponding to the left and right images and calculating the correlation coefficient;

if i, j does not reach the right boundary and the lower boundary of the left image according to the sliding step length, updating the region template, updating the position according to the sliding step length n, jumping to the step of carrying out hierarchical searching and stereo matching on the position of the region template corresponding to the left image and the right image, and calculating the correlation coefficient, otherwise, outputting all optimal stereo matching and measured depth values, and stopping calculation.

Further, the step of calculating three-dimensional space coordinates based on depth measurement of the best region stereo matching includes:

and calculating three-dimensional space coordinates according to the optimal stereo matching result and the depth measurement.

Specifically, according to the output optimal stereo matching result and depth measurement in the step S4, calculating three-dimensional space coordinates:

where (u, v) belongs to the pixel coordinates of the core region in the region template.

According to the invention, the vehicle-mounted embedded binocular camera in the intelligent vehicle auxiliary driving system is calibrated, the polar correction of specific installation conditions and an ideal camera model is not needed, the region template and the correlation coefficient function are constructed, the left and right stereo image pairs with aligned acquisition time sequences are subjected to region stereo matching of the convolution filtering positions of the region template, the positioning point projection relation model of the region template is constructed, the depth measurement value and the three-dimensional space coordinate can be calculated by optimizing the stereo matching in a hierarchical search mode. The three-dimensional matching of the salient region level of the global view feature is realized, the complex geometric calculation and nonlinear optimization process of the pixel feature level are not needed, the redundancy is low, the problems of weak image texture features, low conventional matching reliability of the region and limited image quality are effectively solved, and the basis of an auxiliary driving system for three-dimensional reconstruction, reversing image, positioning and composition and path planning of the intelligent vehicle environment scene is provided.

The above disclosure is only a preferred embodiment of the present invention, and it should be understood that the scope of the invention is not limited thereto, and those skilled in the art will appreciate that all or part of the procedures described above can be performed according to the equivalent changes of the claims, and still fall within the scope of the present invention.

Claims

1. The hierarchical search method for binocular vision depth measurement of the intelligent vehicle is characterized by comprising the following steps of:

initializing condition setting and calibration of a binocular camera model;

the steps of constructing a regional template for the binocular camera to collect image data and initializing correlation coefficients include:

defining and initializing and constructing an image plane area template;

calculation and initialization of correlation coefficients of the left and right region templates,

the defining and initializing construction steps of the image plane area template include: defining a block area on the image data collected by the left camera and the right camera as an area template, wherein the block area comprises a positioning point, an n multiplied by n core area, an extension boundary with the width of m, namely an edge area of the core area, and the positioning point is a pixel point at the upper left corner position of the core area of the area template, namely an initial point of the core area; initializing and constructing a region template with a pixel of 0 and a width of n+2×m;

in the step of calculating and initializing the correlation coefficients of the left and right region templates: initializing positioning points, color parameters, core area width and expansion boundaries of a left area template and a right area template with equal widths contained in left and right image data acquired by a left camera and a right camera at the same moment; initializing the minimum average error correlation coefficient of the left and right region templates; initializing normalized average error correlation coefficients of the left and right region templates; initializing a compressed cosine distance correlation coefficient of the left and right region templates;

judging the right graph region template with the largest correlation as the optimal stereo matching result, calculating a depth estimation measured value,

the step of collecting left and right paired stereo images of the left and right cameras at the same moment and initializing an initial positioning point of the regional template comprises the following steps: the binocular camera collects video sequences simultaneously, and extracts left and right image data of the same time interval frame after calibration alignment to form left and right paired stereoscopic images; the anchor points of the region template of the initialized left image are calculated from the left upper corner pixel point of the left image according to the sliding sequence from left to right and from top to bottom,

the step of carrying out hierarchical search and stereo matching of the template positions of the corresponding areas of the left and right images and calculating the correlation coefficient comprises the following steps: calculating an initialized area template of the image plane of the left camera at the current positioning point, and setting a calculation step length, namely sliding and conditions; performing convolution filtering on the area of the current left-image area template position, and substituting the priori value of the depth pre-estimation of the positioning point and the area template positioning point into a projection relation model; calculating the position of the region template corresponding to the estimated matching of the image plane of the right camera, namely the position of the positioning point; taking a corresponding three-dimensional world point of a depth priori value estimated by a positioning point of a current left-drawing region template position as a first starting point for initial calculation, and selecting a quantitative point as a basic searching point set; obtaining a positioning point set of a right image region template corresponding to the stereo matching according to the projection relation model; calculating the region correlation coefficient of the region template position of the region corresponding to the region template of the current left image and the region template position of the region template of the right image estimation stereo matching to obtain a correlation coefficient set,

the step of judging the right graph region template with the largest correlation as the optimal stereo matching result and calculating the depth estimation measured value comprises the following steps: if the correlation coefficient set meets the preset condition, judging that the left positioning point and the left and right positioning point correspond to the left and right drawing areas of the area template to realize the three-dimensional matching of the left and right areas; if the preset condition is not met, expanding the search range, adding positioning points corresponding to the predicted depth estimation value, and updating the point set; then calculating a new correlation coefficient set according to the new point set, jumping to a step of taking a corresponding three-dimensional world point of a depth priori value estimated by a locating point of the current left-hand region template position as a starting point to calculate a first starting point, and selecting a certain point as a basic searching point set, otherwise, performing the next step; if the searching range is expanded and the preset condition is still not met, performing minimum value inhibition of the correlation coefficient, and judging that the stereo matching of the left region and the right region fails if the minimum reliability is not reached; if the correlation coefficient is concentrated and has the maximum value, but is not unimodal, taking the core area of the area template as a seed area, and expanding the edge area step-by-step growth of the boundary, namely updating the correlation function of the area template; if the expansion boundary of the region template core region reaches the maximum value, the preset condition is still not met, and the stereo matching of the left region and the right region is judged to be invalid at the moment; if the mounting conditions of the left and right cameras belong to the same type and are mounted in parallel, the hierarchical search calculation can be simplified, and the corresponding relation and projection matrix on the image planes of the pixel points on the images of the left and right cameras are simplified and updated; if the optimal stereo matching exists, outputting a measured depth value, if the optimal stereo matching does not exist, updating the position of the region template according to the sliding step length, and carrying out region stereo matching and depth of the position of the region template of the next left graph, namely jumping to the step of carrying out hierarchical searching and stereo matching of the position of the region template corresponding to the left graph and the right graph, and calculating a correlation coefficient; if the sliding step length does not reach the right boundary and the lower boundary of the left image, updating the position of the region template according to the sliding step length, jumping to the step of carrying out hierarchical searching and three-dimensional matching on the position of the region template corresponding to the left image and the right image, and calculating the correlation coefficient, otherwise, outputting all optimal three-dimensional matching and measured depth values, and stopping calculation;

2. The hierarchical search method for binocular vision depth measurement of intelligent vehicles according to claim 1, wherein the step of initializing condition setting and calibration of the binocular camera model comprises:

setting two installation conditions of a binocular camera;

3. The hierarchical search method for binocular vision depth measurement of intelligent vehicles according to claim 2, wherein the step of setting two installation conditions of the binocular camera comprises:

4. The hierarchical search method for binocular vision depth measurement of intelligent vehicles according to claim 2, wherein the step of calibrating the internal and external parameters of the binocular camera to obtain the projection matrix of the two-dimensional image to the three-dimensional world comprises the steps of:

5. The hierarchical search method for binocular vision depth measurement of intelligent vehicles according to claim 1, wherein the step of constructing the anchor point projection relation model of the corresponding region templates in the left and right images comprises: