CN113763451B - Hierarchical search method for binocular vision depth measurement of intelligent vehicle - Google Patents

Hierarchical search method for binocular vision depth measurement of intelligent vehicle Download PDF

Info

Publication number
CN113763451B
CN113763451B CN202111117235.5A CN202111117235A CN113763451B CN 113763451 B CN113763451 B CN 113763451B CN 202111117235 A CN202111117235 A CN 202111117235A CN 113763451 B CN113763451 B CN 113763451B
Authority
CN
China
Prior art keywords
region
template
point
image
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111117235.5A
Other languages
Chinese (zh)
Other versions
CN113763451A (en
Inventor
白羚
李银国
周中奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202111117235.5A priority Critical patent/CN113763451B/en
Publication of CN113763451A publication Critical patent/CN113763451A/en
Application granted granted Critical
Publication of CN113763451B publication Critical patent/CN113763451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a hierarchical search method for binocular vision depth measurement of an intelligent vehicle, which is characterized in that a vehicle-mounted embedded binocular camera in an intelligent vehicle auxiliary driving system is calibrated, polar correction of specific installation conditions and an ideal camera model is not needed, a region template and a correlation coefficient function are constructed, left and right stereo image pairs with aligned acquisition time sequences are subjected to region stereo matching of a region template convolution filtering position, a positioning point projection relation model of the region template is constructed, and a depth measurement value and three-dimensional space coordinate can be calculated by optimizing the region-level stereo matching in a hierarchical search mode. The method realizes the salient region-level stereo matching of the global view features, does not need complex geometric calculation and nonlinear optimization processes of pixel feature levels, has low redundancy, and effectively solves the problems of weak image texture features, low region conventional matching reliability and image quality limitation.

Description

Hierarchical search method for binocular vision depth measurement of intelligent vehicle
Technical Field
The invention relates to the technical field of image processing, in particular to a hierarchical search algorithm for binocular vision depth measurement of an intelligent vehicle.
Background
With the technical development and product application of intelligent vehicle environment sensing and advanced auxiliary driving systems, autonomous, rapid and low-cost environment sensing and target ranging are basic requirements for ensuring the automation of the auxiliary driving systems, wherein the most basic binocular vision technology based on depth cameras is widely applied to the intelligent home and transportation fields of autonomous logistics vehicles, robot obstacle avoidance, intelligent vehicle auxiliary driving and the like. The technology integrates multiple aspects of computer vision, solid geometry, image processing and the like. With the rapid development of mobile communication networks, intelligent vehicle environment sensing has generated technical applications of reversing images, positioning and composition, three-dimensional reconstruction of environment scenes, path planning and the like closely related to environment depth measurement technology.
The current binocular camera-based computer binocular stereoscopic vision depth estimation method is a current low-cost driving assistance visual scheme, is a mainstream direction of a depth measurement technology of robot positioning and composition, and is a depth estimation method based on binocular parallax analysis and binocular image feature point extraction, matching, triangulation and depth recovery, and the methods simulate human eye stereoscopic vision and enable the measurement accuracy of depth information to be relatively high through a computer vision stereoscopic geometry calculation method. However, the binocular depth estimation method based on binocular parallax analysis and traditional feature point extraction adopts image pixel level feature points to perform depth three-dimensional measurement, is limited to an ideal binocular camera model, namely an imaging mode with epipolar constraint, has low feature significance on a weak texture region, and has low reliability of conventional feature gray level matching, so that the whole method has high redundancy and high image quality limitation.
Disclosure of Invention
The invention aims to provide a hierarchical search method for binocular vision depth measurement of an intelligent vehicle, and aims to solve the technical problems that in the prior art, based on binocular parallax analysis and traditional feature point extraction, a binocular depth estimation method is adopted to perform depth three-dimensional measurement by adopting image pixel-level feature points, an imaging mode with polar line constraint exists, the feature saliency of a weak texture region is not strong, the reliability of conventional feature gray scale matching is low, the redundancy of the whole method is large, and the image quality limitation is large.
In order to achieve the above purpose, the hierarchical search method for binocular vision depth measurement of the intelligent vehicle adopted by the invention comprises the following steps:
initializing condition setting and calibration of a binocular camera model;
constructing a regional template for acquiring image data by a binocular camera and initializing a correlation coefficient;
constructing positioning point projection relation models of corresponding region templates in the left image and the right image;
three-dimensional matching and hierarchical searching are carried out on the binocular region template to obtain a depth measurement value of the optimal region three-dimensional matching;
and calculating three-dimensional space coordinates based on the depth measurement value of the optimal region stereo matching.
The method for initializing the condition setting and calibration of the binocular camera model comprises the following steps of:
setting two installation conditions of a binocular camera;
and calibrating the internal parameters and the external parameters of the binocular camera to obtain a projection matrix from the two-dimensional image to the three-dimensional world.
Wherein, the step of setting two mounting conditions of the binocular camera comprises:
the installed binocular left and right cameras are the same in model, namely the same in internal reference, and are transversely installed in parallel;
the installed binocular left and right cameras have different models, namely different internal parameters, and the stereo correction is realized without parallel installation.
The method comprises the steps of calibrating internal parameters and external parameters of a binocular camera to obtain a projection matrix from a two-dimensional image to a three-dimensional world:
and respectively calibrating and calculating projection matrixes of the left camera model and the right camera model relative to a world coordinate system by using a monocular camera calibration method or a binocular camera calibration method.
The method for constructing the regional template of the binocular camera for collecting image data and initializing the correlation coefficient comprises the following steps of:
defining and initializing and constructing an image plane area template;
and calculating and initializing correlation coefficients of the left and right region templates.
Wherein, in the steps of definition and initialization construction of the image plane area template:
defining a block area on the image data collected by the left camera and the right camera as an area template, wherein the block area comprises a positioning point, a core area and an edge area of the core area, and the positioning point is a pixel point at the left upper corner position of the core area of the area template, namely an initial point of the core area;
an area template with pixels of 0 is initialized and constructed.
The method for constructing the positioning point projection relation model of the corresponding region template in the left image and the right image comprises the following steps:
decomposing the projection matrix calculated in the left and right camera models;
calculating a joint constant matrix coefficient decomposed in the projection matrix;
initializing a locating point in the left graph, and calculating the track of a locating point of a falling point in the corresponding right graph.
The steps of stereo matching and hierarchical searching of the binocular region template comprise the following steps:
a left and right paired stereo image of the left and right cameras at the same moment is acquired, and an initial positioning point of the regional template is initialized;
performing hierarchical search and stereo matching on the template positions of the corresponding areas of the left and right images, and calculating correlation coefficients;
and judging the right graph region template with the largest correlation as an optimal stereo matching result, and calculating a depth estimation measured value.
The method for initializing the initial positioning point of the regional template comprises the following steps of:
the binocular camera collects video sequences simultaneously, and extracts left and right image data of the same time interval frame after calibration alignment to form left and right paired stereoscopic images;
the locating point of the region template of the initialized left image is calculated from the pixel point at the upper left corner of the left image according to the sliding sequence from left to right and from top to bottom.
The step of carrying out hierarchical search and stereo matching on the template positions of the corresponding areas of the left and right images, and calculating the correlation coefficient comprises the following steps:
calculating an initialized area template of the image plane of the left camera at the current positioning point, and setting a calculation step length, namely sliding and conditions;
performing convolution filtering on the area of the current left-image area template position, and substituting the priori value of the depth pre-estimation of the positioning point and the area template positioning point into a projection relation model;
calculating the position of the region template corresponding to the estimated matching of the image plane of the right camera, namely the position of the positioning point;
taking a corresponding three-dimensional world point of a depth priori value estimated by a positioning point of a current left-drawing region template position as a first starting point for initial calculation, and selecting a quantitative point as a basic searching point set;
obtaining a positioning point set of a right image region template corresponding to the stereo matching according to the projection relation model;
calculating the region correlation coefficient of the region template position of the region corresponding to the region template of the current left image and the region template position of the region template of the right image estimation stereo matching to obtain a correlation coefficient set;
and judging the right graph region template with the largest correlation as an optimal stereo matching result and calculating a depth estimation measured value.
The beneficial effects of the invention are as follows: through calibrating a vehicle-mounted embedded binocular camera in an intelligent vehicle auxiliary driving system, regional templates and related coefficient functions are constructed without specific installation conditions and polar line correction of an ideal camera model, left and right stereo image pairs with aligned acquisition time sequences are subjected to regional stereo matching of regional template convolution filtering positions, a positioning point projection relation model of the regional templates is constructed, and a hierarchical search mode is utilized to optimize regional stereo matching, so that depth measurement values and three-dimensional space coordinates can be calculated. The three-dimensional matching of the salient region level of the global view feature is realized, the complex geometric calculation and nonlinear optimization process of the pixel feature level are not needed, the redundancy is low, the problems of weak image texture features, low conventional matching reliability of the region and limited image quality are effectively solved, and the basis of an auxiliary driving system for three-dimensional reconstruction, reversing image, positioning and composition and path planning of the intelligent vehicle environment scene is provided.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of the steps of a hierarchical search method for binocular vision depth measurement of a smart car of the present invention.
Fig. 2 is a schematic diagram of the composition of the left and right region templates of the present invention.
Fig. 3 is a schematic diagram of a right-hand map anchor point track line based on a projection relationship of region stereo matching according to the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
Referring to fig. 1 to 3, the present invention provides a hierarchical search method for binocular vision depth measurement of an intelligent vehicle, which includes the following steps:
s1, initializing condition setting and calibration of a binocular camera model;
s2, constructing a regional template for acquiring image data by a binocular camera and initializing a correlation coefficient;
s3, constructing a positioning point projection relation model of a corresponding region template in the left and right images;
s4, three-dimensional matching and hierarchical searching are carried out on the binocular region template to obtain a depth measurement value of the optimal region three-dimensional matching;
and S5, calculating three-dimensional space coordinates based on depth measurement values of the optimal region stereo matching.
The method for initializing the condition setting and calibration of the binocular camera model comprises the following steps of:
setting two installation conditions of a binocular camera;
and calibrating the internal parameters and the external parameters of the binocular camera to obtain a projection matrix from the two-dimensional image to the three-dimensional world.
Further, the step of setting two mounting conditions of the binocular camera includes:
the installed binocular left and right cameras are the same in model, namely the same in internal reference, and are transversely installed in parallel;
the installed binocular left and right cameras have different models, namely different internal parameters, and the stereo correction is realized without parallel installation.
Specifically, the same model, namely the internal reference, of the binocular left and right cameras is installed, the binocular left and right cameras are installed in parallel according to the x axis of the world coordinate system, namely the transverse direction, the base line is B, and the corresponding relation on the image planes of the pixel points on the images of the left and right cameras is calculated to obtain:
(u (1) ,v (1) ) Collecting the coordinate position of the pixel point on the image data on the image plane coordinate system for the left camera, (u) (2) ,v (2) ) Collecting coordinates of pixel points on image data on an image plane for a right cameraCoordinate position of tie, d (1) For the left camera (u) (1) ,v (1) ) Depth information corresponding to three-dimensional spatial points of the three-dimensional world,the scale factor in the x direction in the reference matrix of the camera is taken as the scale factor, wherein f is the focal length, and Sx is the physical dimension of the image pixel block in the x direction;
the internal reference matrix is:
the scale factor in the y direction in the camera internal reference matrix is given by f, wherein f is the focal length, and Sy is the physical dimension of the image pixel block in the y direction;
the binocular left and right cameras are installed in different models, namely, internal parameters are different, and the binocular left and right cameras are installed without strict parallelism, namely, without stereo correction.
Further, in the step of calibrating the internal parameters and external parameters of the binocular camera to obtain the projection matrix from the two-dimensional image to the three-dimensional world:
and respectively calibrating and calculating projection matrixes of the left camera model and the right camera model relative to a world coordinate system by using a monocular camera calibration method or a binocular camera calibration method.
Specifically, a monocular camera calibration method or a binocular camera calibration method is utilized to respectively calibrate and calculate projection matrixes of the left camera model and the right camera model relative to a world coordinate system:
wherein d is (2) For right camera (u) (2) ,v (2) ) Depth information corresponding to three-dimensional space points of three-dimensional world, M (1) For the projection matrix of the left camera to the three-dimensional world coordinate system, M (2) Is right phaseThe projection matrices of the machine to three-dimensional world coordinate system, it is noted here that both projection matrices are composed of internal references by external references of the corresponding camera model, and that the internal references are inverse matrices of the so-called internal reference matrices in the calibration result.
The method for constructing the regional template of the binocular camera for collecting image data and initializing the correlation coefficient comprises the following steps of:
defining and initializing and constructing an image plane area template;
and calculating and initializing correlation coefficients of the left and right region templates.
Further, in the steps of defining and initializing the image plane region template:
defining a block area on the image data collected by the left camera and the right camera as an area template, wherein the block area comprises a positioning point, a core area and an edge area of the core area, and the positioning point is a pixel point at the left upper corner position of the core area of the area template, namely an initial point of the core area;
an area template with pixels of 0 is initialized and constructed.
Specifically, a block area on the image data collected by the left and right cameras is defined as an area template, which comprises a positioning point, an n×n core area and an extension boundary with the width of m, namely an edge area of the core area, wherein the positioning point is a pixel point at the upper left corner of the core area of the area template, namely an initial point of the core area, and can also be called a mark point;
an area template with a pixel of 0 and a width of n+2×m is initialized and constructed.
Further, in the step of calculating and initializing the correlation coefficients of the left and right region templates:
initializing positioning points, color parameters, core area width and expansion boundaries of a left area template and a right area template with equal widths contained in left and right image data acquired by a left camera and a right camera at the same moment;
initializing the minimum average error correlation coefficient of the left and right region templates;
initializing normalized average error correlation coefficients of the left and right region templates;
initializing the compressed cosine distance correlation coefficients of the left and right region templates.
Specifically, the positioning point of a left region template with equal width contained in left and right image data acquired by a left and right camera at the same moment is P (u, v), the corresponding color parameter is w (u, v), the positioning point of a right region template is Q (x, y), the corresponding color parameter is f (x, y), the widths of core regions of the two region templates are n, and the expansion boundaries are m;
initializing the minimum average error correlation coefficient of the left and right region templates:
i and j are current labels of the region template corresponding to the cyclic calculation of the row pixel labels and the column pixel labels respectively, namely, the sequence values from 0 to N-1;
the above is the average value of the difference between the values of all corresponding left and right pixel points in the left and right region templates, the range is 0-255, M ad The smaller the value is, the larger the correlation between the positions of the templates of the corresponding areas in the left and right images is, M ad The larger the value is, the smaller the correlation of the template positions of the corresponding areas in the left image and the right image is; calculating relative local minimum, namely M ad (min) In particular, when the left and right region templates correspond to the images completely coincide, M ad =0;
Initializing normalized average error correlation coefficients of the left and right region templates:
in the aboveAnd->The average value of the total pixels corresponding to the image positions of the left and right region templates and the threshold value of CThe larger the value is, the larger the correlation of the corresponding region template positions in the left and right images is, the smaller the value is, and the smaller the correlation of the corresponding region template positions in the left and right images is;
initializing compressed cosine distance correlation coefficients of left and right region templates:
taking subvector w obtained by compressed projection of left region template u direction and v direction acquired by a left camera u And w v The method comprises the following steps of:
similarly, a subvector f obtained by compressed projection of the right image region template in the x direction and the y direction acquired by a right camera is taken x And f y The method comprises the following steps of:
the cosine distance correlation coefficient of the compressed projection is:
the above is vector (w u ,w v ) T And (f) x ,f y ) T The cosine distance of the included angle between the two 2N-dimensional vector included angles is within the range of 0-1, the larger the value is, the larger the correlation of the corresponding region template positions in the left and right images is, and the smaller the value is, the smaller the correlation of the corresponding region template positions in the left and right images is.
Further, the step of constructing the positioning point projection relation model of the corresponding region template in the left and right images comprises the following steps:
decomposing the projection matrix calculated in the left and right camera models;
calculating a joint constant matrix coefficient decomposed in the projection matrix;
initializing a locating point in the left graph, and calculating the track of a locating point of a falling point in the corresponding right graph.
Specifically, M of the projection matrix calculated in the left and right camera models in S1 (1) The method comprises the following steps of:
calculating joint constant matrix coefficients decomposed in a projection matrix:
initializing a locating point P (u, v) in a left diagram, and calculating a locating point Q (x, y) of a corresponding right diagram:
description above the depth measurement estimate d on the left camera for the corresponding three-dimensional world space point (X, Y, Z) given the region template anchor point P (u, v) on the left image plane of a priori estimate (1) Then the corresponding region template locating point Q (x, y) on the right image plane can be estimated and calculated, and P (u, v) to Q (x, y) are perspective transformation calculation relations in the two-dimensional solid geometry;
when the depth estimation of the current left image plane area template positioning point invariant P (u, v) with unknown corresponding scale is unknown variable quantity, the track of the positioning point Q (x, y) of the corresponding stereo matching area template on the right image plane is a straight line.
The step of obtaining the depth measurement value of the optimal region stereo matching comprises the following steps of:
a left and right paired stereo image of the left and right cameras at the same moment is acquired, and an initial positioning point of the regional template is initialized;
and carrying out hierarchical search and stereo matching on the template positions of the corresponding areas of the left and right images, and calculating the correlation coefficient.
And judging the right graph region template with the largest correlation as an optimal stereo matching result, and calculating a depth estimation measured value.
Further, the step of collecting left and right paired stereo images of the same moment of the left and right cameras and initializing an initial positioning point of the region template comprises the following steps:
the binocular camera collects video sequences simultaneously, and extracts left and right image data of the same time interval frame after calibration alignment to form left and right paired stereoscopic images;
the locating point of the region template of the initialized left image is calculated from the pixel point at the upper left corner of the left image according to the sliding sequence from left to right and from top to bottom.
Further, the step of performing hierarchical search and stereo matching of the template positions of the corresponding areas of the left and right images, and the step of calculating the correlation coefficient comprises the following steps:
calculating an initialized area template of the image plane of the left camera at the current positioning point, and setting a calculation step length, namely sliding and conditions;
performing convolution filtering on the area of the current left-image area template position, and substituting the priori value of the depth pre-estimation of the positioning point and the area template positioning point into a projection relation model;
calculating the position of the region template corresponding to the estimated matching of the image plane of the right camera, namely the position of the positioning point;
taking a corresponding three-dimensional world point of a depth priori value estimated by a positioning point of a current left-drawing region template position as a first starting point for initial calculation, and selecting a quantitative point as a basic searching point set;
obtaining a positioning point set of a right image region template corresponding to the stereo matching according to the projection relation model;
and calculating the region correlation coefficient of the region template position of the region corresponding to the region template of the current left image and the region template position of the region template of the right image estimation stereo matching to obtain a correlation coefficient set.
Further, the step of determining the right graph region template with the greatest correlation as the optimal stereo matching result and calculating the depth estimation measurement value includes:
if the correlation coefficient set meets the preset condition, judging that the left positioning point and the left and right positioning point correspond to the left and right drawing areas of the area template to realize the three-dimensional matching of the left and right areas;
if the preset condition is not met, expanding the search range, adding positioning points corresponding to the predicted depth estimation value, and updating the point set;
then calculating a new correlation coefficient set according to the new point set, jumping to a step of taking a corresponding three-dimensional world point of a depth priori value estimated by a locating point of the current left-hand region template position as a starting point to calculate a first starting point, and selecting a certain point as a basic searching point set, otherwise, performing the next step;
if the searching range is expanded and the preset condition is still not met, performing minimum value inhibition of the correlation coefficient, and judging that the stereo matching of the left region and the right region fails if the minimum reliability is not reached;
if the correlation coefficient is concentrated and has the maximum value, but is not unimodal, taking the core area of the area template as a seed area, and expanding the edge area step-by-step growth of the boundary, namely updating the correlation function of the area template;
if the expansion boundary of the region template core region reaches the maximum value, the preset condition is still not met, and the stereo matching of the left region and the right region is judged to be invalid at the moment;
if the mounting conditions of the left and right cameras belong to the same type and are mounted in parallel, the hierarchical search calculation can be simplified, and the corresponding relation and projection matrix on the image planes of the pixel points on the images of the left and right cameras are simplified and updated;
if the optimal stereo matching exists, outputting a measured depth value, if the optimal stereo matching does not exist, updating the position of the region template according to the sliding step length, and carrying out region stereo matching and depth of the position of the region template of the next left graph, namely jumping to the step of carrying out hierarchical searching and stereo matching of the position of the region template corresponding to the left graph and the right graph, and calculating a correlation coefficient;
if the calculated locating points corresponding to the right image region template position are out of range, namely not on the right image plane coordinate, updating the region template, decomposing the region template according to a stepped tower type, and when the s-order is decomposed to the 0-order, namely the region template is degenerated into pixel points, jumping to the steps of performing stepped searching and three-dimensional matching of the left image and right image corresponding region template positions, and calculating the correlation coefficient;
if the sliding step length does not reach the right boundary and the lower boundary of the left image, updating the position of the region template according to the sliding step length, jumping to the step of carrying out hierarchical searching and stereo matching on the position of the region template corresponding to the left image and the right image, and calculating the correlation coefficient, otherwise, outputting all optimal stereo matching and measured depth values, and stopping calculation.
Specifically, the initialized region template Mask (P (u, v), n.m) of the image plane of the left camera is calculated at the current positioning point (i, j), the calculation step length is n, and the condition is set as i+n < image width
Region Mask (P (u, v), n.m) for locating the current left region template i,j Performing convolution filtering to obtain priori values of depth pre-estimation of positioning pointsSubstituting the regional template positioning points P (u, v) into the projection relation model in the step S3;
an estimated matching region template Mask' corresponding to the right camera image plane is calculated (Q (x) 0 ,y 0 ) N, m), i.e. setpoint position:
d 0 is thatInitial depth of extraction d 0 The method comprises the steps of estimating a priori a target regional depth;
depth prior value d estimated by locating point of current left-hand region template position 0 Regarding the corresponding three-dimensional world points of (2) as the first starting point for starting calculation, and selecting 2k+1 points as basic search point sets:
d i =d 0 ±i·Δd
Δd is the search step, and in the example, the k value is selected to be 2;
according to the projection relation model in the step S2, a positioning point set of a right drawing area template corresponding to 2k+1 stereo matching is obtained:
calculating the region Mask (P (u, v), n.m) of the region template corresponding to the current left image i,j Region Mask' for region template position stereo matching with right image estimation (Q (x) i ,y i ) Correlation coefficient C (P (u, v), Q (x) i ,y i ) N, m) to obtain a correlation coefficient set C (x i ,y i )=C i (i=0,±1,…k);
If the preset condition is satisfied, C -k ,C -k+1 ,…C 0 ,C 1 ,…C k The distribution of (2) has a unimodal distribution, at the maximum pointAnd |C (x) * ,y * ) I … δ, δ is an acceptable confidence threshold, then determine setpoint P (u, v) and setpoint Q (x * ,y * ) The left and right regions corresponding to the region templates realize the three-dimensional matching of the left and right regions;
accordingly, the region Mask (P (u, v), n, m) i,j With Mask' (Q (x) * ,y * ) All pixel points in n, m) are matched according to the bit, and the depth value is according to the depth d of the regional level i * =d * And require Is C i Average value of (2);
if the preset condition is not satisfied, expanding the search range and increasing the corresponding predicted depth estimation value to d k+1 ,…d K Updating the point set, K being the subscript of the maximum search range, such that d K Not exceeding a maximum depth value;
then calculate a new C from the new point set -k ,…C 0 ,…C k Jump to depth a priori value d estimated by locating point of current left image region template position 0 Taking the corresponding three-dimensional world points of the set as the first starting point of initial calculation, selecting 2k+1 points as a basic searching point set, otherwise, carrying out the next step;
c, if the preset condition is still not satisfied after the search range is enlarged * Minimum suppression of (C) * If the I is less than delta and the minimum credibility is not reached, judging that the stereo matching of the left and right areas fails;
if the correlation coefficient is concentrated { C i Maximum value |C } exists * The i … delta is not unimodal, for example, a plurality of peaks exist or is in a flat-top distribution, and the optimal stereo matching is not obvious enough at the moment, which indicates that the position of the region template is a region with weaker image texture details, for example, in a solid-color region such as sky, the core region of the region template is taken as a seed region, the edge region of the expanded boundary is increased in a stepwise manner according to add width values, namely, the correlation functions C (P (u, v), Q (x) of the region template are updated * ,y * ),n,(m+add)) i,j For example, add may take 2, then the region template size is n+2 (m+add);
if m+add reaches the maximum value m max When the preset condition is still not met, judging that the stereo matching of the left area and the right area is invalid at the moment;
if the left and right camera mounting conditions are of the same type and are mounted in parallel, the calculation can be simplified, i.e. the condition that the camera model in S1 is calibratedIn the correspondence relationship between the image planes of the pixel points on the images of the left and right camerasThe method can be obtained by the following steps:
the projection matrix in the projection model in S3 may be updated and reduced to:
if the optimal stereo matching exists, outputting a measured depth value, if the optimal stereo matching does not exist, updating the position of the region template according to the sliding step length n, and performing region stereo matching and depth of the position of the region template of the next left graph, namely jumping to the step of performing hierarchical searching and stereo matching of the position of the region template corresponding to the left graph and the right graph, and calculating a correlation coefficient;
if the calculated locating point corresponding to the right image area template position is out of range, namely not on the right image plane coordinate, updating the area template according to the sizeWhen the s-order decomposition is carried out to the 0-order, namely the region template is degraded into pixel points, jumping to the step of carrying out the hierarchical search and the stereo matching of the region template positions corresponding to the left and right images and calculating the correlation coefficient;
if i, j does not reach the right boundary and the lower boundary of the left image according to the sliding step length, updating the region template, updating the position according to the sliding step length n, jumping to the step of carrying out hierarchical searching and stereo matching on the position of the region template corresponding to the left image and the right image, and calculating the correlation coefficient, otherwise, outputting all optimal stereo matching and measured depth values, and stopping calculation.
Further, the step of calculating three-dimensional space coordinates based on depth measurement of the best region stereo matching includes:
and calculating three-dimensional space coordinates according to the optimal stereo matching result and the depth measurement.
Specifically, according to the output optimal stereo matching result and depth measurement in the step S4, calculating three-dimensional space coordinates:
where (u, v) belongs to the pixel coordinates of the core region in the region template.
According to the invention, the vehicle-mounted embedded binocular camera in the intelligent vehicle auxiliary driving system is calibrated, the polar correction of specific installation conditions and an ideal camera model is not needed, the region template and the correlation coefficient function are constructed, the left and right stereo image pairs with aligned acquisition time sequences are subjected to region stereo matching of the convolution filtering positions of the region template, the positioning point projection relation model of the region template is constructed, the depth measurement value and the three-dimensional space coordinate can be calculated by optimizing the stereo matching in a hierarchical search mode. The three-dimensional matching of the salient region level of the global view feature is realized, the complex geometric calculation and nonlinear optimization process of the pixel feature level are not needed, the redundancy is low, the problems of weak image texture features, low conventional matching reliability of the region and limited image quality are effectively solved, and the basis of an auxiliary driving system for three-dimensional reconstruction, reversing image, positioning and composition and path planning of the intelligent vehicle environment scene is provided.
The above disclosure is only a preferred embodiment of the present invention, and it should be understood that the scope of the invention is not limited thereto, and those skilled in the art will appreciate that all or part of the procedures described above can be performed according to the equivalent changes of the claims, and still fall within the scope of the present invention.

Claims (5)

1. The hierarchical search method for binocular vision depth measurement of the intelligent vehicle is characterized by comprising the following steps of:
initializing condition setting and calibration of a binocular camera model;
the steps of constructing a regional template for the binocular camera to collect image data and initializing correlation coefficients include:
defining and initializing and constructing an image plane area template;
calculation and initialization of correlation coefficients of the left and right region templates,
the defining and initializing construction steps of the image plane area template include: defining a block area on the image data collected by the left camera and the right camera as an area template, wherein the block area comprises a positioning point, an n multiplied by n core area, an extension boundary with the width of m, namely an edge area of the core area, and the positioning point is a pixel point at the upper left corner position of the core area of the area template, namely an initial point of the core area; initializing and constructing a region template with a pixel of 0 and a width of n+2×m;
in the step of calculating and initializing the correlation coefficients of the left and right region templates: initializing positioning points, color parameters, core area width and expansion boundaries of a left area template and a right area template with equal widths contained in left and right image data acquired by a left camera and a right camera at the same moment; initializing the minimum average error correlation coefficient of the left and right region templates; initializing normalized average error correlation coefficients of the left and right region templates; initializing a compressed cosine distance correlation coefficient of the left and right region templates;
constructing positioning point projection relation models of corresponding region templates in the left image and the right image;
the step of obtaining the depth measurement value of the optimal region stereo matching comprises the following steps of:
a left and right paired stereo image of the left and right cameras at the same moment is acquired, and an initial positioning point of the regional template is initialized;
performing hierarchical search and stereo matching on the template positions of the corresponding areas of the left and right images, and calculating correlation coefficients;
judging the right graph region template with the largest correlation as the optimal stereo matching result, calculating a depth estimation measured value,
the step of collecting left and right paired stereo images of the left and right cameras at the same moment and initializing an initial positioning point of the regional template comprises the following steps: the binocular camera collects video sequences simultaneously, and extracts left and right image data of the same time interval frame after calibration alignment to form left and right paired stereoscopic images; the anchor points of the region template of the initialized left image are calculated from the left upper corner pixel point of the left image according to the sliding sequence from left to right and from top to bottom,
the step of carrying out hierarchical search and stereo matching of the template positions of the corresponding areas of the left and right images and calculating the correlation coefficient comprises the following steps: calculating an initialized area template of the image plane of the left camera at the current positioning point, and setting a calculation step length, namely sliding and conditions; performing convolution filtering on the area of the current left-image area template position, and substituting the priori value of the depth pre-estimation of the positioning point and the area template positioning point into a projection relation model; calculating the position of the region template corresponding to the estimated matching of the image plane of the right camera, namely the position of the positioning point; taking a corresponding three-dimensional world point of a depth priori value estimated by a positioning point of a current left-drawing region template position as a first starting point for initial calculation, and selecting a quantitative point as a basic searching point set; obtaining a positioning point set of a right image region template corresponding to the stereo matching according to the projection relation model; calculating the region correlation coefficient of the region template position of the region corresponding to the region template of the current left image and the region template position of the region template of the right image estimation stereo matching to obtain a correlation coefficient set,
the step of judging the right graph region template with the largest correlation as the optimal stereo matching result and calculating the depth estimation measured value comprises the following steps: if the correlation coefficient set meets the preset condition, judging that the left positioning point and the left and right positioning point correspond to the left and right drawing areas of the area template to realize the three-dimensional matching of the left and right areas; if the preset condition is not met, expanding the search range, adding positioning points corresponding to the predicted depth estimation value, and updating the point set; then calculating a new correlation coefficient set according to the new point set, jumping to a step of taking a corresponding three-dimensional world point of a depth priori value estimated by a locating point of the current left-hand region template position as a starting point to calculate a first starting point, and selecting a certain point as a basic searching point set, otherwise, performing the next step; if the searching range is expanded and the preset condition is still not met, performing minimum value inhibition of the correlation coefficient, and judging that the stereo matching of the left region and the right region fails if the minimum reliability is not reached; if the correlation coefficient is concentrated and has the maximum value, but is not unimodal, taking the core area of the area template as a seed area, and expanding the edge area step-by-step growth of the boundary, namely updating the correlation function of the area template; if the expansion boundary of the region template core region reaches the maximum value, the preset condition is still not met, and the stereo matching of the left region and the right region is judged to be invalid at the moment; if the mounting conditions of the left and right cameras belong to the same type and are mounted in parallel, the hierarchical search calculation can be simplified, and the corresponding relation and projection matrix on the image planes of the pixel points on the images of the left and right cameras are simplified and updated; if the optimal stereo matching exists, outputting a measured depth value, if the optimal stereo matching does not exist, updating the position of the region template according to the sliding step length, and carrying out region stereo matching and depth of the position of the region template of the next left graph, namely jumping to the step of carrying out hierarchical searching and stereo matching of the position of the region template corresponding to the left graph and the right graph, and calculating a correlation coefficient; if the sliding step length does not reach the right boundary and the lower boundary of the left image, updating the position of the region template according to the sliding step length, jumping to the step of carrying out hierarchical searching and three-dimensional matching on the position of the region template corresponding to the left image and the right image, and calculating the correlation coefficient, otherwise, outputting all optimal three-dimensional matching and measured depth values, and stopping calculation;
and calculating three-dimensional space coordinates based on the depth measurement value of the optimal region stereo matching.
2. The hierarchical search method for binocular vision depth measurement of intelligent vehicles according to claim 1, wherein the step of initializing condition setting and calibration of the binocular camera model comprises:
setting two installation conditions of a binocular camera;
and calibrating the internal parameters and the external parameters of the binocular camera to obtain a projection matrix from the two-dimensional image to the three-dimensional world.
3. The hierarchical search method for binocular vision depth measurement of intelligent vehicles according to claim 2, wherein the step of setting two installation conditions of the binocular camera comprises:
the installed binocular left and right cameras are the same in model, namely the same in internal reference, and are transversely installed in parallel;
the installed binocular left and right cameras have different models, namely different internal parameters, and the stereo correction is realized without parallel installation.
4. The hierarchical search method for binocular vision depth measurement of intelligent vehicles according to claim 2, wherein the step of calibrating the internal and external parameters of the binocular camera to obtain the projection matrix of the two-dimensional image to the three-dimensional world comprises the steps of:
and respectively calibrating and calculating projection matrixes of the left camera model and the right camera model relative to a world coordinate system by using a monocular camera calibration method or a binocular camera calibration method.
5. The hierarchical search method for binocular vision depth measurement of intelligent vehicles according to claim 1, wherein the step of constructing the anchor point projection relation model of the corresponding region templates in the left and right images comprises:
decomposing the projection matrix calculated in the left and right camera models;
calculating a joint constant matrix coefficient decomposed in the projection matrix;
initializing a locating point in the left graph, and calculating the track of a locating point of a falling point in the corresponding right graph.
CN202111117235.5A 2021-09-23 2021-09-23 Hierarchical search method for binocular vision depth measurement of intelligent vehicle Active CN113763451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111117235.5A CN113763451B (en) 2021-09-23 2021-09-23 Hierarchical search method for binocular vision depth measurement of intelligent vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111117235.5A CN113763451B (en) 2021-09-23 2021-09-23 Hierarchical search method for binocular vision depth measurement of intelligent vehicle

Publications (2)

Publication Number Publication Date
CN113763451A CN113763451A (en) 2021-12-07
CN113763451B true CN113763451B (en) 2024-01-02

Family

ID=78797139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111117235.5A Active CN113763451B (en) 2021-09-23 2021-09-23 Hierarchical search method for binocular vision depth measurement of intelligent vehicle

Country Status (1)

Country Link
CN (1) CN113763451B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101702751A (en) * 2009-11-03 2010-05-05 上海富瀚微电子有限公司 Three-dimensional block matching method in video noise-reduction treatment
CN102184410A (en) * 2011-05-06 2011-09-14 浙江工业大学 Three-dimensional recovered cranioface recognition method
CN105809717A (en) * 2016-03-10 2016-07-27 上海玮舟微电子科技有限公司 Depth estimation method, system and electronic equipment
CN106767399A (en) * 2016-11-11 2017-05-31 大连理工大学 The non-contact measurement method of the logistics measurement of cargo found range based on binocular stereo vision and dot laser
EP3418975A1 (en) * 2017-06-23 2018-12-26 Koninklijke Philips N.V. Depth estimation for an image
CN109360246A (en) * 2018-11-02 2019-02-19 哈尔滨工业大学 Stereo vision three-dimensional displacement measurement method based on synchronous sub-district search
CN110246169A (en) * 2019-05-30 2019-09-17 华中科技大学 A kind of window adaptive three-dimensional matching process and system based on gradient
CN110414384A (en) * 2019-07-11 2019-11-05 东南大学 Intelligent rice and wheat harvester leading line tracking
CN110569704A (en) * 2019-05-11 2019-12-13 北京工业大学 Multi-strategy self-adaptive lane line detection method based on stereoscopic vision
CN112051853A (en) * 2020-09-18 2020-12-08 哈尔滨理工大学 Intelligent obstacle avoidance system and method based on machine vision
CN112770118A (en) * 2020-12-31 2021-05-07 展讯通信(天津)有限公司 Video frame image motion estimation method and related equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014145856A1 (en) * 2013-03-15 2014-09-18 Pelican Imaging Corporation Systems and methods for stereo imaging with camera arrays
US10755428B2 (en) * 2017-04-17 2020-08-25 The United States Of America, As Represented By The Secretary Of The Navy Apparatuses and methods for machine vision system including creation of a point cloud model and/or three dimensional model

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101702751A (en) * 2009-11-03 2010-05-05 上海富瀚微电子有限公司 Three-dimensional block matching method in video noise-reduction treatment
CN102184410A (en) * 2011-05-06 2011-09-14 浙江工业大学 Three-dimensional recovered cranioface recognition method
CN105809717A (en) * 2016-03-10 2016-07-27 上海玮舟微电子科技有限公司 Depth estimation method, system and electronic equipment
CN106767399A (en) * 2016-11-11 2017-05-31 大连理工大学 The non-contact measurement method of the logistics measurement of cargo found range based on binocular stereo vision and dot laser
EP3418975A1 (en) * 2017-06-23 2018-12-26 Koninklijke Philips N.V. Depth estimation for an image
CN109360246A (en) * 2018-11-02 2019-02-19 哈尔滨工业大学 Stereo vision three-dimensional displacement measurement method based on synchronous sub-district search
CN110569704A (en) * 2019-05-11 2019-12-13 北京工业大学 Multi-strategy self-adaptive lane line detection method based on stereoscopic vision
CN110246169A (en) * 2019-05-30 2019-09-17 华中科技大学 A kind of window adaptive three-dimensional matching process and system based on gradient
CN110414384A (en) * 2019-07-11 2019-11-05 东南大学 Intelligent rice and wheat harvester leading line tracking
CN112051853A (en) * 2020-09-18 2020-12-08 哈尔滨理工大学 Intelligent obstacle avoidance system and method based on machine vision
CN112770118A (en) * 2020-12-31 2021-05-07 展讯通信(天津)有限公司 Video frame image motion estimation method and related equipment

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
Dimitris G. Chachlakis等.Minimum Mean-Squared-Error Autocorrelation Processing in Coprime Arrays.arXiv:2010.11073v1.2020,1-20. *
G.de Haan等.True-motion estimation with 3-D recursive search block matching.IEEE Transactions on Circuits and Systems for Video Technology.1993,第3卷(第5期),369-379. *
K. Virk 等.Low complexity recursive search based motion estimation algorithm for video coding applications.2005 13th European Signal Processing Conference.2015,1-4. *
Mostafa Mansour 等.Relative Importance of Binocular Disparity and Motion Parallax for Depth Estimation: A Computer Vision Approach.Remote Sens.2019,第11卷(第17期),1990. *
N. Guo 等.A Computationally Efficient Path-Following Control Strategy of Autonomous Electric Vehicles With Yaw Motion Stabilization. IEEE Transactions on Transportation Electrification.2020,第6卷(第2期),728-739. *
Viny Saajan Victor等.arXiv:2109.10123.2021,1-9. *
周中奎.基于机器学习的智能汽车目标检测与场景增强技术研究.中国硕士学位论文全文数据库 (工程科技Ⅱ辑).2021,(第(2021)02期),C035-500. *
宗雯雯.基于双目立体视觉的特征点匹配关键技术研究与应用.中国硕士学位论文全文数据库 (信息科技辑).2012,(第(2012)03期), I138-1928. *
李银国 等.基于双目图像的大尺度智能驾驶场景重建.计算机科学.2019,第46卷(第S2期),251-254+259. *
王浩宇.面向智能驾驶的虚拟场景重建.中国硕士学位论文全文数据库 (工程科技Ⅱ辑).2019,(第(2019)12期),C035-259. *

Also Published As

Publication number Publication date
CN113763451A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN111462200B (en) Cross-video pedestrian positioning and tracking method, system and equipment
CN110569704B (en) Multi-strategy self-adaptive lane line detection method based on stereoscopic vision
Park et al. High-precision depth estimation using uncalibrated LiDAR and stereo fusion
EP2386998B1 (en) A Two-Stage Correlation Method for Correspondence Search
CN108537848B (en) Two-stage pose optimization estimation method for indoor scene reconstruction
US8913055B2 (en) Online environment mapping
EP2757524B1 (en) Depth sensing method and system for autonomous vehicles
CN109472820B (en) Monocular RGB-D camera real-time face reconstruction method and device
CN107843251B (en) Pose estimation method of mobile robot
CN111862234B (en) Binocular camera self-calibration method and system
CN105069804B (en) Threedimensional model scan rebuilding method based on smart mobile phone
Munoz-Banon et al. Targetless camera-lidar calibration in unstructured environments
EP3293700B1 (en) 3d reconstruction for vehicle
CN112083403B (en) Positioning tracking error correction method and system for virtual scene
CN111738033B (en) Vehicle driving information determination method and device based on plane segmentation and vehicle-mounted terminal
CN111862236B (en) Self-calibration method and system for fixed-focus binocular camera
Miksch et al. Automatic extrinsic camera self-calibration based on homography and epipolar geometry
CN115936029A (en) SLAM positioning method and device based on two-dimensional code
CN114812558A (en) Monocular vision unmanned aerial vehicle autonomous positioning method combined with laser ranging
CN112017259B (en) Indoor positioning and image building method based on depth camera and thermal imager
CN113763451B (en) Hierarchical search method for binocular vision depth measurement of intelligent vehicle
Le Besnerais et al. Dense height map estimation from oblique aerial image sequences
CN117011660A (en) Dot line feature SLAM method for fusing depth information in low-texture scene
Wirges et al. Self-supervised flow estimation using geometric regularization with applications to camera image and grid map sequences
CN117237431A (en) Training method and device of depth estimation model, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant