CN108564536B

CN108564536B - Global optimization method of depth map

Info

Publication number: CN108564536B
Application number: CN201711406513.2A
Authority: CN
Inventors: 郭文松
Original assignee: Luoyang Zhongke Zhongchuang Space Technology Co ltd
Current assignee: Luoyang Zhongke Information Industry Research Institute; Luoyang Zhongke Zhongchuang Space Technology Co., Ltd
Priority date: 2017-12-22
Filing date: 2017-12-22
Publication date: 2020-11-24
Anticipated expiration: 2037-12-22
Also published as: CN108564536A

Abstract

A global optimization method for depth map fully utilizes difference information of parallax data of left and right visual anglesAnd the edge gradient information of the color data to globally optimize the depth map. Firstly, respectively carrying out regional filtering on initial left and right visual angle parallax data based on a regional growing method to remove isolated small-block-shaped error parallax; then, the difference information of the optimized left and right parallax data is utilized and adopted

The parallax confidence coefficient data is calculated by the model, and experiments prove that the method is simple and effective; and finally, converting the left visual angle parallax data and the confidence coefficient data into initial depth data and confidence coefficient data under the visual angle of the color camera through visual angle projection, fully utilizing edge information of the color image, constructing a linear equation set related to the depth data, and solving through a super-relaxation iteration method to obtain the optimized depth data. The method can acquire high-precision depth data in real time, the optimized depth map is smooth and has edges, and large cavities can be filled well.

Description

Global optimization method of depth map

Technical Field

The invention relates to the technical field of computer vision and image processing, in particular to a global optimization method of a depth map.

Background

Two images of a scene are acquired from different perspectives, and depth information of the scene can be estimated by the position offset of the scene in the two images. The position deviation corresponds to the parallax of the image pixel point, and can be directly converted into the scene depth, and is generally represented by a depth map. However, when a scene has texture missing and texture repeating, a large slice of holes may appear on the corresponding region in the calculated depth map. On one hand, in the existing method, the texture is enriched by artificially compensating the scene (such as pasting a mark point, projecting a light spot and the like), but the existing method has the conditions of inconvenient operation, incapability of operating, no function and the like; on the other hand, the depth map is directly optimized, but the method is complicated, excessively optimized or not in practical conditions.

Disclosure of Invention

In order to solve the defects in the prior art, the invention provides a global optimization method of a depth map, which realizes filtering and denoising of the depth map and filling of large cavities, converts left and right visual angle parallax data into RGB camera visual angles, fully utilizes RGB image edge information, and is simple and efficient.

In order to achieve the purpose, the invention adopts the specific scheme that: a global optimization method of a depth map comprises the following steps:

respectively carrying out regional filtering on initial left visual angle parallax data and initial right visual angle parallax data based on a regional growing method, removing error parallax of isolated block regions, and obtaining optimized left visual angle parallax data and optimized right visual angle parallax data; the specific process of removing the block area based on the area growing method and having the error parallax is as follows:

s1, creating two images Buff and Dst which have the same size as the initial left visual angle parallax data and the initial right visual angle parallax data and have initial values of zero, wherein Buff is used for recording grown pixel points, and Dst is used for marking image block areas meeting the conditions;

s2, setting a first threshold value and a second threshold value; the first threshold value is a parallax difference value, and the second threshold value is an area value of a block area with wrong parallax;

s3, traversing each pixel point which is not grown, taking the current point as a seed point, and pressing into a region growing function;

s4, creating stacks vectorGrowPoints and stacks ResultPoints, taking out the tail point from the stacks vectorGrowPoints, and then according to the eight directions of the point: { -1, -1}, {0, -1}, {1, -1}, {1, 0}, {1, 1}, {0, 1}, { -1, 1}, { -1, 0} extracting the disparity value of the pixel point which does not grow out to be compared with the disparity value of the seed point, if the disparity value is smaller than a first threshold value, considering that the condition is met, respectively pushing the pixel point into stacked vector growth points and stacked resource points, marking the grown point in Buff, and repeating the process until no point exists in the stacked vector growth points; if the number of points in the stack resultPoints is smaller than a second threshold value, marking in Dst;

s5, repeating the steps S3 and S4, and removing the marked region in the Dst from the parallax data to obtain optimized left visual angle parallax data and optimized right visual angle parallax data;

step two, calculating left visual angle confidence coefficient data according to the left visual angle parallax data optimized in the step one and the right visual angle parallax data optimized in the step one; the specific method for calculating the left view confidence coefficient data comprises the following steps: o is_p＝e^-|ld-rd|Wherein ld is left view parallax data after the optimization of the step one, rd is corresponding right view parallax data after the optimization of the step one, and O_pIs to the leftView confidence coefficient data;

step three, calculating left visual angle depth data according to the left visual angle parallax data and the camera parameters which are optimized in the step one; simultaneously carrying out perspective projection conversion on the left perspective depth data and the left perspective confidence coefficient data obtained in the second step to obtain initial depth data and confidence coefficient data under the perspective of the RGB camera;

and step four, calculating edge constraint coefficient data by using RGB image edge information, and then generating optimized depth data by using the edge constraint coefficient data, the initial depth data and the confidence coefficient data under the viewing angle of the RGB camera in the step three through a global optimization objective function.

Preferably, an acquisition device is used in the process of acquiring the depth image, and the acquisition device comprises two near-infrared cameras and an RGB camera.

Preferably, in step three, the specific calculation process of the initial depth data under the viewing angle of the RGB camera is as follows:

t1, traversing image pixels, knowing the base line and focal length of the left and right near-infrared cameras, and converting the parallax value into left visual angle depth data;

t2, calculating three-dimensional coordinates of the corresponding space points in the corresponding coordinate system according to the left visual angle depth data and the internal parameters of the left near-infrared camera or the near-infrared right camera;

t3, calculating three-dimensional coordinates of the corresponding space points in the RGB camera coordinate system according to the relative position relation between the left near-infrared camera coordinate system or the right near-infrared camera coordinate system and the RGB camera coordinate system and the three-dimensional correction matrix between the left near-infrared camera and the right near-infrared camera; and T4, calculating the projection and the depth value of the corresponding space point on the RGB image plane by the internal parameters of the RGB camera, and obtaining the initial depth data under the viewing angle of the RGB camera.

Preferably, the global optimization objective function adopted in step four is:

wherein the content of the first and second substances,

initial depth data for a pixel point p on the image, D_pFor depth data to be found, α_pIs confidence coefficient data, omega, of pixel point p under the visual angle of RGB camera_qpFour-neighborhood pixel points of which the number is p and q is edge constraint coefficient data; when (D) is minimum, the optimization ends; assuming that the image has n pixel points, in order to minimize (D), the right-hand part of the global optimization objective function with equal sign is directed to each D_pThe derivation is equal to zero, n equations are obtained, and a linear equation system with AX ═ B is obtained by sorting, wherein A is a coefficient matrix of n × n, and alpha_pAnd ω_qpIn relation to, B is a constant matrix of n × 1, and_pand

where X is the column vector [ D ] of the depth data to be determined₁，D₂，…，D_n]^TAnd obtaining optimized depth data through iterative calculation.

Preferably, for any pixel point p, AX is the p-th behavior in B:

and calculating a coefficient matrix A and a constant matrix B.

Preferably, the specific calculation process of the coefficient matrix a and the constant matrix B is as follows:

(1) firstly, the RGB image is gradiented

Is the gray scale difference between pixel points q and p, and then

The value range is [0, 1 ]]Wherein beta is a tuning parameter, andβ＝20；

(2) from alpha_pAnd ω_qpCalculating a coefficient matrix a, wherein the pth behavior of a: (alpha_p+∑_(p，q)∈E(ω_pq+ ω_qp))D_p-∑_(p，q)∈E(ω_pq+ω_qp)D_qObtaining 5 nonzero values of the row, wherein the 5 nonzero values are the pixel point p and corresponding elements of the pixel point p in four adjacent domains, and the element corresponding to the pixel point p is alpha_p+∑_(p，q)∈E(ω_pq+ω_qp) The element corresponding to the four-neighborhood pixel point q of the pixel point p is- (omega)_pq+ω_qp)；

(3) From alpha_pAnd initial depth data

Computing a constant matrix B, wherein the p-th behavior of B

Preferably, the linear equation set is solved by adopting a super-relaxation iteration method to obtain the optimized depth data.

Has the advantages that:

(1) the invention provides a global optimization method of a depth map, which is based on an acquisition device, wherein the acquisition device comprises two near infrared cameras (NIR) and a visible light (RGB) camera, the near infrared cameras form a binocular stereo vision system, the depth map is acquired in real time and is registered with an RGB image acquired by the visible light camera; the method comprises the steps of fully utilizing global information of left and right visual angle parallax data and edge constraint of color data to carry out global optimization on a depth map, converting the left and right visual angle parallax data into RGB camera visual angles, and utilizing RGB image edge information; when calculating the confidence coefficient data, adopt e^-xThe model directly utilizes the method of the parallax data of left and right visual angles, and experiments prove that the method is simple and effective. The simple and clean body is as follows: in the existing method, the confidence coefficient is determined by fitting a matching cost quadratic curve of three adjacent integer disparity values of a pixel point, and the method needs to be repeatedNewly calculating parallax matching cost, performing secondary fitting on three matching cost values of pixel points, and determining alpha by judging curve orientation_pThe method is simpler compared with the prior art; the effective expression is as follows: the optimized depth map is smooth, the edge is kept, and large cavities can be filled well;

(2) the invention provides a global optimization method of a depth map, which is characterized in that a regional growing method is adopted to carry out regional filtering on initial left visual angle parallax data and initial right visual angle parallax data respectively, experiments prove that the method can finish marking after traversing an image once, and can effectively remove error and parallax of small isolated regions with similar parallax values and obviously different from peripheral parallax values.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a depth map of a large sheet of cavities in an optimized front header;

FIG. 3 is a depth map optimized by the global optimization method of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to the flowchart of fig. 1 of the present invention, the internal and external parameters of all cameras of the present invention are known, and the initial left view parallax data and the initial right view parallax data are calculated by the prior art, which is not described herein again. A global optimization method of a depth map is used in the process of acquiring a depth image based on an acquisition device, wherein the acquisition device comprises two near-infrared cameras and an RGB camera, and the method comprises the following steps:

respectively carrying out regional filtering on initial left visual angle parallax data and initial right visual angle parallax data based on a regional growing method, removing error parallax of isolated block regions, and obtaining optimized left visual angle parallax data and optimized right visual angle parallax data; generally generated parallax data are subjected to left and right verification, a large amount of point parallaxes which are mismatched are removed, but the mismatching parallaxes which are in a small area still exist, the method firstly carries out regional filtering on the parallax data of left and right visual angles respectively, small isolated areas with similar parallax values are removed, the parallax quality is further improved, and the specific process of removing the mismatching parallaxes of the block areas based on the regional growing method is as follows:

s2, setting a first threshold value and a second threshold value; the first threshold value is a parallax difference value, and the second threshold value is an area value of a block area with wrong parallax; preferably, the first threshold is 10, and the second threshold is 60;

step two, the parallax data of the left visual angle optimized in the step one and the parallax data of the right visual angle optimized in the step one are measuredCalculating left view confidence coefficient data; the specific method for calculating the left view confidence coefficient data comprises the following steps: o is_p＝e^-|ld-rdL, wherein ld is left viewing angle parallax data after optimization in step one, rd is right viewing angle parallax data after optimization in corresponding step one, and O_pLeft view confidence coefficient data; in the existing method, a method for determining the point parallax confidence coefficient data by fitting a matching cost curve exists, the implementation process is complicated, and the method for calculating the confidence coefficient data is simple and efficient. Left view confidence coefficient data is decisive for the optimization effect, and O_pThe reliability of the value is closely related to the accuracy of parallax data, and the small blocks in the parallax data have parallax errors, so that large blocks in the corresponding area have depth data errors after optimization, and therefore, the invention provides a method for removing the block-shaped parallax errors based on a region growing method to improve the parallax quality;

step three, calculating left visual angle depth data according to the left visual angle parallax data and the camera parameters which are optimized in the step one; simultaneously carrying out perspective projection conversion on the left perspective depth data and the left perspective confidence coefficient data obtained in the second step to obtain initial depth data and confidence coefficient data under the perspective of the RGB camera; the specific calculation process of the initial depth data under the viewing angle of the RGB camera is as follows:

t3, calculating three-dimensional coordinates of the corresponding space points in the RGB camera coordinate system according to the relative position relation between the left near-infrared camera coordinate system or the right near-infrared camera coordinate system and the RGB camera coordinate system and the three-dimensional correction matrix between the left near-infrared camera and the right near-infrared camera; t4, calculating the projection and depth value of the corresponding space point on the RGB image plane by the internal parameters of the RGB camera, and obtaining the initial depth data under the visual angle of the RGB camera;

step four, calculating edge constraint coefficient data by using RGB image edge information, and then generating optimized depth data by using a global optimization objective function according to the edge constraint coefficient data, the initial depth data and the confidence coefficient data under the viewing angle of the RGB camera in the step three, wherein the adopted global optimization objective function is as follows:

wherein the content of the first and second substances,

For any pixel point p, AX is the p-th behavior in B:

and calculating a coefficient matrix A and a constant matrix B.

Step three, acquiring initial depth data, calculating a coefficient matrix and a constant matrix below, wherein for an image with million resolution, the depth data volume can reach million, the coefficient matrix data volume is in a square level, and in order to meet the requirement of GPU real-time implementation, the invention adopts an ultra-relaxation iterative method (SOR) to solve a linear equation set to complete depth data optimization, as shown in FIG. 2 and FIG. 3, FIG. 2 is a depth map of a large-slice cavity before optimization; FIG. 3 is a depth map optimized using the global optimization method of the present invention. The specific calculation process of the coefficient matrix A and the constant matrix B is as follows:

(1) firstly, the RGB image is gradiented

Is the gray scale difference between pixel points q and p, and then

The value range is [0, 1 ]]Where β is the tuning parameter, and β ═ 20, by this step ω is solved_qp，ω_qpThe effect on the depth effect is to keep the depth edge from being overly smoothed;

(3) From alpha_pAnd initial depth data

Computing a constant matrix B, wherein the p-th behavior of B

(4) And solving a linear equation set by an SOR method to obtain optimized depth data.

The invention provides a global optimization method of a depth map, which performs global optimization on the initial depth of a scene, realizes real-time high-precision acquisition of depth, and mainly solves the problem that a large number of cavities exist in calculated parallax data when the texture of the scene is lacked or repeated, such as hair, the texture is single, and even if an active light source is adopted to project structured light, the texture is easily absorbed and lacks characteristics. The method can be used in the cases of three-dimensional reconstruction, somatosensory interaction and the like. In the three-dimensional reconstruction, high-quality depth data under each visual angle is provided for real-time high-precision reconstruction, and subsequent optimization processing operation can be simplified. In the somatosensory interaction, a real picture is displayed in front of the opposite side by establishing different interactor models.

It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A global optimization method for a depth map is characterized by comprising the following steps:

s2, setting a first threshold value and a second threshold value; the first threshold value is a difference value of parallax errors, and the second threshold value is an area with a parallax error in a block area;

step two, calculating left visual angle confidence coefficient data according to the left visual angle parallax data optimized in the step one and the right visual angle parallax data optimized in the step one; the specific method for calculating the left view confidence coefficient data comprises the following steps: o is_p＝e^-|ld-rd|Wherein ld is left view parallax data after the optimization of the step one, rd is corresponding right view parallax data after the optimization of the step one, and O_pLeft view confidence coefficient data;

2. The global optimization method of a depth map as claimed in claim 1, characterized in that: an acquisition device is used in the process of acquiring the depth image, and the acquisition device comprises two near-infrared cameras and an RGB camera.

3. The global optimization method of a depth map as claimed in claim 1, characterized in that: in the third step, the specific calculation process of the initial depth data under the viewing angle of the RGB camera is as follows:

t1, traversing the optimized left view parallax data, knowing the base line and the focal length of the left and right near-infrared cameras, and converting the parallax value into left view depth data;

t2, calculating three-dimensional coordinates of the corresponding space point in the corresponding coordinate system according to the left visual angle depth data and the internal parameters of the left near-infrared camera or the right near-infrared camera;

t3, calculating three-dimensional coordinates of the corresponding space points in the RGB camera coordinate system according to the relative position relation between the left near-infrared camera coordinate system or the right near-infrared camera coordinate system and the RGB camera coordinate system and the three-dimensional correction matrix between the left near-infrared camera and the right near-infrared camera;

and T4, calculating the projection and the depth value of the corresponding space point on the RGB image plane by the internal parameters of the RGB camera, and obtaining the initial depth data under the viewing angle of the RGB camera.

4. The global optimization method of a depth map as claimed in claim 1, characterized in that: the global optimization objective function adopted in the fourth step is as follows:

wherein the content of the first and second substances,

initial depth data for a pixel point p on an image from the perspective of an RGB camera, D_pFor depth data to be found, α_pIs confidence coefficient data, omega, of pixel point p under the visual angle of RGB camera_qpFour-neighborhood pixel points of which the number is p and q is edge constraint coefficient data; when (D) is minimum, the optimization ends; assuming that the image has n pixel points, in order to minimize (D), the right-hand part of the global optimization objective function with equal sign is directed to each D_pThe derivation is equal to zero, n equations are obtained, and a linear equation system with AX ═ B is obtained by sorting, wherein A is a coefficient matrix of n × n, and alpha_pAnd ω_qpIn relation to, B is a constant matrix of n × 1, and_pand

5. The global optimization method of depth map as claimed in claim 4, characterized in that: for any pixel point p, where AX is the behavior of the pixel point p in B:

and calculating a coefficient matrix A and a constant matrix B.

6. The global optimization method for depth maps according to claim 5, characterized in that: the specific calculation process of the coefficient matrix A and the constant matrix B is as follows:

(1) firstly, the RGB image is gradiented

Is the gray scale difference between pixel points q and p, and then

The value range is [0, 1 ]]Wherein β is a tuning parameter, and β ═ 20;

(2) from alpha_pAnd ω_qpCalculating a coefficient matrix A, wherein the behavior of a pixel point p of A is as follows: (alpha_p+∑_(p，q)∈E(ω_pq+ω_qp))D_p-∑_(p，q)∈E(ω_pq+ω_qp)D_qObtaining 5 nonzero values of the row, wherein the 5 nonzero values are the pixel point p and corresponding elements of the pixel point p in four adjacent domains, and the element corresponding to the pixel point p is alpha_p+∑_(p，q)∈E(ω_pq+ω_qp) The element corresponding to the four-neighborhood pixel point q of the pixel point p is- (omega)_pq+ω_qp)；

(3) From alpha_pAnd initial depth data

Calculating a constant matrix B, wherein the behavior of a pixel point p of B

7. The global optimization method for depth maps according to claim 6, characterized in that: and solving the linear equation set by adopting an ultra-relaxation iteration method to obtain the optimized depth data.