CN108830895A

CN108830895A - Differentially expanding moving method based on segmentation in a kind of Stereo matching

Info

Publication number: CN108830895A
Application number: CN201810687866.2A
Authority: CN
Inventors: 赵玺; 杨新宇; 刘鹏康; 苏振强; 骆志伟; 杜妍
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2018-06-28
Filing date: 2018-06-28
Publication date: 2018-11-16

Abstract

The invention discloses a segmentation-based local expansion and movement method in stereo matching, which adopts an image segmentation method, can fully consider the edge information of the image during processing, and avoids the influence of parallax mutation on the result. The invention can effectively reduce the matching error of the image in the region lacking in texture, and has better parallax calculation result at the edge of the object than the local expansion moving method, and the overall accuracy is also higher. The present invention applies 3D labels when calculating parallax, and compared with discrete parallax labels, the obtained parallax is more accurate and the result is better; the present invention performs parallax calculation on the basis of segmentation, and maintains object edges better; the present invention It can effectively reduce the matching error of the image in the lack of texture area, and compared with the existing methods, the overall accuracy is also higher.

Description

A Segmentation-Based Local Expansion Movement Method in Stereo Matching

技术领域technical field

本发明涉及一种立体匹配中基于分割的局部扩张移动方法。The invention relates to a segmentation-based local expansion moving method in stereo matching.

背景技术Background technique

立体视觉匹配作为计算机视觉的一个重要分支，是一个通过匹配两幅或者多幅图像间的匹配像素点，将二维的位置信息转化为三维的深度信息，从而估计出场景的三维模型的过程。立体视觉匹配算法最早应用在摄影测量学领域，从重叠的航空影像中自动地构建地形学上的高度图。如今，立体视觉匹配在三维导航、三维重建、无人驾驶汽车、智能视频监控、遥感图像分析、医学影像、机器人智能控制等领域都有着广泛的应用。As an important branch of computer vision, stereo vision matching is a process of converting two-dimensional position information into three-dimensional depth information by matching matching pixels between two or more images, thereby estimating the three-dimensional model of the scene. Stereo vision matching algorithms were first applied in the field of photogrammetry to automatically construct topographic height maps from overlapping aerial images. Today, stereo vision matching is widely used in 3D navigation, 3D reconstruction, driverless cars, intelligent video surveillance, remote sensing image analysis, medical imaging, robot intelligent control and other fields.

尽管立体视觉匹配技术近年来已经取得了长足的发展，但依然面临着许多问题。这些问题主要来自遮挡、缺乏纹理(或者纹理重复)和光照、噪声等引起的放射性差异这三个原因造成的匹配模糊。下面主要讨论缺乏纹理的问题。缺乏纹理的区域在现实世界中广泛存在，比如白墙、地面或者大片纯色区域的物体内部等。缺乏纹理意味着图像在该区域缺乏特征点，不仅在稀疏立体视觉匹配算法中会存在因找不到特征点而无法有效地匹配的问题，而且即使在稠密立体视觉匹配算法中也会因为像素的代价与缺乏纹理区域内的其它像素代价相同而导致匹配错误。在计算相似性度量时，一个像素可能与区域中的多个像素的匹配代价一致，这时，仅靠单个像素间的匹配代价已经无法满足需求。Although stereo vision matching technology has made great progress in recent years, it still faces many problems. These problems mainly come from the matching blur caused by three reasons: occlusion, lack of texture (or texture repetition), and radioactive differences caused by lighting and noise. The following mainly discusses the problem of lack of texture. Areas lacking texture widely exist in the real world, such as white walls, grounds, or interiors of objects in large areas of solid color. The lack of texture means that the image lacks feature points in this area. Not only will there be problems in the sparse stereo vision matching algorithm that cannot find the feature points and cannot be effectively matched, but even in the dense stereo vision matching algorithm. The cost is the same as that of other pixels in regions lacking texture causing matching errors. When calculating the similarity measure, a pixel may have the same matching cost as multiple pixels in the region. At this time, the matching cost between a single pixel cannot meet the requirements.

为了提高算法在缺乏纹理区域的效果就需要利用更多的信息，局部方法在进行代价聚集的时候可以增大窗口，将更多的像素考虑进去，但窗口的增大会使得算法对于纹理丰富区域的匹配结果下降。所以仅仅依靠局部窗口内的信息，很难进一步提高立体匹配算法在缺乏纹理区域的准确性，不仅如此，随着窗口的增大，代价聚集的计算复杂度也会提高。全局的方法在计算时会把图像的所有像素都考虑进去，结果比局部方法更准确，但计算复杂度也更高。而且随着3D标签方法在立体匹配中的广泛应用，视差的搜索空间从离散的标签空间转变成了连续的标签空间，相当于搜索结果有无穷多个，这进一步加剧了全局方法的计算复杂度。而在全局方法中，图割法能够利用扩张移动机制，同时更新多个像素的标签，而不像置信传播那样需要逐像素的更新，这避免了其陷入局部最优值。局部扩张移动算法就采用了这种基于图割的扩张移动机制，它将其应用在局部区域，使得它能在有效地全局系统下估计像素的3D标签的同时，避免陷入到局部最优解，同时更新多个像素的标签也提高了算法的速度。但该方法仅仅依靠MRF平滑项的惩罚和滤波函数来保持物体的边缘，这使得其在物体边缘的视差计算上会存在一些问题。In order to improve the effect of the algorithm in areas lacking texture, more information needs to be used. The local method can increase the window when performing cost aggregation to take more pixels into account, but the increase of the window will make the algorithm more effective for texture-rich areas. Match results drop. Therefore, relying only on the information in the local window, it is difficult to further improve the accuracy of the stereo matching algorithm in areas lacking texture. Not only that, but with the increase of the window, the computational complexity of cost aggregation will also increase. The global method will take all the pixels of the image into account in the calculation, and the result is more accurate than the local method, but the computational complexity is also higher. Moreover, with the widespread application of 3D label methods in stereo matching, the search space of disparity has changed from a discrete label space to a continuous label space, which is equivalent to an infinite number of search results, which further aggravates the computational complexity of the global method. . In the global method, the graph cut method can use the expansion movement mechanism to update the labels of multiple pixels at the same time, unlike the belief propagation that requires pixel-by-pixel update, which avoids it from falling into the local optimum. The local expansion movement algorithm adopts this graph-cut-based expansion movement mechanism, which is applied to the local area, so that it can effectively estimate the 3D label of the pixel under the global system, and avoid falling into the local optimal solution. Simultaneously updating the labels of multiple pixels also increases the speed of the algorithm. However, this method only relies on the penalty and filter function of the MRF smoothing term to maintain the edge of the object, which makes it have some problems in the calculation of the parallax of the edge of the object.

发明内容Contents of the invention

本发明的目的在于克服上述现有技术的缺点，提供一种立体匹配中基于分割的局部扩张移动方法，针对图像中缺乏纹理区域的基于分割的局部扩张移动方法，能够有效提高算法在缺乏纹理区域的准确性，同时更好地保持物体的边缘。The object of the present invention is to overcome the above-mentioned shortcoming of the prior art, provide a kind of local expansion movement method based on segmentation in stereo matching, aim at the local expansion movement method based on segmentation in the lack of texture area in the image, can effectively improve the algorithm in the lack of texture area accuracy while better preserving the edges of objects.

为达到上述目的，本发明采用以下技术方案予以实现：In order to achieve the above object, the present invention adopts the following technical solutions to achieve:

一种立体匹配中基于分割的局部扩张移动方法，包括以下步骤：A segmentation-based local expansion movement method in stereo matching, comprising the following steps:

步骤1)初始化：Step 1) Initialize:

输入图像对，将图像对中的左图像和右图像分别作为参考图像和目标图像；Input an image pair, and use the left image and the right image in the image pair as a reference image and a target image, respectively;

步骤2)匹配代价计算：Step 2) Matching cost calculation:

采用基于CNN的匹配代价，计算得到视差空间图像DSI；Using the matching cost based on CNN, the disparity space image DSI is calculated;

步骤3)视差选择：Step 3) Parallax selection:

采用本发明公开的基于分割的局部扩张移动方法来进行视差计算，得到原始视差图；The disparity calculation is performed by adopting the segmentation-based local expansion movement method disclosed in the present invention to obtain the original disparity map;

步骤4)视差精化：Step 4) Parallax refinement:

通过左右一致性检测、孔洞填充以及加权中值滤波生成最终的左视差图和右视差图。The final left disparity map and right disparity map are generated by left and right consistency detection, hole filling and weighted median filtering.

本发明进一步的改进在于：The further improvement of the present invention is:

步骤3)的具体方法如下：The specific method of step 3) is as follows:

步骤3-1)采用SLIC对图像进行分割；通过分割，图像的每一个像素都被分配了一个表示Segment编号的Label，从1到最大分割块数K；Step 3-1) using SLIC to segment the image; through segmentation, each pixel of the image is assigned a Label representing the Segment number, from 1 to the maximum number of segments K;

步骤3-2)，对于每一个Segment，在包含该Segment的最小的矩形区域内，遍历所有像素，只有当该像素在Segment内部时，才以该像素为中心构建扩张区域和滤波窗口；Step 3-2), for each Segment, traverse all pixels in the smallest rectangular area containing the Segment, and only when the pixel is inside the Segment, construct an expansion area and a filter window centered on the pixel;

步骤3-3)每个像素在步骤3-2)建立的扩张区域内应用迭代α扩张算法，初始的α由Segment内部随机选择的像素当前的3D标签值确定，比较当像素视差标签取α时系统能量与像素视差标签取当前值时整个MRF系统的能量的大小，若取α时，系统能量更小，则更新当前像素的标签为α，否则保持当前标签不变；每次迭代遍历Segment中的所有像素，将α向扩张区域内扩张。Step 3-3) Apply the iterative α expansion algorithm to each pixel in the expansion area established in step 3-2). The initial α is determined by the current 3D label value of the randomly selected pixel inside the segment. Compared with when the pixel parallax label is α The system energy and pixel disparity label take the current value of the energy of the entire MRF system. If the system energy is smaller when α is taken, the label of the current pixel is updated to α, otherwise the current label remains unchanged; each iteration traverses the Segment All pixels of , expand α to the expansion area.

步骤3-3)中的系统能量的能量函数如下：The energy function of the system energy in step 3-3) is as follows:

MRF的能量函数定义如表达式(1)所示；The energy function definition of MRF is shown in expression (1);

表达式(1)中，第一项为数据项，衡量视差函数与输入图像之间的吻合程度，第二项为平滑项，如果视差在相邻的像素对(p,q)∈Ν间不连续，则会惩罚该项，保证该项在物体内部是平滑变化的，仅在物体边缘才会发生剧烈变化；λ是用来平衡数据项和平滑项的权值，λ的取值为10。In expression (1), the first item is a data item, which measures the degree of agreement between the disparity function and the input image, and the second item is a smoothing item. If the disparity between adjacent pixel pairs (p,q)∈N Continuous, the item will be punished to ensure that the item changes smoothly inside the object, and only changes drastically at the edge of the object; λ is used to balance the weight of the data item and the smooth item, and the value of λ is 10.

数据项如下：The data items are as follows:

使用斜面块匹配项来衡量视差函数与输入图像之间的吻合性，使用3D标签来提高算法在斜面的准确性，3D标签将每个像素点p的视差d_p用3个参数表示，这三个参数代表一个平面f_p，数据项的定义如表达式(2)所示；Use the slope block matching item to measure the consistency between the disparity function and the input image, and use the 3D label to improve the accuracy of the algorithm on the slope. The 3D label expresses the disparity d _{p of each pixel p} with 3 parameters. These three parameters represent a plane f _p , and the definition of data items is shown in expression (2);

表达式(2)中，点p的视差的3D标签由表达式(3)表示；In expression (2), the 3D label of the disparity of point p is represented by expression (3);

表达式(3)中，p_x和p_y分别表示p的x坐标和y坐标，3个参数组成了一个三元组这样估计像素p的视差d_p就转换成了估计三元组的三个参数值的问题；这三个参数通过PatchMatch的随机搜索策略来确定，在连续视差值的范围内随机选择一个值z₀作为点(x₀,y₀)的视差，将随机的单位向量n＝(n_x,n_y,n_z)作为点(x₀,y₀,z₀)的法向量，三个参数转化为：In expression (3), p _x and p _y represent the x coordinate and y coordinate of p respectively, and the three parameters form a triplet In this way, estimating the disparity d _p of pixel p is transformed into a problem of estimating the three parameter values of the triplet; these three parameters are determined by the random search strategy of PatchMatch, and a value z is randomly selected in the range of continuous disparity values ₀ is used as the parallax of the point (x ₀ , y ₀ ), and the random unit vector n=(n _x , n _y , n _z ) is used as the normal vector of the point (x ₀ , y ₀ , z ₀ ), and the three parameters are converted for:

表达式(2)中，W_p是以p为中心的方形窗口，权值ω_ps参考ASW，使用引导图像滤波的方法，具体定义如表达式(7)所示；In expression (2), W _p is a square window centered on p, and the weight ω _ps refers to ASW, using the method of guided image filtering. The specific definition is shown in expression (7);

表达式(7)中，I_p＝I_L(p)/255，是一个归一化的彩色向量,μ_k和Σ_k分别是I_p在窗口W_k′中的均值和协方差，e是为了避免过拟合；In the expression (7), I _p = I _L (p)/255, which is a normalized color vector, μ _k and Σ _k are the mean and covariance of I _p in the window W _k ′ respectively, and e is In order to avoid overfitting;

给定视差平面表达式(2)中的函数ρ(s|f_p)衡量窗口W_p内的像素s与其在右图像中的对应点s′的相似度，s′的定义如表达式(8)所示；given parallax plane The function ρ(s|f _p ) in the expression (2) measures the similarity between the pixel s in the window W _p and its corresponding point s′ in the right image, and the definition of s′ is shown in the expression (8);

ρ(s|f_p)的定义如表达式(9)所示；The definition of ρ(s|f _p ) is shown in expression (9);

ρ(s|f_p)＝min(C_CNN(s,s′),τ_CNN) (9)ρ(s|f _p )=min(C _CNN (s,s′),τ _CNN ) (9)

表达式(9)中，C_CNN是基于CNN的匹配代价计算方法，截断值τ_CNN增加了匹配代价对于遮挡区域的鲁棒性；当计算右图像的数据项时，左右图像视差的相对变化相反，若p和s此时均表示右图像中的像素点，则要调换表达式(9)中s和s′的位置，把表达式(8)中的减号改为加号。In expression (9), C _CNN is a CNN-based matching cost calculation method, and the truncation value τ _CNN increases the robustness of the matching cost to occluded areas; when calculating the data items of the right image, the relative changes in the parallax of the left and right images are opposite , if both p and s represent pixels in the right image at this time, the positions of s and s′ in expression (9) should be exchanged, and the minus sign in expression (8) should be changed to a plus sign.

平滑项如下：The smoothing term is as follows:

平滑项衡量表示像素对{p,q}在标签函数f下生成的标签(f_p,f_q)之间的距离；表达式(1)中的平滑项的具体定义如表达式(10)所示；The smooth term measures the distance between the labels (f _p , f _q ) generated by the pixel pair {p, q} under the label function f; the specific definition of the smooth term in expression (1) is as in expression (10) Show;

E_smooth(f_p,f_q)＝max(w_pq,ε)min(δ_pq(f_p,f_q),τ_dis) (10)E _smooth (f _p ,f _q )＝max(w _pq ,ε)min(δ _pq (f _p ,f _q ),τ _dis ) (10)

表达式(10)中，ε是一个表示权值w_pq下界的常数，值很小，其能增加权值对噪声的鲁棒性，δ_pq(f_p,f_q)惩罚f_p和f_q之间视差的不连续性，截断值τ_dis将允许视差在边缘处的剧烈变化；权值w_pq的定义如表达式(11)所示；In the expression (10), ε is a constant representing the lower bound of the weight w _pq , the value is very small, which can increase the robustness of the weight to noise, and δ _pq (f _p , f _q ) punishes f _p and f _q The discontinuity of the disparity between, the truncation value τ _dis will allow the drastic change of the disparity at the edge; the definition of the weight w _pq is shown in the expression (11);

表达式(11)中，参数γ衡量颜色差异的影响；In the expression (11), the parameter γ measures the influence of the color difference;

δ_pq(f_p,f_q)的定义如表达式(12)所示；The definition of δ _pq (f _p , f _q ) is shown in expression (12);

表达式(12)中，第一项表示p点在f_p和f_q这两个标签表示的平面下视差的差异，第二项则是点q在这两个标签下视差的差异。In expression (12), the first item represents the difference in parallax of point p under the planes represented by the two labels f _p and f _q , and the second item is the difference in parallax of point q under these two labels.

与现有技术相比，本发明具有以下有益效果：Compared with the prior art, the present invention has the following beneficial effects:

本发明公开的立体匹配中基于分割的局部扩张移动方法由于采用了图像分割方法，在处理时能够充分考虑图像的边缘信息，避免视差突变对结果的影响。本发明能够有效地减少图像在缺乏纹理的区域的匹配错误，而且比局部扩张移动方法在物体边缘的视差计算结果更好，总体的准确度也更高。其具有如下优点：The segmentation-based local expansion and movement method in the stereo matching disclosed by the present invention adopts the image segmentation method, which can fully consider the edge information of the image during processing, and avoid the influence of parallax mutation on the result. The invention can effectively reduce the matching error of the image in the region lacking in texture, and has better parallax calculation result at the edge of the object than the local expansion moving method, and the overall accuracy is also higher. It has the following advantages:

第一：本发明在计算视差时应用了3D标签，与离散的视差标签相比，得到的视差更准确，结果更好；First: the present invention applies 3D labels when calculating parallax, compared with discrete parallax labels, the obtained parallax is more accurate and the result is better;

第二：本发明在分割的基础上进行视差计算，对物体边缘的保持更好；Second: the present invention performs parallax calculation on the basis of segmentation, and better maintains object edges;

第三：本发明能够有效的减少图像在缺乏纹理区域的匹配错误，与现有方法相比，总体准确度也更高。Third: the present invention can effectively reduce the matching error of the image in the lack of texture area, and compared with the existing method, the overall accuracy is also higher.

附图说明Description of drawings

图1为本发明的流程；Fig. 1 is flow process of the present invention;

图2为本发明进行窗口选择的过程；Fig. 2 is the process that the present invention carries out window selection;

图3为本发明与现有方法的结果对比。Fig. 3 is the result comparison of the present invention and the existing method.

具体实施方式Detailed ways

下面结合附图对本发明做进一步详细描述：The present invention is described in further detail below in conjunction with accompanying drawing:

参见图1，为了提高立体视觉匹配对缺乏纹理图像的准确度，本发明采用局部扩张移动的思想，先应用图像分割方法将图像分割成Segment，然后在Segment上应用局部扩张移动，以保持立体匹配算法的视差计算在物体边缘仍然能取得较好的结果。本发明具体步骤如下：Referring to Figure 1, in order to improve the accuracy of stereo vision matching for images lacking texture, the present invention adopts the idea of local expansion and movement, first applies the image segmentation method to divide the image into segments, and then applies local expansion and movement on the segments to maintain stereo matching The algorithm's disparity calculation can still achieve good results at the edge of the object. Concrete steps of the present invention are as follows:

步骤1)初始化Step 1) Initialize

步骤2)匹配代价计算Step 2) Matching cost calculation

步骤3)视差选择：Step 3) Parallax selection:

采用本发明公开的基于分割的局部扩张移动方法来进行视差计算，得到原始视差图；其具体步骤如下：The disparity calculation is performed by adopting the segmentation-based local expansion movement method disclosed in the present invention to obtain the original disparity map; the specific steps are as follows:

步骤3-1)采用SLIC对图像进行分割。通过分割，图像的每一个像素都被分配了一个表示Segment编号的Label，从1到最大分割块数K；Step 3-1) Use SLIC to segment the image. Through segmentation, each pixel of the image is assigned a Label representing the Segment number, from 1 to the maximum number of segments K;

步骤3-2)，如图2所示，对于每一个Segment，在包含该Segment的最小的矩形区域(图2中的虚线所围成的矩形)内，遍历所有像素，只有当该像素在Segment内部时，才以该像素为中心构建扩张区域(图2中半径为r的窗口)和滤波窗口(图2中半径为r+R的窗口)。以一个Segment为例，对于Segment中的每个像素，以其为中心，r为半径构建扩张区域，扩张区域再往外扩大R作为计算能量函数时引导滤波的窗口。Step 3-2), as shown in Figure 2, for each Segment, traverse all pixels in the smallest rectangular area containing the Segment (the rectangle surrounded by the dotted line in Figure 2), only when the pixel is in the Segment When it is inside, the expansion area (the window with radius r in Figure 2) and the filter window (the window with radius r+R in Figure 2) are constructed with the pixel as the center. Taking a Segment as an example, for each pixel in the Segment, use it as the center and r as the radius to construct an expansion area, and then expand the expansion area to expand R as the window for guiding the filter when calculating the energy function.

步骤3-3)每个像素在步骤3-2建立的扩张区域内应用迭代α扩张算法，初始的α由Segment内部随机选择的像素当前的3D标签值确定，比较当像素视差标签取α时系统能量与像素视差标签取当前值时整个MRF系统的能量的大小，若取α时，系统能量更小，则更新当前像素的标签为α，否则保持当前标签不变。每次迭代遍历Segment中的所有像素，将α向扩张区域内扩张。一般经过7-9次迭代，像素就能够取得比较理想的视差标签。Step 3-3) Apply the iterative α expansion algorithm to each pixel in the expansion area established in step 3-2. The initial α is determined by the current 3D label value of the randomly selected pixel inside the segment. Compared with the system when the pixel disparity label is α Energy and pixel disparity The size of the energy of the entire MRF system when the label takes the current value. If the system energy is smaller when α is taken, the label of the current pixel is updated to α, otherwise the current label remains unchanged. Each iteration traverses all the pixels in the Segment, and expands α to the expansion area. Generally, after 7-9 iterations, the pixel can obtain a relatively ideal disparity label.

上述系统能量的能量函数如下：The energy function of the above system energy is as follows:

局部扩张移动算法是在基于图割的α扩张的基础上建立的，本质上还是一种全局的方法，因此求解过程依然是一个求马尔科夫随机场系统整体能量最小的过程。MRF的能量函数定义如表达式(1)所示。The local expansion moving algorithm is established on the basis of graph cut-based α expansion, and is essentially a global method, so the solution process is still a process of finding the minimum overall energy of the Markov random field system. The energy function definition of MRF is shown in expression (1).

表达式(1)中，第一项为数据项，衡量视差函数与输入图像之间的吻合程度，第二项为平滑项，如果视差在相邻的像素对(p,q)∈Ν间不连续，则会惩罚该项，保证该项在物体内部是平滑变化的，仅在物体边缘才会发生剧烈变化。λ是用来平衡数据项和平滑项的权值，λ的取值为10。数据项和平滑项的具体定义将在下面给出。In expression (1), the first item is a data item, which measures the degree of agreement between the disparity function and the input image, and the second item is a smoothing item. If the disparity between adjacent pixel pairs (p,q)∈N If continuous, the term will be penalized to ensure that the term changes smoothly inside the object, and only changes sharply at the edge of the object. λ is the weight used to balance the data item and the smoothing item, and the value of λ is 10. The specific definitions of the data term and the smoothing term will be given below.

1)数据项1) Data item

使用斜面块匹配项来衡量视差函数与输入图像之间的吻合性，使用3D标签来提高算法在斜面的准确性，3D标签将每个像素点p的视差d_p用3个参数表示，这三个参数代表一个平面f_p，数据项的定义如表达式(2)所示。Use the slope block matching item to measure the consistency between the disparity function and the input image, and use the 3D label to improve the accuracy of the algorithm on the slope. The 3D label expresses the disparity d _{p of each pixel p} with 3 parameters. These three The parameters represent a plane f _p , and the definition of data items is shown in expression (2).

表达式(2)中，点p的视差的3D标签由表达式(3)表示。In Expression (2), the 3D label of the disparity of point p is represented by Expression (3).

表达式(3)中，p_x和p_y分别表示p的x坐标和y坐标，3个参数组成了一个三元组这样估计像素p的视差d_p就转换成了估计三元组的三个参数值的问题。这三个参数可以通过PatchMatch的随机搜索策略来确定，在连续视差值的范围内随机选择一个值z₀作为点(x₀,y₀)的视差，将随机的单位向量n＝(n_x,n_y,n_z)作为点(x₀,y₀,z₀)的法向量，三个参数的就可以转化为：In expression (3), p _x and p _y represent the x coordinate and y coordinate of p respectively, and the three parameters form a triplet In this way, estimating the disparity d _p of pixel p is transformed into a problem of estimating the three parameter values of the triplet. These three parameters can be determined by the random search strategy of PatchMatch. A value z ₀ is randomly selected as the disparity of point (x ₀ , y ₀ ) in the range of continuous disparity values, and the random unit vector n=(n _x ,n _y ,n _z ) as the normal vector of the point (x ₀ ,y ₀ ,z ₀ ), the three parameters can be transformed into:

表达式(2)中，W_p是以p为中心的方形窗口，权值ω_ps参考ASW，不过在ASW中使用的是双边滤波，本发明使用性能更出色的引导图像滤波，具体定义如表达式(7)所示。In the expression (2), W _p is a square window centered on p, and the weight ω _ps refers to ASW, but what is used in ASW is bilateral filtering, and the present invention uses guided image filtering with better performance, specifically defined as the expression Formula (7) shown.

表达式(7)中，I_p＝I_L(p)/255，是一个归一化的彩色向量,μ_k和Σ_k分别是I_p在窗口W_k′中的均值和协方差，e是为了避免过拟合。In the expression (7), I _p = I _L (p)/255, which is a normalized color vector, μ _k and Σ _k are the mean and covariance of I _p in the window W _k ′ respectively, and e is In order to avoid overfitting.

给定视差平面表达式(2)中的函数ρ(s|f_p)衡量窗口W_p内的像素s与其在右图像中的对应点s′的相似度，s′的定义如表达式(8)所示。given parallax plane The function ρ(s|f _p ) in the expression (2) measures the similarity between the pixel s in the window W _p and its corresponding point s′ in the right image, and the definition of s′ is shown in the expression (8).

ρ(s|f_p)的定义如表达式(9)所示。The definition of ρ(s|f _p ) is shown in Expression (9).

表达式(9)中，C_CNN是由Zbontar和LeCun提出的一种基于CNN的匹配代价计算方法，截断值τ_CNN增加了匹配代价对于遮挡区域的鲁棒性。当计算右图像的数据项时，左右图像视差的相对变化相反，若p和s此时均表示右图像中的像素点，则要调换表达式(9)中s和s′的位置，把表达式(8)中的减号改为加号。In expression (9), C _CNN is a CNN-based matching cost calculation method proposed by Zbontar and LeCun, and the truncation value τ _CNN increases the robustness of the matching cost to occluded areas. When calculating the data items of the right image, the relative change of the parallax of the left and right images is opposite. If p and s both represent the pixels in the right image at this time, the positions of s and s′ in the expression (9) should be exchanged, and the expression The minus sign in formula (8) is changed to plus sign.

2)平滑项2) smooth item

平滑项衡量表示像素对{p,q}在标签函数f下生成的标签(f_p,f_q)之间的距离(相似度、平滑程度)。本发明使用一种基于曲率的二阶平滑项，表达式(1)中的平滑项的具体定义如表达式(10)所示。The smoothness item measures the distance (similarity, smoothness) between the labels (f _p , f _q ) generated by the pixel pair {p,q} under the label function f. The present invention uses a curvature-based second-order smoothing term, and the specific definition of the smoothing term in expression (1) is shown in expression (10).

表达式(10)中，ε是一个表示权值w_pq下界的常数，值很小，其能增加权值对噪声的鲁棒性，δ_pq(f_p,f_q)惩罚f_p和f_q之间视差的不连续性，截断值τ_dis将允许视差在边缘处的剧烈变化。权值w_pq的定义如表达式(11)所示。In the expression (10), ε is a constant representing the lower bound of the weight w _pq , the value is very small, which can increase the robustness of the weight to noise, and δ _pq (f _p , f _q ) punishes f _p and f _q The discontinuity of the disparity between, the cut-off value τ _dis will allow the disparity to change sharply at the edges. The definition of weight w _pq is shown in expression (11).

表达式(11)中，参数γ衡量颜色差异的影响。In expression (11), the parameter γ measures the influence of color differences.

δ_pq(f_p,f_q)的定义如表达式(12)所示。The definition of δ _pq (f _p , f _q ) is shown in expression (12).

步骤4)视差精化：Step 4) Parallax refinement:

如图3所示，图3给出本发明与现有的局部扩张移动方法的结果对比，(a)表示输入的左图像，(b)表示图像的Ground Truth。(c)-(f)分别表示了展示本发明提出的基于分割的局部扩张移动方法与局部扩张移动方法生成的深度图和错误图(在错误图中，黑色的像素为错误像素，黑色区域越大，错误越多)。从(c)和(e)中可以看出，本发明的方法的深度图要明显优于局部扩张移动方法，特别是在衣架周围区域以及墙面区域。从(d)和(f)中可以发现与局部扩张移动方法相比，本发明的方法的错误像素更少，结果更优。原因是本发明的方法基于图像分割的结果，分割会保持物体的边缘信息，与直接应用局部扩张移动相比，基于分割的局部扩张移动方法很少会出现跨平面扩张的情况，而局部扩张移动方法对此没有限制。这些结果都证明了本发明的方法在保持边缘方面的性能要优于局部扩张移动算法。As shown in Figure 3, Figure 3 shows the result comparison between the present invention and the existing local expansion and movement method, (a) represents the input left image, and (b) represents the Ground Truth of the image. (c)-(f) respectively show the depth map and error map generated by the segmentation-based local expansion moving method and the local expansion moving method proposed by the present invention (in the error map, the black pixels are error pixels, and the black area is more The larger the number, the more errors). It can be seen from (c) and (e) that the depth map of the method of the present invention is significantly better than the local expansion moving method, especially in the area around the hanger and the wall area. From (d) and (f), it can be found that compared with the local expansion moving method, the method of the present invention has fewer error pixels and the result is better. The reason is that the method of the present invention is based on the result of image segmentation, and the segmentation will maintain the edge information of the object. Compared with the direct application of local expansion movement, the segmentation-based local expansion movement method rarely occurs cross-plane expansion, while the local expansion movement Methods have no restrictions on this. These results all prove that the method of the present invention outperforms the local dilation move algorithm in preserving edges.

以上内容仅为说明本发明的技术思想，不能以此限定本发明的保护范围，凡是按照本发明提出的技术思想，在技术方案基础上所做的任何改动，均落入本发明权利要求书的保护范围之内。The above content is only to illustrate the technical ideas of the present invention, and cannot limit the protection scope of the present invention. Any changes made on the basis of the technical solutions according to the technical ideas proposed in the present invention shall fall within the scope of the claims of the present invention. within the scope of protection.

Claims

1. the differentially expanding moving method in a kind of Stereo matching based on segmentation, which is characterized in that include the following steps：

Step 1) initialization：

Input picture pair, using the left image of image pair and right image as reference picture and target image；

Step 2) matching cost calculates：

Using the matching cost based on CNN, disparity space image DSI is calculated；

The selection of step 3) parallax：

Disparity computation is carried out based on the differentially expanding moving method of segmentation using disclosed by the invention, obtains original disparity map；

Step 4) parallax is refined：

Final left disparity map and right disparity map are generated by left and right consistency detection, holes filling and Weighted median filtering.

2. the differentially expanding moving method in Stereo matching according to claim 1 based on segmentation, which is characterized in that step 3) the specific method is as follows：

Step 3-1) image is split using SLIC；By segmentation, a table is assigned in each pixel of image The Label for showing Segment number, from 1 to maximum fractionation block number K；

Step 3-2), all pictures are traversed in the smallest rectangular area comprising the Segment for each Segment Element just constructs extended region and filter window only when the pixel is inside Segment centered on the pixel；

Step 3-3) each pixel applies iteration α Extension algorithm in the extended region that step 3-2) is established, initial α by The current 3D label value of pixel of Segment internal random selection determines, compare when pixel parallax label takes α system capacity with The size of pixel parallax label energy of entire MRF system when taking current value, if take α, system capacity is smaller, then updates current The label of pixel is α, otherwise keeps current label constant；All pixels in Segment are iterated over every time, by α to expansion Region intramedullary expansion.

3. the differentially expanding moving method in Stereo matching according to claim 2 based on segmentation, which is characterized in that step The energy function of system capacity in 3-3) is as follows：

The energy function definition of MRF is as shown in expression formula (1)；

In expression formula (1), first item is data item, measures the degree of agreement between parallax function and input picture, and Section 2 is Smooth item, if parallax in adjacent pixel between discontinuous (p, q) ∈ Ν, can punish this, guarantee this in object Portion is smooth change, and acute variation can just only occur in object edge；λ is the weight for equilibrium data item and smooth item, λ Value be 10.

4. the differentially expanding moving method in Stereo matching according to claim 3 based on segmentation, which is characterized in that data Item is as follows：

The identical property between parallax function and input picture is measured using ramp blocks occurrence, improves algorithm using 3D label Accuracy on inclined-plane, 3D label is by the parallax d of each pixel p_pIt is indicated with 3 parameters, these three parameters represent one and put down Face f_p, shown in the definition of data item such as expression formula (2)；

In expression formula (2), the 3D label of the parallax of point p is indicated by expression formula (3)；

In expression formula (3), p_xAnd p_yThe x coordinate and y-coordinate of p are respectively indicated, 3 parameters constitute a tripleThe parallax d of pixel p is estimated in this way_pThe problem of being converted into three parameter values of estimation triple；These three Parameter is determined by the random searching strategy of PatchMatch, and a value z is randomly choosed in the range of continuous parallax value₀Make For point (x₀,y₀) parallax, by random unit vector n=(n_x,n_y,n_z) it is used as point (x₀,y₀,z₀) normal vector, three parameters It is converted into：

In expression formula (2), W_pIt is the square window centered on p, weight ω_psWith reference to ASW, the side filtered using navigational figure Method is specifically defined as shown in expression formula (7)；

In expression formula (7), I_p=I_LIt (p)/255, is a normalized color vectors, μ_kAnd Σ_kIt is I respectively_pIn window W_k' in Mean value and covariance, e is in order to avoid over-fitting；

Given disparity planeIn expression formula (2) function ρ (s | f_p) measure window W_pInterior pixel s and its The similarity of corresponding points s ' in right image, shown in the definition of s ' such as expression formula (8)；

ρ(s|f_p) definition such as expression formula (9) shown in；

ρ(s|f_p)=min (C_CNN(s,s′),τ_CNN) (9)

In expression formula (9), C_CNNIt is the matching cost calculation method based on CNN, cutoff value τ_CNNMatching cost is increased for hiding Keep off the robustness in region；When calculating the data item of right image, the opposite variation of left images parallax is on the contrary, if p and s are equal at this time It indicates the pixel in right image, then to exchange the position of s and s ' in expression formula (9), the minus sign in expression formula (8) is changed to add Number.

5. the differentially expanding moving method in Stereo matching according to claim 3 based on segmentation, which is characterized in that smooth Item is as follows：

Smooth item measures the label (f for indicating that pixel generates { p, q } at label function f_p,f_qThe distance between)；Expression formula (1) smooth item in is specifically defined as shown in expression formula (10)；

E_smooth(f_p,f_q)=max (w_pq,ε)min(δ_pq(f_p,f_q),τ_dis) (10)

In expression formula (10), ε is an expression weight w_pqThe constant of lower bound is worth very little, can increase weight to the robust of noise Property, δ_pq(f_p,f_q) punishment f_pAnd f_qBetween parallax discontinuity, cutoff value τ_disParallax will be allowed in the violent change of edge Change；Weight w_pqDefinition such as expression formula (11) shown in；

In expression formula (11), parameter γ measures the influence of color difference；

δ_pq(f_p,f_q) definition such as expression formula (12) shown in；

δ_pq(f_p,f_q)=| d_p(f_p)-d_p(f_q)|+|d_q(f_q)-d_q(f_p)| (12)

In expression formula (12), first item indicates p point in f_pAnd f_qThe difference of parallax, Section 2 under the plane of the two tag representations It is then the difference of point q parallax under the two labels.