CN104778721A

CN104778721A - Distance measuring method of significant target in binocular image

Info

Publication number: CN104778721A
Application number: CN201510233157.3A
Authority: CN
Inventors: 王进祥; 杜奥博; 石金进
Original assignee: Harbin Institute of Technology Shenzhen
Current assignee: Guangzhou Xiaopeng Motors Technology Co Ltd
Priority date: 2015-05-08
Filing date: 2015-05-08
Publication date: 2015-07-15
Anticipated expiration: 2035-05-08
Also published as: CN104778721B

Abstract

A distance measurement method for a salient target in a binocular image, the invention relates to a distance measurement method for a target in a binocular image. The object of the present invention is to propose a method for measuring the distance of a salient target in a binocular image, so as to solve the problem of slow processing speed of the existing target distance measuring method. Step 1, use the visual saliency model to extract the salient features of the binocular image, and mark the seed point and background point; Step 2, build a weighted map for the binocular image; Step 3, use the seed point and background in step 1 Points and the weighted graph in step 2, the salient target in the binocular image is segmented through the random walk image segmentation algorithm; The calculated disparity matrix K' is substituted into the binocular distance measurement model to obtain the salient target distance. The invention can be applied to the distance measurement of the prominent target in the image in front of the field of vision when the smart car is running.

Description

A distance measurement method for salient objects in binocular images

技术领域technical field

本发明涉及一种双目图像中目标的距离测量方法，尤其涉及一种双目图像中显著性目标的距离测量方法，属于图像处理技术领域。The invention relates to a distance measurement method for a target in a binocular image, in particular to a distance measurement method for a salient target in a binocular image, and belongs to the technical field of image processing.

背景技术Background technique

距离信息在交通图像处理当中主要应用于为汽车的控制系统提供安全判断。在智能汽车的研究过程中，传统的目标测量方法是利用特定波长雷达或激光对目标进行测距。与雷达和激光相比，视觉传感器具有价格上的优势，同时视角也更加开阔。并且利用视觉传感器在测量目标距离的同时，能判断出目标的具体内容。Distance information is mainly used in traffic image processing to provide safety judgments for vehicle control systems. In the research process of smart cars, the traditional target measurement method is to use specific wavelength radar or laser to measure the distance of the target. Compared with radar and laser, vision sensors have a price advantage, and at the same time, they have a wider viewing angle. And the visual sensor can be used to measure the distance of the target and at the same time, the specific content of the target can be judged.

但是目前的交通图像信息相对繁杂，传统的目标距离测量算法很难在复杂图像中得到理想结果，由于无法找到图像中显著性目标而是全局检测，使得处理速度较慢并增加了很多的无关数据，使得算法无法满足实际应用要求。However, the current traffic image information is relatively complicated, and it is difficult for the traditional target distance measurement algorithm to obtain ideal results in complex images. Since the salient target in the image cannot be found but global detection, the processing speed is slow and a lot of irrelevant data is added. , so that the algorithm cannot meet the requirements of practical applications.

发明内容Contents of the invention

本发明的目的是提出一种双目图像中显著性目标的距离测量方法，以解决现有的目标距离测量方法处理速度慢的问题。The purpose of the present invention is to propose a method for measuring the distance of a salient target in a binocular image, so as to solve the problem of slow processing speed of the existing target distance measuring method.

本发明所述的一种双目图像中显著性目标的距离测量方法，是按照以下步骤实现的：步骤一、利用视觉显著性模型对双目图像进行显著性特征提取，并标出种子点和背景点，具体包括：The method for measuring the distance of a salient target in a binocular image according to the present invention is realized according to the following steps: Step 1, using a visual saliency model to extract salient features of the binocular image, and marking the seed points and Background points, including:

步骤一、利用视觉显著性模型对双目图像进行显著性特征提取，并标出种子点和背景点，具体包括：Step 1. Use the visual saliency model to extract the saliency features of the binocular image, and mark the seed points and background points, specifically including:

步骤一一、首先进行预处理，对双目图像进行边缘检测，生成双目图像的边缘图；步骤一二、利用视觉显著性模型对双目图像进行显著性特征提取，生成显著性特征图；Step 11, first perform preprocessing, perform edge detection on the binocular image, and generate an edge map of the binocular image; step 12, use a visual saliency model to extract salient features from the binocular image, and generate a salient feature map;

步骤一三、根据显著性特征图找出图中灰度值最大像素点，标记为种子点；并以种子点为中心的25×25的窗口内遍历像素，找出像素点的灰度值小于0.1的且距离种子点最远的像素点标记为背景点；Step 13. According to the saliency feature map, find out the pixel point with the largest gray value in the graph, and mark it as a seed point; and traverse the pixels in a window of 25×25 centered on the seed point, and find out that the gray value of the pixel point is less than 0.1 and the pixel farthest from the seed point is marked as the background point;

步骤二、对双目图像建立加权图；Step 2, establishing a weighted map for the binocular image;

利用经典高斯权函数对双目图像建立加权图：Use the classic Gaussian weight function to create a weighted image for the binocular image:

${W W}_{ij ij} = = {e e}^{- - β β {(({g g}_{i i} - - {g g}_{j j}))}^{22}} - - - - - - ((11))$

其中，W_ij表示顶点i和顶点j之间的权值，g_i表示顶点i的亮度，g_j表示顶点j的亮度，β是自由参数，e为自然底数；Among them, W _ij represents the weight between vertex i and vertex j, g _i represents the brightness of vertex i, g _j represents the brightness of vertex j, β is a free parameter, and e is a natural base;

通过下式求出加权图的拉普拉斯矩阵L：The Laplacian matrix L of the weighted graph is obtained by the following formula:

其中，L_ij为拉普拉斯矩阵L中对应顶点i到j的元素，d_i为顶点i与周围点权值的和，d_i＝∑W_ij；Among them, L _ij is the element corresponding to vertex i to j in the Laplacian matrix L, d _i is the sum of vertex i and surrounding point weights, d _i =∑W _ij ;

步骤三、利用步骤一中的种子点和背景点和步骤二中的加权图，通过随机游走图像分割算法将双目图像中的显著性目标分割出来；Step 3, using the seed point and background point in step 1 and the weighted map in step 2, using the random walk image segmentation algorithm to segment the salient target in the binocular image;

步骤三一、将双目图像的像素点根据步骤一标记出的种子点和背景点分出两类集合，即标记点集合V_M与未标记点集合V_U，拉普拉斯矩阵L根据V_M和V_U，优先排列标记点然后再排列非标记点；其中，所述L分成L_M、L_U、B、B^T四部分，则将拉普拉斯矩阵表示如下：Step 31. Divide the pixel points of the binocular image into two types of sets according to the seed points and background points marked in step 1, namely, the set of marked points V _M and the set of unmarked points V _U , and the Laplacian matrix L is based on V _M and V _U , first arrange the marked points and then arrange the non-marked points; wherein, the L is divided into four parts: L _M , L _U , B, and B ^T , and the Laplacian matrix is expressed as follows:

$L L = = [\begin{matrix} {L L}_{M m} & B B \\ {B B}^{T T} & {L L}_{U u} \end{matrix}] - - - - - - ((33))$

其中，L_M为标记点到标记点的拉普拉斯矩阵，L_U为非标记点到非标记点的拉普拉斯矩阵，B和B^T分别为标记点到非标记点和非标记点到标记点的拉普拉斯矩阵；Among them, L _M is the Laplacian matrix from the marked point to the marked point, L _U is the Laplacian matrix from the unmarked point to the unmarked point, B and B ^T are the marked point to the non-marked point and the non-marked point respectively to the Laplacian matrix of the marked points;

步骤三二、根据拉普拉斯矩阵和标记点求解组合狄利克雷积分D[x]；Step 32, solving the combined Dirichlet integral D[x] according to the Laplace matrix and the marked points;

组合狄利克雷积分公式如下：The combined Dirichlet integral formula is as follows:

$D D. [[x x]] = = \frac{11}{22} Σ Σ {w w}_{ij ij} {(({x x}_{i i} - - {x x}_{j j}))}^{22} = = \frac{11}{22} {x x}^{T T} Lx Lx - - - - - - ((44))$

其中，x为加权图中顶点到标记点的概率矩阵，x_i和x_j分别为顶点i和j到标记点的概率；Among them, x is the probability matrix from the vertex to the marked point in the weighted graph, and x _i and x _j are the probabilities from vertices i and j to the marked point respectively;

根据标记点集合V_M与未标记点集合V_U，将x分为x_M和x_U两部分，x_M为标记点集合V_M对应的概率矩阵，x_U为未标记点集合V_U对应的概率矩阵；将式(4)分解为：According to the set of marked points V _M and the set of unmarked points V _U , divide x into two parts x _M and x _U , x _M is the probability matrix corresponding to the set of marked points V _M , x _U is the corresponding probability matrix of the set of unmarked points V _U Probability matrix; formula (4) is decomposed into:

$D D. [[{x x}_{U u}]] = = \frac{11}{22} [[{x x}_{M m}^{T T} {x x}_{U u}^{T T}]] [\begin{matrix} {L L}_{M m} & B B \\ {B B}^{T T} & {L L}_{U u} \end{matrix}] [\begin{matrix} {x x}_{M m} \\ {x x}_{U u} \end{matrix}] = = \frac{11}{22} (({x x}_{M m}^{T T} {L L}_{M m} {x x}_{M m} + + 22 {x x}_{U u}^{T T} {B B}^{T T} {x x}_{M m} + + {x x}_{U u}^{T T} {L L}_{U u} {x x}_{U u})) - - - - - - ((55))$

对于标记点s，设定m^s，如果任意顶点i为s，则否则对D[x_u]针对x_U求微分，得到式(5)极小值的解即为标记点s的狄利克雷概率值：For a marker point s, set m ^s , if any vertex i is s, then otherwise Differentiate D[x _u ] for x _U , and the solution to the minimum value of formula (5) is the Dirichlet probability value of the marked point s:

${L L}_{U u} {x x}_{i i}^{s the s} = = - - B B {m m}^{s the s} - - - - - - ((66))$

其中，表示顶点i首次到达标记点s的概率；in, Indicates the probability that vertex i reaches mark point s for the first time;

根据通过组合狄利克雷积分求出的按照式(7)进行阈值分割，生成分割图：According to the obtained by combinatorial Dirichlet integral Perform threshold segmentation according to formula (7) to generate a segmentation map:

其中，s_i为某一顶点i在分割图中对应位置的像素大小；Among them, s _i is the pixel size of the corresponding position of a vertex i in the segmentation map;

其中，所述分割图中亮度为1的像素点表示为图像中的显著性目标，亮度为0的即为背景；Wherein, the pixel points with a brightness of 1 in the segmentation map are represented as salient objects in the image, and those with a brightness of 0 are the background;

步骤三三、将分割图与原图像对应的像素相乘，生成目标图，即提取出分割出的显著性目标，公式如下：Step 33: Multiply the segmentation map with the pixels corresponding to the original image to generate the target map, that is, extract the segmented salient target, the formula is as follows:

t_i＝s_i·I_i (8)t _i =s _i ·I _i (8)

其中，t_i为目标图T的某一顶点i的灰度值，I_i为输入图像I(σ)对应位置i的灰度值；Among them, t _i is the gray value of a certain vertex i of the target image T, and I _i is the gray value of the corresponding position i of the input image I(σ);

步骤四、通过SIFT算法将显著性目标单独进行关键点匹配；Step 4, using the SIFT algorithm to perform key point matching on the saliency target alone;

步骤四一、将目标图建立高斯金字塔，对滤波后的图像两两求差得到DOG图像，DOG图像定义为D(x,y,σ)，求取公式如下：Step 41. Establish a Gaussian pyramid for the target image, and obtain a DOG image by calculating the difference between the filtered images in pairs. The DOG image is defined as D(x, y, σ), and the calculation formula is as follows:

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*T(x,y)D(x,y,σ)=(G(x,y,kσ)-G(x,y,σ))*T(x,y)

(9) (9)

＝C(x,y,kσ)-C(x,y,σ)=C(x,y,kσ)-C(x,y,σ)

其中，为一个变化尺度的高斯函数，p，q表示高斯模板的维度，(x,y)为像素点在高斯金字塔图像中的位置，σ是图像的尺度空间因子，k表示某一具体尺度值，C(x,y,σ)定义为G(x,y,σ)与目标图T(x,y)的卷积，即C(x,y,σ)＝G(x,y,σ)*T(x,y)；in, is a Gaussian function with varying scales, p, q represent the dimensions of the Gaussian template, (x, y) is the position of the pixel in the Gaussian pyramid image, σ is the scale space factor of the image, k represents a specific scale value, C (x,y,σ) is defined as the convolution of G(x,y,σ) and the target graph T(x,y), that is, C(x,y,σ)=G(x,y,σ)*T (x,y);

步骤四二、在相邻的DOG图像中求出极值点，通过拟合三维二次函数确定极值点的位置和尺度作为关键点，并根据Hessian矩阵对关键点进行稳定性检测以消除边缘响应，具体如下：Step 42. Find the extreme points in the adjacent DOG images, determine the position and scale of the extreme points as key points by fitting the three-dimensional quadratic function, and perform stability detection on the key points according to the Hessian matrix to eliminate the edge Response, as follows:

(一)对尺度空间DOG通过进行泰勒展开求其曲线拟合D(X)：(1) Find the curve fitting D(X) of the scale space DOG by performing Taylor expansion:

$D D. ((X x)) = = D D. + + \frac{&PartialD; &PartialD; {D D.}^{T T}}{&PartialD; &PartialD; X x} X x + + \frac{11}{22} {X x}^{T T} \frac{{&PartialD; &PartialD;}^{22} D D.}{&PartialD; &PartialD; {X x}^{22}} X x - - - - - - ((1010))$

其中，X＝(x,y,σ)^T，D为曲线拟合，对式(10)求导并令其为0，得到极值点的偏移量式(11)：Among them, X=(x, y, σ) ^T , D is the curve fitting, take the derivative of formula (10) and make it 0, and get the offset formula (11) of the extremum point:

$\overset{^^}{X x} = = - - \frac{{&PartialD; &PartialD;}^{22} {D D.}^{- - 11}}{&PartialD; &PartialD; {X x}^{22}} \frac{&PartialD; &PartialD; D D.}{&PartialD; &PartialD; X x} - - - - - - ((1111))$

为去除低对比度的极值点，将式(11)代入公式(10)，得到式(12)：In order to remove the extreme points of low contrast, formula (11) is substituted into formula (10), and formula (12) is obtained:

$D D. ((\overset{^^}{X x})) = = D D. + + \frac{11}{22} \frac{&PartialD; &PartialD; {D D.}^{T T}}{&PartialD; &PartialD; X x} \overset{^^}{X x} - - - - - - ((1212))$

若式(12)的值大于0.03，保留该极值点并获取该极值点的精确位置和尺度，否则丢弃；If the value of formula (12) is greater than 0.03, keep the extreme point and obtain the precise position and scale of the extreme point, otherwise discard;

(二)通过关键点处的Hessian矩阵筛选消除不稳定的关键点；(2) Eliminate unstable key points by screening the Hessian matrix at the key points;

利用Hessian矩阵特征值之间的比率计算曲率；Calculate the curvature using the ratio between the eigenvalues of the Hessian matrix;

根据关键点邻域的曲率判断边缘点；Judging the edge point according to the curvature of the key point neighborhood;

曲率的比率设置为10，大于10则删除，反之，则保留，保留下来的则是稳定的关键点；The curvature ratio is set to 10, if it is greater than 10, it will be deleted, otherwise, it will be retained, and the remaining key points will be stable;

步骤四三、利用关键点邻域16×16的窗口的像素为每个关键点指定方向参数；Step 43, using the pixels of the 16×16 window of the key point neighborhood to specify the direction parameter for each key point;

对于在DOG图像中检测出的关键点，梯度的大小和方向计算公式如下：For the key points detected in the DOG image, the calculation formula of the magnitude and direction of the gradient is as follows:

$m m ((x x,, y the y)) = = \sqrt{{((C C ((x x + + 11,, y the y)) - - C C {((x x - - 11,, y the y))}^{22} + + ((C C ((x x,, y the y + + 11)) - - C C ((x x,, y the y - - 11))))}^{22}}$

(13)(13)

θ(x,y)＝tan^-1((C(x,y+1)-C(x,y-1))/(C(x+1,y)-C(x-1,y)))θ(x,y)=tan ^-1 ((C(x,y+1)-C(x,y-1))/(C(x+1,y)-C(x-1,y)) )

其中，C为关键点所在的尺度空间，m为关键点的梯度大小，θ为所求点的梯度方向；以关键点为中心，在周围区域划定一个16×16邻域，求出其中像素点的梯度大小和梯度方向，使用直方图来统计这个邻域内点的梯度；直方图的横坐标为方向，将360度分为36份，每份是10度对应直方图当中的一项，直方图的纵坐标为梯度大小，对应为相应梯度方向的点的大小进行相加，其和作为纵坐标的大小；主方向定义为梯度大小最大为hm的区间方向，通过梯度大小在08*hm之上的区间作为主方向的辅助向，以增强匹配的稳定性；Among them, C is the scale space where the key point is located, m is the gradient size of the key point, and θ is the gradient direction of the desired point; with the key point as the center, a 16×16 neighborhood is defined in the surrounding area, and the pixel The gradient size and gradient direction of the point, use the histogram to count the gradient of the points in this neighborhood; the abscissa of the histogram is the direction, divide 360 degrees into 36 parts, and each part is an item in the histogram corresponding to 10 degrees, the histogram The ordinate of the graph is the gradient size, which corresponds to the sum of the points in the corresponding gradient direction, and the sum is used as the size of the ordinate; the main direction is defined as the direction of the interval with the maximum gradient size of hm, and the gradient size is between 08*hm The interval above is used as the auxiliary direction of the main direction to enhance the stability of matching;

步骤四四、建立描述子表述关键点的局部特征信息Step 44: Establish descriptors to express local feature information of key points

首先在关键点周围的坐标旋转为关键点的方向；First, the coordinates around the key point are rotated to the direction of the key point;

然后选取关键点周围16×16的窗口，在邻域内分为16个4×4的小窗口，在4×4的小窗口中，计算其相对应的梯度的大小和方向，并用一个8个bin的直方图来统计每一个小窗口的梯度信息，通过高斯加权算法对关键点周围16×16的窗口计算描述子如下式：Then select a 16×16 window around the key point, and divide it into 16 4×4 small windows in the neighborhood. In the 4×4 small window, calculate the size and direction of the corresponding gradient, and use an 8 bin The histogram of each small window is used to count the gradient information of each small window, and the descriptor is calculated for the 16×16 window around the key point through the Gaussian weighting algorithm as follows:

$h h = = {m m}_{g g} ((a a + + x x,, b b + + y the y)) * * {e e}^{- - \frac{{((- - {x x}^{' '}))}^{22} + + {(({y the y}^{' '}))}^{22}}{22 \times \times {((0.5 0.5 d d))}^{22}}} - - - - - - ((1414))$

其中，h为描述子，(a,b)为关键点在高斯金字塔图像的位置，m_g为关键点的梯度大小即步骤四三直方图主方向的梯度大小，d为窗口的边长即16，(x,y)为像素点在高斯金字塔图像中的位置，(x′,y′)为像素在将坐标旋转为关键点的方向的邻域内的新坐标，新坐标的计算公式如式：Among them, h is the descriptor, (a, b) is the position of the key point in the Gaussian pyramid image, m _g is the gradient size of the key point, which is the gradient size of the main direction of the histogram in step 43, and d is the side length of the window, which is 16 , (x, y) is the position of the pixel in the Gaussian pyramid image, (x′, y′) is the new coordinate of the pixel in the neighborhood of the direction that rotates the coordinate to the key point, and the calculation formula of the new coordinate is as follows:

$(\begin{matrix} {x x}^{' '} \\ {y the y}^{' '} \end{matrix}) = = (\begin{matrix} cos cos {θ θ}_{g g} & - - sin sin {θ θ}_{g g} \\ sin sin {θ θ}_{g g} & cos cos {θ θ}_{g g} \end{matrix}) (\begin{matrix} x x \\ y the y \end{matrix}) - - - - - - ((1515))$

θ_g为关键点的梯度方向；θ _g is the gradient direction of the key point;

通过对16×16的窗口计算得到128个关键点的特征向量，记为H＝(h₁,h₂,h₃,...,h₁₂₈)，对特征向量进行归一化处理，归一化后特征向量记为L_g，归一化公式如式：The feature vectors of 128 key points are obtained by calculating the 16×16 window, which is recorded as H=(h ₁ ,h ₂ ,h ₃ ,...,h ₁₂₈ ), and the feature vectors are normalized. The eigenvector after normalization is denoted as L _g , and the normalization formula is as follows:

${l l}_{i i} = = \frac{{h h}_{i i}}{\sqrt{{Σ Σ}_{j j = = 11}^{128128} {h h}_{j j}}},, j j = = 1,2,3 1,2,3,, . . . . . . - - - - - - ((1616))$

其中，L_g＝(l₁,l₂,...,l_i,...,l₁₂₈)为归一化之后的关键点的特征向量，l_i,i＝1,2,3,....为某一归一化向量；Among them, L _g =(l ₁ ,l ₂ ,...,l _i ,...,l ₁₂₈ ) is the feature vector of the key point after normalization, l _i ,i=1,2,3,. ... is a normalized vector;

采用关键点的特征向量的欧氏距离作为双目图像中关键点的相似度的判定度量，对双目图像中的关键点进行匹配，相互匹配的关键像素点坐标信息作为一组关键信息；The Euclidean distance of the feature vector of the key point is used as the judgment measure of the similarity of the key point in the binocular image, and the key point in the binocular image is matched, and the coordinate information of the key pixel points matched with each other is used as a set of key information;

步骤四五、对生成的匹配关键点进行筛选；Steps 4 and 5, screening the generated matching key points;

求出每对关键点的坐标水平视差，生成视差矩阵，视差矩阵定义为K_n＝{k₁,k₂...k_n}，n为匹配的对数，k₁、k₂、k_n为单个匹配点视差；Find the coordinate horizontal disparity of each pair of key points, and generate a disparity matrix. The disparity matrix is defined as K _n ={k ₁ ,k ₂ ... k _n }, n is the logarithm of matching, k ₁ , k ₂ , k _n Disparity for a single matching point;

求出视差矩阵的中位数k_m，并得到参考视差矩阵，记为K_n'，公式如下：Find the median k _m of the disparity matrix, and obtain the reference disparity matrix, denoted as K _n ', the formula is as follows:

K_n'＝{k₁-k_m,k₂-k_m,...,k_n-k_m} (17)K _n '＝{k ₁ -k _m ,k ₂ -k _m ,...,k _n -k _m } (17)

设定视差阈值为3，将K_n'中大于阈值的对应视差删除，得到最终视察矩阵结果K'，k_1'、k_2'、k_n'均为筛选后的正确匹配点的视差，n'为最终正确匹配的对数，公式如下：Set the parallax threshold to 3, delete the corresponding parallax in K _n ' that is greater than the threshold, and obtain the final observation matrix result K', k _1' , k _2' , and k _n' are the parallaxes of the correct matching points after screening, n 'is the logarithm of the final correct match, the formula is as follows:

K'＝{k_1',k_2',...,k_n'} (18)K'＝{k _1' ,k _2' ,...,k _n' } (18)

步骤五、将步骤四求出的视差矩阵K'代入双目测距的模型中求出显著性目标距离；Step five, substituting the disparity matrix K' obtained in step four into the binocular distance measurement model to obtain the salient target distance;

两个完全相同的成像系统的焦距沿水平方向相距J，两个光轴均平行于水平面，图像平面与竖直平面相平行；The focal lengths of two identical imaging systems are separated by J along the horizontal direction, the two optical axes are parallel to the horizontal plane, and the image plane is parallel to the vertical plane;

假设场景中一目标点M(X，Y，Z)，在左、右两个成像点分别是Pl(x₁,y₁)和Pr(x₂,y₂)，x₁,y₁与x₂,y₂分别为Pl与Pr在成像的竖直平面的坐标，双目模型中视差定义为k＝|pl-pr|＝|x₂-x₁|，由三角形相似关系得到距离公式，X，Y，Z为空间坐标系中横轴，竖轴，纵轴的坐标：Assuming a target point M(X, Y, Z) in the scene, the two imaging points on the left and right are Pl(x ₁ ,y ₁ ) and Pr(x ₂ ,y ₂ ), x ₁ ,y ₁ and x ₂ , y ₂ are the coordinates of Pl and Pr in the vertical plane of imaging respectively. In the binocular model, the parallax is defined as k=|pl-pr|=|x ₂ -x ₁ |, and the distance formula is obtained from the triangle similarity relationship, X , Y, Z are the coordinates of the horizontal axis, vertical axis, and vertical axis in the spatial coordinate system:

$z z = = J J \frac{f f}{k k} = = J J \frac{f f}{| | {x x}_{22} - - {x x}_{11} | | d d {x x}^{' '}} - - - - - - ((1919))$

其中dx'表示每一像素在成像的底片中水平轴方向上的物理距离，f为成像系统的焦距，z是目标点M到两成像中心连线的距离，将步骤四求出的视差矩阵带入式(19)中，根据双目模型的物理信息求出对应的距离矩阵Z'＝{z₁,z₂,...,z_n'}，z₁，z₂，z_n'为单个匹配视差求出的显著性目标距离，最后求出距离矩阵的平均值即为双目图像中显著性目标的距离Z_f，公式如下：Where dx' represents the physical distance of each pixel in the horizontal axis direction of the imaged film, f is the focal length of the imaging system, z is the distance from the target point M to the line connecting the two imaging centers, and the disparity matrix obtained in step 4 is taken with In formula (19), the corresponding distance matrix Z'={z ₁ , z ₂ ,...,z _n' } is obtained according to the physical information of the binocular model, z ₁ , z ₂ , z _n' are a single The distance of the salient target obtained by matching the disparity, and finally the average value of the distance matrix is calculated as the distance Z _f of the salient target in the binocular image. The formula is as follows:

${Z Z}_{f f} = = \frac{11}{n no} {Σ Σ}_{k k = = 11}^{{n no}^{' '}} {z z}_{k k} - - - - - - ((2020)) . .$

本发明的有益效果是：The beneficial effects of the present invention are:

1、本发明采用模拟人类视觉系统的方法，提取人眼感兴趣的区域，算法提取出显著性目标基本与人眼检测结果一致，使得提取出使得本发明能够实现跟人眼一样自动的识别显著性目标。1. The present invention adopts the method of simulating the human visual system to extract the area of interest to the human eye. The salient target extracted by the algorithm is basically consistent with the detection result of the human eye, so that the extraction enables the present invention to realize the same automatic recognition as the human eye. sexual target.

2、本发明自动完成显著性目标距离测量，无需手工选择显著性目标。2. The present invention automatically completes the distance measurement of the salient target without manual selection of the salient target.

3、本发明对同一目标进行匹配，从而保证关键点匹配的视差结果相近，能有效筛选出错误匹配点，匹配准确度接近100％，视差的相对误差不到2％，增加了测距的准确性。3. The present invention matches the same target, so as to ensure that the parallax results of key point matching are similar, and can effectively screen out wrong matching points. The matching accuracy is close to 100%, and the relative error of parallax is less than 2%, which increases the accuracy of distance measurement sex.

4、本发明匹配信息较少，可以有效减少额外无关计算，至少减少75％的匹配计算，并减少无关数据的引入，匹配数据利用率在90％以上，使得在复杂图像环境下可实现显著性目标距离测量，提高图像处理效率。4. The present invention has less matching information, which can effectively reduce extra irrelevant calculations, reduce at least 75% of matching calculations, and reduce the introduction of irrelevant data. The utilization rate of matching data is more than 90%, so that salience can be realized in complex image environments Target distance measurement, improve image processing efficiency.

5、本发明对智能汽车行驶中对视野前方图像显著性目标的距离测量，从而为汽车安全行驶提供关键信息，解决了传统的图像距离测量只能对整个图片进行深度检测的缺点，并很好避免了误差较大，噪声过多的问题。5. The present invention measures the distance of the salient target of the image in front of the field of vision while the smart car is driving, thereby providing key information for the safe driving of the car, and solving the disadvantage that the traditional image distance measurement can only perform depth detection on the entire picture, and it is very good The problems of large error and excessive noise are avoided.

6、本发明通过对双目图像的显著性特征提取并实现对显著性目标的分割，从而使得目标范围缩小，减少匹配所用时间，提高效率，对显著性目标关键点进行匹配从而求出视差，进而实现距离测量，由于目标在一个竖直面上，可以很好地筛选出错误的匹配关键点，使精准度提高，本发明方法能够快速识别显著性目标并准确测量显著性目标的距离。6. The present invention extracts the salient features of the binocular image and realizes the segmentation of the salient target, thereby reducing the range of the target, reducing the time used for matching, improving efficiency, and matching the key points of the salient target to obtain the parallax. Furthermore, the distance measurement is realized. Since the target is on a vertical plane, the wrong matching key points can be well screened out to improve the accuracy. The method of the present invention can quickly identify the salient target and accurately measure the distance of the salient target.

附图说明Description of drawings

图1为本发明方法的流程图；Fig. 1 is the flowchart of the inventive method;

图2为视觉显著性分析流程图；Figure 2 is a flow chart of visual salience analysis;

图3为随机游走算法流程图；Fig. 3 is the flow chart of random walk algorithm;

图4为SIFT算法流程图；Fig. 4 is a flowchart of the SIFT algorithm;

图5为双目测量系统，X,Y,Z为定义的空间坐标系，M为空间某一点，Pl和Pr为M在成像面的成像点，M为空间上一点，f为成像系统的焦距。Figure 5 is a binocular measurement system, X, Y, Z are the defined space coordinate system, M is a certain point in space, Pl and Pr are the imaging points of M on the imaging plane, M is a point in space, and f is the focal length of the imaging system .

具体实施方式Detailed ways

结合附图进一步详细说明本发明的具体实施方式。The specific implementation manner of the present invention will be further described in detail in conjunction with the accompanying drawings.

具体实施方式一：下面结合图1～图5说明本实施方式，本实施方式所述的方法包括以下步骤：Specific implementation mode 1: The following describes this implementation mode in conjunction with FIGS. 1 to 5 . The method described in this implementation mode includes the following steps:

利用视觉显著性模型对双目图像进行显著性提取，分别计算双目图像的每个像素点的亮度、颜色、方向三种显著特征，并将三个显著性特征归一化得到图像的加权显著图。在显著图上每个像素代表图像中相应位置的显著性大小。找出图中像素值最大的点，即显著性最强的点，标为种子点；在种子点周围逐步扩大范围找出显著性最弱的点，标为背景点。利用视觉显著性模型提取图像显著性流程如图2所示。Use the visual saliency model to extract the saliency of the binocular image, calculate the three salient features of brightness, color, and direction of each pixel of the binocular image, and normalize the three salient features to obtain the weighted saliency of the image picture. Each pixel on the saliency map represents the saliency of the corresponding position in the image. Find the point with the largest pixel value in the picture, that is, the point with the strongest significance, and mark it as a seed point; gradually expand the range around the seed point to find the point with the weakest significance, and mark it as a background point. The process of extracting image saliency using visual saliency model is shown in Figure 2.

步骤一一、首先进行预处理，对双目图像进行边缘检测，生成视觉显著性模型，边缘信息为图像重要的显著性信息；Step 11, first preprocessing, edge detection is performed on the binocular image, and a visual saliency model is generated, and the edge information is important saliency information of the image;

步骤一二、利用视觉显著性模型对双目图像进行显著性特征提取，生成显著性特征图；Step 12, using the visual saliency model to extract the saliency feature of the binocular image, and generate a saliency feature map;

步骤一三、根据显著性特征图找出图中亮度最大像素点，标记为种子点；并以种子点为中心的25×25的窗口内遍历像素，找出像素点的灰度值小于0.1的且距离种子点最远的像素点标记为背景点；Step 13. According to the saliency feature map, find out the pixel point with the maximum brightness in the picture, and mark it as a seed point; and traverse the pixels in a 25×25 window centered on the seed point, and find out the gray value of the pixel point that is less than 0.1 And the pixel point farthest from the seed point is marked as the background point;

利用经典高斯权函数对双目图像建立加权图，通过像素的灰度不同对双目图像中每个像素点与其周围像素之间赋予一定权重作为边，同时将每个像素点作为顶点，建立包含顶点和边的加权图；Using the classic Gaussian weight function to establish a weighted graph for the binocular image, assign a certain weight to each pixel in the binocular image and its surrounding pixels as an edge through the different gray levels of the pixels, and at the same time use each pixel as a vertex to establish a weighted graph that includes weighted graph of vertices and edges;

利用图论的理论将整幅图像看成无向的加权图，把每个像素看成加权图中的顶点，其中，所述利用像素的灰度值对加权图的边进行加权，具体采用经典高斯权函数如下：The entire image is regarded as an undirected weighted graph using the theory of graph theory, and each pixel is regarded as a vertex in the weighted graph. The Gaussian weight function is as follows:

其中，W_ij表示顶点i和顶点j之间的权值，g_i表示像素i的亮度，g_j表示像素j的亮度，β是自由参数，e为自然底数；Among them, W _ij represents the weight between vertex i and vertex j, g _i represents the brightness of pixel i, g _j represents the brightness of pixel j, β is a free parameter, and e is a natural base;

设定m^s定义为对于标记点s，如果任意顶点i为s，则否则对D[x_u]针对x_U求微分，得到式(5)极小值的解即为标记点s的狄利克雷概率值：Let m ^s be defined as for a marker point s, if any vertex i is s, then otherwise Differentiate D[x _u ] for x _U , and the solution to the minimum value of formula (5) is the Dirichlet probability value of the marked point s:

t_i＝s_i·I_i(8)t _i =s _i ·I _i (8)

其中，t_i为目标图T的对应位置i的灰度值，I_i为输入图像I(σ)对应位置i的灰度值；Among them, t _i is the gray value of the corresponding position i of the target image T, and I _i is the gray value of the corresponding position i of the input image I(σ);

通过SIFT算法将分割出来的显著性目标单独进行关键点检测和匹配，对得到的匹配坐标进行筛选，将错误匹配的结果提出，留下正确匹配结果。Through the SIFT algorithm, the segmented salient targets are individually detected and matched for key points, and the obtained matching coordinates are screened, and the wrong matching results are proposed, leaving the correct matching results.

SIFT算法对双目图像进行匹配流程如图4所示。The matching process of binocular images by SIFT algorithm is shown in Figure 4.

步骤四一、将目标图建立高斯金字塔，对滤波后的图像两两求差得到DOG图像，DOG图像定义为D(x,y,σ)，求取公式如下：Step 41. Establish a Gaussian pyramid for the target image, and obtain the DOG image by calculating the difference between the filtered images in pairs. The DOG image is defined as D(x, y, σ), and the calculation formula is as follows:

(9) (9)

＝C(x,y,kσ)-C(x,y,σ)=C(x,y,kσ)-C(x,y,σ)

步骤四二、在相邻的DOG图像中求出极值点，通过拟合三维二次函数确定极值点的位置和尺度作为关键点，并根据Hessian矩阵对关键点进行稳定性检测以消除边缘响应,具体如下：Step 42. Find the extreme points in the adjacent DOG images, determine the position and scale of the extreme points as key points by fitting the three-dimensional quadratic function, and perform stability detection on the key points according to the Hessian matrix to eliminate the edge Response, as follows:

关键点为DOG图像的局部极值点组成，遍历DOG图像上每个点，对每个点检测与同尺度的8个相邻点以及相邻上下的2×9个点共26个点的灰度值大小，如果其比周围相邻点都大或者都小则为极值点。The key points are composed of local extremum points of the DOG image, traverse each point on the DOG image, and detect 8 adjacent points of the same scale and 2×9 points adjacent to the upper and lower gray points of a total of 26 points for each point. If it is larger or smaller than the surrounding adjacent points, it is an extreme point.

求出的极值点并不是真正的关键点，为了提高稳定性，需要(一)对尺度空间DOG通过进行泰勒展开求其曲线拟合D(X)：The obtained extreme points are not the real key points. In order to improve the stability, it is necessary (1) to obtain the curve fitting D(X) of the scale space DOG through Taylor expansion:

若式(12)的值大于0.03，保留该极值点并获取该极值点的精确位置(原位置加上拟合之后的偏移量)和尺度，否则丢弃。为了消除不稳定的关键点，通过关键点处的Hessian矩阵进行筛选：If the value of formula (12) is greater than 0.03, keep the extreme point and obtain the exact position (original position plus the offset after fitting) and scale of the extreme point, otherwise discard it. To eliminate unstable keypoints, filter through the Hessian matrix at the keypoints:

步骤四三、确定关键点位置和所在尺度之后，需要为关键点赋一个方向，定义关键点描述子是相对于这个方向的。利用关键点邻域16×16的窗口的像素为每个关键点指定方向参数；Step 43: After determining the position and scale of the key point, it is necessary to assign a direction to the key point, and define the key point descriptor to be relative to this direction. Use the pixels of the 16×16 window of the key point neighborhood to specify the direction parameter for each key point;

(13)(13)

其中，C为关键点所在的尺度空间，m为关键点的梯度大小，θ为关键点的梯度方向；以关键点为中心，在周围区域划定一个邻域，使用直方图来统计这个邻域内点的梯度；Among them, C is the scale space where the key point is located, m is the gradient size of the key point, and θ is the gradient direction of the key point; with the key point as the center, a neighborhood is defined in the surrounding area, and the histogram is used to count the neighborhood point gradient;

直方图的横坐标为方向，将360度分为36份，每份是10度对应直方图当中的一项。直方图的纵坐标为梯度的大小，对应为相应梯度方向的点的大小进行相加，其和作为纵坐标的大小。主方向定义为梯度大小最大为hm的那个区间方向，通过使其他高度为08*hm之上的区间作为主方向的辅助向，以增强匹配的稳定性。The abscissa of the histogram is the direction, which divides 360 degrees into 36 parts, and each part is an item in the histogram corresponding to 10 degrees. The ordinate of the histogram is the size of the gradient, which corresponds to the sum of the points in the corresponding gradient direction, and the sum is used as the size of the ordinate. The main direction is defined as the interval direction whose gradient size is the largest hm, and the stability of matching is enhanced by making other intervals with a height above 08*hm the auxiliary direction of the main direction.

步骤四四、通过上面阶段之后，检测出的每个关键点就都有了位置、方向、所处尺度这三种信息。为每个关键点建立一个描述子以表述关键点的局部特征信息。Step 4. After passing the above stages, each key point detected has three kinds of information: position, direction, and scale. A descriptor is established for each key point to express the local feature information of the key point.

首先在关键点周围的坐标旋转为关键点的方向。然后选取关键点周围16×16的窗口，在邻域内分为16个4×4的小窗口。在4×4的小窗口中，计算其相对应的梯度的大小和方向。并用一个8个bin的直方图来统计每一个小窗口的梯度信息。通过高斯加权算法对关键点周围16×16的窗口计算描述子如下式：First the coordinates around the key point are rotated to the direction of the key point. Then select a 16×16 window around the key point, and divide it into 16 small 4×4 windows in the neighborhood. In the 4×4 small window, calculate the magnitude and direction of the corresponding gradient. And use a histogram of 8 bins to count the gradient information of each small window. The descriptor is calculated for the 16×16 window around the key point by the Gaussian weighting algorithm as follows:

$h h = = m m ((a a + + x x,, b b + + y the y)) * * {e e}^{- - \frac{{((- - {x x}^{' '}))}^{22} + + {(({y the y}^{' '}))}^{22}}{22 \times \times {((0.5 0.5 d d))}^{22}}} - - - - - - ((1414))$

其中，h为描述子，(a,b)为关键点在高斯金字塔图像的位置，d为窗口的边长即16，(x,y)为像素点在高斯金字塔图像中的位置，(x′,y′)为像素在将坐标旋转为关键点的方向的邻域内的新坐标，新坐标的计算公式如式：Among them, h is the descriptor, (a, b) is the position of the key point in the Gaussian pyramid image, d is the side length of the window, which is 16, (x, y) is the position of the pixel in the Gaussian pyramid image, (x' ,y′) is the new coordinate of the pixel in the neighborhood where the coordinate is rotated to the direction of the key point. The calculation formula of the new coordinate is as follows:

$(\begin{matrix} {x x}^{' '} \\ {y the y}^{' '} \end{matrix}) = = (\begin{matrix} cos cos θ θ & - - sin sin θ θ \\ sin sin θ θ & cos cos θ θ \end{matrix}) (\begin{matrix} x x \\ y the y \end{matrix}) - - - - - - ((1515))$

θ为关键点的方向。θ is the direction of the key point.

通过对16×16的窗口计算得到128个关键点的特征向量，记为H＝(h₁,h₂,h₃,...,h₁₂₈)，为了减少光线的影响，对特征向量进行归一化处理，归一化后特征向量记为L_g，归一化公式如式：The eigenvectors of 128 key points are obtained by calculating the 16×16 window, which is recorded as H=(h ₁ ,h ₂ ,h ₃ ,...,h ₁₂₈ ). In order to reduce the influence of light, the eigenvectors are normalized After normalization, the eigenvector after normalization is recorded as L _g , and the normalization formula is as follows:

${l l}_{i i} = = \frac{{h h}_{i i}}{\sqrt{{Σ Σ}_{j j = = 11}^{128128} {h h}_{j j}}},, j j = = 1,2,3 1,2,3,, . . . . . . . . - - - - - - ((1616))$

其中，L_g＝(l₁,l₂,l₃,...,l₁₂₈)为归一化之后的关键点的特征向量；Wherein, L _g =(l ₁ ,l ₂ ,l ₃ ,...,l ₁₂₈ ) is the feature vector of the key point after normalization;

当双目图像的两幅图的关键点的描述子都生成之后，采用关键点的特征向量的欧氏距离作为双目图像中关键点的相似度的判定度量，对双目图像中的关键点进行匹配，相互匹配的关键像素点坐标信息作为一组关键信息；When the descriptors of the key points of the two pictures of the binocular image are generated, the Euclidean distance of the feature vector of the key point is used as the judgment measure of the similarity of the key points in the binocular image, and the key points in the binocular image Matching is performed, and the coordinate information of key pixels matched with each other is used as a set of key information;

步骤四五、为最大程度避免误差的产生，对生成的匹配关键点进行筛选；Step 4 and 5, in order to avoid the occurrence of errors to the greatest extent, the generated matching key points are screened;

由于测量系统为双目模型，所以显著性目标的关键点在两个图像中为一个水平面，每对关键点的水平差理论上是相等的。所以求出每对关键点的坐标水平视差，生成视差矩阵，视差矩阵定义为K_n＝{k₁,k₂...k_n}，n为匹配的对数，k₁、k₂、k_n为单个匹配点视差；Since the measurement system is a binocular model, the key points of the saliency target are a horizontal plane in the two images, and the level difference of each pair of key points is theoretically equal. Therefore, the coordinate horizontal disparity of each pair of key points is calculated, and a disparity matrix is generated. The disparity matrix is defined as K _n ={k ₁ ,k ₂ ... k _n }, n is the logarithm of matching, k ₁ , k ₂ , k _n is the disparity of a single matching point;

K_n'＝{k₁-k_m,k₂-k_m,...,k_n-k_m}K _n '＝{k ₁ -k _m ,k ₂ -k _m ,...,k _n -k _m }

设定视差阈值为3，将K_n'中大于阈值的对应视差删除，得到最终视察矩阵结果K'，以避免错误匹配关键点带来的干扰。k_1'、k_2'、k_n'均为筛选后的正确匹配点的视差，n'为最终正确匹配的对数，公式如下：Set the disparity threshold to 3, delete the corresponding disparity in K _n ' that is greater than the threshold, and obtain the final inspection matrix result K', so as to avoid the interference caused by wrongly matching key points. k _1' , k _2' , and k _n' are the disparities of the correct matching points after screening, and n' is the logarithm of the final correct matching. The formula is as follows:

K'＝{k_1',k_2',...,k_n'}K'＝{k _1' ,k _2' ,...,k _n' }

将显著性目标匹配出的关键点坐标作减求出双目图像中显著性目标的视差。将视差带入双目测距的模型中从而求出显著性目标距离。The parallax of the salient objects in the binocular image is obtained by subtracting the coordinates of the key points matched by the salient objects. The parallax is brought into the model of binocular ranging to calculate the distance of the salient target.

双目成像能获取同一场景的两幅不同视角的图像，双目模型如图5。Binocular imaging can acquire two images of the same scene with different perspectives. The binocular model is shown in Figure 5.

两个完全相同的成像系统的焦距沿水平方向相距B，两个光轴均平行于水平面，图像平面与竖直平面相平行；The focal lengths of two identical imaging systems are separated by B along the horizontal direction, the two optical axes are parallel to the horizontal plane, and the image plane is parallel to the vertical plane;

假设场景中一点M(X，Y，Z)，在左、右两个成像点分别是Pl(x₁,y₁)和Pr(x₂,y₂)，x₁,y₁与x₂,y₂分别为Pl与Pr在成像的竖直平面的坐标，双目模型中视差定义为k＝|pl-pr|＝|x₂-x₁|，由三角形相似关系得到距离公式，X，Y，Z为空间坐标系中横轴，竖轴，纵轴的坐标：Assuming a point M(X, Y, Z) in the scene, the left and right imaging points are Pl(x ₁ ,y ₁ ) and Pr(x ₂ ,y ₂ ), x ₁ ,y ₁ and x ₂ , y ₂ are the coordinates of Pl and Pr in the vertical plane of imaging respectively. In the binocular model, the parallax is defined as k=|pl-pr|=|x ₂ -x ₁ |, and the distance formula is obtained from the triangle similarity relationship, X, Y , Z is the horizontal axis, vertical axis, and vertical axis coordinates in the spatial coordinate system:

$z z = = B B \frac{f f}{k k} = = B B \frac{f f}{| | {x x}_{22} - - {x x}_{11} | | d d x x} - - - - - - ((1717))$

其中dx表示每一像素在成像的底片中水平轴方向上的物理距离，f为成像系统的焦距，z是目标点M到两成像中心连线的距离，将步骤四求出的视差矩阵带入式(17)中，根据双目模型的物理信息求出对应的距离矩阵Z'＝{z₁,z₂,...,z_n'}，z₁，z₂，z_n'为单个匹配视差求出的显著性目标距离，最后求出距离矩阵的平均值即为双目图像中显著性目标的距离Z_f，公式如下：Where dx represents the physical distance of each pixel in the horizontal axis direction of the imaged film, f is the focal length of the imaging system, z is the distance from the target point M to the line connecting the two imaging centers, and the parallax matrix obtained in step 4 is brought into In formula (17), the corresponding distance matrix Z'={z ₁ , z ₂ ,...,z _n' } is obtained according to the physical information of the binocular model, and z ₁ , z ₂ , z _n' are a single match The distance of the salient target calculated by the parallax, and finally the average value of the distance matrix is calculated as the distance Z _f of the salient target in the binocular image. The formula is as follows:

${Z Z}_{f f} = = \frac{11}{n no} {Σ Σ}_{k k = = 11}^{{n no}^{' '}} {z z}_{k k} - - - - - - ((1818)) . .$

具体实施方式二：下面结合图说明本实施方式，本实施方式与具体实施方式一不同的是：步骤一一所述的对图像进行边缘检测的具体过程为：Specific embodiment two: the present embodiment is described below in conjunction with the figure, and the difference between this embodiment and the specific embodiment one is: the specific process of performing edge detection on the image described in step one by one is:

步骤一一一、采用2D高斯滤波模板对双目图像进行卷积运算消除图像的噪声干扰；Step 111, using 2D Gaussian filter template to perform convolution operation on the binocular image to eliminate the noise interference of the image;

步骤一一二、利用水平和竖直方向的一阶偏导的差分分别计算滤波后的双目图像I(x,y)上像素的梯度幅值和梯度方向，其中x方向和y方向的偏导数dx和dy分别为：Step 112, using the difference of the first-order partial derivatives in the horizontal and vertical directions to calculate the gradient magnitude and gradient direction of the pixel on the filtered binocular image I(x, y) respectively, where the partial derivatives in the x direction and y direction The derivatives dx and dy are respectively:

dx＝[I(x+1,y)-I(x-1,y)]/2 (21)dx＝[I(x+1,y)-I(x-1,y)]/2 (21)

dy＝[I(x,y+1)-I(x,y-1)]/2 (22)dy＝[I(x,y+1)-I(x,y-1)]/2 (22)

则梯度幅值为：Then the gradient magnitude is:

D'＝(dx²+dy²)^1/2 (23)D'＝(dx ² +dy ² ) ^1/2 (23)

梯度方向为：The gradient direction is:

θ'＝arctan(dy/dx) (24)；θ'=arctan(dy/dx) (24);

D'和θ'分别表示滤波后的双目图像I(x,y)上像素的梯度幅值和梯度方向；D' and θ' represent the gradient magnitude and gradient direction of the pixel on the filtered binocular image I(x,y) respectively;

步骤一一三、对梯度进行非极大值抑制，然后对图像进行双阈值处理，生成边缘图像；其中，所述边缘图像的边缘点灰度值为255，非边缘点灰度值为0。Step 113: Perform non-maximum suppression on the gradient, and then perform double-threshold processing on the image to generate an edge image; wherein, the gray value of the edge points of the edge image is 255, and the gray value of the non-edge points is 0.

具体实施方式三：下面结合图说明本实施方式，本实施方式与具体实施方式一或二不同的是：步骤一二所述的利用视觉显著性模型对双目图像进行显著性特征提取，生成显著性特征图的具体过程为：Specific embodiment three: The following describes this embodiment in conjunction with the figures. The difference between this embodiment and specific embodiment one or two is that: the use of the visual saliency model to extract the salient features of the binocular image described in step 12 generates a salient feature. The specific process of the characteristic map is as follows:

步骤一二一、双目图像边缘检测之后，将原始图像和边缘图像进行叠加：Step 121, after binocular image edge detection, superimpose the original image and the edge image:

I₁(σ)＝0.7I(σ)+0.3C(σ) (25)I ₁ (σ)＝0.7I(σ)+0.3C(σ) (25)

其中，I(σ)为输入双目图像的原图，C(σ)为边缘图像，I₁(σ)为叠加处理之后的图像；Among them, I(σ) is the original image of the input binocular image, C(σ) is the edge image, and I ₁ (σ) is the image after superposition processing;

步骤一二二、采用高斯差函数计算叠加处理之后的图像的九层高斯金字塔，其中第0层为输入的叠加图像，1到8层分别为对上一层采用高斯滤波和降阶采样而成，大小对应着输入图像的1/2到1/256，对高斯金字塔的每一层提取亮度，颜色，方向特征并生成对应的亮度金字塔、颜色金字塔和方向金字塔；Step 122: Use the Gaussian difference function to calculate the nine-layer Gaussian pyramid of the superimposed image, wherein the 0th layer is the input superimposed image, and the 1st to 8th layers are respectively formed by using Gaussian filtering and downsampling on the previous layer , the size corresponds to 1/2 to 1/256 of the input image, and extracts brightness, color, and direction features for each layer of the Gaussian pyramid and generates corresponding brightness pyramids, color pyramids, and direction pyramids;

提取亮度特征公式如下：The formula for extracting brightness features is as follows:

I_n＝(r+g+b)/3 (26)I _n =(r+g+b)/3 (26)

其中r、g、b分别对应着输入双目图像颜色的红、绿、蓝三个分量,I_n为亮度特征；Where r, g, b correspond to the red, green, and blue components of the input binocular image color respectively, and I _n is the brightness feature;

提取颜色特征公式如下：The formula for extracting color features is as follows:

R＝r-(g+b)/2 (27)R＝r-(g+b)/2 (27)

G＝g-(r+b)/2 (28)G＝g-(r+b)/2 (28)

B＝b-(r+g)/2 (29)B＝b-(r+g)/2 (29)

Y＝r+g-2(|r-g|+b) (30)Y＝r+g-2(|r-g|+b) (30)

R，G，B，Y对应着叠加之后图像的颜色分量；R, G, B, Y correspond to the color components of the superimposed image;

O(σ,ω)是对亮度特征I_n在尺度方向进行Gabor函数滤波提取的方向特征，ω为Gabor函数的方向即高斯金字塔层数，σ为Gabor函数的总的方向数量，其中σ∈[0,1,2…,8],ω∈[0°,45°,90°,135°]；O(σ,ω) is the directional feature extracted by Gabor function filtering on the brightness feature I _n in the scale direction, ω is the direction of the Gabor function, that is, the number of Gaussian pyramid layers, and σ is the total number of directions of the Gabor function, where σ∈[ 0,1,2...,8], ω∈[0°,45°,90°,135°];

步骤一二三、对求出的高斯金字塔的不同尺度的亮度、颜色和方向三个特征进行中央周边对比作差，具体为：Steps 1, 2, and 3, compare and compare the three features of the obtained Gaussian pyramid with brightness, color, and direction at the center and periphery, specifically:

设尺度c(c∈{2,3,4})为中心尺度，尺度u(u＝c+δ,δ∈{3,4})为外围尺度；在9层的高斯金字塔中的中心尺度c和外周尺度u之间有6种组合(2-5，2-6，3-6，3-7，4-7，4-8)；Let the scale c(c∈{2,3,4}) be the central scale, and the scale u(u=c+δ,δ∈{3,4}) be the peripheral scale; the central scale c in the 9-layer Gaussian pyramid There are 6 combinations (2-5, 2-6, 3-6, 3-7, 4-7, 4-8) between and the peripheral scale u;

通过尺度c和尺度s的特征图的差值表示中央和周边对比作差的的局部方向特征对比如下式：The difference between the feature maps of scale c and scale s represents the local direction feature comparison between the central and peripheral contrasts as follows:

I_n(c,u)＝|I_n(c)-I_n(u)| (31)I _n (c,u)=|I _n (c)-I _n (u)| (31)

RG(c,u)＝|(R(c)-G(c))-(G(u)-R(u))| (32)RG(c,u)＝|(R(c)-G(c))-(G(u)-R(u))| (32)

BY(c,u)＝|(B(c)-Y(c))-(Y(u)-B(u))| (33)BY(c,u)＝|(B(c)-Y(c))-(Y(u)-B(u))| (33)

O(c,u,ω)＝|O(c,ω)-O(u,ω)| (34)O(c,u,ω)＝|O(c,ω)-O(u,ω)| (34)

其中，在做差之前需要通过插值使两幅图的大小一致再进行作差；Among them, before making the difference, it is necessary to make the size of the two images consistent through interpolation before making the difference;

步骤一二四、通过归一化对作差生成的不同特征的特征图进行融合，生成输入双目图像的显著性特征图，具体为：Step 124: Fuse the feature maps of different features generated by the difference by normalization to generate the saliency feature map of the input binocular image, specifically:

首先对每个特征的尺度对比特征图进行归一化融合生成该特征的综合特征图为亮度特征归一化特征图，为颜色特征归一化特征图，为方向特征归一化特征图；计算过程如下面公式所示：First, the scale comparison feature map of each feature is normalized and fused to generate a comprehensive feature map of the feature Normalize feature maps for brightness features, Normalize feature maps for color features, The feature map is normalized for the direction feature; the calculation process is shown in the following formula:

$\overset{&OverBar; &OverBar;}{{I I}_{n no}} = = {&CirclePlus; &CirclePlus;}_{c c = = 22}^{44} {&CirclePlus; &CirclePlus;}_{s the s = = c c + + 33}^{c c + + 44} N N (({I I}_{n no} ((c c,, s the s)))) - - - - - - ((3535))$

$\overset{&OverBar; &OverBar;}{C C} = = {&CirclePlus; &CirclePlus;}_{c c = = 22}^{44} {&CirclePlus; &CirclePlus;}_{s the s = = c c + + 33}^{c c + + 44} [[N N ((RG RG ((c c,, s the s)))) + + N N ((BY BY ((c c,, s the s))))]] - - - - - - ((3636))$

其中，N(·)代表归一化计算函数，首先对于需计算的特征图，将特征图中每个像素的特征值都归一化到一个闭合区域[0,255]内，然后在归一化的各个特征图中找到全局最大显著值A，再求出特征图中局部极大值的平均值a，最后对特征的每一个像素对应的特征值都乘以2(A-a)；Among them, N( ) represents the normalized calculation function. First, for the feature map to be calculated, the feature value of each pixel in the feature map is normalized to a closed area [0,255], and then the normalized Find the global maximum saliency value A in each feature map, then find the average value a of the local maximum value in the feature map, and finally multiply the eigenvalue corresponding to each pixel of the feature by 2(A-a);

再利用每个特征的综合特征图进行归一化处理得到最终的显著性特征图S，计算过程如下：Then use the comprehensive feature map of each feature to perform normalization processing to obtain the final saliency feature map S. The calculation process is as follows:

$S S = = \frac{11}{33} ((N N ((\overset{&OverBar; &OverBar;}{{I I}_{n no}})) + + N N ((\overset{&OverBar; &OverBar;}{C C})) + + N N ((\overset{&OverBar; &OverBar;}{O o})))) - - - - - - ((3838)) . .$

Claims

1. a distance measurement method of salient target in binocular image, it is characterized in that described method comprises the following steps:

Step 1. Use the visual saliency model to extract the saliency features of the binocular image, and mark the seed points and background points, specifically including:

Step 11, first perform preprocessing, perform edge detection on the binocular image, and generate an edge map of the binocular image;

Step 12, using the visual saliency model to extract the saliency feature of the binocular image, and generate a saliency feature map;

Step 13. According to the saliency feature map, find out the pixel point with the largest gray value in the graph, and mark it as a seed point; and traverse the pixels in a window of 25×25 centered on the seed point, and find out that the gray value of the pixel point is less than 0.1 and the pixel farthest from the seed point is marked as the background point;

Step 2, establishing a weighted map for the binocular image;

Use the classic Gaussian weight function to create a weighted image for binocular images:

{W W}_{ij ij} = = {e e}^{- - β β {(({g g}_{i i} - - {g g}_{j j}))}^{22}} - - - - - - ((11))

Among them, W _ij represents the weight between vertex i and vertex j, g _i represents the brightness of vertex i, g _j represents the brightness of vertex j, β is a free parameter, and e is a natural base;

The Laplacian matrix L of the weighted graph is obtained by the following formula:

Among them, L _ij is the element corresponding to vertex i to j in the Laplacian matrix L, d _i is the sum of the weight of vertex i and surrounding points,

d_{i} = Σ W_{ij};

Step 3, using the seed point and background point in step 1 and the weighted map in step 2, using the random walk image segmentation algorithm to segment the salient target in the binocular image;

Step 31. Divide the pixel points of the binocular image into two types of sets according to the seed points and background points marked in step 1, namely, the set of marked points V _M and the set of unmarked points V _U , and the Laplacian matrix L is based on V _M and V _U , first arrange the marked points and then arrange the non-marked points; wherein, the L is divided into four parts: L _M , L _U , B, and B ^T , and the Laplacian matrix is expressed as follows:

L L = = [\begin{matrix} {L L}_{M m} & B B \\ {B B}^{T T} & {L L}_{U u} \end{matrix}] - - - - - - ((33))

Among them, L _M is the Laplacian matrix from the marked point to the marked point, L _U is the Laplacian matrix from the unmarked point to the unmarked point, B and B ^T are the marked point to the non-marked point and the non-marked point respectively to the Laplacian matrix of the marked points;

Step 32, solving the combined Dirichlet integral D[x] according to the Laplace matrix and the marked points;

The combined Dirichlet integral formula is as follows:

D D. [[x x]] = = \frac{11}{22} Σ Σ {w w}_{ij ij} {(({x x}_{i i} - - {x x}_{j j}))}^{22} = = \frac{11}{22} {x x}^{T T} Lx Lx - - - - - - ((44))

Among them, x is the probability matrix from the vertex to the marked point in the weighted graph, and x _i and x _j are the probabilities from vertices i and j to the marked point respectively;

According to the set of marked points V _M and the set of unmarked points V _U , divide x into two parts x _M and x _U , x _M is the probability matrix corresponding to the set of marked points V _M , x _U is the corresponding probability matrix of the set of unmarked points V _U Probability matrix; formula (4) is decomposed into:

D D. [[{x x}_{U u}]] = = \frac{11}{22} [[{x x}_{M m}^{T T} {x x}_{U u}^{T T}]] [\begin{matrix} {L L}_{M m} & B B \\ {B B}^{T T} & {L L}_{U u} \end{matrix}] [\begin{matrix} {x x}_{M m} \\ {x x}_{U u} \end{matrix}] = = \frac{11}{22} (({x x}_{M m}^{T T} {L L}_{M m} {x x}_{M m} + + {22 x x}_{U u}^{T T} {B B}^{T T} {x x}_{M m} + + {x x}_{U u}^{T T} {L L}_{U u} {x x}_{U u})) - - - - - - ((55))

For a marker point s, set m ^s , if any vertex i is s, then otherwise Differentiate D[x _u ] for x _U , and the solution to the minimum value of formula (5) is the Dirichlet probability value of the marked point s:

{L L}_{U u} {x x}_{i i}^{s the s} = = - - {Bm B m}^{s the s} - - - - - - ((66))

in, Indicates the probability that vertex i reaches mark point s for the first time;

According to the obtained by combinatorial Dirichlet integral Perform threshold segmentation according to formula (7) to generate a segmentation map:

Among them, s _i is the pixel size of the corresponding position of a vertex i in the segmentation map;

Wherein, the pixel points with a brightness of 1 in the segmentation map are represented as salient objects in the image, and those with a brightness of 0 are the background;

Step 33: Multiply the segmentation map with the pixels corresponding to the original image to generate the target map, that is, extract the segmented salient target, the formula is as follows:

t _i =s _i ·I _i (8)

Among them, t _i is the gray value of a certain vertex i of the target image T, and I _i is the gray value of the corresponding position i of the input image I(σ);

Step 4, using the SIFT algorithm to perform key point matching on the saliency target alone;

Step 41. Establish a Gaussian pyramid for the target image, and obtain the DOG image by calculating the difference between the filtered images in pairs. The DOG image is defined as D(x, y, σ), and the calculation formula is as follows:

D(x,y,σ)=(G(x,y,kσ)-G(x,y,σ))*T(x,y)

(9)

=C(x,y,kσ)-C(x,y,σ)

in, is a Gaussian function with varying scales, p, q represent the dimensions of the Gaussian template, (x, y) is the position of the pixel in the Gaussian pyramid image, σ is the scale space factor of the image, k represents a specific scale value, C (x,y,σ) is defined as the convolution of G(x,y,σ) and the target graph T(x,y), that is, C(x,y,σ)=G(x,y,σ)*T (x,y);

Step 42. Find the extreme points in the adjacent DOG images, determine the position and scale of the extreme points as key points by fitting the three-dimensional quadratic function, and perform stability detection on the key points according to the Hessian matrix to eliminate the edge Response, as follows:

(1) Find the curve fitting D(X) of the scale space DOG by performing Taylor expansion:

D D. ((X x)) = = D D. + + \frac{{&PartialD; &PartialD; D D.}^{T T}}{&PartialD; &PartialD; X x} X x + + \frac{11}{22} {X x}^{T T} \frac{{&PartialD; &PartialD;}^{22} D D.}{{&PartialD; &PartialD; X x}^{22}} X x - - - - - - ((1010))

Among them, X=(x, y, σ) ^T , D is the curve fitting, take the derivative of formula (10) and make it 0, and get the offset formula (11) of the extremum point:

\overset{^^}{X x} = = - - \frac{{&PartialD; &PartialD;}^{22} {D D.}^{- - 11}}{{&PartialD; &PartialD; X x}^{22}} \frac{&PartialD; &PartialD; D D.}{&PartialD; &PartialD; X x} - - - - - - ((1111))

In order to remove the low-contrast extreme points, formula (11) is substituted into formula (10), and formula (12) is obtained:

D D. ((\overset{^^}{X x})) = = D D. + + \frac{11}{22} \frac{{&PartialD; &PartialD; D D.}^{T T}}{&PartialD; &PartialD; X x} \overset{^^}{X x} - - - - - - ((1212))

If the value of formula (12) is greater than 0.03, keep the extreme point and obtain the precise position and scale of the extreme point, otherwise discard;

(2) Eliminate unstable key points by screening the Hessian matrix at the key points;

Calculate the curvature using the ratio between the eigenvalues of the Hessian matrix;

Judging the edge point according to the curvature of the key point neighborhood;

The curvature ratio is set to 10, if it is greater than 10, it will be deleted, otherwise, it will be retained, and the remaining key points will be stable;

Step 43, using the pixels of the 16×16 window of the key point neighborhood to specify the direction parameter for each key point;

For the key points detected in the DOG image, the calculation formula of the magnitude and direction of the gradient is as follows:

\begin{matrix} m m ((x x,, y the y)) = = \sqrt{{((C C ((x x + + 11,, y the y)) - - C C ((x x - - 11,, y the y))))}^{22} + + {((C C ((x x,, y the y + + 11)) - - C C ((x x,, y the y - - 11))))}^{22}} \\ θ θ ((x x,, y the y)) = = {tan the tan}^{- - 11} ((((C C ((x x,, y the y + + 11)) - - C C ((x x,, y the y - - 11)))) / / ((C C ((x x + + 11,, y the y)) - - C C ((x x + + 11,, y the y)) - - C C ((x x - - 11,, y the y)))))) \end{matrix} - - - - - - ((1313))

Among them, C is the scale space where the key point is located, m is the gradient size of the key point, and θ is the gradient direction of the desired point; with the key point as the center, a 16×16 neighborhood is defined in the surrounding area, and the pixel in it is calculated The gradient size and gradient direction of the point, use the histogram to count the gradient of the points in this neighborhood; the abscissa of the histogram is the direction, divide 360 degrees into 36 parts, and each part is an item in the histogram corresponding to 10 degrees, the histogram The ordinate of the graph is the gradient size, which corresponds to the sum of the points in the corresponding gradient direction, and the sum is used as the size of the ordinate; the main direction is defined as the direction of the interval with the maximum gradient size of hm, and the gradient size is between 08*hm The interval above is used as the auxiliary direction of the main direction to enhance the stability of matching;

Step 44: Establish descriptors to express local feature information of key points

First, the coordinates around the key point are rotated to the direction of the key point;

Then select a 16×16 window around the key point, and divide it into 16 4×4 small windows in the neighborhood. In the 4×4 small window, calculate the size and direction of the corresponding gradient, and use an 8 bin The histogram of each small window is used to count the gradient information of each small window, and the descriptor is calculated for the 16×16 window around the key point through the Gaussian weighting algorithm as follows:

h h = = {m m}_{g g} ((a a + + x x,, b b + + y the y)) * * {e e}^{- - \frac{{((- - {x x}^{' '}))}^{22} + + {(({y the y}^{' '}))}^{22}}{22 \times \times {((0.5 0.5 d d))}^{22}}} - - - - - - ((1414))

Among them, h is the descriptor, (a, b) is the position of the key point in the Gaussian pyramid image, m _g is the gradient size of the key point, which is the gradient size of the main direction of the histogram in step 43, and d is the side length of the window, which is 16 , (x, y) is the position of the pixel in the Gaussian pyramid image, (x′, y′) is the new coordinate of the pixel in the neighborhood of the direction that rotates the coordinate to the key point, and the calculation formula of the new coordinate is as follows:

(\begin{matrix} {x x}^{' '} \\ {y the y}^{' '} \end{matrix}) = = (\begin{matrix} cos cos {θ θ}_{g g} & - - sin sin {θ θ}_{g g} \\ sin sin {θ θ}_{g g} & cos cos {θ θ}_{g g} \end{matrix}) (\begin{matrix} x x \\ y the y \end{matrix}) - - - - - - ((1515))

θ _g is the gradient direction of the key point;

The feature vectors of 128 key points are obtained by calculating the 16×16 window, which is recorded as H=(h ₁ ,h ₂ ,h ₃ ,...,h ₁₂₈ ), and the feature vectors are normalized. The eigenvector after normalization is denoted as L _g , and the normalization formula is as follows:

{l l}_{i i} = = \frac{{h h}_{i i}}{\sqrt{{Σ Σ}_{j j = = 11}^{128128} {h h}_{j j}}},, j j = = 1,2,3 1,2,3,, . . . . . . . . ((1616))

Among them, L _g =(l ₁ ,l ₂ ,...,l _i ,...,l ₁₂₈ ) is the feature vector of the key point after normalization, l _i ,i=1,2,3,. ... is a normalized vector;

The Euclidean distance of the feature vector of the key point is used as the judgment measure of the similarity of the key point in the binocular image, and the key point in the binocular image is matched, and the coordinate information of the key pixel points matched with each other is used as a set of key information;

Steps 4 and 5, screening the generated matching key points;

Find the coordinate horizontal disparity of each pair of key points, and generate a disparity matrix. The disparity matrix is defined as K _n ={k ₁ ,k ₂ ... k _n }, n is the logarithm of matching, k ₁ , k ₂ , k _n Disparity for a single matching point;

Find the median k _m of the disparity matrix, and obtain the reference disparity matrix, denoted as K _n ', the formula is as follows:

K _n '＝{k ₁ -k _m ,k ₂ -k _m ,...,k _n -k _m } (17)

Set the parallax threshold to 3, delete the corresponding parallax in K _n ' that is greater than the threshold, and obtain the final observation matrix result K', k _1' , k _2' , and k _n' are the parallaxes of the correct matching points after screening, n 'is the logarithm of the final correct match, the formula is as follows:

K'＝{k _1' ,k _2' ,...,k _n' } (18)

Step five, substituting the disparity matrix K' obtained in step four into the binocular distance measurement model to obtain the salient target distance;

The focal lengths of two identical imaging systems are separated by J along the horizontal direction, the two optical axes are parallel to the horizontal plane, and the image plane is parallel to the vertical plane;

Assuming a target point M(X, Y, Z) in the scene, the two imaging points on the left and right are Pl(x ₁ ,y ₁ ) and Pr(x ₂ ,y ₂ ), x ₁ ,y ₁ and x ₂ , y ₂ are the coordinates of Pl and Pr in the vertical plane of imaging respectively. In the binocular model, the parallax is defined as k=|pl-pr|=|x ₂ -x ₁ |, and the distance formula is obtained from the triangle similarity relationship, X , Y, Z are the coordinates of the horizontal axis, vertical axis, and vertical axis in the spatial coordinate system:

z z = = J J \frac{f f}{k k} = = J J \frac{f f}{| | {x x}_{22} - - {x x}_{11} | | {dx dx}^{' '}} - - - - - - ((1919))

Where dx' represents the physical distance of each pixel in the horizontal axis direction of the imaged film, f is the focal length of the imaging system, z is the distance from the target point M to the line connecting the two imaging centers, and the disparity matrix obtained in step 4 is taken with In formula (19), the corresponding distance matrix Z'={z ₁ , z ₂ ,...,z _n' } is obtained according to the physical information of the binocular model, z ₁ , z ₂ , z _n' are a single The distance of the salient target obtained by matching the disparity, and finally the average value of the distance matrix is calculated as the distance Z _f of the salient target in the binocular image. The formula is as follows:

{Z Z}_{f f} = = \frac{11}{n no} {Σ Σ}_{k k = = 11}^{{n no}^{' '}} {z z}_{k k} - - - - - - ((2020)) . .

2. the distance measuring method of salient target in a kind of binocular image according to claim 1, it is characterized in that the concrete process that step one by one is carried out edge detection to image is:

Step 111, using 2D Gaussian filter template to perform convolution operation on the binocular image to eliminate the noise interference of the image;

Step 112, using the difference of the first-order partial derivatives in the horizontal and vertical directions to calculate the gradient magnitude and gradient direction of the pixel on the filtered binocular image I(x, y) respectively, where the partial derivatives in the x direction and y direction The derivatives dx and dy are respectively:

dx＝[I(x+1,y)-I(x-1,y)]/2 (21)

dy＝[I(x,y+1)-I(x,y-1)]/2 (22)

Then the gradient magnitude is:

D'＝(dx ² +dy ² ) ^1/2 (23)

The gradient direction is:

θ'=arctan(dy/dx) (24);

D' and θ' represent the gradient magnitude and gradient direction of the pixel on the filtered binocular image I(x,y) respectively;

Step 113: Perform non-maximum suppression on the gradient, and then perform double-threshold processing on the image to generate an edge image; wherein, the gray value of the edge points of the edge image is 255, and the gray value of the non-edge points is 0.

3. the distance measurement method of a salient target in a binocular image according to claim 2, characterized in that the binocular image is extracted using a visual saliency model described in step 12 to generate a salient feature The specific process of the feature map is:

Step 121, after binocular image edge detection, superimpose the original image and the edge image:

I ₁ (σ)＝0.7I(σ)+0.3C(σ) (25)

Among them, I(σ) is the original image of the input binocular image, C(σ) is the edge image, and I ₁ (σ) is the image after superposition processing;

Step 122: Use the Gaussian difference function to calculate the nine-layer Gaussian pyramid of the superimposed image, wherein the 0th layer is the input superimposed image, and the 1st to 8th layers are respectively formed by using Gaussian filtering and downsampling on the previous layer , the size corresponds to 1/2 to 1/256 of the input image, and extracts brightness, color, and direction features for each layer of the Gaussian pyramid and generates corresponding brightness pyramids, color pyramids, and direction pyramids;

The formula for extracting brightness features is as follows:

I _n =(r+g+b)/3 (26)

Where r, g, b correspond to the red, green, and blue components of the input binocular image color respectively, and I _n is the brightness feature;

The formula for extracting color features is as follows:

R＝r-(g+b)/2 (27)

G＝g-(r+b)/2 (28)

B＝b-(r+g)/2 (29)

Y＝r+g-2(|r-g|+b) (30)

R, G, B, Y correspond to the color components of the superimposed image;

O(σ,ω) is the directional feature extracted by Gabor function filtering on the brightness feature In in the scale direction, ω is the direction of the Gabor function, that is, the number of Gaussian pyramid layers, and σ is the total number of directions of the Gabor function, where σ∈[0 ,1,2...,8],ω∈[0°,45°,90°,135°];

Steps 1, 2, and 3, compare and compare the three features of the obtained Gaussian pyramid with brightness, color, and direction at the center and periphery, specifically:

Let the scale c(c∈{2,3,4}) be the central scale, and the scale u(u=c+δ,δ∈{3,4}) be the peripheral scale; the central scale c in the 9-layer Gaussian pyramid There are 6 combinations (2-5, 2-6, 3-6, 3-7, 4-7, 4-8) between and the peripheral scale u;

The difference between the feature maps of scale c and scale s represents the local direction feature comparison between the central and peripheral contrasts as follows:

I _n (c,u)=|I _n (c)-I _n (u)| (31)

RG(c,u)＝|(R(c)-G(c))-(G(u)-R(u))| (32)

BY(c,u)＝|(B(c)-Y(c))-(Y(u)-B(u))| (33)

O(c,u,ω)＝|O(c,ω)-O(u,ω)| (34)

Among them, before making the difference, it is necessary to make the size of the two images consistent through interpolation before making the difference;

Step 124: Fuse the feature maps of different features generated by the difference by normalization to generate the saliency feature map of the input binocular image, specifically:

First, the scale comparison feature map of each feature is normalized and fused to generate a comprehensive feature map of the feature Normalize feature maps for brightness features, Normalize feature maps for color features, The feature map is normalized for the direction feature; the calculation process is shown in the following formula:

\overset{&OverBar; &OverBar;}{{I I}_{n no}} = = {&CirclePlus; &CirclePlus;}_{c c = = 22}^{44} {&CirclePlus; &CirclePlus;}_{s the s = = c c + + 33}^{c c + + 44} N N (({I I}_{n no} ((c c,, s the s)))) - - - - - - ((3535))

\overset{&OverBar; &OverBar;}{C C} = = {&CirclePlus; &CirclePlus;}_{c c = = 22}^{44} {&CirclePlus; &CirclePlus;}_{s the s = = c c + + 33}^{c c + + 44} [[N N ((RG RG ((c c,, s the s)))) + + N N ((BY BY ((c c,, s the s))))]] - - - - - - ((3636))

Among them, N( ) represents the normalized calculation function. First, for the feature map to be calculated, the feature value of each pixel in the feature map is normalized into a closed area [0,255], and then the normalized Find the global maximum saliency value A in each feature map, then find the average value a of the local maximum value in the feature map, and finally multiply the eigenvalue corresponding to each pixel of the feature by 2(A-a);

Then use the comprehensive feature map of each feature to perform normalization processing to obtain the final saliency feature map S. The calculation process is as follows:

S S = = \frac{11}{33} ((N N ((\overset{&OverBar; &OverBar;}{{I I}_{n no}})) + + N N ((\overset{&OverBar; &OverBar;}{C C})) + + N N ((\overset{&OverBar; &OverBar;}{O o})))) - - - - - - ((3838)) . .