CN111368882B - Stereo matching method based on simplified independent component analysis and local similarity - Google Patents

Stereo matching method based on simplified independent component analysis and local similarity Download PDF

Info

Publication number
CN111368882B
CN111368882B CN202010103827.0A CN202010103827A CN111368882B CN 111368882 B CN111368882 B CN 111368882B CN 202010103827 A CN202010103827 A CN 202010103827A CN 111368882 B CN111368882 B CN 111368882B
Authority
CN
China
Prior art keywords
loss function
component analysis
pixel
independent component
matching cost
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010103827.0A
Other languages
Chinese (zh)
Other versions
CN111368882A (en
Inventor
陈苏婷
张婧霖
邓仲
张闯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202010103827.0A priority Critical patent/CN111368882B/en
Publication of CN111368882A publication Critical patent/CN111368882A/en
Application granted granted Critical
Publication of CN111368882B publication Critical patent/CN111368882B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2134Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on separation criteria, e.g. independent component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a stereo matching method based on simplified independent component analysis and local similarity, which is used for the technical field of image processing and improves a DispNet network, wherein the method firstly proposes simplified Independent Component Analysis (ICA) cost aggregation, introduces a matching cost volume pyramid, simplifies the preprocessing process of an ICA algorithm and defines a simplified ICA loss function; secondly, introducing a regional loss function, and defining a local similarity loss function by combining a single-pixel point loss function so as to perfect the spatial structure of the disparity map; and finally, combining the simplified ICA loss function with the local similarity loss function, training the network to predict the disparity map, and making up the edge information of the disparity map. The method and the device improve the prediction accuracy of the edge and the detail part of the parallax image and reduce the dependence degree on a single pixel point in the prediction process while ensuring the prediction speed of the parallax image.

Description

一种基于简化独立成分分析和局部相似性的立体匹配方法A stereo matching method based on simplified independent component analysis and local similarity

技术领域Technical Field

本发明属于图像处理技术领域,尤其涉及一种基于简化独立成分分析和局部相似性的立体匹配方法。The invention belongs to the technical field of image processing, and in particular relates to a stereo matching method based on simplified independent component analysis and local similarity.

背景技术Background Art

立体匹配是立体视觉研究中的关键部分,在车辆的自动驾驶,3D模型重建,物体的检测与识别等方面有广泛的应用。立体匹配的目的是求出立体图像对中左右图像像素点之间的对应关系,以获得视差图。然而,立体匹配面临着很大的挑战,遇到遮挡、弱纹理、深度不连续等复杂场景,不易获取稠密且精细的视差图。因此,如何准确的从立体图对中获取密集视差,具有重大的研究意义。Stereo matching is a key part of stereo vision research and has a wide range of applications in autonomous driving, 3D model reconstruction, object detection and recognition, etc. The purpose of stereo matching is to find the correspondence between the pixels of the left and right images in a stereo image pair to obtain a disparity map. However, stereo matching faces great challenges. When encountering complex scenes such as occlusion, weak texture, and depth discontinuity, it is not easy to obtain dense and detailed disparity maps. Therefore, how to accurately obtain dense disparity from stereo image pairs is of great research significance.

传统的立体匹配方法匹配效果的好坏依赖于匹配代价的准确性,计算十分缓慢,十分依赖匹配窗口的合理性,对弱纹理区的处理效果不好,算法实现时收敛速度较慢。在传统的立体匹配算法中,采用手动的方法提取图像特征以及代价卷的设计,图像信息表达不全面,影响了后续步骤的实施,视差图精度受到影响。The quality of the traditional stereo matching method depends on the accuracy of the matching cost. The calculation is very slow and depends heavily on the rationality of the matching window. The processing effect on weak texture areas is not good, and the convergence speed of the algorithm is slow. In the traditional stereo matching algorithm, manual methods are used to extract image features and design cost volumes. The image information is not fully expressed, which affects the implementation of subsequent steps and the accuracy of the disparity map.

发明内容Summary of the invention

发明目的:针对在实际场景中现存的立体匹配网络在视差图的边缘,细节信息以及弱纹理区域的预测准确率较低问题,本发明提出一种基于简化独立成分分析(SICA)和局部相似性的立体匹配方法。该方法提高了视差图边缘以及细节部分的预测准确率,减少了在预测过程中对单像素点的依赖程度。Purpose of the invention: In order to solve the problem that the existing stereo matching network has low prediction accuracy in the edge, detail information and weak texture area of the disparity map in actual scenes, the present invention proposes a stereo matching method based on simplified independent component analysis (SICA) and local similarity. This method improves the prediction accuracy of the edge and detail parts of the disparity map and reduces the reliance on single pixels in the prediction process.

技术方案:为实现本发明的目的,本发明所采用的技术方案是:一种基于简化独立成分分析和局部相似性的立体匹配方法,包括以下步骤:Technical solution: To achieve the purpose of the present invention, the technical solution adopted by the present invention is: a stereo matching method based on simplified independent component analysis and local similarity, comprising the following steps:

步骤一,将双目相机拍摄的立体图像对输入DispNetC网络的卷积层,提取每个像素的特征,通过计算特征相关性构建初始匹配代价卷,完成初始匹配代价计算;Step 1: Input the stereo image pair taken by the binocular camera into the convolutional layer of the DispNetC network, extract the features of each pixel, construct the initial matching cost volume by calculating the feature correlation, and complete the initial matching cost calculation;

步骤二,将初始匹配代价卷输入DispNetC网络的编码-解码结构,进行简化独立成分分析匹配代价聚合,定义简化独立成分分析损失函数LSICA,更新像素点的权值;Step 2: Input the initial matching cost volume into the encoding-decoding structure of the DispNetC network, perform simplified independent component analysis matching cost aggregation, define the simplified independent component analysis loss function L SICA , and update the pixel weights;

步骤三,聚合后的匹配代价卷输入解码结构的最后一层反卷积层,反卷积的结果即为视差图,构建局部相似性损失函数Ll,并结合简化独立成分分析损失函数LSICA,得到总损失函数L;Step 3: The aggregated matching cost volume is input into the last deconvolution layer of the decoding structure. The result of the deconvolution is the disparity map. The local similarity loss function L l is constructed and combined with the simplified independent component analysis loss function L SICA to obtain the total loss function L;

步骤四,利用真实视差图和预测视差图以及定义的总损失函数L进行网络训练,更新网络参数,通过训练完成的网络预测得到全尺寸视差图。Step 4: Use the real disparity map, the predicted disparity map and the defined total loss function L to perform network training, update the network parameters, and obtain the full-size disparity map through the trained network prediction.

进一步地,所述步骤一,实现从特征表达到像素点相似性衡量的转换,初始匹配代价计算方法如下:Furthermore, in step 1, the conversion from feature expression to pixel similarity measurement is realized, and the initial matching cost calculation method is as follows:

通过DispNetC网络的卷积层提取立体图像对的特征,得到两张图像各自的特征图;将特征输入DispNetC网络的相关层,获取其在特征空间内对应位置的关系,获得初始匹配代价;通过DispNetC网络的相关层比较两个特征图各个块的关系,即计算各块之间的相关性,公式如下:The features of the stereo image pair are extracted through the convolutional layer of the DispNetC network to obtain the feature maps of the two images respectively; the features are input into the correlation layer of the DispNetC network to obtain the relationship between the corresponding positions in the feature space and obtain the initial matching cost; the relationship between the blocks of the two feature maps is compared through the correlation layer of the DispNetC network, that is, the correlation between the blocks is calculated, and the formula is as follows:

Figure BDA0002387805400000021
Figure BDA0002387805400000021

其中c(x1,x2)表示特征图的块的相关性,f1和f2分别表示两个特征图,x1表示特征图f1中以x1为中心的一块,x2表示特征图f2中以x2为中心的一块,k为图像块大小,d为图像位移范围,即视差搜索范围;Where c(x 1 ,x 2 ) represents the correlation of the feature map block, f 1 and f 2 represent two feature maps respectively, x 1 represents a block centered on x 1 in the feature map f 1 , x 2 represents a block centered on x 2 in the feature map f 2 , k is the image block size, and d is the image displacement range, that is, the disparity search range;

在求取匹配代价的过程中,将左图设为参考图像,在范围d内进行移动,计算相关性大小,得到初始匹配代价卷。In the process of obtaining the matching cost, the left image is set as the reference image, moved within the range d, the correlation size is calculated, and the initial matching cost volume is obtained.

进一步地,所述步骤二,将初始匹配代价卷输入DispNetC网络的编码-解码结构,将匹配代价卷堆叠成空间金字塔并结合简化独立成分分析损失函数,利用通道向量之间的相关性,完成像素点在所有视差搜索范围内与其邻域像素的重要性衡量,完成像素点的权值更新,具体如下:Furthermore, in step 2, the initial matching cost volume is input into the encoding-decoding structure of the DispNetC network, the matching cost volume is stacked into a spatial pyramid and combined with a simplified independent component analysis loss function, and the correlation between channel vectors is used to complete the importance measurement of the pixel point and its neighboring pixels in all disparity search ranges, and the weight update of the pixel point is completed, as follows:

(1)基于简化独立成分分析的代价聚合在解码阶段完成,匹配代价卷经过解码结构的若干反卷积层,每个反卷积层得到一个反卷积结果,即每层输出一个匹配代价卷,堆叠不同层的匹配代价卷fs构成空间金字塔;对每层匹配代价卷进行上采样,上采样的匹配代价卷的大小和最后一层输出的匹配代价卷fs'的大小相同;(1) Cost aggregation based on simplified independent component analysis is completed in the decoding stage. The matching cost volume passes through several deconvolution layers of the decoding structure. Each deconvolution layer obtains a deconvolution result, that is, each layer outputs a matching cost volume, and the matching cost volumes fs of different layers are stacked to form a spatial pyramid; each layer of the matching cost volume is upsampled, and the size of the upsampled matching cost volume is the same as the size of the matching cost volume fs ' output by the last layer;

(2)保持fs'的通道数不变,将fs'拉平成

Figure BDA0002387805400000022
其中Xj由WiHi个通道向量
Figure BDA0002387805400000023
组成,Wi、Hi分别表示匹配代价卷的长、宽,dj表示上采样后匹配代价卷的层数,i表示像素点的位置,j表示第j个视差搜索范围;(2) Keep the number of channels of f s ' unchanged and flatten f s ' into
Figure BDA0002387805400000022
where Xj consists of WiHi channel vectors
Figure BDA0002387805400000023
Composition, Wi and Hi represent the length and width of the matching cost volume respectively, dj represents the number of layers of the matching cost volume after upsampling, i represents the position of the pixel point, and j represents the jth disparity search range;

(3)由拉平的Xj中得到权重矩阵Yj,Yj由通道向量

Figure BDA0002387805400000024
的各个点的权重之和求得;(3) The weight matrix Y j is obtained from the flattened X j . Y j is composed of the channel vector
Figure BDA0002387805400000024
The sum of the weights of each point is obtained;

Figure BDA0002387805400000025
Figure BDA0002387805400000025

其中Wa和ba分别表示网络权重和偏置项;Where Wa and ba represent the network weight and bias term respectively;

(4)对权重矩阵Yj中对应位置i上的权重进行softmax规范化,得到归一化后的权重矩阵Ai,公式如下:(4) Perform softmax normalization on the weight at the corresponding position i in the weight matrix Yj to obtain the normalized weight matrix Ai , as shown in the following formula:

Figure BDA0002387805400000026
Figure BDA0002387805400000026

ai=softmax(Γ(y1,...,yi))a i =softmax(Γ(y 1 ,...,y i ))

其中ai表示归一化后像素点的权重,i表示像素点的位置,WiHi表示矩阵Ai元素个数,yi为权重矩阵Yj中的元素,表示未归一化之前位置i的像素点的权重,Γ是采用element-wise sum的融合函数,T表示矩阵转置;Where ai represents the weight of the pixel after normalization, i represents the position of the pixel, WiHi represents the number of elements in the matrix Ai , yi is the element in the weight matrix Yj , which represents the weight of the pixel at position i before normalization, Γ is the fusion function using element-wise sum, and T represents matrix transpose;

(5)将权重矩阵Ai与Xj相乘,得到聚合后的向量Mi,Mi=AiXj;将完成代价聚合后的向量

Figure BDA0002387805400000031
转换为代价卷
Figure BDA0002387805400000032
di表示代价聚和后的代价卷层数。(5) Multiply the weight matrix Ai and Xj to obtain the aggregated vector Mi , Mi = AiXj ;
Figure BDA0002387805400000031
Convert to price roll
Figure BDA0002387805400000032
d i represents the number of cost volume layers after cost aggregation.

进一步地,由于传统的独立成分分析(ICA)算法,需要进行预处理,特征提取等一系列操作,本发明中仅在构建匹配代价卷金字塔时,根据ICA损失函数定义新的简化独立成分分析(SICA)损失函数,将SICA损失函数参数对应于ICA损失函数中的参数;Furthermore, since the traditional independent component analysis (ICA) algorithm requires a series of operations such as preprocessing and feature extraction, in the present invention, only when constructing the matching cost volume pyramid, a new simplified independent component analysis (SICA) loss function is defined according to the ICA loss function, and the SICA loss function parameters correspond to the parameters in the ICA loss function;

权重矩阵Ai由通道向量

Figure BDA0002387805400000033
本身加权获得,考虑其他像素点的影响,结合独立成分分析损失函数,定义简化独立成分分析损失函数如下:The weight matrix Ai is composed of the channel vector
Figure BDA0002387805400000033
It is weighted itself, considering the influence of other pixels, combined with the independent component analysis loss function, and the simplified independent component analysis loss function is defined as follows:

Figure BDA0002387805400000034
Figure BDA0002387805400000034

其中LSICA表示简化独立成分分析损失函数,I表示单位矩阵,x表示平方和函数。Where L SICA represents the simplified independent component analysis loss function, I represents the identity matrix, and x represents the square sum function.

进一步地,所述步骤三,在单像素点损失函数的基础上结合区域损失函数,构建局部相似性损失函数,结合简化独立成分分析损失函数,得到总损失函数;Furthermore, in step three, based on the single pixel loss function, the regional loss function is combined to construct a local similarity loss function, and the total loss function is obtained by combining the simplified independent component analysis loss function;

在立体匹配中,通过计算预测视差图和真实视差图之间的差异,将该差异作为训练损失,其中单像素点的损失函数Ls表示为:In stereo matching, the difference between the predicted disparity map and the real disparity map is calculated and used as the training loss, where the loss function Ls of a single pixel is expressed as:

Figure BDA0002387805400000035
Figure BDA0002387805400000035

其中N是像素数量,dn

Figure BDA0002387805400000036
分别是第n个像素的预测视差以及真实的视差。where N is the number of pixels, d n and
Figure BDA0002387805400000036
are the predicted disparity and the actual disparity of the n-th pixel, respectively.

进一步地,采用KL散度衡量两个相邻像素之间的相似性,当像素n和其邻域像素t的真实视差相同,在训练网络时,使像素n和t的预测视差的差别越小,同时损失函数值越小越满足预期;当像素n和其邻域像素t的真实视差不同,在训练网络时,使像素n和t的预测视差的差别越大,同时损失函数越小越满足预期;根据两个相邻像素之间的相似性定义区域损失函数Lr为:Furthermore, KL divergence is used to measure the similarity between two adjacent pixels. When the real disparity of pixel n and its neighboring pixel t is the same, when training the network, the difference between the predicted disparity of pixels n and t is made smaller, and the smaller the loss function value is, the more it meets expectations; when the real disparity of pixel n and its neighboring pixel t is different, when training the network, the difference between the predicted disparity of pixels n and t is made larger, and the smaller the loss function is, the more it meets expectations; according to the similarity between two adjacent pixels, the regional loss function Lr is defined as:

Figure BDA0002387805400000037
Figure BDA0002387805400000037

其中Dkl()表示Kullback-Leibler散度,dn和dt分别是中心像素点n和其领域像素点t的预测视差值,

Figure BDA0002387805400000038
Figure BDA0002387805400000039
分别是中心像素点n和其领域像素点t的真实视差值,m为边界参数。Where D kl () represents the Kullback-Leibler divergence, d n and d t are the predicted disparity values of the center pixel n and its neighborhood pixel t, respectively.
Figure BDA0002387805400000038
and
Figure BDA0002387805400000039
are the true disparity values of the center pixel n and its surrounding pixel t, and m is the boundary parameter.

进一步地,在单像素点损失函数的基础上结合区域损失函数,将局部相似性损失函数定义Ll为:Furthermore, based on the single pixel loss function and combined with the regional loss function, the local similarity loss function is defined as:

Figure BDA00023878054000000310
Figure BDA00023878054000000310

其中N是像素数量,区域损失函数Lr中R(dn)代表区域内预测的视差值,

Figure BDA0002387805400000041
代表区域内真实的视差值,n代表区域的中心像素,R(*)代表p*q的邻域,R代表p*q邻域的面积。Where N is the number of pixels, and R(d n ) in the regional loss function L r represents the predicted disparity value within the region.
Figure BDA0002387805400000041
represents the true disparity value in the area, n represents the central pixel of the area, R(*) represents the neighborhood of p*q, and R represents the area of the neighborhood of p*q.

进一步地,结合简化独立成分分析损失函数LSICA以及局部相似性损失函数Ll,将总损失函数L定义为:Furthermore, combining the simplified independent component analysis loss function L SICA and the local similarity loss function L l , the total loss function L is defined as:

Figure BDA0002387805400000042
Figure BDA0002387805400000042

其中ω和λ为权重参数,用来控制简化独立成分分析损失函数LSICA和局部相似性损失函数Ll的重要性比例,R(*)代表p*q的邻域,R代表p*q邻域的面积。Where ω and λ are weight parameters used to control the importance ratio of the simplified independent component analysis loss function LSICA and the local similarity loss function Ll , R(*) represents the neighborhood of p*q, and R represents the area of the neighborhood of p*q.

进一步地,所述步骤四,利用BPTT算法实现网络参数更新,所述参数包括权重、偏置。Furthermore, in the step four, the BPTT algorithm is used to update the network parameters, and the parameters include weights and biases.

本发明对DispNetC进行改进。DispNetC网络结构用于立体匹配,求视差图,该网络包括三部分:特征提取,特征相关性计算,编解码结构。立体图像对输入DispNetC网络经过特征提取,特征相关性计算,编解码结构,就可以得到视差图。The present invention improves DispNetC. The DispNetC network structure is used for stereo matching and disparity map. The network includes three parts: feature extraction, feature correlation calculation, and encoding and decoding structure. The stereo image pair is input into the DispNetC network, and after feature extraction, feature correlation calculation, and encoding and decoding structure, the disparity map can be obtained.

本发明在DispNetC的编解码结构上引入ICA代价聚合以及对应的ICA损失函数,同时在DispNetC原本的单像素点损失函数的基础上加入了一个区域损失函数。首先提出了简化独立成分分析代价聚合,在DispNetC编码-解码结构的解码部分引入匹配代价卷金字塔,同时定义了简化独立成分分析损失函数,简化了独立成分分析算法的预处理过程;其次,引入区域损失函数,结合单像素点损失函数,定义局部相似性损失函数,以完善视差图的空间结构;最后,简化独立成分分析损失函数和局部相似性损失函数相结合,进行视差图预测,弥补视差图的边缘信息。The present invention introduces ICA cost aggregation and the corresponding ICA loss function on the encoding and decoding structure of DispNetC, and adds a regional loss function based on the original single-pixel loss function of DispNetC. First, a simplified independent component analysis cost aggregation is proposed, and a matching cost volume pyramid is introduced in the decoding part of the DispNetC encoding-decoding structure. At the same time, a simplified independent component analysis loss function is defined, which simplifies the preprocessing process of the independent component analysis algorithm; secondly, a regional loss function is introduced, combined with a single-pixel loss function, and a local similarity loss function is defined to improve the spatial structure of the disparity map; finally, the simplified independent component analysis loss function is combined with the local similarity loss function to perform disparity map prediction and compensate for the edge information of the disparity map.

有益效果:与现有技术相比,本发明的技术方案具有以下有益的技术效果:Beneficial effects: Compared with the prior art, the technical solution of the present invention has the following beneficial technical effects:

本发明构建了基于简化独立成分分析和局部相似性的立体匹配方法,提出了集匹配代价卷金字塔和简化独立成分分析损失函数为一体的简化独立成分分析匹配代价聚合以及局部相似性损失函数。这个立体匹配方法所提出的匹配代价聚合模型完善了视差图的场景结构以及细节部分。而局部相似性损失函数弥补单像素点损失函数的不足,从依靠独立像素点扩展到依靠邻域像素点信息,学习像素之间的内在关系,在保证视差图预测速度的同时,提高了视差图边缘以及细节部分的预测准确率,减少了在预测过程中对单像素点的依赖程度。The present invention constructs a stereo matching method based on simplified independent component analysis and local similarity, and proposes a simplified independent component analysis matching cost aggregation and a local similarity loss function that integrates a matching cost volume pyramid and a simplified independent component analysis loss function. The matching cost aggregation model proposed by this stereo matching method improves the scene structure and details of the disparity map. The local similarity loss function makes up for the shortcomings of the single pixel loss function, expanding from relying on independent pixels to relying on neighborhood pixel information, learning the intrinsic relationship between pixels, and improving the prediction accuracy of the edges and details of the disparity map while ensuring the prediction speed of the disparity map, reducing the degree of dependence on single pixels in the prediction process.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明方法的实现流程图;Fig. 1 is a flow chart of the implementation of the method of the present invention;

图2是简化独立成分分析匹配代价聚合示意图;FIG2 is a simplified schematic diagram of independent component analysis matching cost aggregation;

图3是构建局部相似性损失函数示意图。FIG3 is a schematic diagram of constructing a local similarity loss function.

具体实施方式DETAILED DESCRIPTION

下面结合附图和实施例对本发明的技术方案作进一步的说明。The technical solution of the present invention is further described below in conjunction with the accompanying drawings and embodiments.

本发明所提出的基于简化独立成分分析和局部相似性的立体匹配方法,实现流程如图1所示,具体实现步骤如下:The stereo matching method based on simplified independent component analysis and local similarity proposed in the present invention has an implementation process as shown in FIG1 , and the specific implementation steps are as follows:

步骤一,将双目相机拍摄的立体图像对输入DispNetC网络的卷积层,提取每个像素的特征,通过计算特征相关性构建初始匹配代价卷,完成初始匹配代价计算,实现从特征表达到像素点相似性衡量的转换;具体如下:Step 1: Input the stereo image pair taken by the binocular camera into the convolution layer of the DispNetC network, extract the features of each pixel, construct the initial matching cost volume by calculating the feature correlation, complete the initial matching cost calculation, and realize the conversion from feature expression to pixel similarity measurement; the details are as follows:

为了比较输入图像对中两个像素点的相似性,需要得到每个像素点的强有力的表达,通过DispNetC网络的卷积层提取立体图像对中左图Il和右图Ir的特征,得到左图像特征Fl和右图像特征Fr,其中I和F分别表示原图及特征图,l,r分别表示左和右,为匹配代价的构建做准备;In order to compare the similarity of two pixels in the input image pair, it is necessary to obtain a strong expression of each pixel. The features of the left image I l and the right image I r in the stereo image pair are extracted through the convolutional layer of the DispNetC network to obtain the left image feature F l and the right image feature F r , where I and F represent the original image and feature map respectively, l and r represent left and right respectively, in preparation for the construction of the matching cost;

将特征Fl和Fr输入DispNetC网络的相关层,获取Fl和Fr在特征空间内对应位置的关系,获得初始匹配代价Fc,完成从特征表达到像素点相似性衡量的转换;Input the features F l and F r into the relevant layer of the DispNetC network, obtain the relationship between the corresponding positions of F l and F r in the feature space, obtain the initial matching cost F c , and complete the conversion from feature expression to pixel similarity measurement;

DispNetC网络的相关层用于比较两个特征图各个块的关系,即计算各块之间的相关性,公式如下:The correlation layer of the DispNetC network is used to compare the relationship between the blocks of the two feature maps, that is, to calculate the correlation between the blocks. The formula is as follows:

Figure BDA0002387805400000051
Figure BDA0002387805400000051

其中c(x1,x2)表示特征图的块的相关性,f1和f2分别表示两个特征图,x1表示特征图f1中以x1为中心的一块,x2表示特征图f2中以x2为中心的一块,k为图像块大小,d为图像位移范围,即视差搜索范围;Where c(x 1 ,x 2 ) represents the correlation of the feature map block, f 1 and f 2 represent two feature maps respectively, x 1 represents a block centered on x 1 in the feature map f 1 , x 2 represents a block centered on x 2 in the feature map f 2 , k is the image block size, and d is the image displacement range, that is, the disparity search range;

在求取匹配代价的过程中,将左图设为参考图像,在范围d内进行移动,计算相关性大小,得到初始匹配代价卷。In the process of obtaining the matching cost, the left image is set as the reference image, moved within the range d, the correlation size is calculated, and the initial matching cost volume is obtained.

步骤二,将初始匹配代价卷输入DispNetC网络的编码-解码结构,将匹配代价卷堆叠成空间金字塔,进行简化独立成分分析匹配代价聚合,定义简化独立成分分析损失函数LSICA,利用通道向量之间的相关性,完成像素点在所有视差搜索范围内与其邻域像素的重要性衡量,完成像素点的权值更新;图2所示为简化独立成分分析匹配代价聚合的执行流程示意图,具体包括:Step 2: Input the initial matching cost volume into the encoding-decoding structure of the DispNetC network, stack the matching cost volume into a spatial pyramid, perform simplified independent component analysis matching cost aggregation, define the simplified independent component analysis loss function L SICA , and use the correlation between channel vectors to measure the importance of pixels and their neighboring pixels in all disparity search ranges, and complete the weight update of pixels; Figure 2 shows a schematic diagram of the execution process of simplified independent component analysis matching cost aggregation, which specifically includes:

(1)基于简化独立成分分析的代价聚合在解码阶段完成,匹配代价卷经过解码结构的若干反卷积层,每个反卷积层得到一个反卷积结果,即每层输出一个匹配代价卷,堆叠不同层的匹配代价卷fs构成空间金字塔;对每层匹配代价卷进行上采样,上采样的匹配代价卷的大小和最后一层输出的匹配代价卷fs'的大小相同;(1) Cost aggregation based on simplified independent component analysis is completed in the decoding stage. The matching cost volume passes through several deconvolution layers of the decoding structure. Each deconvolution layer obtains a deconvolution result, that is, each layer outputs a matching cost volume, and the matching cost volumes fs of different layers are stacked to form a spatial pyramid; each layer of the matching cost volume is upsampled, and the size of the upsampled matching cost volume is the same as the size of the matching cost volume fs ' output by the last layer;

(2)保持fs'的通道数不变,将fs'拉平成

Figure BDA0002387805400000052
其中Xj由WiHi个通道向量
Figure BDA0002387805400000061
组成,Wi、Hi分别表示匹配代价卷的长、宽,dj表示上采样后匹配代价卷的层数,i表示像素点的位置,j表示第j个视差搜索范围;(2) Keep the number of channels of f s ' unchanged and flatten f s ' into
Figure BDA0002387805400000052
where Xj consists of WiHi channel vectors
Figure BDA0002387805400000061
Composition, Wi and Hi represent the length and width of the matching cost volume respectively, dj represents the number of layers of the matching cost volume after upsampling, i represents the position of the pixel point, and j represents the jth disparity search range;

(3)由拉平的Xj中得到权重矩阵Yj,Yj由通道向量

Figure BDA0002387805400000062
的各个点的权重之和求得;(3) The weight matrix Y j is obtained from the flattened X j . Y j is composed of the channel vector
Figure BDA0002387805400000062
The sum of the weights of each point is obtained;

Figure BDA0002387805400000063
Figure BDA0002387805400000063

其中Wa和ba分别表示网络权重和偏置项;Where Wa and ba represent the network weight and bias term respectively;

(4)对权重矩阵Yj中对应位置i上的权重进行softmax规范化,得到归一化后的权重矩阵Ai,公式如下:(4) Perform softmax normalization on the weight at the corresponding position i in the weight matrix Yj to obtain the normalized weight matrix Ai , as shown in the following formula:

Figure BDA0002387805400000064
Figure BDA0002387805400000064

ai=softmax(Γ(y1,...,yi))a i =softmax(Γ(y 1 ,...,y i ))

其中ai表示归一化后像素点的权重,i表示像素点的位置,WiHi表示矩阵Ai元素个数,yi为权重矩阵Yj中的元素,表示未归一化之前位置i的像素点的权重,Γ是采用element-wise sum的融合函数,T表示矩阵转置;Where ai represents the weight of the pixel after normalization, i represents the position of the pixel, WiHi represents the number of elements in the matrix Ai , yi is the element in the weight matrix Yj , which represents the weight of the pixel at position i before normalization, Γ is the fusion function using element-wise sum, and T represents matrix transpose;

(5)将权重矩阵Ai与Xj相乘,得到聚合后的向量Mi,Mi=AiXj;将完成代价聚合后的向量

Figure BDA0002387805400000065
转换为代价卷
Figure BDA0002387805400000066
di表示代价聚和后的代价卷层数。(5) Multiply the weight matrix Ai and Xj to obtain the aggregated vector Mi , Mi = AiXj ;
Figure BDA0002387805400000065
Convert to price roll
Figure BDA0002387805400000066
d i represents the number of cost volume layers after cost aggregation.

由于传统的独立成分分析(ICA)算法,需要进行预处理,特征提取等一系列操作,本发明中仅在构建匹配代价卷金字塔时,根据ICA损失函数定义新的简化独立成分分析(SICA)损失函数,将SICA损失函数参数对应于ICA损失函数中的参数;Since the traditional independent component analysis (ICA) algorithm requires a series of operations such as preprocessing and feature extraction, in the present invention, only when constructing the matching cost volume pyramid, a new simplified independent component analysis (SICA) loss function is defined according to the ICA loss function, and the SICA loss function parameters correspond to the parameters in the ICA loss function;

以上获取通道本身的权重可以看成一个简化的独立成分分析过程:Xj可以看成独立成分分析处理过程中的待降维信号;由

Figure BDA0002387805400000067
获得的权重,在计算权值时
Figure BDA0002387805400000068
可以看作在独立成分分析处理过程中的中心化步骤,其中Wa和ba分别表示权重和偏置项,这个权重和偏置在网络训练的过程中更新;权重矩阵Ai与独立成分分析中变换矩阵W相对应;在匹配代价卷fj'给重要部分赋予权值类似于在独立成分分析中提取主要成分,所述重要部分是指有图像中具有特征的位置,比如图像的边缘,对于预测视差较为重要,这些有特征的位置被赋予的权重越高,视差准确性越高;在独立成分分析中提取主要成分是指独立成分分析适用于主成分分析,提取最有代表性的特征。The above acquisition of the channel weights can be regarded as a simplified independent component analysis process: Xj can be regarded as the signal to be reduced in the independent component analysis process;
Figure BDA0002387805400000067
The weight obtained, when calculating the weight
Figure BDA0002387805400000068
It can be regarded as a centralization step in the independent component analysis process, where Wa and ba represent weights and bias terms respectively, and the weights and biases are updated during the network training process; the weight matrix Ai corresponds to the transformation matrix W in the independent component analysis; assigning weights to important parts in the matching cost volume fj ' is similar to extracting the main components in the independent component analysis, and the important parts refer to the positions with features in the image, such as the edges of the image, which are more important for predicting disparity. The higher the weights assigned to these characteristic positions, the higher the disparity accuracy; extracting the main components in the independent component analysis means that the independent component analysis is applicable to the principal component analysis to extract the most representative features.

当前的权重矩阵Ai由通道向量

Figure BDA0002387805400000069
本身加权获得,并没有考虑到其他像素点的影响,因此需要结合独立成分分析重建损失函数,定义简化独立成分分析损失函数如下:The current weight matrix Ai is composed of the channel vector
Figure BDA0002387805400000069
The weighted result itself does not take into account the influence of other pixels. Therefore, it is necessary to combine the independent component analysis to reconstruct the loss function. The simplified independent component analysis loss function is defined as follows:

Figure BDA00023878054000000610
Figure BDA00023878054000000610

其中LSICA表示简化独立成分分析损失函数,I表示单位矩阵,x表示平方和函数。Where L SICA represents the simplified independent component analysis loss function, I represents the identity matrix, and x represents the square sum function.

步骤三,聚合后的匹配代价卷输入解码结构的最后一层反卷积层,反卷积的结果即为视差图,构建局部相似性损失函数Ll,并结合简化独立成分分析损失函数LSICA,得到总损失函数L;具体包括:Step 3: The aggregated matching cost volume is input into the last deconvolution layer of the decoding structure. The result of the deconvolution is the disparity map. The local similarity loss function L l is constructed and combined with the simplified independent component analysis loss function L SICA to obtain the total loss function L. Specifically, it includes:

在立体匹配中,通过计算预测视差图和真实视差图之间的差异,将该差异作为训练损失,其中单像素点的损失函数Ls表示为:In stereo matching, the difference between the predicted disparity map and the real disparity map is calculated and used as the training loss, where the loss function Ls of a single pixel is expressed as:

Figure BDA0002387805400000071
Figure BDA0002387805400000071

其中N是像素数量,dn

Figure BDA0002387805400000072
分别是第n个像素的预测视差以及真实的视差;where N is the number of pixels, d n and
Figure BDA0002387805400000072
are the predicted disparity and the actual disparity of the nth pixel respectively;

采用KL散度衡量两个相邻像素之间的相似性,当像素n和其邻域像素t的真实视差相同,在训练网络时,使像素n和t的预测视差的差别越小,同时损失函数值越小越满足预期;当像素n和其邻域像素t的真实视差不同,在训练网络时,使像素n和t的预测视差的差别越大,同时损失函数越小越满足预期;根据两个相邻像素之间的相似性定义区域损失函数Lr为:KL divergence is used to measure the similarity between two adjacent pixels. When the real disparity of pixel n and its neighboring pixel t is the same, when training the network, the difference between the predicted disparity of pixels n and t is made smaller, and the smaller the loss function value is, the more it meets expectations; when the real disparity of pixel n and its neighboring pixel t is different, when training the network, the difference between the predicted disparity of pixels n and t is made larger, and the smaller the loss function is, the more it meets expectations; according to the similarity between two adjacent pixels, the regional loss function Lr is defined as:

Figure BDA0002387805400000073
Figure BDA0002387805400000073

其中Dkl()表示Kullback-Leibler散度,dn和dt分别是中心像素点n和其领域像素点t的预测视差值,

Figure BDA0002387805400000074
Figure BDA0002387805400000075
分别是中心像素点n和其领域像素点t的真实视差值,m为边界参数;Where D kl () represents the Kullback-Leibler divergence, d n and d t are the predicted disparity values of the center pixel n and its neighborhood pixel t, respectively.
Figure BDA0002387805400000074
and
Figure BDA0002387805400000075
are the true disparity values of the center pixel n and its surrounding pixel t, and m is the boundary parameter;

在单像素点损失函数的基础上结合区域损失函数,构建局部相似性损失函数,将局部相似性损失函数定义Ll为:Based on the single pixel loss function and the regional loss function, a local similarity loss function is constructed and the local similarity loss function is defined as:

Figure BDA0002387805400000076
Figure BDA0002387805400000076

其中N是像素数量,区域损失函数Lr中R(dn)代表区域内预测的视差值,

Figure BDA0002387805400000077
代表区域内真实的视差值,n代表区域的中心像素,本实施例中,R(*)代表3*3的邻域,R代表3*3邻域的面积,局部相似性损失函数示意图如图3所示;Where N is the number of pixels, and R(d n ) in the regional loss function L r represents the predicted disparity value within the region.
Figure BDA0002387805400000077
represents the real disparity value in the area, n represents the central pixel of the area, in this embodiment, R(*) represents a 3*3 neighborhood, R represents the area of the 3*3 neighborhood, and the schematic diagram of the local similarity loss function is shown in FIG3 ;

综上,结合简化独立成分分析损失函数LSICA以及局部相似性损失函数Ll,将总损失函数L定义为:In summary, combined with the simplified independent component analysis loss function L SICA and the local similarity loss function L l , the total loss function L is defined as:

Figure BDA0002387805400000078
Figure BDA0002387805400000078

其中ω和λ为权重参数,用来控制简化独立成分分析损失函数LSICA和局部相似性损失函数Ll的重要性比例,本实施例中,R(*)代表3*3的邻域,R代表3*3邻域的面积。Wherein ω and λ are weight parameters, which are used to control the importance ratio of the simplified independent component analysis loss function L SICA and the local similarity loss function L l . In this embodiment, R(*) represents a 3*3 neighborhood, and R represents the area of the 3*3 neighborhood.

步骤四,利用真实视差图和预测视差图以及定义的总损失函数L进行网络训练,利用BPTT算法实现网络参数更新,所述参数包括权重、偏置,通过训练完成的网络预测得到全尺寸视差图。Step 4: Use the real disparity map and the predicted disparity map and the defined total loss function L to perform network training, and use the BPTT algorithm to update network parameters, wherein the parameters include weights and biases, and obtain a full-size disparity map through network prediction after training.

以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变形,这些改进和变形也应视为本发明的保护范围。The above is a preferred embodiment of the present invention. It should be pointed out that for ordinary technicians in this technical field, several improvements and modifications can be made without departing from the technical principles of the present invention. These improvements and modifications should also be regarded as the scope of protection of the present invention.

Claims (5)

1.一种基于简化独立成分分析和局部相似性的图像立体匹配方法,其特征在于:该方法包括以下步骤:1. An image stereo matching method based on simplified independent component analysis and local similarity, characterized in that the method comprises the following steps: 步骤一,将双目相机拍摄的立体图像对输入DispNetC网络的卷积层,提取每个像素的特征,通过计算特征相关性构建初始匹配代价卷,完成初始匹配代价计算;Step 1: Input the stereo image pair taken by the binocular camera into the convolutional layer of the DispNetC network, extract the features of each pixel, construct the initial matching cost volume by calculating the feature correlation, and complete the initial matching cost calculation; 步骤二,将初始匹配代价卷输入DispNetC网络的编码-解码结构,进行简化独立成分分析匹配代价聚合,定义简化独立成分分析损失函数LSICA,更新像素点的权值;Step 2: Input the initial matching cost volume into the encoding-decoding structure of the DispNetC network, perform simplified independent component analysis matching cost aggregation, define the simplified independent component analysis loss function L SICA , and update the pixel weights; 步骤三,聚合后的匹配代价卷输入解码结构的最后一层反卷积层,反卷积的结果即为视差图,构建局部相似性损失函数Ll,并结合简化独立成分分析损失函数LSICA,得到总损失函数L;具体如下:Step 3: The aggregated matching cost volume is input into the last deconvolution layer of the decoding structure. The result of the deconvolution is the disparity map. The local similarity loss function L l is constructed and combined with the simplified independent component analysis loss function L SICA to obtain the total loss function L; the details are as follows: 在单像素点损失函数的基础上结合区域损失函数,构建局部相似性损失函数,结合简化独立成分分析损失函数,得到总损失函数;Based on the single pixel loss function and the regional loss function, a local similarity loss function is constructed, and combined with the simplified independent component analysis loss function, the total loss function is obtained; 在立体匹配中,通过计算预测视差图和真实视差图之间的差异,将该差异作为训练损失,其中单像素点的损失函数Ls表示为:In stereo matching, the difference between the predicted disparity map and the real disparity map is calculated and used as the training loss, where the loss function Ls of a single pixel is expressed as:
Figure FDA0004058404930000011
Figure FDA0004058404930000011
其中N是像素数量,dn
Figure FDA0004058404930000012
分别是第n个像素的预测视差以及真实的视差;
where N is the number of pixels, d n and
Figure FDA0004058404930000012
are the predicted disparity and the actual disparity of the nth pixel respectively;
采用KL散度衡量两个相邻像素之间的相似性,当像素n和其邻域像素t的真实视差相同,在训练网络时,使像素n和t的预测视差的差别越小,同时损失函数值越小越满足预期;当像素n和其邻域像素t的真实视差不同,在训练网络时,使像素n和t的预测视差的差别越大,同时损失函数越小越满足预期;根据两个相邻像素之间的相似性定义区域损失函数Lr为:KL divergence is used to measure the similarity between two adjacent pixels. When the real disparity of pixel n and its neighboring pixel t is the same, when training the network, the difference between the predicted disparity of pixels n and t is made smaller, and the smaller the loss function value is, the more it meets expectations; when the real disparity of pixel n and its neighboring pixel t is different, when training the network, the difference between the predicted disparity of pixels n and t is made larger, and the smaller the loss function is, the more it meets expectations; according to the similarity between two adjacent pixels, the regional loss function Lr is defined as:
Figure FDA0004058404930000013
Figure FDA0004058404930000013
其中Dkl()表示Kullback-Leibler散度,dn和dt分别是中心像素点n和其领域像素点t的预测视差值,
Figure FDA0004058404930000014
Figure FDA0004058404930000015
分别是中心像素点n和其领域像素点t的真实视差值,m为边界参数;
Where D kl () represents the Kullback-Leibler divergence, d n and d t are the predicted disparity values of the center pixel n and its neighborhood pixel t, respectively.
Figure FDA0004058404930000014
and
Figure FDA0004058404930000015
are the true disparity values of the center pixel n and its surrounding pixel t, and m is the boundary parameter;
在单像素点损失函数的基础上结合区域损失函数,将局部相似性损失函数定义Ll为:Based on the single pixel loss function and the regional loss function, the local similarity loss function is defined as:
Figure FDA0004058404930000016
Figure FDA0004058404930000016
其中N是像素数量,区域损失函数Lr中R(dn)代表区域内预测的视差值,
Figure FDA0004058404930000017
代表区域内真实的视差值,n代表区域的中心像素,R(*)代表p*q的邻域,R代表p*q邻域的面积;
Where N is the number of pixels, and R(d n ) in the regional loss function L r represents the predicted disparity value within the region.
Figure FDA0004058404930000017
represents the true disparity value in the area, n represents the central pixel of the area, R(*) represents the neighborhood of p*q, and R represents the area of the neighborhood of p*q;
结合简化独立成分分析损失函数LSICA以及局部相似性损失函数Ll,将总损失函数L定义为:Combining the simplified independent component analysis loss function L SICA and the local similarity loss function L l , the total loss function L is defined as:
Figure FDA0004058404930000021
Figure FDA0004058404930000021
其中ω和λ为权重参数,用来控制简化独立成分分析损失函数LSICA和局部相似性损失函数Ll的重要性比例,R(*)代表p*q的邻域,R代表p*q邻域的面积;Where ω and λ are weight parameters used to control the importance ratio of the simplified independent component analysis loss function L SICA and the local similarity loss function L l , R(*) represents the neighborhood of p*q, and R represents the area of the neighborhood of p*q; 步骤四,利用真实视差图和预测视差图以及定义的总损失函数L进行网络训练,更新网络参数,通过训练完成的网络预测得到全尺寸视差图。Step 4: Use the real disparity map, the predicted disparity map and the defined total loss function L to perform network training, update the network parameters, and obtain the full-size disparity map through the trained network prediction.
2.根据权利要求1所述的一种基于简化独立成分分析和局部相似性的图像立体匹配方法,其特征在于:所述步骤一,初始匹配代价计算方法如下:2. The image stereo matching method based on simplified independent component analysis and local similarity according to claim 1, characterized in that: in step 1, the initial matching cost calculation method is as follows: 通过DispNetC网络的卷积层提取立体图像对的特征,得到两张图像各自的特征图;将特征输入DispNetC网络的相关层,获取其在特征空间内对应位置的关系,获得初始匹配代价;通过DispNetC网络的相关层比较两个特征图各个块的关系,即计算各块之间的相关性,公式如下:The features of the stereo image pair are extracted through the convolutional layer of the DispNetC network to obtain the feature maps of the two images respectively; the features are input into the correlation layer of the DispNetC network to obtain the relationship between the corresponding positions in the feature space and obtain the initial matching cost; the relationship between the blocks of the two feature maps is compared through the correlation layer of the DispNetC network, that is, the correlation between the blocks is calculated, and the formula is as follows:
Figure FDA0004058404930000022
Figure FDA0004058404930000022
其中c(x1,x2)表示特征图的块的相关性,f1和f2分别表示两个特征图,x1表示特征图f1中以x1为中心的一块,x2表示特征图f2中以x2为中心的一块,k为图像块大小,d为图像位移范围,即视差搜索范围;Where c(x 1 ,x 2 ) represents the correlation of the feature map block, f 1 and f 2 represent two feature maps respectively, x 1 represents a block centered on x 1 in the feature map f 1 , x 2 represents a block centered on x 2 in the feature map f 2 , k is the image block size, and d is the image displacement range, that is, the disparity search range; 在求取匹配代价的过程中,将立体图像对中左图设为参考图像,在范围d内进行移动,计算相关性大小,得到初始匹配代价卷。In the process of obtaining the matching cost, the left image in the stereo image pair is set as the reference image, moved within the range d, and the correlation size is calculated to obtain the initial matching cost volume.
3.根据权利要求1所述的一种基于简化独立成分分析和局部相似性的图像立体匹配方法,其特征在于:所述步骤二,将初始匹配代价卷输入DispNetC网络的编码-解码结构,将匹配代价卷堆叠成空间金字塔并结合简化独立成分分析损失函数,利用通道向量之间的相关性,完成像素点在所有视差搜索范围内与其邻域像素的重要性衡量,完成像素点的权值更新,具体如下:3. According to claim 1, a method for image stereo matching based on simplified independent component analysis and local similarity is characterized in that: in step 2, the initial matching cost volume is input into the encoding-decoding structure of the DispNetC network, the matching cost volume is stacked into a spatial pyramid and combined with the simplified independent component analysis loss function, and the correlation between channel vectors is used to complete the importance measurement of the pixel point and its neighboring pixels in all disparity search ranges, and the weight update of the pixel point is completed, which is specifically as follows: (1)基于简化独立成分分析的代价聚合在解码阶段完成,匹配代价卷经过解码结构的若干反卷积层,每个反卷积层得到一个反卷积结果,即每层输出一个匹配代价卷,堆叠不同层的匹配代价卷fs构成空间金字塔;对每层匹配代价卷进行上采样,上采样的匹配代价卷的大小和最后一层输出的匹配代价卷fs'的大小相同;(1) Cost aggregation based on simplified independent component analysis is completed in the decoding stage. The matching cost volume passes through several deconvolution layers of the decoding structure. Each deconvolution layer obtains a deconvolution result, that is, each layer outputs a matching cost volume, and the matching cost volumes fs of different layers are stacked to form a spatial pyramid; each layer of the matching cost volume is upsampled, and the size of the upsampled matching cost volume is the same as the size of the matching cost volume fs ' output by the last layer; (2)保持fs'的通道数不变,将fs'拉平成
Figure FDA0004058404930000023
其中Xj由WiHi个通道向量
Figure FDA0004058404930000031
组成,Wi、Hi分别表示匹配代价卷的长、宽,dj表示上采样后匹配代价卷的层数,i表示像素点的位置,j表示第j个视差搜索范围;
(2) Keep the number of channels of f s ' unchanged and flatten f s ' into
Figure FDA0004058404930000023
where Xj consists of WiHi channel vectors
Figure FDA0004058404930000031
Composition, Wi and Hi represent the length and width of the matching cost volume respectively, dj represents the number of layers of the matching cost volume after upsampling, i represents the position of the pixel point, and j represents the jth disparity search range;
(3)由拉平的Xj中得到权重矩阵Yj,Yj由通道向量
Figure FDA0004058404930000032
的各个点的权重之和求得;
(3) The weight matrix Y j is obtained from the flattened X j . Y j is composed of the channel vector
Figure FDA0004058404930000032
The sum of the weights of each point is obtained;
Figure FDA0004058404930000033
Figure FDA0004058404930000033
其中Wa和ba分别表示网络权重和偏置项;Where Wa and ba represent the network weight and bias term respectively; (4)对权重矩阵Yj中对应位置i上的权重进行softmax规范化,得到归一化后的权重矩阵Ai,公式如下:(4) Perform softmax normalization on the weight at the corresponding position i in the weight matrix Yj to obtain the normalized weight matrix Ai , as shown in the following formula:
Figure FDA0004058404930000034
Figure FDA0004058404930000034
ai=softmax(Γ(y1,...,yi))a i =softmax(Γ(y 1 ,...,y i )) 其中ai表示归一化后像素点的权重,i表示像素点的位置,WiHi表示矩阵Ai元素个数,yi为权重矩阵Yj中的元素,表示未归一化之前位置i的像素点的权重,Γ是采用element-wisesum的融合函数,T表示矩阵转置;Where ai represents the weight of the pixel after normalization, i represents the position of the pixel, WiHi represents the number of elements in the matrix Ai , yi is the element in the weight matrix Yj , which represents the weight of the pixel at position i before normalization, Γ is the fusion function using element-wisesum, and T represents matrix transpose; (5)将权重矩阵Ai与Xj相乘,得到聚合后的向量Mi,Mi=AiXj;将完成代价聚合后的向量
Figure FDA0004058404930000035
转换为代价卷
Figure FDA0004058404930000036
di表示代价聚和后的代价卷层数。
(5) Multiply the weight matrix Ai and Xj to obtain the aggregated vector Mi , Mi = AiXj ;
Figure FDA0004058404930000035
Convert to price roll
Figure FDA0004058404930000036
d i represents the number of cost volume layers after cost aggregation.
4.根据权利要求3所述的一种基于简化独立成分分析和局部相似性的图像立体匹配方法,其特征在于:权重矩阵Ai由通道向量
Figure FDA0004058404930000037
本身加权获得,考虑其他像素点的影响,结合独立成分分析损失函数,定义简化独立成分分析损失函数如下:
4. The image stereo matching method based on simplified independent component analysis and local similarity according to claim 3 is characterized in that: the weight matrix Ai is composed of channel vectors
Figure FDA0004058404930000037
It is weighted itself, considering the influence of other pixels, combined with the independent component analysis loss function, and the simplified independent component analysis loss function is defined as follows:
Figure FDA0004058404930000038
Figure FDA0004058404930000038
其中LSICA表示简化独立成分分析损失函数,I表示单位矩阵,x表示平方和函数。Where L SICA represents the simplified independent component analysis loss function, I represents the identity matrix, and x represents the square sum function.
5.根据权利要求1-4任一所述的一种基于简化独立成分分析和局部相似性的图像立体匹配方法,其特征在于:所述步骤四,利用BPTT算法实现网络参数更新,所述参数包括权重、偏置。5. The image stereo matching method based on simplified independent component analysis and local similarity according to any one of claims 1 to 4, characterized in that: in the step 4, the BPTT algorithm is used to update the network parameters, and the parameters include weights and biases.
CN202010103827.0A 2020-02-20 2020-02-20 Stereo matching method based on simplified independent component analysis and local similarity Active CN111368882B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010103827.0A CN111368882B (en) 2020-02-20 2020-02-20 Stereo matching method based on simplified independent component analysis and local similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010103827.0A CN111368882B (en) 2020-02-20 2020-02-20 Stereo matching method based on simplified independent component analysis and local similarity

Publications (2)

Publication Number Publication Date
CN111368882A CN111368882A (en) 2020-07-03
CN111368882B true CN111368882B (en) 2023-04-18

Family

ID=71206367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010103827.0A Active CN111368882B (en) 2020-02-20 2020-02-20 Stereo matching method based on simplified independent component analysis and local similarity

Country Status (1)

Country Link
CN (1) CN111368882B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149547B (en) * 2020-09-17 2023-06-02 南京信息工程大学 Water Body Recognition Method Based on Image Pyramid Guidance and Pixel Pair Matching
CN113470099B (en) * 2021-07-09 2022-03-25 北京的卢深视科技有限公司 Depth imaging method, electronic device and storage medium
CN114049510B (en) * 2021-10-26 2025-01-07 北京中科慧眼科技有限公司 Binocular camera stereo matching method, system and intelligent terminal based on loss function

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109584290A (en) * 2018-12-03 2019-04-05 北京航空航天大学 A kind of three-dimensional image matching method based on convolutional neural networks
CN110533712A (en) * 2019-08-26 2019-12-03 北京工业大学 A kind of binocular solid matching process based on convolutional neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109584290A (en) * 2018-12-03 2019-04-05 北京航空航天大学 A kind of three-dimensional image matching method based on convolutional neural networks
CN110533712A (en) * 2019-08-26 2019-12-03 北京工业大学 A kind of binocular solid matching process based on convolutional neural networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于卷积神经网络的立体匹配研究;王润;《中国优秀硕士学位论文全文数据库 信息科技辑》;全文 *
基于卷积神经网络的立体匹配算法研究;严邓涛;《中国优秀硕士学位论文全文数据库 信息科技辑》;全文 *

Also Published As

Publication number Publication date
CN111368882A (en) 2020-07-03

Similar Documents

Publication Publication Date Title
CN108510535B (en) High-quality depth estimation method based on depth prediction and enhancer network
CN110930342B (en) A network construction method for depth map super-resolution reconstruction based on color map guidance
CN111368882B (en) Stereo matching method based on simplified independent component analysis and local similarity
CN110163213B (en) Remote sensing image segmentation method based on disparity map and multi-scale depth network model
CN113592018A (en) Infrared light and visible light image fusion method based on residual dense network and gradient loss
CN111739082A (en) An Unsupervised Depth Estimation Method for Stereo Vision Based on Convolutional Neural Networks
CN111310598B (en) A Hyperspectral Remote Sensing Image Classification Method Based on 3D and 2D Hybrid Convolution
CN113592026A (en) Binocular vision stereo matching method based on void volume and cascade cost volume
CN112115951B (en) A RGB-D Image Semantic Segmentation Method Based on Spatial Relationship
CN110909615B (en) Target detection method based on multi-scale input mixed perception neural network
CN111402311A (en) Knowledge distillation-based lightweight stereo parallax estimation method
CN110826500B (en) Method for estimating 3D human body posture based on antagonistic network of motion link space
CN110070574A (en) A kind of binocular vision Stereo Matching Algorithm based on improvement PSMNet
CN113780389B (en) Deep learning semi-supervised dense matching method and system based on consistency constraint
CN110351548B (en) A Stereo Image Quality Evaluation Method Based on Deep Learning and Disparity Map Weighted Guidance
CN106056622A (en) Multi-view depth video recovery method based on Kinect camera
CN111462211A (en) Binocular parallax calculation method based on convolutional neural network
CN109087247A (en) The method that a kind of pair of stereo-picture carries out oversubscription
Li et al. No-reference stereoscopic image quality assessment based on local to global feature regression
CN108596831B (en) Super-resolution reconstruction method based on AdaBoost example regression
CN101739684B (en) Color segmentation and pixel significance estimation-based parallax estimation method
CN113807417B (en) Dense matching method and system based on deep learning visual field self-selection network
CN112907641A (en) Multi-view depth estimation method based on detail information preservation
CN115375746A (en) Stereo Matching Method Based on Dual Spatial Pooling Pyramid
CN113361375A (en) Vehicle target identification method based on improved BiFPN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant