CN116823914A - Unsupervised focal stack depth estimation method based on all-focusing image synthesis - Google Patents
Unsupervised focal stack depth estimation method based on all-focusing image synthesis Download PDFInfo
- Publication number
- CN116823914A CN116823914A CN202311101094.7A CN202311101094A CN116823914A CN 116823914 A CN116823914 A CN 116823914A CN 202311101094 A CN202311101094 A CN 202311101094A CN 116823914 A CN116823914 A CN 116823914A
- Authority
- CN
- China
- Prior art keywords
- image
- focus
- representing
- focal stack
- pyramid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims description 52
- 230000015572 biosynthetic process Effects 0.000 title claims description 24
- 238000003786 synthesis reaction Methods 0.000 title claims description 23
- 230000010287 polarization Effects 0.000 claims description 47
- 239000013598 vector Substances 0.000 claims description 45
- 238000005259 measurement Methods 0.000 claims description 26
- 238000004364 calculation method Methods 0.000 claims description 17
- 238000000605 extraction Methods 0.000 claims description 14
- 238000001308 synthesis method Methods 0.000 claims description 14
- 230000008447 perception Effects 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 10
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 9
- 238000010586 diagram Methods 0.000 claims description 8
- 238000000354 decomposition reaction Methods 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 3
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000008859 change Effects 0.000 claims 1
- 238000011049 filling Methods 0.000 claims 1
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000007499 fusion processing Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
Landscapes
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
技术领域Technical field
本发明涉及单目深度估计技术领域,尤其涉及一种基于全对焦图像合成的无监督焦点堆栈深度估计方法。The invention relates to the technical field of monocular depth estimation, and in particular to an unsupervised focus stack depth estimation method based on full-focus image synthesis.
背景技术Background technique
有监督方法在深度估计任务上表现出较高准确性,但局限在于需要深度真值,这在实际应用场景中可能难以获得。近年来,随着深度学习技术的不断发展和计算机视觉领域的不断探索,无监督单目深度估计领域取得了长足的进展。无监督单目深度估计是指在没有深度标签的情况下,通过计算机视觉算法推测场景的深度信息。无监督焦点堆栈深度估计可分为两类,即重建监督和辅助监督。Supervised methods show high accuracy in depth estimation tasks, but are limited in that they require true depth values, which may be difficult to obtain in practical application scenarios. In recent years, with the continuous development of deep learning technology and continuous exploration in the field of computer vision, the field of unsupervised monocular depth estimation has made great progress. Unsupervised monocular depth estimation refers to inferring the depth information of the scene through computer vision algorithms without depth labels. Unsupervised focus stack depth estimation can be divided into two categories, namely reconstruction supervision and auxiliary supervision.
重建监督通过网络的重建损失对网络进行监督学习,从而学习到深度信息,将无监督焦点堆栈深度估计视为多视角单目深度估计的一种特殊情况,通过利用对焦序列的模糊差异来估计场景深度,然后,利用对焦图和估计的中间深度重新对焦,输出焦点堆栈,并利用重建损失进行监督学习。然而,由于深度估计任务的不适定性,重建模型容易导致多个深度解相互竞争,难以确定最优解,因此网络结构非常不稳定,同时,中间表示易被解释为焦点堆栈的信息压缩编码,导致模型难以收敛,因此通常需要引入额外的损失来对中间表示进行约束。Reconstruction supervision performs supervised learning on the network through the reconstruction loss of the network to learn depth information. Unsupervised focus stack depth estimation is regarded as a special case of multi-view monocular depth estimation, and the scene is estimated by exploiting the blur difference of the focus sequence. Depth is then refocused using the focus map and estimated intermediate depth, a focus stack is output, and a reconstruction loss is used for supervised learning. However, due to the ill-posedness of the depth estimation task, reconstructing the model easily leads to multiple depth solutions competing with each other, making it difficult to determine the optimal solution. Therefore, the network structure is very unstable. At the same time, the intermediate representation is easily interpreted as the information compression encoding of the focus stack, resulting in Models have difficulty converging, so additional losses often need to be introduced to constrain the intermediate representation.
辅助监督则是在无监督情况下,通过一些辅助信息来指导网络的学习过程,采用全对焦图像作为辅助的监督信息,该方法首先将焦点堆栈输入编解码器结构中,输出各对焦距离下的深度分布概率,并将其分别与焦点堆栈和对焦距离相结合,输出全对焦图像的同时也能得到相对粗糙的深度图。然而,该模型存在一定的局限性,如参数量较大,并且需要数据集本身提供全对焦图像作为监督信息,所以应用限制较大。因此,如何提供一种基于全对焦图像合成的无监督焦点堆栈深度估计方法是本领域技术人员亟须解决的问题。Auxiliary supervision uses some auxiliary information to guide the learning process of the network under unsupervised conditions, using full-focus images as auxiliary supervision information. This method first inputs the focus stack into the codec structure and outputs the image at each focus distance. Depth distribution probability, and combining it with the focus stack and focus distance respectively, can output a fully focused image while also obtaining a relatively rough depth map. However, this model has certain limitations, such as a large number of parameters and the need for the data set itself to provide full-focus images as supervision information, so its application is severely limited. Therefore, how to provide an unsupervised focus stack depth estimation method based on full-focus image synthesis is an urgent problem that those skilled in the art need to solve.
发明内容Contents of the invention
本发明的一个目的在于提出一种基于全对焦图像合成的无监督焦点堆栈深度估计方法,本发明在深度预测方面表现出相对高的准确性和良好的泛化性能,适用于不同场景下的深度估计任务,具有很高的实用性。One purpose of the present invention is to propose an unsupervised focus stack depth estimation method based on full-focus image synthesis. The present invention shows relatively high accuracy and good generalization performance in depth prediction, and is suitable for depth in different scenarios. Estimation tasks, with high practicality.
根据本发明实施例的一种基于全对焦图像合成的无监督焦点堆栈深度估计方法,包括:An unsupervised focus stack depth estimation method based on full-focus image synthesis according to an embodiment of the present invention includes:
S1、利用基于图像金字塔的全对焦图像合成方法和基于焦点测量算子的全对焦图像合成方法进行全对焦图像计算,得到对应的全对焦图像,将得到的全对焦图像进行融合并作为监督信息;S1. Use the full-focus image synthesis method based on the image pyramid and the full-focus image synthesis method based on the focus measurement operator to calculate the full-focus image, obtain the corresponding full-focus image, and fuse the obtained full-focus images as supervision information;
S2、通过三维感知模块对焦点堆栈进行高频噪声过滤和初步特征提取得到初提取特征,同时焦点堆栈经过差分值计算模块得到编码了模糊歧义性的特征,将初提取特征和模糊歧义性特征进行级联,即得到焦点体;S2. Use the three-dimensional perception module to perform high-frequency noise filtering and preliminary feature extraction on the focus stack to obtain the initial extraction features. At the same time, the focus stack obtains features encoding fuzzy ambiguity through the difference value calculation module. The initial extraction features and fuzzy ambiguity features are processed. Cascade, that is, the focus body is obtained;
S3、将三维极化自注意力机制引入焦点堆栈中,将输入特征焦点体分为通道极化特征图和空间极化特征图;S3. Introduce the three-dimensional polarization self-attention mechanism into the focus stack, and divide the input feature focus volume into a channel polarization feature map and a spatial polarization feature map;
S4、上述的通道极化特征图和空间极化特征图经过深度概率预测模块定位焦点堆栈最大清晰度所在的层次,并输出对应的概率值,确定最佳清晰度所在的层次,获得全对焦图像。S4. The above-mentioned channel polarization feature map and spatial polarization feature map use the depth probability prediction module to locate the level where the maximum sharpness of the focus stack is located, and output the corresponding probability value to determine the level where the best clarity is located, and obtain a fully focused image. .
可选的,所述图像金字塔具体包括:Optionally, the image pyramid specifically includes:
高斯金字塔下采样,以原图像表示高斯金字塔的最底层,其分辨率为/>,通过定义第i层的高斯金字塔:Gaussian pyramid downsampling to the original image Represents the bottom layer of the Gaussian pyramid, and its resolution is/> , by defining the i-th layer of Gaussian pyramid:
; ;
其中,其中,表示卷积操作,/>表示大小为/>的卷积核,/>表示去除输入图像的偶数行和偶数列的下采样过程;Among them, among them, Represents the convolution operation,/> Indicates that the size is/> The convolution kernel,/> Represents the downsampling process of removing even rows and even columns of the input image;
下采样将输入图像的分辨率降低为四分之一,通过不断迭代上述步骤,得到整个高斯金字塔;Downsampling will be the resolution of the input image Reduce it to a quarter, and obtain the entire Gaussian pyramid by continuously iterating the above steps;
高斯金字塔上采样,将原图像在每个方向上扩大为原来的两倍,新增的行和列以0填充,使用与先前相同的卷积核乘以四与放大后的图像进行卷积,得到重建后的图像;Gaussian pyramid upsampling, the original image Expand to twice the original size in each direction, fill the new rows and columns with 0, use the same convolution kernel multiplied by four as before to convolve with the enlarged image to obtain the reconstructed image;
重建后的图像内引入拉普拉斯金字塔,设表示拉普拉斯金字塔的第/>层:The Laplacian pyramid is introduced into the reconstructed image, assuming Represents the third of Laplace's Pyramid/> layer:
; ;
其中,表示上采样过程,即将图像在每个方向上扩大为原来的两倍,新增的行和列以0填充;in, Represents the upsampling process, that is, the image is expanded to twice the original size in each direction, and the new rows and columns are filled with 0;
原图像被分解为高斯金字塔和拉普拉斯金字塔,对于焦点堆栈中的每一张图像,执行相同的分解操作,得到一组图像金字塔。original image is decomposed into a Gaussian pyramid and a Laplacian pyramid, and for each image in the focus stack, the same decomposition operation is performed to obtain a set of image pyramids.
可选的,所述图像金字塔的的融合过程具体包括:Optionally, the image pyramid fusion process specifically includes:
给定焦点堆栈序列:Given a focus stack sequence:
; ;
其中,表示像素点的空间坐标,/>表示对焦序列的数量,每一张图片都和特定的对焦距离相对应;in, Represents the spatial coordinates of the pixel point,/> Indicates the number of focus sequences, each picture corresponds to a specific focus distance;
对焦点堆栈进行图像金字塔分解,得到高斯金字塔/>和拉普拉斯金字塔/>,其中,/>代表金字塔的层数;focus stack Perform image pyramid decomposition to obtain Gaussian pyramid/> and Laplace's Pyramid/> , where,/> Represents the number of layers of the pyramid;
对拉普拉斯金字塔的每一个位置/>进行焦点测量,获取最大清晰度对应的索引图/>,/> 由索引图和拉普拉斯金字塔生成:Pyramid of Laplace every position/> Perform focus measurement to obtain the index image corresponding to the maximum clarity/> ,/> Generated from index graph and Laplacian pyramid:
利用对全对焦拉普拉斯金字塔/>自上而下地进行上采样,得到焦点堆栈对应的全对焦图像。use Full focus on Laplacian Pyramid/> Upsampling is performed from top to bottom to obtain a fully focused image corresponding to the focus stack.
可选的,所述基于图像金字塔的全对焦图像合成方法具体包括对输入的焦点堆栈进行图像金字塔分解,得到高斯金字塔/>和拉普拉斯金字塔/>,对拉普拉斯金字塔/>进行区域信息熵计算,得到每一层的焦点测量清晰度度量值,提取清晰度度量值最大的一层作为对应层的全对焦图像,重建得到最终的全对焦图像。Optionally, the all-focus image synthesis method based on image pyramid specifically includes the input focus stack Perform image pyramid decomposition to obtain Gaussian pyramid/> and Laplace's Pyramid/> , to Laplace's Pyramid/> The regional information entropy is calculated to obtain the focus measurement sharpness measurement value of each layer. The layer with the largest sharpness measurement value is extracted as the full-focus image of the corresponding layer, and the final full-focus image is reconstructed.
可选的,所述基于焦点测量算子的全对焦图像合成方法包括将小区域邻域融合算子应用到各个对焦序列上得到各个焦点图像的焦点测量清晰度度量值,进行索引最大化确定最佳清晰度对应的索引,根据索引提取焦点堆栈中像素值作为全对焦图像。Optionally, the full-focus image synthesis method based on focus measurement operators includes applying small-area neighborhood fusion operators to each focus sequence. The focus measurement sharpness metric value of each focus image is obtained, the index is maximized to determine the index corresponding to the best sharpness, and the pixel value in the focus stack is extracted according to the index as a fully focused image.
可选的,所述基于焦点测量算子的全对焦图像合成方法具体包括:Optionally, the full-focus image synthesis method based on the focus measurement operator specifically includes:
通过向量运算将向量值图像转换为标量值图像获得综合特征:Comprehensive features are obtained by converting vector-valued images into scalar-valued images through vector operations:
设表示向量值像素,/>表示标量值像素,选取向量值图像中的小块尺寸/>,使/>为中心向量值像素,/>为窗口/>内的向量值像素;set up Represents vector-valued pixels, /> Represents a scalar-valued pixel, selecting a patch size in a vector-valued image/> , make/> is the center vector value pixel,/> for window/> vector value pixels within;
其中,向量值像素对应的标量值像素/>通过缩放窗口内差分向量长度得到;where, the vector value pixel Corresponding scalar value pixel/> Obtained by scaling the length of the difference vector within the window;
计算窗口内其他向量/>与中心向量/>之差得到差分向量/>:calculation window Other vectors within/> with center vector/> The difference is the difference vector/> :
; ;
; ;
; ;
其中,表示结果向量的点积形成的标量值,/>表示一个局部的自适应缩放因子;in, Represents a scalar value formed by the dot product of the result vectors, /> Represents a local adaptive scaling factor;
; ;
其中,计算差分向量之间的点积,用来衡量特征间的相似性,提供差分向量/>和中心向量/>之间的叉积长度;in, Calculate the dot product between difference vectors to measure the similarity between features, Provides differential vectors/> and center vector/> The cross product length between;
将得到的标量值图像应用于索引最大化操作,以评估图像的清晰度,根据最佳清晰度所在的索引从输入的焦点堆栈中提取相应位置的像素值,得到相应的全对焦图像。The resulting scalar value image is applied to the index maximization operation to evaluate the sharpness of the image, and the pixel value at the corresponding position is extracted from the input focus stack according to the index where the best sharpness is located, and the corresponding fully focused image is obtained.
可选的,所述三维感知模块通过一个四层的网络结构完成焦点堆栈的高频噪声过滤和初步特征提取,所述三维感知模块包括多个具有不同的卷积核大小和步长的并行卷积层,用于捕捉不同尺度上的模糊特征;Optionally, the three-dimensional perception module completes the high-frequency noise filtering and preliminary feature extraction of the focus stack through a four-layer network structure. The three-dimensional perception module includes multiple parallel convolutions with different convolution kernel sizes and step sizes. Accumulated layers are used to capture fuzzy features at different scales;
所述S2具体包括:The S2 specifically includes:
S21、使用一个3D卷积网络对焦点堆栈进行过滤,提取模糊特征;S21. Use a 3D convolution network to filter the focus stack and extract blur features;
S22、在网络结构中引入一个差分值计算模块,将模糊特征输入差分值计算模块中,差分值计算模块计算RGB三通道的差分值:S22. Introduce a differential value calculation module into the network structure, input the fuzzy features into the differential value calculation module, and the differential value calculation module calculates the differential values of the three RGB channels:
; ;
其中,表示融合后的RGB通道差分,/>代表输入特征的不同颜色维度;in, Represents the RGB channel difference after fusion,/> Represents different color dimensions of input features;
S23、经过一个下采样层得到RGB差分特征,RGB差分特征与模糊特征进行融合,构建出融合了模糊歧义性的焦点体。S23. Obtain the RGB differential features through a down-sampling layer, and fuse the RGB differential features with the fuzzy features to construct a focus volume that combines fuzzy ambiguity.
可选的,所述通道极化特征图通过对输入的特征图x进行极化变换得到:Optionally, the channel polarization feature map is obtained by performing polarization transformation on the input feature map x:
极化变换将输入的特征图x转化为两组基向量和/>;Polarization transformation converts the input feature map x into two sets of basis vectors and/> ;
其中,和/>对应通道层面的查询和键;in, and/> Corresponds to channel-level queries and keys;
计算和/>的相似度得分/>:calculate and/> similarity score/> :
; ;
其中,表示激活函数,/>表示归一化指数函数,/>、/>和/>分别表示1×1的三维卷积层,/>和/>表示两个张量重塑操作符,×表示元素级别的乘法操作,/>和/>与/>之间的通道数为/>;in, Represents the activation function,/> Represents a normalized exponential function,/> ,/> and/> Represents a 1×1 three-dimensional convolution layer respectively, /> and/> Represents two tensor reshaping operators, × represents element-level multiplication operations, /> and/> with/> The number of channels between is/> ;
用得分作为权重,对输入向量进行加权求和,得到获得了通道关联的通道极化特征图/>:Use score As a weight, the input vectors are weighted and summed to obtain the channel polarization feature map that obtains the channel correlation/> :
; ;
其中,表示通道级乘法运算符。in, Represents a channel-level multiplication operator.
可选的,所述空间极化特征图方法包括:Optionally, the spatial polarization feature map method includes:
将输入的通道极化特征图进行极化变化,得到两组极化向向量和/>;Convert the input channel polarization feature map to Perform polarization changes to obtain two sets of polarization vectors and/> ;
其中,通过对三通道进行全局池化以获取全局空间特征,/>通过三维卷积将输入特征图中的像素进行重新排列增强空间不同方向上的特征;in, Global pooling is performed on three channels to obtain global spatial features,/> Rearrange the pixels in the input feature map through three-dimensional convolution to enhance features in different directions in space;
通过两组极化向量计算相似度矩阵:Calculate the similarity matrix from two sets of polarization vectors :
; ;
其中,和分别表示标准的1×1三维卷积层,表示通道卷积的中间参数,、和,×表示矩阵点乘操作,表示全局池化; in, and Represents the standard 1×1 three-dimensional convolution layer respectively, Represents the intermediate parameters of channel convolution, , and , × represents the matrix dot multiplication operation, Represents global pooling;
通过相似度矩阵来获取对应的权重,将权重与输入的通道极化特征进行加权求和,得到关联了通道和空间特征的综合自注意力特征表示;The corresponding weights are obtained through the similarity matrix, and the weights are weighted and summed with the input channel polarization features to obtain a comprehensive self-attention feature representation that associates channel and spatial features. ;
; ;
其中 表示空间乘法运算符。in Represents the spatial multiplication operator.
可选的,所述S4具体包括:Optionally, the S4 specifically includes:
S41、经过一个去掉池化层的编解码器网络后,将焦点堆栈深度估计网络的输出分为多个层次,每个层次对应一个特定的对焦距离;S41. After passing through a codec network with the pooling layer removed, the output of the focus stack depth estimation network is divided into multiple levels, each level corresponding to a specific focus distance;
S42、在层次间应用操作确定最佳清晰度所在的层次,得到最佳对焦位置,获得全对焦图像;S42. Apply between levels The operation determines the level where the best clarity is, obtains the best focus position, and obtains a fully focused image;
S43、使用多层概率值加权求和的方式得到最终的深度估计结果。S43. Use the weighted summation of multi-layer probability values to obtain the final depth estimation result.
本发明的有益效果是:The beneficial effects of the present invention are:
本发明首先合成全对焦图像并将其用作监督信息,然后通过特征粗提取模块、极化自注意力模块和分层深度估计模块进行深度估计。使用焦点堆栈合成全对焦图像用作监督信息并利用自注意力机制的关联能力来获取场景深度,使得本发明在深度预测方面表现出相对高的准确性和良好的泛化性能,适用于不同场景下的深度估计任务,具有很高的实用性。The present invention first synthesizes a full-focus image and uses it as supervision information, and then performs depth estimation through a feature rough extraction module, a polarization self-attention module and a hierarchical depth estimation module. Using focus stacks to synthesize fully focused images is used as supervision information and utilizes the correlation ability of the self-attention mechanism to obtain scene depth, so that the present invention shows relatively high accuracy and good generalization performance in depth prediction, and is suitable for different scenes It is highly practical for depth estimation tasks.
附图说明Description of the drawings
附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制。在附图中:The drawings are used to provide a further understanding of the present invention and constitute a part of the specification. They are used to explain the present invention together with the embodiments of the present invention and do not constitute a limitation of the present invention. In the attached picture:
图1为本发明提出的一种基于全对焦图像合成的无监督焦点堆栈深度估计方法中无监督焦点堆栈深度估计模型;Figure 1 is an unsupervised focus stack depth estimation model in an unsupervised focus stack depth estimation method based on full-focus image synthesis proposed by the present invention;
图2为本发明提出的一种基于全对焦图像合成的无监督焦点堆栈深度估计方法中焦点测量清晰度度量值的结构框图;Figure 2 is a structural block diagram of the focus measurement sharpness measurement value in an unsupervised focus stack depth estimation method based on full-focus image synthesis proposed by the present invention;
图3为本发明提出的一种基于全对焦图像合成的无监督焦点堆栈深度估计方法中全对焦图像合成定性对比图;Figure 3 is a qualitative comparison diagram of full-focus image synthesis in an unsupervised focus stack depth estimation method based on full-focus image synthesis proposed by the present invention;
图4为本发明提出的一种基于全对焦图像合成的无监督焦点堆栈深度估计方法中三维感知模块的结构框图;Figure 4 is a structural block diagram of a three-dimensional perception module in an unsupervised focus stack depth estimation method based on full-focus image synthesis proposed by the present invention;
图5为本发明提出的一种基于全对焦图像合成的无监督焦点堆栈深度估计方法中通道差分模块的结构框图;Figure 5 is a structural block diagram of a channel difference module in an unsupervised focus stack depth estimation method based on full-focus image synthesis proposed by the present invention;
图6为本发明提出的一种基于全对焦图像合成的无监督焦点堆栈深度估计方法中DefocusNet上泛化性能可视化对比图;Figure 6 is a visual comparison chart of the generalization performance on DefocusNet in an unsupervised focus stack depth estimation method based on full-focus image synthesis proposed by the present invention;
图7为本发明提出的一种基于全对焦图像合成的无监督焦点堆栈深度估计方法中MobileDepth上泛化性能可视化对比图。Figure 7 is a visual comparison chart of generalization performance on MobileDepth in an unsupervised focus stack depth estimation method based on full-focus image synthesis proposed by the present invention.
具体实施方式Detailed ways
现在结合附图对本发明做进一步详细的说明。这些附图均为简化的示意图,仅以示意方式说明本发明的基本结构,因此其仅显示与本发明有关的构成。The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic diagrams that only illustrate the basic structure of the present invention in a schematic manner, and therefore only show the structures related to the present invention.
参考图1,一种基于全对焦图像合成的无监督焦点堆栈深度估计方法,包括:Referring to Figure 1, an unsupervised focus stack depth estimation method based on full-focus image synthesis includes:
S1、利用基于图像金字塔的全对焦图像合成方法和基于焦点测量算子的全对焦图像合成方法进行全对焦图像计算,得到对应的全对焦图像,将得到的全对焦图像进行融合并作为监督信息;S1. Use the full-focus image synthesis method based on the image pyramid and the full-focus image synthesis method based on the focus measurement operator to calculate the full-focus image, obtain the corresponding full-focus image, and fuse the obtained full-focus images as supervision information;
参考图2,本实施方式中展示了两种方法的合成全对焦图像的过程。Referring to Figure 2, this embodiment shows the process of synthesizing a full-focus image using two methods.
图中的表示对焦序列,高斯金字塔下采样,以原图像/>表示高斯金字塔的最底层,其分辨率为/>,通过定义第i层的高斯金字塔:in the picture Represents the focus sequence, Gaussian pyramid downsampling, using the original image/> Represents the bottom layer of the Gaussian pyramid, and its resolution is/> , by defining the i-th layer of Gaussian pyramid:
; ;
其中,表示卷积操作,/>表示大小为/>的卷积核,/>表示去除输入图像的偶数行和偶数列的下采样过程;in, Represents the convolution operation,/> Indicates that the size is/> The convolution kernel,/> Represents the downsampling process of removing even rows and even columns of the input image;
下采样将输入图像的分辨率降低为四分之一,通过不断迭代上述步骤,得到整个高斯金字塔;Downsampling will be the resolution of the input image Reduce it to a quarter, and obtain the entire Gaussian pyramid by continuously iterating the above steps;
; ;
其中,表示上采样过程,即将图像在每个方向上扩大为原来的两倍,新增的行和列以0填充;in, Represents the upsampling process, that is, the image is expanded to twice the original size in each direction, and the new rows and columns are filled with 0;
原图像被分解为高斯金字塔和拉普拉斯金字塔,对于焦点堆栈中的每一张图像,执行相同的分解操作,得到一组图像金字塔。original image is decomposed into a Gaussian pyramid and a Laplacian pyramid, and for each image in the focus stack, the same decomposition operation is performed to obtain a set of image pyramids.
本实施方式中,图像金字塔的的融合过程具体包括:In this implementation, the image pyramid fusion process specifically includes:
给定焦点堆栈序列:Given a focus stack sequence:
; ;
其中,表示像素点的空间坐标,/>表示对焦序列的数量,每一张图片都和特定的对焦距离相对应;in, Represents the spatial coordinates of the pixel point,/> Indicates the number of focus sequences, each picture corresponds to a specific focus distance;
对焦点堆栈进行图像金字塔分解,得到高斯金字塔/>和拉普拉斯金字塔/>,其中,/>代表金字塔的层数;focus stack Perform image pyramid decomposition to obtain Gaussian pyramid/> and Laplace's Pyramid/> , where,/> Represents the number of layers of the pyramid;
对拉普拉斯金字塔的每一个位置/>进行焦点测量,获取最大清晰度对应的索引图/>,全对焦拉普拉斯金字塔/>由索引图和拉普拉斯金字塔生成:/>;Pyramid of Laplace every position/> Perform focus measurement to obtain the index image corresponding to the maximum clarity/> , full focus Laplacian Pyramid/> Generated from index graph and Laplacian pyramid: /> ;
利用对全对焦拉普拉斯金字塔/>自上而下地进行上采样,得到焦点堆栈对应的全对焦图像。use Full focus on Laplacian Pyramid/> Upsampling is performed from top to bottom to obtain a fully focused image corresponding to the focus stack.
本实施方式中,基于图像金字塔的全对焦图像合成方法具体包括对输入的焦点堆栈进行分解,得到高斯金字塔/>和拉普拉斯金字塔/>,由于整个分解过程完全可逆,所以此图像变换方法没有信息损失,对拉普拉斯金字塔/>进行区域信息熵计算,得到每一层的焦点测量清晰度度量值,提取清晰度度量值最大的一层作为对应层的全对焦图像,重建得到最终的全对焦图像。In this embodiment, the full-focus image synthesis method based on image pyramid specifically includes the input focus stack Decompose and obtain Gaussian pyramid/> and Laplace's Pyramid/> , since the entire decomposition process is completely reversible, there is no information loss in this image transformation method, for the Laplacian pyramid/> The regional information entropy is calculated to obtain the focus measurement sharpness measurement value of each layer. The layer with the largest sharpness measurement value is extracted as the full-focus image of the corresponding layer, and the final full-focus image is reconstructed.
本实施方式中,基于焦点测量算子的全对焦图像合成方法包括将小区域邻域融合算子应用到各个对焦序列上得到各个焦点图像的焦点测量清晰度度量值,进行索引最大化确定最佳清晰度对应的索引,根据索引提取焦点堆栈中像素值作为全对焦图像。In this embodiment, the full-focus image synthesis method based on the focus measurement operator includes applying the small-area neighborhood fusion operator to each focus sequence. The focus measurement sharpness metric value of each focus image is obtained, the index is maximized to determine the index corresponding to the best sharpness, and the pixel value in the focus stack is extracted according to the index as a fully focused image.
本发明基于图像金字塔和小窗口融合算子的全对焦图像融合算法能够合成高质量的全对焦图像。提出的模型利用全局关联结构有效地提升了深度预测的精度,同时轻量化的设计使模型具备实时推理能力。The full-focus image fusion algorithm of the present invention based on image pyramid and small window fusion operator can synthesize high-quality full-focus images. The proposed model uses the global correlation structure to effectively improve the accuracy of depth prediction, and its lightweight design enables the model to have real-time reasoning capabilities.
参考图3,本实施方式中,基于焦点测量算子的全对焦图像合成方法具体包括:Referring to Figure 3, in this embodiment, the full-focus image synthesis method based on the focus measurement operator specifically includes:
通过向量运算将向量值图像转换为标量值图像获得综合特征:Comprehensive features are obtained by converting vector-valued images into scalar-valued images through vector operations:
设表示向量值像素,/>表示标量值像素,选取向量值图像中的小块尺寸/>,使/>为中心向量值像素,/>为窗口/>内的向量值像素;set up Represents vector-valued pixels, /> Represents a scalar-valued pixel, selecting a patch size in a vector-valued image/> , make/> is the center vector value pixel,/> for window/> vector value pixels within;
其中,向量值像素对应的标量值像素/>通过缩放窗口内差分向量长度得到;where, the vector value pixel Corresponding scalar value pixel/> Obtained by scaling the length of the difference vector within the window;
计算窗口内其他向量/>与中心向量/>之差得到差分向量/>:calculation window Other vectors within/> with center vector/> The difference is the difference vector/> :
; ;
; ;
; ;
其中,表示结果向量的点积形成的标量值,/>表示一个局部的自适应缩放因子,/>在计算标量特征图像上扮演着重要角色;in, Represents a scalar value formed by the dot product of the result vectors, /> Represents a local adaptive scaling factor, /> Plays an important role in calculating scalar feature images;
; ;
其中,计算差分向量之间的点积,用来衡量特征间的相似性,提供差分向量/>和中心向量/>之间的叉积长度;in, Calculate the dot product between difference vectors to measure the similarity between features, Provides differential vectors/> and center vector/> The cross product length between;
将得到的标量值图像应用于索引最大化操作,以评估图像的清晰度,根据最佳清晰度所在的索引从输入的焦点堆栈中提取相应位置的像素值,得到相应的全对焦图像,依据此方法,可以从对焦序列合成高质量的全对焦图像。The resulting scalar value image is applied to the index maximization operation to evaluate the sharpness of the image, and the pixel value at the corresponding position is extracted from the input focus stack according to the index where the best sharpness is located, and the corresponding fully focused image is obtained according to With this method, high-quality all-focus images can be synthesized from focus sequences.
S2、通过三维感知模块对焦点堆栈进行高频噪声过滤和初步特征提取得到初提取特征,同时焦点堆栈经过差分值计算模块得到编码了模糊歧义性的特征,将初提取特征和模糊歧义性特征进行级联,即得到焦点体;S2. Use the three-dimensional perception module to perform high-frequency noise filtering and preliminary feature extraction on the focus stack to obtain the initial extraction features. At the same time, the focus stack obtains features encoding fuzzy ambiguity through the difference value calculation module. The initial extraction features and fuzzy ambiguity features are processed. Cascade, that is, the focus body is obtained;
本实施方式中,三维感知模块通过一个四层的网络结构完成焦点堆栈的高频噪声过滤和初步特征提取,三维感知模块包括多个具有不同的卷积核大小和步长的并行卷积层,用于捕捉不同尺度上的模糊特征;In this implementation, the three-dimensional perception module completes high-frequency noise filtering and preliminary feature extraction of the focus stack through a four-layer network structure. The three-dimensional perception module includes multiple parallel convolution layers with different convolution kernel sizes and step sizes. Used to capture fuzzy features at different scales;
参考图4,S2具体包括:Referring to Figure 4, S2 specifically includes:
S21、使用一个3D卷积网络对焦点堆栈进行过滤,提取模糊特征;S21. Use a 3D convolution network to filter the focus stack and extract blur features;
S22、在网络结构中引入一个差分值计算模块,将模糊特征输入差分值计算模块中,差分值计算模块计算RGB三通道的差分值:S22. Introduce a differential value calculation module into the network structure, input the fuzzy features into the differential value calculation module, and the differential value calculation module calculates the differential values of the three RGB channels:
; ;
其中,表示融合后的 RGB通道差分,代表输入特征的不同颜色维度;in, Represents the fused RGB channel difference, representing the different color dimensions of the input features;
S23、经过一个下采样层得到RGB差分特征,RGB差分特征与模糊特征进行融合,构建出融合了模糊歧义性的焦点体。S23. Obtain the RGB differential features through a down-sampling layer, and fuse the RGB differential features with the fuzzy features to construct a focus volume that combines fuzzy ambiguity.
S3、将三维极化自注意力机制引入焦点堆栈中,将输入特征焦点体分为通道极化特征图和空间极化特征图;S3. Introduce the three-dimensional polarization self-attention mechanism into the focus stack, and divide the input feature focus volume into a channel polarization feature map and a spatial polarization feature map;
本实施方式中,通道极化特征图通过对输入的特征图x进行极化变换得到:In this implementation, the channel polarization feature map is obtained by performing polarization transformation on the input feature map x:
极化变换将输入的特征图x转化为两组基向量和/>;Polarization transformation converts the input feature map x into two sets of basis vectors and/> ;
其中,和/>对应通道层面的查询和键;in, and/> Corresponds to channel-level queries and keys;
计算和/>的相似度得分/>:calculate and/> similarity score/> :
; ;
其中,表示激活函数,/>表示归一化指数函数,/>、/>和/>分别表示1×1的三维卷积层,/>和/>表示两个张量重塑操作符,×表示元素级别的乘法操作,/>和/>与/>之间的通道数为/>;in, Represents the activation function,/> Represents a normalized exponential function,/> ,/> and/> Represents a 1×1 three-dimensional convolution layer respectively, /> and/> Represents two tensor reshaping operators, × represents element-level multiplication operations, /> and/> with/> The number of channels between is/> ;
用得分作为权重,对输入向量进行加权求和,得到获得了通道关联的通道极化特征图/>:Use score As a weight, the input vectors are weighted and summed to obtain the channel polarization feature map that obtains the channel correlation/> :
; ;
其中,表示通道级乘法运算符。in, Represents a channel-level multiplication operator.
本实施方式中,空间极化特征图方法包括:In this implementation, the spatial polarization feature map method includes:
将输入的通道极化特征图进行极化变化,得到两组极化向量和/>;Convert the input channel polarization feature map to Perform polarization changes to obtain two sets of polarization vectors and/> ;
其中,通过对三通道进行全局池化以获取全局空间特征,/>通过三维卷积将输入特征图中的像素进行重新排列增强空间不同方向上的特征;in, Global pooling is performed on three channels to obtain global spatial features,/> Rearrange the pixels in the input feature map through three-dimensional convolution to enhance features in different directions in space;
通过两组极化向量计算相似度矩阵:Calculate the similarity matrix from two sets of polarization vectors :
; ;
其中,和/>分别表示标准的1×1三维卷积层,/>表示通道卷积的中间参数,、/>和/>表示三个张量重塑操作,×表示矩阵点乘操作,/>表示全局池化;in, and/> Represents the standard 1×1 three-dimensional convolution layer respectively,/> Represents the intermediate parameters of channel convolution, ,/> and/> represents three tensor reshaping operations, × represents the matrix dot multiplication operation, /> Represents global pooling;
通过相似度矩阵来获取对应的权重,将权重与输入的通道极化特征进行加权求和,得到关联了通道和空间特征的综合自注意力特征表示;;The corresponding weights are obtained through the similarity matrix, and the weights are weighted and summed with the input channel polarization features to obtain a comprehensive self-attention feature representation that associates channel and spatial features. ; ;
其中 表示空间乘法运算符。in Represents the spatial multiplication operator.
需要注意的是,上述所有的卷积操作和张量重塑操作都是在三个通道维度上进行的,因此,三维极化自注意力机制可以同时考虑通道关联性和空间模糊关联性。It should be noted that all the above convolution operations and tensor reshaping operations are performed in three channel dimensions. Therefore, the three-dimensional polarization self-attention mechanism can consider both channel correlation and spatial fuzzy correlation.
本发明提出的模型在较小的焦点堆栈上表现出良好的性能,同时具有优秀的泛化能力。The model proposed in this invention shows good performance on smaller focus stacks while having excellent generalization capabilities.
S4、上述的通道极化特征图和空间极化特征图经过深度概率预测模块定位焦点堆栈最大清晰度所在的层次,并输出对应的概率值,确定最佳清晰度所在的层次,获得全对焦图像。S4. The above-mentioned channel polarization feature map and spatial polarization feature map use the depth probability prediction module to locate the level where the maximum sharpness of the focus stack is located, and output the corresponding probability value to determine the level where the best clarity is located, and obtain a fully focused image. .
本实施方式中,S4具体包括:In this implementation, S4 specifically includes:
S41、经过一个去掉池化层的编解码器网络后,将焦点堆栈深度估计网络的输出分为多个层次,每个层次对应一个特定的对焦距离;S41. After passing through a codec network with the pooling layer removed, the output of the focus stack depth estimation network is divided into multiple levels, each level corresponding to a specific focus distance;
S42、在层次间应用Softmax操作确定最佳清晰度所在的层次,得到最佳对焦位置,获得全对焦图像;S42. Apply Softmax operation between layers to determine the layer with the best clarity, obtain the best focus position, and obtain a fully focused image;
在测试时,利用输入对焦序列中的模糊信息,确定目标深度所在的层次,并利用对应层次的概率密度函数计算深度概率值。During testing, the blur information in the input focus sequence is used to determine the level at which the target depth is located, and the probability density function of the corresponding level is used to calculate the depth probability value.
S43、使用多层概率值加权求和的方式得到最终的深度估计结果。S43. Use the weighted summation of multi-layer probability values to obtain the final depth estimation result.
在实施例1:In Example 1:
本发明在4D Light Field、DefocusNet和FlyingThings3D数据集上进行了量化:This invention is quantified on 4D Light Field, DefocusNet and FlyingThings3D datasets:
由上表1可以看出,提出的全对焦图像合成方法,可以从较小的焦点堆栈中合成比较精确的全对焦图像。As can be seen from Table 1 above, the proposed full-focus image synthesis method can synthesize a more accurate full-focus image from a smaller focus stack.
上表2-表4是本发明在 4D Light Field、DefocusNet和FlyingThings3D数据集上与最新的方法进行了量化对比结果。The above Tables 2 to 4 are the quantitative comparison results between the present invention and the latest methods on the 4D Light Field, DefocusNet and FlyingThings3D data sets.
由上表1-表4可以看出,在4D Light Field数据集上的结果表明,本发明在无监督深度估计中比 AiFDepthNet 方法MSE和RMSE指标上分别提升了42.5%和26.3%。与有监督方法的对比中,本方法超越了包括VDFF、PSPNet、DDFF在内的大部分有监督方法,即使与DefocusNet方法相比,在MSE和RMSE上的性能仅相差15.0%和4.6%。在DefocusNet数据集和FlyingThings3D数据集上的结果显示,相对于AiFDepthNet方法,本方法在 MAE、MSE、RMSE指标上均取得更高的精度。与AiFDepthNet方法16M参数量相比,本方法的参数量也更小,为3.3M,具有更高的计算效率。As can be seen from Tables 1 to 4 above, the results on the 4D Light Field data set show that the present invention improves the MSE and RMSE indicators by 42.5% and 26.3% respectively in unsupervised depth estimation compared to the AiFDepthNet method. In comparison with supervised methods, this method surpasses most supervised methods including VDFF, PSPNet, and DDFF. Even compared with the DefocusNet method, the performance difference in MSE and RMSE is only 15.0% and 4.6%. The results on the DefocusNet data set and FlyingThings3D data set show that compared with the AiFDepthNet method, this method achieves higher accuracy in MAE, MSE, and RMSE indicators. Compared with the 16M parameter size of the AiFDepthNet method, the parameter size of this method is also smaller, 3.3M, and has higher computational efficiency.
本发明首先合成全对焦图像并将其用作监督信息,然后通过特征粗提取模块、极化自注意力模块和分层深度估计模块进行深度估计。使用焦点堆栈合成全对焦图像用作监督信息并利用自注意力机制的关联能力来获取场景深度,使得本发明在深度预测方面表现出相对高的准确性和良好的泛化性能,适用于不同场景下的深度估计任务,具有很高的实用性。The present invention first synthesizes a full-focus image and uses it as supervision information, and then performs depth estimation through a feature rough extraction module, a polarization self-attention module and a hierarchical depth estimation module. Using focus stacks to synthesize fully focused images is used as supervision information and utilizes the correlation ability of the self-attention mechanism to obtain scene depth, so that the present invention shows relatively high accuracy and good generalization performance in depth prediction, and is suitable for different scenes It is highly practical for depth estimation tasks.
以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,根据本发明的技术方案及其发明构思加以等同替换或改变,都应涵盖在本发明的保护范围之内。The above are only preferred specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any person familiar with the technical field can, within the technical scope disclosed in the present invention, implement the technical solutions of the present invention. Equivalent substitutions or changes of the inventive concept thereof shall be included in the protection scope of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311101094.7A CN116823914B (en) | 2023-08-30 | 2023-08-30 | Unsupervised focal stack depth estimation method based on all-focusing image synthesis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311101094.7A CN116823914B (en) | 2023-08-30 | 2023-08-30 | Unsupervised focal stack depth estimation method based on all-focusing image synthesis |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116823914A true CN116823914A (en) | 2023-09-29 |
CN116823914B CN116823914B (en) | 2024-01-09 |
Family
ID=88141360
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311101094.7A Active CN116823914B (en) | 2023-08-30 | 2023-08-30 | Unsupervised focal stack depth estimation method based on all-focusing image synthesis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116823914B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118570070A (en) * | 2024-06-05 | 2024-08-30 | 深圳市斯贝达电子有限公司 | A super-resolution image enhancement method based on focal stacking |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120218386A1 (en) * | 2011-02-28 | 2012-08-30 | Duke University | Systems and Methods for Comprehensive Focal Tomography |
CN110246172A (en) * | 2019-06-18 | 2019-09-17 | 首都师范大学 | A kind of the light field total focus image extraction method and system of the fusion of two kinds of Depth cues |
CN110751160A (en) * | 2019-10-30 | 2020-02-04 | 华中科技大学 | Method, device and system for detecting object in image |
CN112465796A (en) * | 2020-12-07 | 2021-03-09 | 清华大学深圳国际研究生院 | Light field feature extraction method fusing focus stack and full-focus image |
CN114792430A (en) * | 2022-04-24 | 2022-07-26 | 深圳市安软慧视科技有限公司 | Pedestrian re-identification method, system and related equipment based on polarized self-attention |
US20220309696A1 (en) * | 2021-03-23 | 2022-09-29 | Mediatek Inc. | Methods and Apparatuses of Depth Estimation from Focus Information |
CN115830240A (en) * | 2022-12-14 | 2023-03-21 | 山西大学 | An Unsupervised Deep Learning 3D Reconstruction Method Based on Image Fusion Perspective |
-
2023
- 2023-08-30 CN CN202311101094.7A patent/CN116823914B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120218386A1 (en) * | 2011-02-28 | 2012-08-30 | Duke University | Systems and Methods for Comprehensive Focal Tomography |
CN110246172A (en) * | 2019-06-18 | 2019-09-17 | 首都师范大学 | A kind of the light field total focus image extraction method and system of the fusion of two kinds of Depth cues |
CN110751160A (en) * | 2019-10-30 | 2020-02-04 | 华中科技大学 | Method, device and system for detecting object in image |
CN112465796A (en) * | 2020-12-07 | 2021-03-09 | 清华大学深圳国际研究生院 | Light field feature extraction method fusing focus stack and full-focus image |
US20220309696A1 (en) * | 2021-03-23 | 2022-09-29 | Mediatek Inc. | Methods and Apparatuses of Depth Estimation from Focus Information |
CN114792430A (en) * | 2022-04-24 | 2022-07-26 | 深圳市安软慧视科技有限公司 | Pedestrian re-identification method, system and related equipment based on polarized self-attention |
CN115830240A (en) * | 2022-12-14 | 2023-03-21 | 山西大学 | An Unsupervised Deep Learning 3D Reconstruction Method Based on Image Fusion Perspective |
Non-Patent Citations (3)
Title |
---|
TIAN, B, ET.AL: "Fine-grained multi-focus image fusion based on edge features", 《SCIENTIFIC REPORTS 》, vol. 13, no. 1 * |
周萌等: "基于失焦模糊特性的焦点堆栈深度估计方法", 《计算机应用》, pages 2 * |
张雪霏: "面向单目深度估计的无监督深度学习模型研究", 《中国优秀硕士论文电子期刊网》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118570070A (en) * | 2024-06-05 | 2024-08-30 | 深圳市斯贝达电子有限公司 | A super-resolution image enhancement method based on focal stacking |
Also Published As
Publication number | Publication date |
---|---|
CN116823914B (en) | 2024-01-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lim et al. | DSLR: Deep stacked Laplacian restorer for low-light image enhancement | |
CN111047548B (en) | Attitude transformation data processing method and device, computer equipment and storage medium | |
Li et al. | Model-informed multistage unsupervised network for hyperspectral image super-resolution | |
CN112507997B (en) | Face super-resolution system based on multi-scale convolution and receptive field feature fusion | |
CN113673590B (en) | Rain removal method, system and medium based on multi-scale hourglass densely connected network | |
CN114187331A (en) | Unsupervised optical flow estimation method based on Transformer feature pyramid network | |
Ghorai et al. | Multiple pyramids based image inpainting using local patch statistics and steering kernel feature | |
CN111861880B (en) | Image super-fusion method based on regional information enhancement and block self-attention | |
CN102722863A (en) | A Method for Super-Resolution Reconstruction of Depth Maps Using Autoregressive Models | |
CN113762147B (en) | Facial expression migration method and device, electronic equipment and storage medium | |
CN113870335A (en) | Monocular depth estimation method based on multi-scale feature fusion | |
CN115147271A (en) | Multi-view information attention interaction network for light field super-resolution | |
CN113284051A (en) | Face super-resolution method based on frequency decomposition multi-attention machine system | |
CN114638836A (en) | An urban streetscape segmentation method based on highly effective driving and multi-level feature fusion | |
CN112686830A (en) | Super-resolution method of single depth map based on image decomposition | |
CN116823914B (en) | Unsupervised focal stack depth estimation method based on all-focusing image synthesis | |
Talreja et al. | XTNSR: Xception-based transformer network for single image super resolution | |
CN114943646A (en) | Gradient weight loss and attention mechanism super-resolution method based on texture guidance | |
CN112907641B (en) | Multi-view depth estimation method based on detail information retention | |
Tang et al. | MFFAGAN: generative adversarial network with multilevel feature fusion attention mechanism for remote sensing image super-resolution | |
CN113240584A (en) | Multitask gesture picture super-resolution method based on picture edge information | |
CN114049436B (en) | Improved cascade structure multi-view stereoscopic reconstruction method | |
CN114494007B (en) | A text-guided natural image super-resolution reconstruction method | |
Jin et al. | Boosting single image super-resolution learnt from implicit multi-image prior | |
CN114782256A (en) | Image reconstruction method, image reconstruction device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |