CN114820344A - Depth map enhancement method and device - Google Patents
Depth map enhancement method and device Download PDFInfo
- Publication number
- CN114820344A CN114820344A CN202210295510.0A CN202210295510A CN114820344A CN 114820344 A CN114820344 A CN 114820344A CN 202210295510 A CN202210295510 A CN 202210295510A CN 114820344 A CN114820344 A CN 114820344A
- Authority
- CN
- China
- Prior art keywords
- feature
- depth map
- map
- depth
- view
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本申请涉及计算机视觉技术领域,特别涉及一种深度图增强方法及装置。The present application relates to the technical field of computer vision, and in particular, to a depth map enhancement method and device.
背景技术Background technique
随着技术进步,立体对象模型的采集和获取变得越来越容易,特别是深度图由于其在自动驾驶和机器人等各个领域的应用而引起了越来越多的研究关注。With the advancement of technology, the acquisition and acquisition of stereo object models has become easier and easier, especially depth maps have attracted more and more research attention due to their applications in various fields such as autonomous driving and robotics.
由于设备的低分辨率和遮挡等诸多因素,如激光雷达、深度相机等深度扫描设备直接获取的深度的数据质量通常很差,低质量数据主要表现在深度空间结构的不完整和信息冗余等方面,因此针对深度图的增强方法和系统在实际的工程应用中变得越来越重要。Due to many factors such as low resolution and occlusion of the device, the quality of depth data directly obtained by depth scanning devices such as lidar and depth cameras is usually poor. Low-quality data is mainly manifested in incomplete depth spatial structure and information redundancy, etc. Therefore, enhancement methods and systems for depth maps are becoming more and more important in practical engineering applications.
相关技术中,采用的基于几何特征的补全和稠密化方法,需要目标物体满足某种特定的几何对称性,再利用获取的图像进行特征提取和预测。然而,真实世界的物体之间几何差异性较大,在深度采集的过程中难以使用通用的增强模型进行几何特征的补全和稠密化,使得采集中存在信息残缺以及精度低下的情况,亟待改进。In the related art, the completion and densification methods based on geometric features are used, which require the target object to satisfy a certain geometric symmetry, and then use the acquired images for feature extraction and prediction. However, the geometric differences between real-world objects are quite large, and it is difficult to use a general enhancement model to complete and densify geometric features in the process of depth acquisition, resulting in incomplete information and low accuracy in the acquisition, which needs to be improved urgently. .
发明内容SUMMARY OF THE INVENTION
本申请提供一种深度图增强方法及装置,以解决相关技术中由于目标物体需满足某种特定的几何对称性,再进行特征提取和预测,导致在深度采集的过程中,难以使用通用的增强模型进行几何特征的补全和稠密化,使得采集中存在信息残缺以及精度低下的情况的技术问题。The present application provides a depth map enhancement method and device to solve the problem that in the related art, since the target object needs to meet a certain geometric symmetry before feature extraction and prediction, it is difficult to use a general enhancement in the process of depth acquisition. The model performs the completion and densification of geometric features, which causes technical problems such as incomplete information and low accuracy in the acquisition.
本申请第一方面实施例提供一种深度图增强方法,包括以下步骤:从原始视觉数据中获取初始深度图;对所述初始深度图进行交替卷积与反卷积模块的多尺度特征提取,并由所述反卷积模块进行逐步恢复,得到以深度图三维坐标作为三通道的特征视图;对所述特征视图进行尺度压缩和卷积,依次得到两阶段的特征向量;对所述初始深度图进行空间变换和特征提取,得到深度图特征,并基于所述两阶段的特征向量对所述深度图特征进行强化,并恢复深度结构,生成第一三通道特征图;以及由所述特征视图多层感知机映射得到第二三通道特征图,并融合所述第一三通道特征图和所述第二三通道特征图,多层感知机映射得到最终深度图。The embodiment of the first aspect of the present application provides a depth map enhancement method, including the following steps: obtaining an initial depth map from original visual data; performing multi-scale feature extraction on the initial depth map by alternating convolution and deconvolution modules, The deconvolution module performs step-by-step recovery to obtain a three-channel feature view with the three-dimensional coordinates of the depth map; scale compression and convolution are performed on the feature view, and two-stage feature vectors are obtained in turn; The image is subjected to spatial transformation and feature extraction to obtain depth map features, and the depth map features are enhanced based on the two-stage feature vectors, and the depth structure is restored to generate a first three-channel feature map; The second three-channel feature map is obtained by multi-layer perceptron mapping, and the first three-channel feature map and the second three-channel feature map are fused, and the final depth map is obtained by the multi-layer perceptron mapping.
可选地,在本申请的一个实施例中,所述从原始视觉数据中获取初始深度图,包括:由所述原始视觉数据得到满足预设条件的低质量深度图;对所述低质量深度图和红外和深度图原始数据进行预处理和特征提取,生成所述初始深度图。Optionally, in an embodiment of the present application, the obtaining an initial depth map from raw visual data includes: obtaining a low-quality depth map that satisfies a preset condition from the raw visual data; image and infrared and depth map raw data for preprocessing and feature extraction to generate the initial depth map.
可选地,在本申请的一个实施例中,所述由所述原始视觉数据得到满足预设条件的低质量深度图,包括:基于预设的采集视角阈值对所述原始视觉数据进行采集,得到所述低质量深度图。Optionally, in an embodiment of the present application, the obtaining a low-quality depth map that satisfies a preset condition from the raw visual data includes: collecting the raw visual data based on a preset collection viewing angle threshold, The low quality depth map is obtained.
可选地,在本申请的一个实施例中,所述对所述初始深度图进行空间变换和特征提取,得到深度图特征,并基于所述两阶段的特征向量对所述深度图特征进行强化,并恢复深度结构,生成第一三通道特征图,包括:对所述初始深度图进行空间三维变换,得到三维变换后的深度;基于所述三维变换后的深度提取第一特征,并将所述第一特征与所述特征视图的视图特征向量融合,得到第一阶段融合特征;基于所述第一阶段融合特征提取第二特征,并将所述第二特征与所述视图特征向量融合,得到第二阶段融合特征;由所述第二阶段融合特征多层感知机映射得到所述第一三通道特征图。Optionally, in an embodiment of the present application, performing spatial transformation and feature extraction on the initial depth map to obtain depth map features, and enhancing the depth map features based on the two-stage feature vectors , and restore the depth structure to generate a first three-channel feature map, including: performing spatial three-dimensional transformation on the initial depth map to obtain a three-dimensional transformed depth; extracting a first feature based on the three-dimensional transformed depth, and converting the The first feature is fused with the view feature vector of the feature view to obtain a first-stage fusion feature; a second feature is extracted based on the first-stage fusion feature, and the second feature is fused with the view feature vector, The second-stage fusion feature is obtained; the first three-channel feature map is obtained by the multi-layer perceptron mapping of the second-stage fusion feature.
本申请第二方面实施例提供一种深度图增强装置,包括:获取模块,用于从原始视觉数据中获取初始深度图;特征提取模块,用于对所述初始深度图进行交替卷积与反卷积模块的多尺度特征提取,并由所述反卷积模块进行逐步恢复,得到以深度图三维坐标作为三通道的特征视图;计算模块,用于对所述特征视图进行尺度压缩和卷积,依次得到两阶段的特征向量;强化模块,用于对所述初始深度图进行空间变换和特征提取,得到深度图特征,并基于所述两阶段的特征向量对所述深度图特征进行强化,并恢复深度结构,生成第一三通道特征图;以及融合模块,用于由所述特征视图多层感知机映射得到第二三通道特征图,并融合所述第一三通道特征图和所述第二三通道特征图,多层感知机映射得到最终深度图。An embodiment of the second aspect of the present application provides a depth map enhancement device, including: an acquisition module for acquiring an initial depth map from original visual data; a feature extraction module for performing alternate convolution and inversion on the initial depth map The multi-scale feature extraction of the convolution module is gradually restored by the deconvolution module, and the feature view with the three-dimensional coordinates of the depth map as the three-channel is obtained; the calculation module is used to perform scale compression and convolution on the feature view. , and obtain two-stage feature vectors in turn; the enhancement module is used to perform spatial transformation and feature extraction on the initial depth map to obtain depth map features, and strengthen the depth map features based on the two-stage feature vectors, and restore the depth structure to generate a first three-channel feature map; and a fusion module for obtaining a second three-channel feature map from the feature view multi-layer perceptron mapping, and fuse the first three-channel feature map and the The second three-channel feature map is mapped by a multilayer perceptron to obtain the final depth map.
可选地,在本申请的一个实施例中,所述获取模块包括:获取单元,用于由所述原始视觉数据得到满足预设条件的低质量深度图;预处理单元,用于对所述低质量深度图和红外和深度图原始数据进行预处理和特征提取,生成所述初始深度图。Optionally, in an embodiment of the present application, the acquisition module includes: an acquisition unit, configured to obtain a low-quality depth map that satisfies a preset condition from the original visual data; a preprocessing unit, configured to The low-quality depth map and the infrared and depth map raw data are preprocessed and feature extracted to generate the initial depth map.
可选地,在本申请的一个实施例中,所述获取单元进一步用于,基于预设的采集视角阈值对所述原始视觉数据进行采集,得到所述低质量深度图。Optionally, in an embodiment of the present application, the acquiring unit is further configured to acquire the low-quality depth map by acquiring the raw visual data based on a preset acquisition viewing angle threshold.
可选地,在本申请的一个实施例中,所述强化模块包括:三维变换单元,用于对所述初始深度图进行空间三维变换,得到三维变换后的深度;第一融合单元,用于基于所述三维变换后的深度提取第一特征,并将所述第一特征与所述特征视图的视图特征向量融合,得到第一阶段融合特征;第二融合单元,用于基于所述第一阶段融合特征提取第二特征,并将所述第二特征与所述视图特征向量融合,得到第二阶段融合特征;映射单元,用于由所述第二阶段融合特征多层感知机映射得到所述第一三通道特征图。Optionally, in an embodiment of the present application, the enhancement module includes: a three-dimensional transformation unit, configured to perform spatial three-dimensional transformation on the initial depth map to obtain a three-dimensional transformed depth; a first fusion unit, configured to The first feature is extracted based on the depth after the three-dimensional transformation, and the first feature is fused with the view feature vector of the feature view to obtain the first-stage fusion feature; the second fusion unit is used for The stage fusion feature extracts the second feature, and fuses the second feature with the view feature vector to obtain the second stage fusion feature; the mapping unit is used to map the second stage fusion feature multi-layer perceptron to obtain the second stage fusion feature. The first three-channel feature map is described.
本申请第三方面实施例提供一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序,以实现如上述实施例所述的深度图增强方法。An embodiment of a third aspect of the present application provides an electronic device, including: a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the program to achieve The depth map enhancement method described in the above embodiments.
本申请第四方面实施例提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行,以用于实现如权利要求1-4任一项所述的深度图增强方法。Embodiments of the fourth aspect of the present application provide a computer-readable storage medium on which a computer program is stored, and the program is executed by a processor, so as to implement the depth map enhancement method according to any one of claims 1-4 .
本申请实施例可以从原始视觉数据中获取初始深度图,从初始深度图中,提取出两阶段特征向量和深度图特征并进行强化,恢复深度结构,并通过多层感知机映射得到最终深度图,在保持原始深度几何结构的同时,可以对原始数据中形状缺失区域进行补全,并对整体深度图进行稠密化,具备了适应不同程度、不同角度的形状残缺和稀疏的能力,可以有效地提高低精度采集设备的有效输出精度,提高数据采集的质量。由此,解决了相关技术中由于目标物体需满足某种特定的几何对称性,再进行特征提取和预测,导致在深度采集的过程中,难以使用通用的增强模型进行几何特征的补全和稠密化,使得采集中存在信息残缺以及精度低下的情况的技术问题。In this embodiment of the present application, an initial depth map can be obtained from the original visual data, two-stage feature vectors and depth map features can be extracted from the initial depth map and enhanced, the depth structure can be restored, and a final depth map can be obtained through multi-layer perceptron mapping , while maintaining the original depth geometry, it can complete the missing shape areas in the original data, and densify the overall depth map. It has the ability to adapt to shape defects and sparseness of different degrees and angles. Improve the effective output accuracy of low-precision acquisition equipment and improve the quality of data acquisition. Therefore, in the related art, since the target object needs to meet a certain geometric symmetry, and then perform feature extraction and prediction, it is difficult to use a general enhancement model to complete and dense geometric features in the process of depth acquisition. The technical problems of incomplete information and low precision exist in the collection.
本申请附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。Additional aspects and advantages of the present application will be set forth, in part, in the following description, and in part will be apparent from the following description, or learned by practice of the present application.
附图说明Description of drawings
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present application will become apparent and readily understood from the following description of embodiments taken in conjunction with the accompanying drawings, wherein:
图1为根据本申请实施例提供的一种深度图增强方法的流程图;1 is a flowchart of a depth map enhancement method provided according to an embodiment of the present application;
图2为根据本申请一个实施例的深度图增强方法的原理示意图;FIG. 2 is a schematic diagram of the principle of a depth map enhancement method according to an embodiment of the present application;
图3为根据本申请一个实施例的深度图增强方法的增强网络的原理示意图;3 is a schematic diagram of the principle of an enhancement network of a depth map enhancement method according to an embodiment of the present application;
图4为根据本申请一个实施例的深度图增强方法的生成器的原理示意图;4 is a schematic diagram of a generator of a depth map enhancement method according to an embodiment of the present application;
图5为根据本申请一个具体实施例的深度图增强方法的流程图;5 is a flowchart of a depth map enhancement method according to a specific embodiment of the present application;
图6为根据本申请实施例提供的一种深度图增强装置的结构示意图;FIG. 6 is a schematic structural diagram of a depth map enhancement apparatus provided according to an embodiment of the present application;
图7为根据本申请实施例提供的电子设备的结构示意图。FIG. 7 is a schematic structural diagram of an electronic device provided according to an embodiment of the present application.
具体实施方式Detailed ways
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本申请,而不能理解为对本申请的限制。The following describes in detail the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary, and are intended to be used to explain the present application, but should not be construed as a limitation to the present application.
下面参考附图描述本申请实施例的深度图增强方法及装置。针对上述背景技术中心提到的相关技术中由于目标物体需满足某种特定的几何对称性,再进行特征提取和预测,导致在深度采集的过程中,难以使用通用的增强模型进行几何特征的补全和稠密化,使得采集中存在信息残缺以及精度低下的情况的技术问题,本申请提供了一种深度图增强方法,在该方法中,可以从原始视觉数据中获取初始深度图,从初始深度图中,提取出两阶段特征向量和深度图特征并进行强化,恢复深度结构,并通过多层感知机映射得到最终深度图,在保持原始深度几何结构的同时,可以对原始数据中形状缺失区域进行补全,并对整体深度图进行稠密化,具备了适应不同程度、不同角度的形状残缺和稀疏的能力,可以有效地提高低精度采集设备的有效输出精度,提高数据采集的质量。由此,解决了相关技术中由于目标物体需满足某种特定的几何对称性,再进行特征提取和预测,导致在深度采集的过程中,难以使用通用的增强模型进行几何特征的补全和稠密化,使得采集中存在信息残缺以及精度低下的情况的技术问题。The depth map enhancement method and apparatus according to the embodiments of the present application are described below with reference to the accompanying drawings. In the related art mentioned in the above-mentioned background technology center, since the target object needs to meet a certain geometric symmetry, and then perform feature extraction and prediction, it is difficult to use a general enhancement model to complement the geometric features in the process of depth acquisition. Full-sum densification leads to the technical problems of incomplete information and low precision in acquisition. The present application provides a depth map enhancement method. In this method, an initial depth map can be obtained from the original visual data. In the figure, the two-stage feature vector and depth map features are extracted and enhanced to restore the depth structure, and the final depth map is obtained through multi-layer perceptron mapping. Completing and densifying the overall depth map has the ability to adapt to shape defects and sparseness of different degrees and angles, which can effectively improve the effective output accuracy of low-precision acquisition equipment and improve the quality of data acquisition. Therefore, in the related art, since the target object needs to meet a certain geometric symmetry, and then perform feature extraction and prediction, it is difficult to use a general enhancement model to complete and dense geometric features in the process of depth acquisition. The technical problems of incomplete information and low precision exist in the collection.
具体而言,图1为本申请实施例所提供的一种深度图增强方法的流程示意图。Specifically, FIG. 1 is a schematic flowchart of a depth map enhancement method provided by an embodiment of the present application.
如图1所示,该深度图增强方法包括以下步骤:As shown in Figure 1, the depth map enhancement method includes the following steps:
在步骤S101中,从原始视觉数据中获取初始深度图。In step S101, an initial depth map is obtained from the original visual data.
在实际执行过程中,本申请实施例可以使用深度相机和相机进行配套采集,获取原始视觉数据,从原始视觉数据中获取视图和稀疏且残缺的低质量深度图后,经过包括但不仅限于数据处理后获取初始深度图。In the actual implementation process, the embodiments of the present application may use a depth camera and a camera to perform matching acquisition to obtain raw visual data, and after obtaining views and sparse and incomplete low-quality depth maps from the raw visual data, processes including but not limited to data processing Then get the initial depth map.
可选地,在本申请的一个实施例中,从原始视觉数据中获取初始深度图,包括:由原始视觉数据得到满足预设条件的低质量深度图;对低质量深度图和红外和深度图原始数据进行预处理和特征提取,生成初始深度图。Optionally, in an embodiment of the present application, obtaining an initial depth map from raw visual data includes: obtaining a low-quality depth map that meets preset conditions from the raw visual data; The raw data is preprocessed and feature extracted to generate an initial depth map.
作为一种可能实现的方式,本申请实施例可以由原始视觉数据得到满足预设条件的低质量深度图,并将从原始数据中得到的视图、红外和深度图原始数据进行预处理和特征提取,生成初始深度图,用于后续步骤的处理。本申请实施例以红外数据作为辅助,对低功耗低性能深度相机、激光雷达的数据输出进行增强的效果,下面进行详细描述。As a possible implementation manner, in this embodiment of the present application, a low-quality depth map that satisfies preset conditions can be obtained from raw visual data, and the raw data of view, infrared, and depth maps obtained from the raw data can be preprocessed and feature extracted. , to generate an initial depth map for processing in subsequent steps. The embodiments of the present application use infrared data as an aid to enhance the data output of a low-power and low-performance depth camera and a lidar, which will be described in detail below.
需要注意的是,低质量深度图的预设条件可以由本领域技术人员根据实际情况进行设定,在此不做具体限制。It should be noted that the preset condition of the low-quality depth map can be set by those skilled in the art according to the actual situation, and no specific limitation is made here.
可选地,在本申请的一个实施例中,由原始视觉数据得到满足预设条件的低质量深度图,包括:基于预设的采集视角阈值对原始视觉数据进行采集,得到低质量深度图。Optionally, in an embodiment of the present application, obtaining a low-quality depth map that satisfies a preset condition from raw visual data includes: collecting the raw visual data based on a preset collection viewing angle threshold to obtain a low-quality depth map.
具体地,本申请实施例可以使用深度相机和相机进行配套采集,并基于预设的采集视角阈值对原始视觉数据进行采集,得到低质量深度图。本申请实施例基于预设采集视角阈值进行原始视觉数据采集,有利于后续对低质量深度图的强化及稠密化,为生成高质量图像奠定基础。Specifically, in this embodiment of the present application, a depth camera and a camera may be used for matching collection, and raw visual data is collected based on a preset collection viewing angle threshold to obtain a low-quality depth map. The embodiment of the present application performs original visual data collection based on a preset collection viewing angle threshold, which is beneficial to the subsequent enhancement and densification of low-quality depth maps, and lays a foundation for generating high-quality images.
需要注意的是,预设的视角阈值根据采集目标的不同会有相应变化,视角阈值可以由本领域技术人员根据实际情况进行设置,在此不做具体限制。It should be noted that the preset viewing angle threshold will change correspondingly according to different acquisition targets, and the viewing angle threshold may be set by those skilled in the art according to the actual situation, which is not specifically limited here.
在步骤S102中,对初始深度图进行交替卷积与反卷积模块的多尺度特征提取,并由反卷积模块进行逐步恢复,得到以深度图三维坐标作为三通道的特征视图。In step S102, multi-scale feature extraction is performed on the initial depth map by alternating convolution and deconvolution modules, and the deconvolution module performs stepwise restoration to obtain a feature view with three-channel depth map three-dimensional coordinates.
进一步地,本申请实施例可以对通过上述步骤获取的初始深度图进行交替卷积与反卷积模块的多尺度特征提取,并由反卷积模块进行逐步恢复,得到以深度图三维坐标作为三通道的特征视图,本申请实施例可以通过逐步提取不同级别的几何特征,实现多模态特征的提取,便于后续进行特征融合并恢复生成高精度图像。Further, in this embodiment of the present application, multi-scale feature extraction by alternating convolution and deconvolution modules can be performed on the initial depth map obtained through the above steps, and the deconvolution module can perform stepwise recovery to obtain the three-dimensional coordinates of the depth map as three dimensions. For the feature view of the channel, the embodiment of the present application can extract multi-modal features by gradually extracting geometric features at different levels, so as to facilitate subsequent feature fusion and restore and generate high-precision images.
在步骤S103中,对特征视图进行尺度压缩和卷积,依次得到两阶段的特征向量。In step S103, scale compression and convolution are performed on the feature view, and two-stage feature vectors are obtained in sequence.
在实际执行过程中,本申请实施例可以对特征视图和红外特征进行尺度压缩和卷积,从而以此获得两阶段的特征向量,实现多模态特征的提取,便于后续进行特征融合并恢复生成高精度图像。In the actual implementation process, the embodiment of the present application can perform scale compression and convolution on the feature view and infrared features, so as to obtain two-stage feature vectors, realize multi-modal feature extraction, and facilitate subsequent feature fusion and recovery generation. High-precision images.
在步骤S104中,对初始深度图进行空间变换和特征提取,得到深度图特征,并基于两阶段的特征向量对深度图特征进行强化,并恢复深度结构,生成第一三通道特征图。In step S104, spatial transformation and feature extraction are performed on the initial depth map to obtain depth map features, and the depth map features are enhanced based on the two-stage feature vectors, and the depth structure is restored to generate a first three-channel feature map.
具体地,本申请实施例可以对步骤S101中获取的初始深度图进行空间变换和特征提取,并输入步骤S103中获取的两阶段的特征向量,对深度图特征进行强化并恢复深度结构,进而生成第一三通道特征图。本申请实施例通过对深度图特征进行强化并恢复深度结构,有利于在保持原始深度几何结构的同时,可以对原始数据中形状缺失区域进行补全,并对整体深度图进行稠密化,具备了适应不同程度、不同角度的形状残缺和稀疏的能力,可以有效地提高低精度采集设备的有效输出精度,提高数据采集的质量。Specifically, this embodiment of the present application may perform spatial transformation and feature extraction on the initial depth map obtained in step S101, and input the two-stage feature vector obtained in step S103 to enhance the depth map features and restore the depth structure, and then generate The first three-channel feature map. By strengthening the depth map features and restoring the depth structure, the embodiment of the present application is beneficial to complete the missing shape regions in the original data while maintaining the original depth geometric structure, and to densify the overall depth map. The ability to adapt to incomplete and sparse shapes of different degrees and angles can effectively improve the effective output accuracy of low-precision acquisition devices and improve the quality of data acquisition.
可选地,在本申请的一个实施例中,对初始深度图进行空间变换和特征提取,得到深度图特征,并基于两阶段的特征向量对深度图特征进行强化,并恢复深度结构,生成第一三通道特征图,包括:对初始深度图进行空间三维变换,得到三维变换后的深度;基于三维变换后的深度提取第一特征,并将第一特征与特征视图的视图特征向量融合,得到第一阶段融合特征;基于第一阶段融合特征提取第二特征,并将第二特征与视图特征向量融合,得到第二阶段融合特征;由第二阶段融合特征多层感知机映射得到第一三通道特征图。Optionally, in an embodiment of the present application, spatial transformation and feature extraction are performed on the initial depth map to obtain depth map features, and the depth map features are enhanced based on the two-stage feature vector, and the depth structure is restored to generate the first depth map feature. A three-channel feature map, including: performing spatial three-dimensional transformation on the initial depth map to obtain a three-dimensional transformed depth; extracting a first feature based on the three-dimensional transformed depth, and fusing the first feature with the view feature vector of the feature view to obtain The first-stage fusion feature; the second feature is extracted based on the first-stage fusion feature, and the second feature is fused with the view feature vector to obtain the second-stage fusion feature; the second-stage fusion feature multi-layer perceptron mapping is used to obtain the first and third stage. Channel feature map.
举例而言,在实际执行过程中,本申请实施例进行深度特征强化的步骤如下:For example, in the actual execution process, the steps of performing deep feature enhancement in this embodiment of the present application are as follows:
1、对初始深度图进行空间三维变换,得到三维变换后的深度;1. Perform spatial 3D transformation on the initial depth map to obtain the 3D transformed depth;
2、根据三维变换后的深度提取第一特征,并将第一特征与特征视图的视图特征向量融合,得到第一阶段融合特征;2. Extract the first feature according to the depth after the three-dimensional transformation, and fuse the first feature with the view feature vector of the feature view to obtain the first-stage fusion feature;
3、通过第一阶段融合特征提取第二特征,并将第二特征与视图特征向量融合,得到第二阶段融合特征;3. Extract the second feature through the first-stage fusion feature, and fuse the second feature with the view feature vector to obtain the second-stage fusion feature;
4、由第二阶段融合特征多层感知机映射得到第一三通道特征图。4. The first three-channel feature map is obtained by the second-stage fusion feature multi-layer perceptron mapping.
举例而言,如图2所示,深度特征强化的方法为:For example, as shown in Figure 2, the method of deep feature enhancement is:
点分支从点中提取特征,并使用从图像分支中的图像特征生成的注意掩码生成增强的点特征;The point branch extracts features from points and generates enhanced point features using an attention mask generated from the image features in the image branch;
然后,将增强的点要素转发到完全连接的层以重建表示全局几何的点集,该点集有助于最终预测深度的另一个子集。The augmented point features are then forwarded to a fully connected layer to reconstruct a point set representing the global geometry that contributes to the final prediction of another subset of depths.
具体来说,本申请实施例可以将欧氏空间(N×3)中3D表示的原始N个输入点转换为特征空间(N×C)中的固定维C,使用DGCNN中提出的EdgeConv学习模块提取点特征。Specifically, the embodiments of the present application can convert the original N input points represented in 3D in the Euclidean space (N×3) into the fixed dimension C in the feature space (N×C), and use the EdgeConv learning module proposed in DGCNN. Extract point features.
在空间变换层中,本申请实施例可以使用估计的3×3矩阵将输入点集对齐到规范空间,为了估计3×3矩阵,本申请实施例可以使用一个张量将每个点的坐标及其k个相邻点的坐标差进行连接,其中,图像特征点特征ARC全局特征CA×R为图2点分支中的特征增强模块。In the spatial transformation layer, the embodiment of the present application can use the estimated 3×3 matrix to align the input point set to the canonical space. In order to estimate the 3×3 matrix, the embodiment of the present application can use a tensor to align the coordinates of each point and The coordinate differences of its k adjacent points are connected, and the image feature point feature ARC global feature CA×R is the feature enhancement module in the point branch of Figure 2.
可以理解的是,特征增强模块融合了从视图模态中的全局特征和通过注意机制从点模态中提取的几何局部特征Fp。具体来说,来自第一个增强单元视图分支的K维特征向量(如图2的顶行所示)在与N倍的重复点连接后特征,使用MLP压缩为Nx1维向量,通过sigmoid函数将此向量归一化为[0,1]的范围,可以得到一个注意掩码ma(Nx1),此后,通过执行ma与点特征F′的逐元素相乘,实现增强的局部点特征Fp。Understandably, the feature augmentation module fuses global features from view modalities and geometric local features F p extracted from point modalities via an attention mechanism. Specifically, the K-dimensional feature vector (shown in the top row of Fig. 2) from the view branch of the first enhancement unit, after concatenating the features with N times repeated points, is compressed into an Nx1-dimensional vector using MLP, which is converted by the sigmoid function to This vector is normalized to the range of [0, 1], an attention mask ma (Nx1) can be obtained, after which the enhanced local point feature F is achieved by performing element- wise multiplication of ma and the point feature F′ p .
进一步地,本申请实施例可以获得由全局图像特征增强的局部特征:Further, the embodiments of the present application can obtain local features enhanced by global image features:
最终的增强点特征Fe(N×2×C)可以通过将局部增强点特征F′与对Fp进行平均池化获得的全局点特征的N倍重复级联而实现。The final enhanced point feature Fe (N×2×C) can be achieved by N-fold repeated concatenation of the local enhanced point feature F′ and the global point feature obtained by average pooling on Fp .
在步骤S105中,由特征视图多层感知机映射得到第二三通道特征图,并融合第一三通道特征图和第二三通道特征图,多层感知机映射得到最终深度图。In step S105, the second three-channel feature map is obtained from the feature view multi-layer perceptron mapping, and the first three-channel feature map and the second three-channel feature map are fused, and the multi-layer perceptron is mapped to obtain the final depth map.
作为一种可能实现的方式,本申请实施例可以将从上述步骤中获取的经过压缩的特征向量进行多层感知机映射,获得第二三通道特征图,并将上述步骤中获得的第一三通道特征图和第二三通道特征图进行融合,再经过多层感知机映射,获得经过恢复的稠密、完整深度图。本申请实施例可以在保持原始深度几何结构的同时,可以对原始数据中形状缺失区域进行补全,并对整体深度图进行稠密化,具备了适应不同程度、不同角度的形状残缺和稀疏的能力,可以有效地提高低精度采集设备的有效输出精度,提高数据采集的质量。As a possible implementation manner, in this embodiment of the present application, a multi-layer perceptron mapping may be performed on the compressed feature vector obtained in the above steps to obtain a second three-channel feature map, and the first and third channel feature maps obtained in the above steps may be The channel feature map and the second and third channel feature maps are fused, and then mapped by a multi-layer perceptron to obtain a restored dense and complete depth map. The embodiment of the present application can maintain the original depth geometric structure, complete the shape missing area in the original data, and densify the overall depth map, and has the ability to adapt to shape defects and sparseness of different degrees and angles , which can effectively improve the effective output accuracy of low-precision acquisition equipment and improve the quality of data acquisition.
下面结合图2至图5所示,以一个实施例对本申请实施例进行详细阐述。Hereinafter, with reference to FIGS. 2 to 5 , an embodiment of the present application will be described in detail by taking an embodiment.
本申请实施例包括以下步骤:The embodiment of the present application includes the following steps:
步骤S501:原始视觉数据采集。在实际执行过程中,本申请实施例可以使用深度相机和相机,并基于预设的采集视角阈值对原始视觉数据进行配套采集,获取原始视觉数据,原始视觉数据中获取视图和稀疏且残缺的低质量深度图。Step S501: raw visual data acquisition. In the actual implementation process, the embodiments of the present application may use a depth camera and a camera, and perform matching collection of raw visual data based on a preset collection angle of view threshold to obtain raw visual data. Views and sparse and incomplete low-level data are obtained from the raw visual data. Quality depth map.
需要注意的是,预设的视角阈值根据采集目标的不同会有相应变化,视角阈值可以由本领域技术人员根据实际情况进行设置,在此不做具体限制。It should be noted that the preset viewing angle threshold will change correspondingly according to different acquisition targets, and the viewing angle threshold may be set by those skilled in the art according to the actual situation, which is not specifically limited here.
步骤S502:原始数据的预处理和特征提取。作为一种可能实现的方式,本申请实施例可以由原始视觉数据得到满足预设条件的低质量深度图,并将从原始数据中得到的视图、红外和深度图原始数据进行预处理和特征提取,生成初始深度图,用于后续步骤的处理。Step S502: preprocessing and feature extraction of the original data. As a possible implementation manner, in this embodiment of the present application, a low-quality depth map that satisfies preset conditions can be obtained from raw visual data, and the raw data of view, infrared, and depth maps obtained from the raw data can be preprocessed and feature extracted. , to generate an initial depth map for processing in subsequent steps.
需要注意的是,低质量深度图的预设条件可以由本领域技术人员根据实际情况进行设定,在此不做具体限制。It should be noted that the preset condition of the low-quality depth map can be set by those skilled in the art according to the actual situation, and no specific limitation is made here.
步骤S503:获取以深度图三维坐标作为三通道的特征视图。进一步地,本申请实施例可以对通过上述步骤获取的初始深度图进行交替卷积与反卷积模块的多尺度特征提取,并由反卷积模块进行逐步恢复,得到以深度图三维坐标作为三通道的特征视图,本申请实施例可以通过逐步提取不同级别的几何特征,实现多模态特征的提取,便于后续进行特征融合并恢复生成高精度图像。Step S503: Acquire a feature view using the three-dimensional coordinates of the depth map as three channels. Further, in this embodiment of the present application, multi-scale feature extraction by alternating convolution and deconvolution modules can be performed on the initial depth map obtained through the above steps, and the deconvolution module can perform stepwise recovery to obtain the three-dimensional coordinates of the depth map as three dimensions. For the feature view of the channel, the embodiment of the present application can extract multi-modal features by gradually extracting geometric features at different levels, so as to facilitate subsequent feature fusion and restore and generate high-precision images.
步骤S504:获取两阶段的特征向量。在实际执行过程中,本申请实施例可以对特征视图和红外特征进行尺度压缩和卷积,从而以此获得两阶段的特征向量,实现多模态特征的提取,便于后续进行特征融合并恢复生成高精度图像。Step S504: Obtain two-stage feature vectors. In the actual implementation process, the embodiment of the present application can perform scale compression and convolution on the feature view and infrared features, so as to obtain two-stage feature vectors, realize multi-modal feature extraction, and facilitate subsequent feature fusion and recovery generation. High-precision images.
步骤S505:进行深度图特征进行强化并恢复深度结构。具体地,本申请实施例可以对步骤S101中获取的初始深度图进行空间变换和特征提取,并输入步骤S103中获取的两阶段的特征向量,对深度图特征进行强化并恢复深度结构,进而生成第一三通道特征图。本申请实施例通过对深度图特征进行强化并恢复深度结构,有利于在保持原始深度几何结构的同时,可以对原始数据中形状缺失区域进行补全,并对整体深度图进行稠密化,具备了适应不同程度、不同角度的形状残缺和稀疏的能力,可以有效地提高低精度采集设备的有效输出精度,提高数据采集的质量。Step S505: Enhance the depth map feature and restore the depth structure. Specifically, this embodiment of the present application may perform spatial transformation and feature extraction on the initial depth map obtained in step S101, and input the two-stage feature vector obtained in step S103 to enhance the depth map features and restore the depth structure, and then generate The first three-channel feature map. By strengthening the depth map features and restoring the depth structure, the embodiment of the present application is beneficial to complete the missing shape regions in the original data while maintaining the original depth geometric structure, and to densify the overall depth map. The ability to adapt to incomplete and sparse shapes of different degrees and angles can effectively improve the effective output accuracy of low-precision acquisition devices and improve the quality of data acquisition.
在实际执行过程中,本申请实施例进行深度特征强化的步骤如下:In the actual execution process, the steps of deep feature enhancement in this embodiment of the present application are as follows:
1、对初始深度图进行空间三维变换,得到三维变换后的深度;1. Perform spatial 3D transformation on the initial depth map to obtain the 3D transformed depth;
2、根据三维变换后的深度提取第一特征,并将第一特征与特征视图的视图特征向量融合,得到第一阶段融合特征;2. Extract the first feature according to the depth after the three-dimensional transformation, and fuse the first feature with the view feature vector of the feature view to obtain the first-stage fusion feature;
3、通过第一阶段融合特征提取第二特征,并将第二特征与视图特征向量融合,得到第二阶段融合特征;3. Extract the second feature through the first-stage fusion feature, and fuse the second feature with the view feature vector to obtain the second-stage fusion feature;
4、由第二阶段融合特征多层感知机映射得到第一三通道特征图。4. The first three-channel feature map is obtained by the second-stage fusion feature multi-layer perceptron mapping.
举例而言,如图2所示,深度特征强化的方法为:For example, as shown in Figure 2, the method of deep feature enhancement is:
点分支从点中提取特征,并使用从图像分支中的图像特征生成的注意掩码生成增强的点特征;The point branch extracts features from points and generates enhanced point features using an attention mask generated from the image features in the image branch;
然后,将增强的点要素转发到完全连接的层以重建表示全局几何的点集,该点集有助于最终预测深度的另一个子集。The augmented point features are then forwarded to a fully connected layer to reconstruct a point set representing the global geometry that contributes to the final prediction of another subset of depths.
具体来说,本申请实施例可以将欧氏空间(N×3)中3D表示的原始N个输入点转换为特征空间(N×C)中的固定维C,使用DGCNN中提出的EdgeConv学习模块提取点特征。Specifically, the embodiments of the present application can convert the original N input points represented in 3D in the Euclidean space (N×3) into the fixed dimension C in the feature space (N×C), and use the EdgeConv learning module proposed in DGCNN. Extract point features.
在空间变换层中,本申请实施例可以使用估计的3×3矩阵将输入点集对齐到规范空间,为了估计3×3矩阵,本申请实施例可以使用一个张量将每个点的坐标及其k个相邻点的坐标差进行连接,其中,图像特征点特征ARC全局特征CA×R为图2点分支中的特征增强模块。In the spatial transformation layer, the embodiment of the present application can use the estimated 3×3 matrix to align the input point set to the canonical space. In order to estimate the 3×3 matrix, the embodiment of the present application can use a tensor to align the coordinates of each point and The coordinate differences of its k adjacent points are connected, and the image feature point feature ARC global feature CA×R is the feature enhancement module in the point branch of Figure 2.
可以理解的是,特征增强模块融合了从视图模态中的全局特征和通过注意机制从点模态中提取的几何局部特征Fp。具体来说,来自第一个增强单元视图分支的K维特征向量(如图2的顶行所示)在与N倍的重复点连接后特征,使用MLP压缩为Nx1维向量,通过sigmoid函数将此向量归一化为[0,1]的范围,可以得到一个注意掩码ma(Nx1),此后,通过执行ma与点特征F′的逐元素相乘,实现增强的局部点特征Fp。Understandably, the feature augmentation module fuses global features from view modalities and geometric local features F p extracted from point modalities via an attention mechanism. Specifically, the K-dimensional feature vector (shown in the top row of Fig. 2) from the view branch of the first enhancement unit, after concatenating the features with N times repeated points, is compressed into an Nx1-dimensional vector using MLP, which is converted by the sigmoid function to This vector is normalized to the range of [0, 1], an attention mask ma (Nx1) can be obtained, after which the enhanced local point feature F is achieved by performing element- wise multiplication of ma with the point feature F ′ p .
进一步地,本申请实施例可以获得由全局图像特征增强的局部特征:Further, the embodiments of the present application can obtain local features enhanced by global image features:
最终的增强点特征Fe(N×2×C)可以通过将局部增强点特征F′与对Fp进行平均池化获得的全局点特征的N倍重复级联而实现。The final enhanced point feature Fe (N×2×C) can be achieved by N-fold repeated concatenation of the local enhanced point feature F′ and the global point feature obtained by average pooling on Fp .
步骤S506:融合三通道特征图,经过多层感知机映射,获得经过恢复的稠密、完整深度图。作为一种可能实现的方式,本申请实施例可以将从上述步骤中获取的经过压缩的特征向量进行多层感知机映射,获得第二三通道特征图,并将上述步骤中获得的第一三通道特征图和第二三通道特征图进行融合,再经过多层感知机映射,获得经过恢复的稠密、完整深度图。本申请实施例可以在保持原始深度几何结构的同时,可以对原始数据中形状缺失区域进行补全,并对整体深度图进行稠密化,具备了适应不同程度、不同角度的形状残缺和稀疏的能力,可以有效地提高低精度采集设备的有效输出精度,提高数据采集的质量。Step S506 : fuse the three-channel feature maps, and obtain a restored dense and complete depth map through multi-layer perceptron mapping. As a possible implementation manner, in this embodiment of the present application, a multi-layer perceptron mapping may be performed on the compressed feature vector obtained in the above steps to obtain a second three-channel feature map, and the first and third channel feature maps obtained in the above steps may be The channel feature map and the second and third channel feature maps are fused, and then mapped by a multi-layer perceptron to obtain a restored dense and complete depth map. The embodiment of the present application can maintain the original depth geometric structure, complete the shape missing area in the original data, and densify the overall depth map, and has the ability to adapt to shape defects and sparseness of different degrees and angles , which can effectively improve the effective output accuracy of low-precision acquisition equipment and improve the quality of data acquisition.
进一步地,本申请实施例的深度图增强方法主要包括以下几个方面:Further, the depth map enhancement method in the embodiment of the present application mainly includes the following aspects:
1、多模态特征提取、融合与恢复生成。深度增强网络是一种对抗式架构,包括生成器和鉴别器,如图3所示,增强网络是一种对抗性架构,包括生成器和鉴别器。如图4所示,生成器是具有相似结构的两个增强单元的级联结构,以低质量的深度X和单视图参考图像I作为输入,每个增强单元由三个并行功能分支组成:视图分支,点分支和融合分支,它们分别通过处理视图、红外特征,深度特征和多模态图像点融合特征来预测点集。1. Multimodal feature extraction, fusion and restoration generation. Deep augmentation network is an adversarial architecture, including generator and discriminator, as shown in Figure 3, augmentation network is an adversarial architecture, including generator and discriminator. As shown in Fig. 4, the generator is a cascade of two enhancement units with similar structure, taking a low-quality depth X and a single-view reference image I as input, and each enhancement unit consists of three parallel functional branches: view branch, point branch and fusion branch, which predict point sets by processing view, infrared features, depth features and multimodal image point fusion features, respectively.
如图4所示,深度生成器具有两个级联增强单元,第一个单元将原始图像I和深度图X直接作为输入,并将其传输到潜在的嵌入空间中,然后将要素转发到下一个单元;第二个单元在功能上类似于第一个单元,但输出为三维点集X。从数据流的角度来看,生成器包括三个并行分支,分别从视图,深度和多模态图像点特征预测深度,输出深度是三个分支的预测点集的并集。As shown in Figure 4, the depth generator has two cascaded enhancement units, the first unit takes the original image I and depth map X directly as input and transfers them into the latent embedding space, and then forwards the features to the next One element; the second element is functionally similar to the first element, but the output is a 3D point set X. From a data flow perspective, the generator consists of three parallel branches that predict depth from view, depth, and multimodal image point features, respectively, and the output depth is the union of the predicted point sets of the three branches.
其中,视图分支,视图分支将图像和红外图像作为输入,并逐步提取不同级别的几何特征。Among them, the view branch takes images and infrared images as input and gradually extracts different levels of geometric features.
深度分支,点分支从点中提取特征,并使用从图像分支中的图像特征生成的注意掩码生成增强的点特征,从而将增强的点要素转发到完全连接的层以重建表示全局几何的点集,该点集有助于最终预测深度的另一个子集。The depth branch, the point branch extracts features from the points and generates augmented point features using an attention mask generated from the image features in the image branch, thereby forwarding the augmented point features to a fully connected layer to reconstruct the points representing the global geometry , which contributes to the final prediction of another subset of depths.
融合分支,融合分支主要融合两条特征流。Fusion branch, the fusion branch mainly fuses two feature streams.
2、生成深度判别器。对抗训练促进了图像表示方面的一系列表示和生成应用程序的开发,但是在将这种体系结构应用于深度完成方面的工作很少,因此本申请实施例可以添加一个判别器,以在框架中进行对抗训练,目的是识别深度的真假。本申请实施例可以使用考虑完整性和分布均匀性的联合损失函数来训练网络以获得“粗”完整深度,然后以对抗性损失对网络进行微调,以实现“精细”完整深度,并将PointNet用作判别器的二进制分类网络,以区分预测结果来自生成集X或实集Y。本申请实施例可以在改进的Wasserstein GAN中使用对抗性损失进行训练:2. Generate a deep discriminator. Adversarial training has facilitated the development of a range of representational and generative applications in image representation, but little work has been done on applying this architecture to depth completion, so embodiments of this application can add a discriminator to the framework Adversarial training is performed with the goal of identifying true and false depths. Embodiments of the present application can use a joint loss function that considers completeness and distribution uniformity to train the network to obtain a "coarse" full depth, then fine-tune the network with an adversarial loss to achieve a "fine" full depth, and use PointNet with A binary classification network that acts as a discriminator to distinguish predictions from either the generated set X or the real set Y. Embodiments of this application can be trained using adversarial losses in an improved Wasserstein GAN:
其中,D表示1-Lipschitz函数的集合,y~PY和分别是从生成的数据和实际数据中提取的点集样本,第二项中的多项式是梯度罚分。Among them, D represents the set of 1-Lipschitz functions, y~PY and are the point set samples drawn from the generated data and the real data, respectively, and the polynomial in the second term is the gradient penalty.
根据本申请实施例提出的深度图增强方法,可以从原始视觉数据中获取初始深度图,从初始深度图中,提取出两阶段特征向量和深度图特征并进行强化,恢复深度结构,并通过多层感知机映射得到最终深度图,在保持原始深度几何结构的同时,可以对原始数据中形状缺失区域进行补全,并对整体深度图进行稠密化,具备了适应不同程度、不同角度的形状残缺和稀疏的能力,可以有效地提高低精度采集设备的有效输出精度,提高数据采集的质量。由此,解决了相关技术中在进行数据采集时,由于物体几何差异性较大,难以通过几何对称性进行特征补全,使得采集到的数据存在信息残缺且精度低下的情况,难以实现高精度的图像输出的技术问题。According to the depth map enhancement method proposed in the embodiment of the present application, an initial depth map can be obtained from the original visual data, and two-stage feature vectors and depth map features can be extracted from the initial depth map and enhanced to restore the depth structure. The layer perceptron maps to obtain the final depth map. While maintaining the original depth geometry, it can complete the shape missing areas in the original data, and densify the overall depth map, which can adapt to different degrees and different angles. and sparse ability, which can effectively improve the effective output accuracy of low-precision acquisition equipment and improve the quality of data acquisition. This solves the problem that in the related art, during data collection, due to the large geometric differences of objects, it is difficult to perform feature completion through geometric symmetry, so that the collected data has incomplete information and low precision, and it is difficult to achieve high precision. technical issues of image output.
其次参照附图描述根据本申请实施例提出的深度图增强装置。Next, the depth map enhancement device proposed according to the embodiments of the present application will be described with reference to the accompanying drawings.
图6是本申请实施例的深度图增强装置的方框示意图。FIG. 6 is a schematic block diagram of a depth map enhancement apparatus according to an embodiment of the present application.
如图6所示,该深度图增强装置10包括:获取模块100、特征提取模块200、计算模块300、强化模块400和融合模块500。As shown in FIG. 6 , the depth
具体地,获取模块100,用于从原始视觉数据中获取初始深度图。Specifically, the obtaining
特征提取模块200,用于对初始深度图进行交替卷积与反卷积模块的多尺度特征提取,并由反卷积模块进行逐步恢复,得到以深度图三维坐标作为三通道的特征视图。The
计算模块300,用于对特征视图进行尺度压缩和卷积,依次得到两阶段的特征向量。The
强化模块400,用于对初始深度图进行空间变换和特征提取,得到深度图特征,并基于两阶段的特征向量对深度图特征进行强化,并恢复深度结构,生成第一三通道特征图。The
融合模块500,用于由特征视图多层感知机映射得到第二三通道特征图,并融合第一三通道特征图和第二三通道特征图,多层感知机映射得到最终深度图。The
可选地,在本申请的一个实施例中,获取模块100包括:获取单元和预处理单元。Optionally, in an embodiment of the present application, the obtaining
其中,获取单元,用于由原始视觉数据得到满足预设条件的低质量深度图。Wherein, the obtaining unit is used to obtain a low-quality depth map that satisfies a preset condition from the original visual data.
预处理单元,用于对低质量深度图和红外和深度图原始数据进行预处理和特征提取,生成初始深度图。A preprocessing unit for preprocessing and feature extraction on low-quality depth map and infrared and depth map raw data to generate an initial depth map.
可选地,在本申请的一个实施例中,获取单元进一步用于,基于预设的采集视角阈值对原始视觉数据进行采集,得到低质量深度图。Optionally, in an embodiment of the present application, the obtaining unit is further configured to collect raw visual data based on a preset collection viewing angle threshold to obtain a low-quality depth map.
可选地,在本申请的一个实施例中,强化模块400包括:三维变换单元、第一融合单元、第二融合单元和映射单元。Optionally, in an embodiment of the present application, the
其中,三维变换单元,用于对初始深度图进行空间三维变换,得到三维变换后的深度。The three-dimensional transformation unit is used to perform spatial three-dimensional transformation on the initial depth map to obtain the three-dimensional transformed depth.
第一融合单元,用于基于三维变换后的深度提取第一特征,并将第一特征与特征视图的视图特征向量融合,得到第一阶段融合特征。The first fusion unit is configured to extract the first feature based on the depth after the three-dimensional transformation, and fuse the first feature with the view feature vector of the feature view to obtain the first-stage fusion feature.
第二融合单元,用于基于第一阶段融合特征提取第二特征,并将第二特征与视图特征向量融合,得到第二阶段融合特征。The second fusion unit is configured to extract the second feature based on the first-stage fusion feature, and fuse the second feature with the view feature vector to obtain the second-stage fusion feature.
映射单元,用于由第二阶段融合特征多层感知机映射得到第一三通道特征图。The mapping unit is used to obtain the first three-channel feature map by the second-stage fusion feature multi-layer perceptron mapping.
需要说明的是,前述对深度图增强方法实施例的解释说明也适用于该实施例的深度图增强装置,此处不再赘述。It should be noted that, the foregoing explanation of the embodiment of the depth map enhancement method is also applicable to the depth map enhancement apparatus of this embodiment, and details are not repeated here.
根据本申请实施例提出的深度图增强装置,可以从原始视觉数据中获取初始深度图,从初始深度图中,提取出两阶段特征向量和深度图特征并进行强化,恢复深度结构,并通过多层感知机映射得到最终深度图,在保持原始深度几何结构的同时,可以对原始数据中形状缺失区域进行补全,并对整体深度图进行稠密化,具备了适应不同程度、不同角度的形状残缺和稀疏的能力,可以有效地提高低精度采集设备的有效输出精度,提高数据采集的质量。由此,解决了相关技术中在进行数据采集时,由于物体几何差异性较大,难以通过几何对称性进行特征补全,使得采集到的数据存在信息残缺且精度低下的情况,难以实现高精度的图像输出的技术问题。According to the depth map enhancement device proposed in the embodiment of the present application, an initial depth map can be obtained from the original visual data, two-stage feature vectors and depth map features can be extracted from the initial depth map and enhanced, and the depth structure can be restored. The layer perceptron is mapped to obtain the final depth map. While maintaining the original depth geometric structure, it can complete the shape missing area in the original data, and densify the overall depth map, which has the ability to adapt to different degrees and different angles. and sparse ability, which can effectively improve the effective output accuracy of low-precision acquisition equipment and improve the quality of data acquisition. This solves the problem that in the related art, during data collection, due to the large geometric differences of objects, it is difficult to perform feature completion through geometric symmetry, so that the collected data has incomplete information and low precision, and it is difficult to achieve high precision. technical issues of image output.
图7为本申请实施例提供的电子设备的结构示意图。该电子设备可以包括:FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. The electronic device may include:
存储器701、处理器702及存储在存储器701上并可在处理器702上运行的计算机程序。
处理器702执行程序时实现上述实施例中提供的深度图增强方法。When the
进一步地,电子设备还包括:Further, the electronic device also includes:
通信接口703,用于存储器701和处理器702之间的通信。The
存储器701,用于存放可在处理器702上运行的计算机程序。The
存储器701可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。The
如果存储器701、处理器702和通信接口703独立实现,则通信接口703、存储器701和处理器702可以通过总线相互连接并完成相互间的通信。总线可以是工业标准体系结构(Industry Standard Architecture,简称为ISA)总线、外部设备互连(PeripheralComponent,简称为PCI)总线或扩展工业标准体系结构(Extended Industry StandardArchitecture,简称为EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图7中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。If the
可选的,在具体实现上,如果存储器701、处理器702及通信接口703,集成在一块芯片上实现,则存储器701、处理器702及通信接口703可以通过内部接口完成相互间的通信。Optionally, in terms of specific implementation, if the
处理器702可能是一个中央处理器(Central Processing Unit,简称为CPU),或者是特定集成电路(Application Specific Integrated Circuit,简称为ASIC),或者是被配置成实施本申请实施例的一个或多个集成电路。The
本实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上的深度图增强方法。This embodiment also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the above depth map enhancement method.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或N个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, description with reference to the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples", etc., mean specific features described in connection with the embodiment or example , structure, material or feature is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials or characteristics described may be combined in any suitable manner in any one or N of the embodiments or examples. Furthermore, those skilled in the art may combine and combine the different embodiments or examples described in this specification, as well as the features of the different embodiments or examples, without conflicting each other.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本申请的描述中,“N个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In addition, the terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In the description of the present application, "N" means at least two, such as two, three, etc., unless otherwise expressly and specifically defined.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更N个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本申请的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本申请的实施例所属技术领域的技术人员所理解。Any process or method description in the flowchart or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or N more executable instructions for implementing custom logical functions or steps of the process , and the scope of the preferred embodiments of the present application includes alternative implementations in which the functions may be performed out of the order shown or discussed, including performing the functions substantially concurrently or in the reverse order depending upon the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present application belong.
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或N个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。Logic and/or steps represented in flowcharts or otherwise described herein, for example, may be considered an ordered listing of executable instructions for implementing the logical functions, may be embodied in any computer-readable medium, For use with, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a system including a processor, or other system that can fetch instructions from and execute instructions from an instruction execution system, apparatus, or apparatus) or equipment. For the purposes of this specification, a "computer-readable medium" can be any device that can contain, store, communicate, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or apparatus. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections with one or N wires (electronic devices), portable computer disk cartridges (magnetic devices), random access memory (RAM), Read Only Memory (ROM), Erasable Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, followed by editing, interpretation, or other suitable medium as necessary process to obtain the program electronically and then store it in computer memory.
应当理解,本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,N个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。如,如果用硬件来实现和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that various parts of this application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware as in another embodiment, it can be implemented by any one of the following techniques known in the art, or a combination thereof: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, Programmable Gate Arrays (PGA), Field Programmable Gate Arrays (FPGA), etc.
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。Those skilled in the art can understand that all or part of the steps carried by the methods of the above embodiments can be completed by instructing the relevant hardware through a program, and the program can be stored in a computer-readable storage medium, and the program is stored in a computer-readable storage medium. When executed, one or a combination of the steps of the method embodiment is included.
此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically alone, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本申请的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like. Although the embodiments of the present application have been shown and described above, it should be understood that the above embodiments are exemplary and should not be construed as limitations to the present application. Embodiments are subject to variations, modifications, substitutions and variations.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210295510.0A CN114820344B (en) | 2022-03-23 | 2022-03-23 | Depth map enhancement method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210295510.0A CN114820344B (en) | 2022-03-23 | 2022-03-23 | Depth map enhancement method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114820344A true CN114820344A (en) | 2022-07-29 |
CN114820344B CN114820344B (en) | 2025-06-10 |
Family
ID=82530110
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210295510.0A Active CN114820344B (en) | 2022-03-23 | 2022-03-23 | Depth map enhancement method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114820344B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102691078B1 (en) * | 2023-02-27 | 2024-08-05 | 고려대학교 산학협력단 | Method and apparatus for generating image based on denoising diffusion model reflecting geometric information |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190362511A1 (en) * | 2018-05-23 | 2019-11-28 | Apple Inc. | Efficient scene depth map enhancement for low power devices |
CN111080688A (en) * | 2019-12-25 | 2020-04-28 | 左一帆 | Depth map enhancement method based on depth convolution neural network |
US20210150726A1 (en) * | 2019-11-14 | 2021-05-20 | Samsung Electronics Co., Ltd. | Image processing apparatus and method |
CN113436220A (en) * | 2021-05-28 | 2021-09-24 | 华东师范大学 | Image background estimation method based on depth map segmentation |
-
2022
- 2022-03-23 CN CN202210295510.0A patent/CN114820344B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190362511A1 (en) * | 2018-05-23 | 2019-11-28 | Apple Inc. | Efficient scene depth map enhancement for low power devices |
US20210150726A1 (en) * | 2019-11-14 | 2021-05-20 | Samsung Electronics Co., Ltd. | Image processing apparatus and method |
CN111080688A (en) * | 2019-12-25 | 2020-04-28 | 左一帆 | Depth map enhancement method based on depth convolution neural network |
CN113436220A (en) * | 2021-05-28 | 2021-09-24 | 华东师范大学 | Image background estimation method based on depth map segmentation |
Non-Patent Citations (1)
Title |
---|
SARAH XU: "Light Field Depth Estimation With Multi-Layer Perceptron", HTTPS://GITHUB.COM/YSX001/EE367-LIGHTFIELD-DEPTH, 31 December 2021 (2021-12-31), pages 1 - 5 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102691078B1 (en) * | 2023-02-27 | 2024-08-05 | 고려대학교 산학협력단 | Method and apparatus for generating image based on denoising diffusion model reflecting geometric information |
Also Published As
Publication number | Publication date |
---|---|
CN114820344B (en) | 2025-06-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108805170B (en) | Forming a dataset for fully supervised learning | |
CN111160214B (en) | 3D target detection method based on data fusion | |
Tulsiani et al. | Learning category-specific deformable 3d models for object reconstruction | |
CN115439694A (en) | High-precision point cloud completion method and device based on deep learning | |
CN114758337B (en) | Semantic instance reconstruction method, device, equipment and medium | |
CN113034581B (en) | Relative pose estimation method of space targets based on deep learning | |
WO2016175150A1 (en) | Template creation device and template creation method | |
WO2021164887A1 (en) | 6d pose and shape estimation method | |
CN114638866B (en) | A point cloud registration method and system based on local feature learning | |
JP2019159940A (en) | Point group feature extraction device, point group feature extraction method, and program | |
Jeon et al. | Struct-MDC: Mesh-refined unsupervised depth completion leveraging structural regularities from visual SLAM | |
CN118071999B (en) | Multi-view 3D target detection method based on sampling self-adaption continuous NeRF | |
CN114693951A (en) | An RGB-D Saliency Object Detection Method Based on Global Context Information Exploration | |
CN114820344A (en) | Depth map enhancement method and device | |
WO2023019478A1 (en) | Three-dimensional reconstruction method and apparatus, electronic device, and readable storage medium | |
CN115775214B (en) | Point cloud completion method and system based on multi-stage fractal combination | |
CN118247346A (en) | Target pose estimation method based on multimodal feature fusion deep neural network | |
CN117934840A (en) | Point cloud scene segmentation method integrating double neighborhood features and global space perception | |
CN116229577A (en) | Three-dimensional human body pose estimation method and device based on RGBD multi-mode information | |
CN117689990A (en) | Three-tributary bidirectional fusion network method based on 6D attitude estimation | |
CN117351078A (en) | Target size and 6D pose estimation method based on shape prior | |
CN113361609B (en) | Template matching method based on anisotropic filtering and applied to man-machine cooperation | |
CN112818965B (en) | Multi-scale image target detection method and system, electronic equipment and storage medium | |
CN116385369A (en) | Depth image quality evaluation method and device, electronic equipment and storage medium | |
WO2023231173A1 (en) | Binocular stereo matching method, device, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |