CN113936117A - High-frequency region enhanced luminosity three-dimensional reconstruction method based on deep learning - Google Patents

High-frequency region enhanced luminosity three-dimensional reconstruction method based on deep learning Download PDF

Info

Publication number
CN113936117A
CN113936117A CN202111524515.8A CN202111524515A CN113936117A CN 113936117 A CN113936117 A CN 113936117A CN 202111524515 A CN202111524515 A CN 202111524515A CN 113936117 A CN113936117 A CN 113936117A
Authority
CN
China
Prior art keywords
surface normal
attention weight
reconstructed
images
layers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111524515.8A
Other languages
Chinese (zh)
Other versions
CN113936117B (en
Inventor
举雅琨
董军宇
高峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to CN202111524515.8A priority Critical patent/CN113936117B/en
Publication of CN113936117A publication Critical patent/CN113936117A/en
Application granted granted Critical
Publication of CN113936117B publication Critical patent/CN113936117B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/30Polynomial surface description
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Software Systems (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The method comprises the steps of shooting a plurality of images of an object to be reconstructed by using a photometric stereo system, outputting accurate surface normal three-dimensional reconstruction by using a deep learning algorithm, wherein a surface normal generation network is designed to generate the surface normal of the object to be reconstructed from the images and illumination; the attention weight generation network generates an attention weight map of an object to be reconstructed from the image; processing the attention weight loss function pixel by pixel; and then using the trained network for surface normal reconstruction of the photometric stereo image. The invention respectively learns the surface normal and high-frequency information through the proposed surface normal generation network and the attention weight generation network, and trains by using the proposed attention weight loss, thereby improving the reconstruction precision of the surface of a high-frequency region such as a fold edge. Compared with the traditional photometric stereo method, the three-dimensional reconstruction precision is improved, and particularly the details of the surface of an object to be reconstructed are improved.

Description

基于深度学习的高频区域增强的光度立体三维重建方法A Photometric Stereo 3D Reconstruction Method Based on Deep Learning for High-frequency Region Enhancement

技术领域technical field

本发明涉及一种基于深度学习的高频区域增强光度立体三维重建方法,属于多度立体三维重建领域。The invention relates to a high-frequency region-enhanced photometric three-dimensional reconstruction method based on deep learning, and belongs to the field of multi-degree three-dimensional reconstruction.

背景技术Background technique

三维重建算法是计算机视觉中非常重要且基础的问题,光度立体算法是一种高精度的逐像素三维重建方法,其利用不同光照方向下的图像提供的灰度变化线索恢复物体表面法向。光度立体在许多高精度三维重建任务求中有着不可替代的位置,例如其在考古勘探,管道检测,海床精细测绘等方面有着重要的应用价值。The 3D reconstruction algorithm is a very important and fundamental problem in computer vision. The photometric stereo algorithm is a high-precision pixel-by-pixel 3D reconstruction method, which uses the grayscale change clues provided by images under different illumination directions to restore the surface normal of the object. Photometric stereo has an irreplaceable position in many high-precision 3D reconstruction tasks. For example, it has important application value in archaeological exploration, pipeline inspection, and fine seabed mapping.

但现有的基于深度学习的光度立体方法在物体表面高频区域的误差很大,例如褶皱、边缘,现有方法在这些区域会生成模糊的三维重建结果,然而,这些区域正是重点关注并需要精确重建的地方。However, the existing deep learning-based photometric stereo methods have large errors in the high-frequency areas of the object surface, such as wrinkles and edges. The existing methods will generate blurred 3D reconstruction results in these areas. However, these areas are the focus of attention and Where precise reconstruction is required.

发明内容SUMMARY OF THE INVENTION

针对以上问题,本发明目的是提出了一种基于深度学习的高频区域增强光度立体三维重建方法,以克服现有技术的不足。In view of the above problems, the purpose of the present invention is to propose a high-frequency region-enhanced photometric three-dimensional reconstruction method based on deep learning, so as to overcome the deficiencies of the prior art.

基于深度学习的高频区域增强光度立体三维重建方法,其特征是包括以下步骤:A high-frequency region-enhanced photometric stereo 3D reconstruction method based on deep learning is characterized by including the following steps:

1)利用光度立体系统,拍摄若干张待重建物体的图像:1) Using the photometric stereo system, take several images of the object to be reconstructed:

待重建物体在单个平行白色光源的照射下拍摄图像,以待重建物体中心为坐标轴原点,建立笛卡尔坐标系,则白光光源的位置由该笛卡尔坐标系中向量 l = [x, y, z]表示;The image of the object to be reconstructed is taken under the illumination of a single parallel white light source, and the center of the object to be reconstructed is the origin of the coordinate axis to establish a Cartesian coordinate system, then the position of the white light source is determined by the vector l = [ x, y, z ] means;

改变该光源位置,在另一光照方向下获得拍摄图像;通常需至少拍摄10张以上的在不同光照方向照射下的图像,记作m 1 , m 2 , ..., m j 同时相对应的光源位置记作l 1 , l 2 , ...,l j j为大于等于10的自然数;Change the position of the light source and obtain a photographed image under another illumination direction; usually, at least 10 images under different illumination directions need to be photographed, denoted as m 1 , m 2 , ..., m j , while corresponding The position of the light source is denoted as l 1 , l 2 , ..., l j , where j is a natural number greater than or equal to 10;

2)利用深度学习算法输入m 1 ,m 2 , ..., m j l 1 ,l 2 , ..., l j ,输出准确的表面法向三维重建:2) Use the deep learning algorithm to input m 1 , m 2 , ..., m j and l 1 , l 2 , ..., l j , and output the accurate three-dimensional reconstruction of the surface normal:

所利用的深度学习算法分为以下四个部分:(1)表面法向生成网络,(2)注意力权重生成网络,(3)注意力权重损失函数联合训练,(4)网络训练;其中:The utilized deep learning algorithm is divided into the following four parts: (1) surface normal generation network, (2) attention weight generation network, (3) joint training of attention weight loss function, (4) network training; among which:

(1)表面法向生成网络被设计成从图像m 1 ,m 2 , ..., m j 和光照l 1 ,l 2 , ..., l j 中生成需要重建物体的表面法向

Figure 100002_DEST_PATH_IMAGE001
;(1) The surface normal generation network is designed to generate the surface normal of the object to be reconstructed from the images m 1 , m 2 , ..., m j and the light l 1 , l 2 , ..., l j
Figure 100002_DEST_PATH_IMAGE001
;

(2)注意力权重生成网络被设计成从图像m 1 , m 2 , ..., m j 中生成需要重建物体的注意力权重图P (2) The attention weight generation network is designed to generate the attention weight map P of the object to be reconstructed from the images m 1 , m 2 , ..., m j ;

(3)注意力权重损失L是一个逐像素处理的损失函数,它由每一个像素点的损失L k 平均计算得到,公式为

Figure 100002_DEST_PATH_IMAGE002
p*q为图像m的分辨率,p、q≥2n,n≥4;(3) The attention weight loss L is a pixel -by-pixel processing loss function, which is calculated by the average loss Lk of each pixel, and the formula is
Figure 100002_DEST_PATH_IMAGE002
; p*q is the resolution of image m , p, q ≥ 2 n , n ≥ 4;

每一个像素位置的损失L k 包括两部分,第一部分是带有系数项的梯度损失L gradient ,第二部分是带有系数项的法向损失L normal ,即L k = P k L gradient +λ(1 – P k ) L normal The loss L k of each pixel position includes two parts, the first part is the gradient loss L gradient with coefficient terms, and the second part is the normal loss L normal with coefficient terms, that is, L k = P k L gradient + λ (1 – P k ) L normal ;

其中,

Figure 100002_DEST_PATH_IMAGE003
Figure 100002_DEST_PATH_IMAGE004
是待重建物体的真实表面法向n在位置k的梯度,ζ是计算梯度时使用的邻域像素范围,ζ设置范围为1、2、3、4、5,
Figure 100002_DEST_PATH_IMAGE005
是预测的表面法向
Figure 100002_DEST_PATH_IMAGE006
在位置k的梯度;
Figure 100002_DEST_PATH_IMAGE007
表示网络预测的表面法向,
Figure 100002_DEST_PATH_IMAGE008
表示真实表面法向;in,
Figure 100002_DEST_PATH_IMAGE003
,
Figure 100002_DEST_PATH_IMAGE004
is the gradient of the real surface normal n of the object to be reconstructed at position k , ζ is the neighborhood pixel range used when calculating the gradient, ζ is set in the range of 1, 2, 3, 4, 5,
Figure 100002_DEST_PATH_IMAGE005
is the predicted surface normal
Figure 100002_DEST_PATH_IMAGE006
gradient at position k ;
Figure 100002_DEST_PATH_IMAGE007
represents the surface normal predicted by the network,
Figure 100002_DEST_PATH_IMAGE008
represents the true surface normal;

梯度损失在网络中可以锐化表面法向的高频表达;P k 为注意力权重图上像素位置k上的值;The gradient loss can sharpen the high-frequency representation of the surface normal in the network; P k is the value at the pixel position k on the attention weight map;

其次,

Figure 100002_DEST_PATH_IMAGE009
,●代表点乘操作,λ是一个超参,目的是为了梯度损失和法向损失,设置范围为{7、8、9、10};Second,
Figure 100002_DEST_PATH_IMAGE009
, ● represents the point multiplication operation, λ is a hyperparameter, the purpose is for gradient loss and normal loss, and the setting range is {7, 8, 9, 10};

通过上述(3)注意力权重损失可将(1)表面法向生成网络和(2)注意力权重生成网络建立起联系;Through the above (3) attention weight loss, (1) the surface normal generation network and (2) the attention weight generation network can be connected;

(4)网络训练(4) Network training

训练网络时,利用反向传播算法不断调整优化,最小化上述损失函数,使其在达到设定的循环次数时停止训练,以达到最优效果;或者L normal 小于0.03时,即认为训练已经达到最有效果,停止训练;When training the network, the back-propagation algorithm is used to continuously adjust and optimize to minimize the above loss function, so that it stops training when the set number of cycles is reached to achieve the optimal effect; or when L normal is less than 0.03, it is considered that the training has reached The most effective, stop training;

3)将上述训练好的网络用于光度立体图像的表面法向重建:3) Use the above trained network for surface normal reconstruction of photometric stereo images:

先拍摄s张以上不同光照方向的图像, s≥10,将m 1 , m 2 , ..., m s l 1 , l 2 , ..., l s 输入训练好的网络,得到预测的表面法向

Figure 273068DEST_PATH_IMAGE001
First shoot more than s images with different lighting directions, s≥10, input m 1 , m 2 , ..., m s and l 1 , l 2 , ..., l s into the trained network to get the predicted surface normal
Figure 273068DEST_PATH_IMAGE001
.

所述(1)表面法向生成网络被设计成从图像m 1 ,m 2 , ..., m j 和光照l 1 ,l 2 , ..., l j 中生成需要重建物体的表面法向

Figure 799995DEST_PATH_IMAGE001
具体如下:The (1) surface normal generation network is designed to generate the surface normal of the object to be reconstructed from the images m 1 , m 2 , ..., m j and the illumination l 1 , l 2 , ..., l j
Figure 799995DEST_PATH_IMAGE001
details as follows:

图像m的分辨率记为p*q,p、q≥2n,n≥4,则m∈ℝp*q*3,其中3表示RGB;表面法向生成网络首先按照m的分辨率p*q将光照l = [x, y, z] ∈ℝ3重复填充至ℝp*q*3的空间中,将填充后的光照记为h,则h∈ℝp*q*3,此时hm有相同的空间大小,将hm在第三组维度上连接融合,成为新的张量,新的张量属于ℝp*q*6,在输入j张图像和光照的情况下,得到了j个融合后的张量;The resolution of image m is denoted as p*q, p, q ≥ 2 n , n ≥ 4, then m ∈ℝ p*q*3 , where 3 represents RGB; the surface normal generation network first follows the resolution p* of m q Fill the light l = [ x, y, z ] ∈ℝ 3 into the space of ℝ p*q* 3 repeatedly, and denote the filled light as h , then h ∈ℝ p*q*3 , at this time h It has the same space size as m , and connects h and m in the third set of dimensions to become a new tensor. The new tensor belongs to ℝ p*q*6 . In the case of inputting j images and lighting, Get j fused tensors;

将这些张量分别进行4层卷积层操作,卷积层1、2、3、4的卷积核大小均为3*3,均采用“relu”激活函数,其中第2层和第4层是步长“stride”为2的卷积,第1层和第3层是步长“stride”为1的卷积,卷积层1、2、3、4的特征通道数分别为64,128,128,256;These tensors are respectively subjected to 4-layer convolution layer operations. The convolution kernel sizes of convolution layers 1, 2, 3, and 4 are all 3*3, and the "relu" activation function is used. The second layer and the fourth layer It is a convolution with a stride of 2, the first and third layers are convolutions with a stride of 1, and the feature channels of convolutional layers 1, 2, 3, and 4 are 64 and 128, respectively. , 128, 256;

之后,利用最大池化层从j个经过4层卷积的张量∈ℝp/4*q/4*256池化为一个ℝp/4*q/4*256中的张量;Then, use the max pooling layer to pool from j tensors ∈ ℝ p/4*q/4* 256 through 4 layers of convolution into a tensor in ℝ p/4*q/4*256 ;

再经过卷积层5、6、7、8计算,卷积层5、6、7、8的卷积核大小均为3*3,均采用“relu”激活函数,其中第5层和第7层为转置卷积,第6层和第8层是步长“stride”为1的卷积,卷积层5、6、7、8的特征通道数为128、128、64、3;After the calculation of convolutional layers 5, 6, 7, and 8, the convolution kernel size of convolutional layers 5, 6, 7, and 8 are all 3*3, and the "relu" activation function is used. The layers are transposed convolutions, the sixth and eighth layers are convolutions with a stride of 1, and the number of feature channels for convolutional layers 5, 6, 7, and 8 are 128, 128, 64, and 3;

最后,对第8层卷积得到的张量进行规范化处理,使其模为1,得到需要重建物体的的表面法向

Figure 487722DEST_PATH_IMAGE001
。Finally, normalize the tensor obtained by the convolution of the 8th layer, so that its modulus is 1, and obtain the surface normal of the object to be reconstructed.
Figure 487722DEST_PATH_IMAGE001
.

所述(2)注意力权重生成网络被设计成从图像m 1 , m 2 , ..., m j 中生成需要重建物体的注意力权重图P具体如下:The (2) attention weight generation network is designed to generate the attention weight map P of the object to be reconstructed from the images m 1 , m 2 , ..., m j as follows:

注意力权重生成网络对图像m∈ℝp*q*3计算其梯度值,该梯度值也属于空间ℝp*q*3,并将其梯度与图像在第三组维度上连接融合,成为新的张量,新的张量属于ℝp*q*6,在输入j张图像和光照的情况下,得到了j个融合后的张量;The attention weight generation network calculates its gradient value for the image m ∈ ℝ p*q* 3 , which also belongs to the space ℝ p*q*3 , and fuses its gradient with the image in the third set of dimensions to become a new , the new tensor belongs to ℝ p*q*6 , in the case of inputting j images and lighting, j fused tensors are obtained;

首先将这些融合后的张量分别进行3层的卷积层操作,这3层的卷积核大小均为3*3,均采用“relu”激活函数,其中第2层的步长“stride”为2,第1层和第3层的步长“stride”为1,四个卷积层的特征通道数分别为64、128、128;First, these fused tensors are respectively subjected to 3-layer convolution layer operations. The size of the convolution kernel of these 3 layers is 3*3, and the "relu" activation function is used, and the step size of the second layer is "stride". is 2, the stride "stride" of the first and third layers is 1, and the number of feature channels of the four convolutional layers is 64, 128, and 128, respectively;

之后,利用最大池化层从j个经过3层卷积的张量∈ℝp/2*q/2*128池化为一个ℝp/2*q/2*128中的张量;Afterwards, use the max pooling layer to pool from j tensors ℝ p/2*q/2*128 with 3 layers of convolution into a tensor in ℝ p/2*q/2*128 ;

再经过卷积层5、6、7计算,卷积层5、6、7的卷积核大小均为3*3,均采用“relu”激活函数,其中第6层为转置卷积,第5层和第7层是步长“stride”为1的卷积,卷积层5、6、7的特征通道数为128、64、1,从而得到需要重建物体的注意力权重图P After the calculation of convolutional layers 5, 6, and 7, the size of the convolution kernels of convolutional layers 5, 6, and 7 are all 3*3, and the "relu" activation function is used. The sixth layer is transposed convolution, and the first Layers 5 and 7 are convolutions with a stride of 1, and the number of feature channels in convolutional layers 5, 6, and 7 is 128, 64, and 1, so as to obtain the attention weight map P of the object to be reconstructed.

所述的基于深度学习的高频区域增强光度立体三维重建方法,其特征是所述图像m的分辨率p*q中,p取值16,32,48,64,q取值16,32,48,64。The described deep learning-based high-frequency region-enhanced photometric stereoscopic three-dimensional reconstruction method is characterized in that in the resolution p*q of the image m, p is 16, 32, 48, 64, and q is 16, 32, 48, 64.

所述的基于深度学习的高频区域增强光度立体三维重建方法,其特征是所述ζ设置为1。The deep learning-based high-frequency region enhanced photometric stereoscopic three-dimensional reconstruction method is characterized in that the ζ is set to 1.

所述的基于深度学习的高频区域增强光度立体三维重建方法,其特征是所述λ设置为8。The said deep learning-based high-frequency region-enhanced photometric stereoscopic three-dimensional reconstruction method is characterized in that the λ is set to 8.

所述的基于深度学习的高频区域增强光度立体三维重建方法,其特征是所述的循环次数设定为30次epoch。The said deep learning-based high-frequency region-enhanced photometric stereoscopic three-dimensional reconstruction method is characterized in that the number of cycles is set to 30 epochs.

所述的基于深度学习的高频区域增强光度立体三维重建方法,其特征是所述p取值32,q取值32。The deep learning-based high-frequency region enhanced photometric stereoscopic three-dimensional reconstruction method is characterized in that the p value is 32, and the q value is 32.

本发明提出的基于深度学习的高频区域增强的光度立体三维重建方法,通过提出的表面法向生成网络,注意力权重生成网络,分别学习表面法向和高频信息,并利用提出的注意力权重损失进行训练,可以改善褶皱边缘等高频区域表面的重建精度。相比先前传统光度立体方法,提高了三维重建精度,尤其是待重建物体表面的细节。The photometric stereo 3D reconstruction method based on deep learning based on high-frequency region enhancement proposed by the present invention learns the surface normal and high-frequency information respectively through the proposed surface normal generation network and attention weight generation network, and utilizes the proposed attention Weight loss for training can improve the reconstruction accuracy of surfaces in high frequency regions such as wrinkled edges. Compared with the previous traditional photometric stereo method, the accuracy of 3D reconstruction is improved, especially the details of the surface of the object to be reconstructed.

本发明提出的注意力权重损失,可以应用于多种底层视觉任务,提高任务精度,丰富图像的细节,例如深度估计,图像去模糊和图像去雾。The attention weight loss proposed by the present invention can be applied to a variety of underlying visual tasks to improve task accuracy and enrich image details, such as depth estimation, image deblurring, and image dehazing.

附图说明Description of drawings

图1是本发明的流程图。Figure 1 is a flow chart of the present invention.

图2是步骤2)中表面法向生成网络示意图。Figure 2 is a schematic diagram of the surface normal generation network in step 2).

图3是步骤2)中注意力权重生成网络示意图。Figure 3 is a schematic diagram of the attention weight generation network in step 2).

图4是本发明的应用效果示意图,其中第一行为输入图像,第二行为生成的权重图像,第三行为生成的表面法向。4 is a schematic diagram of the application effect of the present invention, wherein the first behavior is an input image, the second behavior is a weighted image generated, and the third behavior is a surface normal generated.

具体实施方式Detailed ways

如图1,基于深度学习的高频区域增强光度立体三维重建方法,其特征是包括以下步骤:As shown in Figure 1, a high-frequency region-enhanced photometric stereo 3D reconstruction method based on deep learning is characterized by including the following steps:

1)利用光度立体系统,拍摄若干张待重建物体的图像:1) Using the photometric stereo system, take several images of the object to be reconstructed:

待重建物体在单个平行白色光源的照射下拍摄图像,以待重建物体中心为坐标轴原点,建立笛卡尔坐标系,则白光光源的位置由该笛卡尔坐标系中向量 l = [x, y, z]表示;The image of the object to be reconstructed is taken under the illumination of a single parallel white light source, and the center of the object to be reconstructed is the origin of the coordinate axis to establish a Cartesian coordinate system, then the position of the white light source is determined by the vector l = [ x, y, z ] means;

改变该光源位置,在另一光照方向下获得拍摄图像;通常需至少拍摄10张以上的在不同光照方向照射下的图像,记作m 1 , m 2 , ..., m j 同时相对应的光源位置记作l 1 , l 2 , ...,l j j为大于等于10的自然数;Change the position of the light source and obtain a photographed image under another illumination direction; usually at least 10 images under different illumination directions need to be photographed, denoted as m 1 , m 2 , ..., m j , while corresponding The position of the light source is denoted as l 1 , l 2 , ..., l j , where j is a natural number greater than or equal to 10;

2)利用深度学习算法输入m 1 ,m 2 , ..., m j l 1 ,l 2 , ..., l j ,输出准确的表面法向三维重建:2) Use the deep learning algorithm to input m 1 , m 2 , ..., m j and l 1 , l 2 , ..., l j , and output the accurate three-dimensional reconstruction of the surface normal:

所利用的深度学习算法分为以下四个部分:(1)表面法向生成网络,(2)注意力权重生成网络,(3)注意力权重损失函数联合训练,(4)网络训练;The utilized deep learning algorithm is divided into the following four parts: (1) surface normal generation network, (2) attention weight generation network, (3) joint training of attention weight loss function, (4) network training;

(1)表面法向生成网络被设计成从图像m 1 ,m 2 , ..., m j 和光照l 1 ,l 2 , ..., l j 中生成需要重建物体的表面法向

Figure 164822DEST_PATH_IMAGE001
;(1) The surface normal generation network is designed to generate the surface normal of the object to be reconstructed from the images m 1 , m 2 , ..., m j and the light l 1 , l 2 , ..., l j
Figure 164822DEST_PATH_IMAGE001
;

图像m的分辨率记为p*q,p、q≥2n,n≥4,则m∈ℝp*q*3,其中3表示RGB;如图2,表面法向生成网络首先按照m的分辨率p*q将光照l = [x, y, z] ∈ℝ3重复填充至ℝp*q*3的空间中,将填充后的光照记为h,则h∈ℝp*q*3,此时hm有相同的空间大小,将hm在第三组维度上连接融合,成为新的张量,新的张量属于ℝp*q*6,在输入j张图像和光照的情况下,得到了j个融合后的张量;The resolution of image m is denoted as p*q, p, q ≥ 2 n , n ≥ 4, then m ∈ℝ p*q*3 , where 3 represents RGB; as shown in Figure 2, the surface normal generation network first follows m The resolution p*q fills the light l = [ x, y, z ] ∈ℝ 3 repeatedly into the space of ℝ p*q*3 , and denote the filled light as h , then h ∈ℝ p*q*3 , at this time h and m have the same space size, connect h and m in the third set of dimensions to become a new tensor, the new tensor belongs to ℝ p*q*6 , after inputting j images and lighting In the case of , j fused tensors are obtained;

将这些张量分别进行4层卷积层操作,卷积层1、2、3、4的卷积核大小均为3*3,均采用“relu”激活函数,其中第2层和第4层是步长“stride”为2的卷积,第1层和第3层是步长“stride”为1的卷积,卷积层1、2、3、4的特征通道数分别为64,128,128,256;These tensors are respectively subjected to 4-layer convolution layer operations. The convolution kernel sizes of convolution layers 1, 2, 3, and 4 are all 3*3, and the "relu" activation function is used. The second layer and the fourth layer It is a convolution with a stride of 2, the first and third layers are convolutions with a stride of 1, and the feature channels of convolutional layers 1, 2, 3, and 4 are 64 and 128, respectively. , 128, 256;

之后,利用最大池化层从j个经过4层卷积的张量∈ℝp/4*q/4*256池化为一个ℝp/4*q/4*256中的张量;Then, use the max pooling layer to pool from j tensors ∈ ℝ p/4*q/4* 256 through 4 layers of convolution into a tensor in ℝ p/4*q/4*256 ;

再经过卷积层5、6、7、8计算,卷积层5、6、7、8的卷积核大小均为3*3,均采用“relu”激活函数,其中第5层和第7层为转置卷积,第6层和第8层是步长“stride”为1的卷积,卷积层5、6、7、8的特征通道数为128、128、64、3;After the calculation of convolutional layers 5, 6, 7, and 8, the convolution kernel size of convolutional layers 5, 6, 7, and 8 are all 3*3, and the "relu" activation function is used. The layers are transposed convolutions, the sixth and eighth layers are convolutions with a stride of 1, and the number of feature channels for convolutional layers 5, 6, 7, and 8 are 128, 128, 64, and 3;

最后,对第8层卷积得到的张量进行规范化处理,使其模为1,得到预测的表面法向

Figure 884254DEST_PATH_IMAGE001
;Finally, normalize the tensor obtained by the 8th layer convolution to make it modulo 1 to obtain the predicted surface normal
Figure 884254DEST_PATH_IMAGE001
;

(2)注意力权重生成网络被设计成从图像m 1 , m 2 , ..., m j 中生成需要重建物体的注意力权重图:(2) The attention weight generation network is designed to generate the attention weight map of the object to be reconstructed from the images m 1 , m 2 , ..., m j :

注意力权重生成网络对图像m∈ℝp*q*3计算其梯度值,该梯度值也属于空间ℝp*q*3,并将其梯度与图像在第三组维度上连接融合,如图3,成为新的张量,新的张量属于ℝp*q*6,在输入j张图像和光照的情况下,得到了j个融合后的张量;The attention weight generation network calculates its gradient value for the image m ∈ ℝ p*q* 3 , which also belongs to the space ℝ p*q*3 , and fuses its gradient with the image in the third set of dimensions, as shown in the figure 3. Become a new tensor. The new tensor belongs to ℝ p*q*6 . In the case of inputting j images and lighting, j fused tensors are obtained;

首先将这些融合后的张量分别进行3层的卷积层操作,这3层的卷积核大小均为3*3,均采用“relu”激活函数,其中第2层的步长“stride”为2,第1层和第3层的步长“stride”为1,四个卷积层的特征通道数分别为64、128、128;First, these fused tensors are respectively subjected to 3-layer convolution layer operations. The size of the convolution kernel of these 3 layers is 3*3, and the "relu" activation function is used, and the step size of the second layer is "stride". is 2, the stride "stride" of the first and third layers is 1, and the number of feature channels of the four convolutional layers is 64, 128, and 128, respectively;

之后,利用最大池化层从j个经过3层卷积的张量∈ℝp/2*q/2*128池化为一个ℝp/2*q/2*128中的张量;Afterwards, use the max pooling layer to pool from j tensors ℝ p/2*q/2*128 with 3 layers of convolution into a tensor in ℝ p/2*q/2*128 ;

再经过卷积层5、6、7计算,卷积层5、6、7的卷积核大小均为3*3,均采用“relu”激活函数,其中第6层为转置卷积,第5层和第7层是步长“stride”为1的卷积,卷积层5、6、7的特征通道数为128、64、1,从而得到待重建物体的注意力权重图P After the calculation of convolutional layers 5, 6, and 7, the size of the convolution kernels of convolutional layers 5, 6, and 7 are all 3*3, and the "relu" activation function is used. The sixth layer is transposed convolution, and the first The 5th and 7th layers are convolutions with a stride of 1, and the feature channels of convolutional layers 5, 6, and 7 are 128, 64, and 1, so as to obtain the attention weight map P of the object to be reconstructed;

(3)注意力权重损失L是一个逐像素处理的损失函数,它由每一个像素点的损失L k 平均计算得到,公式为

Figure 280950DEST_PATH_IMAGE002
;(3) The attention weight loss L is a pixel -by-pixel processing loss function, which is calculated by the average loss Lk of each pixel, and the formula is
Figure 280950DEST_PATH_IMAGE002
;

每一个像素位置的损失L k 包括两部分,第一部分是带有系数项的梯度损失L gradient ,第二部分是带有系数项的法向损失L normal ,即L k = P k L gradient +λ(1 – P k ) L normal The loss L k of each pixel position includes two parts, the first part is the gradient loss L gradient with coefficient terms, and the second part is the normal loss L normal with coefficient terms, that is, L k = P k L gradient + λ (1 – P k ) L normal ;

其中,

Figure 564295DEST_PATH_IMAGE003
,in,
Figure 564295DEST_PATH_IMAGE003
,

Figure 668255DEST_PATH_IMAGE004
是待重建物体的真实表面法向n在位置k的梯度,ζ是计算梯度时使用的邻域像素范围,ζ设置范围为1、2、3、4、5,本发明中默认设置为1,
Figure 386069DEST_PATH_IMAGE005
是预测的表面法向
Figure 989220DEST_PATH_IMAGE006
在位置k的梯度;
Figure 648609DEST_PATH_IMAGE007
表示网络预测的表面法向,
Figure 667512DEST_PATH_IMAGE008
表示真实表面法向;
Figure 668255DEST_PATH_IMAGE004
is the gradient of the real surface normal n of the object to be reconstructed at position k , ζ is the neighborhood pixel range used when calculating the gradient, ζ is set in the range of 1, 2, 3, 4, 5, and is set to 1 by default in the present invention,
Figure 386069DEST_PATH_IMAGE005
is the predicted surface normal
Figure 989220DEST_PATH_IMAGE006
gradient at position k ;
Figure 648609DEST_PATH_IMAGE007
represents the surface normal predicted by the network,
Figure 667512DEST_PATH_IMAGE008
represents the true surface normal;

梯度损失在网络中可以锐化表面法向的高频表达;P k 为注意力权重图上像素位置k上的值,也即注意力权重为逐像素的损失L k 提供第一个梯度损失分项L gradient 的权重,注意力权重值大的地方,梯度损失的权重就大;The gradient loss can sharpen the high-frequency representation of the surface normal in the network; P k is the value at the pixel position k on the attention weight map, that is, the attention weight provides the first gradient loss score for the pixel-wise loss L k . The weight of the item L gradient , where the attention weight value is large, the weight of the gradient loss is large;

其次,

Figure 358606DEST_PATH_IMAGE009
,●代表点乘操作,λ是一个超参,目的是为了梯度损失和法向损失,本文将其设置为8;一般可以设置为{7,8,9,10},取8时可以获得较好的效果;Second,
Figure 358606DEST_PATH_IMAGE009
, ● represents the point multiplication operation, λ is a hyperparameter, the purpose is for gradient loss and normal loss, this paper sets it to 8; generally it can be set to {7, 8, 9, 10}, when 8 is taken, it can be compared Good results;

通过上述(3)注意力权重损失可将(1)表面法向生成网络和(2)注意力权重生成网络建立起联系;Through the above (3) attention weight loss, (1) the surface normal generation network and (2) the attention weight generation network can be connected;

(4)网络训练(4) Network training

训练网络时,利用反向传播算法不断调整优化,最小化上述损失函数,使其在达到30次epoch(循环)的时刻停止训练,以达到最优效果;或者L normal 小于0.03时,即认为训练已经达到最有效果,停止训练;When training the network, use the back-propagation algorithm to continuously adjust and optimize to minimize the above loss function, so that it stops training when it reaches 30 epochs (cycles) to achieve the optimal effect; or when L normal is less than 0.03, it is considered that training Has reached the most effective, stop training;

在本发明中,在30个epoch后结束结束网络的训练,此时即认为训练已经达到最优效果;In the present invention, the training of the network is terminated after 30 epochs, at this time, it is considered that the training has reached the optimal effect;

(5)将上述训练好的网络用于光度立体图像的表面法向重建:(5) Use the above trained network for surface normal reconstruction of photometric stereo images:

先拍摄s张以上不同光照方向的图像, s≥10,将m 1 , m 2 , ..., m s 和l 1 , l 2 , ...,l s 输入训练好的网络,得到预测的表面法向

Figure 568877DEST_PATH_IMAGE001
。First shoot more than s images with different lighting directions, s ≥ 10, input m 1 , m 2 , ..., m s and l 1 , l 2 , ..., l s into the trained network to get the predicted surface normal
Figure 568877DEST_PATH_IMAGE001
.

其中pq∈{16、32、48、64},λ∈{7,8,910},ζ可以取1,2,3,4,5。where p , q ∈ {16, 32, 48, 64}, λ ∈ {7, 8, 910}, ζ can take 1, 2, 3, 4, 5.

重建效果如图4所示。第一行表示待重建物体拍摄的图像,第二行代表生成的注意力权重图P,第三行代表生成的表面法向

Figure 217027DEST_PATH_IMAGE001
。The reconstruction effect is shown in Figure 4. The first row represents the image captured by the object to be reconstructed, the second row represents the generated attention weight map P, and the third row represents the generated surface normal
Figure 217027DEST_PATH_IMAGE001
.

Claims (8)

1.基于深度学习的高频区域增强的光度立体三维重建方法,其特征是包括以下步骤:1. A photometric stereoscopic three-dimensional reconstruction method based on high-frequency region enhancement based on deep learning, which is characterized by comprising the following steps: 1)利用光度立体系统,拍摄若干张待重建物体的图像:1) Using the photometric stereo system, take several images of the object to be reconstructed: 待重建物体在单个平行白色光源的照射下拍摄图像,以待重建物体中心为坐标轴原点,建立笛卡尔坐标系,则白光光源的位置由该笛卡尔坐标系中向量 l = [x, y, z]表示;The image of the object to be reconstructed is taken under the illumination of a single parallel white light source, and the center of the object to be reconstructed is the origin of the coordinate axis to establish a Cartesian coordinate system, then the position of the white light source is determined by the vector l = [ x, y, z ] means; 改变该光源位置,在另一光照方向下获得拍摄图像;通常需至少拍摄10张以上的在不同光照方向照射下的图像,记作m 1 , m 2 , ..., m j 同时相对应的光源位置记作l 1 , l 2 , ...,l j j为大于等于10的自然数;Change the position of the light source and obtain a photographed image under another illumination direction; usually, at least 10 images under different illumination directions need to be photographed, denoted as m 1 , m 2 , ..., m j , while corresponding The position of the light source is denoted as l 1 , l 2 , ..., l j , where j is a natural number greater than or equal to 10; 2)利用深度学习算法输入m 1 ,m 2 , ..., m j l 1 ,l 2 , ..., l j ,输出准确的表面法向三维重建:2) Use the deep learning algorithm to input m 1 , m 2 , ..., m j and l 1 , l 2 , ..., l j , and output the accurate three-dimensional reconstruction of the surface normal: 所利用的深度学习算法分为以下四个部分:(1)表面法向生成网络,(2)注意力权重生成网络,(3)注意力权重损失函数联合训练,(4)网络训练;其中:The utilized deep learning algorithm is divided into the following four parts: (1) surface normal generation network, (2) attention weight generation network, (3) joint training of attention weight loss function, (4) network training; among which: (1)表面法向生成网络被设计成从图像m 1 ,m 2 , ..., m j 和光照l 1 ,l 2 , ..., l j 中生成需要重建物体的表面法向
Figure DEST_PATH_IMAGE001
(1) The surface normal generation network is designed to generate the surface normal of the object to be reconstructed from the images m 1 , m 2 , ..., m j and the light l 1 , l 2 , ..., l j
Figure DEST_PATH_IMAGE001
;
(2)注意力权重生成网络被设计成从图像m 1 , m 2 , ..., m j 中生成需要重建物体的注意力权重图P (2) The attention weight generation network is designed to generate the attention weight map P of the object to be reconstructed from the images m 1 , m 2 , ..., m j ; (3)注意力权重损失L是一个逐像素处理的损失函数,它由每一个像素点的损失L k 平均计算得到,公式为
Figure DEST_PATH_IMAGE002
p*q为图像m的分辨率,p、q≥2n,n≥4;
(3) The attention weight loss L is a pixel -by-pixel processing loss function, which is calculated by the average loss Lk of each pixel, and the formula is
Figure DEST_PATH_IMAGE002
; p*q is the resolution of image m , p, q ≥ 2 n , n ≥ 4;
每一个像素位置的损失L k 包括两部分,第一部分是带有系数项的梯度损失L gradient ,第二部分是带有系数项的法向损失L normal ,即L k = P k L gradient +λ(1 – P k ) L normal The loss L k of each pixel position includes two parts, the first part is the gradient loss L gradient with coefficient terms, and the second part is the normal loss L normal with coefficient terms, that is, L k = P k L gradient + λ (1 – P k ) L normal ; 其中,
Figure DEST_PATH_IMAGE003
Figure DEST_PATH_IMAGE004
是待重建物体的真实表面法向n在位置k的梯度;
in,
Figure DEST_PATH_IMAGE003
;
Figure DEST_PATH_IMAGE004
is the gradient of the true surface normal n of the object to be reconstructed at position k ;
ζ是计算梯度时使用的邻域像素范围,ζ设置范围为1、2、3、4、5;
Figure DEST_PATH_IMAGE005
是预测的表面法向
Figure DEST_PATH_IMAGE006
在位置k的梯度;
ζ is the neighborhood pixel range used when calculating the gradient, and the setting range of ζ is 1, 2, 3, 4, and 5;
Figure DEST_PATH_IMAGE005
is the predicted surface normal
Figure DEST_PATH_IMAGE006
gradient at position k ;
Figure DEST_PATH_IMAGE007
表示网络预测的表面法向,
Figure DEST_PATH_IMAGE008
表示真实表面法向;
Figure DEST_PATH_IMAGE007
represents the surface normal predicted by the network,
Figure DEST_PATH_IMAGE008
represents the true surface normal;
P k 为注意力权重图上像素位置k上的值; P k is the value at pixel position k on the attention weight map; 其次,
Figure DEST_PATH_IMAGE009
,●代表点乘操作,λ是一个超参,目的是为了梯度损失和法向损失,设置范围为{7、8、9、10};
Second,
Figure DEST_PATH_IMAGE009
, ● represents the point multiplication operation, λ is a hyperparameter, the purpose is for gradient loss and normal loss, and the setting range is {7, 8, 9, 10};
通过上述(3)注意力权重损失可将(1)表面法向生成网络和(2)注意力权重生成网络建立起联系;Through the above (3) attention weight loss, (1) the surface normal generation network and (2) the attention weight generation network can be connected; (4)网络训练(4) Network training 训练网络时,利用反向传播算法不断调整优化,最小化上述损失函数,使其在达到设定的循环次数时停止训练,以达到最优效果;或者L normal 小于0.03时,即认为训练已经达到最有效果,停止训练;When training the network, the back-propagation algorithm is used to continuously adjust and optimize to minimize the above loss function, so that it stops training when the set number of cycles is reached to achieve the optimal effect; or when L normal is less than 0.03, it is considered that the training has reached The most effective, stop training; 3)将上述训练好的网络用于光度立体图像的表面法向重建:3) Use the above trained network for surface normal reconstruction of photometric stereo images: 先拍摄s张以上不同光照方向的图像, s≥10,将m 1 , m 2 , ..., m s l 1 , l 2 , ..., l s 输入训练好的网络,得到预测的表面法向
Figure 953835DEST_PATH_IMAGE006
First shoot more than s images with different lighting directions, s≥10, input m 1 , m 2 , ..., m s and l 1 , l 2 , ..., l s into the trained network to get the predicted surface normal
Figure 953835DEST_PATH_IMAGE006
.
2.如权利要求1所述的基于深度学习的高频区域增强的光度立体三维重建方法,其特征是(1)表面法向生成网络被设计成从图像m 1 ,m 2 , ..., m j 和光照l 1 ,l 2 , ..., l j 中生成需要重建物体的表面法向
Figure 787187DEST_PATH_IMAGE001
具体如下:
2. The photometric stereo 3D reconstruction method based on deep learning for high frequency region enhancement according to claim 1, wherein (1) the surface normal generation network is designed to generate images from images m 1 , m 2 , ..., m j and lighting l 1 , l 2 , ..., l j generate the surface normal of the object that needs to be reconstructed
Figure 787187DEST_PATH_IMAGE001
details as follows:
图像m的分辨率记为p*q,p、q≥2n,n≥4,则m∈ℝp*q*3,其中3表示RGB;表面法向生成网络首先按照m的分辨率p*q将光照l = [x, y, z] ∈ℝ3重复填充至ℝp*q*3的空间中,将填充后的光照记为h,则h∈ℝp*q*3,此时hm有相同的空间大小,将hm在第三组维度上连接融合,成为新的张量,新的张量属于ℝp*q*6,在输入j张图像和光照的情况下,得到了j个融合后的张量;The resolution of image m is denoted as p*q, p, q ≥ 2 n , n ≥ 4, then m ∈ℝ p*q*3 , where 3 represents RGB; the surface normal generation network first follows the resolution p* of m q Fill the light l = [ x, y, z ] ∈ℝ 3 into the space of ℝ p*q* 3 repeatedly, and denote the filled light as h , then h ∈ℝ p*q*3 , at this time h It has the same space size as m , and connects h and m in the third set of dimensions to become a new tensor. The new tensor belongs to ℝ p*q*6 . In the case of inputting j images and lighting, Get j fused tensors; 将这些张量分别进行4层卷积层操作,卷积层1、2、3、4的卷积核大小均为3*3,均采用“relu”激活函数,其中第2层和第4层是步长“stride”为2的卷积,第1层和第3层是步长“stride”为1的卷积,卷积层1、2、3、4的特征通道数分别为64,128,128,256;These tensors are respectively subjected to 4-layer convolution layer operations. The convolution kernel sizes of convolution layers 1, 2, 3, and 4 are all 3*3, and the "relu" activation function is used. The second layer and the fourth layer It is a convolution with a stride of 2, the first and third layers are convolutions with a stride of 1, and the feature channels of convolutional layers 1, 2, 3, and 4 are 64 and 128, respectively. , 128, 256; 之后,利用最大池化层从j个经过4层卷积的张量∈ℝp/4*q/4*256池化为一个ℝp/4*q/4*256中的张量;Then, use the max pooling layer to pool from j tensors ∈ ℝ p/4*q/4* 256 through 4 layers of convolution into a tensor in ℝ p/4*q/4*256 ; 再经过卷积层5、6、7、8计算,卷积层5、6、7、8的卷积核大小均为3*3,均采用“relu”激活函数,其中第5层和第7层为转置卷积,第6层和第8层是步长“stride”为1的卷积,卷积层5、6、7、8的特征通道数为128、128、64、3;After the calculation of convolutional layers 5, 6, 7, and 8, the convolution kernel size of convolutional layers 5, 6, 7, and 8 are all 3*3, and the "relu" activation function is used. The layers are transposed convolutions, the sixth and eighth layers are convolutions with a stride of 1, and the number of feature channels for convolutional layers 5, 6, 7, and 8 are 128, 128, 64, and 3; 最后,对第8层卷积得到的张量进行规范化处理,使其模为1,得到需要重建物体的的表面法向
Figure 246987DEST_PATH_IMAGE001
Finally, normalize the tensor obtained by the convolution of the 8th layer, so that its modulus is 1, and obtain the surface normal of the object to be reconstructed.
Figure 246987DEST_PATH_IMAGE001
.
3.如权利要求1所述的基于深度学习的高频区域增强的光度立体三维重建方法,其特征是(2)注意力权重生成网络被设计成从图像m 1 , m 2 , ..., m j 中生成需要重建物体的注意力权重图P具体如下:3. The photometric stereo 3D reconstruction method based on deep learning for high frequency region enhancement according to claim 1, wherein (2) the attention weight generation network is designed to generate images from images m 1 , m 2 , ..., The attention weight map P generated in m j for the object to be reconstructed is as follows: 注意力权重生成网络对图像m∈ℝp*q*3计算其梯度值,该梯度值也属于空间ℝp*q*3,并将其梯度与图像在第三组维度上连接融合,成为新的张量,新的张量属于ℝp*q*6,在输入j张图像和光照的情况下,得到了j个融合后的张量;The attention weight generation network calculates its gradient value for the image m ∈ ℝ p*q* 3 , which also belongs to the space ℝ p*q*3 , and fuses its gradient with the image in the third set of dimensions to become a new , the new tensor belongs to ℝ p*q*6 , in the case of inputting j images and lighting, j fused tensors are obtained; 首先将这些融合后的张量分别进行3层的卷积层操作,这3层的卷积核大小均为3*3,均采用“relu”激活函数,其中第2层的步长“stride”为2,第1层和第3层的步长“stride”为1,四个卷积层的特征通道数分别为64、128、128;First, these fused tensors are respectively subjected to 3-layer convolution layer operations. The size of the convolution kernel of these 3 layers is 3*3, and the "relu" activation function is used, and the step size of the second layer is "stride". is 2, the stride "stride" of the first and third layers is 1, and the number of feature channels of the four convolutional layers is 64, 128, and 128, respectively; 之后,利用最大池化层从j个经过3层卷积的张量∈ℝp/2*q/2*128池化为一个ℝp/2*q/2*128中的张量;Afterwards, use the max pooling layer to pool from j tensors ℝ p/2*q/2*128 with 3 layers of convolution into a tensor in ℝ p/2*q/2*128 ; 再经过卷积层5、6、7计算,卷积层5、6、7的卷积核大小均为3*3,均采用“relu”激活函数,其中第6层为转置卷积,第5层和第7层是步长“stride”为1的卷积,卷积层5、6、7的特征通道数为128、64、1,从而得到需要重建物体的注意力权重图P After the calculation of convolutional layers 5, 6, and 7, the size of the convolution kernels of convolutional layers 5, 6, and 7 are all 3*3, and the "relu" activation function is used. The sixth layer is transposed convolution, and the first Layers 5 and 7 are convolutions with a stride of 1, and the number of feature channels in convolutional layers 5, 6, and 7 is 128, 64, and 1, so as to obtain the attention weight map P of the object to be reconstructed. 4.如权利要求1所述的基于深度学习的高频区域增强的光度立体三维重建方法,其特征是所述图像m的分辨率p*q中,p取值16,32,48,64,q取值16,32,48,64。4. The photometric stereoscopic three-dimensional reconstruction method based on deep learning of high-frequency region enhancement as claimed in claim 1, characterized in that in the resolution p*q of the image m, p takes a value of 16, 32, 48, 64, q takes values 16, 32, 48, 64. 5.如权利要求1所述的基于深度学习的高频区域增强的光度立体三维重建方法,其特征是所述ζ设置为1。5 . The photometric stereoscopic three-dimensional reconstruction method for high-frequency region enhancement based on deep learning according to claim 1 , wherein the ζ is set to 1. 6 . 6.如权利要求1所述的基于深度学习的高频区域增强的光度立体三维重建方法,其特征是所述λ设置为8。6 . The photometric stereoscopic three-dimensional reconstruction method for high-frequency region enhancement based on deep learning according to claim 1 , wherein the λ is set to 8. 7 . 7.如权利要求1所述的基于深度学习的高频区域增强的光度立体三维重建方法,其特征是所述的循环次数设定为30次epoch。7 . The photometric stereoscopic three-dimensional reconstruction method for high-frequency region enhancement based on deep learning according to claim 1 , wherein the number of cycles is set to 30 epochs. 8 . 8.如权利要求4所述的基于深度学习的高频区域增强的光度立体三维重建方法,其特征是所述p取值32,q取值32。8 . The photometric stereoscopic three-dimensional reconstruction method for high-frequency region enhancement based on deep learning according to claim 4 , wherein the p value is 32, and the q value is 32. 9 .
CN202111524515.8A 2021-12-14 2021-12-14 High-frequency region enhanced luminosity three-dimensional reconstruction method based on deep learning Active CN113936117B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111524515.8A CN113936117B (en) 2021-12-14 2021-12-14 High-frequency region enhanced luminosity three-dimensional reconstruction method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111524515.8A CN113936117B (en) 2021-12-14 2021-12-14 High-frequency region enhanced luminosity three-dimensional reconstruction method based on deep learning

Publications (2)

Publication Number Publication Date
CN113936117A true CN113936117A (en) 2022-01-14
CN113936117B CN113936117B (en) 2022-03-08

Family

ID=79288969

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111524515.8A Active CN113936117B (en) 2021-12-14 2021-12-14 High-frequency region enhanced luminosity three-dimensional reconstruction method based on deep learning

Country Status (1)

Country Link
CN (1) CN113936117B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998507A (en) * 2022-06-07 2022-09-02 天津大学 Luminosity three-dimensional reconstruction method based on self-supervision learning
CN115098563A (en) * 2022-07-14 2022-09-23 中国海洋大学 Time sequence abnormity detection method and system based on GCN and attention VAE
CN118628371A (en) * 2024-08-12 2024-09-10 南开大学 Surface normal recovery method, device and storage medium based on photometric stereo

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862741A (en) * 2017-12-10 2018-03-30 中国海洋大学 A kind of single-frame images three-dimensional reconstruction apparatus and method based on deep learning
CN108510573A (en) * 2018-04-03 2018-09-07 南京大学 A method of the multiple views human face three-dimensional model based on deep learning is rebuild
CN109146934A (en) * 2018-06-04 2019-01-04 成都通甲优博科技有限责任公司 A kind of face three-dimensional rebuilding method and system based on binocular solid and photometric stereo
CN110060212A (en) * 2019-03-19 2019-07-26 中国海洋大学 A kind of multispectral photometric stereo surface normal restoration methods based on deep learning
US20210241478A1 (en) * 2020-02-03 2021-08-05 Nanotronics Imaging, Inc. Deep Photometric Learning (DPL) Systems, Apparatus and Methods
CN113538675A (en) * 2021-06-30 2021-10-22 同济人工智能研究院(苏州)有限公司 Neural network for calculating attention weight for laser point cloud and training method
CN113762358A (en) * 2021-08-18 2021-12-07 江苏大学 Semi-supervised learning three-dimensional reconstruction method based on relative deep training

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862741A (en) * 2017-12-10 2018-03-30 中国海洋大学 A kind of single-frame images three-dimensional reconstruction apparatus and method based on deep learning
CN108510573A (en) * 2018-04-03 2018-09-07 南京大学 A method of the multiple views human face three-dimensional model based on deep learning is rebuild
CN109146934A (en) * 2018-06-04 2019-01-04 成都通甲优博科技有限责任公司 A kind of face three-dimensional rebuilding method and system based on binocular solid and photometric stereo
CN110060212A (en) * 2019-03-19 2019-07-26 中国海洋大学 A kind of multispectral photometric stereo surface normal restoration methods based on deep learning
US20210241478A1 (en) * 2020-02-03 2021-08-05 Nanotronics Imaging, Inc. Deep Photometric Learning (DPL) Systems, Apparatus and Methods
CN113538675A (en) * 2021-06-30 2021-10-22 同济人工智能研究院(苏州)有限公司 Neural network for calculating attention weight for laser point cloud and training method
CN113762358A (en) * 2021-08-18 2021-12-07 江苏大学 Semi-supervised learning three-dimensional reconstruction method based on relative deep training

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHENG-JIAN LIN等: "A Constrained Independent Component Analysis Based Photometric Stereo for 3D Human Face Reconstruction", 《2012 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL》 *
陈加等: "深度学习在基于单幅图像的物体三维重建中的应用", 《自动化学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998507A (en) * 2022-06-07 2022-09-02 天津大学 Luminosity three-dimensional reconstruction method based on self-supervision learning
CN115098563A (en) * 2022-07-14 2022-09-23 中国海洋大学 Time sequence abnormity detection method and system based on GCN and attention VAE
CN118628371A (en) * 2024-08-12 2024-09-10 南开大学 Surface normal recovery method, device and storage medium based on photometric stereo

Also Published As

Publication number Publication date
CN113936117B (en) 2022-03-08

Similar Documents

Publication Publication Date Title
CN113936117B (en) High-frequency region enhanced luminosity three-dimensional reconstruction method based on deep learning
CN114549731B (en) Method and device for generating visual angle image, electronic equipment and storage medium
Chen et al. Point-based multi-view stereo network
CN106355570B (en) A kind of binocular stereo vision matching method of combination depth characteristic
CN109377530B (en) A Binocular Depth Estimation Method Based on Deep Neural Network
CN110427968B (en) A binocular stereo matching method based on detail enhancement
CN109598754B (en) Binocular depth estimation method based on depth convolution network
CN111915660B (en) Binocular disparity matching method and system based on shared features and attention up-sampling
CN111833393A (en) A binocular stereo matching method based on edge information
CN113313732A (en) Forward-looking scene depth estimation method based on self-supervision learning
CN112802078A (en) Depth map generation method and device
CN112288788B (en) Monocular image depth estimation method
CN107133914A (en) For generating the device of three-dimensional color image and method for generating three-dimensional color image
CN111553296B (en) A Binary Neural Network Stereo Vision Matching Method Based on FPGA
CN112509021A (en) Parallax optimization method based on attention mechanism
CN102903111B (en) Large area based on Iamge Segmentation low texture area Stereo Matching Algorithm
CN115631223A (en) Multi-view stereo reconstruction method based on self-adaptive learning and aggregation
CN115511708A (en) Depth map super-resolution method and system based on uncertainty-aware feature transmission
CN112991504B (en) An Improved Hole Filling Method Based on TOF Camera 3D Reconstruction
JP7398938B2 (en) Information processing device and its learning method
CN117152330B (en) Point cloud 3D model mapping method and device based on deep learning
Chen et al. Pisr: Polarimetric neural implicit surface reconstruction for textureless and specular objects
CN109934863B (en) A light field depth information estimation method based on densely connected convolutional neural network
Yang et al. A new RBF reflection model for shape from shading
JP7508673B2 (en) Computer vision method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant